All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] drm/i915: Cancel pending execlists irq handler upon idling
@ 2017-06-27 15:25 Chris Wilson
  2017-06-27 15:45 ` ✓ Fi.CI.BAT: success for " Patchwork
  2017-06-28  8:59 ` [PATCH] " Tvrtko Ursulin
  0 siblings, 2 replies; 6+ messages in thread
From: Chris Wilson @ 2017-06-27 15:25 UTC (permalink / raw)
  To: intel-gfx

Due to the slight asynchronicity in handling the execlists interrupts
(i.e. we defer the work to a handler that may consume more than one
interrupt event), when the engine is idle we may still have an irq
tasklet queued (especially when it has been deferred to a ksoftirqd).
At the beginning of the tasklet, we assert that we do hold a device
wakeref for the access we are about to perform. This assumes that when
we idle and release the GT wakeref, all execlists work has been
completed (since the elsp tracking says the hw is idle). However, there
may still be a tasklet queued, so as we mark the engine idle, also
cancel any pending tasklet.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
 Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/intel_engine_cs.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index 3b46c1f7b88b..49e875c46c96 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -1328,6 +1328,7 @@ void intel_engines_mark_idle(struct drm_i915_private *i915)
 	for_each_engine(engine, i915, id) {
 		intel_engine_disarm_breadcrumbs(engine);
 		i915_gem_batch_pool_fini(&engine->batch_pool);
+		tasklet_kill(&engine->irq_tasklet);
 		engine->no_priolist = false;
 	}
 }
-- 
2.13.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* ✓ Fi.CI.BAT: success for drm/i915: Cancel pending execlists irq handler upon idling
  2017-06-27 15:25 [PATCH] drm/i915: Cancel pending execlists irq handler upon idling Chris Wilson
@ 2017-06-27 15:45 ` Patchwork
  2017-06-28  8:59 ` [PATCH] " Tvrtko Ursulin
  1 sibling, 0 replies; 6+ messages in thread
From: Patchwork @ 2017-06-27 15:45 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: drm/i915: Cancel pending execlists irq handler upon idling
URL   : https://patchwork.freedesktop.org/series/26441/
State : success

== Summary ==

Series 26441v1 drm/i915: Cancel pending execlists irq handler upon idling
https://patchwork.freedesktop.org/api/1.0/series/26441/revisions/1/mbox/

Test gem_exec_flush:
        Subgroup basic-batch-kernel-default-uc:
                fail       -> PASS       (fi-snb-2600) fdo#100007
Test gem_exec_suspend:
        Subgroup basic-s4-devices:
                pass       -> DMESG-WARN (fi-kbl-r) fdo#100125
Test kms_cursor_legacy:
        Subgroup basic-busy-flip-before-cursor-legacy:
                fail       -> PASS       (fi-snb-2600) fdo#100215
Test kms_pipe_crc_basic:
        Subgroup hang-read-crc-pipe-a:
                pass       -> DMESG-WARN (fi-pnv-d510) fdo#101597 +1

fdo#100007 https://bugs.freedesktop.org/show_bug.cgi?id=100007
fdo#100125 https://bugs.freedesktop.org/show_bug.cgi?id=100125
fdo#100215 https://bugs.freedesktop.org/show_bug.cgi?id=100215
fdo#101597 https://bugs.freedesktop.org/show_bug.cgi?id=101597

fi-bdw-5557u     total:279  pass:268  dwarn:0   dfail:0   fail:0   skip:11  time:443s
fi-bdw-gvtdvm    total:279  pass:257  dwarn:8   dfail:0   fail:0   skip:14  time:433s
fi-blb-e6850     total:279  pass:224  dwarn:1   dfail:0   fail:0   skip:54  time:353s
fi-bsw-n3050     total:279  pass:242  dwarn:1   dfail:0   fail:0   skip:36  time:524s
fi-bxt-j4205     total:279  pass:260  dwarn:0   dfail:0   fail:0   skip:19  time:508s
fi-byt-j1900     total:279  pass:253  dwarn:2   dfail:0   fail:0   skip:24  time:499s
fi-byt-n2820     total:279  pass:249  dwarn:2   dfail:0   fail:0   skip:28  time:486s
fi-glk-2a        total:279  pass:260  dwarn:0   dfail:0   fail:0   skip:19  time:600s
fi-hsw-4770      total:279  pass:263  dwarn:0   dfail:0   fail:0   skip:16  time:435s
fi-hsw-4770r     total:279  pass:263  dwarn:0   dfail:0   fail:0   skip:16  time:419s
fi-ilk-650       total:279  pass:229  dwarn:0   dfail:0   fail:0   skip:50  time:419s
fi-ivb-3520m     total:279  pass:261  dwarn:0   dfail:0   fail:0   skip:18  time:494s
fi-ivb-3770      total:279  pass:261  dwarn:0   dfail:0   fail:0   skip:18  time:470s
fi-kbl-7500u     total:279  pass:261  dwarn:0   dfail:0   fail:0   skip:18  time:471s
fi-kbl-7560u     total:279  pass:269  dwarn:0   dfail:0   fail:0   skip:10  time:571s
fi-kbl-r         total:279  pass:260  dwarn:1   dfail:0   fail:0   skip:18  time:585s
fi-pnv-d510      total:279  pass:222  dwarn:2   dfail:0   fail:0   skip:55  time:556s
fi-skl-6260u     total:279  pass:269  dwarn:0   dfail:0   fail:0   skip:10  time:451s
fi-skl-6700hq    total:279  pass:223  dwarn:1   dfail:0   fail:30  skip:24  time:340s
fi-skl-6700k     total:279  pass:257  dwarn:4   dfail:0   fail:0   skip:18  time:464s
fi-skl-6770hq    total:279  pass:269  dwarn:0   dfail:0   fail:0   skip:10  time:489s
fi-skl-gvtdvm    total:279  pass:266  dwarn:0   dfail:0   fail:0   skip:13  time:433s
fi-snb-2520m     total:279  pass:251  dwarn:0   dfail:0   fail:0   skip:28  time:536s
fi-snb-2600      total:279  pass:250  dwarn:0   dfail:0   fail:0   skip:29  time:402s

bf26e1dbbba24a7697559f1131d4be99747b7646 drm-tip: 2017y-06m-27d-13h-59m-07s UTC integration manifest
1e7aaf7 drm/i915: Cancel pending execlists irq handler upon idling

== Logs ==

For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_5051/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] drm/i915: Cancel pending execlists irq handler upon idling
  2017-06-27 15:25 [PATCH] drm/i915: Cancel pending execlists irq handler upon idling Chris Wilson
  2017-06-27 15:45 ` ✓ Fi.CI.BAT: success for " Patchwork
@ 2017-06-28  8:59 ` Tvrtko Ursulin
  2017-06-28 10:01   ` Chris Wilson
  1 sibling, 1 reply; 6+ messages in thread
From: Tvrtko Ursulin @ 2017-06-28  8:59 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 27/06/2017 16:25, Chris Wilson wrote:
> Due to the slight asynchronicity in handling the execlists interrupts
> (i.e. we defer the work to a handler that may consume more than one
> interrupt event), when the engine is idle we may still have an irq
> tasklet queued (especially when it has been deferred to a ksoftirqd).
> At the beginning of the tasklet, we assert that we do hold a device
> wakeref for the access we are about to perform. This assumes that when
> we idle and release the GT wakeref, all execlists work has been
> completed (since the elsp tracking says the hw is idle). However, there
> may still be a tasklet queued, so as we mark the engine idle, also
> cancel any pending tasklet.

We check the irq_posted bit which should correspond with a pending 
tasklet (intel_engines_are_idle/intel_engine_is_idle), before 
transitioning to idle so I don't understand this.

Regards,

Tvrtko

> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>   Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>   drivers/gpu/drm/i915/intel_engine_cs.c | 1 +
>   1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
> index 3b46c1f7b88b..49e875c46c96 100644
> --- a/drivers/gpu/drm/i915/intel_engine_cs.c
> +++ b/drivers/gpu/drm/i915/intel_engine_cs.c
> @@ -1328,6 +1328,7 @@ void intel_engines_mark_idle(struct drm_i915_private *i915)
>   	for_each_engine(engine, i915, id) {
>   		intel_engine_disarm_breadcrumbs(engine);
>   		i915_gem_batch_pool_fini(&engine->batch_pool);
> +		tasklet_kill(&engine->irq_tasklet);
>   		engine->no_priolist = false;
>   	}
>   }
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] drm/i915: Cancel pending execlists irq handler upon idling
  2017-06-28  8:59 ` [PATCH] " Tvrtko Ursulin
@ 2017-06-28 10:01   ` Chris Wilson
  2017-06-28 10:15     ` Tvrtko Ursulin
  0 siblings, 1 reply; 6+ messages in thread
From: Chris Wilson @ 2017-06-28 10:01 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx

Quoting Tvrtko Ursulin (2017-06-28 09:59:04)
> 
> On 27/06/2017 16:25, Chris Wilson wrote:
> > Due to the slight asynchronicity in handling the execlists interrupts
> > (i.e. we defer the work to a handler that may consume more than one
> > interrupt event), when the engine is idle we may still have an irq
> > tasklet queued (especially when it has been deferred to a ksoftirqd).
> > At the beginning of the tasklet, we assert that we do hold a device
> > wakeref for the access we are about to perform. This assumes that when
> > we idle and release the GT wakeref, all execlists work has been
> > completed (since the elsp tracking says the hw is idle). However, there
> > may still be a tasklet queued, so as we mark the engine idle, also
> > cancel any pending tasklet.
> 
> We check the irq_posted bit which should correspond with a pending 
> tasklet (intel_engines_are_idle/intel_engine_is_idle), before 
> transitioning to idle so I don't understand this.

Exactly, we've processed the interrupt in the current irq handler, but
due to the ordering (which is essential to ensure that we don't miss an
interrupt, i.e. the strong ordering is via the tasklet atomic ops so
that each interrupt is always followed by a tasklet, at least if we do
have elsp[]!) we can queue a second tasklet execution despite it already
being handled concurrently.

Run long enough and this rare event will then coincide with idling.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] drm/i915: Cancel pending execlists irq handler upon idling
  2017-06-28 10:01   ` Chris Wilson
@ 2017-06-28 10:15     ` Tvrtko Ursulin
  2017-06-28 10:29       ` Chris Wilson
  0 siblings, 1 reply; 6+ messages in thread
From: Tvrtko Ursulin @ 2017-06-28 10:15 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 28/06/2017 11:01, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2017-06-28 09:59:04)
>>
>> On 27/06/2017 16:25, Chris Wilson wrote:
>>> Due to the slight asynchronicity in handling the execlists interrupts
>>> (i.e. we defer the work to a handler that may consume more than one
>>> interrupt event), when the engine is idle we may still have an irq
>>> tasklet queued (especially when it has been deferred to a ksoftirqd).
>>> At the beginning of the tasklet, we assert that we do hold a device
>>> wakeref for the access we are about to perform. This assumes that when
>>> we idle and release the GT wakeref, all execlists work has been
>>> completed (since the elsp tracking says the hw is idle). However, there
>>> may still be a tasklet queued, so as we mark the engine idle, also
>>> cancel any pending tasklet.
>>
>> We check the irq_posted bit which should correspond with a pending
>> tasklet (intel_engines_are_idle/intel_engine_is_idle), before
>> transitioning to idle so I don't understand this.
> 
> Exactly, we've processed the interrupt in the current irq handler, but
> due to the ordering (which is essential to ensure that we don't miss an
> interrupt, i.e. the strong ordering is via the tasklet atomic ops so
> that each interrupt is always followed by a tasklet, at least if we do
> have elsp[]!) we can queue a second tasklet execution despite it already
> being handled concurrently.
> 
> Run long enough and this rare event will then coincide with idling.

Ah rings a bell now. That would mean the irq_posted bit being consumed 
by the current tasklet, and then the next one coming along as designed.

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] drm/i915: Cancel pending execlists irq handler upon idling
  2017-06-28 10:15     ` Tvrtko Ursulin
@ 2017-06-28 10:29       ` Chris Wilson
  0 siblings, 0 replies; 6+ messages in thread
From: Chris Wilson @ 2017-06-28 10:29 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx

Quoting Tvrtko Ursulin (2017-06-28 11:15:36)
> 
> On 28/06/2017 11:01, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2017-06-28 09:59:04)
> >>
> >> On 27/06/2017 16:25, Chris Wilson wrote:
> >>> Due to the slight asynchronicity in handling the execlists interrupts
> >>> (i.e. we defer the work to a handler that may consume more than one
> >>> interrupt event), when the engine is idle we may still have an irq
> >>> tasklet queued (especially when it has been deferred to a ksoftirqd).
> >>> At the beginning of the tasklet, we assert that we do hold a device
> >>> wakeref for the access we are about to perform. This assumes that when
> >>> we idle and release the GT wakeref, all execlists work has been
> >>> completed (since the elsp tracking says the hw is idle). However, there
> >>> may still be a tasklet queued, so as we mark the engine idle, also
> >>> cancel any pending tasklet.
> >>
> >> We check the irq_posted bit which should correspond with a pending
> >> tasklet (intel_engines_are_idle/intel_engine_is_idle), before
> >> transitioning to idle so I don't understand this.
> > 
> > Exactly, we've processed the interrupt in the current irq handler, but
> > due to the ordering (which is essential to ensure that we don't miss an
> > interrupt, i.e. the strong ordering is via the tasklet atomic ops so
> > that each interrupt is always followed by a tasklet, at least if we do
> > have elsp[]!) we can queue a second tasklet execution despite it already
> > being handled concurrently.
> > 
> > Run long enough and this rare event will then coincide with idling.
> 
> Ah rings a bell now. That would mean the irq_posted bit being consumed 
> by the current tasklet, and then the next one coming along as designed.

Phew, I didn't have to draw the ascii flow chart. Now, for the final
execution that triggers the bug, we shouldn't actually touch hw (the irq
posted bit should be clear, and the execlists queue should also be
empty), so we could argue that the bug on is in err -- but I think it is
a useful bug on to document the intent of the wakeref being held on our
behalf and cancelling the extra work seems sensible.
 
> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Ta, consider it pushed. Not sure if CI has seen it yet, so no bugzilla
to close.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2017-06-28 10:30 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-06-27 15:25 [PATCH] drm/i915: Cancel pending execlists irq handler upon idling Chris Wilson
2017-06-27 15:45 ` ✓ Fi.CI.BAT: success for " Patchwork
2017-06-28  8:59 ` [PATCH] " Tvrtko Ursulin
2017-06-28 10:01   ` Chris Wilson
2017-06-28 10:15     ` Tvrtko Ursulin
2017-06-28 10:29       ` Chris Wilson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.