All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] drm/i915/execlists: Split the atomic test_and_clear_bit for irq handler
@ 2017-03-21 11:33 Chris Wilson
  2017-03-21 12:59 ` ✓ Fi.CI.BAT: success for " Patchwork
  2017-03-21 14:09 ` [PATCH] " Tvrtko Ursulin
  0 siblings, 2 replies; 4+ messages in thread
From: Chris Wilson @ 2017-03-21 11:33 UTC (permalink / raw)
  To: intel-gfx

Rather than impose the cost of a locked test before queuing a new
request, reduce it to a simple test_bit() with a following clear_bit()
prior to doing the CSB check. This ensure that if an interrupt does
occur whilst reading from the CSB, we still detect it (the interrupt
would trigger a rescheduling of the tasklet anyway).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/intel_lrc.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 296d125d8665..3154b98dc971 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -530,13 +530,18 @@ static void intel_lrc_irq_handler(unsigned long data)
 
 	intel_uncore_forcewake_get(dev_priv, engine->fw_domains);
 
-	while (test_and_clear_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted)) {
+	/* Prefer doing test_and_clear_bit() as a two stage operation to avoid
+	 * imposing the cost of a locked atomic transaction when submitting a
+	 * new request (outside of the context-switch interrupt).
+	 */
+	while (test_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted)) {
 		u32 __iomem *csb_mmio =
 			dev_priv->regs + i915_mmio_reg_offset(RING_CONTEXT_STATUS_PTR(engine));
 		u32 __iomem *buf =
 			dev_priv->regs + i915_mmio_reg_offset(RING_CONTEXT_STATUS_BUF_LO(engine, 0));
 		unsigned int csb, head, tail;
 
+		clear_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted);
 		csb = readl(csb_mmio);
 		head = GEN8_CSB_READ_PTR(csb);
 		tail = GEN8_CSB_WRITE_PTR(csb);
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* ✓ Fi.CI.BAT: success for drm/i915/execlists: Split the atomic test_and_clear_bit for irq handler
  2017-03-21 11:33 [PATCH] drm/i915/execlists: Split the atomic test_and_clear_bit for irq handler Chris Wilson
@ 2017-03-21 12:59 ` Patchwork
  2017-03-21 14:09 ` [PATCH] " Tvrtko Ursulin
  1 sibling, 0 replies; 4+ messages in thread
From: Patchwork @ 2017-03-21 12:59 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: drm/i915/execlists: Split the atomic test_and_clear_bit for irq handler
URL   : https://patchwork.freedesktop.org/series/21603/
State : success

== Summary ==

Series 21603v1 drm/i915/execlists: Split the atomic test_and_clear_bit for irq handler
https://patchwork.freedesktop.org/api/1.0/series/21603/revisions/1/mbox/

Test gem_exec_suspend:
        Subgroup basic-s4-devices:
                dmesg-warn -> PASS       (fi-bxt-t5700) fdo#100125

fdo#100125 https://bugs.freedesktop.org/show_bug.cgi?id=100125

fi-bdw-5557u     total:278  pass:267  dwarn:0   dfail:0   fail:0   skip:11  time: 457s
fi-bsw-n3050     total:278  pass:239  dwarn:0   dfail:0   fail:0   skip:39  time: 589s
fi-bxt-j4205     total:278  pass:259  dwarn:0   dfail:0   fail:0   skip:19  time: 532s
fi-bxt-t5700     total:278  pass:258  dwarn:0   dfail:0   fail:0   skip:20  time: 555s
fi-byt-j1900     total:278  pass:251  dwarn:0   dfail:0   fail:0   skip:27  time: 499s
fi-byt-n2820     total:278  pass:247  dwarn:0   dfail:0   fail:0   skip:31  time: 502s
fi-hsw-4770      total:278  pass:262  dwarn:0   dfail:0   fail:0   skip:16  time: 439s
fi-hsw-4770r     total:278  pass:262  dwarn:0   dfail:0   fail:0   skip:16  time: 431s
fi-ilk-650       total:278  pass:228  dwarn:0   dfail:0   fail:0   skip:50  time: 434s
fi-ivb-3520m     total:278  pass:260  dwarn:0   dfail:0   fail:0   skip:18  time: 517s
fi-ivb-3770      total:278  pass:260  dwarn:0   dfail:0   fail:0   skip:18  time: 493s
fi-kbl-7500u     total:278  pass:260  dwarn:0   dfail:0   fail:0   skip:18  time: 491s
fi-skl-6260u     total:278  pass:268  dwarn:0   dfail:0   fail:0   skip:10  time: 478s
fi-skl-6700hq    total:278  pass:261  dwarn:0   dfail:0   fail:0   skip:17  time: 590s
fi-skl-6700k     total:278  pass:256  dwarn:4   dfail:0   fail:0   skip:18  time: 493s
fi-skl-6770hq    total:278  pass:268  dwarn:0   dfail:0   fail:0   skip:10  time: 510s
fi-snb-2520m     total:278  pass:250  dwarn:0   dfail:0   fail:0   skip:28  time: 545s
fi-snb-2600      total:278  pass:249  dwarn:0   dfail:0   fail:0   skip:29  time: 417s

84b98c6d4f5f6ce0ce17b8fd07629dfc71e7d829 drm-tip: 2017y-03m-21d-11h-08m-23s UTC integration manifest
bf4538e drm/i915/execlists: Split the atomic test_and_clear_bit for irq handler

== Logs ==

For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_4244/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] drm/i915/execlists: Split the atomic test_and_clear_bit for irq handler
  2017-03-21 11:33 [PATCH] drm/i915/execlists: Split the atomic test_and_clear_bit for irq handler Chris Wilson
  2017-03-21 12:59 ` ✓ Fi.CI.BAT: success for " Patchwork
@ 2017-03-21 14:09 ` Tvrtko Ursulin
  2017-03-21 14:18   ` Chris Wilson
  1 sibling, 1 reply; 4+ messages in thread
From: Tvrtko Ursulin @ 2017-03-21 14:09 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 21/03/2017 11:33, Chris Wilson wrote:
> Rather than impose the cost of a locked test before queuing a new
> request, reduce it to a simple test_bit() with a following clear_bit()
> prior to doing the CSB check. This ensure that if an interrupt does
> occur whilst reading from the CSB, we still detect it (the interrupt
> would trigger a rescheduling of the tasklet anyway).
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>  drivers/gpu/drm/i915/intel_lrc.c | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index 296d125d8665..3154b98dc971 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -530,13 +530,18 @@ static void intel_lrc_irq_handler(unsigned long data)
>
>  	intel_uncore_forcewake_get(dev_priv, engine->fw_domains);
>
> -	while (test_and_clear_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted)) {
> +	/* Prefer doing test_and_clear_bit() as a two stage operation to avoid
> +	 * imposing the cost of a locked atomic transaction when submitting a
> +	 * new request (outside of the context-switch interrupt).
> +	 */
> +	while (test_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted)) {
>  		u32 __iomem *csb_mmio =
>  			dev_priv->regs + i915_mmio_reg_offset(RING_CONTEXT_STATUS_PTR(engine));
>  		u32 __iomem *buf =
>  			dev_priv->regs + i915_mmio_reg_offset(RING_CONTEXT_STATUS_BUF_LO(engine, 0));
>  		unsigned int csb, head, tail;
>
> +		clear_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted);
>  		csb = readl(csb_mmio);
>  		head = GEN8_CSB_READ_PTR(csb);
>  		tail = GEN8_CSB_WRITE_PTR(csb);
>

Looks safe to me from the point of view of potential races. If a new 
interrupt arrives and sets the bit just before the tasklet clears it, we 
would process the full set of CSB updates on the following line.

I can also confirm that it has a real effect of bringing the CPU usage 
of this interrupt handler down.

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] drm/i915/execlists: Split the atomic test_and_clear_bit for irq handler
  2017-03-21 14:09 ` [PATCH] " Tvrtko Ursulin
@ 2017-03-21 14:18   ` Chris Wilson
  0 siblings, 0 replies; 4+ messages in thread
From: Chris Wilson @ 2017-03-21 14:18 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: intel-gfx

On Tue, Mar 21, 2017 at 02:09:43PM +0000, Tvrtko Ursulin wrote:
> 
> On 21/03/2017 11:33, Chris Wilson wrote:
> >Rather than impose the cost of a locked test before queuing a new
> >request, reduce it to a simple test_bit() with a following clear_bit()
> >prior to doing the CSB check. This ensure that if an interrupt does
> >occur whilst reading from the CSB, we still detect it (the interrupt
> >would trigger a rescheduling of the tasklet anyway).
> >
> >Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> >---
> > drivers/gpu/drm/i915/intel_lrc.c | 7 ++++++-
> > 1 file changed, 6 insertions(+), 1 deletion(-)
> >
> >diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> >index 296d125d8665..3154b98dc971 100644
> >--- a/drivers/gpu/drm/i915/intel_lrc.c
> >+++ b/drivers/gpu/drm/i915/intel_lrc.c
> >@@ -530,13 +530,18 @@ static void intel_lrc_irq_handler(unsigned long data)
> >
> > 	intel_uncore_forcewake_get(dev_priv, engine->fw_domains);
> >
> >-	while (test_and_clear_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted)) {
> >+	/* Prefer doing test_and_clear_bit() as a two stage operation to avoid
> >+	 * imposing the cost of a locked atomic transaction when submitting a
> >+	 * new request (outside of the context-switch interrupt).
> >+	 */
> >+	while (test_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted)) {
> > 		u32 __iomem *csb_mmio =
> > 			dev_priv->regs + i915_mmio_reg_offset(RING_CONTEXT_STATUS_PTR(engine));
> > 		u32 __iomem *buf =
> > 			dev_priv->regs + i915_mmio_reg_offset(RING_CONTEXT_STATUS_BUF_LO(engine, 0));
> > 		unsigned int csb, head, tail;
> >
> >+		clear_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted);
> > 		csb = readl(csb_mmio);
> > 		head = GEN8_CSB_READ_PTR(csb);
> > 		tail = GEN8_CSB_WRITE_PTR(csb);
> >
> 
> Looks safe to me from the point of view of potential races. If a new
> interrupt arrives and sets the bit just before the tasklet clears
> it, we would process the full set of CSB updates on the following
> line.
> 
> I can also confirm that it has a real effect of bringing the CPU
> usage of this interrupt handler down.
> 
> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Thanks, pushed because CI is getting bored (or I'm getting bored of it
ignoring some patches ;)
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2017-03-21 14:18 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-21 11:33 [PATCH] drm/i915/execlists: Split the atomic test_and_clear_bit for irq handler Chris Wilson
2017-03-21 12:59 ` ✓ Fi.CI.BAT: success for " Patchwork
2017-03-21 14:09 ` [PATCH] " Tvrtko Ursulin
2017-03-21 14:18   ` Chris Wilson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.