All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] drm/i915/execlists: Use a locked clear_bit() for synchronisation with interrupt
@ 2018-03-21  9:10 Chris Wilson
  2018-03-21  9:19 ` ✗ Fi.CI.CHECKPATCH: warning for " Patchwork
                   ` (8 more replies)
  0 siblings, 9 replies; 17+ messages in thread
From: Chris Wilson @ 2018-03-21  9:10 UTC (permalink / raw)
  To: intel-gfx; +Cc: Mika Kuoppala

We were relying on the uncached reads when processing the CSB to provide
ourselves with the serialisation with the interrupt handler (so we could
detect new interrupts in the middle of processing the old one). However,
in commit 767a983ab255 ("drm/i915/execlists: Read the context-status HEAD
from the HWSP") those uncached reads were eliminated (on one path at
least) and along with them our serialisation. The result is that we
would very rarely miss notification of a new interrupt and leave a
context-switch unprocessed, hanging the GPU.

Fixes: 767a983ab255 ("drm/i915/execlists: Read the context-status HEAD from the HWSP")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
---
 drivers/gpu/drm/i915/intel_lrc.c | 21 ++++++++-------------
 1 file changed, 8 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 53f1c009ed7b..67b6a0f658d6 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -831,7 +831,8 @@ static void execlists_submission_tasklet(unsigned long data)
 	struct drm_i915_private *dev_priv = engine->i915;
 	bool fw = false;
 
-	/* We can skip acquiring intel_runtime_pm_get() here as it was taken
+	/*
+	 * We can skip acquiring intel_runtime_pm_get() here as it was taken
 	 * on our behalf by the request (see i915_gem_mark_busy()) and it will
 	 * not be relinquished until the device is idle (see
 	 * i915_gem_idle_work_handler()). As a precaution, we make sure
@@ -840,7 +841,8 @@ static void execlists_submission_tasklet(unsigned long data)
 	 */
 	GEM_BUG_ON(!dev_priv->gt.awake);
 
-	/* Prefer doing test_and_clear_bit() as a two stage operation to avoid
+	/*
+	 * Prefer doing test_and_clear_bit() as a two stage operation to avoid
 	 * imposing the cost of a locked atomic transaction when submitting a
 	 * new request (outside of the context-switch interrupt).
 	 */
@@ -856,17 +858,10 @@ static void execlists_submission_tasklet(unsigned long data)
 			execlists->csb_head = -1; /* force mmio read of CSB ptrs */
 		}
 
-		/* The write will be ordered by the uncached read (itself
-		 * a memory barrier), so we do not need another in the form
-		 * of a locked instruction. The race between the interrupt
-		 * handler and the split test/clear is harmless as we order
-		 * our clear before the CSB read. If the interrupt arrived
-		 * first between the test and the clear, we read the updated
-		 * CSB and clear the bit. If the interrupt arrives as we read
-		 * the CSB or later (i.e. after we had cleared the bit) the bit
-		 * is set and we do a new loop.
-		 */
-		__clear_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted);
+		/* Clear before reading to catch new interrupts */
+		clear_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted);
+		smp_mb__after_atomic();
+
 		if (unlikely(execlists->csb_head == -1)) { /* following a reset */
 			if (!fw) {
 				intel_uncore_forcewake_get(dev_priv,
-- 
2.16.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* ✗ Fi.CI.CHECKPATCH: warning for drm/i915/execlists: Use a locked clear_bit() for synchronisation with interrupt
  2018-03-21  9:10 [PATCH] drm/i915/execlists: Use a locked clear_bit() for synchronisation with interrupt Chris Wilson
@ 2018-03-21  9:19 ` Patchwork
  2018-03-21  9:35 ` ✗ Fi.CI.BAT: " Patchwork
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 17+ messages in thread
From: Patchwork @ 2018-03-21  9:19 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: drm/i915/execlists: Use a locked clear_bit() for synchronisation with interrupt
URL   : https://patchwork.freedesktop.org/series/40359/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
d0c27de732e3 drm/i915/execlists: Use a locked clear_bit() for synchronisation with interrupt
-:63: WARNING:MEMORY_BARRIER: memory barrier without comment
#63: FILE: drivers/gpu/drm/i915/intel_lrc.c:863:
+		smp_mb__after_atomic();

total: 0 errors, 1 warnings, 0 checks, 39 lines checked

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* ✗ Fi.CI.BAT: warning for drm/i915/execlists: Use a locked clear_bit() for synchronisation with interrupt
  2018-03-21  9:10 [PATCH] drm/i915/execlists: Use a locked clear_bit() for synchronisation with interrupt Chris Wilson
  2018-03-21  9:19 ` ✗ Fi.CI.CHECKPATCH: warning for " Patchwork
@ 2018-03-21  9:35 ` Patchwork
  2018-03-21 10:14 ` [PATCH] " Tvrtko Ursulin
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 17+ messages in thread
From: Patchwork @ 2018-03-21  9:35 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: drm/i915/execlists: Use a locked clear_bit() for synchronisation with interrupt
URL   : https://patchwork.freedesktop.org/series/40359/
State : warning

== Summary ==

Series 40359v1 drm/i915/execlists: Use a locked clear_bit() for synchronisation with interrupt
https://patchwork.freedesktop.org/api/1.0/series/40359/revisions/1/mbox/

---- Possible new issues:

Test kms_flip:
        Subgroup basic-flip-vs-wf_vblank:
                pass       -> DMESG-WARN (fi-skl-6700hq)
        Subgroup basic-plain-flip:
                pass       -> DMESG-WARN (fi-skl-6700hq)

---- Known issues:

Test gem_mmap_gtt:
        Subgroup basic-small-bo-tiledx:
                pass       -> FAIL       (fi-gdg-551) fdo#102575
Test kms_busy:
        Subgroup basic-flip-b:
                pass       -> DMESG-WARN (fi-skl-6700hq) fdo#101144 +2
Test kms_pipe_crc_basic:
        Subgroup suspend-read-crc-pipe-a:
                incomplete -> PASS       (fi-cfl-s2) fdo#105641

fdo#102575 https://bugs.freedesktop.org/show_bug.cgi?id=102575
fdo#101144 https://bugs.freedesktop.org/show_bug.cgi?id=101144
fdo#105641 https://bugs.freedesktop.org/show_bug.cgi?id=105641

fi-bdw-5557u     total:285  pass:264  dwarn:0   dfail:0   fail:0   skip:21  time:437s
fi-bdw-gvtdvm    total:285  pass:261  dwarn:0   dfail:0   fail:0   skip:24  time:446s
fi-blb-e6850     total:285  pass:220  dwarn:1   dfail:0   fail:0   skip:64  time:380s
fi-bsw-n3050     total:285  pass:239  dwarn:0   dfail:0   fail:0   skip:46  time:541s
fi-bwr-2160      total:285  pass:180  dwarn:0   dfail:0   fail:0   skip:105 time:298s
fi-bxt-dsi       total:285  pass:255  dwarn:0   dfail:0   fail:0   skip:30  time:527s
fi-bxt-j4205     total:285  pass:256  dwarn:0   dfail:0   fail:0   skip:29  time:513s
fi-byt-j1900     total:285  pass:250  dwarn:0   dfail:0   fail:0   skip:35  time:517s
fi-byt-n2820     total:285  pass:246  dwarn:0   dfail:0   fail:0   skip:39  time:504s
fi-cfl-8700k     total:285  pass:257  dwarn:0   dfail:0   fail:0   skip:28  time:414s
fi-cfl-s2        total:285  pass:259  dwarn:0   dfail:0   fail:0   skip:26  time:577s
fi-cfl-u         total:285  pass:259  dwarn:0   dfail:0   fail:0   skip:26  time:517s
fi-cnl-drrs      total:285  pass:254  dwarn:3   dfail:0   fail:0   skip:28  time:530s
fi-elk-e7500     total:285  pass:225  dwarn:1   dfail:0   fail:0   skip:59  time:428s
fi-gdg-551       total:285  pass:176  dwarn:0   dfail:0   fail:1   skip:108 time:319s
fi-glk-1         total:285  pass:257  dwarn:0   dfail:0   fail:0   skip:28  time:536s
fi-hsw-4770      total:285  pass:258  dwarn:0   dfail:0   fail:0   skip:27  time:406s
fi-ilk-650       total:285  pass:225  dwarn:0   dfail:0   fail:0   skip:60  time:418s
fi-ivb-3520m     total:285  pass:256  dwarn:0   dfail:0   fail:0   skip:29  time:475s
fi-ivb-3770      total:285  pass:252  dwarn:0   dfail:0   fail:0   skip:33  time:435s
fi-kbl-7500u     total:285  pass:260  dwarn:1   dfail:0   fail:0   skip:24  time:478s
fi-kbl-7567u     total:285  pass:265  dwarn:0   dfail:0   fail:0   skip:20  time:466s
fi-kbl-r         total:285  pass:258  dwarn:0   dfail:0   fail:0   skip:27  time:514s
fi-pnv-d510      total:285  pass:219  dwarn:1   dfail:0   fail:0   skip:65  time:655s
fi-skl-6260u     total:285  pass:265  dwarn:0   dfail:0   fail:0   skip:20  time:441s
fi-skl-6600u     total:285  pass:258  dwarn:0   dfail:0   fail:0   skip:27  time:530s
fi-skl-6700hq    total:285  pass:254  dwarn:5   dfail:0   fail:0   skip:26  time:557s
fi-skl-6700k2    total:285  pass:261  dwarn:0   dfail:0   fail:0   skip:24  time:507s
fi-skl-6770hq    total:285  pass:265  dwarn:0   dfail:0   fail:0   skip:20  time:492s
fi-skl-guc       total:285  pass:257  dwarn:0   dfail:0   fail:0   skip:28  time:427s
fi-snb-2520m     total:285  pass:245  dwarn:0   dfail:0   fail:0   skip:40  time:568s
fi-snb-2600      total:285  pass:245  dwarn:0   dfail:0   fail:0   skip:40  time:401s

9d737cebc219c821989021a3115424165ff7b052 drm-tip: 2018y-03m-20d-14h-56m-05s UTC integration manifest
d0c27de732e3 drm/i915/execlists: Use a locked clear_bit() for synchronisation with interrupt

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_8426/issues.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] drm/i915/execlists: Use a locked clear_bit() for synchronisation with interrupt
  2018-03-21  9:10 [PATCH] drm/i915/execlists: Use a locked clear_bit() for synchronisation with interrupt Chris Wilson
  2018-03-21  9:19 ` ✗ Fi.CI.CHECKPATCH: warning for " Patchwork
  2018-03-21  9:35 ` ✗ Fi.CI.BAT: " Patchwork
@ 2018-03-21 10:14 ` Tvrtko Ursulin
  2018-03-21 10:24 ` ✗ Fi.CI.CHECKPATCH: warning for " Patchwork
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 17+ messages in thread
From: Tvrtko Ursulin @ 2018-03-21 10:14 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: Mika Kuoppala


On 21/03/2018 09:10, Chris Wilson wrote:
> We were relying on the uncached reads when processing the CSB to provide
> ourselves with the serialisation with the interrupt handler (so we could
> detect new interrupts in the middle of processing the old one). However,
> in commit 767a983ab255 ("drm/i915/execlists: Read the context-status HEAD
> from the HWSP") those uncached reads were eliminated (on one path at
> least) and along with them our serialisation. The result is that we
> would very rarely miss notification of a new interrupt and leave a
> context-switch unprocessed, hanging the GPU.
> 
> Fixes: 767a983ab255 ("drm/i915/execlists: Read the context-status HEAD from the HWSP")
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Michel Thierry <michel.thierry@intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Mika Kuoppala <mika.kuoppala@intel.com>
> ---
>   drivers/gpu/drm/i915/intel_lrc.c | 21 ++++++++-------------
>   1 file changed, 8 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index 53f1c009ed7b..67b6a0f658d6 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -831,7 +831,8 @@ static void execlists_submission_tasklet(unsigned long data)
>   	struct drm_i915_private *dev_priv = engine->i915;
>   	bool fw = false;
>   
> -	/* We can skip acquiring intel_runtime_pm_get() here as it was taken
> +	/*
> +	 * We can skip acquiring intel_runtime_pm_get() here as it was taken
>   	 * on our behalf by the request (see i915_gem_mark_busy()) and it will
>   	 * not be relinquished until the device is idle (see
>   	 * i915_gem_idle_work_handler()). As a precaution, we make sure
> @@ -840,7 +841,8 @@ static void execlists_submission_tasklet(unsigned long data)
>   	 */
>   	GEM_BUG_ON(!dev_priv->gt.awake);
>   
> -	/* Prefer doing test_and_clear_bit() as a two stage operation to avoid
> +	/*
> +	 * Prefer doing test_and_clear_bit() as a two stage operation to avoid
>   	 * imposing the cost of a locked atomic transaction when submitting a
>   	 * new request (outside of the context-switch interrupt).
>   	 */
> @@ -856,17 +858,10 @@ static void execlists_submission_tasklet(unsigned long data)
>   			execlists->csb_head = -1; /* force mmio read of CSB ptrs */
>   		}
>   
> -		/* The write will be ordered by the uncached read (itself
> -		 * a memory barrier), so we do not need another in the form
> -		 * of a locked instruction. The race between the interrupt
> -		 * handler and the split test/clear is harmless as we order
> -		 * our clear before the CSB read. If the interrupt arrived
> -		 * first between the test and the clear, we read the updated
> -		 * CSB and clear the bit. If the interrupt arrives as we read
> -		 * the CSB or later (i.e. after we had cleared the bit) the bit
> -		 * is set and we do a new loop.
> -		 */
> -		__clear_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted);
> +		/* Clear before reading to catch new interrupts */
> +		clear_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted);
> +		smp_mb__after_atomic();
> +

Could theoretically avoid the locked cost in the mmio case by having two 
flavours of bit clearing in the "if" branches below but it doesn't 
sounds like a worthy complication in a wider context.

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Now off to apply it in desperation it might affect my weird reset issues...

Regards,

Tvrtko

>   		if (unlikely(execlists->csb_head == -1)) { /* following a reset */
>   			if (!fw) {
>   				intel_uncore_forcewake_get(dev_priv,
> 

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* ✗ Fi.CI.CHECKPATCH: warning for drm/i915/execlists: Use a locked clear_bit() for synchronisation with interrupt
  2018-03-21  9:10 [PATCH] drm/i915/execlists: Use a locked clear_bit() for synchronisation with interrupt Chris Wilson
                   ` (2 preceding siblings ...)
  2018-03-21 10:14 ` [PATCH] " Tvrtko Ursulin
@ 2018-03-21 10:24 ` Patchwork
  2018-03-21 10:42 ` ✗ Fi.CI.BAT: failure " Patchwork
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 17+ messages in thread
From: Patchwork @ 2018-03-21 10:24 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: drm/i915/execlists: Use a locked clear_bit() for synchronisation with interrupt
URL   : https://patchwork.freedesktop.org/series/40359/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
6cf422f94c61 drm/i915/execlists: Use a locked clear_bit() for synchronisation with interrupt
-:64: WARNING:MEMORY_BARRIER: memory barrier without comment
#64: FILE: drivers/gpu/drm/i915/intel_lrc.c:863:
+		smp_mb__after_atomic();

total: 0 errors, 1 warnings, 0 checks, 39 lines checked

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* ✗ Fi.CI.BAT: failure for drm/i915/execlists: Use a locked clear_bit() for synchronisation with interrupt
  2018-03-21  9:10 [PATCH] drm/i915/execlists: Use a locked clear_bit() for synchronisation with interrupt Chris Wilson
                   ` (3 preceding siblings ...)
  2018-03-21 10:24 ` ✗ Fi.CI.CHECKPATCH: warning for " Patchwork
@ 2018-03-21 10:42 ` Patchwork
  2018-03-21 10:46 ` [PATCH] " Mika Kuoppala
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 17+ messages in thread
From: Patchwork @ 2018-03-21 10:42 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: drm/i915/execlists: Use a locked clear_bit() for synchronisation with interrupt
URL   : https://patchwork.freedesktop.org/series/40359/
State : failure

== Summary ==

Series 40359v1 drm/i915/execlists: Use a locked clear_bit() for synchronisation with interrupt
https://patchwork.freedesktop.org/api/1.0/series/40359/revisions/1/mbox/

---- Possible new issues:

Test kms_pipe_crc_basic:
        Subgroup read-crc-pipe-c-frame-sequence:
                pass       -> FAIL       (fi-skl-6770hq)

---- Known issues:

Test debugfs_test:
        Subgroup read_all_entries:
                pass       -> INCOMPLETE (fi-snb-2520m) fdo#103713
Test gem_mmap_gtt:
        Subgroup basic-small-bo-tiledx:
                pass       -> FAIL       (fi-gdg-551) fdo#102575
Test kms_pipe_crc_basic:
        Subgroup suspend-read-crc-pipe-a:
                incomplete -> PASS       (fi-cfl-s2) fdo#105641

fdo#103713 https://bugs.freedesktop.org/show_bug.cgi?id=103713
fdo#102575 https://bugs.freedesktop.org/show_bug.cgi?id=102575
fdo#105641 https://bugs.freedesktop.org/show_bug.cgi?id=105641

fi-bdw-5557u     total:285  pass:264  dwarn:0   dfail:0   fail:0   skip:21  time:432s
fi-bdw-gvtdvm    total:285  pass:261  dwarn:0   dfail:0   fail:0   skip:24  time:441s
fi-blb-e6850     total:285  pass:220  dwarn:1   dfail:0   fail:0   skip:64  time:379s
fi-bsw-n3050     total:285  pass:239  dwarn:0   dfail:0   fail:0   skip:46  time:542s
fi-bwr-2160      total:285  pass:180  dwarn:0   dfail:0   fail:0   skip:105 time:296s
fi-bxt-dsi       total:285  pass:255  dwarn:0   dfail:0   fail:0   skip:30  time:514s
fi-bxt-j4205     total:285  pass:256  dwarn:0   dfail:0   fail:0   skip:29  time:517s
fi-byt-j1900     total:285  pass:250  dwarn:0   dfail:0   fail:0   skip:35  time:515s
fi-byt-n2820     total:285  pass:246  dwarn:0   dfail:0   fail:0   skip:39  time:503s
fi-cfl-8700k     total:285  pass:257  dwarn:0   dfail:0   fail:0   skip:28  time:413s
fi-cfl-s2        total:285  pass:259  dwarn:0   dfail:0   fail:0   skip:26  time:574s
fi-cfl-u         total:285  pass:259  dwarn:0   dfail:0   fail:0   skip:26  time:510s
fi-cnl-drrs      total:285  pass:254  dwarn:3   dfail:0   fail:0   skip:28  time:516s
fi-elk-e7500     total:285  pass:225  dwarn:1   dfail:0   fail:0   skip:59  time:420s
fi-gdg-551       total:285  pass:176  dwarn:0   dfail:0   fail:1   skip:108 time:319s
fi-glk-1         total:285  pass:257  dwarn:0   dfail:0   fail:0   skip:28  time:536s
fi-hsw-4770      total:285  pass:258  dwarn:0   dfail:0   fail:0   skip:27  time:401s
fi-ilk-650       total:285  pass:225  dwarn:0   dfail:0   fail:0   skip:60  time:419s
fi-ivb-3520m     total:285  pass:256  dwarn:0   dfail:0   fail:0   skip:29  time:473s
fi-ivb-3770      total:285  pass:252  dwarn:0   dfail:0   fail:0   skip:33  time:429s
fi-kbl-7500u     total:285  pass:260  dwarn:1   dfail:0   fail:0   skip:24  time:473s
fi-kbl-7567u     total:285  pass:265  dwarn:0   dfail:0   fail:0   skip:20  time:466s
fi-kbl-r         total:285  pass:258  dwarn:0   dfail:0   fail:0   skip:27  time:512s
fi-pnv-d510      total:285  pass:219  dwarn:1   dfail:0   fail:0   skip:65  time:660s
fi-skl-6260u     total:285  pass:265  dwarn:0   dfail:0   fail:0   skip:20  time:443s
fi-skl-6600u     total:285  pass:258  dwarn:0   dfail:0   fail:0   skip:27  time:532s
fi-skl-6700k2    total:285  pass:261  dwarn:0   dfail:0   fail:0   skip:24  time:504s
fi-skl-6770hq    total:285  pass:264  dwarn:0   dfail:0   fail:1   skip:20  time:488s
fi-skl-guc       total:285  pass:257  dwarn:0   dfail:0   fail:0   skip:28  time:426s
fi-skl-gvtdvm    total:285  pass:262  dwarn:0   dfail:0   fail:0   skip:23  time:452s
fi-snb-2520m     total:3    pass:2    dwarn:0   dfail:0   fail:0   skip:0  
fi-snb-2600      total:285  pass:245  dwarn:0   dfail:0   fail:0   skip:40  time:401s

9d737cebc219c821989021a3115424165ff7b052 drm-tip: 2018y-03m-20d-14h-56m-05s UTC integration manifest
6cf422f94c61 drm/i915/execlists: Use a locked clear_bit() for synchronisation with interrupt

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_8427/issues.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] drm/i915/execlists: Use a locked clear_bit() for synchronisation with interrupt
  2018-03-21  9:10 [PATCH] drm/i915/execlists: Use a locked clear_bit() for synchronisation with interrupt Chris Wilson
                   ` (4 preceding siblings ...)
  2018-03-21 10:42 ` ✗ Fi.CI.BAT: failure " Patchwork
@ 2018-03-21 10:46 ` Mika Kuoppala
  2018-03-21 17:01   ` Michel Thierry
  2018-03-21 11:31 ` ✗ Fi.CI.CHECKPATCH: warning for " Patchwork
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 17+ messages in thread
From: Mika Kuoppala @ 2018-03-21 10:46 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

Chris Wilson <chris@chris-wilson.co.uk> writes:

> We were relying on the uncached reads when processing the CSB to provide
> ourselves with the serialisation with the interrupt handler (so we could
> detect new interrupts in the middle of processing the old one). However,
> in commit 767a983ab255 ("drm/i915/execlists: Read the context-status HEAD
> from the HWSP") those uncached reads were eliminated (on one path at
> least) and along with them our serialisation. The result is that we
> would very rarely miss notification of a new interrupt and leave a
> context-switch unprocessed, hanging the GPU.
>
> Fixes: 767a983ab255 ("drm/i915/execlists: Read the context-status HEAD from the HWSP")
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Michel Thierry <michel.thierry@intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Mika Kuoppala <mika.kuoppala@intel.com>
> ---
>  drivers/gpu/drm/i915/intel_lrc.c | 21 ++++++++-------------
>  1 file changed, 8 insertions(+), 13 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index 53f1c009ed7b..67b6a0f658d6 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -831,7 +831,8 @@ static void execlists_submission_tasklet(unsigned long data)
>  	struct drm_i915_private *dev_priv = engine->i915;
>  	bool fw = false;
>  
> -	/* We can skip acquiring intel_runtime_pm_get() here as it was taken
> +	/*
> +	 * We can skip acquiring intel_runtime_pm_get() here as it was taken
>  	 * on our behalf by the request (see i915_gem_mark_busy()) and it will
>  	 * not be relinquished until the device is idle (see
>  	 * i915_gem_idle_work_handler()). As a precaution, we make sure
> @@ -840,7 +841,8 @@ static void execlists_submission_tasklet(unsigned long data)
>  	 */
>  	GEM_BUG_ON(!dev_priv->gt.awake);
>  
> -	/* Prefer doing test_and_clear_bit() as a two stage operation to avoid
> +	/*
> +	 * Prefer doing test_and_clear_bit() as a two stage operation to avoid
>  	 * imposing the cost of a locked atomic transaction when submitting a
>  	 * new request (outside of the context-switch interrupt).
>  	 */
> @@ -856,17 +858,10 @@ static void execlists_submission_tasklet(unsigned long data)
>  			execlists->csb_head = -1; /* force mmio read of CSB ptrs */
>  		}
>  
> -		/* The write will be ordered by the uncached read (itself
> -		 * a memory barrier), so we do not need another in the form
> -		 * of a locked instruction. The race between the interrupt
> -		 * handler and the split test/clear is harmless as we order
> -		 * our clear before the CSB read. If the interrupt arrived
> -		 * first between the test and the clear, we read the updated
> -		 * CSB and clear the bit. If the interrupt arrives as we read
> -		 * the CSB or later (i.e. after we had cleared the bit) the bit
> -		 * is set and we do a new loop.
> -		 */
> -		__clear_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted);
> +		/* Clear before reading to catch new interrupts */
> +		clear_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted);
> +		smp_mb__after_atomic();

I was confused about this memory barrier as our test is in the
same context and ordered wrt this. Chris noted in irc that this is for
the documentation for ordering wrt the code that follows.

I am fine with that so,
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>

> +
>  		if (unlikely(execlists->csb_head == -1)) { /* following a reset */
>  			if (!fw) {
>  				intel_uncore_forcewake_get(dev_priv,
> -- 
> 2.16.2
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* ✗ Fi.CI.CHECKPATCH: warning for drm/i915/execlists: Use a locked clear_bit() for synchronisation with interrupt
  2018-03-21  9:10 [PATCH] drm/i915/execlists: Use a locked clear_bit() for synchronisation with interrupt Chris Wilson
                   ` (5 preceding siblings ...)
  2018-03-21 10:46 ` [PATCH] " Mika Kuoppala
@ 2018-03-21 11:31 ` Patchwork
  2018-03-21 11:51 ` ✓ Fi.CI.BAT: success " Patchwork
  2018-03-21 13:40 ` ✓ Fi.CI.IGT: " Patchwork
  8 siblings, 0 replies; 17+ messages in thread
From: Patchwork @ 2018-03-21 11:31 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: drm/i915/execlists: Use a locked clear_bit() for synchronisation with interrupt
URL   : https://patchwork.freedesktop.org/series/40359/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
6b81af0baaa5 drm/i915/execlists: Use a locked clear_bit() for synchronisation with interrupt
-:65: WARNING:MEMORY_BARRIER: memory barrier without comment
#65: FILE: drivers/gpu/drm/i915/intel_lrc.c:863:
+		smp_mb__after_atomic();

total: 0 errors, 1 warnings, 0 checks, 39 lines checked

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* ✓ Fi.CI.BAT: success for drm/i915/execlists: Use a locked clear_bit() for synchronisation with interrupt
  2018-03-21  9:10 [PATCH] drm/i915/execlists: Use a locked clear_bit() for synchronisation with interrupt Chris Wilson
                   ` (6 preceding siblings ...)
  2018-03-21 11:31 ` ✗ Fi.CI.CHECKPATCH: warning for " Patchwork
@ 2018-03-21 11:51 ` Patchwork
  2018-03-21 13:40 ` ✓ Fi.CI.IGT: " Patchwork
  8 siblings, 0 replies; 17+ messages in thread
From: Patchwork @ 2018-03-21 11:51 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: drm/i915/execlists: Use a locked clear_bit() for synchronisation with interrupt
URL   : https://patchwork.freedesktop.org/series/40359/
State : success

== Summary ==

Series 40359v1 drm/i915/execlists: Use a locked clear_bit() for synchronisation with interrupt
https://patchwork.freedesktop.org/api/1.0/series/40359/revisions/1/mbox/

---- Known issues:

Test gem_mmap_gtt:
        Subgroup basic-small-bo-tiledx:
                pass       -> FAIL       (fi-gdg-551) fdo#102575
Test kms_pipe_crc_basic:
        Subgroup suspend-read-crc-pipe-a:
                incomplete -> PASS       (fi-cfl-s2) fdo#105641
        Subgroup suspend-read-crc-pipe-b:
                pass       -> INCOMPLETE (fi-snb-2520m) fdo#103713

fdo#102575 https://bugs.freedesktop.org/show_bug.cgi?id=102575
fdo#105641 https://bugs.freedesktop.org/show_bug.cgi?id=105641
fdo#103713 https://bugs.freedesktop.org/show_bug.cgi?id=103713

fi-bdw-5557u     total:285  pass:264  dwarn:0   dfail:0   fail:0   skip:21  time:434s
fi-bdw-gvtdvm    total:285  pass:261  dwarn:0   dfail:0   fail:0   skip:24  time:442s
fi-blb-e6850     total:285  pass:220  dwarn:1   dfail:0   fail:0   skip:64  time:381s
fi-bsw-n3050     total:285  pass:239  dwarn:0   dfail:0   fail:0   skip:46  time:535s
fi-bwr-2160      total:285  pass:180  dwarn:0   dfail:0   fail:0   skip:105 time:299s
fi-bxt-dsi       total:285  pass:255  dwarn:0   dfail:0   fail:0   skip:30  time:510s
fi-bxt-j4205     total:285  pass:256  dwarn:0   dfail:0   fail:0   skip:29  time:516s
fi-byt-j1900     total:285  pass:250  dwarn:0   dfail:0   fail:0   skip:35  time:518s
fi-byt-n2820     total:285  pass:246  dwarn:0   dfail:0   fail:0   skip:39  time:506s
fi-cfl-8700k     total:285  pass:257  dwarn:0   dfail:0   fail:0   skip:28  time:409s
fi-cfl-s2        total:285  pass:259  dwarn:0   dfail:0   fail:0   skip:26  time:583s
fi-cfl-u         total:285  pass:259  dwarn:0   dfail:0   fail:0   skip:26  time:511s
fi-cnl-drrs      total:285  pass:254  dwarn:3   dfail:0   fail:0   skip:28  time:512s
fi-elk-e7500     total:285  pass:225  dwarn:1   dfail:0   fail:0   skip:59  time:421s
fi-gdg-551       total:285  pass:176  dwarn:0   dfail:0   fail:1   skip:108 time:320s
fi-glk-1         total:285  pass:257  dwarn:0   dfail:0   fail:0   skip:28  time:539s
fi-hsw-4770      total:285  pass:258  dwarn:0   dfail:0   fail:0   skip:27  time:411s
fi-ilk-650       total:285  pass:225  dwarn:0   dfail:0   fail:0   skip:60  time:421s
fi-ivb-3520m     total:285  pass:256  dwarn:0   dfail:0   fail:0   skip:29  time:468s
fi-ivb-3770      total:285  pass:252  dwarn:0   dfail:0   fail:0   skip:33  time:436s
fi-kbl-7500u     total:285  pass:260  dwarn:1   dfail:0   fail:0   skip:24  time:474s
fi-kbl-7567u     total:285  pass:265  dwarn:0   dfail:0   fail:0   skip:20  time:466s
fi-kbl-r         total:285  pass:258  dwarn:0   dfail:0   fail:0   skip:27  time:514s
fi-pnv-d510      total:285  pass:219  dwarn:1   dfail:0   fail:0   skip:65  time:659s
fi-skl-6260u     total:285  pass:265  dwarn:0   dfail:0   fail:0   skip:20  time:445s
fi-skl-6600u     total:285  pass:258  dwarn:0   dfail:0   fail:0   skip:27  time:529s
fi-skl-6700k2    total:285  pass:261  dwarn:0   dfail:0   fail:0   skip:24  time:501s
fi-skl-6770hq    total:285  pass:265  dwarn:0   dfail:0   fail:0   skip:20  time:494s
fi-skl-guc       total:285  pass:257  dwarn:0   dfail:0   fail:0   skip:28  time:430s
fi-skl-gvtdvm    total:285  pass:262  dwarn:0   dfail:0   fail:0   skip:23  time:446s
fi-snb-2520m     total:242  pass:208  dwarn:0   dfail:0   fail:0   skip:33 
fi-snb-2600      total:285  pass:245  dwarn:0   dfail:0   fail:0   skip:40  time:399s

9d737cebc219c821989021a3115424165ff7b052 drm-tip: 2018y-03m-20d-14h-56m-05s UTC integration manifest
6b81af0baaa5 drm/i915/execlists: Use a locked clear_bit() for synchronisation with interrupt

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_8430/issues.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* ✓ Fi.CI.IGT: success for drm/i915/execlists: Use a locked clear_bit() for synchronisation with interrupt
  2018-03-21  9:10 [PATCH] drm/i915/execlists: Use a locked clear_bit() for synchronisation with interrupt Chris Wilson
                   ` (7 preceding siblings ...)
  2018-03-21 11:51 ` ✓ Fi.CI.BAT: success " Patchwork
@ 2018-03-21 13:40 ` Patchwork
  8 siblings, 0 replies; 17+ messages in thread
From: Patchwork @ 2018-03-21 13:40 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: drm/i915/execlists: Use a locked clear_bit() for synchronisation with interrupt
URL   : https://patchwork.freedesktop.org/series/40359/
State : success

== Summary ==

---- Known issues:

Test kms_flip:
        Subgroup plain-flip-fb-recreate:
                pass       -> FAIL       (shard-hsw) fdo#100368
Test kms_setmode:
        Subgroup basic:
                pass       -> FAIL       (shard-apl) fdo#99912
Test kms_vblank:
        Subgroup pipe-b-ts-continuation-dpms-suspend:
                incomplete -> PASS       (shard-hsw) fdo#105054

fdo#100368 https://bugs.freedesktop.org/show_bug.cgi?id=100368
fdo#99912 https://bugs.freedesktop.org/show_bug.cgi?id=99912
fdo#105054 https://bugs.freedesktop.org/show_bug.cgi?id=105054

shard-apl        total:3478 pass:1814 dwarn:1   dfail:0   fail:7   skip:1655 time:13125s
shard-hsw        total:3478 pass:1767 dwarn:1   dfail:0   fail:2   skip:1707 time:11838s
shard-snb        total:3478 pass:1358 dwarn:1   dfail:0   fail:2   skip:2117 time:7283s
Blacklisted hosts:
shard-kbl        total:3478 pass:1939 dwarn:1   dfail:0   fail:9   skip:1529 time:9974s

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_8430/shards.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] drm/i915/execlists: Use a locked clear_bit() for synchronisation with interrupt
  2018-03-21 10:46 ` [PATCH] " Mika Kuoppala
@ 2018-03-21 17:01   ` Michel Thierry
  2018-03-21 17:05     ` Chris Wilson
  2018-03-21 17:10     ` Chris Wilson
  0 siblings, 2 replies; 17+ messages in thread
From: Michel Thierry @ 2018-03-21 17:01 UTC (permalink / raw)
  To: Mika Kuoppala, Chris Wilson, intel-gfx

On 3/21/2018 3:46 AM, Mika Kuoppala wrote:
> Chris Wilson <chris@chris-wilson.co.uk> writes:
> 
>> We were relying on the uncached reads when processing the CSB to provide
>> ourselves with the serialisation with the interrupt handler (so we could
>> detect new interrupts in the middle of processing the old one). However,
>> in commit 767a983ab255 ("drm/i915/execlists: Read the context-status HEAD
>> from the HWSP") those uncached reads were eliminated (on one path at
>> least) and along with them our serialisation. The result is that we
>> would very rarely miss notification of a new interrupt and leave a
>> context-switch unprocessed, hanging the GPU.
>>
>> Fixes: 767a983ab255 ("drm/i915/execlists: Read the context-status HEAD from the HWSP")
>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>> Cc: Michel Thierry <michel.thierry@intel.com>
>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> Cc: Mika Kuoppala <mika.kuoppala@intel.com>
>> ---
>>   drivers/gpu/drm/i915/intel_lrc.c | 21 ++++++++-------------
>>   1 file changed, 8 insertions(+), 13 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
>> index 53f1c009ed7b..67b6a0f658d6 100644
>> --- a/drivers/gpu/drm/i915/intel_lrc.c
>> +++ b/drivers/gpu/drm/i915/intel_lrc.c
>> @@ -831,7 +831,8 @@ static void execlists_submission_tasklet(unsigned long data)
>>   	struct drm_i915_private *dev_priv = engine->i915;
>>   	bool fw = false;
>>   
>> -	/* We can skip acquiring intel_runtime_pm_get() here as it was taken
>> +	/*
>> +	 * We can skip acquiring intel_runtime_pm_get() here as it was taken
>>   	 * on our behalf by the request (see i915_gem_mark_busy()) and it will
>>   	 * not be relinquished until the device is idle (see
>>   	 * i915_gem_idle_work_handler()). As a precaution, we make sure
>> @@ -840,7 +841,8 @@ static void execlists_submission_tasklet(unsigned long data)
>>   	 */
>>   	GEM_BUG_ON(!dev_priv->gt.awake);
>>   
>> -	/* Prefer doing test_and_clear_bit() as a two stage operation to avoid
>> +	/*
>> +	 * Prefer doing test_and_clear_bit() as a two stage operation to avoid
>>   	 * imposing the cost of a locked atomic transaction when submitting a
>>   	 * new request (outside of the context-switch interrupt).
>>   	 */
>> @@ -856,17 +858,10 @@ static void execlists_submission_tasklet(unsigned long data)
>>   			execlists->csb_head = -1; /* force mmio read of CSB ptrs */
>>   		}
>>   
>> -		/* The write will be ordered by the uncached read (itself
>> -		 * a memory barrier), so we do not need another in the form
>> -		 * of a locked instruction. The race between the interrupt
>> -		 * handler and the split test/clear is harmless as we order
>> -		 * our clear before the CSB read. If the interrupt arrived
>> -		 * first between the test and the clear, we read the updated
>> -		 * CSB and clear the bit. If the interrupt arrives as we read
>> -		 * the CSB or later (i.e. after we had cleared the bit) the bit
>> -		 * is set and we do a new loop.
>> -		 */
>> -		__clear_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted);
>> +		/* Clear before reading to catch new interrupts */
>> +		clear_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted);
>> +		smp_mb__after_atomic();

Checkpatch wants a comment for the memory barrier... Are we being strict 
about it? (https://patchwork.freedesktop.org/series/40359/)

> 
> I was confused about this memory barrier as our test is in the
> same context and ordered wrt this. Chris noted in irc that this is for
> the documentation for ordering wrt the code that follows.
> 
> I am fine with that so,
> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> 

Fine by me too,

Reviewed-by: Michel Thierry <michel.thierry@intel.com>

>> +
>>   		if (unlikely(execlists->csb_head == -1)) { /* following a reset */
>>   			if (!fw) {
>>   				intel_uncore_forcewake_get(dev_priv,
>> -- 
>> 2.16.2
>>
>> _______________________________________________
>> Intel-gfx mailing list
>> Intel-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] drm/i915/execlists: Use a locked clear_bit() for synchronisation with interrupt
  2018-03-21 17:01   ` Michel Thierry
@ 2018-03-21 17:05     ` Chris Wilson
  2018-03-21 17:07       ` Chris Wilson
  2018-03-22  9:34       ` Jani Nikula
  2018-03-21 17:10     ` Chris Wilson
  1 sibling, 2 replies; 17+ messages in thread
From: Chris Wilson @ 2018-03-21 17:05 UTC (permalink / raw)
  To: Michel Thierry, Mika Kuoppala, intel-gfx

Quoting Michel Thierry (2018-03-21 17:01:12)
> On 3/21/2018 3:46 AM, Mika Kuoppala wrote:
> > Chris Wilson <chris@chris-wilson.co.uk> writes:
> > 
> >> We were relying on the uncached reads when processing the CSB to provide
> >> ourselves with the serialisation with the interrupt handler (so we could
> >> detect new interrupts in the middle of processing the old one). However,
> >> in commit 767a983ab255 ("drm/i915/execlists: Read the context-status HEAD
> >> from the HWSP") those uncached reads were eliminated (on one path at
> >> least) and along with them our serialisation. The result is that we
> >> would very rarely miss notification of a new interrupt and leave a
> >> context-switch unprocessed, hanging the GPU.
> >>
> >> Fixes: 767a983ab255 ("drm/i915/execlists: Read the context-status HEAD from the HWSP")
> >> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> >> Cc: Michel Thierry <michel.thierry@intel.com>
> >> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >> Cc: Mika Kuoppala <mika.kuoppala@intel.com>
> >> ---
> >>   drivers/gpu/drm/i915/intel_lrc.c | 21 ++++++++-------------
> >>   1 file changed, 8 insertions(+), 13 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> >> index 53f1c009ed7b..67b6a0f658d6 100644
> >> --- a/drivers/gpu/drm/i915/intel_lrc.c
> >> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> >> @@ -831,7 +831,8 @@ static void execlists_submission_tasklet(unsigned long data)
> >>      struct drm_i915_private *dev_priv = engine->i915;
> >>      bool fw = false;
> >>   
> >> -    /* We can skip acquiring intel_runtime_pm_get() here as it was taken
> >> +    /*
> >> +     * We can skip acquiring intel_runtime_pm_get() here as it was taken
> >>       * on our behalf by the request (see i915_gem_mark_busy()) and it will
> >>       * not be relinquished until the device is idle (see
> >>       * i915_gem_idle_work_handler()). As a precaution, we make sure
> >> @@ -840,7 +841,8 @@ static void execlists_submission_tasklet(unsigned long data)
> >>       */
> >>      GEM_BUG_ON(!dev_priv->gt.awake);
> >>   
> >> -    /* Prefer doing test_and_clear_bit() as a two stage operation to avoid
> >> +    /*
> >> +     * Prefer doing test_and_clear_bit() as a two stage operation to avoid
> >>       * imposing the cost of a locked atomic transaction when submitting a
> >>       * new request (outside of the context-switch interrupt).
> >>       */
> >> @@ -856,17 +858,10 @@ static void execlists_submission_tasklet(unsigned long data)
> >>                      execlists->csb_head = -1; /* force mmio read of CSB ptrs */
> >>              }
> >>   
> >> -            /* The write will be ordered by the uncached read (itself
> >> -             * a memory barrier), so we do not need another in the form
> >> -             * of a locked instruction. The race between the interrupt
> >> -             * handler and the split test/clear is harmless as we order
> >> -             * our clear before the CSB read. If the interrupt arrived
> >> -             * first between the test and the clear, we read the updated
> >> -             * CSB and clear the bit. If the interrupt arrives as we read
> >> -             * the CSB or later (i.e. after we had cleared the bit) the bit
> >> -             * is set and we do a new loop.
> >> -             */
> >> -            __clear_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted);
> >> +            /* Clear before reading to catch new interrupts */
> >> +            clear_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted);
> >> +            smp_mb__after_atomic();
> 
> Checkpatch wants a comment for the memory barrier... Are we being strict 
> about it? (https://patchwork.freedesktop.org/series/40359/)

There's a comment for it not two lines above! Silly perl script.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] drm/i915/execlists: Use a locked clear_bit() for synchronisation with interrupt
  2018-03-21 17:05     ` Chris Wilson
@ 2018-03-21 17:07       ` Chris Wilson
  2018-03-22  9:34       ` Jani Nikula
  1 sibling, 0 replies; 17+ messages in thread
From: Chris Wilson @ 2018-03-21 17:07 UTC (permalink / raw)
  To: Michel Thierry, Mika Kuoppala, intel-gfx

Quoting Chris Wilson (2018-03-21 17:05:06)
> Quoting Michel Thierry (2018-03-21 17:01:12)
> > On 3/21/2018 3:46 AM, Mika Kuoppala wrote:
> > > Chris Wilson <chris@chris-wilson.co.uk> writes:
> > > 
> > >> We were relying on the uncached reads when processing the CSB to provide
> > >> ourselves with the serialisation with the interrupt handler (so we could
> > >> detect new interrupts in the middle of processing the old one). However,
> > >> in commit 767a983ab255 ("drm/i915/execlists: Read the context-status HEAD
> > >> from the HWSP") those uncached reads were eliminated (on one path at
> > >> least) and along with them our serialisation. The result is that we
> > >> would very rarely miss notification of a new interrupt and leave a
> > >> context-switch unprocessed, hanging the GPU.
> > >>
> > >> Fixes: 767a983ab255 ("drm/i915/execlists: Read the context-status HEAD from the HWSP")
> > >> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > >> Cc: Michel Thierry <michel.thierry@intel.com>
> > >> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > >> Cc: Mika Kuoppala <mika.kuoppala@intel.com>
> > >> ---
> > >>   drivers/gpu/drm/i915/intel_lrc.c | 21 ++++++++-------------
> > >>   1 file changed, 8 insertions(+), 13 deletions(-)
> > >>
> > >> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> > >> index 53f1c009ed7b..67b6a0f658d6 100644
> > >> --- a/drivers/gpu/drm/i915/intel_lrc.c
> > >> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> > >> @@ -831,7 +831,8 @@ static void execlists_submission_tasklet(unsigned long data)
> > >>      struct drm_i915_private *dev_priv = engine->i915;
> > >>      bool fw = false;
> > >>   
> > >> -    /* We can skip acquiring intel_runtime_pm_get() here as it was taken
> > >> +    /*
> > >> +     * We can skip acquiring intel_runtime_pm_get() here as it was taken
> > >>       * on our behalf by the request (see i915_gem_mark_busy()) and it will
> > >>       * not be relinquished until the device is idle (see
> > >>       * i915_gem_idle_work_handler()). As a precaution, we make sure
> > >> @@ -840,7 +841,8 @@ static void execlists_submission_tasklet(unsigned long data)
> > >>       */
> > >>      GEM_BUG_ON(!dev_priv->gt.awake);
> > >>   
> > >> -    /* Prefer doing test_and_clear_bit() as a two stage operation to avoid
> > >> +    /*
> > >> +     * Prefer doing test_and_clear_bit() as a two stage operation to avoid
> > >>       * imposing the cost of a locked atomic transaction when submitting a
> > >>       * new request (outside of the context-switch interrupt).
> > >>       */
> > >> @@ -856,17 +858,10 @@ static void execlists_submission_tasklet(unsigned long data)
> > >>                      execlists->csb_head = -1; /* force mmio read of CSB ptrs */
> > >>              }
> > >>   
> > >> -            /* The write will be ordered by the uncached read (itself
> > >> -             * a memory barrier), so we do not need another in the form
> > >> -             * of a locked instruction. The race between the interrupt
> > >> -             * handler and the split test/clear is harmless as we order
> > >> -             * our clear before the CSB read. If the interrupt arrived
> > >> -             * first between the test and the clear, we read the updated
> > >> -             * CSB and clear the bit. If the interrupt arrives as we read
> > >> -             * the CSB or later (i.e. after we had cleared the bit) the bit
> > >> -             * is set and we do a new loop.
> > >> -             */
> > >> -            __clear_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted);
> > >> +            /* Clear before reading to catch new interrupts */
> > >> +            clear_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted);
> > >> +            smp_mb__after_atomic();
> > 
> > Checkpatch wants a comment for the memory barrier... Are we being strict 
> > about it? (https://patchwork.freedesktop.org/series/40359/)
> 
> There's a comment for it not two lines above! Silly perl script.

Besides it being only a simulacrum of a mb. Silly perl script :) 
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] drm/i915/execlists: Use a locked clear_bit() for synchronisation with interrupt
  2018-03-21 17:01   ` Michel Thierry
  2018-03-21 17:05     ` Chris Wilson
@ 2018-03-21 17:10     ` Chris Wilson
  1 sibling, 0 replies; 17+ messages in thread
From: Chris Wilson @ 2018-03-21 17:10 UTC (permalink / raw)
  To: Michel Thierry, Mika Kuoppala, intel-gfx

Quoting Michel Thierry (2018-03-21 17:01:12)
> On 3/21/2018 3:46 AM, Mika Kuoppala wrote:
> > Chris Wilson <chris@chris-wilson.co.uk> writes:
> > 
> >> We were relying on the uncached reads when processing the CSB to provide
> >> ourselves with the serialisation with the interrupt handler (so we could
> >> detect new interrupts in the middle of processing the old one). However,
> >> in commit 767a983ab255 ("drm/i915/execlists: Read the context-status HEAD
> >> from the HWSP") those uncached reads were eliminated (on one path at
> >> least) and along with them our serialisation. The result is that we
> >> would very rarely miss notification of a new interrupt and leave a
> >> context-switch unprocessed, hanging the GPU.
> >>
> >> Fixes: 767a983ab255 ("drm/i915/execlists: Read the context-status HEAD from the HWSP")
> >> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> >> Cc: Michel Thierry <michel.thierry@intel.com>
> >> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >> Cc: Mika Kuoppala <mika.kuoppala@intel.com>
> >> ---
> >>   drivers/gpu/drm/i915/intel_lrc.c | 21 ++++++++-------------
> >>   1 file changed, 8 insertions(+), 13 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> >> index 53f1c009ed7b..67b6a0f658d6 100644
> >> --- a/drivers/gpu/drm/i915/intel_lrc.c
> >> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> >> @@ -831,7 +831,8 @@ static void execlists_submission_tasklet(unsigned long data)
> >>      struct drm_i915_private *dev_priv = engine->i915;
> >>      bool fw = false;
> >>   
> >> -    /* We can skip acquiring intel_runtime_pm_get() here as it was taken
> >> +    /*
> >> +     * We can skip acquiring intel_runtime_pm_get() here as it was taken
> >>       * on our behalf by the request (see i915_gem_mark_busy()) and it will
> >>       * not be relinquished until the device is idle (see
> >>       * i915_gem_idle_work_handler()). As a precaution, we make sure
> >> @@ -840,7 +841,8 @@ static void execlists_submission_tasklet(unsigned long data)
> >>       */
> >>      GEM_BUG_ON(!dev_priv->gt.awake);
> >>   
> >> -    /* Prefer doing test_and_clear_bit() as a two stage operation to avoid
> >> +    /*
> >> +     * Prefer doing test_and_clear_bit() as a two stage operation to avoid
> >>       * imposing the cost of a locked atomic transaction when submitting a
> >>       * new request (outside of the context-switch interrupt).
> >>       */
> >> @@ -856,17 +858,10 @@ static void execlists_submission_tasklet(unsigned long data)
> >>                      execlists->csb_head = -1; /* force mmio read of CSB ptrs */
> >>              }
> >>   
> >> -            /* The write will be ordered by the uncached read (itself
> >> -             * a memory barrier), so we do not need another in the form
> >> -             * of a locked instruction. The race between the interrupt
> >> -             * handler and the split test/clear is harmless as we order
> >> -             * our clear before the CSB read. If the interrupt arrived
> >> -             * first between the test and the clear, we read the updated
> >> -             * CSB and clear the bit. If the interrupt arrives as we read
> >> -             * the CSB or later (i.e. after we had cleared the bit) the bit
> >> -             * is set and we do a new loop.
> >> -             */
> >> -            __clear_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted);
> >> +            /* Clear before reading to catch new interrupts */
> >> +            clear_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted);
> >> +            smp_mb__after_atomic();
> 
> Checkpatch wants a comment for the memory barrier... Are we being strict 
> about it? (https://patchwork.freedesktop.org/series/40359/)
> 
> > 
> > I was confused about this memory barrier as our test is in the
> > same context and ordered wrt this. Chris noted in irc that this is for
> > the documentation for ordering wrt the code that follows.
> > 
> > I am fine with that so,
> > Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> > 
> 
> Fine by me too,
> 
> Reviewed-by: Michel Thierry <michel.thierry@intel.com>

It definitely appears to be fixing an issue I've been seeing for the
last few months since using HWSP for execlists. But I only seeing in
conjunction with another set of patches, so my presumption was upon
those and not drm-tip (which kept on testing clear).

Thanks for the review, pushed. I'm sure I'll moan about the locked
instruction appearing in the profiles, just as much as I moan about the
locked instructions for tasklet_schedule() dominating some profiles.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] drm/i915/execlists: Use a locked clear_bit() for synchronisation with interrupt
  2018-03-21 17:05     ` Chris Wilson
  2018-03-21 17:07       ` Chris Wilson
@ 2018-03-22  9:34       ` Jani Nikula
  2018-03-22  9:36         ` Chris Wilson
  1 sibling, 1 reply; 17+ messages in thread
From: Jani Nikula @ 2018-03-22  9:34 UTC (permalink / raw)
  To: Chris Wilson, Michel Thierry, Mika Kuoppala, intel-gfx; +Cc: Rodrigo Vivi

On Wed, 21 Mar 2018, Chris Wilson <chris@chris-wilson.co.uk> wrote:
> Quoting Michel Thierry (2018-03-21 17:01:12)
>> On 3/21/2018 3:46 AM, Mika Kuoppala wrote:
>> > Chris Wilson <chris@chris-wilson.co.uk> writes:
>> >> -            /* The write will be ordered by the uncached read (itself
>> >> -             * a memory barrier), so we do not need another in the form
>> >> -             * of a locked instruction. The race between the interrupt
>> >> -             * handler and the split test/clear is harmless as we order
>> >> -             * our clear before the CSB read. If the interrupt arrived
>> >> -             * first between the test and the clear, we read the updated
>> >> -             * CSB and clear the bit. If the interrupt arrives as we read
>> >> -             * the CSB or later (i.e. after we had cleared the bit) the bit
>> >> -             * is set and we do a new loop.
>> >> -             */
>> >> -            __clear_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted);
>> >> +            /* Clear before reading to catch new interrupts */
>> >> +            clear_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted);
>> >> +            smp_mb__after_atomic();
>> 
>> Checkpatch wants a comment for the memory barrier... Are we being strict 
>> about it? (https://patchwork.freedesktop.org/series/40359/)
>
> There's a comment for it not two lines above! Silly perl script.

Sure, it's nowhere near perfect. But I do like to get the reminder about
this, "hey don't forget to document your memory barriers, locks,
etc.". It does mean we can't use checkpatch for gating, but I think it
can make the reviewer's life easier to be able to just point at the
results, and ask the author to fix the relevant stuff. I think it's less
tedious and less offensive than the reviewer doing the job manually.

BR,
Jani.


-- 
Jani Nikula, Intel Open Source Technology Center
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] drm/i915/execlists: Use a locked clear_bit() for synchronisation with interrupt
  2018-03-22  9:34       ` Jani Nikula
@ 2018-03-22  9:36         ` Chris Wilson
  2018-03-22 10:04           ` Jani Nikula
  0 siblings, 1 reply; 17+ messages in thread
From: Chris Wilson @ 2018-03-22  9:36 UTC (permalink / raw)
  To: Jani Nikula, Michel Thierry, Mika Kuoppala, intel-gfx; +Cc: Rodrigo Vivi

Quoting Jani Nikula (2018-03-22 09:34:18)
> On Wed, 21 Mar 2018, Chris Wilson <chris@chris-wilson.co.uk> wrote:
> > Quoting Michel Thierry (2018-03-21 17:01:12)
> >> On 3/21/2018 3:46 AM, Mika Kuoppala wrote:
> >> > Chris Wilson <chris@chris-wilson.co.uk> writes:
> >> >> -            /* The write will be ordered by the uncached read (itself
> >> >> -             * a memory barrier), so we do not need another in the form
> >> >> -             * of a locked instruction. The race between the interrupt
> >> >> -             * handler and the split test/clear is harmless as we order
> >> >> -             * our clear before the CSB read. If the interrupt arrived
> >> >> -             * first between the test and the clear, we read the updated
> >> >> -             * CSB and clear the bit. If the interrupt arrives as we read
> >> >> -             * the CSB or later (i.e. after we had cleared the bit) the bit
> >> >> -             * is set and we do a new loop.
> >> >> -             */
> >> >> -            __clear_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted);
> >> >> +            /* Clear before reading to catch new interrupts */
> >> >> +            clear_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted);
> >> >> +            smp_mb__after_atomic();
> >> 
> >> Checkpatch wants a comment for the memory barrier... Are we being strict 
> >> about it? (https://patchwork.freedesktop.org/series/40359/)
> >
> > There's a comment for it not two lines above! Silly perl script.
> 
> Sure, it's nowhere near perfect. But I do like to get the reminder about
> this, "hey don't forget to document your memory barriers, locks,
> etc.". It does mean we can't use checkpatch for gating, but I think it
> can make the reviewer's life easier to be able to just point at the
> results, and ask the author to fix the relevant stuff. I think it's less
> tedious and less offensive than the reviewer doing the job manually.

The complaint was only in jest. The reminder to document locks and mb is
indeed invaluable, just sometimes the limitation of being a "dumb" perl
script show through.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] drm/i915/execlists: Use a locked clear_bit() for synchronisation with interrupt
  2018-03-22  9:36         ` Chris Wilson
@ 2018-03-22 10:04           ` Jani Nikula
  0 siblings, 0 replies; 17+ messages in thread
From: Jani Nikula @ 2018-03-22 10:04 UTC (permalink / raw)
  To: Chris Wilson, Michel Thierry, Mika Kuoppala, intel-gfx; +Cc: Rodrigo Vivi

On Thu, 22 Mar 2018, Chris Wilson <chris@chris-wilson.co.uk> wrote:
> Quoting Jani Nikula (2018-03-22 09:34:18)
>> On Wed, 21 Mar 2018, Chris Wilson <chris@chris-wilson.co.uk> wrote:
>> > Quoting Michel Thierry (2018-03-21 17:01:12)
>> >> On 3/21/2018 3:46 AM, Mika Kuoppala wrote:
>> >> > Chris Wilson <chris@chris-wilson.co.uk> writes:
>> >> >> -            /* The write will be ordered by the uncached read (itself
>> >> >> -             * a memory barrier), so we do not need another in the form
>> >> >> -             * of a locked instruction. The race between the interrupt
>> >> >> -             * handler and the split test/clear is harmless as we order
>> >> >> -             * our clear before the CSB read. If the interrupt arrived
>> >> >> -             * first between the test and the clear, we read the updated
>> >> >> -             * CSB and clear the bit. If the interrupt arrives as we read
>> >> >> -             * the CSB or later (i.e. after we had cleared the bit) the bit
>> >> >> -             * is set and we do a new loop.
>> >> >> -             */
>> >> >> -            __clear_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted);
>> >> >> +            /* Clear before reading to catch new interrupts */
>> >> >> +            clear_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted);
>> >> >> +            smp_mb__after_atomic();
>> >> 
>> >> Checkpatch wants a comment for the memory barrier... Are we being strict 
>> >> about it? (https://patchwork.freedesktop.org/series/40359/)
>> >
>> > There's a comment for it not two lines above! Silly perl script.
>> 
>> Sure, it's nowhere near perfect. But I do like to get the reminder about
>> this, "hey don't forget to document your memory barriers, locks,
>> etc.". It does mean we can't use checkpatch for gating, but I think it
>> can make the reviewer's life easier to be able to just point at the
>> results, and ask the author to fix the relevant stuff. I think it's less
>> tedious and less offensive than the reviewer doing the job manually.
>
> The complaint was only in jest. The reminder to document locks and mb is
> indeed invaluable, just sometimes the limitation of being a "dumb" perl
> script show through.

Oh, I didn't misread you. I just switched to serious mode because we do
need to evaluate whether the checkpatch reports from CI are net positive
or negative, and, either way, what can we do to further improve the S/N.

BR,
Jani.


-- 
Jani Nikula, Intel Open Source Technology Center
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2018-03-22 10:03 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-03-21  9:10 [PATCH] drm/i915/execlists: Use a locked clear_bit() for synchronisation with interrupt Chris Wilson
2018-03-21  9:19 ` ✗ Fi.CI.CHECKPATCH: warning for " Patchwork
2018-03-21  9:35 ` ✗ Fi.CI.BAT: " Patchwork
2018-03-21 10:14 ` [PATCH] " Tvrtko Ursulin
2018-03-21 10:24 ` ✗ Fi.CI.CHECKPATCH: warning for " Patchwork
2018-03-21 10:42 ` ✗ Fi.CI.BAT: failure " Patchwork
2018-03-21 10:46 ` [PATCH] " Mika Kuoppala
2018-03-21 17:01   ` Michel Thierry
2018-03-21 17:05     ` Chris Wilson
2018-03-21 17:07       ` Chris Wilson
2018-03-22  9:34       ` Jani Nikula
2018-03-22  9:36         ` Chris Wilson
2018-03-22 10:04           ` Jani Nikula
2018-03-21 17:10     ` Chris Wilson
2018-03-21 11:31 ` ✗ Fi.CI.CHECKPATCH: warning for " Patchwork
2018-03-21 11:51 ` ✓ Fi.CI.BAT: success " Patchwork
2018-03-21 13:40 ` ✓ Fi.CI.IGT: " Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.