All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] drm/i915/guc: Disable preemption if it fails
@ 2018-05-31 20:47 Chris Wilson
  2018-05-31 20:57 ` ✗ Fi.CI.CHECKPATCH: warning for " Patchwork
                   ` (3 more replies)
  0 siblings, 4 replies; 7+ messages in thread
From: Chris Wilson @ 2018-05-31 20:47 UTC (permalink / raw)
  To: intel-gfx

If we fail to tell the GuC to perform preemption, we get stuck
attempting to continually retry inject_preempt_context() until we
eventually timeout and reset the GPU (approximately emitting the same
warning 1000 times). Bail after the first failure, emit the WARN and
stop trying to do any further preemption on this engine.

References: https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_2235/shard-apl4/igt@gem_exec_schedule@preempt-bsd.html
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Michałt Winiarski <michal.winiarski@intel.com>
---
 drivers/gpu/drm/i915/intel_guc_submission.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/i915/intel_guc_submission.c b/drivers/gpu/drm/i915/intel_guc_submission.c
index 133367a17863..24bdac205c45 100644
--- a/drivers/gpu/drm/i915/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/intel_guc_submission.c
@@ -588,6 +588,7 @@ static void inject_preempt_context(struct work_struct *work)
 	data[6] = intel_guc_ggtt_offset(guc, guc->shared_data);
 
 	if (WARN_ON(intel_guc_send(guc, data, ARRAY_SIZE(data)))) {
+		engine->flags &= ~I915_ENGINE_HAS_PREEMPTION; /* XXX racy! */
 		execlists_clear_active(&engine->execlists,
 				       EXECLISTS_ACTIVE_PREEMPT);
 		tasklet_schedule(&engine->execlists.tasklet);
-- 
2.17.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* ✗ Fi.CI.CHECKPATCH: warning for drm/i915/guc: Disable preemption if it fails
  2018-05-31 20:47 [PATCH] drm/i915/guc: Disable preemption if it fails Chris Wilson
@ 2018-05-31 20:57 ` Patchwork
  2018-05-31 21:13 ` ✓ Fi.CI.BAT: success " Patchwork
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 7+ messages in thread
From: Patchwork @ 2018-05-31 20:57 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: drm/i915/guc: Disable preemption if it fails
URL   : https://patchwork.freedesktop.org/series/44045/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
815a8c14e179 drm/i915/guc: Disable preemption if it fails
-:15: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description (prefer a maximum 75 chars per line)
#15: 
References: https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_2235/shard-apl4/igt@gem_exec_schedule@preempt-bsd.html

total: 0 errors, 1 warnings, 0 checks, 7 lines checked

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 7+ messages in thread

* ✓ Fi.CI.BAT: success for drm/i915/guc: Disable preemption if it fails
  2018-05-31 20:47 [PATCH] drm/i915/guc: Disable preemption if it fails Chris Wilson
  2018-05-31 20:57 ` ✗ Fi.CI.CHECKPATCH: warning for " Patchwork
@ 2018-05-31 21:13 ` Patchwork
  2018-05-31 21:16 ` [PATCH] " Michel Thierry
  2018-05-31 21:17 ` Singh, Satyeshwar
  3 siblings, 0 replies; 7+ messages in thread
From: Patchwork @ 2018-05-31 21:13 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: drm/i915/guc: Disable preemption if it fails
URL   : https://patchwork.freedesktop.org/series/44045/
State : success

== Summary ==

= CI Bug Log - changes from CI_DRM_4268 -> Patchwork_9164 =

== Summary - WARNING ==

  Minor unknown changes coming with Patchwork_9164 need to be verified
  manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_9164, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  External URL: https://patchwork.freedesktop.org/api/1.0/series/44045/revisions/1/mbox/

== Possible new issues ==

  Here are the unknown changes that may have been introduced in Patchwork_9164:

  === IGT changes ===

    ==== Warnings ====

    igt@gem_exec_gttfill@basic:
      fi-pnv-d510:        PASS -> SKIP

    
== Known issues ==

  Here are the changes found in Patchwork_9164 that come from known issues:

  === IGT changes ===

    ==== Issues hit ====

    igt@gem_exec_create@basic:
      fi-glk-j4005:       PASS -> DMESG-WARN (fdo#105719)

    igt@gem_mmap_gtt@basic-small-bo-tiledx:
      fi-gdg-551:         PASS -> FAIL (fdo#102575)

    
    ==== Possible fixes ====

    igt@kms_chamelium@hdmi-hpd-fast:
      fi-kbl-7500u:       FAIL (fdo#103841, fdo#102672) -> SKIP

    igt@kms_flip@basic-flip-vs-modeset:
      fi-glk-j4005:       DMESG-WARN (fdo#106000) -> PASS

    
  fdo#102575 https://bugs.freedesktop.org/show_bug.cgi?id=102575
  fdo#102672 https://bugs.freedesktop.org/show_bug.cgi?id=102672
  fdo#103841 https://bugs.freedesktop.org/show_bug.cgi?id=103841
  fdo#105719 https://bugs.freedesktop.org/show_bug.cgi?id=105719
  fdo#106000 https://bugs.freedesktop.org/show_bug.cgi?id=106000


== Participating hosts (42 -> 40) ==

  Additional (1): fi-bxt-dsi 
  Missing    (3): fi-ctg-p8600 fi-ilk-m540 fi-skl-6700hq 


== Build changes ==

    * Linux: CI_DRM_4268 -> Patchwork_9164

  CI_DRM_4268: 7138a144b0da03dcbe54731f7ac0dab6948beafb @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_4503: ae0ea2a0cff1cf8516d18ada5b9db01c56b73ed9 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_9164: 815a8c14e179ccb07fe20b90affeb87ca3b49368 @ git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

815a8c14e179 drm/i915/guc: Disable preemption if it fails

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_9164/issues.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] drm/i915/guc: Disable preemption if it fails
  2018-05-31 20:47 [PATCH] drm/i915/guc: Disable preemption if it fails Chris Wilson
  2018-05-31 20:57 ` ✗ Fi.CI.CHECKPATCH: warning for " Patchwork
  2018-05-31 21:13 ` ✓ Fi.CI.BAT: success " Patchwork
@ 2018-05-31 21:16 ` Michel Thierry
  2018-05-31 21:17 ` Singh, Satyeshwar
  3 siblings, 0 replies; 7+ messages in thread
From: Michel Thierry @ 2018-05-31 21:16 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

On 5/31/2018 1:47 PM, Chris Wilson wrote:
> If we fail to tell the GuC to perform preemption, we get stuck
> attempting to continually retry inject_preempt_context() until we
> eventually timeout and reset the GPU (approximately emitting the same
> warning 1000 times). Bail after the first failure, emit the WARN and
I only see 340 warnings in the 4 seconds before it timed out.

<7>[ ] [drm:intel_guc_send_mmio [i915]] INTEL_GUC_SEND: Action 0x2 
failed; ret=-110 status=0x00000002 response=0x40000000

The status is the same as the action, so something really bad happened 
inside there.

> stop trying to do any further preemption on this engine.
> 
> References: https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_2235/shard-apl4/igt@gem_exec_schedule@preempt-bsd.html
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
> Cc: Michel Thierry <michel.thierry@intel.com>
> Cc: Michałt Winiarski <michal.winiarski@intel.com>

Reviewed-by: Michel Thierry <michel.thierry@intel.com>

> ---
>   drivers/gpu/drm/i915/intel_guc_submission.c | 1 +
>   1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/gpu/drm/i915/intel_guc_submission.c b/drivers/gpu/drm/i915/intel_guc_submission.c
> index 133367a17863..24bdac205c45 100644
> --- a/drivers/gpu/drm/i915/intel_guc_submission.c
> +++ b/drivers/gpu/drm/i915/intel_guc_submission.c
> @@ -588,6 +588,7 @@ static void inject_preempt_context(struct work_struct *work)
>   	data[6] = intel_guc_ggtt_offset(guc, guc->shared_data);
>   
>   	if (WARN_ON(intel_guc_send(guc, data, ARRAY_SIZE(data)))) {
> +		engine->flags &= ~I915_ENGINE_HAS_PREEMPTION; /* XXX racy! */
>   		execlists_clear_active(&engine->execlists,
>   				       EXECLISTS_ACTIVE_PREEMPT);
>   		tasklet_schedule(&engine->execlists.tasklet);
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] drm/i915/guc: Disable preemption if it fails
  2018-05-31 20:47 [PATCH] drm/i915/guc: Disable preemption if it fails Chris Wilson
                   ` (2 preceding siblings ...)
  2018-05-31 21:16 ` [PATCH] " Michel Thierry
@ 2018-05-31 21:17 ` Singh, Satyeshwar
  2018-05-31 21:20   ` Chris Wilson
  3 siblings, 1 reply; 7+ messages in thread
From: Singh, Satyeshwar @ 2018-05-31 21:17 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

Hi Chris,
Isn't this dependent upon the workload submitted to the GuC? Meaning we have one workload that refused to be preempted (really long shader for example) but it went away on its own. Other workloads that come in later are preemptible. However, if we turn off preemption permanently, then all future workloads will not be preempted either which may not be desirable.
-Satyeshwar

-----Original Message-----
From: Intel-gfx [mailto:intel-gfx-bounces@lists.freedesktop.org] On Behalf Of Chris Wilson
Sent: Thursday, May 31, 2018 1:47 PM
To: intel-gfx@lists.freedesktop.org
Subject: [Intel-gfx] [PATCH] drm/i915/guc: Disable preemption if it fails

If we fail to tell the GuC to perform preemption, we get stuck attempting to continually retry inject_preempt_context() until we eventually timeout and reset the GPU (approximately emitting the same warning 1000 times). Bail after the first failure, emit the WARN and stop trying to do any further preemption on this engine.

References: https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_2235/shard-apl4/igt@gem_exec_schedule@preempt-bsd.html
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Michałt Winiarski <michal.winiarski@intel.com>
---
 drivers/gpu/drm/i915/intel_guc_submission.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/i915/intel_guc_submission.c b/drivers/gpu/drm/i915/intel_guc_submission.c
index 133367a17863..24bdac205c45 100644
--- a/drivers/gpu/drm/i915/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/intel_guc_submission.c
@@ -588,6 +588,7 @@ static void inject_preempt_context(struct work_struct *work)
 	data[6] = intel_guc_ggtt_offset(guc, guc->shared_data);
 
 	if (WARN_ON(intel_guc_send(guc, data, ARRAY_SIZE(data)))) {
+		engine->flags &= ~I915_ENGINE_HAS_PREEMPTION; /* XXX racy! */
 		execlists_clear_active(&engine->execlists,
 				       EXECLISTS_ACTIVE_PREEMPT);
 		tasklet_schedule(&engine->execlists.tasklet);
--
2.17.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] drm/i915/guc: Disable preemption if it fails
  2018-05-31 21:17 ` Singh, Satyeshwar
@ 2018-05-31 21:20   ` Chris Wilson
  2018-06-02  0:02     ` Jeff McGee
  0 siblings, 1 reply; 7+ messages in thread
From: Chris Wilson @ 2018-05-31 21:20 UTC (permalink / raw)
  To: Singh, Satyeshwar, intel-gfx

Quoting Singh, Satyeshwar (2018-05-31 22:17:25)
> Hi Chris,
> Isn't this dependent upon the workload submitted to the GuC? Meaning we have one workload that refused to be preempted (really long shader for example) but it went away on its own. Other workloads that come in later are preemptible. However, if we turn off preemption permanently, then all future workloads will not be preempted either which may not be desirable.

Whoever implements the recovery mechanism can clear the flag. You may
like to clear the flag on reset? We would have to be more careful about
the manipulation of engine->flags as it's not serialised atm (since it's
_meant_ to be write-once during init).
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] drm/i915/guc: Disable preemption if it fails
  2018-05-31 21:20   ` Chris Wilson
@ 2018-06-02  0:02     ` Jeff McGee
  0 siblings, 0 replies; 7+ messages in thread
From: Jeff McGee @ 2018-06-02  0:02 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

On Thu, May 31, 2018 at 10:20:54PM +0100, Chris Wilson wrote:
> Quoting Singh, Satyeshwar (2018-05-31 22:17:25)
> > Hi Chris,
> > Isn't this dependent upon the workload submitted to the GuC? Meaning we have one workload that refused to be preempted (really long shader for example) but it went away on its own. Other workloads that come in later are preemptible. However, if we turn off preemption permanently, then all future workloads will not be preempted either which may not be desirable.
> 
> Whoever implements the recovery mechanism can clear the flag. You may
> like to clear the flag on reset? We would have to be more careful about
> the manipulation of engine->flags as it's not serialised atm (since it's
> _meant_ to be write-once during init).
> -Chris

The error that would occur here is a failure of GuC to *initiate* the
preemption, and is different from a slow resolution of the preemption on
hardware caused by the workload blocking scenario that Satyeshwar
describes. GuC will wait forever for preemption resolution, as will i915
currently without a timeout mechanism. A failure of GuC to initiate a
preemption would be a very strange and bad thing and probably would
warrant a WARN and disabling. Is anyone actually seeing that with current
firmware? I have not in my own testing. Is it an actual error returned
from GuC or a timeout waiting for GuC response?
-Jeff
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2018-06-02  0:02 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-31 20:47 [PATCH] drm/i915/guc: Disable preemption if it fails Chris Wilson
2018-05-31 20:57 ` ✗ Fi.CI.CHECKPATCH: warning for " Patchwork
2018-05-31 21:13 ` ✓ Fi.CI.BAT: success " Patchwork
2018-05-31 21:16 ` [PATCH] " Michel Thierry
2018-05-31 21:17 ` Singh, Satyeshwar
2018-05-31 21:20   ` Chris Wilson
2018-06-02  0:02     ` Jeff McGee

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.