All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] drm/i915/guc: Don't read SOFT_SCRATCH(15) on MMIO error
@ 2018-05-28 17:16 Michal Wajdeczko
  2018-05-28 17:52 ` ✗ Fi.CI.BAT: failure for " Patchwork
                   ` (4 more replies)
  0 siblings, 5 replies; 12+ messages in thread
From: Michal Wajdeczko @ 2018-05-28 17:16 UTC (permalink / raw)
  To: intel-gfx

SOFT_SCRATCH(15) is used by GuC for sending MMIO GuC events to host and
those events are now handled by intel_guc_to_host_event_handler_mmio().

We should not try to read it on MMIO action error as 1) we may be using
different set of registers for GuC MMIO communication, and 2) GuC may
use CTB mechanism for sending events to host.

While here, upgrade error message to DRM_ERROR.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/intel_guc.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_guc.c b/drivers/gpu/drm/i915/intel_guc.c
index 116f4cc..e28a996 100644
--- a/drivers/gpu/drm/i915/intel_guc.c
+++ b/drivers/gpu/drm/i915/intel_guc.c
@@ -346,10 +346,8 @@ int intel_guc_send_mmio(struct intel_guc *guc, const u32 *action, u32 len,
 		ret = -EIO;
 
 	if (ret) {
-		DRM_DEBUG_DRIVER("INTEL_GUC_SEND: Action 0x%X failed;"
-				 " ret=%d status=0x%08X response=0x%08X\n",
-				 action[0], ret, status,
-				 I915_READ(SOFT_SCRATCH(15)));
+		DRM_ERROR("MMIO: GuC action %#x failed with error %d %#x\n",
+			  action[0], ret, status);
 		goto out;
 	}
 
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* ✗ Fi.CI.BAT: failure for drm/i915/guc: Don't read SOFT_SCRATCH(15) on MMIO error
  2018-05-28 17:16 [PATCH] drm/i915/guc: Don't read SOFT_SCRATCH(15) on MMIO error Michal Wajdeczko
@ 2018-05-28 17:52 ` Patchwork
  2018-05-29 10:43 ` ✓ Fi.CI.BAT: success " Patchwork
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 12+ messages in thread
From: Patchwork @ 2018-05-28 17:52 UTC (permalink / raw)
  To: Michal Wajdeczko; +Cc: intel-gfx

== Series Details ==

Series: drm/i915/guc: Don't read SOFT_SCRATCH(15) on MMIO error
URL   : https://patchwork.freedesktop.org/series/43865/
State : failure

== Summary ==

= CI Bug Log - changes from CI_DRM_4249 -> Patchwork_9135 =

== Summary - FAILURE ==

  Serious unknown changes coming with Patchwork_9135 absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_9135, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  External URL: https://patchwork.freedesktop.org/api/1.0/series/43865/revisions/1/mbox/

== Possible new issues ==

  Here are the unknown changes that may have been introduced in Patchwork_9135:

  === IGT changes ===

    ==== Possible regressions ====

    igt@kms_cursor_legacy@basic-busy-flip-before-cursor-legacy:
      fi-glk-j4005:       PASS -> FAIL

    
== Known issues ==

  Here are the changes found in Patchwork_9135 that come from known issues:

  === IGT changes ===

    ==== Issues hit ====

    igt@gem_busy@basic-hang-default:
      fi-glk-j4005:       PASS -> DMESG-WARN (fdo#106000)

    igt@gem_exec_suspend@basic-s3:
      fi-glk-j4005:       PASS -> DMESG-WARN (fdo#106097) +1

    igt@kms_chamelium@dp-edid-read:
      fi-kbl-7500u:       PASS -> FAIL (fdo#103841)

    
    ==== Possible fixes ====

    igt@kms_chamelium@hdmi-edid-read:
      fi-kbl-7500u:       FAIL (fdo#103841) -> SKIP

    igt@kms_chamelium@hdmi-hpd-fast:
      fi-kbl-7500u:       FAIL (fdo#102672, fdo#103841) -> SKIP

    
  fdo#102672 https://bugs.freedesktop.org/show_bug.cgi?id=102672
  fdo#103841 https://bugs.freedesktop.org/show_bug.cgi?id=103841
  fdo#106000 https://bugs.freedesktop.org/show_bug.cgi?id=106000
  fdo#106097 https://bugs.freedesktop.org/show_bug.cgi?id=106097


== Participating hosts (44 -> 38) ==

  Missing    (6): fi-ilk-m540 fi-byt-j1900 fi-byt-squawks fi-bsw-cyan fi-ctg-p8600 fi-skl-6700hq 


== Build changes ==

    * Linux: CI_DRM_4249 -> Patchwork_9135

  CI_DRM_4249: f460c1f3512a6a2ecea6585156a3a2f1430cc407 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_4499: f560ae5a464331f03f0a669ed46b8c9e56526187 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_9135: a93c347453c64561121dbf7b7902ce575cc22039 @ git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

a93c347453c6 drm/i915/guc: Don't read SOFT_SCRATCH(15) on MMIO error

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_9135/issues.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* ✓ Fi.CI.BAT: success for drm/i915/guc: Don't read SOFT_SCRATCH(15) on MMIO error
  2018-05-28 17:16 [PATCH] drm/i915/guc: Don't read SOFT_SCRATCH(15) on MMIO error Michal Wajdeczko
  2018-05-28 17:52 ` ✗ Fi.CI.BAT: failure for " Patchwork
@ 2018-05-29 10:43 ` Patchwork
  2018-05-29 12:04 ` ✓ Fi.CI.IGT: " Patchwork
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 12+ messages in thread
From: Patchwork @ 2018-05-29 10:43 UTC (permalink / raw)
  To: Michal Wajdeczko; +Cc: intel-gfx

== Series Details ==

Series: drm/i915/guc: Don't read SOFT_SCRATCH(15) on MMIO error
URL   : https://patchwork.freedesktop.org/series/43865/
State : success

== Summary ==

= CI Bug Log - changes from CI_DRM_4250 -> Patchwork_9138 =

== Summary - SUCCESS ==

  No regressions found.

  External URL: https://patchwork.freedesktop.org/api/1.0/series/43865/revisions/1/mbox/

== Known issues ==

  Here are the changes found in Patchwork_9138 that come from known issues:

  === IGT changes ===

    ==== Issues hit ====

    igt@gem_exec_create@basic:
      fi-glk-j4005:       PASS -> DMESG-WARN (fdo#105719)

    igt@kms_flip@basic-flip-vs-wf_vblank:
      fi-glk-j4005:       PASS -> FAIL (fdo#100368)

    igt@kms_flip@basic-plain-flip:
      fi-glk-j4005:       PASS -> DMESG-WARN (fdo#106097)

    igt@kms_pipe_crc_basic@suspend-read-crc-pipe-c:
      fi-bxt-dsi:         PASS -> INCOMPLETE (fdo#103927)

    
    ==== Possible fixes ====

    igt@gem_exec_suspend@basic-s4-devices:
      fi-kbl-7500u:       DMESG-WARN (fdo#105128) -> PASS

    igt@kms_flip@basic-flip-vs-modeset:
      fi-glk-j4005:       DMESG-WARN (fdo#106000) -> PASS

    igt@kms_frontbuffer_tracking@basic:
      fi-hsw-4200u:       DMESG-FAIL (fdo#106103, fdo#102614) -> PASS

    
  fdo#100368 https://bugs.freedesktop.org/show_bug.cgi?id=100368
  fdo#102614 https://bugs.freedesktop.org/show_bug.cgi?id=102614
  fdo#103927 https://bugs.freedesktop.org/show_bug.cgi?id=103927
  fdo#105128 https://bugs.freedesktop.org/show_bug.cgi?id=105128
  fdo#105719 https://bugs.freedesktop.org/show_bug.cgi?id=105719
  fdo#106000 https://bugs.freedesktop.org/show_bug.cgi?id=106000
  fdo#106097 https://bugs.freedesktop.org/show_bug.cgi?id=106097
  fdo#106103 https://bugs.freedesktop.org/show_bug.cgi?id=106103


== Participating hosts (43 -> 39) ==

  Missing    (4): fi-ilk-m540 fi-byt-squawks fi-bsw-cyan fi-skl-6700hq 


== Build changes ==

    * Linux: CI_DRM_4250 -> Patchwork_9138

  CI_DRM_4250: 01b6423d3fabdb32ac69bc155dd5beb87c6761d8 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_4499: f560ae5a464331f03f0a669ed46b8c9e56526187 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_9138: 4e3d675aa3897bc6b42fb6a3e60639a2bc720bd7 @ git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

4e3d675aa389 drm/i915/guc: Don't read SOFT_SCRATCH(15) on MMIO error

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_9138/issues.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* ✓ Fi.CI.IGT: success for drm/i915/guc: Don't read SOFT_SCRATCH(15) on MMIO error
  2018-05-28 17:16 [PATCH] drm/i915/guc: Don't read SOFT_SCRATCH(15) on MMIO error Michal Wajdeczko
  2018-05-28 17:52 ` ✗ Fi.CI.BAT: failure for " Patchwork
  2018-05-29 10:43 ` ✓ Fi.CI.BAT: success " Patchwork
@ 2018-05-29 12:04 ` Patchwork
  2018-05-29 14:54 ` [PATCH] " Chris Wilson
  2018-05-31 18:13 ` Chris Wilson
  4 siblings, 0 replies; 12+ messages in thread
From: Patchwork @ 2018-05-29 12:04 UTC (permalink / raw)
  To: Michal Wajdeczko; +Cc: intel-gfx

== Series Details ==

Series: drm/i915/guc: Don't read SOFT_SCRATCH(15) on MMIO error
URL   : https://patchwork.freedesktop.org/series/43865/
State : success

== Summary ==

= CI Bug Log - changes from CI_DRM_4250_full -> Patchwork_9138_full =

== Summary - WARNING ==

  Minor unknown changes coming with Patchwork_9138_full need to be verified
  manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_9138_full, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  External URL: https://patchwork.freedesktop.org/api/1.0/series/43865/revisions/1/mbox/

== Possible new issues ==

  Here are the unknown changes that may have been introduced in Patchwork_9138_full:

  === IGT changes ===

    ==== Warnings ====

    igt@gem_exec_schedule@deep-bsd1:
      shard-kbl:          SKIP -> PASS +2

    igt@gem_mocs_settings@mocs-rc6-bsd2:
      shard-kbl:          PASS -> SKIP

    igt@kms_busy@basic-flip-a:
      shard-snb:          SKIP -> PASS

    
== Known issues ==

  Here are the changes found in Patchwork_9138_full that come from known issues:

  === IGT changes ===

    ==== Issues hit ====

    igt@drv_selftest@live_hangcheck:
      shard-glk:          PASS -> DMESG-FAIL (fdo#106560)

    igt@kms_cursor_crc@cursor-256x256-suspend:
      shard-kbl:          PASS -> INCOMPLETE (fdo#103665)

    igt@kms_cursor_legacy@2x-nonblocking-modeset-vs-cursor-atomic:
      shard-glk:          PASS -> FAIL (fdo#106509, fdo#105454)

    igt@kms_flip@2x-dpms-vs-vblank-race:
      shard-glk:          PASS -> FAIL (fdo#103060)

    igt@kms_flip@2x-flip-vs-absolute-wf_vblank:
      shard-hsw:          PASS -> FAIL (fdo#100368)

    igt@kms_flip@2x-flip-vs-expired-vblank-interruptible:
      shard-glk:          PASS -> FAIL (fdo#105363) +1

    igt@kms_flip@2x-wf_vblank-ts-check:
      shard-hsw:          PASS -> FAIL (fdo#103928)

    igt@kms_flip@flip-vs-expired-vblank:
      shard-glk:          PASS -> FAIL (fdo#105363, fdo#102887)

    igt@kms_flip@wf_vblank-ts-check:
      shard-glk:          PASS -> FAIL (fdo#100368) +1

    igt@kms_flip_tiling@flip-to-x-tiled:
      shard-glk:          PASS -> FAIL (fdo#103822, fdo#104724)

    
    ==== Possible fixes ====

    igt@drv_selftest@live_hangcheck:
      shard-apl:          DMESG-FAIL (fdo#106560) -> PASS

    igt@drv_selftest@live_workarounds:
      shard-kbl:          INCOMPLETE (fdo#103665) -> PASS

    igt@kms_cursor_legacy@2x-long-cursor-vs-flip-atomic:
      shard-hsw:          FAIL (fdo#105767) -> PASS

    igt@kms_flip@2x-plain-flip-ts-check-interruptible:
      shard-glk:          FAIL (fdo#100368) -> PASS

    igt@kms_flip_tiling@flip-to-y-tiled:
      shard-glk:          FAIL (fdo#103822, fdo#104724) -> PASS

    
  fdo#100368 https://bugs.freedesktop.org/show_bug.cgi?id=100368
  fdo#102887 https://bugs.freedesktop.org/show_bug.cgi?id=102887
  fdo#103060 https://bugs.freedesktop.org/show_bug.cgi?id=103060
  fdo#103665 https://bugs.freedesktop.org/show_bug.cgi?id=103665
  fdo#103822 https://bugs.freedesktop.org/show_bug.cgi?id=103822
  fdo#103928 https://bugs.freedesktop.org/show_bug.cgi?id=103928
  fdo#104724 https://bugs.freedesktop.org/show_bug.cgi?id=104724
  fdo#105363 https://bugs.freedesktop.org/show_bug.cgi?id=105363
  fdo#105454 https://bugs.freedesktop.org/show_bug.cgi?id=105454
  fdo#105767 https://bugs.freedesktop.org/show_bug.cgi?id=105767
  fdo#106509 https://bugs.freedesktop.org/show_bug.cgi?id=106509
  fdo#106560 https://bugs.freedesktop.org/show_bug.cgi?id=106560


== Participating hosts (5 -> 5) ==

  No changes in participating hosts


== Build changes ==

    * Linux: CI_DRM_4250 -> Patchwork_9138

  CI_DRM_4250: 01b6423d3fabdb32ac69bc155dd5beb87c6761d8 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_4499: f560ae5a464331f03f0a669ed46b8c9e56526187 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_9138: 4e3d675aa3897bc6b42fb6a3e60639a2bc720bd7 @ git://anongit.freedesktop.org/gfx-ci/linux

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_9138/shards.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] drm/i915/guc: Don't read SOFT_SCRATCH(15) on MMIO error
  2018-05-28 17:16 [PATCH] drm/i915/guc: Don't read SOFT_SCRATCH(15) on MMIO error Michal Wajdeczko
                   ` (2 preceding siblings ...)
  2018-05-29 12:04 ` ✓ Fi.CI.IGT: " Patchwork
@ 2018-05-29 14:54 ` Chris Wilson
  2018-05-29 15:10   ` Michal Wajdeczko
  2018-07-12 15:31   ` Chris Wilson
  2018-05-31 18:13 ` Chris Wilson
  4 siblings, 2 replies; 12+ messages in thread
From: Chris Wilson @ 2018-05-29 14:54 UTC (permalink / raw)
  To: Michal Wajdeczko, intel-gfx

Quoting Michal Wajdeczko (2018-05-28 18:16:18)
> SOFT_SCRATCH(15) is used by GuC for sending MMIO GuC events to host and
> those events are now handled by intel_guc_to_host_event_handler_mmio().
> 
> We should not try to read it on MMIO action error as 1) we may be using
> different set of registers for GuC MMIO communication, and 2) GuC may
> use CTB mechanism for sending events to host.

Ok.
 
> While here, upgrade error message to DRM_ERROR.

Does the error help? What do you want to convey to the user? For error
handling, we want to propagate the result back anyway for the caller has
to decide what to do next.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] drm/i915/guc: Don't read SOFT_SCRATCH(15) on MMIO error
  2018-05-29 14:54 ` [PATCH] " Chris Wilson
@ 2018-05-29 15:10   ` Michal Wajdeczko
  2018-05-29 15:17     ` Chris Wilson
  2018-07-12 15:31   ` Chris Wilson
  1 sibling, 1 reply; 12+ messages in thread
From: Michal Wajdeczko @ 2018-05-29 15:10 UTC (permalink / raw)
  To: intel-gfx, Chris Wilson

On Tue, 29 May 2018 16:54:12 +0200, Chris Wilson  
<chris@chris-wilson.co.uk> wrote:

> Quoting Michal Wajdeczko (2018-05-28 18:16:18)
>> SOFT_SCRATCH(15) is used by GuC for sending MMIO GuC events to host and
>> those events are now handled by intel_guc_to_host_event_handler_mmio().
>>
>> We should not try to read it on MMIO action error as 1) we may be using
>> different set of registers for GuC MMIO communication, and 2) GuC may
>> use CTB mechanism for sending events to host.
>
> Ok.
>
>> While here, upgrade error message to DRM_ERROR.
>
> Does the error help? What do you want to convey to the user? For error
> handling, we want to propagate the result back anyway for the caller has
> to decide what to do next.

We are propagating error code to the caller, but since any error from the
GuC is unexpected, we should rather always log it and don't rely on the
caller or drm debug for that. Note that in case of CTB we also log received
errors using DRM_ERROR (see intel_guc_send_ct).

Michal
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] drm/i915/guc: Don't read SOFT_SCRATCH(15) on MMIO error
  2018-05-29 15:10   ` Michal Wajdeczko
@ 2018-05-29 15:17     ` Chris Wilson
  2018-05-29 15:30       ` Michal Wajdeczko
  0 siblings, 1 reply; 12+ messages in thread
From: Chris Wilson @ 2018-05-29 15:17 UTC (permalink / raw)
  To: Michal Wajdeczko, intel-gfx

Quoting Michal Wajdeczko (2018-05-29 16:10:44)
> On Tue, 29 May 2018 16:54:12 +0200, Chris Wilson  
> <chris@chris-wilson.co.uk> wrote:
> 
> > Quoting Michal Wajdeczko (2018-05-28 18:16:18)
> >> SOFT_SCRATCH(15) is used by GuC for sending MMIO GuC events to host and
> >> those events are now handled by intel_guc_to_host_event_handler_mmio().
> >>
> >> We should not try to read it on MMIO action error as 1) we may be using
> >> different set of registers for GuC MMIO communication, and 2) GuC may
> >> use CTB mechanism for sending events to host.
> >
> > Ok.
> >
> >> While here, upgrade error message to DRM_ERROR.
> >
> > Does the error help? What do you want to convey to the user? For error
> > handling, we want to propagate the result back anyway for the caller has
> > to decide what to do next.
> 
> We are propagating error code to the caller, but since any error from the
> GuC is unexpected, we should rather always log it and don't rely on the
> caller or drm debug for that. Note that in case of CTB we also log received
> errors using DRM_ERROR (see intel_guc_send_ct).

But whose error? Ours or the hw? We expect hw errors, or should ;)

But mostly from the pov of the message, is this the right information to
flag as the error or does the caller have better context?
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] drm/i915/guc: Don't read SOFT_SCRATCH(15) on MMIO error
  2018-05-29 15:17     ` Chris Wilson
@ 2018-05-29 15:30       ` Michal Wajdeczko
  0 siblings, 0 replies; 12+ messages in thread
From: Michal Wajdeczko @ 2018-05-29 15:30 UTC (permalink / raw)
  To: intel-gfx, Chris Wilson

On Tue, 29 May 2018 17:17:02 +0200, Chris Wilson  
<chris@chris-wilson.co.uk> wrote:

> Quoting Michal Wajdeczko (2018-05-29 16:10:44)
>> On Tue, 29 May 2018 16:54:12 +0200, Chris Wilson
>> <chris@chris-wilson.co.uk> wrote:
>>
>> > Quoting Michal Wajdeczko (2018-05-28 18:16:18)
>> >> SOFT_SCRATCH(15) is used by GuC for sending MMIO GuC events to host  
>> and
>> >> those events are now handled by  
>> intel_guc_to_host_event_handler_mmio().
>> >>
>> >> We should not try to read it on MMIO action error as 1) we may be  
>> using
>> >> different set of registers for GuC MMIO communication, and 2) GuC may
>> >> use CTB mechanism for sending events to host.
>> >
>> > Ok.
>> >
>> >> While here, upgrade error message to DRM_ERROR.
>> >
>> > Does the error help? What do you want to convey to the user? For error
>> > handling, we want to propagate the result back anyway for the caller  
>> has
>> > to decide what to do next.
>>
>> We are propagating error code to the caller, but since any error from  
>> the
>> GuC is unexpected, we should rather always log it and don't rely on the
>> caller or drm debug for that. Note that in case of CTB we also log  
>> received
>> errors using DRM_ERROR (see intel_guc_send_ct).
>
> But whose error? Ours or the hw? We expect hw errors, or should ;)

well, it can be any i915/FW/HW - hard to tell without other full logs..

>
> But mostly from the pov of the message, is this the right information to
> flag as the error or does the caller have better context?

Only caller can easily provide additional info related for failed command
(such as index/address that was rejected by FW) that could help diagnose
the problem, but in case FW/HW errors it does not matter.

At this point, we can only identify request/action ID that has failed.
But that's better than nothing.

/Michal
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] drm/i915/guc: Don't read SOFT_SCRATCH(15) on MMIO error
  2018-05-28 17:16 [PATCH] drm/i915/guc: Don't read SOFT_SCRATCH(15) on MMIO error Michal Wajdeczko
                   ` (3 preceding siblings ...)
  2018-05-29 14:54 ` [PATCH] " Chris Wilson
@ 2018-05-31 18:13 ` Chris Wilson
  2018-05-31 18:23   ` Chris Wilson
  4 siblings, 1 reply; 12+ messages in thread
From: Chris Wilson @ 2018-05-31 18:13 UTC (permalink / raw)
  To: Michal Wajdeczko, intel-gfx

Quoting Michal Wajdeczko (2018-05-28 18:16:18)
> SOFT_SCRATCH(15) is used by GuC for sending MMIO GuC events to host and
> those events are now handled by intel_guc_to_host_event_handler_mmio().
> 
> We should not try to read it on MMIO action error as 1) we may be using
> different set of registers for GuC MMIO communication, and 2) GuC may
> use CTB mechanism for sending events to host.
> 
> While here, upgrade error message to DRM_ERROR.
> 
> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
> Cc: Michel Thierry <michel.thierry@intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>

I'm still not totally sold on having the DRM_ERROR here improves
debugging; it doesn't do anything to improve error handling, but

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

nevertheless.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] drm/i915/guc: Don't read SOFT_SCRATCH(15) on MMIO error
  2018-05-31 18:13 ` Chris Wilson
@ 2018-05-31 18:23   ` Chris Wilson
  0 siblings, 0 replies; 12+ messages in thread
From: Chris Wilson @ 2018-05-31 18:23 UTC (permalink / raw)
  To: Michal Wajdeczko, intel-gfx

Quoting Chris Wilson (2018-05-31 19:13:32)
> Quoting Michal Wajdeczko (2018-05-28 18:16:18)
> > SOFT_SCRATCH(15) is used by GuC for sending MMIO GuC events to host and
> > those events are now handled by intel_guc_to_host_event_handler_mmio().
> > 
> > We should not try to read it on MMIO action error as 1) we may be using
> > different set of registers for GuC MMIO communication, and 2) GuC may
> > use CTB mechanism for sending events to host.
> > 
> > While here, upgrade error message to DRM_ERROR.
> > 
> > Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
> > Cc: Michel Thierry <michel.thierry@intel.com>
> > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > Cc: Chris Wilson <chris@chris-wilson.co.uk>
> 
> I'm still not totally sold on having the DRM_ERROR here improves
> debugging; it doesn't do anything to improve error handling, but
> 
> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
> 
> nevertheless.

And pushed. Thank you for the patch,
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] drm/i915/guc: Don't read SOFT_SCRATCH(15) on MMIO error
  2018-05-29 14:54 ` [PATCH] " Chris Wilson
  2018-05-29 15:10   ` Michal Wajdeczko
@ 2018-07-12 15:31   ` Chris Wilson
  2018-07-12 17:29     ` Michal Wajdeczko
  1 sibling, 1 reply; 12+ messages in thread
From: Chris Wilson @ 2018-07-12 15:31 UTC (permalink / raw)
  To: Michal Wajdeczko, Michał Winiarski, intel-gfx

Quoting Chris Wilson (2018-05-29 15:54:12)
> Quoting Michal Wajdeczko (2018-05-28 18:16:18)
> > SOFT_SCRATCH(15) is used by GuC for sending MMIO GuC events to host and
> > those events are now handled by intel_guc_to_host_event_handler_mmio().
> > 
> > We should not try to read it on MMIO action error as 1) we may be using
> > different set of registers for GuC MMIO communication, and 2) GuC may
> > use CTB mechanism for sending events to host.
> 
> Ok.
>  
> > While here, upgrade error message to DRM_ERROR.
> 
> Does the error help? What do you want to convey to the user? For error
> handling, we want to propagate the result back anyway for the caller has
> to decide what to do next.

Good news! We see the error in BAT,

[  542.138479] i915: unknown parameter 'enable_guc_loading' ignored
[  542.138483] i915: unknown parameter 'enable_guc_submission' ignored
[  542.138485] Setting dangerous option enable_guc - tainting kernel
[  542.138488] Setting dangerous option live_selftests - tainting kernel
[  542.173291] [drm:intel_guc_send_mmio [i915]] *ERROR* MMIO: GuC action 0x10 failed with error -5 0xf000f000
[  542.367055] i915: probe of 0000:00:02.0 failed with error -25

And Michał reminded me this wasn't the first time...

commit feb06c151fade9ecaa3dd410d792cce26e8b10de
Author: Michał Winiarski <michal.winiarski@intel.com>
Date:   Mon Mar 19 10:53:47 2018 +0100

    drm/i915/guc: Demote GuC error messages
    
    We're using those functions in selftests, and the callers are expected
    to do the error handling anyways. Let's demote all GuC actions and
    doorbell creation to DEBUG_DRIVER.

So do we kindly ask Michał to resubmit his fix?
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] drm/i915/guc: Don't read SOFT_SCRATCH(15) on MMIO error
  2018-07-12 15:31   ` Chris Wilson
@ 2018-07-12 17:29     ` Michal Wajdeczko
  0 siblings, 0 replies; 12+ messages in thread
From: Michal Wajdeczko @ 2018-07-12 17:29 UTC (permalink / raw)
  To: Michał Winiarski, intel-gfx, Chris Wilson

On Thu, 12 Jul 2018 17:31:14 +0200, Chris Wilson  
<chris@chris-wilson.co.uk> wrote:

> Quoting Chris Wilson (2018-05-29 15:54:12)
>> Quoting Michal Wajdeczko (2018-05-28 18:16:18)
>> > SOFT_SCRATCH(15) is used by GuC for sending MMIO GuC events to host  
>> and
>> > those events are now handled by  
>> intel_guc_to_host_event_handler_mmio().
>> >
>> > We should not try to read it on MMIO action error as 1) we may be  
>> using
>> > different set of registers for GuC MMIO communication, and 2) GuC may
>> > use CTB mechanism for sending events to host.
>>
>> Ok.
>>
>> > While here, upgrade error message to DRM_ERROR.
>>
>> Does the error help? What do you want to convey to the user? For error
>> handling, we want to propagate the result back anyway for the caller has
>> to decide what to do next.
>
> Good news! We see the error in BAT,
>
> [  542.138479] i915: unknown parameter 'enable_guc_loading' ignored
> [  542.138483] i915: unknown parameter 'enable_guc_submission' ignored
> [  542.138485] Setting dangerous option enable_guc - tainting kernel
> [  542.138488] Setting dangerous option live_selftests - tainting kernel
> [  542.173291] [drm:intel_guc_send_mmio [i915]] *ERROR* MMIO: GuC action  
> 0x10 failed with error -5 0xf000f000
> [  542.367055] i915: probe of 0000:00:02.0 failed with error -25
>
> And Michał reminded me this wasn't the first time...
>
> commit feb06c151fade9ecaa3dd410d792cce26e8b10de
> Author: Michał Winiarski <michal.winiarski@intel.com>
> Date:   Mon Mar 19 10:53:47 2018 +0100
>
>     drm/i915/guc: Demote GuC error messages
>    We're using those functions in selftests, and the callers are expected
>     to do the error handling anyways. Let's demote all GuC actions and
>     doorbell creation to DEBUG_DRIVER.
>
> So do we kindly ask Michał to resubmit his fix?

There are more places where DRM_ERROR is used after detection GuC error
(see intel_guc_send_ct as example, more to show up shortly)

I would rather prefer to add GUC_ERROR macro that could be tweaked
under SELFTEST config and runtime flags to demote unwanted errors:

#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
#define GUC_ERROR(...) \
	do { \
		if (unlikely(i915_selftest.mock || i915_selftest.live)) \
			DRM_DEBUG_DRIVER(__VA_ARGS__); \
		else \
			DRM_ERROR(__VA_ARGS__);
	} while (0);
#else
#define GUC_ERROR(...) DRM_ERROR(__VA_ARGS__)
#endif

/Michal
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2018-07-12 17:30 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-28 17:16 [PATCH] drm/i915/guc: Don't read SOFT_SCRATCH(15) on MMIO error Michal Wajdeczko
2018-05-28 17:52 ` ✗ Fi.CI.BAT: failure for " Patchwork
2018-05-29 10:43 ` ✓ Fi.CI.BAT: success " Patchwork
2018-05-29 12:04 ` ✓ Fi.CI.IGT: " Patchwork
2018-05-29 14:54 ` [PATCH] " Chris Wilson
2018-05-29 15:10   ` Michal Wajdeczko
2018-05-29 15:17     ` Chris Wilson
2018-05-29 15:30       ` Michal Wajdeczko
2018-07-12 15:31   ` Chris Wilson
2018-07-12 17:29     ` Michal Wajdeczko
2018-05-31 18:13 ` Chris Wilson
2018-05-31 18:23   ` Chris Wilson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.