All of lore.kernel.org
 help / color / mirror / Atom feed
* [Intel-gfx] [PATCH] drm/i915/guc: Don't GEM_BUG_ON on corrupted H2G CTB
@ 2020-01-20 19:18 Michal Wajdeczko
  2020-01-21  2:45 ` [Intel-gfx] ✗ Fi.CI.BAT: failure for " Patchwork
  2020-01-24 18:43 ` [Intel-gfx] [PATCH] " Chris Wilson
  0 siblings, 2 replies; 3+ messages in thread
From: Michal Wajdeczko @ 2020-01-20 19:18 UTC (permalink / raw)
  To: intel-gfx

We should never BUG_ON on any corruption in CTB descriptor as
data there can be also modified by the GuC. Instead we can
use flag "is_in_error" to indicate that we will not process
any further messages over this CTB (until reset).

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 32 ++++++++++++++++-------
 1 file changed, 22 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
index 02b543377e2b..d84812683364 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
@@ -317,18 +317,25 @@ static int ct_write(struct intel_guc_ct *ct,
 {
 	struct intel_guc_ct_buffer *ctb = &ct->ctbs[CTB_SEND];
 	struct guc_ct_buffer_desc *desc = ctb->desc;
-	u32 head = desc->head / 4;	/* in dwords */
-	u32 tail = desc->tail / 4;	/* in dwords */
-	u32 size = desc->size / 4;	/* in dwords */
-	u32 used;			/* in dwords */
+	u32 head = desc->head;
+	u32 tail = desc->tail;
+	u32 size = desc->size;
+	u32 used;
 	u32 header;
 	u32 *cmds = ctb->cmds;
 	unsigned int i;
 
-	GEM_BUG_ON(desc->size % 4);
-	GEM_BUG_ON(desc->head % 4);
-	GEM_BUG_ON(desc->tail % 4);
-	GEM_BUG_ON(tail >= size);
+	if (unlikely(desc->is_in_error))
+		return -EPIPE;
+
+	if (unlikely(!IS_ALIGNED(head | tail | size, 4) ||
+		     (tail | head) >= size))
+		goto corrupted;
+
+	/* later calculations will be done in dwords */
+	head /= 4;
+	tail /= 4;
+	size /= 4;
 
 	/*
 	 * tail == head condition indicates empty. GuC FW does not support
@@ -367,12 +374,17 @@ static int ct_write(struct intel_guc_ct *ct,
 		cmds[tail] = action[i];
 		tail = (tail + 1) % size;
 	}
+	GEM_BUG_ON(tail > size);
 
 	/* now update desc tail (back in bytes) */
 	desc->tail = tail * 4;
-	GEM_BUG_ON(desc->tail > desc->size);
-
 	return 0;
+
+corrupted:
+	CT_ERROR(ct, "Corrupted descriptor addr=%#x head=%u tail=%u size=%u\n",
+		 desc->addr, desc->head, desc->tail, desc->size);
+	desc->is_in_error = 1;
+	return -EPIPE;
 }
 
 /**
-- 
2.19.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* [Intel-gfx] ✗ Fi.CI.BAT: failure for drm/i915/guc: Don't GEM_BUG_ON on corrupted H2G CTB
  2020-01-20 19:18 [Intel-gfx] [PATCH] drm/i915/guc: Don't GEM_BUG_ON on corrupted H2G CTB Michal Wajdeczko
@ 2020-01-21  2:45 ` Patchwork
  2020-01-24 18:43 ` [Intel-gfx] [PATCH] " Chris Wilson
  1 sibling, 0 replies; 3+ messages in thread
From: Patchwork @ 2020-01-21  2:45 UTC (permalink / raw)
  To: Michal Wajdeczko; +Cc: intel-gfx

== Series Details ==

Series: drm/i915/guc: Don't GEM_BUG_ON on corrupted H2G CTB
URL   : https://patchwork.freedesktop.org/series/72305/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_7781 -> Patchwork_16177
====================================================

Summary
-------

  **FAILURE**

  Serious unknown changes coming with Patchwork_16177 absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_16177, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16177/index.html

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in Patchwork_16177:

### IGT changes ###

#### Possible regressions ####

  * igt@runner@aborted:
    - fi-kbl-7500u:       NOTRUN -> [FAIL][1]
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16177/fi-kbl-7500u/igt@runner@aborted.html

  
Known issues
------------

  Here are the changes found in Patchwork_16177 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@gem_close_race@basic-threads:
    - fi-byt-j1900:       [PASS][2] -> [TIMEOUT][3] ([fdo#112271] / [i915#816])
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7781/fi-byt-j1900/igt@gem_close_race@basic-threads.html
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16177/fi-byt-j1900/igt@gem_close_race@basic-threads.html

  * igt@i915_module_load@reload-with-fault-injection:
    - fi-skl-lmem:        [PASS][4] -> [DMESG-WARN][5] ([i915#889])
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7781/fi-skl-lmem/igt@i915_module_load@reload-with-fault-injection.html
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16177/fi-skl-lmem/igt@i915_module_load@reload-with-fault-injection.html

  * igt@i915_pm_rpm@module-reload:
    - fi-kbl-7500u:       [PASS][6] -> [DMESG-WARN][7] ([i915#889]) +1 similar issue
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7781/fi-kbl-7500u/igt@i915_pm_rpm@module-reload.html
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16177/fi-kbl-7500u/igt@i915_pm_rpm@module-reload.html

  * igt@kms_chamelium@dp-edid-read:
    - fi-cml-u2:          [PASS][8] -> [FAIL][9] ([i915#217])
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7781/fi-cml-u2/igt@kms_chamelium@dp-edid-read.html
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16177/fi-cml-u2/igt@kms_chamelium@dp-edid-read.html

  * igt@kms_frontbuffer_tracking@basic:
    - fi-hsw-peppy:       [PASS][10] -> [DMESG-WARN][11] ([i915#44])
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7781/fi-hsw-peppy/igt@kms_frontbuffer_tracking@basic.html
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16177/fi-hsw-peppy/igt@kms_frontbuffer_tracking@basic.html

  
#### Possible fixes ####

  * igt@kms_chamelium@hdmi-hpd-fast:
    - fi-kbl-7500u:       [FAIL][12] ([fdo#111407]) -> [PASS][13]
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7781/fi-kbl-7500u/igt@kms_chamelium@hdmi-hpd-fast.html
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16177/fi-kbl-7500u/igt@kms_chamelium@hdmi-hpd-fast.html

  
#### Warnings ####

  * igt@i915_selftest@live_blt:
    - fi-hsw-4770:        [DMESG-FAIL][14] ([i915#725]) -> [DMESG-FAIL][15] ([i915#563])
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7781/fi-hsw-4770/igt@i915_selftest@live_blt.html
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16177/fi-hsw-4770/igt@i915_selftest@live_blt.html

  
  [fdo#111407]: https://bugs.freedesktop.org/show_bug.cgi?id=111407
  [fdo#112271]: https://bugs.freedesktop.org/show_bug.cgi?id=112271
  [i915#217]: https://gitlab.freedesktop.org/drm/intel/issues/217
  [i915#44]: https://gitlab.freedesktop.org/drm/intel/issues/44
  [i915#563]: https://gitlab.freedesktop.org/drm/intel/issues/563
  [i915#725]: https://gitlab.freedesktop.org/drm/intel/issues/725
  [i915#816]: https://gitlab.freedesktop.org/drm/intel/issues/816
  [i915#889]: https://gitlab.freedesktop.org/drm/intel/issues/889


Participating hosts (50 -> 40)
------------------------------

  Missing    (10): fi-ilk-m540 fi-hsw-4200u fi-skl-guc fi-glk-dsi fi-byt-squawks fi-bwr-2160 fi-bsw-cyan fi-ctg-p8600 fi-whl-u fi-byt-clapper 


Build changes
-------------

  * CI: CI-20190529 -> None
  * Linux: CI_DRM_7781 -> Patchwork_16177

  CI-20190529: 20190529
  CI_DRM_7781: 3f2b341ae1fde67f823aeb715c6f489affdef8b1 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_5374: 83c32e859202e43ff6a8cca162c76fcd90ad6e3b @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_16177: 5929bca143b8ff15f2dce5c8ef4b14d67479bbb1 @ git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

5929bca143b8 drm/i915/guc: Don't GEM_BUG_ON on corrupted H2G CTB

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16177/index.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [Intel-gfx] [PATCH] drm/i915/guc: Don't GEM_BUG_ON on corrupted H2G CTB
  2020-01-20 19:18 [Intel-gfx] [PATCH] drm/i915/guc: Don't GEM_BUG_ON on corrupted H2G CTB Michal Wajdeczko
  2020-01-21  2:45 ` [Intel-gfx] ✗ Fi.CI.BAT: failure for " Patchwork
@ 2020-01-24 18:43 ` Chris Wilson
  1 sibling, 0 replies; 3+ messages in thread
From: Chris Wilson @ 2020-01-24 18:43 UTC (permalink / raw)
  To: Michal Wajdeczko, intel-gfx

Quoting Michal Wajdeczko (2020-01-20 19:18:17)
> We should never BUG_ON on any corruption in CTB descriptor as
> data there can be also modified by the GuC. Instead we can
> use flag "is_in_error" to indicate that we will not process
> any further messages over this CTB (until reset).

Like you already did for ct_read(). I was confused over having deja vu.

> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2020-01-24 18:43 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-20 19:18 [Intel-gfx] [PATCH] drm/i915/guc: Don't GEM_BUG_ON on corrupted H2G CTB Michal Wajdeczko
2020-01-21  2:45 ` [Intel-gfx] ✗ Fi.CI.BAT: failure for " Patchwork
2020-01-24 18:43 ` [Intel-gfx] [PATCH] " Chris Wilson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.