* [Intel-gfx] [PATCH] drm/i915/guc: Don't GEM_BUG_ON on corrupted H2G CTB
@ 2020-01-20 19:18 Michal Wajdeczko
2020-01-21 2:45 ` [Intel-gfx] ✗ Fi.CI.BAT: failure for " Patchwork
2020-01-24 18:43 ` [Intel-gfx] [PATCH] " Chris Wilson
0 siblings, 2 replies; 3+ messages in thread
From: Michal Wajdeczko @ 2020-01-20 19:18 UTC (permalink / raw)
To: intel-gfx
We should never BUG_ON on any corruption in CTB descriptor as
data there can be also modified by the GuC. Instead we can
use flag "is_in_error" to indicate that we will not process
any further messages over this CTB (until reset).
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
---
drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 32 ++++++++++++++++-------
1 file changed, 22 insertions(+), 10 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
index 02b543377e2b..d84812683364 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
@@ -317,18 +317,25 @@ static int ct_write(struct intel_guc_ct *ct,
{
struct intel_guc_ct_buffer *ctb = &ct->ctbs[CTB_SEND];
struct guc_ct_buffer_desc *desc = ctb->desc;
- u32 head = desc->head / 4; /* in dwords */
- u32 tail = desc->tail / 4; /* in dwords */
- u32 size = desc->size / 4; /* in dwords */
- u32 used; /* in dwords */
+ u32 head = desc->head;
+ u32 tail = desc->tail;
+ u32 size = desc->size;
+ u32 used;
u32 header;
u32 *cmds = ctb->cmds;
unsigned int i;
- GEM_BUG_ON(desc->size % 4);
- GEM_BUG_ON(desc->head % 4);
- GEM_BUG_ON(desc->tail % 4);
- GEM_BUG_ON(tail >= size);
+ if (unlikely(desc->is_in_error))
+ return -EPIPE;
+
+ if (unlikely(!IS_ALIGNED(head | tail | size, 4) ||
+ (tail | head) >= size))
+ goto corrupted;
+
+ /* later calculations will be done in dwords */
+ head /= 4;
+ tail /= 4;
+ size /= 4;
/*
* tail == head condition indicates empty. GuC FW does not support
@@ -367,12 +374,17 @@ static int ct_write(struct intel_guc_ct *ct,
cmds[tail] = action[i];
tail = (tail + 1) % size;
}
+ GEM_BUG_ON(tail > size);
/* now update desc tail (back in bytes) */
desc->tail = tail * 4;
- GEM_BUG_ON(desc->tail > desc->size);
-
return 0;
+
+corrupted:
+ CT_ERROR(ct, "Corrupted descriptor addr=%#x head=%u tail=%u size=%u\n",
+ desc->addr, desc->head, desc->tail, desc->size);
+ desc->is_in_error = 1;
+ return -EPIPE;
}
/**
--
2.19.2
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 3+ messages in thread
* [Intel-gfx] ✗ Fi.CI.BAT: failure for drm/i915/guc: Don't GEM_BUG_ON on corrupted H2G CTB
2020-01-20 19:18 [Intel-gfx] [PATCH] drm/i915/guc: Don't GEM_BUG_ON on corrupted H2G CTB Michal Wajdeczko
@ 2020-01-21 2:45 ` Patchwork
2020-01-24 18:43 ` [Intel-gfx] [PATCH] " Chris Wilson
1 sibling, 0 replies; 3+ messages in thread
From: Patchwork @ 2020-01-21 2:45 UTC (permalink / raw)
To: Michal Wajdeczko; +Cc: intel-gfx
== Series Details ==
Series: drm/i915/guc: Don't GEM_BUG_ON on corrupted H2G CTB
URL : https://patchwork.freedesktop.org/series/72305/
State : failure
== Summary ==
CI Bug Log - changes from CI_DRM_7781 -> Patchwork_16177
====================================================
Summary
-------
**FAILURE**
Serious unknown changes coming with Patchwork_16177 absolutely need to be
verified manually.
If you think the reported changes have nothing to do with the changes
introduced in Patchwork_16177, please notify your bug team to allow them
to document this new failure mode, which will reduce false positives in CI.
External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16177/index.html
Possible new issues
-------------------
Here are the unknown changes that may have been introduced in Patchwork_16177:
### IGT changes ###
#### Possible regressions ####
* igt@runner@aborted:
- fi-kbl-7500u: NOTRUN -> [FAIL][1]
[1]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16177/fi-kbl-7500u/igt@runner@aborted.html
Known issues
------------
Here are the changes found in Patchwork_16177 that come from known issues:
### IGT changes ###
#### Issues hit ####
* igt@gem_close_race@basic-threads:
- fi-byt-j1900: [PASS][2] -> [TIMEOUT][3] ([fdo#112271] / [i915#816])
[2]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7781/fi-byt-j1900/igt@gem_close_race@basic-threads.html
[3]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16177/fi-byt-j1900/igt@gem_close_race@basic-threads.html
* igt@i915_module_load@reload-with-fault-injection:
- fi-skl-lmem: [PASS][4] -> [DMESG-WARN][5] ([i915#889])
[4]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7781/fi-skl-lmem/igt@i915_module_load@reload-with-fault-injection.html
[5]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16177/fi-skl-lmem/igt@i915_module_load@reload-with-fault-injection.html
* igt@i915_pm_rpm@module-reload:
- fi-kbl-7500u: [PASS][6] -> [DMESG-WARN][7] ([i915#889]) +1 similar issue
[6]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7781/fi-kbl-7500u/igt@i915_pm_rpm@module-reload.html
[7]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16177/fi-kbl-7500u/igt@i915_pm_rpm@module-reload.html
* igt@kms_chamelium@dp-edid-read:
- fi-cml-u2: [PASS][8] -> [FAIL][9] ([i915#217])
[8]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7781/fi-cml-u2/igt@kms_chamelium@dp-edid-read.html
[9]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16177/fi-cml-u2/igt@kms_chamelium@dp-edid-read.html
* igt@kms_frontbuffer_tracking@basic:
- fi-hsw-peppy: [PASS][10] -> [DMESG-WARN][11] ([i915#44])
[10]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7781/fi-hsw-peppy/igt@kms_frontbuffer_tracking@basic.html
[11]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16177/fi-hsw-peppy/igt@kms_frontbuffer_tracking@basic.html
#### Possible fixes ####
* igt@kms_chamelium@hdmi-hpd-fast:
- fi-kbl-7500u: [FAIL][12] ([fdo#111407]) -> [PASS][13]
[12]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7781/fi-kbl-7500u/igt@kms_chamelium@hdmi-hpd-fast.html
[13]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16177/fi-kbl-7500u/igt@kms_chamelium@hdmi-hpd-fast.html
#### Warnings ####
* igt@i915_selftest@live_blt:
- fi-hsw-4770: [DMESG-FAIL][14] ([i915#725]) -> [DMESG-FAIL][15] ([i915#563])
[14]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7781/fi-hsw-4770/igt@i915_selftest@live_blt.html
[15]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16177/fi-hsw-4770/igt@i915_selftest@live_blt.html
[fdo#111407]: https://bugs.freedesktop.org/show_bug.cgi?id=111407
[fdo#112271]: https://bugs.freedesktop.org/show_bug.cgi?id=112271
[i915#217]: https://gitlab.freedesktop.org/drm/intel/issues/217
[i915#44]: https://gitlab.freedesktop.org/drm/intel/issues/44
[i915#563]: https://gitlab.freedesktop.org/drm/intel/issues/563
[i915#725]: https://gitlab.freedesktop.org/drm/intel/issues/725
[i915#816]: https://gitlab.freedesktop.org/drm/intel/issues/816
[i915#889]: https://gitlab.freedesktop.org/drm/intel/issues/889
Participating hosts (50 -> 40)
------------------------------
Missing (10): fi-ilk-m540 fi-hsw-4200u fi-skl-guc fi-glk-dsi fi-byt-squawks fi-bwr-2160 fi-bsw-cyan fi-ctg-p8600 fi-whl-u fi-byt-clapper
Build changes
-------------
* CI: CI-20190529 -> None
* Linux: CI_DRM_7781 -> Patchwork_16177
CI-20190529: 20190529
CI_DRM_7781: 3f2b341ae1fde67f823aeb715c6f489affdef8b1 @ git://anongit.freedesktop.org/gfx-ci/linux
IGT_5374: 83c32e859202e43ff6a8cca162c76fcd90ad6e3b @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
Patchwork_16177: 5929bca143b8ff15f2dce5c8ef4b14d67479bbb1 @ git://anongit.freedesktop.org/gfx-ci/linux
== Linux commits ==
5929bca143b8 drm/i915/guc: Don't GEM_BUG_ON on corrupted H2G CTB
== Logs ==
For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16177/index.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [Intel-gfx] [PATCH] drm/i915/guc: Don't GEM_BUG_ON on corrupted H2G CTB
2020-01-20 19:18 [Intel-gfx] [PATCH] drm/i915/guc: Don't GEM_BUG_ON on corrupted H2G CTB Michal Wajdeczko
2020-01-21 2:45 ` [Intel-gfx] ✗ Fi.CI.BAT: failure for " Patchwork
@ 2020-01-24 18:43 ` Chris Wilson
1 sibling, 0 replies; 3+ messages in thread
From: Chris Wilson @ 2020-01-24 18:43 UTC (permalink / raw)
To: Michal Wajdeczko, intel-gfx
Quoting Michal Wajdeczko (2020-01-20 19:18:17)
> We should never BUG_ON on any corruption in CTB descriptor as
> data there can be also modified by the GuC. Instead we can
> use flag "is_in_error" to indicate that we will not process
> any further messages over this CTB (until reset).
Like you already did for ct_read(). I was confused over having deja vu.
> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2020-01-24 18:43 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-20 19:18 [Intel-gfx] [PATCH] drm/i915/guc: Don't GEM_BUG_ON on corrupted H2G CTB Michal Wajdeczko
2020-01-21 2:45 ` [Intel-gfx] ✗ Fi.CI.BAT: failure for " Patchwork
2020-01-24 18:43 ` [Intel-gfx] [PATCH] " Chris Wilson
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.