* [Intel-gfx] [CI] drm/i915: Disable atomics in L3 for gen9
@ 2021-01-25 21:52 Chris Wilson
2021-01-25 22:01 ` [Intel-gfx] [PATCH] " Chris Wilson
` (2 more replies)
0 siblings, 3 replies; 7+ messages in thread
From: Chris Wilson @ 2021-01-25 21:52 UTC (permalink / raw)
To: intel-gfx
Enabling atomic operations in L3 leads to unrecoverable GPU hangs, as
the machine stops responding milliseconds after receipt of the reset
request [GDRT]. By disabling the cached atomics, the hang do not occur
and we presume the GPU would reset normally for similar hangs.
Reported-by: Jason Ekstrand <jason@jlekstrand.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110998
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Jason Ekstrand <jason@jlesktrand.net>
---
drivers/gpu/drm/i915/gt/intel_workarounds.c | 8 ++++++++
drivers/gpu/drm/i915/i915_reg.h | 7 +++++++
2 files changed, 15 insertions(+)
diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c
index 82e15c8c7a97..7a1d8c68aefb 100644
--- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
@@ -1840,6 +1840,14 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
wa_write_or(wal,
GEN8_L3SQCREG4,
GEN8_LQSC_FLUSH_COHERENT_LINES);
+
+ /* Disable atomics in L3 to prevent unrecoverable hangs */
+ wa_write_masked_or(wal, GEN9_SCRATCH_LNCF1,
+ GEN9_LNCF_NONIA_COHERENT_ATOMICS_ENABLE, 0);
+ wa_write_masked_or(wal, GEN8_L3SQCREG4,
+ GEN8_LQSQ_NONIA_COHERENT_ATOMICS_ENABLE, 0);
+ wa_write_masked_or(wal, GEN9_SCRATCH1,
+ EVICTION_PERF_FIX_ENABLE, 0);
}
if (IS_HASWELL(i915)) {
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 8b9bbc6bacb1..fa3866f9ccfc 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -8222,6 +8222,7 @@ enum {
#define GEN11_LQSC_CLEAN_EVICT_DISABLE (1 << 6)
#define GEN8_LQSC_RO_PERF_DIS (1 << 27)
#define GEN8_LQSC_FLUSH_COHERENT_LINES (1 << 21)
+#define GEN8_LQSQ_NONIA_COHERENT_ATOMICS_ENABLE REG_BIT(22)
/* GEN8 chicken */
#define HDC_CHICKEN0 _MMIO(0x7300)
@@ -12104,6 +12105,12 @@ enum skl_power_gate {
#define __GEN11_VCS2_MOCS0 0x10000
#define GEN11_MFX2_MOCS(i) _MMIO(__GEN11_VCS2_MOCS0 + (i) * 4)
+#define GEN9_SCRATCH_LNCF1 _MMIO(0xb008)
+#define GEN9_LNCF_NONIA_COHERENT_ATOMICS_ENABLE REG_BIT(0)
+
+#define GEN9_SCRATCH1 _MMIO(0xb11c)
+#define EVICTION_PERF_FIX_ENABLE REG_BIT(8)
+
#define GEN10_SCRATCH_LNCF2 _MMIO(0xb0a0)
#define PMFLUSHDONE_LNICRSDROP (1 << 20)
#define PMFLUSH_GAPL3UNBLOCK (1 << 21)
--
2.20.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [Intel-gfx] [PATCH] drm/i915: Disable atomics in L3 for gen9
2021-01-25 21:52 [Intel-gfx] [CI] drm/i915: Disable atomics in L3 for gen9 Chris Wilson
@ 2021-01-25 22:01 ` Chris Wilson
2021-01-25 23:41 ` [Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915: Disable atomics in L3 for gen9 (rev4) Patchwork
2021-01-26 5:11 ` [Intel-gfx] ✗ Fi.CI.IGT: failure " Patchwork
2 siblings, 0 replies; 7+ messages in thread
From: Chris Wilson @ 2021-01-25 22:01 UTC (permalink / raw)
To: intel-gfx; +Cc: Jason Ekstrand, Chris Wilson
Enabling atomic operations in L3 leads to unrecoverable GPU hangs, as
the machine stops responding milliseconds after receipt of the reset
request [GDRT]. By disabling the cached atomics, the hang do not occur
and we presume the GPU would reset normally for similar hangs.
Reported-by: Jason Ekstrand <jason@jlekstrand.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110998
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Jason Ekstrand <jason@jlesktrand.net>
---
drivers/gpu/drm/i915/gt/intel_workarounds.c | 8 ++++++++
drivers/gpu/drm/i915/i915_reg.h | 7 +++++++
2 files changed, 15 insertions(+)
diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c
index 82e15c8c7a97..71d1c19c868b 100644
--- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
@@ -1840,6 +1840,14 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
wa_write_or(wal,
GEN8_L3SQCREG4,
GEN8_LQSC_FLUSH_COHERENT_LINES);
+
+ /* Disable atomics in L3 to prevent unrecoverable hangs */
+ wa_write_clr_set(wal, GEN9_SCRATCH_LNCF1,
+ GEN9_LNCF_NONIA_COHERENT_ATOMICS_ENABLE, 0);
+ wa_write_clr_set(wal, GEN8_L3SQCREG4,
+ GEN8_LQSQ_NONIA_COHERENT_ATOMICS_ENABLE, 0);
+ wa_write_clr_set(wal, GEN9_SCRATCH1,
+ EVICTION_PERF_FIX_ENABLE, 0);
}
if (IS_HASWELL(i915)) {
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 8b9bbc6bacb1..fa3866f9ccfc 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -8222,6 +8222,7 @@ enum {
#define GEN11_LQSC_CLEAN_EVICT_DISABLE (1 << 6)
#define GEN8_LQSC_RO_PERF_DIS (1 << 27)
#define GEN8_LQSC_FLUSH_COHERENT_LINES (1 << 21)
+#define GEN8_LQSQ_NONIA_COHERENT_ATOMICS_ENABLE REG_BIT(22)
/* GEN8 chicken */
#define HDC_CHICKEN0 _MMIO(0x7300)
@@ -12104,6 +12105,12 @@ enum skl_power_gate {
#define __GEN11_VCS2_MOCS0 0x10000
#define GEN11_MFX2_MOCS(i) _MMIO(__GEN11_VCS2_MOCS0 + (i) * 4)
+#define GEN9_SCRATCH_LNCF1 _MMIO(0xb008)
+#define GEN9_LNCF_NONIA_COHERENT_ATOMICS_ENABLE REG_BIT(0)
+
+#define GEN9_SCRATCH1 _MMIO(0xb11c)
+#define EVICTION_PERF_FIX_ENABLE REG_BIT(8)
+
#define GEN10_SCRATCH_LNCF2 _MMIO(0xb0a0)
#define PMFLUSHDONE_LNICRSDROP (1 << 20)
#define PMFLUSH_GAPL3UNBLOCK (1 << 21)
--
2.20.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915: Disable atomics in L3 for gen9 (rev4)
2021-01-25 21:52 [Intel-gfx] [CI] drm/i915: Disable atomics in L3 for gen9 Chris Wilson
2021-01-25 22:01 ` [Intel-gfx] [PATCH] " Chris Wilson
@ 2021-01-25 23:41 ` Patchwork
2021-01-26 5:11 ` [Intel-gfx] ✗ Fi.CI.IGT: failure " Patchwork
2 siblings, 0 replies; 7+ messages in thread
From: Patchwork @ 2021-01-25 23:41 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx
[-- Attachment #1.1: Type: text/plain, Size: 3922 bytes --]
== Series Details ==
Series: drm/i915: Disable atomics in L3 for gen9 (rev4)
URL : https://patchwork.freedesktop.org/series/63969/
State : success
== Summary ==
CI Bug Log - changes from CI_DRM_9681 -> Patchwork_19495
====================================================
Summary
-------
**SUCCESS**
No regressions found.
External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19495/index.html
Known issues
------------
Here are the changes found in Patchwork_19495 that come from known issues:
### IGT changes ###
#### Issues hit ####
* igt@amdgpu/amd_cs_nop@sync-gfx0:
- fi-bsw-n3050: NOTRUN -> [SKIP][1] ([fdo#109271]) +17 similar issues
[1]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19495/fi-bsw-n3050/igt@amdgpu/amd_cs_nop@sync-gfx0.html
* igt@gem_tiled_fence_blits@basic:
- fi-tgl-y: [PASS][2] -> [DMESG-WARN][3] ([i915#402])
[2]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9681/fi-tgl-y/igt@gem_tiled_fence_blits@basic.html
[3]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19495/fi-tgl-y/igt@gem_tiled_fence_blits@basic.html
* igt@kms_chamelium@hdmi-hpd-fast:
- fi-icl-u2: [PASS][4] -> [DMESG-WARN][5] ([i915#2203])
[4]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9681/fi-icl-u2/igt@kms_chamelium@hdmi-hpd-fast.html
[5]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19495/fi-icl-u2/igt@kms_chamelium@hdmi-hpd-fast.html
#### Possible fixes ####
* igt@gem_exec_suspend@basic-s0:
- fi-tgl-u2: [FAIL][6] ([i915#1888]) -> [PASS][7]
[6]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9681/fi-tgl-u2/igt@gem_exec_suspend@basic-s0.html
[7]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19495/fi-tgl-u2/igt@gem_exec_suspend@basic-s0.html
* igt@gem_ringfill@basic-all:
- fi-tgl-y: [DMESG-WARN][8] ([i915#402]) -> [PASS][9]
[8]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9681/fi-tgl-y/igt@gem_ringfill@basic-all.html
[9]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19495/fi-tgl-y/igt@gem_ringfill@basic-all.html
* igt@i915_selftest@live@execlists:
- fi-bsw-n3050: [INCOMPLETE][10] ([i915#2940]) -> [PASS][11]
[10]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9681/fi-bsw-n3050/igt@i915_selftest@live@execlists.html
[11]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19495/fi-bsw-n3050/igt@i915_selftest@live@execlists.html
#### Warnings ####
* igt@i915_pm_rpm@basic-rte:
- fi-kbl-guc: [SKIP][12] ([fdo#109271]) -> [FAIL][13] ([i915#704])
[12]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9681/fi-kbl-guc/igt@i915_pm_rpm@basic-rte.html
[13]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19495/fi-kbl-guc/igt@i915_pm_rpm@basic-rte.html
[fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
[i915#1888]: https://gitlab.freedesktop.org/drm/intel/issues/1888
[i915#2203]: https://gitlab.freedesktop.org/drm/intel/issues/2203
[i915#2940]: https://gitlab.freedesktop.org/drm/intel/issues/2940
[i915#402]: https://gitlab.freedesktop.org/drm/intel/issues/402
[i915#704]: https://gitlab.freedesktop.org/drm/intel/issues/704
Participating hosts (40 -> 36)
------------------------------
Missing (4): fi-ctg-p8600 fi-jsl-1 fi-ilk-m540 fi-hsw-4200u
Build changes
-------------
* Linux: CI_DRM_9681 -> Patchwork_19495
CI-20190529: 20190529
CI_DRM_9681: 1e6338c3a84443c647b7bccf812164f3376f11f9 @ git://anongit.freedesktop.org/gfx-ci/linux
IGT_5971: abef2b7d6ff30f3b948b3e5d39653debb73083f3 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
Patchwork_19495: 1d3fdd58287cf8f4b111636bd85d9e9058e6fdea @ git://anongit.freedesktop.org/gfx-ci/linux
== Linux commits ==
1d3fdd58287c drm/i915: Disable atomics in L3 for gen9
== Logs ==
For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19495/index.html
[-- Attachment #1.2: Type: text/html, Size: 4823 bytes --]
[-- Attachment #2: Type: text/plain, Size: 160 bytes --]
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Intel-gfx] ✗ Fi.CI.IGT: failure for drm/i915: Disable atomics in L3 for gen9 (rev4)
2021-01-25 21:52 [Intel-gfx] [CI] drm/i915: Disable atomics in L3 for gen9 Chris Wilson
2021-01-25 22:01 ` [Intel-gfx] [PATCH] " Chris Wilson
2021-01-25 23:41 ` [Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915: Disable atomics in L3 for gen9 (rev4) Patchwork
@ 2021-01-26 5:11 ` Patchwork
2 siblings, 0 replies; 7+ messages in thread
From: Patchwork @ 2021-01-26 5:11 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx
[-- Attachment #1.1: Type: text/plain, Size: 19951 bytes --]
== Series Details ==
Series: drm/i915: Disable atomics in L3 for gen9 (rev4)
URL : https://patchwork.freedesktop.org/series/63969/
State : failure
== Summary ==
CI Bug Log - changes from CI_DRM_9681_full -> Patchwork_19495_full
====================================================
Summary
-------
**FAILURE**
Serious unknown changes coming with Patchwork_19495_full absolutely need to be
verified manually.
If you think the reported changes have nothing to do with the changes
introduced in Patchwork_19495_full, please notify your bug team to allow them
to document this new failure mode, which will reduce false positives in CI.
Possible new issues
-------------------
Here are the unknown changes that may have been introduced in Patchwork_19495_full:
### IGT changes ###
#### Possible regressions ####
* igt@gem_ctx_persistence@heartbeat-stop:
- shard-tglb: [PASS][1] -> [FAIL][2]
[1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9681/shard-tglb5/igt@gem_ctx_persistence@heartbeat-stop.html
[2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19495/shard-tglb1/igt@gem_ctx_persistence@heartbeat-stop.html
#### Suppressed ####
The following results come from untrusted machines, tests, or statuses.
They do not affect the overall result.
* {igt@sysfs_clients@busy@vecs0}:
- shard-glk: [PASS][3] -> [FAIL][4]
[3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9681/shard-glk8/igt@sysfs_clients@busy@vecs0.html
[4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19495/shard-glk8/igt@sysfs_clients@busy@vecs0.html
- shard-kbl: [PASS][5] -> [FAIL][6]
[5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9681/shard-kbl1/igt@sysfs_clients@busy@vecs0.html
[6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19495/shard-kbl1/igt@sysfs_clients@busy@vecs0.html
Known issues
------------
Here are the changes found in Patchwork_19495_full that come from known issues:
### IGT changes ###
#### Issues hit ####
* igt@gem_exec_fair@basic-deadline:
- shard-apl: NOTRUN -> [FAIL][7] ([i915#2846])
[7]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19495/shard-apl2/igt@gem_exec_fair@basic-deadline.html
* igt@gem_exec_fair@basic-none@vcs0:
- shard-apl: [PASS][8] -> [FAIL][9] ([i915#2842])
[8]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9681/shard-apl6/igt@gem_exec_fair@basic-none@vcs0.html
[9]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19495/shard-apl6/igt@gem_exec_fair@basic-none@vcs0.html
* igt@gem_exec_fair@basic-pace@rcs0:
- shard-kbl: [PASS][10] -> [FAIL][11] ([i915#2842]) +2 similar issues
[10]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9681/shard-kbl7/igt@gem_exec_fair@basic-pace@rcs0.html
[11]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19495/shard-kbl6/igt@gem_exec_fair@basic-pace@rcs0.html
* igt@gem_exec_fair@basic-pace@vcs1:
- shard-iclb: NOTRUN -> [FAIL][12] ([i915#2842])
[12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19495/shard-iclb1/igt@gem_exec_fair@basic-pace@vcs1.html
- shard-tglb: [PASS][13] -> [FAIL][14] ([i915#2842])
[13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9681/shard-tglb5/igt@gem_exec_fair@basic-pace@vcs1.html
[14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19495/shard-tglb1/igt@gem_exec_fair@basic-pace@vcs1.html
* igt@gem_exec_reloc@basic-wide-active@vcs1:
- shard-iclb: NOTRUN -> [FAIL][15] ([i915#2389])
[15]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19495/shard-iclb2/igt@gem_exec_reloc@basic-wide-active@vcs1.html
* igt@gem_exec_schedule@u-fairslice@bcs0:
- shard-iclb: [PASS][16] -> [DMESG-WARN][17] ([i915#2803])
[16]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9681/shard-iclb4/igt@gem_exec_schedule@u-fairslice@bcs0.html
[17]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19495/shard-iclb4/igt@gem_exec_schedule@u-fairslice@bcs0.html
* igt@gem_exec_schedule@u-fairslice@rcs0:
- shard-tglb: [PASS][18] -> [DMESG-WARN][19] ([i915#2803])
[18]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9681/shard-tglb3/igt@gem_exec_schedule@u-fairslice@rcs0.html
[19]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19495/shard-tglb8/igt@gem_exec_schedule@u-fairslice@rcs0.html
* igt@gem_exec_whisper@basic-fds-forked-all:
- shard-glk: [PASS][20] -> [DMESG-WARN][21] ([i915#118] / [i915#95])
[20]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9681/shard-glk3/igt@gem_exec_whisper@basic-fds-forked-all.html
[21]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19495/shard-glk2/igt@gem_exec_whisper@basic-fds-forked-all.html
* igt@i915_pm_rpm@modeset-lpsp-stress:
- shard-apl: NOTRUN -> [SKIP][22] ([fdo#109271]) +42 similar issues
[22]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19495/shard-apl2/igt@i915_pm_rpm@modeset-lpsp-stress.html
* igt@i915_suspend@debugfs-reader:
- shard-kbl: [PASS][23] -> [INCOMPLETE][24] ([i915#155])
[23]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9681/shard-kbl7/igt@i915_suspend@debugfs-reader.html
[24]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19495/shard-kbl6/igt@i915_suspend@debugfs-reader.html
* igt@kms_chamelium@dp-hpd-for-each-pipe:
- shard-apl: NOTRUN -> [SKIP][25] ([fdo#109271] / [fdo#111827]) +3 similar issues
[25]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19495/shard-apl2/igt@kms_chamelium@dp-hpd-for-each-pipe.html
* igt@kms_flip@2x-plain-flip-ts-check@ac-hdmi-a1-hdmi-a2:
- shard-glk: [PASS][26] -> [FAIL][27] ([i915#2122])
[26]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9681/shard-glk4/igt@kms_flip@2x-plain-flip-ts-check@ac-hdmi-a1-hdmi-a2.html
[27]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19495/shard-glk5/igt@kms_flip@2x-plain-flip-ts-check@ac-hdmi-a1-hdmi-a2.html
* igt@kms_pipe_crc_basic@read-crc-pipe-c:
- shard-kbl: [PASS][28] -> [DMESG-WARN][29] ([i915#180] / [i915#78])
[28]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9681/shard-kbl4/igt@kms_pipe_crc_basic@read-crc-pipe-c.html
[29]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19495/shard-kbl2/igt@kms_pipe_crc_basic@read-crc-pipe-c.html
* igt@kms_psr2_sf@plane-move-sf-dmg-area-3:
- shard-apl: NOTRUN -> [SKIP][30] ([fdo#109271] / [i915#658])
[30]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19495/shard-apl2/igt@kms_psr2_sf@plane-move-sf-dmg-area-3.html
* igt@kms_psr@psr2_primary_blt:
- shard-iclb: [PASS][31] -> [SKIP][32] ([fdo#109441]) +1 similar issue
[31]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9681/shard-iclb2/igt@kms_psr@psr2_primary_blt.html
[32]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19495/shard-iclb5/igt@kms_psr@psr2_primary_blt.html
#### Possible fixes ####
* igt@feature_discovery@psr2:
- shard-iclb: [SKIP][33] ([i915#658]) -> [PASS][34]
[33]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9681/shard-iclb4/igt@feature_discovery@psr2.html
[34]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19495/shard-iclb2/igt@feature_discovery@psr2.html
* igt@gem_exec_fair@basic-none-share@rcs0:
- shard-iclb: [FAIL][35] ([i915#2842]) -> [PASS][36]
[35]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9681/shard-iclb1/igt@gem_exec_fair@basic-none-share@rcs0.html
[36]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19495/shard-iclb3/igt@gem_exec_fair@basic-none-share@rcs0.html
* igt@gem_exec_fair@basic-none@vcs0:
- shard-kbl: [FAIL][37] ([i915#2842]) -> [PASS][38] +1 similar issue
[37]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9681/shard-kbl1/igt@gem_exec_fair@basic-none@vcs0.html
[38]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19495/shard-kbl3/igt@gem_exec_fair@basic-none@vcs0.html
* igt@gem_exec_fair@basic-pace-share@rcs0:
- shard-tglb: [FAIL][39] ([i915#2842]) -> [PASS][40]
[39]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9681/shard-tglb2/igt@gem_exec_fair@basic-pace-share@rcs0.html
[40]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19495/shard-tglb6/igt@gem_exec_fair@basic-pace-share@rcs0.html
- shard-glk: [FAIL][41] ([i915#2842]) -> [PASS][42]
[41]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9681/shard-glk1/igt@gem_exec_fair@basic-pace-share@rcs0.html
[42]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19495/shard-glk3/igt@gem_exec_fair@basic-pace-share@rcs0.html
* igt@i915_pm_dc@dc6-psr:
- shard-iclb: [FAIL][43] ([i915#454]) -> [PASS][44]
[43]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9681/shard-iclb4/igt@i915_pm_dc@dc6-psr.html
[44]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19495/shard-iclb7/igt@i915_pm_dc@dc6-psr.html
* igt@i915_pm_rpm@basic-rte:
- shard-kbl: [DMESG-WARN][45] ([i915#165] / [i915#180]) -> [PASS][46]
[45]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9681/shard-kbl2/igt@i915_pm_rpm@basic-rte.html
[46]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19495/shard-kbl1/igt@i915_pm_rpm@basic-rte.html
* igt@kms_cursor_legacy@2x-flip-vs-cursor-atomic:
- shard-glk: [FAIL][47] ([i915#72]) -> [PASS][48]
[47]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9681/shard-glk1/igt@kms_cursor_legacy@2x-flip-vs-cursor-atomic.html
[48]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19495/shard-glk3/igt@kms_cursor_legacy@2x-flip-vs-cursor-atomic.html
* igt@kms_flip@flip-vs-expired-vblank@a-dp1:
- shard-apl: [FAIL][49] ([i915#79]) -> [PASS][50]
[49]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9681/shard-apl2/igt@kms_flip@flip-vs-expired-vblank@a-dp1.html
[50]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19495/shard-apl6/igt@kms_flip@flip-vs-expired-vblank@a-dp1.html
* igt@kms_flip@flip-vs-expired-vblank@a-edp1:
- shard-tglb: [FAIL][51] ([i915#2598]) -> [PASS][52] +1 similar issue
[51]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9681/shard-tglb7/igt@kms_flip@flip-vs-expired-vblank@a-edp1.html
[52]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19495/shard-tglb5/igt@kms_flip@flip-vs-expired-vblank@a-edp1.html
* igt@kms_flip@flip-vs-suspend@a-dp1:
- shard-kbl: [DMESG-WARN][53] ([i915#165]) -> [PASS][54]
[53]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9681/shard-kbl7/igt@kms_flip@flip-vs-suspend@a-dp1.html
[54]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19495/shard-kbl4/igt@kms_flip@flip-vs-suspend@a-dp1.html
* igt@kms_psr@psr2_primary_page_flip:
- shard-iclb: [SKIP][55] ([fdo#109441]) -> [PASS][56] +4 similar issues
[55]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9681/shard-iclb4/igt@kms_psr@psr2_primary_page_flip.html
[56]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19495/shard-iclb2/igt@kms_psr@psr2_primary_page_flip.html
* {igt@sysfs_clients@busy@bcs0}:
- shard-kbl: [FAIL][57] -> [PASS][58]
[57]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9681/shard-kbl1/igt@sysfs_clients@busy@bcs0.html
[58]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19495/shard-kbl1/igt@sysfs_clients@busy@bcs0.html
* {igt@sysfs_clients@sema-10@rcs0}:
- shard-apl: [SKIP][59] ([fdo#109271]) -> [PASS][60]
[59]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9681/shard-apl7/igt@sysfs_clients@sema-10@rcs0.html
[60]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19495/shard-apl1/igt@sysfs_clients@sema-10@rcs0.html
* {igt@sysfs_clients@split-10@bcs0}:
- shard-glk: [SKIP][61] ([fdo#109271]) -> [PASS][62]
[61]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9681/shard-glk8/igt@sysfs_clients@split-10@bcs0.html
[62]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19495/shard-glk8/igt@sysfs_clients@split-10@bcs0.html
#### Warnings ####
* igt@gem_exec_fair@basic-throttle@rcs0:
- shard-iclb: [FAIL][63] ([i915#2842]) -> [FAIL][64] ([i915#2849])
[63]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9681/shard-iclb3/igt@gem_exec_fair@basic-throttle@rcs0.html
[64]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19495/shard-iclb8/igt@gem_exec_fair@basic-throttle@rcs0.html
* igt@i915_pm_rc6_residency@rc6-fence:
- shard-iclb: [WARN][65] ([i915#1804] / [i915#2684]) -> [WARN][66] ([i915#2684])
[65]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9681/shard-iclb4/igt@i915_pm_rc6_residency@rc6-fence.html
[66]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19495/shard-iclb2/igt@i915_pm_rc6_residency@rc6-fence.html
* igt@i915_pm_rc6_residency@rc6-idle:
- shard-iclb: [WARN][67] ([i915#2681] / [i915#2684]) -> [WARN][68] ([i915#1804] / [i915#2684])
[67]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9681/shard-iclb8/igt@i915_pm_rc6_residency@rc6-idle.html
[68]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19495/shard-iclb6/igt@i915_pm_rc6_residency@rc6-idle.html
* igt@kms_psr2_sf@overlay-plane-update-sf-dmg-area-1:
- shard-iclb: [SKIP][69] ([i915#2920]) -> [SKIP][70] ([i915#658]) +1 similar issue
[69]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9681/shard-iclb2/igt@kms_psr2_sf@overlay-plane-update-sf-dmg-area-1.html
[70]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19495/shard-iclb1/igt@kms_psr2_sf@overlay-plane-update-sf-dmg-area-1.html
* igt@kms_psr2_sf@plane-move-sf-dmg-area-3:
- shard-iclb: [SKIP][71] ([i915#658]) -> [SKIP][72] ([i915#2920]) +1 similar issue
[71]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9681/shard-iclb6/igt@kms_psr2_sf@plane-move-sf-dmg-area-3.html
[72]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19495/shard-iclb2/igt@kms_psr2_sf@plane-move-sf-dmg-area-3.html
* igt@perf_pmu@rc6-suspend:
- shard-kbl: [INCOMPLETE][73] ([i915#155]) -> [DMESG-WARN][74] ([i915#180])
[73]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9681/shard-kbl3/igt@perf_pmu@rc6-suspend.html
[74]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19495/shard-kbl7/igt@perf_pmu@rc6-suspend.html
* igt@runner@aborted:
- shard-kbl: ([FAIL][75], [FAIL][76], [FAIL][77]) ([i915#2295] / [i915#2505]) -> ([FAIL][78], [FAIL][79], [FAIL][80], [FAIL][81]) ([i915#2292] / [i915#2295])
[75]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9681/shard-kbl2/igt@runner@aborted.html
[76]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9681/shard-kbl6/igt@runner@aborted.html
[77]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9681/shard-kbl7/igt@runner@aborted.html
[78]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19495/shard-kbl4/igt@runner@aborted.html
[79]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19495/shard-kbl3/igt@runner@aborted.html
[80]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19495/shard-kbl7/igt@runner@aborted.html
[81]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19495/shard-kbl7/igt@runner@aborted.html
- shard-iclb: ([FAIL][82], [FAIL][83], [FAIL][84]) ([i915#2295] / [i915#2724]) -> ([FAIL][85], [FAIL][86], [FAIL][87], [FAIL][88]) ([i915#2295] / [i915#2426] / [i915#2724])
[82]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9681/shard-iclb1/igt@runner@aborted.html
[83]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9681/shard-iclb3/igt@runner@aborted.html
[84]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9681/shard-iclb2/igt@runner@aborted.html
[85]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19495/shard-iclb3/igt@runner@aborted.html
[86]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19495/shard-iclb4/igt@runner@aborted.html
[87]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19495/shard-iclb1/igt@runner@aborted.html
[88]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19495/shard-iclb8/igt@runner@aborted.html
- shard-tglb: ([FAIL][89], [FAIL][90], [FAIL][91]) ([i915#1602] / [i915#2295] / [i915#2667]) -> ([FAIL][92], [FAIL][93], [FAIL][94], [FAIL][95]) ([i915#1602] / [i915#2295] / [i915#2426] / [i915#2667] / [i915#2803])
[89]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9681/shard-tglb1/igt@runner@aborted.html
[90]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9681/shard-tglb7/igt@runner@aborted.html
[91]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9681/shard-tglb5/igt@runner@aborted.html
[92]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19495/shard-tglb5/igt@runner@aborted.html
[93]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19495/shard-tglb5/igt@runner@aborted.html
[94]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19495/shard-tglb8/igt@runner@aborted.html
[95]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19495/shard-tglb1/igt@runner@aborted.html
{name}: This element is suppressed. This means it is ignored when computing
the status of the difference (SUCCESS, WARNING, or FAILURE).
[fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
[fdo#109441]: https://bugs.freedesktop.org/show_bug.cgi?id=109441
[fdo#111827]: https://bugs.freedesktop.org/show_bug.cgi?id=111827
[i915#118]: https://gitlab.freedesktop.org/drm/intel/issues/118
[i915#155]: https://gitlab.freedesktop.org/drm/intel/issues/155
[i915#1602]: https://gitlab.freedesktop.org/drm/intel/issues/1602
[i915#165]: https://gitlab.freedesktop.org/drm/intel/issues/165
[i915#180]: https://gitlab.freedesktop.org/drm/intel/issues/180
[i915#1804]: https://gitlab.freedesktop.org/drm/intel/issues/1804
[i915#2122]: https://gitlab.freedesktop.org/drm/intel/issues/2122
[i915#2292]: https://gitlab.freedesktop.org/drm/intel/issues/2292
[i915#2295]: https://gitlab.freedesktop.org/drm/intel/issues/2295
[i915#2389]: https://gitlab.freedesktop.org/drm/intel/issues/2389
[i915#2426]: https://gitlab.freedesktop.org/drm/intel/issues/2426
[i915#2505]: https://gitlab.freedesktop.org/drm/intel/issues/2505
[i915#2598]: https://gitlab.freedesktop.org/drm/intel/issues/2598
[i915#2667]: https://gitlab.freedesktop.org/drm/intel/issues/2667
[i915#2681]: https://gitlab.freedesktop.org/drm/intel/issues/2681
[i915#2684]: https://gitlab.freedesktop.org/drm/intel/issues/2684
[i915#2724]: https://gitlab.freedesktop.org/drm/intel/issues/2724
[i915#2803]: https://gitlab.freedesktop.org/drm/intel/issues/2803
[i915#2842]: https://gitlab.freedesktop.org/drm/intel/issues/2842
[i915#2846]: https://gitlab.freedesktop.org/drm/intel/issues/2846
[i915#2849]: https://gitlab.freedesktop.org/drm/intel/issues/2849
[i915#2920]: https://gitlab.freedesktop.org/drm/intel/issues/2920
[i915#454]: https://gitlab.freedesktop.org/drm/intel/issues/454
[i915#658]: https://gitlab.freedesktop.org/drm/intel/issues/658
[i915#72]: https://gitlab.freedesktop.org/drm/intel/issues/72
[i915#78]: https://gitlab.freedesktop.org/drm/intel/issues/78
[i915#79]: https://gitlab.freedesktop.org/drm/intel/issues/79
[i915#95]: https://gitlab.freedesktop.org/drm/intel/issues/95
Participating hosts (10 -> 10)
------------------------------
No changes in participating hosts
Build changes
-------------
* Linux: CI_DRM_9681 -> Patchwork_19495
CI-20190529: 20190529
CI_DRM_9681: 1e6338c3a84443c647b7bccf812164f3376f11f9 @ git://anongit.freedesktop.org/gfx-ci/linux
IGT_5971: abef2b7d6ff30f3b948b3e5d39653debb73083f3 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
Patchwork_19495: 1d3fdd58287cf8f4b111636bd85d9e9058e6fdea @ git://anongit.freedesktop.org/gfx-ci/linux
piglit_4509: fdc5a4ca11124ab8413c7988896eec4c97336694 @ git://anongit.freedesktop.org/piglit
== Logs ==
For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19495/index.html
[-- Attachment #1.2: Type: text/html, Size: 24297 bytes --]
[-- Attachment #2: Type: text/plain, Size: 160 bytes --]
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH] drm/i915: Disable atomics in L3 for gen9
@ 2019-07-20 14:31 Chris Wilson
2019-07-22 11:41 ` Tvrtko Ursulin
0 siblings, 1 reply; 7+ messages in thread
From: Chris Wilson @ 2019-07-20 14:31 UTC (permalink / raw)
To: intel-gfx
Enabling atomic operations in L3 leads to unrecoverable GPU hangs, as
the machine stops responding milliseconds after receipt of the reset
request [GDRT]. By disabling the cached atomics, the hang do not occur
and we presume the GPU would reset normally for similar hangs.
Reported-by: Jason Ekstrand <jason@jlekstrand.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110998
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
---
Jason reports that Windows is not clearing L3SQCREG4:22 and does not
suffer the same GPU hang so it is likely some other w/a that interacts
badly. Fwiw, these 3 are the only registers I could find that mention
atomic ops (and appear to be part of the same chain for memory access).
---
drivers/gpu/drm/i915/gt/intel_workarounds.c | 8 ++++++++
drivers/gpu/drm/i915/i915_reg.h | 7 +++++++
2 files changed, 15 insertions(+)
diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c
index 704ace01e7f5..ac94ed3ba7b6 100644
--- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
@@ -1349,6 +1349,14 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
wa_write_or(wal,
GEN8_L3SQCREG4,
GEN8_LQSC_FLUSH_COHERENT_LINES);
+
+ /* Disable atomics in L3 to prevent unrecoverable hangs */
+ wa_write_masked_or(wal, GEN9_SCRATCH_LNCF1,
+ GEN9_LNCF_NONIA_COHERENT_ATOMICS_ENABLE, 0);
+ wa_write_masked_or(wal, GEN8_L3SQCREG4,
+ GEN8_LQSQ_NONIA_COHERENT_ATOMICS_ENABLE, 0);
+ wa_write_masked_or(wal, GEN9_SCRATCH1,
+ EVICTION_PERF_FIX_ENABLE, 0);
}
}
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 24f2a52a2b42..e23b2200e7fc 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -7728,6 +7728,7 @@ enum {
#define GEN11_LQSC_CLEAN_EVICT_DISABLE (1 << 6)
#define GEN8_LQSC_RO_PERF_DIS (1 << 27)
#define GEN8_LQSC_FLUSH_COHERENT_LINES (1 << 21)
+#define GEN8_LQSQ_NONIA_COHERENT_ATOMICS_ENABLE REG_BIT(22)
/* GEN8 chicken */
#define HDC_CHICKEN0 _MMIO(0x7300)
@@ -11202,6 +11203,12 @@ enum skl_power_gate {
/* Media decoder 2 MOCS registers */
#define GEN11_MFX2_MOCS(i) _MMIO(0x10000 + (i) * 4)
+#define GEN9_SCRATCH_LNCF1 _MMIO(0xb008)
+#define GEN9_LNCF_NONIA_COHERENT_ATOMICS_ENABLE REG_BIT(0)
+
+#define GEN9_SCRATCH1 _MMIO(0xb11c)
+#define EVICTION_PERF_FIX_ENABLE REG_BIT(8)
+
#define GEN10_SCRATCH_LNCF2 _MMIO(0xb0a0)
#define PMFLUSHDONE_LNICRSDROP (1 << 20)
#define PMFLUSH_GAPL3UNBLOCK (1 << 21)
--
2.22.0
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] drm/i915: Disable atomics in L3 for gen9
2019-07-20 14:31 [PATCH] drm/i915: Disable atomics in L3 for gen9 Chris Wilson
@ 2019-07-22 11:41 ` Tvrtko Ursulin
2019-07-23 11:55 ` Chris Wilson
0 siblings, 1 reply; 7+ messages in thread
From: Tvrtko Ursulin @ 2019-07-22 11:41 UTC (permalink / raw)
To: Chris Wilson, intel-gfx
On 20/07/2019 15:31, Chris Wilson wrote:
> Enabling atomic operations in L3 leads to unrecoverable GPU hangs, as
> the machine stops responding milliseconds after receipt of the reset
> request [GDRT]. By disabling the cached atomics, the hang do not occur
> and we presume the GPU would reset normally for similar hangs.
>
> Reported-by: Jason Ekstrand <jason@jlekstrand.net>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110998
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Jason Ekstrand <jason@jlekstrand.net>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
> ---
> Jason reports that Windows is not clearing L3SQCREG4:22 and does not
> suffer the same GPU hang so it is likely some other w/a that interacts
> badly. Fwiw, these 3 are the only registers I could find that mention
> atomic ops (and appear to be part of the same chain for memory access).
Bit-toggling itself looks fine to me and matches what I could find in
the docs. (All three bits across three registers should be equal.)
What I am curious about is what are the other consequences of disabling
L3 atomics? Performance drop somewhere?
Regards,
Tvrtko
> ---
> drivers/gpu/drm/i915/gt/intel_workarounds.c | 8 ++++++++
> drivers/gpu/drm/i915/i915_reg.h | 7 +++++++
> 2 files changed, 15 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c
> index 704ace01e7f5..ac94ed3ba7b6 100644
> --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
> +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
> @@ -1349,6 +1349,14 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
> wa_write_or(wal,
> GEN8_L3SQCREG4,
> GEN8_LQSC_FLUSH_COHERENT_LINES);
> +
> + /* Disable atomics in L3 to prevent unrecoverable hangs */
> + wa_write_masked_or(wal, GEN9_SCRATCH_LNCF1,
> + GEN9_LNCF_NONIA_COHERENT_ATOMICS_ENABLE, 0);
> + wa_write_masked_or(wal, GEN8_L3SQCREG4,
> + GEN8_LQSQ_NONIA_COHERENT_ATOMICS_ENABLE, 0);
> + wa_write_masked_or(wal, GEN9_SCRATCH1,
> + EVICTION_PERF_FIX_ENABLE, 0);
> }
> }
>
> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> index 24f2a52a2b42..e23b2200e7fc 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -7728,6 +7728,7 @@ enum {
> #define GEN11_LQSC_CLEAN_EVICT_DISABLE (1 << 6)
> #define GEN8_LQSC_RO_PERF_DIS (1 << 27)
> #define GEN8_LQSC_FLUSH_COHERENT_LINES (1 << 21)
> +#define GEN8_LQSQ_NONIA_COHERENT_ATOMICS_ENABLE REG_BIT(22)
>
> /* GEN8 chicken */
> #define HDC_CHICKEN0 _MMIO(0x7300)
> @@ -11202,6 +11203,12 @@ enum skl_power_gate {
> /* Media decoder 2 MOCS registers */
> #define GEN11_MFX2_MOCS(i) _MMIO(0x10000 + (i) * 4)
>
> +#define GEN9_SCRATCH_LNCF1 _MMIO(0xb008)
> +#define GEN9_LNCF_NONIA_COHERENT_ATOMICS_ENABLE REG_BIT(0)
> +
> +#define GEN9_SCRATCH1 _MMIO(0xb11c)
> +#define EVICTION_PERF_FIX_ENABLE REG_BIT(8)
> +
> #define GEN10_SCRATCH_LNCF2 _MMIO(0xb0a0)
> #define PMFLUSHDONE_LNICRSDROP (1 << 20)
> #define PMFLUSH_GAPL3UNBLOCK (1 << 21)
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] drm/i915: Disable atomics in L3 for gen9
2019-07-22 11:41 ` Tvrtko Ursulin
@ 2019-07-23 11:55 ` Chris Wilson
2019-07-23 22:19 ` Francisco Jerez
0 siblings, 1 reply; 7+ messages in thread
From: Chris Wilson @ 2019-07-23 11:55 UTC (permalink / raw)
To: Tvrtko Ursulin, intel-gfx
Quoting Tvrtko Ursulin (2019-07-22 12:41:36)
>
> On 20/07/2019 15:31, Chris Wilson wrote:
> > Enabling atomic operations in L3 leads to unrecoverable GPU hangs, as
> > the machine stops responding milliseconds after receipt of the reset
> > request [GDRT]. By disabling the cached atomics, the hang do not occur
> > and we presume the GPU would reset normally for similar hangs.
> >
> > Reported-by: Jason Ekstrand <jason@jlekstrand.net>
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110998
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Jason Ekstrand <jason@jlekstrand.net>
> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> > Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
> > ---
> > Jason reports that Windows is not clearing L3SQCREG4:22 and does not
> > suffer the same GPU hang so it is likely some other w/a that interacts
> > badly. Fwiw, these 3 are the only registers I could find that mention
> > atomic ops (and appear to be part of the same chain for memory access).
>
> Bit-toggling itself looks fine to me and matches what I could find in
> the docs. (All three bits across three registers should be equal.)
>
> What I am curious about is what are the other consequences of disabling
> L3 atomics? Performance drop somewhere?
The test I have where it goes from dead to passing, that's a considerable
performance improvement ;)
I imagine not being able to use L3 for atomics is pretty dire, whether that
has any impact, I have no clue.
It is still very likely that we see this because we are doing something
wrong elsewhere.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] drm/i915: Disable atomics in L3 for gen9
2019-07-23 11:55 ` Chris Wilson
@ 2019-07-23 22:19 ` Francisco Jerez
2019-07-24 14:34 ` Chris Wilson
0 siblings, 1 reply; 7+ messages in thread
From: Francisco Jerez @ 2019-07-23 22:19 UTC (permalink / raw)
To: Chris Wilson, Tvrtko Ursulin, intel-gfx
[-- Attachment #1.1.1: Type: text/plain, Size: 2755 bytes --]
Chris Wilson <chris@chris-wilson.co.uk> writes:
> Quoting Tvrtko Ursulin (2019-07-22 12:41:36)
>>
>> On 20/07/2019 15:31, Chris Wilson wrote:
>> > Enabling atomic operations in L3 leads to unrecoverable GPU hangs, as
>> > the machine stops responding milliseconds after receipt of the reset
>> > request [GDRT]. By disabling the cached atomics, the hang do not occur
>> > and we presume the GPU would reset normally for similar hangs.
>> >
>> > Reported-by: Jason Ekstrand <jason@jlekstrand.net>
>> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110998
>> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>> > Cc: Jason Ekstrand <jason@jlekstrand.net>
>> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
>> > Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
>> > ---
>> > Jason reports that Windows is not clearing L3SQCREG4:22 and does not
>> > suffer the same GPU hang so it is likely some other w/a that interacts
>> > badly. Fwiw, these 3 are the only registers I could find that mention
>> > atomic ops (and appear to be part of the same chain for memory access).
>>
>> Bit-toggling itself looks fine to me and matches what I could find in
>> the docs. (All three bits across three registers should be equal.)
>>
>> What I am curious about is what are the other consequences of disabling
>> L3 atomics? Performance drop somewhere?
>
> The test I have where it goes from dead to passing, that's a considerable
> performance improvement ;)
>
> I imagine not being able to use L3 for atomics is pretty dire, whether that
> has any impact, I have no clue.
>
> It is still very likely that we see this because we are doing something
> wrong elsewhere.
This reminds me of f3fc4884ebe6ae649d3723be14b219230d3b7fd2 followed by
d351f6d94893f3ba98b1b20c5ef44c35fc1da124 due to the massive impact (of
the order of 20x IIRC) using the L3 turned out to have on the
performance of HDC atomics, on at least that platform. It seems
unfortunate that we're going to lose L3 atomics on Gen9 now, even though
it's only buffer atomics which are broken IIUC, and even though the
Windows driver is somehow getting away without disabling them. Some of
our setup must be wrong either in the kernel or in userspace... Are
these registers at least whitelisted so userspace can re-enable L3
atomics once the problem is addressed? Wouldn't it be a more specific
workaround for userspace to simply use a non-L3-cacheable MOCS for
(rarely used) buffer surfaces, so it could benefit from L3 atomics
elsewhere?
> -Chris
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 227 bytes --]
[-- Attachment #2: Type: text/plain, Size: 159 bytes --]
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] drm/i915: Disable atomics in L3 for gen9
2019-07-23 22:19 ` Francisco Jerez
@ 2019-07-24 14:34 ` Chris Wilson
2019-07-24 20:02 ` Francisco Jerez
0 siblings, 1 reply; 7+ messages in thread
From: Chris Wilson @ 2019-07-24 14:34 UTC (permalink / raw)
To: Francisco Jerez, Tvrtko Ursulin, intel-gfx
Quoting Francisco Jerez (2019-07-23 23:19:13)
> Chris Wilson <chris@chris-wilson.co.uk> writes:
>
> > Quoting Tvrtko Ursulin (2019-07-22 12:41:36)
> >>
> >> On 20/07/2019 15:31, Chris Wilson wrote:
> >> > Enabling atomic operations in L3 leads to unrecoverable GPU hangs, as
> >> > the machine stops responding milliseconds after receipt of the reset
> >> > request [GDRT]. By disabling the cached atomics, the hang do not occur
> >> > and we presume the GPU would reset normally for similar hangs.
> >> >
> >> > Reported-by: Jason Ekstrand <jason@jlekstrand.net>
> >> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110998
> >> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> >> > Cc: Jason Ekstrand <jason@jlekstrand.net>
> >> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> >> > Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
> >> > ---
> >> > Jason reports that Windows is not clearing L3SQCREG4:22 and does not
> >> > suffer the same GPU hang so it is likely some other w/a that interacts
> >> > badly. Fwiw, these 3 are the only registers I could find that mention
> >> > atomic ops (and appear to be part of the same chain for memory access).
> >>
> >> Bit-toggling itself looks fine to me and matches what I could find in
> >> the docs. (All three bits across three registers should be equal.)
> >>
> >> What I am curious about is what are the other consequences of disabling
> >> L3 atomics? Performance drop somewhere?
> >
> > The test I have where it goes from dead to passing, that's a considerable
> > performance improvement ;)
> >
> > I imagine not being able to use L3 for atomics is pretty dire, whether that
> > has any impact, I have no clue.
> >
> > It is still very likely that we see this because we are doing something
> > wrong elsewhere.
>
> This reminds me of f3fc4884ebe6ae649d3723be14b219230d3b7fd2 followed by
> d351f6d94893f3ba98b1b20c5ef44c35fc1da124 due to the massive impact (of
> the order of 20x IIRC) using the L3 turned out to have on the
> performance of HDC atomics, on at least that platform. It seems
> unfortunate that we're going to lose L3 atomics on Gen9 now, even though
> it's only buffer atomics which are broken IIUC, and even though the
> Windows driver is somehow getting away without disabling them. Some of
> our setup must be wrong either in the kernel or in userspace... Are
> these registers at least whitelisted so userspace can re-enable L3
> atomics once the problem is addressed? Wouldn't it be a more specific
> workaround for userspace to simply use a non-L3-cacheable MOCS for
> (rarely used) buffer surfaces, so it could benefit from L3 atomics
> elsewhere?
If it was the case that disabling L3 atomics was the only way to prevent
the machine lockup under this scenario, then I think it is
unquestionably the right thing to do, and we could not leave it to
userspace to dtrt. We should never add non-context saved unsafe
registers to the whitelist (if setting a register may cause data
corruption or worse in another context/process, that is bad) despite our
repeated transgressions. However, there's no evidence to say that it does
prevent the machine lockup as it prevents the GPU hang that lead to the
lockup on reset.
Other than GPGPU requiring a flush around every sneeze, I did not see
anything in the gen9 w/a list that seemed like a match. Nevertheless, I
expect there is a more precise w/a than a blanket disable.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] drm/i915: Disable atomics in L3 for gen9
2019-07-24 14:34 ` Chris Wilson
@ 2019-07-24 20:02 ` Francisco Jerez
2020-11-09 19:52 ` [Intel-gfx] " Jason Ekstrand
0 siblings, 1 reply; 7+ messages in thread
From: Francisco Jerez @ 2019-07-24 20:02 UTC (permalink / raw)
To: Chris Wilson, Tvrtko Ursulin, intel-gfx
[-- Attachment #1.1.1: Type: text/plain, Size: 4619 bytes --]
Chris Wilson <chris@chris-wilson.co.uk> writes:
> Quoting Francisco Jerez (2019-07-23 23:19:13)
>> Chris Wilson <chris@chris-wilson.co.uk> writes:
>>
>> > Quoting Tvrtko Ursulin (2019-07-22 12:41:36)
>> >>
>> >> On 20/07/2019 15:31, Chris Wilson wrote:
>> >> > Enabling atomic operations in L3 leads to unrecoverable GPU hangs, as
>> >> > the machine stops responding milliseconds after receipt of the reset
>> >> > request [GDRT]. By disabling the cached atomics, the hang do not occur
>> >> > and we presume the GPU would reset normally for similar hangs.
>> >> >
>> >> > Reported-by: Jason Ekstrand <jason@jlekstrand.net>
>> >> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110998
>> >> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>> >> > Cc: Jason Ekstrand <jason@jlekstrand.net>
>> >> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
>> >> > Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
>> >> > ---
>> >> > Jason reports that Windows is not clearing L3SQCREG4:22 and does not
>> >> > suffer the same GPU hang so it is likely some other w/a that interacts
>> >> > badly. Fwiw, these 3 are the only registers I could find that mention
>> >> > atomic ops (and appear to be part of the same chain for memory access).
>> >>
>> >> Bit-toggling itself looks fine to me and matches what I could find in
>> >> the docs. (All three bits across three registers should be equal.)
>> >>
>> >> What I am curious about is what are the other consequences of disabling
>> >> L3 atomics? Performance drop somewhere?
>> >
>> > The test I have where it goes from dead to passing, that's a considerable
>> > performance improvement ;)
>> >
>> > I imagine not being able to use L3 for atomics is pretty dire, whether that
>> > has any impact, I have no clue.
>> >
>> > It is still very likely that we see this because we are doing something
>> > wrong elsewhere.
>>
>> This reminds me of f3fc4884ebe6ae649d3723be14b219230d3b7fd2 followed by
>> d351f6d94893f3ba98b1b20c5ef44c35fc1da124 due to the massive impact (of
>> the order of 20x IIRC) using the L3 turned out to have on the
>> performance of HDC atomics, on at least that platform. It seems
>> unfortunate that we're going to lose L3 atomics on Gen9 now, even though
>> it's only buffer atomics which are broken IIUC, and even though the
>> Windows driver is somehow getting away without disabling them. Some of
>> our setup must be wrong either in the kernel or in userspace... Are
>> these registers at least whitelisted so userspace can re-enable L3
>> atomics once the problem is addressed? Wouldn't it be a more specific
>> workaround for userspace to simply use a non-L3-cacheable MOCS for
>> (rarely used) buffer surfaces, so it could benefit from L3 atomics
>> elsewhere?
>
> If it was the case that disabling L3 atomics was the only way to prevent
> the machine lockup under this scenario, then I think it is
> unquestionably the right thing to do, and we could not leave it to
> userspace to dtrt. We should never add non-context saved unsafe
> registers to the whitelist (if setting a register may cause data
> corruption or worse in another context/process, that is bad) despite our
> repeated transgressions. However, there's no evidence to say that it does
> prevent the machine lockup as it prevents the GPU hang that lead to the
> lockup on reset.
>
> Other than GPGPU requiring a flush around every sneeze, I did not see
> anything in the gen9 w/a list that seemed like a match. Nevertheless, I
> expect there is a more precise w/a than a blanket disable.
> -Chris
Supposedly there is a more precise one (setting the surface state MOCS
to UC for buffer images), but it relies on userspace doing the right
thing for the machine not to lock up. There is a good chance that the
reason why L3 atomics hang on such buffers is ultimately under userspace
control, in which case we'll eventually have to undo the programming
done in this patch in order to re-enable L3 atomics once the problem is
addressed. That means that userspace will have the freedom to hang the
machine hard once again, which sounds really bad, but it's no real news
for us (*cough* HSW *cough*), and it might be the only way to match the
performance of the Windows driver.
What can we do here? Add an i915 option to enable performance features
that can lead to the system hanging hard under malicious (or
incompetent) userspace programming? Probably only the user can tell
whether the trade-off between performance and security of the system is
acceptable...
[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 227 bytes --]
[-- Attachment #2: Type: text/plain, Size: 159 bytes --]
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Intel-gfx] [PATCH] drm/i915: Disable atomics in L3 for gen9
2019-07-24 20:02 ` Francisco Jerez
@ 2020-11-09 19:52 ` Jason Ekstrand
2020-11-09 20:15 ` Chris Wilson
0 siblings, 1 reply; 7+ messages in thread
From: Jason Ekstrand @ 2020-11-09 19:52 UTC (permalink / raw)
To: Francisco Jerez; +Cc: Intel GFX, Marcin Ślusarz, Chris Wilson
We need to land this patch. The number of bugs we have piling up in
Mesa gitlab related to this is getting a lot larger than I'd like.
I've gone back and forth with various HW and SW people internally for
countless e-mail threads and there is no other good workaround. Yes,
the perf hit to atomics sucks but, fortunately, most games don't use
them heavily enough for it to make a significant impact. We should
just eat the perf hit and fix the hangs.
Reviewed-by: Jason Ekstrand <jason@jlesktrand.net>
--Jason
On Wed, Jul 24, 2019 at 3:02 PM Francisco Jerez <currojerez@riseup.net> wrote:
>
> Chris Wilson <chris@chris-wilson.co.uk> writes:
>
> > Quoting Francisco Jerez (2019-07-23 23:19:13)
> >> Chris Wilson <chris@chris-wilson.co.uk> writes:
> >>
> >> > Quoting Tvrtko Ursulin (2019-07-22 12:41:36)
> >> >>
> >> >> On 20/07/2019 15:31, Chris Wilson wrote:
> >> >> > Enabling atomic operations in L3 leads to unrecoverable GPU hangs, as
> >> >> > the machine stops responding milliseconds after receipt of the reset
> >> >> > request [GDRT]. By disabling the cached atomics, the hang do not occur
> >> >> > and we presume the GPU would reset normally for similar hangs.
> >> >> >
> >> >> > Reported-by: Jason Ekstrand <jason@jlekstrand.net>
> >> >> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110998
> >> >> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> >> >> > Cc: Jason Ekstrand <jason@jlekstrand.net>
> >> >> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> >> >> > Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
> >> >> > ---
> >> >> > Jason reports that Windows is not clearing L3SQCREG4:22 and does not
> >> >> > suffer the same GPU hang so it is likely some other w/a that interacts
> >> >> > badly. Fwiw, these 3 are the only registers I could find that mention
> >> >> > atomic ops (and appear to be part of the same chain for memory access).
> >> >>
> >> >> Bit-toggling itself looks fine to me and matches what I could find in
> >> >> the docs. (All three bits across three registers should be equal.)
> >> >>
> >> >> What I am curious about is what are the other consequences of disabling
> >> >> L3 atomics? Performance drop somewhere?
> >> >
> >> > The test I have where it goes from dead to passing, that's a considerable
> >> > performance improvement ;)
> >> >
> >> > I imagine not being able to use L3 for atomics is pretty dire, whether that
> >> > has any impact, I have no clue.
> >> >
> >> > It is still very likely that we see this because we are doing something
> >> > wrong elsewhere.
> >>
> >> This reminds me of f3fc4884ebe6ae649d3723be14b219230d3b7fd2 followed by
> >> d351f6d94893f3ba98b1b20c5ef44c35fc1da124 due to the massive impact (of
> >> the order of 20x IIRC) using the L3 turned out to have on the
> >> performance of HDC atomics, on at least that platform. It seems
> >> unfortunate that we're going to lose L3 atomics on Gen9 now, even though
> >> it's only buffer atomics which are broken IIUC, and even though the
> >> Windows driver is somehow getting away without disabling them. Some of
> >> our setup must be wrong either in the kernel or in userspace... Are
> >> these registers at least whitelisted so userspace can re-enable L3
> >> atomics once the problem is addressed? Wouldn't it be a more specific
> >> workaround for userspace to simply use a non-L3-cacheable MOCS for
> >> (rarely used) buffer surfaces, so it could benefit from L3 atomics
> >> elsewhere?
> >
> > If it was the case that disabling L3 atomics was the only way to prevent
> > the machine lockup under this scenario, then I think it is
> > unquestionably the right thing to do, and we could not leave it to
> > userspace to dtrt. We should never add non-context saved unsafe
> > registers to the whitelist (if setting a register may cause data
> > corruption or worse in another context/process, that is bad) despite our
> > repeated transgressions. However, there's no evidence to say that it does
> > prevent the machine lockup as it prevents the GPU hang that lead to the
> > lockup on reset.
> >
> > Other than GPGPU requiring a flush around every sneeze, I did not see
> > anything in the gen9 w/a list that seemed like a match. Nevertheless, I
> > expect there is a more precise w/a than a blanket disable.
> > -Chris
>
> Supposedly there is a more precise one (setting the surface state MOCS
> to UC for buffer images), but it relies on userspace doing the right
> thing for the machine not to lock up. There is a good chance that the
> reason why L3 atomics hang on such buffers is ultimately under userspace
> control, in which case we'll eventually have to undo the programming
> done in this patch in order to re-enable L3 atomics once the problem is
> addressed. That means that userspace will have the freedom to hang the
> machine hard once again, which sounds really bad, but it's no real news
> for us (*cough* HSW *cough*), and it might be the only way to match the
> performance of the Windows driver.
>
> What can we do here? Add an i915 option to enable performance features
> that can lead to the system hanging hard under malicious (or
> incompetent) userspace programming? Probably only the user can tell
> whether the trade-off between performance and security of the system is
> acceptable...
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Intel-gfx] [PATCH] drm/i915: Disable atomics in L3 for gen9
2020-11-09 19:52 ` [Intel-gfx] " Jason Ekstrand
@ 2020-11-09 20:15 ` Chris Wilson
2020-11-09 20:48 ` Ville Syrjälä
0 siblings, 1 reply; 7+ messages in thread
From: Chris Wilson @ 2020-11-09 20:15 UTC (permalink / raw)
To: Francisco Jerez, Jason Ekstrand; +Cc: Intel GFX, Marcin Ślusarz
Quoting Jason Ekstrand (2020-11-09 19:52:26)
> We need to land this patch. The number of bugs we have piling up in
> Mesa gitlab related to this is getting a lot larger than I'd like.
> I've gone back and forth with various HW and SW people internally for
> countless e-mail threads and there is no other good workaround. Yes,
> the perf hit to atomics sucks but, fortunately, most games don't use
> them heavily enough for it to make a significant impact. We should
> just eat the perf hit and fix the hangs.
Drat, I thought you had found an alternative fix in the
bad GEN9_DISABLE_OCL_OOB_SUPPRESS_LOGIC w/a.
So be it.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Intel-gfx] [PATCH] drm/i915: Disable atomics in L3 for gen9
2020-11-09 20:15 ` Chris Wilson
@ 2020-11-09 20:48 ` Ville Syrjälä
0 siblings, 0 replies; 7+ messages in thread
From: Ville Syrjälä @ 2020-11-09 20:48 UTC (permalink / raw)
To: Chris Wilson; +Cc: Intel GFX, Marcin Ślusarz
On Mon, Nov 09, 2020 at 08:15:05PM +0000, Chris Wilson wrote:
> Quoting Jason Ekstrand (2020-11-09 19:52:26)
> > We need to land this patch. The number of bugs we have piling up in
> > Mesa gitlab related to this is getting a lot larger than I'd like.
> > I've gone back and forth with various HW and SW people internally for
> > countless e-mail threads and there is no other good workaround. Yes,
> > the perf hit to atomics sucks but, fortunately, most games don't use
> > them heavily enough for it to make a significant impact. We should
> > just eat the perf hit and fix the hangs.
>
> Drat, I thought you had found an alternative fix in the
> bad GEN9_DISABLE_OCL_OOB_SUPPRESS_LOGIC w/a.
>
> So be it.
I don't suppose this could be just lack of programming the magic
MOCS entry for L3 evictions?
--- a/drivers/gpu/drm/i915/gt/intel_mocs.c
+++ b/drivers/gpu/drm/i915/gt/intel_mocs.c
@@ -132,6 +132,9 @@ static const struct drm_i915_mocs_entry skl_mocs_table[] = {
MOCS_ENTRY(I915_MOCS_CACHED,
LE_3_WB | LE_TC_2_LLC_ELLC | LE_LRUM(3),
L3_3_WB)
+ MOCS_ENTRY(63,
+ LE_3_WB | LE_TC_1_LLC | LE_LRUM(3),
+ L3_1_UC)
};
/* NOTE: the LE_TGT_CACHE is not used on Broxton */
The code seems to claim we can't even program that on gen9, but there's
nothing in the current spec to back that up AFAICS.
--
Ville Syrjälä
Intel
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2021-01-26 5:11 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-25 21:52 [Intel-gfx] [CI] drm/i915: Disable atomics in L3 for gen9 Chris Wilson
2021-01-25 22:01 ` [Intel-gfx] [PATCH] " Chris Wilson
2021-01-25 23:41 ` [Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915: Disable atomics in L3 for gen9 (rev4) Patchwork
2021-01-26 5:11 ` [Intel-gfx] ✗ Fi.CI.IGT: failure " Patchwork
-- strict thread matches above, loose matches on Subject: below --
2019-07-20 14:31 [PATCH] drm/i915: Disable atomics in L3 for gen9 Chris Wilson
2019-07-22 11:41 ` Tvrtko Ursulin
2019-07-23 11:55 ` Chris Wilson
2019-07-23 22:19 ` Francisco Jerez
2019-07-24 14:34 ` Chris Wilson
2019-07-24 20:02 ` Francisco Jerez
2020-11-09 19:52 ` [Intel-gfx] " Jason Ekstrand
2020-11-09 20:15 ` Chris Wilson
2020-11-09 20:48 ` Ville Syrjälä
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).