[PATCH 1/2] drm/i915/gem: Replace reloc chain with terminator on error unwind

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH 1/2] drm/i915/gem: Replace reloc chain with terminator on error unwind
@ 2020-08-19 10:39 ` Chris Wilson
  0 siblings, 0 replies; 23+ messages in thread
From: Chris Wilson @ 2020-08-19 10:39 UTC (permalink / raw)
  To: intel-gfx; +Cc: Chris Wilson, Pavel Machek, Joonas Lahtinen, stable

If we hit an error during construction of the reloc chain, we need to
replace the chain into the next batch with the terminator so that upon
flushing the relocations so far, we do not execute a hanging batch.

Reported-by: Pavel Machek <pavel@ucw.cz>
Fixes: 964a9b0f611e ("drm/i915/gem: Use chained reloc batches")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: <stable@vger.kernel.org> # v5.8+
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    | 31 ++++++++++---------
 1 file changed, 16 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 24a1486d2dc5..a09f04eee417 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -972,21 +972,6 @@ static int reloc_gpu_chain(struct reloc_cache *cache)
 	if (err)
 		goto out_pool;
 
-	GEM_BUG_ON(cache->rq_size + RELOC_TAIL > PAGE_SIZE  / sizeof(u32));
-	cmd = cache->rq_cmd + cache->rq_size;
-	*cmd++ = MI_ARB_CHECK;
-	if (cache->gen >= 8)
-		*cmd++ = MI_BATCH_BUFFER_START_GEN8;
-	else if (cache->gen >= 6)
-		*cmd++ = MI_BATCH_BUFFER_START;
-	else
-		*cmd++ = MI_BATCH_BUFFER_START | MI_BATCH_GTT;
-	*cmd++ = lower_32_bits(batch->node.start);
-	*cmd++ = upper_32_bits(batch->node.start); /* Always 0 for gen<8 */
-	i915_gem_object_flush_map(cache->rq_vma->obj);
-	i915_gem_object_unpin_map(cache->rq_vma->obj);
-	cache->rq_vma = NULL;
-
 	err = intel_gt_buffer_pool_mark_active(pool, rq);
 	if (err == 0) {
 		i915_vma_lock(batch);
@@ -999,15 +984,31 @@ static int reloc_gpu_chain(struct reloc_cache *cache)
 	if (err)
 		goto out_pool;
 
+	GEM_BUG_ON(cache->rq_size + RELOC_TAIL > PAGE_SIZE  / sizeof(u32));
+	cmd = cache->rq_cmd + cache->rq_size;
+	*cmd++ = MI_ARB_CHECK;
+	if (cache->gen >= 8)
+		*cmd++ = MI_BATCH_BUFFER_START_GEN8;
+	else if (cache->gen >= 6)
+		*cmd++ = MI_BATCH_BUFFER_START;
+	else
+		*cmd++ = MI_BATCH_BUFFER_START | MI_BATCH_GTT;
+	*cmd++ = lower_32_bits(batch->node.start);
+	*cmd++ = upper_32_bits(batch->node.start); /* Always 0 for gen<8 */
+
 	cmd = i915_gem_object_pin_map(batch->obj,
 				      cache->has_llc ?
 				      I915_MAP_FORCE_WB :
 				      I915_MAP_FORCE_WC);
 	if (IS_ERR(cmd)) {
+		/* We will replace the BBS with BBE upon flushing the rq */
 		err = PTR_ERR(cmd);
 		goto out_pool;
 	}
 
+	i915_gem_object_flush_map(cache->rq_vma->obj);
+	i915_gem_object_unpin_map(cache->rq_vma->obj);
+
 	/* Return with batch mapping (cmd) still pinned */
 	cache->rq_cmd = cmd;
 	cache->rq_size = 0;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [Intel-gfx] [PATCH 1/2] drm/i915/gem: Replace reloc chain with terminator on error unwind
@ 2020-08-19 10:39 ` Chris Wilson
  0 siblings, 0 replies; 23+ messages in thread
From: Chris Wilson @ 2020-08-19 10:39 UTC (permalink / raw)
  To: intel-gfx; +Cc: stable, Pavel Machek, Chris Wilson

If we hit an error during construction of the reloc chain, we need to
replace the chain into the next batch with the terminator so that upon
flushing the relocations so far, we do not execute a hanging batch.

Reported-by: Pavel Machek <pavel@ucw.cz>
Fixes: 964a9b0f611e ("drm/i915/gem: Use chained reloc batches")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: <stable@vger.kernel.org> # v5.8+
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    | 31 ++++++++++---------
 1 file changed, 16 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 24a1486d2dc5..a09f04eee417 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -972,21 +972,6 @@ static int reloc_gpu_chain(struct reloc_cache *cache)
 	if (err)
 		goto out_pool;
 
-	GEM_BUG_ON(cache->rq_size + RELOC_TAIL > PAGE_SIZE  / sizeof(u32));
-	cmd = cache->rq_cmd + cache->rq_size;
-	*cmd++ = MI_ARB_CHECK;
-	if (cache->gen >= 8)
-		*cmd++ = MI_BATCH_BUFFER_START_GEN8;
-	else if (cache->gen >= 6)
-		*cmd++ = MI_BATCH_BUFFER_START;
-	else
-		*cmd++ = MI_BATCH_BUFFER_START | MI_BATCH_GTT;
-	*cmd++ = lower_32_bits(batch->node.start);
-	*cmd++ = upper_32_bits(batch->node.start); /* Always 0 for gen<8 */
-	i915_gem_object_flush_map(cache->rq_vma->obj);
-	i915_gem_object_unpin_map(cache->rq_vma->obj);
-	cache->rq_vma = NULL;
-
 	err = intel_gt_buffer_pool_mark_active(pool, rq);
 	if (err == 0) {
 		i915_vma_lock(batch);
@@ -999,15 +984,31 @@ static int reloc_gpu_chain(struct reloc_cache *cache)
 	if (err)
 		goto out_pool;
 
+	GEM_BUG_ON(cache->rq_size + RELOC_TAIL > PAGE_SIZE  / sizeof(u32));
+	cmd = cache->rq_cmd + cache->rq_size;
+	*cmd++ = MI_ARB_CHECK;
+	if (cache->gen >= 8)
+		*cmd++ = MI_BATCH_BUFFER_START_GEN8;
+	else if (cache->gen >= 6)
+		*cmd++ = MI_BATCH_BUFFER_START;
+	else
+		*cmd++ = MI_BATCH_BUFFER_START | MI_BATCH_GTT;
+	*cmd++ = lower_32_bits(batch->node.start);
+	*cmd++ = upper_32_bits(batch->node.start); /* Always 0 for gen<8 */
+
 	cmd = i915_gem_object_pin_map(batch->obj,
 				      cache->has_llc ?
 				      I915_MAP_FORCE_WB :
 				      I915_MAP_FORCE_WC);
 	if (IS_ERR(cmd)) {
+		/* We will replace the BBS with BBE upon flushing the rq */
 		err = PTR_ERR(cmd);
 		goto out_pool;
 	}
 
+	i915_gem_object_flush_map(cache->rq_vma->obj);
+	i915_gem_object_unpin_map(cache->rq_vma->obj);
+
 	/* Return with batch mapping (cmd) still pinned */
 	cache->rq_cmd = cmd;
 	cache->rq_size = 0;
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 2/2] drm/i915/gem: Fallback to using a plain kmap if reloc address space is limited
  2020-08-19 10:39 ` [Intel-gfx] " Chris Wilson
@ 2020-08-19 10:39   ` Chris Wilson
  -1 siblings, 0 replies; 23+ messages in thread
From: Chris Wilson @ 2020-08-19 10:39 UTC (permalink / raw)
  To: intel-gfx; +Cc: Chris Wilson, Pavel Machek, Joonas Lahtinen, stable

Since the processor may not support vmap with WC, or the system may be
limited in virtual address space and so may fail to create such a vmap,
fallback to using a plain kmap of the system pages and flush the buffer
on completion.

Reported-by: Pavel Machek <pavel@ucw.cz>
Fixes: 964a9b0f611e ("drm/i915/gem: Use chained reloc batches")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: <stable@vger.kernel.org> # v5.8+
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    | 25 +++++++++++++------
 1 file changed, 17 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index a09f04eee417..44df98d85b38 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -950,6 +950,21 @@ static void reloc_cache_init(struct reloc_cache *cache,
 
 #define RELOC_TAIL 4
 
+static u32 *__reloc_gpu_map(struct reloc_cache *cache,
+			    struct intel_gt_buffer_pool_node *pool)
+{
+	u32 *map;
+
+	map = i915_gem_object_pin_map(pool->obj,
+				      cache->has_llc ?
+				      I915_MAP_FORCE_WB :
+				      I915_MAP_FORCE_WC);
+	if (IS_ERR(map)) /* try a plain kmap (and flush) if no WC maps */
+		map = i915_gem_object_pin_map(pool->obj, I915_MAP_FORCE_WB);
+
+	return map;
+}
+
 static int reloc_gpu_chain(struct reloc_cache *cache)
 {
 	struct intel_gt_buffer_pool_node *pool;
@@ -996,10 +1011,7 @@ static int reloc_gpu_chain(struct reloc_cache *cache)
 	*cmd++ = lower_32_bits(batch->node.start);
 	*cmd++ = upper_32_bits(batch->node.start); /* Always 0 for gen<8 */
 
-	cmd = i915_gem_object_pin_map(batch->obj,
-				      cache->has_llc ?
-				      I915_MAP_FORCE_WB :
-				      I915_MAP_FORCE_WC);
+	cmd = __reloc_gpu_map(cache, pool);
 	if (IS_ERR(cmd)) {
 		/* We will replace the BBS with BBE upon flushing the rq */
 		err = PTR_ERR(cmd);
@@ -1096,10 +1108,7 @@ static int __reloc_gpu_alloc(struct i915_execbuffer *eb,
 	if (IS_ERR(pool))
 		return PTR_ERR(pool);
 
-	cmd = i915_gem_object_pin_map(pool->obj,
-				      cache->has_llc ?
-				      I915_MAP_FORCE_WB :
-				      I915_MAP_FORCE_WC);
+	cmd = __reloc_gpu_map(cache, pool);
 	if (IS_ERR(cmd)) {
 		err = PTR_ERR(cmd);
 		goto out_pool;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [Intel-gfx] [PATCH 2/2] drm/i915/gem: Fallback to using a plain kmap if reloc address space is limited
@ 2020-08-19 10:39   ` Chris Wilson
  0 siblings, 0 replies; 23+ messages in thread
From: Chris Wilson @ 2020-08-19 10:39 UTC (permalink / raw)
  To: intel-gfx; +Cc: stable, Pavel Machek, Chris Wilson

Since the processor may not support vmap with WC, or the system may be
limited in virtual address space and so may fail to create such a vmap,
fallback to using a plain kmap of the system pages and flush the buffer
on completion.

Reported-by: Pavel Machek <pavel@ucw.cz>
Fixes: 964a9b0f611e ("drm/i915/gem: Use chained reloc batches")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: <stable@vger.kernel.org> # v5.8+
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    | 25 +++++++++++++------
 1 file changed, 17 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index a09f04eee417..44df98d85b38 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -950,6 +950,21 @@ static void reloc_cache_init(struct reloc_cache *cache,
 
 #define RELOC_TAIL 4
 
+static u32 *__reloc_gpu_map(struct reloc_cache *cache,
+			    struct intel_gt_buffer_pool_node *pool)
+{
+	u32 *map;
+
+	map = i915_gem_object_pin_map(pool->obj,
+				      cache->has_llc ?
+				      I915_MAP_FORCE_WB :
+				      I915_MAP_FORCE_WC);
+	if (IS_ERR(map)) /* try a plain kmap (and flush) if no WC maps */
+		map = i915_gem_object_pin_map(pool->obj, I915_MAP_FORCE_WB);
+
+	return map;
+}
+
 static int reloc_gpu_chain(struct reloc_cache *cache)
 {
 	struct intel_gt_buffer_pool_node *pool;
@@ -996,10 +1011,7 @@ static int reloc_gpu_chain(struct reloc_cache *cache)
 	*cmd++ = lower_32_bits(batch->node.start);
 	*cmd++ = upper_32_bits(batch->node.start); /* Always 0 for gen<8 */
 
-	cmd = i915_gem_object_pin_map(batch->obj,
-				      cache->has_llc ?
-				      I915_MAP_FORCE_WB :
-				      I915_MAP_FORCE_WC);
+	cmd = __reloc_gpu_map(cache, pool);
 	if (IS_ERR(cmd)) {
 		/* We will replace the BBS with BBE upon flushing the rq */
 		err = PTR_ERR(cmd);
@@ -1096,10 +1108,7 @@ static int __reloc_gpu_alloc(struct i915_execbuffer *eb,
 	if (IS_ERR(pool))
 		return PTR_ERR(pool);
 
-	cmd = i915_gem_object_pin_map(pool->obj,
-				      cache->has_llc ?
-				      I915_MAP_FORCE_WB :
-				      I915_MAP_FORCE_WC);
+	cmd = __reloc_gpu_map(cache, pool);
 	if (IS_ERR(cmd)) {
 		err = PTR_ERR(cmd);
 		goto out_pool;
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [Intel-gfx] ✓ Fi.CI.BAT: success for series starting with [1/2] drm/i915/gem: Replace reloc chain with terminator on error unwind
  2020-08-19 10:39 ` [Intel-gfx] " Chris Wilson
  (?)
  (?)
@ 2020-08-19 11:48 ` Patchwork
  -1 siblings, 0 replies; 23+ messages in thread
From: Patchwork @ 2020-08-19 11:48 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx


[-- Attachment #1.1: Type: text/plain, Size: 6762 bytes --]

== Series Details ==

Series: series starting with [1/2] drm/i915/gem: Replace reloc chain with terminator on error unwind
URL   : https://patchwork.freedesktop.org/series/80795/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_8907 -> Patchwork_18370
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18370/index.html

Known issues
------------

  Here are the changes found in Patchwork_18370 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@gem_exec_suspend@basic-s0:
    - fi-tgl-u2:          [PASS][1] -> [FAIL][2] ([i915#1888])
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8907/fi-tgl-u2/igt@gem_exec_suspend@basic-s0.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18370/fi-tgl-u2/igt@gem_exec_suspend@basic-s0.html

  * igt@i915_module_load@reload:
    - fi-bxt-dsi:         [PASS][3] -> [DMESG-WARN][4] ([i915#1635] / [i915#1982])
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8907/fi-bxt-dsi/igt@i915_module_load@reload.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18370/fi-bxt-dsi/igt@i915_module_load@reload.html

  * igt@i915_pm_rpm@basic-pci-d3-state:
    - fi-bsw-kefka:       [PASS][5] -> [DMESG-WARN][6] ([i915#1982])
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8907/fi-bsw-kefka/igt@i915_pm_rpm@basic-pci-d3-state.html
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18370/fi-bsw-kefka/igt@i915_pm_rpm@basic-pci-d3-state.html

  * igt@i915_pm_rpm@module-reload:
    - fi-byt-j1900:       [PASS][7] -> [DMESG-WARN][8] ([i915#1982])
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8907/fi-byt-j1900/igt@i915_pm_rpm@module-reload.html
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18370/fi-byt-j1900/igt@i915_pm_rpm@module-reload.html

  * igt@i915_selftest@live@execlists:
    - fi-icl-y:           [PASS][9] -> [INCOMPLETE][10] ([i915#2276])
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8907/fi-icl-y/igt@i915_selftest@live@execlists.html
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18370/fi-icl-y/igt@i915_selftest@live@execlists.html

  * igt@i915_selftest@live@gt_lrc:
    - fi-tgl-u2:          [PASS][11] -> [DMESG-FAIL][12] ([i915#1233])
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8907/fi-tgl-u2/igt@i915_selftest@live@gt_lrc.html
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18370/fi-tgl-u2/igt@i915_selftest@live@gt_lrc.html

  * igt@kms_flip@basic-flip-vs-wf_vblank@c-hdmi-a2:
    - fi-skl-guc:         [PASS][13] -> [DMESG-WARN][14] ([i915#2203])
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8907/fi-skl-guc/igt@kms_flip@basic-flip-vs-wf_vblank@c-hdmi-a2.html
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18370/fi-skl-guc/igt@kms_flip@basic-flip-vs-wf_vblank@c-hdmi-a2.html

  
#### Possible fixes ####

  * igt@kms_cursor_legacy@basic-busy-flip-before-cursor-atomic:
    - fi-bsw-kefka:       [DMESG-WARN][15] ([i915#1982]) -> [PASS][16]
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8907/fi-bsw-kefka/igt@kms_cursor_legacy@basic-busy-flip-before-cursor-atomic.html
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18370/fi-bsw-kefka/igt@kms_cursor_legacy@basic-busy-flip-before-cursor-atomic.html

  
#### Warnings ####

  * igt@gem_exec_suspend@basic-s0:
    - fi-kbl-x1275:       [DMESG-WARN][17] ([i915#1982] / [i915#62] / [i915#92] / [i915#95]) -> [DMESG-WARN][18] ([i915#62] / [i915#92] / [i915#95])
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8907/fi-kbl-x1275/igt@gem_exec_suspend@basic-s0.html
   [18]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18370/fi-kbl-x1275/igt@gem_exec_suspend@basic-s0.html

  * igt@kms_chamelium@common-hpd-after-suspend:
    - fi-skl-6700k2:      [DMESG-WARN][19] ([i915#2203]) -> [INCOMPLETE][20] ([i915#2203])
   [19]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8907/fi-skl-6700k2/igt@kms_chamelium@common-hpd-after-suspend.html
   [20]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18370/fi-skl-6700k2/igt@kms_chamelium@common-hpd-after-suspend.html

  * igt@kms_cursor_legacy@basic-flip-before-cursor-varying-size:
    - fi-kbl-x1275:       [DMESG-WARN][21] ([i915#62] / [i915#92]) -> [DMESG-WARN][22] ([i915#62] / [i915#92] / [i915#95]) +2 similar issues
   [21]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8907/fi-kbl-x1275/igt@kms_cursor_legacy@basic-flip-before-cursor-varying-size.html
   [22]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18370/fi-kbl-x1275/igt@kms_cursor_legacy@basic-flip-before-cursor-varying-size.html

  * igt@kms_flip@basic-flip-vs-dpms@a-dp1:
    - fi-kbl-x1275:       [DMESG-WARN][23] ([i915#62] / [i915#92] / [i915#95]) -> [DMESG-WARN][24] ([i915#62] / [i915#92]) +2 similar issues
   [23]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8907/fi-kbl-x1275/igt@kms_flip@basic-flip-vs-dpms@a-dp1.html
   [24]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18370/fi-kbl-x1275/igt@kms_flip@basic-flip-vs-dpms@a-dp1.html

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [i915#1233]: https://gitlab.freedesktop.org/drm/intel/issues/1233
  [i915#1635]: https://gitlab.freedesktop.org/drm/intel/issues/1635
  [i915#1888]: https://gitlab.freedesktop.org/drm/intel/issues/1888
  [i915#1982]: https://gitlab.freedesktop.org/drm/intel/issues/1982
  [i915#2203]: https://gitlab.freedesktop.org/drm/intel/issues/2203
  [i915#2276]: https://gitlab.freedesktop.org/drm/intel/issues/2276
  [i915#62]: https://gitlab.freedesktop.org/drm/intel/issues/62
  [i915#92]: https://gitlab.freedesktop.org/drm/intel/issues/92
  [i915#95]: https://gitlab.freedesktop.org/drm/intel/issues/95


Participating hosts (39 -> 34)
------------------------------

  Missing    (5): fi-ilk-m540 fi-hsw-4200u fi-byt-squawks fi-bsw-cyan fi-byt-clapper 


Build changes
-------------

  * Linux: CI_DRM_8907 -> Patchwork_18370

  CI-20190529: 20190529
  CI_DRM_8907: f9f7b73d0f125316a33e35f3315f3a5955079e33 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_5769: 4e5f76be680b65780204668e302026cf638decc9 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_18370: 4759f933e1d574c27656dfc1d2523148309ebb39 @ git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

4759f933e1d5 drm/i915/gem: Fallback to using a plain kmap if reloc address space is limited
93fe2eaecd76 drm/i915/gem: Replace reloc chain with terminator on error unwind

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18370/index.html

[-- Attachment #1.2: Type: text/html, Size: 8895 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/2] drm/i915/gem: Replace reloc chain with terminator on error unwind
  2020-08-19 10:39 ` [Intel-gfx] " Chris Wilson
@ 2020-08-19 17:23   ` Pavel Machek
  -1 siblings, 0 replies; 23+ messages in thread
From: Pavel Machek @ 2020-08-19 17:23 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx, Joonas Lahtinen, stable

[-- Attachment #1: Type: text/plain, Size: 726 bytes --]

Hi!

> If we hit an error during construction of the reloc chain, we need to
> replace the chain into the next batch with the terminator so that upon
> flushing the relocations so far, we do not execute a hanging batch.

Thanks for the patches. I assume this should fix problem from
"5.9-rc1: graphics regression moved from -next to mainline" thread.

I have applied them over current -next, and my machine seems to be
working so far (but uptime is less than 30 minutes).

If the machine still works tommorow, I'll assume problem is solved.

Best regards,
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Intel-gfx] [PATCH 1/2] drm/i915/gem: Replace reloc chain with terminator on error unwind
@ 2020-08-19 17:23   ` Pavel Machek
  0 siblings, 0 replies; 23+ messages in thread
From: Pavel Machek @ 2020-08-19 17:23 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx, stable


[-- Attachment #1.1: Type: text/plain, Size: 726 bytes --]

Hi!

> If we hit an error during construction of the reloc chain, we need to
> replace the chain into the next batch with the terminator so that upon
> flushing the relocations so far, we do not execute a hanging batch.

Thanks for the patches. I assume this should fix problem from
"5.9-rc1: graphics regression moved from -next to mainline" thread.

I have applied them over current -next, and my machine seems to be
working so far (but uptime is less than 30 minutes).

If the machine still works tommorow, I'll assume problem is solved.

Best regards,
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #1.2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/2] drm/i915/gem: Replace reloc chain with terminator on error unwind
  2020-08-19 17:23   ` [Intel-gfx] " Pavel Machek
@ 2020-08-19 17:36     ` Chris Wilson
  -1 siblings, 0 replies; 23+ messages in thread
From: Chris Wilson @ 2020-08-19 17:36 UTC (permalink / raw)
  To: Pavel Machek; +Cc: intel-gfx, Joonas Lahtinen, stable

Quoting Pavel Machek (2020-08-19 18:23:31)
> Hi!
> 
> > If we hit an error during construction of the reloc chain, we need to
> > replace the chain into the next batch with the terminator so that upon
> > flushing the relocations so far, we do not execute a hanging batch.
> 
> Thanks for the patches. I assume this should fix problem from
> "5.9-rc1: graphics regression moved from -next to mainline" thread.
> 
> I have applied them over current -next, and my machine seems to be
> working so far (but uptime is less than 30 minutes).
> 
> If the machine still works tommorow, I'll assume problem is solved.

Aye, best wait until we have to start competing with Chromium for
memory... The suspicion is that it was the resource allocation failure
path.
-Chris

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Intel-gfx] [PATCH 1/2] drm/i915/gem: Replace reloc chain with terminator on error unwind
@ 2020-08-19 17:36     ` Chris Wilson
  0 siblings, 0 replies; 23+ messages in thread
From: Chris Wilson @ 2020-08-19 17:36 UTC (permalink / raw)
  To: Pavel Machek; +Cc: intel-gfx, stable

Quoting Pavel Machek (2020-08-19 18:23:31)
> Hi!
> 
> > If we hit an error during construction of the reloc chain, we need to
> > replace the chain into the next batch with the terminator so that upon
> > flushing the relocations so far, we do not execute a hanging batch.
> 
> Thanks for the patches. I assume this should fix problem from
> "5.9-rc1: graphics regression moved from -next to mainline" thread.
> 
> I have applied them over current -next, and my machine seems to be
> working so far (but uptime is less than 30 minutes).
> 
> If the machine still works tommorow, I'll assume problem is solved.

Aye, best wait until we have to start competing with Chromium for
memory... The suspicion is that it was the resource allocation failure
path.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/2] drm/i915/gem: Replace reloc chain with terminator on error unwind
  2020-08-19 17:36     ` [Intel-gfx] " Chris Wilson
@ 2020-08-19 19:33       ` Pavel Machek
  -1 siblings, 0 replies; 23+ messages in thread
From: Pavel Machek @ 2020-08-19 19:33 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx, Joonas Lahtinen, stable

[-- Attachment #1: Type: text/plain, Size: 10751 bytes --]

Hi!

> > > If we hit an error during construction of the reloc chain, we need to
> > > replace the chain into the next batch with the terminator so that upon
> > > flushing the relocations so far, we do not execute a hanging batch.
> > 
> > Thanks for the patches. I assume this should fix problem from
> > "5.9-rc1: graphics regression moved from -next to mainline" thread.
> > 
> > I have applied them over current -next, and my machine seems to be
> > working so far (but uptime is less than 30 minutes).
> > 
> > If the machine still works tommorow, I'll assume problem is solved.
> 
> Aye, best wait until we have to start competing with Chromium for
> memory... The suspicion is that it was the resource allocation failure
> path.

Yep, my machines are low on memory.

But ... test did not work that well. I have dead X and blinking
screen. Machine still works reasonably well over ssh, so I guess
that's an improvement.

Best regards,
							Pavel

[ 5604.909393] ACPI: EC: event unblocked
[ 5604.913590] usb usb2: root hub lost power or was reset
[ 5604.913812] usb usb3: root hub lost power or was reset
[ 5604.914046] usb usb4: root hub lost power or was reset
[ 5604.918812] ata6: port disabled--ignoring
[ 5604.925353] sd 0:0:0:0: [sda] Starting disk
[ 5605.150042] thinkpad_acpi: ACPI backlight control delay disabled
[ 5605.204955] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[ 5605.205931] ata1.00: ACPI cmd ef/02:00:00:00:00:a0 (SET FEATURES) succeeded
[ 5605.205941] ata1.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE LOCK) filtered out
[ 5605.205949] ata1.00: ACPI cmd ef/10:03:00:00:00:a0 (SET FEATURES) filtered out
[ 5605.207748] ata1.00: ACPI cmd ef/02:00:00:00:00:a0 (SET FEATURES) succeeded
[ 5605.207757] ata1.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE LOCK) filtered out
[ 5605.207765] ata1.00: ACPI cmd ef/10:03:00:00:00:a0 (SET FEATURES) filtered out
[ 5605.208227] ata1.00: configured for UDMA/133
[ 5605.281913] usb 5-2: reset full-speed USB device number 3 using uhci_hcd
[ 5605.569752] usb 5-1: reset full-speed USB device number 2 using uhci_hcd
[ 5609.082771] PM: resume devices took 4.192 seconds
[ 5609.083380] OOM killer enabled.
[ 5609.083387] Restarting tasks ... done.
[ 5609.103164] video LNXVIDEO:00: Restoring backlight state
[ 5609.150144] PM: suspend exit
[ 5609.190535] sdhci-pci 0000:15:00.2: Will use DMA mode even though HW doesn't fully claim to support it.
[ 5609.239495] sdhci-pci 0000:15:00.2: Will use DMA mode even though HW doesn't fully claim to support it.
[ 5609.287144] sdhci-pci 0000:15:00.2: Will use DMA mode even though HW doesn't fully claim to support it.
[ 5609.344497] sdhci-pci 0000:15:00.2: Will use DMA mode even though HW doesn't fully claim to support it.
[ 5611.426855] wlan0: authenticate with 5c:f4:ab:10:d2:bb
[ 5611.430609] wlan0: send auth to 5c:f4:ab:10:d2:bb (try 1/3)
[ 5611.432552] wlan0: authenticated
[ 5611.433705] wlan0: associate with 5c:f4:ab:10:d2:bb (try 1/3)
[ 5611.436440] wlan0: RX AssocResp from 5c:f4:ab:10:d2:bb (capab=0x411 status=0 aid=1)
[ 5611.439083] wlan0: associated
[ 7744.718473] BUG: unable to handle page fault for address: f8c00000
[ 7744.718484] #PF: supervisor write access in kernel mode
[ 7744.718487] #PF: error_code(0x0002) - not-present page
[ 7744.718491] *pdpt = 0000000031b0b001 *pde = 0000000000000000 
[ 7744.718500] Oops: 0002 [#1] PREEMPT SMP PTI
[ 7744.718506] CPU: 0 PID: 3004 Comm: Xorg Not tainted 5.9.0-rc1-next-20200819+ #134
[ 7744.718509] Hardware name: LENOVO 17097HU/17097HU, BIOS 7BETD8WW (2.19 ) 03/31/2011
[ 7744.718518] EIP: eb_relocate_vma+0xdbf/0xf20
[ 7744.718523] Code: 48 74 8b 41 08 89 41 0c 8b 85 a4 fd ff ff 89 95 a0 fd ff ff e8 c2 12 6c 00 8b 95 a0 fd ff ff e9 03 fc ff ff 8b 85 d0 fd ff ff <c7> 03 01 00 40 10 89 43 04 8b 85 dc fd ff ff 89 43 08 e9 4a f6 ff
[ 7744.718527] EAX: 01397010 EBX: f8c00000 ECX: 01247000 EDX: 00000000
[ 7744.718531] ESI: f519cd80 EDI: f1ac1cd4 EBP: f1ac1c6c ESP: f1ac1a04
[ 7744.718535] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00210246
[ 7744.718539] CR0: 80050033 CR2: f8c00000 CR3: 31ac2000 CR4: 000006b0
[ 7744.718543] Call Trace:
[ 7744.718553]  ? shmem_read_mapping_page_gfp+0x32/0x70
[ 7744.718560]  ? eb_lookup_vmas+0x272/0x9f0
[ 7744.718565]  i915_gem_do_execbuffer+0xa7b/0x2730
[ 7744.718573]  ? intel_runtime_pm_put_unchecked+0xd/0x10
[ 7744.718578]  ? i915_gem_gtt_pwrite_fast+0xf6/0x520
[ 7744.718586]  ? __lock_acquire.isra.0+0x223/0x500
[ 7744.718592]  ? cache_alloc_debugcheck_after+0x151/0x180
[ 7744.718596]  ? kvmalloc_node+0x69/0x80
[ 7744.718600]  ? __kmalloc+0x92/0x120
[ 7744.718604]  ? kvmalloc_node+0x69/0x80
[ 7744.718608]  i915_gem_execbuffer2_ioctl+0xdd/0x350
[ 7744.718613]  ? i915_gem_execbuffer_ioctl+0x2a0/0x2a0
[ 7744.718619]  drm_ioctl_kernel+0x91/0xe0
[ 7744.718623]  ? i915_gem_execbuffer_ioctl+0x2a0/0x2a0
[ 7744.718627]  drm_ioctl+0x1fd/0x371
[ 7744.718631]  ? i915_gem_execbuffer_ioctl+0x2a0/0x2a0
[ 7744.718639]  ? posix_get_monotonic_timespec+0x1d/0x80
[ 7744.718645]  ? __sys_recvmsg+0x37/0x80
[ 7744.718649]  ? drm_ioctl_kernel+0xe0/0xe0
[ 7744.718654]  __ia32_sys_ioctl+0x14b/0x7c6
[ 7744.718661]  ? exit_to_user_mode_prepare+0x53/0x100
[ 7744.718667]  do_int80_syscall_32+0x2c/0x40
[ 7744.718674]  entry_INT80_32+0x111/0x111
[ 7744.718678] EIP: 0xb7fd3092
[ 7744.718683] Code: 00 00 00 e9 90 ff ff ff ff a3 24 00 00 00 68 30 00 00 00 e9 80 ff ff ff ff a3 e8 ff ff ff 66 90 00 00 00 00 00 00 00 00 cd 80 <c3> 8d b4 26 00 00 00 00 8d b6 00 00 00 00 8b 1c 24 c3 8d b4 26 00
[ 7744.718687] EAX: ffffffda EBX: 0000000a ECX: c0406469 EDX: bfe67abc
[ 7744.718691] ESI: b73c1000 EDI: c0406469 EBP: 0000000a ESP: bfe67a34
[ 7744.718695] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00200292
[ 7744.718700]  ? asm_exc_nmi+0xcc/0x2bc
[ 7744.718703] Modules linked in:
[ 7744.718709] CR2: 00000000f8c00000
[ 7744.718714] ---[ end trace 121f748dd4d0d6ec ]---
[ 7744.718719] EIP: eb_relocate_vma+0xdbf/0xf20
[ 7744.718723] Code: 48 74 8b 41 08 89 41 0c 8b 85 a4 fd ff ff 89 95 a0 fd ff ff e8 c2 12 6c 00 8b 95 a0 fd ff ff e9 03 fc ff ff 8b 85 d0 fd ff ff <c7> 03 01 00 40 10 89 43 04 8b 85 dc fd ff ff 89 43 08 e9 4a f6 ff
[ 7744.718727] EAX: 01397010 EBX: f8c00000 ECX: 01247000 EDX: 00000000
[ 7744.718731] ESI: f519cd80 EDI: f1ac1cd4 EBP: f1ac1c6c ESP: f1ac1a04
[ 7744.718735] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00210246
[ 7744.718739] CR0: 80050033 CR2: f8c00000 CR3: 31ac2000 CR4: 000006b0
[ 7744.723687] BUG: unable to handle page fault for address: f8c02038
[ 7744.723695] #PF: supervisor write access in kernel mode
[ 7744.723699] #PF: error_code(0x0002) - not-present page
[ 7744.723702] *pdpt = 0000000031866001 *pde = 0000000000000000 
[ 7744.723711] Oops: 0002 [#2] PREEMPT SMP PTI
[ 7744.723717] CPU: 1 PID: 3004 Comm: Xorg Tainted: G      D           5.9.0-rc1-next-20200819+ #134
[ 7744.723720] Hardware name: LENOVO 17097HU/17097HU, BIOS 7BETD8WW (2.19 ) 03/31/2011
[ 7744.723728] EIP: n_tty_open+0x26/0x80
[ 7744.723733] Code: 00 00 00 90 55 89 e5 56 53 89 c3 b8 f0 22 00 00 e8 4f 39 cb ff 85 c0 74 62 89 c6 a1 00 2d 27 c5 b9 e8 2a 77 c5 ba 85 83 12 c5 <89> 46 38 8d 86 58 22 00 00 e8 8c 12 c0 ff 8d 86 a4 22 00 00 b9 e0
[ 7744.723738] EAX: 001c65c0 EBX: f2339000 ECX: c5772ae8 EDX: c5128385
[ 7744.723741] ESI: f8c02000 EDI: 00000000 EBP: f1ac1ee4 ESP: f1ac1edc
[ 7744.723745] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00210286
[ 7744.723751] CR0: 80050033 CR2: f8c02038 CR3: 31864000 CR4: 000006b0
[ 7744.723755] Call Trace:
[ 7744.723763]  tty_ldisc_open.isra.0+0x23/0x40
[ 7744.723768]  tty_ldisc_reinit+0x99/0xe0
[ 7744.723772]  tty_ldisc_hangup+0xc4/0x1e0
[ 7744.723776]  __tty_hangup.part.0+0x13f/0x250
[ 7744.723781]  tty_vhangup_session+0x11/0x20
[ 7744.723786]  disassociate_ctty.part.0+0x34/0x230
[ 7744.723790]  disassociate_ctty+0x28/0x30
[ 7744.723797]  do_exit+0x456/0x960
[ 7744.723803]  ? exit_to_user_mode_prepare+0x53/0x100
[ 7744.723808]  rewind_stack_do_exit+0x11/0x13
[ 7744.723812] EIP: 0xb7fd3092
[ 7744.723815] Code: Bad RIP value.
[ 7744.723819] EAX: ffffffda EBX: 0000000a ECX: c0406469 EDX: bfe67abc
[ 7744.723823] ESI: b73c1000 EDI: c0406469 EBP: 0000000a ESP: bfe67a34
[ 7744.723827] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00200292
[ 7744.723837]  ? asm_exc_nmi+0xcc/0x2bc
[ 7744.723839] Modules linked in:
[ 7744.723845] CR2: 00000000f8c02038
[ 7744.723851] ---[ end trace 121f748dd4d0d6ed ]---
[ 7744.723857] EIP: eb_relocate_vma+0xdbf/0xf20
[ 7744.723861] Code: 48 74 8b 41 08 89 41 0c 8b 85 a4 fd ff ff 89 95 a0 fd ff ff e8 c2 12 6c 00 8b 95 a0 fd ff ff e9 03 fc ff ff 8b 85 d0 fd ff ff <c7> 03 01 00 40 10 89 43 04 8b 85 dc fd ff ff 89 43 08 e9 4a f6 ff
[ 7744.723865] EAX: 01397010 EBX: f8c00000 ECX: 01247000 EDX: 00000000
[ 7744.723869] ESI: f519cd80 EDI: f1ac1cd4 EBP: f1ac1c6c ESP: f1ac1a04
[ 7744.723873] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00210246
[ 7744.723877] CR0: 80050033 CR2: f8c02038 CR3: 31864000 CR4: 000006b0
[ 7744.723880] Fixing recursive fault but reboot is needed!
[ 7749.589011] i915 0000:00:02.0: [drm] GPU HANG: ecode 3:0:00000000
[ 7749.589024] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
[ 7749.589030] Please file a _new_ bug report at https://gitlab.freedesktop.org/drm/intel/issues/new.
[ 7749.589036] Please see https://gitlab.freedesktop.org/drm/intel/-/wikis/How-to-file-i915-bugs for details.
[ 7749.589041] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
[ 7749.589047] The GPU crash dump is required to analyze GPU hangs, so please always attach it.
[ 7749.589053] GPU crash dump saved to /sys/class/drm/card0/error
[ 7749.909841] i915 0000:00:02.0: [drm] Resetting chip for no heartbeat on rcs0
[ 7756.504232] i915 0000:00:02.0: [drm] GPU HANG: ecode 3:0:00000000
[ 7756.817879] i915 0000:00:02.0: [drm] Resetting chip for no heartbeat on rcs0
[ 7763.672921] i915 0000:00:02.0: [drm] GPU HANG: ecode 3:0:00000000
[ 7763.985882] i915 0000:00:02.0: [drm] Resetting chip for no heartbeat on rcs0
[ 7770.580999] i915 0000:00:02.0: [drm] GPU HANG: ecode 3:0:00000000
[ 7770.897884] i915 0000:00:02.0: [drm] Resetting chip for no heartbeat on rcs0
[ 7777.497036] i915 0000:00:02.0: [drm] GPU HANG: ecode 3:0:00000000
[ 7777.825882] i915 0000:00:02.0: [drm] Resetting chip for no heartbeat on rcs0
[ 7784.664999] i915 0000:00:02.0: [drm] GPU HANG: ecode 3:0:00000000


-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Intel-gfx] [PATCH 1/2] drm/i915/gem: Replace reloc chain with terminator on error unwind
@ 2020-08-19 19:33       ` Pavel Machek
  0 siblings, 0 replies; 23+ messages in thread
From: Pavel Machek @ 2020-08-19 19:33 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx, stable


[-- Attachment #1.1: Type: text/plain, Size: 10751 bytes --]

Hi!

> > > If we hit an error during construction of the reloc chain, we need to
> > > replace the chain into the next batch with the terminator so that upon
> > > flushing the relocations so far, we do not execute a hanging batch.
> > 
> > Thanks for the patches. I assume this should fix problem from
> > "5.9-rc1: graphics regression moved from -next to mainline" thread.
> > 
> > I have applied them over current -next, and my machine seems to be
> > working so far (but uptime is less than 30 minutes).
> > 
> > If the machine still works tommorow, I'll assume problem is solved.
> 
> Aye, best wait until we have to start competing with Chromium for
> memory... The suspicion is that it was the resource allocation failure
> path.

Yep, my machines are low on memory.

But ... test did not work that well. I have dead X and blinking
screen. Machine still works reasonably well over ssh, so I guess
that's an improvement.

Best regards,
							Pavel

[ 5604.909393] ACPI: EC: event unblocked
[ 5604.913590] usb usb2: root hub lost power or was reset
[ 5604.913812] usb usb3: root hub lost power or was reset
[ 5604.914046] usb usb4: root hub lost power or was reset
[ 5604.918812] ata6: port disabled--ignoring
[ 5604.925353] sd 0:0:0:0: [sda] Starting disk
[ 5605.150042] thinkpad_acpi: ACPI backlight control delay disabled
[ 5605.204955] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[ 5605.205931] ata1.00: ACPI cmd ef/02:00:00:00:00:a0 (SET FEATURES) succeeded
[ 5605.205941] ata1.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE LOCK) filtered out
[ 5605.205949] ata1.00: ACPI cmd ef/10:03:00:00:00:a0 (SET FEATURES) filtered out
[ 5605.207748] ata1.00: ACPI cmd ef/02:00:00:00:00:a0 (SET FEATURES) succeeded
[ 5605.207757] ata1.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE LOCK) filtered out
[ 5605.207765] ata1.00: ACPI cmd ef/10:03:00:00:00:a0 (SET FEATURES) filtered out
[ 5605.208227] ata1.00: configured for UDMA/133
[ 5605.281913] usb 5-2: reset full-speed USB device number 3 using uhci_hcd
[ 5605.569752] usb 5-1: reset full-speed USB device number 2 using uhci_hcd
[ 5609.082771] PM: resume devices took 4.192 seconds
[ 5609.083380] OOM killer enabled.
[ 5609.083387] Restarting tasks ... done.
[ 5609.103164] video LNXVIDEO:00: Restoring backlight state
[ 5609.150144] PM: suspend exit
[ 5609.190535] sdhci-pci 0000:15:00.2: Will use DMA mode even though HW doesn't fully claim to support it.
[ 5609.239495] sdhci-pci 0000:15:00.2: Will use DMA mode even though HW doesn't fully claim to support it.
[ 5609.287144] sdhci-pci 0000:15:00.2: Will use DMA mode even though HW doesn't fully claim to support it.
[ 5609.344497] sdhci-pci 0000:15:00.2: Will use DMA mode even though HW doesn't fully claim to support it.
[ 5611.426855] wlan0: authenticate with 5c:f4:ab:10:d2:bb
[ 5611.430609] wlan0: send auth to 5c:f4:ab:10:d2:bb (try 1/3)
[ 5611.432552] wlan0: authenticated
[ 5611.433705] wlan0: associate with 5c:f4:ab:10:d2:bb (try 1/3)
[ 5611.436440] wlan0: RX AssocResp from 5c:f4:ab:10:d2:bb (capab=0x411 status=0 aid=1)
[ 5611.439083] wlan0: associated
[ 7744.718473] BUG: unable to handle page fault for address: f8c00000
[ 7744.718484] #PF: supervisor write access in kernel mode
[ 7744.718487] #PF: error_code(0x0002) - not-present page
[ 7744.718491] *pdpt = 0000000031b0b001 *pde = 0000000000000000 
[ 7744.718500] Oops: 0002 [#1] PREEMPT SMP PTI
[ 7744.718506] CPU: 0 PID: 3004 Comm: Xorg Not tainted 5.9.0-rc1-next-20200819+ #134
[ 7744.718509] Hardware name: LENOVO 17097HU/17097HU, BIOS 7BETD8WW (2.19 ) 03/31/2011
[ 7744.718518] EIP: eb_relocate_vma+0xdbf/0xf20
[ 7744.718523] Code: 48 74 8b 41 08 89 41 0c 8b 85 a4 fd ff ff 89 95 a0 fd ff ff e8 c2 12 6c 00 8b 95 a0 fd ff ff e9 03 fc ff ff 8b 85 d0 fd ff ff <c7> 03 01 00 40 10 89 43 04 8b 85 dc fd ff ff 89 43 08 e9 4a f6 ff
[ 7744.718527] EAX: 01397010 EBX: f8c00000 ECX: 01247000 EDX: 00000000
[ 7744.718531] ESI: f519cd80 EDI: f1ac1cd4 EBP: f1ac1c6c ESP: f1ac1a04
[ 7744.718535] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00210246
[ 7744.718539] CR0: 80050033 CR2: f8c00000 CR3: 31ac2000 CR4: 000006b0
[ 7744.718543] Call Trace:
[ 7744.718553]  ? shmem_read_mapping_page_gfp+0x32/0x70
[ 7744.718560]  ? eb_lookup_vmas+0x272/0x9f0
[ 7744.718565]  i915_gem_do_execbuffer+0xa7b/0x2730
[ 7744.718573]  ? intel_runtime_pm_put_unchecked+0xd/0x10
[ 7744.718578]  ? i915_gem_gtt_pwrite_fast+0xf6/0x520
[ 7744.718586]  ? __lock_acquire.isra.0+0x223/0x500
[ 7744.718592]  ? cache_alloc_debugcheck_after+0x151/0x180
[ 7744.718596]  ? kvmalloc_node+0x69/0x80
[ 7744.718600]  ? __kmalloc+0x92/0x120
[ 7744.718604]  ? kvmalloc_node+0x69/0x80
[ 7744.718608]  i915_gem_execbuffer2_ioctl+0xdd/0x350
[ 7744.718613]  ? i915_gem_execbuffer_ioctl+0x2a0/0x2a0
[ 7744.718619]  drm_ioctl_kernel+0x91/0xe0
[ 7744.718623]  ? i915_gem_execbuffer_ioctl+0x2a0/0x2a0
[ 7744.718627]  drm_ioctl+0x1fd/0x371
[ 7744.718631]  ? i915_gem_execbuffer_ioctl+0x2a0/0x2a0
[ 7744.718639]  ? posix_get_monotonic_timespec+0x1d/0x80
[ 7744.718645]  ? __sys_recvmsg+0x37/0x80
[ 7744.718649]  ? drm_ioctl_kernel+0xe0/0xe0
[ 7744.718654]  __ia32_sys_ioctl+0x14b/0x7c6
[ 7744.718661]  ? exit_to_user_mode_prepare+0x53/0x100
[ 7744.718667]  do_int80_syscall_32+0x2c/0x40
[ 7744.718674]  entry_INT80_32+0x111/0x111
[ 7744.718678] EIP: 0xb7fd3092
[ 7744.718683] Code: 00 00 00 e9 90 ff ff ff ff a3 24 00 00 00 68 30 00 00 00 e9 80 ff ff ff ff a3 e8 ff ff ff 66 90 00 00 00 00 00 00 00 00 cd 80 <c3> 8d b4 26 00 00 00 00 8d b6 00 00 00 00 8b 1c 24 c3 8d b4 26 00
[ 7744.718687] EAX: ffffffda EBX: 0000000a ECX: c0406469 EDX: bfe67abc
[ 7744.718691] ESI: b73c1000 EDI: c0406469 EBP: 0000000a ESP: bfe67a34
[ 7744.718695] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00200292
[ 7744.718700]  ? asm_exc_nmi+0xcc/0x2bc
[ 7744.718703] Modules linked in:
[ 7744.718709] CR2: 00000000f8c00000
[ 7744.718714] ---[ end trace 121f748dd4d0d6ec ]---
[ 7744.718719] EIP: eb_relocate_vma+0xdbf/0xf20
[ 7744.718723] Code: 48 74 8b 41 08 89 41 0c 8b 85 a4 fd ff ff 89 95 a0 fd ff ff e8 c2 12 6c 00 8b 95 a0 fd ff ff e9 03 fc ff ff 8b 85 d0 fd ff ff <c7> 03 01 00 40 10 89 43 04 8b 85 dc fd ff ff 89 43 08 e9 4a f6 ff
[ 7744.718727] EAX: 01397010 EBX: f8c00000 ECX: 01247000 EDX: 00000000
[ 7744.718731] ESI: f519cd80 EDI: f1ac1cd4 EBP: f1ac1c6c ESP: f1ac1a04
[ 7744.718735] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00210246
[ 7744.718739] CR0: 80050033 CR2: f8c00000 CR3: 31ac2000 CR4: 000006b0
[ 7744.723687] BUG: unable to handle page fault for address: f8c02038
[ 7744.723695] #PF: supervisor write access in kernel mode
[ 7744.723699] #PF: error_code(0x0002) - not-present page
[ 7744.723702] *pdpt = 0000000031866001 *pde = 0000000000000000 
[ 7744.723711] Oops: 0002 [#2] PREEMPT SMP PTI
[ 7744.723717] CPU: 1 PID: 3004 Comm: Xorg Tainted: G      D           5.9.0-rc1-next-20200819+ #134
[ 7744.723720] Hardware name: LENOVO 17097HU/17097HU, BIOS 7BETD8WW (2.19 ) 03/31/2011
[ 7744.723728] EIP: n_tty_open+0x26/0x80
[ 7744.723733] Code: 00 00 00 90 55 89 e5 56 53 89 c3 b8 f0 22 00 00 e8 4f 39 cb ff 85 c0 74 62 89 c6 a1 00 2d 27 c5 b9 e8 2a 77 c5 ba 85 83 12 c5 <89> 46 38 8d 86 58 22 00 00 e8 8c 12 c0 ff 8d 86 a4 22 00 00 b9 e0
[ 7744.723738] EAX: 001c65c0 EBX: f2339000 ECX: c5772ae8 EDX: c5128385
[ 7744.723741] ESI: f8c02000 EDI: 00000000 EBP: f1ac1ee4 ESP: f1ac1edc
[ 7744.723745] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00210286
[ 7744.723751] CR0: 80050033 CR2: f8c02038 CR3: 31864000 CR4: 000006b0
[ 7744.723755] Call Trace:
[ 7744.723763]  tty_ldisc_open.isra.0+0x23/0x40
[ 7744.723768]  tty_ldisc_reinit+0x99/0xe0
[ 7744.723772]  tty_ldisc_hangup+0xc4/0x1e0
[ 7744.723776]  __tty_hangup.part.0+0x13f/0x250
[ 7744.723781]  tty_vhangup_session+0x11/0x20
[ 7744.723786]  disassociate_ctty.part.0+0x34/0x230
[ 7744.723790]  disassociate_ctty+0x28/0x30
[ 7744.723797]  do_exit+0x456/0x960
[ 7744.723803]  ? exit_to_user_mode_prepare+0x53/0x100
[ 7744.723808]  rewind_stack_do_exit+0x11/0x13
[ 7744.723812] EIP: 0xb7fd3092
[ 7744.723815] Code: Bad RIP value.
[ 7744.723819] EAX: ffffffda EBX: 0000000a ECX: c0406469 EDX: bfe67abc
[ 7744.723823] ESI: b73c1000 EDI: c0406469 EBP: 0000000a ESP: bfe67a34
[ 7744.723827] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00200292
[ 7744.723837]  ? asm_exc_nmi+0xcc/0x2bc
[ 7744.723839] Modules linked in:
[ 7744.723845] CR2: 00000000f8c02038
[ 7744.723851] ---[ end trace 121f748dd4d0d6ed ]---
[ 7744.723857] EIP: eb_relocate_vma+0xdbf/0xf20
[ 7744.723861] Code: 48 74 8b 41 08 89 41 0c 8b 85 a4 fd ff ff 89 95 a0 fd ff ff e8 c2 12 6c 00 8b 95 a0 fd ff ff e9 03 fc ff ff 8b 85 d0 fd ff ff <c7> 03 01 00 40 10 89 43 04 8b 85 dc fd ff ff 89 43 08 e9 4a f6 ff
[ 7744.723865] EAX: 01397010 EBX: f8c00000 ECX: 01247000 EDX: 00000000
[ 7744.723869] ESI: f519cd80 EDI: f1ac1cd4 EBP: f1ac1c6c ESP: f1ac1a04
[ 7744.723873] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00210246
[ 7744.723877] CR0: 80050033 CR2: f8c02038 CR3: 31864000 CR4: 000006b0
[ 7744.723880] Fixing recursive fault but reboot is needed!
[ 7749.589011] i915 0000:00:02.0: [drm] GPU HANG: ecode 3:0:00000000
[ 7749.589024] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
[ 7749.589030] Please file a _new_ bug report at https://gitlab.freedesktop.org/drm/intel/issues/new.
[ 7749.589036] Please see https://gitlab.freedesktop.org/drm/intel/-/wikis/How-to-file-i915-bugs for details.
[ 7749.589041] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
[ 7749.589047] The GPU crash dump is required to analyze GPU hangs, so please always attach it.
[ 7749.589053] GPU crash dump saved to /sys/class/drm/card0/error
[ 7749.909841] i915 0000:00:02.0: [drm] Resetting chip for no heartbeat on rcs0
[ 7756.504232] i915 0000:00:02.0: [drm] GPU HANG: ecode 3:0:00000000
[ 7756.817879] i915 0000:00:02.0: [drm] Resetting chip for no heartbeat on rcs0
[ 7763.672921] i915 0000:00:02.0: [drm] GPU HANG: ecode 3:0:00000000
[ 7763.985882] i915 0000:00:02.0: [drm] Resetting chip for no heartbeat on rcs0
[ 7770.580999] i915 0000:00:02.0: [drm] GPU HANG: ecode 3:0:00000000
[ 7770.897884] i915 0000:00:02.0: [drm] Resetting chip for no heartbeat on rcs0
[ 7777.497036] i915 0000:00:02.0: [drm] GPU HANG: ecode 3:0:00000000
[ 7777.825882] i915 0000:00:02.0: [drm] Resetting chip for no heartbeat on rcs0
[ 7784.664999] i915 0000:00:02.0: [drm] GPU HANG: ecode 3:0:00000000


-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/2] drm/i915/gem: Replace reloc chain with terminator on error unwind
  2020-08-19 19:33       ` [Intel-gfx] " Pavel Machek
@ 2020-08-19 19:40         ` Chris Wilson
  -1 siblings, 0 replies; 23+ messages in thread
From: Chris Wilson @ 2020-08-19 19:40 UTC (permalink / raw)
  To: Pavel Machek; +Cc: intel-gfx, Joonas Lahtinen, stable

Quoting Pavel Machek (2020-08-19 20:33:26)
> Hi!
> 
> > > > If we hit an error during construction of the reloc chain, we need to
> > > > replace the chain into the next batch with the terminator so that upon
> > > > flushing the relocations so far, we do not execute a hanging batch.
> > > 
> > > Thanks for the patches. I assume this should fix problem from
> > > "5.9-rc1: graphics regression moved from -next to mainline" thread.
> > > 
> > > I have applied them over current -next, and my machine seems to be
> > > working so far (but uptime is less than 30 minutes).
> > > 
> > > If the machine still works tommorow, I'll assume problem is solved.
> > 
> > Aye, best wait until we have to start competing with Chromium for
> > memory... The suspicion is that it was the resource allocation failure
> > path.
> 
> Yep, my machines are low on memory.
> 
> But ... test did not work that well. I have dead X and blinking
> screen. Machine still works reasonably well over ssh, so I guess
> that's an improvement.

> [ 7744.718473] BUG: unable to handle page fault for address: f8c00000
> [ 7744.718484] #PF: supervisor write access in kernel mode
> [ 7744.718487] #PF: error_code(0x0002) - not-present page
> [ 7744.718491] *pdpt = 0000000031b0b001 *pde = 0000000000000000 
> [ 7744.718500] Oops: 0002 [#1] PREEMPT SMP PTI
> [ 7744.718506] CPU: 0 PID: 3004 Comm: Xorg Not tainted 5.9.0-rc1-next-20200819+ #134
> [ 7744.718509] Hardware name: LENOVO 17097HU/17097HU, BIOS 7BETD8WW (2.19 ) 03/31/2011
> [ 7744.718518] EIP: eb_relocate_vma+0xdbf/0xf20

To save me guessing, paste the above location into
	./scripts/decode_stacktrace.sh ./vmlinux . ./drivers/gpu/drm/i915

The f8c00000 is something running off the end of a kmap, but I didn't
spot a path were we would ignore an error and keep on writing.
Nevertheless it must exist.
-Chris

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Intel-gfx] [PATCH 1/2] drm/i915/gem: Replace reloc chain with terminator on error unwind
@ 2020-08-19 19:40         ` Chris Wilson
  0 siblings, 0 replies; 23+ messages in thread
From: Chris Wilson @ 2020-08-19 19:40 UTC (permalink / raw)
  To: Pavel Machek; +Cc: intel-gfx, stable

Quoting Pavel Machek (2020-08-19 20:33:26)
> Hi!
> 
> > > > If we hit an error during construction of the reloc chain, we need to
> > > > replace the chain into the next batch with the terminator so that upon
> > > > flushing the relocations so far, we do not execute a hanging batch.
> > > 
> > > Thanks for the patches. I assume this should fix problem from
> > > "5.9-rc1: graphics regression moved from -next to mainline" thread.
> > > 
> > > I have applied them over current -next, and my machine seems to be
> > > working so far (but uptime is less than 30 minutes).
> > > 
> > > If the machine still works tommorow, I'll assume problem is solved.
> > 
> > Aye, best wait until we have to start competing with Chromium for
> > memory... The suspicion is that it was the resource allocation failure
> > path.
> 
> Yep, my machines are low on memory.
> 
> But ... test did not work that well. I have dead X and blinking
> screen. Machine still works reasonably well over ssh, so I guess
> that's an improvement.

> [ 7744.718473] BUG: unable to handle page fault for address: f8c00000
> [ 7744.718484] #PF: supervisor write access in kernel mode
> [ 7744.718487] #PF: error_code(0x0002) - not-present page
> [ 7744.718491] *pdpt = 0000000031b0b001 *pde = 0000000000000000 
> [ 7744.718500] Oops: 0002 [#1] PREEMPT SMP PTI
> [ 7744.718506] CPU: 0 PID: 3004 Comm: Xorg Not tainted 5.9.0-rc1-next-20200819+ #134
> [ 7744.718509] Hardware name: LENOVO 17097HU/17097HU, BIOS 7BETD8WW (2.19 ) 03/31/2011
> [ 7744.718518] EIP: eb_relocate_vma+0xdbf/0xf20

To save me guessing, paste the above location into
	./scripts/decode_stacktrace.sh ./vmlinux . ./drivers/gpu/drm/i915

The f8c00000 is something running off the end of a kmap, but I didn't
spot a path were we would ignore an error and keep on writing.
Nevertheless it must exist.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/2] drm/i915/gem: Replace reloc chain with terminator on error unwind
  2020-08-19 19:40         ` [Intel-gfx] " Chris Wilson
@ 2020-08-19 19:47           ` Pavel Machek
  -1 siblings, 0 replies; 23+ messages in thread
From: Pavel Machek @ 2020-08-19 19:47 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx, Joonas Lahtinen, stable

[-- Attachment #1: Type: text/plain, Size: 1423 bytes --]

Hi!

> > Yep, my machines are low on memory.
> > 
> > But ... test did not work that well. I have dead X and blinking
> > screen. Machine still works reasonably well over ssh, so I guess
> > that's an improvement.
> 
> > [ 7744.718473] BUG: unable to handle page fault for address: f8c00000
> > [ 7744.718484] #PF: supervisor write access in kernel mode
> > [ 7744.718487] #PF: error_code(0x0002) - not-present page
> > [ 7744.718491] *pdpt = 0000000031b0b001 *pde = 0000000000000000 
> > [ 7744.718500] Oops: 0002 [#1] PREEMPT SMP PTI
> > [ 7744.718506] CPU: 0 PID: 3004 Comm: Xorg Not tainted 5.9.0-rc1-next-20200819+ #134
> > [ 7744.718509] Hardware name: LENOVO 17097HU/17097HU, BIOS 7BETD8WW (2.19 ) 03/31/2011
> > [ 7744.718518] EIP: eb_relocate_vma+0xdbf/0xf20
> 
> To save me guessing, paste the above location into
> 	./scripts/decode_stacktrace.sh ./vmlinux . ./drivers/gpu/drm/i915
> 
> The f8c00000 is something running off the end of a kmap, but I didn't
> spot a path were we would ignore an error and keep on writing.
> Nevertheless it must exist.

Like this?

$ ./scripts/decode_stacktrace.sh ./vmlinux . ./drivers/gpu/drm/i915
f8c00000
f8c00000
eb_relocate_vma+0xdbf/0xf20
eb_relocate_vma (i915_gem_execbuffer.c:?) 

									Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Intel-gfx] [PATCH 1/2] drm/i915/gem: Replace reloc chain with terminator on error unwind
@ 2020-08-19 19:47           ` Pavel Machek
  0 siblings, 0 replies; 23+ messages in thread
From: Pavel Machek @ 2020-08-19 19:47 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx, stable


[-- Attachment #1.1: Type: text/plain, Size: 1423 bytes --]

Hi!

> > Yep, my machines are low on memory.
> > 
> > But ... test did not work that well. I have dead X and blinking
> > screen. Machine still works reasonably well over ssh, so I guess
> > that's an improvement.
> 
> > [ 7744.718473] BUG: unable to handle page fault for address: f8c00000
> > [ 7744.718484] #PF: supervisor write access in kernel mode
> > [ 7744.718487] #PF: error_code(0x0002) - not-present page
> > [ 7744.718491] *pdpt = 0000000031b0b001 *pde = 0000000000000000 
> > [ 7744.718500] Oops: 0002 [#1] PREEMPT SMP PTI
> > [ 7744.718506] CPU: 0 PID: 3004 Comm: Xorg Not tainted 5.9.0-rc1-next-20200819+ #134
> > [ 7744.718509] Hardware name: LENOVO 17097HU/17097HU, BIOS 7BETD8WW (2.19 ) 03/31/2011
> > [ 7744.718518] EIP: eb_relocate_vma+0xdbf/0xf20
> 
> To save me guessing, paste the above location into
> 	./scripts/decode_stacktrace.sh ./vmlinux . ./drivers/gpu/drm/i915
> 
> The f8c00000 is something running off the end of a kmap, but I didn't
> spot a path were we would ignore an error and keep on writing.
> Nevertheless it must exist.

Like this?

$ ./scripts/decode_stacktrace.sh ./vmlinux . ./drivers/gpu/drm/i915
f8c00000
f8c00000
eb_relocate_vma+0xdbf/0xf20
eb_relocate_vma (i915_gem_execbuffer.c:?) 

									Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Intel-gfx] [PATCH 1/2] drm/i915/gem: Replace reloc chain with terminator on error unwind
  2020-08-19 19:47           ` [Intel-gfx] " Pavel Machek
@ 2020-08-19 19:52             ` Chris Wilson
  -1 siblings, 0 replies; 23+ messages in thread
From: Chris Wilson @ 2020-08-19 19:52 UTC (permalink / raw)
  To: Pavel Machek; +Cc: intel-gfx, stable

Quoting Pavel Machek (2020-08-19 20:47:23)
> Hi!
> 
> > > Yep, my machines are low on memory.
> > > 
> > > But ... test did not work that well. I have dead X and blinking
> > > screen. Machine still works reasonably well over ssh, so I guess
> > > that's an improvement.
> > 
> > > [ 7744.718473] BUG: unable to handle page fault for address: f8c00000
> > > [ 7744.718484] #PF: supervisor write access in kernel mode
> > > [ 7744.718487] #PF: error_code(0x0002) - not-present page
> > > [ 7744.718491] *pdpt = 0000000031b0b001 *pde = 0000000000000000 
> > > [ 7744.718500] Oops: 0002 [#1] PREEMPT SMP PTI
> > > [ 7744.718506] CPU: 0 PID: 3004 Comm: Xorg Not tainted 5.9.0-rc1-next-20200819+ #134
> > > [ 7744.718509] Hardware name: LENOVO 17097HU/17097HU, BIOS 7BETD8WW (2.19 ) 03/31/2011
> > > [ 7744.718518] EIP: eb_relocate_vma+0xdbf/0xf20
> > 
> > To save me guessing, paste the above location into
> >       ./scripts/decode_stacktrace.sh ./vmlinux . ./drivers/gpu/drm/i915
> > 
> > The f8c00000 is something running off the end of a kmap, but I didn't
> > spot a path were we would ignore an error and keep on writing.
> > Nevertheless it must exist.
> 
> Like this?
> 
> $ ./scripts/decode_stacktrace.sh ./vmlinux . ./drivers/gpu/drm/i915
> f8c00000
> f8c00000
> eb_relocate_vma+0xdbf/0xf20
> eb_relocate_vma (i915_gem_execbuffer.c:?) 

Ok, that didn't work as well as I'm used to. Thanks,
-Chris

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Intel-gfx] [PATCH 1/2] drm/i915/gem: Replace reloc chain with terminator on error unwind
@ 2020-08-19 19:52             ` Chris Wilson
  0 siblings, 0 replies; 23+ messages in thread
From: Chris Wilson @ 2020-08-19 19:52 UTC (permalink / raw)
  To: Pavel Machek; +Cc: intel-gfx, stable

Quoting Pavel Machek (2020-08-19 20:47:23)
> Hi!
> 
> > > Yep, my machines are low on memory.
> > > 
> > > But ... test did not work that well. I have dead X and blinking
> > > screen. Machine still works reasonably well over ssh, so I guess
> > > that's an improvement.
> > 
> > > [ 7744.718473] BUG: unable to handle page fault for address: f8c00000
> > > [ 7744.718484] #PF: supervisor write access in kernel mode
> > > [ 7744.718487] #PF: error_code(0x0002) - not-present page
> > > [ 7744.718491] *pdpt = 0000000031b0b001 *pde = 0000000000000000 
> > > [ 7744.718500] Oops: 0002 [#1] PREEMPT SMP PTI
> > > [ 7744.718506] CPU: 0 PID: 3004 Comm: Xorg Not tainted 5.9.0-rc1-next-20200819+ #134
> > > [ 7744.718509] Hardware name: LENOVO 17097HU/17097HU, BIOS 7BETD8WW (2.19 ) 03/31/2011
> > > [ 7744.718518] EIP: eb_relocate_vma+0xdbf/0xf20
> > 
> > To save me guessing, paste the above location into
> >       ./scripts/decode_stacktrace.sh ./vmlinux . ./drivers/gpu/drm/i915
> > 
> > The f8c00000 is something running off the end of a kmap, but I didn't
> > spot a path were we would ignore an error and keep on writing.
> > Nevertheless it must exist.
> 
> Like this?
> 
> $ ./scripts/decode_stacktrace.sh ./vmlinux . ./drivers/gpu/drm/i915
> f8c00000
> f8c00000
> eb_relocate_vma+0xdbf/0xf20
> eb_relocate_vma (i915_gem_execbuffer.c:?) 

Ok, that didn't work as well as I'm used to. Thanks,
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Intel-gfx] ✓ Fi.CI.IGT: success for series starting with [1/2] drm/i915/gem: Replace reloc chain with terminator on error unwind
  2020-08-19 10:39 ` [Intel-gfx] " Chris Wilson
                   ` (3 preceding siblings ...)
  (?)
@ 2020-08-19 21:38 ` Patchwork
  -1 siblings, 0 replies; 23+ messages in thread
From: Patchwork @ 2020-08-19 21:38 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx


[-- Attachment #1.1: Type: text/plain, Size: 13350 bytes --]

== Series Details ==

Series: series starting with [1/2] drm/i915/gem: Replace reloc chain with terminator on error unwind
URL   : https://patchwork.freedesktop.org/series/80795/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_8907_full -> Patchwork_18370_full
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  

Known issues
------------

  Here are the changes found in Patchwork_18370_full that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@gem_exec_whisper@basic-contexts-forked:
    - shard-glk:          [PASS][1] -> [DMESG-WARN][2] ([i915#118] / [i915#95])
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8907/shard-glk1/igt@gem_exec_whisper@basic-contexts-forked.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18370/shard-glk3/igt@gem_exec_whisper@basic-contexts-forked.html

  * igt@gen9_exec_parse@allowed-all:
    - shard-apl:          [PASS][3] -> [DMESG-WARN][4] ([i915#1436] / [i915#1635] / [i915#716])
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8907/shard-apl6/igt@gen9_exec_parse@allowed-all.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18370/shard-apl4/igt@gen9_exec_parse@allowed-all.html

  * igt@i915_selftest@mock@contexts:
    - shard-skl:          [PASS][5] -> [INCOMPLETE][6] ([i915#198] / [i915#2278])
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8907/shard-skl6/igt@i915_selftest@mock@contexts.html
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18370/shard-skl2/igt@i915_selftest@mock@contexts.html

  * igt@kms_draw_crc@draw-method-rgb565-mmap-cpu-untiled:
    - shard-skl:          [PASS][7] -> [FAIL][8] ([i915#52] / [i915#54])
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8907/shard-skl10/igt@kms_draw_crc@draw-method-rgb565-mmap-cpu-untiled.html
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18370/shard-skl1/igt@kms_draw_crc@draw-method-rgb565-mmap-cpu-untiled.html

  * igt@kms_flip@2x-wf_vblank-ts-check-interruptible@ab-hdmi-a1-hdmi-a2:
    - shard-glk:          [PASS][9] -> [DMESG-WARN][10] ([i915#1982]) +2 similar issues
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8907/shard-glk3/igt@kms_flip@2x-wf_vblank-ts-check-interruptible@ab-hdmi-a1-hdmi-a2.html
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18370/shard-glk6/igt@kms_flip@2x-wf_vblank-ts-check-interruptible@ab-hdmi-a1-hdmi-a2.html

  * igt@kms_flip@flip-vs-fences@a-edp1:
    - shard-skl:          [PASS][11] -> [DMESG-WARN][12] ([i915#1982]) +11 similar issues
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8907/shard-skl9/igt@kms_flip@flip-vs-fences@a-edp1.html
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18370/shard-skl3/igt@kms_flip@flip-vs-fences@a-edp1.html

  * igt@kms_flip_tiling@flip-to-y-tiled:
    - shard-skl:          [PASS][13] -> [FAIL][14] ([i915#167])
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8907/shard-skl9/igt@kms_flip_tiling@flip-to-y-tiled.html
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18370/shard-skl3/igt@kms_flip_tiling@flip-to-y-tiled.html

  * igt@kms_frontbuffer_tracking@fbc-1p-offscren-pri-shrfb-draw-blt:
    - shard-tglb:         [PASS][15] -> [DMESG-WARN][16] ([i915#1982]) +1 similar issue
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8907/shard-tglb1/igt@kms_frontbuffer_tracking@fbc-1p-offscren-pri-shrfb-draw-blt.html
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18370/shard-tglb7/igt@kms_frontbuffer_tracking@fbc-1p-offscren-pri-shrfb-draw-blt.html

  * igt@kms_hdr@bpc-switch-suspend:
    - shard-kbl:          [PASS][17] -> [DMESG-WARN][18] ([i915#180]) +3 similar issues
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8907/shard-kbl1/igt@kms_hdr@bpc-switch-suspend.html
   [18]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18370/shard-kbl2/igt@kms_hdr@bpc-switch-suspend.html

  * igt@kms_plane_alpha_blend@pipe-b-coverage-7efc:
    - shard-skl:          [PASS][19] -> [FAIL][20] ([fdo#108145] / [i915#265]) +1 similar issue
   [19]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8907/shard-skl10/igt@kms_plane_alpha_blend@pipe-b-coverage-7efc.html
   [20]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18370/shard-skl9/igt@kms_plane_alpha_blend@pipe-b-coverage-7efc.html

  * igt@kms_psr@psr2_cursor_plane_onoff:
    - shard-iclb:         [PASS][21] -> [SKIP][22] ([fdo#109441]) +1 similar issue
   [21]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8907/shard-iclb2/igt@kms_psr@psr2_cursor_plane_onoff.html
   [22]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18370/shard-iclb8/igt@kms_psr@psr2_cursor_plane_onoff.html

  * igt@kms_setmode@basic:
    - shard-apl:          [PASS][23] -> [FAIL][24] ([i915#1635] / [i915#31])
   [23]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8907/shard-apl8/igt@kms_setmode@basic.html
   [24]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18370/shard-apl7/igt@kms_setmode@basic.html

  
#### Possible fixes ####

  * igt@gem_exec_gttfill@all:
    - shard-glk:          [DMESG-WARN][25] ([i915#118] / [i915#95]) -> [PASS][26]
   [25]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8907/shard-glk8/igt@gem_exec_gttfill@all.html
   [26]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18370/shard-glk8/igt@gem_exec_gttfill@all.html

  * igt@gen9_exec_parse@allowed-single:
    - shard-skl:          [DMESG-WARN][27] ([i915#1436] / [i915#716]) -> [PASS][28]
   [27]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8907/shard-skl7/igt@gen9_exec_parse@allowed-single.html
   [28]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18370/shard-skl7/igt@gen9_exec_parse@allowed-single.html

  * igt@kms_cursor_legacy@basic-flip-after-cursor-varying-size:
    - shard-skl:          [DMESG-WARN][29] ([i915#1982]) -> [PASS][30] +7 similar issues
   [29]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8907/shard-skl2/igt@kms_cursor_legacy@basic-flip-after-cursor-varying-size.html
   [30]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18370/shard-skl10/igt@kms_cursor_legacy@basic-flip-after-cursor-varying-size.html

  * igt@kms_flip@2x-flip-vs-absolute-wf_vblank@bc-vga1-hdmi-a1:
    - shard-hsw:          [INCOMPLETE][31] ([CI#80]) -> [PASS][32]
   [31]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8907/shard-hsw4/igt@kms_flip@2x-flip-vs-absolute-wf_vblank@bc-vga1-hdmi-a1.html
   [32]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18370/shard-hsw8/igt@kms_flip@2x-flip-vs-absolute-wf_vblank@bc-vga1-hdmi-a1.html

  * igt@kms_flip@plain-flip-fb-recreate@a-edp1:
    - shard-skl:          [FAIL][33] ([i915#2122]) -> [PASS][34]
   [33]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8907/shard-skl7/igt@kms_flip@plain-flip-fb-recreate@a-edp1.html
   [34]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18370/shard-skl5/igt@kms_flip@plain-flip-fb-recreate@a-edp1.html

  * igt@kms_frontbuffer_tracking@fbc-suspend:
    - shard-kbl:          [INCOMPLETE][35] ([i915#155]) -> [PASS][36]
   [35]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8907/shard-kbl2/igt@kms_frontbuffer_tracking@fbc-suspend.html
   [36]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18370/shard-kbl1/igt@kms_frontbuffer_tracking@fbc-suspend.html

  * igt@kms_frontbuffer_tracking@psr-1p-primscrn-spr-indfb-draw-render:
    - shard-skl:          [FAIL][37] ([i915#49]) -> [PASS][38]
   [37]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8907/shard-skl4/igt@kms_frontbuffer_tracking@psr-1p-primscrn-spr-indfb-draw-render.html
   [38]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18370/shard-skl9/igt@kms_frontbuffer_tracking@psr-1p-primscrn-spr-indfb-draw-render.html

  * igt@kms_frontbuffer_tracking@psr-1p-primscrn-spr-indfb-move:
    - shard-tglb:         [DMESG-WARN][39] ([i915#1982]) -> [PASS][40] +2 similar issues
   [39]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8907/shard-tglb5/igt@kms_frontbuffer_tracking@psr-1p-primscrn-spr-indfb-move.html
   [40]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18370/shard-tglb1/igt@kms_frontbuffer_tracking@psr-1p-primscrn-spr-indfb-move.html

  * igt@kms_plane_alpha_blend@pipe-a-coverage-7efc:
    - shard-skl:          [FAIL][41] ([fdo#108145] / [i915#265]) -> [PASS][42]
   [41]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8907/shard-skl4/igt@kms_plane_alpha_blend@pipe-a-coverage-7efc.html
   [42]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18370/shard-skl9/igt@kms_plane_alpha_blend@pipe-a-coverage-7efc.html

  * igt@kms_psr@no_drrs:
    - shard-iclb:         [FAIL][43] ([i915#173]) -> [PASS][44]
   [43]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8907/shard-iclb1/igt@kms_psr@no_drrs.html
   [44]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18370/shard-iclb6/igt@kms_psr@no_drrs.html

  * igt@kms_psr@psr2_primary_mmap_cpu:
    - shard-iclb:         [SKIP][45] ([fdo#109441]) -> [PASS][46]
   [45]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8907/shard-iclb5/igt@kms_psr@psr2_primary_mmap_cpu.html
   [46]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18370/shard-iclb2/igt@kms_psr@psr2_primary_mmap_cpu.html

  * igt@kms_vblank@pipe-a-ts-continuation-suspend:
    - shard-kbl:          [DMESG-WARN][47] ([i915#180]) -> [PASS][48] +5 similar issues
   [47]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8907/shard-kbl1/igt@kms_vblank@pipe-a-ts-continuation-suspend.html
   [48]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18370/shard-kbl2/igt@kms_vblank@pipe-a-ts-continuation-suspend.html

  * igt@perf@blocking-parameterized:
    - shard-iclb:         [FAIL][49] ([i915#1542]) -> [PASS][50]
   [49]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8907/shard-iclb4/igt@perf@blocking-parameterized.html
   [50]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18370/shard-iclb6/igt@perf@blocking-parameterized.html

  * igt@prime_busy@after@vecs0:
    - shard-hsw:          [FAIL][51] ([i915#2258]) -> [PASS][52] +1 similar issue
   [51]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8907/shard-hsw1/igt@prime_busy@after@vecs0.html
   [52]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18370/shard-hsw8/igt@prime_busy@after@vecs0.html

  
#### Warnings ####

  * igt@gem_exec_reloc@basic-concurrent16:
    - shard-apl:          [INCOMPLETE][53] ([i915#1635] / [i915#1958]) -> [TIMEOUT][54] ([i915#1635] / [i915#1958])
   [53]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8907/shard-apl7/igt@gem_exec_reloc@basic-concurrent16.html
   [54]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18370/shard-apl6/igt@gem_exec_reloc@basic-concurrent16.html

  * igt@runner@aborted:
    - shard-skl:          [FAIL][55] ([i915#1436]) -> [FAIL][56] ([i915#1436] / [i915#2110])
   [55]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8907/shard-skl7/igt@runner@aborted.html
   [56]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18370/shard-skl2/igt@runner@aborted.html

  
  [CI#80]: https://gitlab.freedesktop.org/gfx-ci/i915-infra/issues/80
  [fdo#108145]: https://bugs.freedesktop.org/show_bug.cgi?id=108145
  [fdo#109441]: https://bugs.freedesktop.org/show_bug.cgi?id=109441
  [i915#118]: https://gitlab.freedesktop.org/drm/intel/issues/118
  [i915#1436]: https://gitlab.freedesktop.org/drm/intel/issues/1436
  [i915#1542]: https://gitlab.freedesktop.org/drm/intel/issues/1542
  [i915#155]: https://gitlab.freedesktop.org/drm/intel/issues/155
  [i915#1635]: https://gitlab.freedesktop.org/drm/intel/issues/1635
  [i915#167]: https://gitlab.freedesktop.org/drm/intel/issues/167
  [i915#173]: https://gitlab.freedesktop.org/drm/intel/issues/173
  [i915#180]: https://gitlab.freedesktop.org/drm/intel/issues/180
  [i915#1958]: https://gitlab.freedesktop.org/drm/intel/issues/1958
  [i915#198]: https://gitlab.freedesktop.org/drm/intel/issues/198
  [i915#1982]: https://gitlab.freedesktop.org/drm/intel/issues/1982
  [i915#2110]: https://gitlab.freedesktop.org/drm/intel/issues/2110
  [i915#2122]: https://gitlab.freedesktop.org/drm/intel/issues/2122
  [i915#2258]: https://gitlab.freedesktop.org/drm/intel/issues/2258
  [i915#2278]: https://gitlab.freedesktop.org/drm/intel/issues/2278
  [i915#265]: https://gitlab.freedesktop.org/drm/intel/issues/265
  [i915#31]: https://gitlab.freedesktop.org/drm/intel/issues/31
  [i915#49]: https://gitlab.freedesktop.org/drm/intel/issues/49
  [i915#52]: https://gitlab.freedesktop.org/drm/intel/issues/52
  [i915#54]: https://gitlab.freedesktop.org/drm/intel/issues/54
  [i915#716]: https://gitlab.freedesktop.org/drm/intel/issues/716
  [i915#95]: https://gitlab.freedesktop.org/drm/intel/issues/95


Participating hosts (11 -> 12)
------------------------------

  Additional (1): pig-snb-2600 


Build changes
-------------

  * Linux: CI_DRM_8907 -> Patchwork_18370

  CI-20190529: 20190529
  CI_DRM_8907: f9f7b73d0f125316a33e35f3315f3a5955079e33 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_5769: 4e5f76be680b65780204668e302026cf638decc9 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_18370: 4759f933e1d574c27656dfc1d2523148309ebb39 @ git://anongit.freedesktop.org/gfx-ci/linux
  piglit_4509: fdc5a4ca11124ab8413c7988896eec4c97336694 @ git://anongit.freedesktop.org/piglit

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18370/index.html

[-- Attachment #1.2: Type: text/html, Size: 15830 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/2] drm/i915/gem: Replace reloc chain with terminator on error unwind
  2020-08-19 19:33       ` [Intel-gfx] " Pavel Machek
@ 2020-08-20  7:36         ` Chris Wilson
  -1 siblings, 0 replies; 23+ messages in thread
From: Chris Wilson @ 2020-08-20  7:36 UTC (permalink / raw)
  To: Pavel Machek; +Cc: intel-gfx, Joonas Lahtinen, stable

Quoting Pavel Machek (2020-08-19 20:33:26)
> Hi!
> 
> > > > If we hit an error during construction of the reloc chain, we need to
> > > > replace the chain into the next batch with the terminator so that upon
> > > > flushing the relocations so far, we do not execute a hanging batch.
> > > 
> > > Thanks for the patches. I assume this should fix problem from
> > > "5.9-rc1: graphics regression moved from -next to mainline" thread.
> > > 
> > > I have applied them over current -next, and my machine seems to be
> > > working so far (but uptime is less than 30 minutes).
> > > 
> > > If the machine still works tommorow, I'll assume problem is solved.
> > 
> > Aye, best wait until we have to start competing with Chromium for
> > memory... The suspicion is that it was the resource allocation failure
> > path.
> 
> Yep, my machines are low on memory.
> 
> But ... test did not work that well. I have dead X and blinking
> screen. Machine still works reasonably well over ssh, so I guess
> that's an improvement.

Well my last remaining 32bit gen3 device is currently pushing up the
daises, so could you try removing the attempt to use WC? Something like

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 44df98d85b38..b26f7de913c3 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -955,10 +955,7 @@ static u32 *__reloc_gpu_map(struct reloc_cache *cache,
 {
        u32 *map;

-       map = i915_gem_object_pin_map(pool->obj,
-                                     cache->has_llc ?
-                                     I915_MAP_FORCE_WB :
-                                     I915_MAP_FORCE_WC);
+       map = i915_gem_object_pin_map(pool->obj, I915_MAP_FORCE_WB);

on top of the previous patch. Faultinjection didn't turn up anything in
eb_relocate_vma, so we need to dig deeper.
-Chris

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [Intel-gfx] [PATCH 1/2] drm/i915/gem: Replace reloc chain with terminator on error unwind
@ 2020-08-20  7:36         ` Chris Wilson
  0 siblings, 0 replies; 23+ messages in thread
From: Chris Wilson @ 2020-08-20  7:36 UTC (permalink / raw)
  To: Pavel Machek; +Cc: intel-gfx, stable

Quoting Pavel Machek (2020-08-19 20:33:26)
> Hi!
> 
> > > > If we hit an error during construction of the reloc chain, we need to
> > > > replace the chain into the next batch with the terminator so that upon
> > > > flushing the relocations so far, we do not execute a hanging batch.
> > > 
> > > Thanks for the patches. I assume this should fix problem from
> > > "5.9-rc1: graphics regression moved from -next to mainline" thread.
> > > 
> > > I have applied them over current -next, and my machine seems to be
> > > working so far (but uptime is less than 30 minutes).
> > > 
> > > If the machine still works tommorow, I'll assume problem is solved.
> > 
> > Aye, best wait until we have to start competing with Chromium for
> > memory... The suspicion is that it was the resource allocation failure
> > path.
> 
> Yep, my machines are low on memory.
> 
> But ... test did not work that well. I have dead X and blinking
> screen. Machine still works reasonably well over ssh, so I guess
> that's an improvement.

Well my last remaining 32bit gen3 device is currently pushing up the
daises, so could you try removing the attempt to use WC? Something like

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 44df98d85b38..b26f7de913c3 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -955,10 +955,7 @@ static u32 *__reloc_gpu_map(struct reloc_cache *cache,
 {
        u32 *map;

-       map = i915_gem_object_pin_map(pool->obj,
-                                     cache->has_llc ?
-                                     I915_MAP_FORCE_WB :
-                                     I915_MAP_FORCE_WC);
+       map = i915_gem_object_pin_map(pool->obj, I915_MAP_FORCE_WB);

on top of the previous patch. Faultinjection didn't turn up anything in
eb_relocate_vma, so we need to dig deeper.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [Intel-gfx] ✗ Fi.CI.BUILD: failure for series starting with [1/2] drm/i915/gem: Replace reloc chain with terminator on error unwind (rev2)
  2020-08-19 10:39 ` [Intel-gfx] " Chris Wilson
                   ` (4 preceding siblings ...)
  (?)
@ 2020-08-20  7:37 ` Patchwork
  -1 siblings, 0 replies; 23+ messages in thread
From: Patchwork @ 2020-08-20  7:37 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [1/2] drm/i915/gem: Replace reloc chain with terminator on error unwind (rev2)
URL   : https://patchwork.freedesktop.org/series/80795/
State : failure

== Summary ==

Applying: drm/i915/gem: Replace reloc chain with terminator on error unwind
error: corrupt patch at line 15
error: could not build fake ancestor
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Patch failed at 0001 drm/i915/gem: Replace reloc chain with terminator on error unwind
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/2] drm/i915/gem: Replace reloc chain with terminator on error unwind
  2020-08-20  7:36         ` [Intel-gfx] " Chris Wilson
@ 2020-09-08 22:23           ` Pavel Machek
  -1 siblings, 0 replies; 23+ messages in thread
From: Pavel Machek @ 2020-09-08 22:23 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx, Joonas Lahtinen, stable

Hi!

> > > > Thanks for the patches. I assume this should fix problem from
> > > > "5.9-rc1: graphics regression moved from -next to mainline" thread.
> > > > 
> > > > I have applied them over current -next, and my machine seems to be
> > > > working so far (but uptime is less than 30 minutes).
> > > > 
> > > > If the machine still works tommorow, I'll assume problem is solved.
> > > 
> > > Aye, best wait until we have to start competing with Chromium for
> > > memory... The suspicion is that it was the resource allocation failure
> > > path.
> > 
> > Yep, my machines are low on memory.
> > 
> > But ... test did not work that well. I have dead X and blinking
> > screen. Machine still works reasonably well over ssh, so I guess
> > that's an improvement.
> 
> Well my last remaining 32bit gen3 device is currently pushing up the
> daises, so could you try removing the attempt to use WC? Something like
> 
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> @@ -955,10 +955,7 @@ static u32 *__reloc_gpu_map(struct reloc_cache *cache,
>  {
>         u32 *map;
> 
> -       map = i915_gem_object_pin_map(pool->obj,
> -                                     cache->has_llc ?
> -                                     I915_MAP_FORCE_WB :
> -                                     I915_MAP_FORCE_WC);
> +       map = i915_gem_object_pin_map(pool->obj, I915_MAP_FORCE_WB);
> 
> on top of the previous patch. Faultinjection didn't turn up anything in
> eb_relocate_vma, so we need to dig deeper.

With this on top of other patches, it works.

Tested-by: Pavel Machek <pavel@ucw.cz>

Best regards,
									Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Intel-gfx] [PATCH 1/2] drm/i915/gem: Replace reloc chain with terminator on error unwind
@ 2020-09-08 22:23           ` Pavel Machek
  0 siblings, 0 replies; 23+ messages in thread
From: Pavel Machek @ 2020-09-08 22:23 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx, stable

Hi!

> > > > Thanks for the patches. I assume this should fix problem from
> > > > "5.9-rc1: graphics regression moved from -next to mainline" thread.
> > > > 
> > > > I have applied them over current -next, and my machine seems to be
> > > > working so far (but uptime is less than 30 minutes).
> > > > 
> > > > If the machine still works tommorow, I'll assume problem is solved.
> > > 
> > > Aye, best wait until we have to start competing with Chromium for
> > > memory... The suspicion is that it was the resource allocation failure
> > > path.
> > 
> > Yep, my machines are low on memory.
> > 
> > But ... test did not work that well. I have dead X and blinking
> > screen. Machine still works reasonably well over ssh, so I guess
> > that's an improvement.
> 
> Well my last remaining 32bit gen3 device is currently pushing up the
> daises, so could you try removing the attempt to use WC? Something like
> 
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> @@ -955,10 +955,7 @@ static u32 *__reloc_gpu_map(struct reloc_cache *cache,
>  {
>         u32 *map;
> 
> -       map = i915_gem_object_pin_map(pool->obj,
> -                                     cache->has_llc ?
> -                                     I915_MAP_FORCE_WB :
> -                                     I915_MAP_FORCE_WC);
> +       map = i915_gem_object_pin_map(pool->obj, I915_MAP_FORCE_WB);
> 
> on top of the previous patch. Faultinjection didn't turn up anything in
> eb_relocate_vma, so we need to dig deeper.

With this on top of other patches, it works.

Tested-by: Pavel Machek <pavel@ucw.cz>

Best regards,
									Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2020-09-08 22:23 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-19 10:39 [PATCH 1/2] drm/i915/gem: Replace reloc chain with terminator on error unwind Chris Wilson
2020-08-19 10:39 ` [Intel-gfx] " Chris Wilson
2020-08-19 10:39 ` [PATCH 2/2] drm/i915/gem: Fallback to using a plain kmap if reloc address space is limited Chris Wilson
2020-08-19 10:39   ` [Intel-gfx] " Chris Wilson
2020-08-19 11:48 ` [Intel-gfx] ✓ Fi.CI.BAT: success for series starting with [1/2] drm/i915/gem: Replace reloc chain with terminator on error unwind Patchwork
2020-08-19 17:23 ` [PATCH 1/2] " Pavel Machek
2020-08-19 17:23   ` [Intel-gfx] " Pavel Machek
2020-08-19 17:36   ` Chris Wilson
2020-08-19 17:36     ` [Intel-gfx] " Chris Wilson
2020-08-19 19:33     ` Pavel Machek
2020-08-19 19:33       ` [Intel-gfx] " Pavel Machek
2020-08-19 19:40       ` Chris Wilson
2020-08-19 19:40         ` [Intel-gfx] " Chris Wilson
2020-08-19 19:47         ` Pavel Machek
2020-08-19 19:47           ` [Intel-gfx] " Pavel Machek
2020-08-19 19:52           ` Chris Wilson
2020-08-19 19:52             ` Chris Wilson
2020-08-20  7:36       ` Chris Wilson
2020-08-20  7:36         ` [Intel-gfx] " Chris Wilson
2020-09-08 22:23         ` Pavel Machek
2020-09-08 22:23           ` [Intel-gfx] " Pavel Machek
2020-08-19 21:38 ` [Intel-gfx] ✓ Fi.CI.IGT: success for series starting with [1/2] " Patchwork
2020-08-20  7:37 ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for series starting with [1/2] drm/i915/gem: Replace reloc chain with terminator on error unwind (rev2) Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.