All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/2] drm/i915/cmdparser: No-op failed batches on all platforms
@ 2021-05-19  7:43 ` Daniel Vetter
  0 siblings, 0 replies; 25+ messages in thread
From: Daniel Vetter @ 2021-05-19  7:43 UTC (permalink / raw)
  To: DRI Development, Intel Graphics Development
  Cc: Daniel Vetter, stable, Jason Ekstrand, Marcin Slusarz,
	Jon Bloomfield, Daniel Vetter

On gen9 for blt cmd parser we relied on the magic fence error
propagation which:
- doesn't work on gen7, because there's no scheduler with ringbuffers
  there yet
- fence error propagation can be weaponized to attack other things, so
  not a good design idea

Instead of magic, do the same thing on gen9 as on gen7.

Kudos to Jason for figuring this out.

Fixes: 9e31c1fe45d5 ("drm/i915: Propagate errors on awaiting already signaled fences")
Cc: <stable@vger.kernel.org> # v5.6+
Cc: Jason Ekstrand <jason.ekstrand@intel.com>
Cc: Marcin Slusarz <marcin.slusarz@intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Relates: https://gitlab.freedesktop.org/drm/intel/-/issues/3080
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
---
 drivers/gpu/drm/i915/i915_cmd_parser.c | 34 +++++++++++++-------------
 1 file changed, 17 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c
index 5b4b2bd46e7c..2d3336ab7ba3 100644
--- a/drivers/gpu/drm/i915/i915_cmd_parser.c
+++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
@@ -1509,6 +1509,12 @@ int intel_engine_cmd_parser(struct intel_engine_cs *engine,
 		}
 	}
 
+	/* Batch unsafe to execute with privileges, cancel! */
+	if (ret) {
+		cmd = page_mask_bits(shadow->obj->mm.mapping);
+		*cmd = MI_BATCH_BUFFER_END;
+	}
+
 	if (trampoline) {
 		/*
 		 * With the trampoline, the shadow is executed twice.
@@ -1524,26 +1530,20 @@ int intel_engine_cmd_parser(struct intel_engine_cs *engine,
 		 */
 		*batch_end = MI_BATCH_BUFFER_END;
 
-		if (ret) {
-			/* Batch unsafe to execute with privileges, cancel! */
-			cmd = page_mask_bits(shadow->obj->mm.mapping);
-			*cmd = MI_BATCH_BUFFER_END;
+		/* If batch is unsafe but valid, jump to the original */
+		if (ret == -EACCES) {
+			unsigned int flags;
 
-			/* If batch is unsafe but valid, jump to the original */
-			if (ret == -EACCES) {
-				unsigned int flags;
+			flags = MI_BATCH_NON_SECURE_I965;
+			if (IS_HASWELL(engine->i915))
+				flags = MI_BATCH_NON_SECURE_HSW;
 
-				flags = MI_BATCH_NON_SECURE_I965;
-				if (IS_HASWELL(engine->i915))
-					flags = MI_BATCH_NON_SECURE_HSW;
+			GEM_BUG_ON(!IS_GEN_RANGE(engine->i915, 6, 7));
+			__gen6_emit_bb_start(batch_end,
+					     batch_addr,
+					     flags);
 
-				GEM_BUG_ON(!IS_GEN_RANGE(engine->i915, 6, 7));
-				__gen6_emit_bb_start(batch_end,
-						     batch_addr,
-						     flags);
-
-				ret = 0; /* allow execution */
-			}
+			ret = 0; /* allow execution */
 		}
 	}
 
-- 
2.31.0


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 1/2] drm/i915/cmdparser: No-op failed batches on all platforms
@ 2021-05-19  7:43 ` Daniel Vetter
  0 siblings, 0 replies; 25+ messages in thread
From: Daniel Vetter @ 2021-05-19  7:43 UTC (permalink / raw)
  To: DRI Development, Intel Graphics Development
  Cc: Daniel Vetter, stable, Jason Ekstrand, Jon Bloomfield,
	Marcin Slusarz, Daniel Vetter

On gen9 for blt cmd parser we relied on the magic fence error
propagation which:
- doesn't work on gen7, because there's no scheduler with ringbuffers
  there yet
- fence error propagation can be weaponized to attack other things, so
  not a good design idea

Instead of magic, do the same thing on gen9 as on gen7.

Kudos to Jason for figuring this out.

Fixes: 9e31c1fe45d5 ("drm/i915: Propagate errors on awaiting already signaled fences")
Cc: <stable@vger.kernel.org> # v5.6+
Cc: Jason Ekstrand <jason.ekstrand@intel.com>
Cc: Marcin Slusarz <marcin.slusarz@intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Relates: https://gitlab.freedesktop.org/drm/intel/-/issues/3080
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
---
 drivers/gpu/drm/i915/i915_cmd_parser.c | 34 +++++++++++++-------------
 1 file changed, 17 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c
index 5b4b2bd46e7c..2d3336ab7ba3 100644
--- a/drivers/gpu/drm/i915/i915_cmd_parser.c
+++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
@@ -1509,6 +1509,12 @@ int intel_engine_cmd_parser(struct intel_engine_cs *engine,
 		}
 	}
 
+	/* Batch unsafe to execute with privileges, cancel! */
+	if (ret) {
+		cmd = page_mask_bits(shadow->obj->mm.mapping);
+		*cmd = MI_BATCH_BUFFER_END;
+	}
+
 	if (trampoline) {
 		/*
 		 * With the trampoline, the shadow is executed twice.
@@ -1524,26 +1530,20 @@ int intel_engine_cmd_parser(struct intel_engine_cs *engine,
 		 */
 		*batch_end = MI_BATCH_BUFFER_END;
 
-		if (ret) {
-			/* Batch unsafe to execute with privileges, cancel! */
-			cmd = page_mask_bits(shadow->obj->mm.mapping);
-			*cmd = MI_BATCH_BUFFER_END;
+		/* If batch is unsafe but valid, jump to the original */
+		if (ret == -EACCES) {
+			unsigned int flags;
 
-			/* If batch is unsafe but valid, jump to the original */
-			if (ret == -EACCES) {
-				unsigned int flags;
+			flags = MI_BATCH_NON_SECURE_I965;
+			if (IS_HASWELL(engine->i915))
+				flags = MI_BATCH_NON_SECURE_HSW;
 
-				flags = MI_BATCH_NON_SECURE_I965;
-				if (IS_HASWELL(engine->i915))
-					flags = MI_BATCH_NON_SECURE_HSW;
+			GEM_BUG_ON(!IS_GEN_RANGE(engine->i915, 6, 7));
+			__gen6_emit_bb_start(batch_end,
+					     batch_addr,
+					     flags);
 
-				GEM_BUG_ON(!IS_GEN_RANGE(engine->i915, 6, 7));
-				__gen6_emit_bb_start(batch_end,
-						     batch_addr,
-						     flags);
-
-				ret = 0; /* allow execution */
-			}
+			ret = 0; /* allow execution */
 		}
 	}
 
-- 
2.31.0


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Intel-gfx] [PATCH 1/2] drm/i915/cmdparser: No-op failed batches on all platforms
@ 2021-05-19  7:43 ` Daniel Vetter
  0 siblings, 0 replies; 25+ messages in thread
From: Daniel Vetter @ 2021-05-19  7:43 UTC (permalink / raw)
  To: DRI Development, Intel Graphics Development
  Cc: Daniel Vetter, stable, Jason Ekstrand, Daniel Vetter

On gen9 for blt cmd parser we relied on the magic fence error
propagation which:
- doesn't work on gen7, because there's no scheduler with ringbuffers
  there yet
- fence error propagation can be weaponized to attack other things, so
  not a good design idea

Instead of magic, do the same thing on gen9 as on gen7.

Kudos to Jason for figuring this out.

Fixes: 9e31c1fe45d5 ("drm/i915: Propagate errors on awaiting already signaled fences")
Cc: <stable@vger.kernel.org> # v5.6+
Cc: Jason Ekstrand <jason.ekstrand@intel.com>
Cc: Marcin Slusarz <marcin.slusarz@intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Relates: https://gitlab.freedesktop.org/drm/intel/-/issues/3080
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
---
 drivers/gpu/drm/i915/i915_cmd_parser.c | 34 +++++++++++++-------------
 1 file changed, 17 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c
index 5b4b2bd46e7c..2d3336ab7ba3 100644
--- a/drivers/gpu/drm/i915/i915_cmd_parser.c
+++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
@@ -1509,6 +1509,12 @@ int intel_engine_cmd_parser(struct intel_engine_cs *engine,
 		}
 	}
 
+	/* Batch unsafe to execute with privileges, cancel! */
+	if (ret) {
+		cmd = page_mask_bits(shadow->obj->mm.mapping);
+		*cmd = MI_BATCH_BUFFER_END;
+	}
+
 	if (trampoline) {
 		/*
 		 * With the trampoline, the shadow is executed twice.
@@ -1524,26 +1530,20 @@ int intel_engine_cmd_parser(struct intel_engine_cs *engine,
 		 */
 		*batch_end = MI_BATCH_BUFFER_END;
 
-		if (ret) {
-			/* Batch unsafe to execute with privileges, cancel! */
-			cmd = page_mask_bits(shadow->obj->mm.mapping);
-			*cmd = MI_BATCH_BUFFER_END;
+		/* If batch is unsafe but valid, jump to the original */
+		if (ret == -EACCES) {
+			unsigned int flags;
 
-			/* If batch is unsafe but valid, jump to the original */
-			if (ret == -EACCES) {
-				unsigned int flags;
+			flags = MI_BATCH_NON_SECURE_I965;
+			if (IS_HASWELL(engine->i915))
+				flags = MI_BATCH_NON_SECURE_HSW;
 
-				flags = MI_BATCH_NON_SECURE_I965;
-				if (IS_HASWELL(engine->i915))
-					flags = MI_BATCH_NON_SECURE_HSW;
+			GEM_BUG_ON(!IS_GEN_RANGE(engine->i915, 6, 7));
+			__gen6_emit_bb_start(batch_end,
+					     batch_addr,
+					     flags);
 
-				GEM_BUG_ON(!IS_GEN_RANGE(engine->i915, 6, 7));
-				__gen6_emit_bb_start(batch_end,
-						     batch_addr,
-						     flags);
-
-				ret = 0; /* allow execution */
-			}
+			ret = 0; /* allow execution */
 		}
 	}
 
-- 
2.31.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 2/2] Revert "drm/i915: Propagate errors on awaiting already signaled fences"
  2021-05-19  7:43 ` Daniel Vetter
  (?)
@ 2021-05-19  7:43   ` Daniel Vetter
  -1 siblings, 0 replies; 25+ messages in thread
From: Daniel Vetter @ 2021-05-19  7:43 UTC (permalink / raw)
  To: DRI Development, Intel Graphics Development
  Cc: Jason Ekstrand, Jason Ekstrand, Marcin Slusarz, stable,
	Jon Bloomfield, Daniel Vetter

From: Jason Ekstrand <jason@jlekstrand.net>

This reverts commit 9e31c1fe45d555a948ff66f1f0e3fe1f83ca63f7.  Ever
since that commit, we've been having issues where a hang in one client
can propagate to another.  In particular, a hang in an app can propagate
to the X server which causes the whole desktop to lock up.

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reported-by: Marcin Slusarz <marcin.slusarz@intel.com>
Cc: <stable@vger.kernel.org> # v5.6+
Cc: Jason Ekstrand <jason.ekstrand@intel.com>
Cc: Marcin Slusarz <marcin.slusarz@intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/3080
Fixes: 9e31c1fe45d5 ("drm/i915: Propagate errors on awaiting already signaled fences")
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
---
 drivers/gpu/drm/i915/i915_request.c | 8 ++------
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 970d8f4986bb..b796197c0772 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -1426,10 +1426,8 @@ i915_request_await_execution(struct i915_request *rq,
 
 	do {
 		fence = *child++;
-		if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags)) {
-			i915_sw_fence_set_error_once(&rq->submit, fence->error);
+		if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags))
 			continue;
-		}
 
 		if (fence->context == rq->fence.context)
 			continue;
@@ -1527,10 +1525,8 @@ i915_request_await_dma_fence(struct i915_request *rq, struct dma_fence *fence)
 
 	do {
 		fence = *child++;
-		if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags)) {
-			i915_sw_fence_set_error_once(&rq->submit, fence->error);
+		if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags))
 			continue;
-		}
 
 		/*
 		 * Requests on the same timeline are explicitly ordered, along
-- 
2.31.0


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 2/2] Revert "drm/i915: Propagate errors on awaiting already signaled fences"
@ 2021-05-19  7:43   ` Daniel Vetter
  0 siblings, 0 replies; 25+ messages in thread
From: Daniel Vetter @ 2021-05-19  7:43 UTC (permalink / raw)
  To: DRI Development, Intel Graphics Development
  Cc: Daniel Vetter, stable, Jason Ekstrand, Jon Bloomfield,
	Jason Ekstrand, Marcin Slusarz

From: Jason Ekstrand <jason@jlekstrand.net>

This reverts commit 9e31c1fe45d555a948ff66f1f0e3fe1f83ca63f7.  Ever
since that commit, we've been having issues where a hang in one client
can propagate to another.  In particular, a hang in an app can propagate
to the X server which causes the whole desktop to lock up.

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reported-by: Marcin Slusarz <marcin.slusarz@intel.com>
Cc: <stable@vger.kernel.org> # v5.6+
Cc: Jason Ekstrand <jason.ekstrand@intel.com>
Cc: Marcin Slusarz <marcin.slusarz@intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/3080
Fixes: 9e31c1fe45d5 ("drm/i915: Propagate errors on awaiting already signaled fences")
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
---
 drivers/gpu/drm/i915/i915_request.c | 8 ++------
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 970d8f4986bb..b796197c0772 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -1426,10 +1426,8 @@ i915_request_await_execution(struct i915_request *rq,
 
 	do {
 		fence = *child++;
-		if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags)) {
-			i915_sw_fence_set_error_once(&rq->submit, fence->error);
+		if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags))
 			continue;
-		}
 
 		if (fence->context == rq->fence.context)
 			continue;
@@ -1527,10 +1525,8 @@ i915_request_await_dma_fence(struct i915_request *rq, struct dma_fence *fence)
 
 	do {
 		fence = *child++;
-		if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags)) {
-			i915_sw_fence_set_error_once(&rq->submit, fence->error);
+		if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags))
 			continue;
-		}
 
 		/*
 		 * Requests on the same timeline are explicitly ordered, along
-- 
2.31.0


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Intel-gfx] [PATCH 2/2] Revert "drm/i915: Propagate errors on awaiting already signaled fences"
@ 2021-05-19  7:43   ` Daniel Vetter
  0 siblings, 0 replies; 25+ messages in thread
From: Daniel Vetter @ 2021-05-19  7:43 UTC (permalink / raw)
  To: DRI Development, Intel Graphics Development
  Cc: Daniel Vetter, stable, Jason Ekstrand

From: Jason Ekstrand <jason@jlekstrand.net>

This reverts commit 9e31c1fe45d555a948ff66f1f0e3fe1f83ca63f7.  Ever
since that commit, we've been having issues where a hang in one client
can propagate to another.  In particular, a hang in an app can propagate
to the X server which causes the whole desktop to lock up.

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reported-by: Marcin Slusarz <marcin.slusarz@intel.com>
Cc: <stable@vger.kernel.org> # v5.6+
Cc: Jason Ekstrand <jason.ekstrand@intel.com>
Cc: Marcin Slusarz <marcin.slusarz@intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/3080
Fixes: 9e31c1fe45d5 ("drm/i915: Propagate errors on awaiting already signaled fences")
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
---
 drivers/gpu/drm/i915/i915_request.c | 8 ++------
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 970d8f4986bb..b796197c0772 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -1426,10 +1426,8 @@ i915_request_await_execution(struct i915_request *rq,
 
 	do {
 		fence = *child++;
-		if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags)) {
-			i915_sw_fence_set_error_once(&rq->submit, fence->error);
+		if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags))
 			continue;
-		}
 
 		if (fence->context == rq->fence.context)
 			continue;
@@ -1527,10 +1525,8 @@ i915_request_await_dma_fence(struct i915_request *rq, struct dma_fence *fence)
 
 	do {
 		fence = *child++;
-		if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags)) {
-			i915_sw_fence_set_error_once(&rq->submit, fence->error);
+		if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags))
 			continue;
-		}
 
 		/*
 		 * Requests on the same timeline are explicitly ordered, along
-- 
2.31.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for series starting with [1/2] drm/i915/cmdparser: No-op failed batches on all platforms
  2021-05-19  7:43 ` Daniel Vetter
                   ` (2 preceding siblings ...)
  (?)
@ 2021-05-19 10:04 ` Patchwork
  -1 siblings, 0 replies; 25+ messages in thread
From: Patchwork @ 2021-05-19 10:04 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx

== Series Details ==

Series: series starting with [1/2] drm/i915/cmdparser: No-op failed batches on all platforms
URL   : https://patchwork.freedesktop.org/series/90310/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
d4f598ce6c56 drm/i915/cmdparser: No-op failed batches on all platforms
-:79: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address mismatch: 'From: Daniel Vetter <daniel.vetter@ffwll.ch>' != 'Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>'

total: 0 errors, 1 warnings, 0 checks, 49 lines checked
f6a21e2aad5a Revert "drm/i915: Propagate errors on awaiting already signaled fences"
-:49: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address mismatch: 'From: Jason Ekstrand <jason@jlekstrand.net>' != 'Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>'

total: 0 errors, 1 warnings, 0 checks, 22 lines checked


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH] Revert "drm/i915: Propagate errors on awaiting already signaled fences"
  2021-05-19  7:43   ` Daniel Vetter
  (?)
@ 2021-05-19 10:15     ` Daniel Vetter
  -1 siblings, 0 replies; 25+ messages in thread
From: Daniel Vetter @ 2021-05-19 10:15 UTC (permalink / raw)
  To: Intel Graphics Development, DRI Development
  Cc: Jason Ekstrand, Jason Ekstrand, Marcin Slusarz, stable,
	Jon Bloomfield, Daniel Vetter

From: Jason Ekstrand <jason@jlekstrand.net>

This reverts commit 9e31c1fe45d555a948ff66f1f0e3fe1f83ca63f7.  Ever
since that commit, we've been having issues where a hang in one client
can propagate to another.  In particular, a hang in an app can propagate
to the X server which causes the whole desktop to lock up.

Error propagation along fences sound like a good idea, but as your bug
shows, surprising consequences, since propagating errors across security
boundaries is not a good thing.

What we do have is track the hangs on the ctx, and report information to
userspace using RESET_STATS. That's how arb_robustness works. Also, if my
understanding is still correct, the EIO from execbuf is when your context
is banned (because not recoverable or too many hangs). And in all these
cases it's up to userspace to figure out what is all impacted and should
be reported to the application, that's not on the kernel to guess and
automatically propagate.

What's more, we're also building more features on top of ctx error
reporting with RESET_STATS ioctl: Encrypted buffers use the same, and the
userspace fence wait also relies on that mechanism. So it is the path
going forward for reporting gpu hangs and resets to userspace.

So all together that's why I think we should just bury this idea again as
not quite the direction we want to go to, hence why I think the revert is
the right option here.Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>

v2: Augment commit message. Also restore Jason's sob that I
accidentally lost.

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> (v1)
Reported-by: Marcin Slusarz <marcin.slusarz@intel.com>
Cc: <stable@vger.kernel.org> # v5.6+
Cc: Jason Ekstrand <jason.ekstrand@intel.com>
Cc: Marcin Slusarz <marcin.slusarz@intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/3080
Fixes: 9e31c1fe45d5 ("drm/i915: Propagate errors on awaiting already signaled fences")
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
---
 drivers/gpu/drm/i915/i915_request.c | 8 ++------
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 970d8f4986bb..b796197c0772 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -1426,10 +1426,8 @@ i915_request_await_execution(struct i915_request *rq,
 
 	do {
 		fence = *child++;
-		if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags)) {
-			i915_sw_fence_set_error_once(&rq->submit, fence->error);
+		if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags))
 			continue;
-		}
 
 		if (fence->context == rq->fence.context)
 			continue;
@@ -1527,10 +1525,8 @@ i915_request_await_dma_fence(struct i915_request *rq, struct dma_fence *fence)
 
 	do {
 		fence = *child++;
-		if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags)) {
-			i915_sw_fence_set_error_once(&rq->submit, fence->error);
+		if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags))
 			continue;
-		}
 
 		/*
 		 * Requests on the same timeline are explicitly ordered, along
-- 
2.31.0


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH] Revert "drm/i915: Propagate errors on awaiting already signaled fences"
@ 2021-05-19 10:15     ` Daniel Vetter
  0 siblings, 0 replies; 25+ messages in thread
From: Daniel Vetter @ 2021-05-19 10:15 UTC (permalink / raw)
  To: Intel Graphics Development, DRI Development
  Cc: Daniel Vetter, stable, Jason Ekstrand, Jon Bloomfield,
	Jason Ekstrand, Marcin Slusarz

From: Jason Ekstrand <jason@jlekstrand.net>

This reverts commit 9e31c1fe45d555a948ff66f1f0e3fe1f83ca63f7.  Ever
since that commit, we've been having issues where a hang in one client
can propagate to another.  In particular, a hang in an app can propagate
to the X server which causes the whole desktop to lock up.

Error propagation along fences sound like a good idea, but as your bug
shows, surprising consequences, since propagating errors across security
boundaries is not a good thing.

What we do have is track the hangs on the ctx, and report information to
userspace using RESET_STATS. That's how arb_robustness works. Also, if my
understanding is still correct, the EIO from execbuf is when your context
is banned (because not recoverable or too many hangs). And in all these
cases it's up to userspace to figure out what is all impacted and should
be reported to the application, that's not on the kernel to guess and
automatically propagate.

What's more, we're also building more features on top of ctx error
reporting with RESET_STATS ioctl: Encrypted buffers use the same, and the
userspace fence wait also relies on that mechanism. So it is the path
going forward for reporting gpu hangs and resets to userspace.

So all together that's why I think we should just bury this idea again as
not quite the direction we want to go to, hence why I think the revert is
the right option here.Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>

v2: Augment commit message. Also restore Jason's sob that I
accidentally lost.

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> (v1)
Reported-by: Marcin Slusarz <marcin.slusarz@intel.com>
Cc: <stable@vger.kernel.org> # v5.6+
Cc: Jason Ekstrand <jason.ekstrand@intel.com>
Cc: Marcin Slusarz <marcin.slusarz@intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/3080
Fixes: 9e31c1fe45d5 ("drm/i915: Propagate errors on awaiting already signaled fences")
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
---
 drivers/gpu/drm/i915/i915_request.c | 8 ++------
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 970d8f4986bb..b796197c0772 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -1426,10 +1426,8 @@ i915_request_await_execution(struct i915_request *rq,
 
 	do {
 		fence = *child++;
-		if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags)) {
-			i915_sw_fence_set_error_once(&rq->submit, fence->error);
+		if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags))
 			continue;
-		}
 
 		if (fence->context == rq->fence.context)
 			continue;
@@ -1527,10 +1525,8 @@ i915_request_await_dma_fence(struct i915_request *rq, struct dma_fence *fence)
 
 	do {
 		fence = *child++;
-		if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags)) {
-			i915_sw_fence_set_error_once(&rq->submit, fence->error);
+		if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags))
 			continue;
-		}
 
 		/*
 		 * Requests on the same timeline are explicitly ordered, along
-- 
2.31.0


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Intel-gfx] [PATCH] Revert "drm/i915: Propagate errors on awaiting already signaled fences"
@ 2021-05-19 10:15     ` Daniel Vetter
  0 siblings, 0 replies; 25+ messages in thread
From: Daniel Vetter @ 2021-05-19 10:15 UTC (permalink / raw)
  To: Intel Graphics Development, DRI Development
  Cc: Daniel Vetter, stable, Jason Ekstrand

From: Jason Ekstrand <jason@jlekstrand.net>

This reverts commit 9e31c1fe45d555a948ff66f1f0e3fe1f83ca63f7.  Ever
since that commit, we've been having issues where a hang in one client
can propagate to another.  In particular, a hang in an app can propagate
to the X server which causes the whole desktop to lock up.

Error propagation along fences sound like a good idea, but as your bug
shows, surprising consequences, since propagating errors across security
boundaries is not a good thing.

What we do have is track the hangs on the ctx, and report information to
userspace using RESET_STATS. That's how arb_robustness works. Also, if my
understanding is still correct, the EIO from execbuf is when your context
is banned (because not recoverable or too many hangs). And in all these
cases it's up to userspace to figure out what is all impacted and should
be reported to the application, that's not on the kernel to guess and
automatically propagate.

What's more, we're also building more features on top of ctx error
reporting with RESET_STATS ioctl: Encrypted buffers use the same, and the
userspace fence wait also relies on that mechanism. So it is the path
going forward for reporting gpu hangs and resets to userspace.

So all together that's why I think we should just bury this idea again as
not quite the direction we want to go to, hence why I think the revert is
the right option here.Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>

v2: Augment commit message. Also restore Jason's sob that I
accidentally lost.

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> (v1)
Reported-by: Marcin Slusarz <marcin.slusarz@intel.com>
Cc: <stable@vger.kernel.org> # v5.6+
Cc: Jason Ekstrand <jason.ekstrand@intel.com>
Cc: Marcin Slusarz <marcin.slusarz@intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/3080
Fixes: 9e31c1fe45d5 ("drm/i915: Propagate errors on awaiting already signaled fences")
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
---
 drivers/gpu/drm/i915/i915_request.c | 8 ++------
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 970d8f4986bb..b796197c0772 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -1426,10 +1426,8 @@ i915_request_await_execution(struct i915_request *rq,
 
 	do {
 		fence = *child++;
-		if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags)) {
-			i915_sw_fence_set_error_once(&rq->submit, fence->error);
+		if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags))
 			continue;
-		}
 
 		if (fence->context == rq->fence.context)
 			continue;
@@ -1527,10 +1525,8 @@ i915_request_await_dma_fence(struct i915_request *rq, struct dma_fence *fence)
 
 	do {
 		fence = *child++;
-		if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags)) {
-			i915_sw_fence_set_error_once(&rq->submit, fence->error);
+		if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags))
 			continue;
-		}
 
 		/*
 		 * Requests on the same timeline are explicitly ordered, along
-- 
2.31.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Intel-gfx] ✓ Fi.CI.BAT: success for series starting with [1/2] drm/i915/cmdparser: No-op failed batches on all platforms
  2021-05-19  7:43 ` Daniel Vetter
                   ` (3 preceding siblings ...)
  (?)
@ 2021-05-19 10:34 ` Patchwork
  -1 siblings, 0 replies; 25+ messages in thread
From: Patchwork @ 2021-05-19 10:34 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx


[-- Attachment #1.1: Type: text/plain, Size: 9790 bytes --]

== Series Details ==

Series: series starting with [1/2] drm/i915/cmdparser: No-op failed batches on all platforms
URL   : https://patchwork.freedesktop.org/series/90310/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_10102 -> Patchwork_20152
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20152/index.html

Known issues
------------

  Here are the changes found in Patchwork_20152 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@gem_exec_fence@nb-await@bcs0:
    - fi-bsw-nick:        [PASS][1] -> [FAIL][2] ([i915#3457]) +1 similar issue
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10102/fi-bsw-nick/igt@gem_exec_fence@nb-await@bcs0.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20152/fi-bsw-nick/igt@gem_exec_fence@nb-await@bcs0.html
    - fi-bsw-n3050:       [PASS][3] -> [FAIL][4] ([i915#3457]) +1 similar issue
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10102/fi-bsw-n3050/igt@gem_exec_fence@nb-await@bcs0.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20152/fi-bsw-n3050/igt@gem_exec_fence@nb-await@bcs0.html

  * igt@gem_exec_fence@nb-await@rcs0:
    - fi-elk-e7500:       [PASS][5] -> [FAIL][6] ([i915#3457])
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10102/fi-elk-e7500/igt@gem_exec_fence@nb-await@rcs0.html
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20152/fi-elk-e7500/igt@gem_exec_fence@nb-await@rcs0.html

  * igt@gem_exec_fence@nb-await@vcs0:
    - fi-bsw-kefka:       [PASS][7] -> [FAIL][8] ([i915#3457]) +2 similar issues
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10102/fi-bsw-kefka/igt@gem_exec_fence@nb-await@vcs0.html
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20152/fi-bsw-kefka/igt@gem_exec_fence@nb-await@vcs0.html

  * igt@gem_exec_fence@nb-await@vecs0:
    - fi-glk-dsi:         [PASS][9] -> [FAIL][10] ([i915#3457])
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10102/fi-glk-dsi/igt@gem_exec_fence@nb-await@vecs0.html
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20152/fi-glk-dsi/igt@gem_exec_fence@nb-await@vecs0.html

  * igt@kms_pipe_crc_basic@nonblocking-crc-pipe-a-frame-sequence:
    - fi-elk-e7500:       [PASS][11] -> [FAIL][12] ([i915#53]) +2 similar issues
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10102/fi-elk-e7500/igt@kms_pipe_crc_basic@nonblocking-crc-pipe-a-frame-sequence.html
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20152/fi-elk-e7500/igt@kms_pipe_crc_basic@nonblocking-crc-pipe-a-frame-sequence.html

  * igt@kms_pipe_crc_basic@read-crc-pipe-a:
    - fi-bsw-kefka:       [PASS][13] -> [FAIL][14] ([i915#53]) +1 similar issue
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10102/fi-bsw-kefka/igt@kms_pipe_crc_basic@read-crc-pipe-a.html
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20152/fi-bsw-kefka/igt@kms_pipe_crc_basic@read-crc-pipe-a.html

  * igt@kms_pipe_crc_basic@suspend-read-crc-pipe-a:
    - fi-bwr-2160:        [PASS][15] -> [FAIL][16] ([i915#53])
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10102/fi-bwr-2160/igt@kms_pipe_crc_basic@suspend-read-crc-pipe-a.html
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20152/fi-bwr-2160/igt@kms_pipe_crc_basic@suspend-read-crc-pipe-a.html

  * igt@runner@aborted:
    - fi-ilk-650:         NOTRUN -> [FAIL][17] ([i915#3475])
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20152/fi-ilk-650/igt@runner@aborted.html

  
#### Possible fixes ####

  * igt@gem_exec_fence@basic-await@rcs0:
    - fi-bsw-n3050:       [FAIL][18] ([i915#3457]) -> [PASS][19]
   [18]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10102/fi-bsw-n3050/igt@gem_exec_fence@basic-await@rcs0.html
   [19]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20152/fi-bsw-n3050/igt@gem_exec_fence@basic-await@rcs0.html

  * igt@gem_exec_fence@basic-await@vcs0:
    - fi-elk-e7500:       [FAIL][20] ([i915#3457]) -> [PASS][21] +1 similar issue
   [20]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10102/fi-elk-e7500/igt@gem_exec_fence@basic-await@vcs0.html
   [21]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20152/fi-elk-e7500/igt@gem_exec_fence@basic-await@vcs0.html

  * igt@gem_exec_fence@nb-await@bcs0:
    - fi-bsw-kefka:       [FAIL][22] ([i915#3457]) -> [PASS][23]
   [22]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10102/fi-bsw-kefka/igt@gem_exec_fence@nb-await@bcs0.html
   [23]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20152/fi-bsw-kefka/igt@gem_exec_fence@nb-await@bcs0.html

  * igt@gem_exec_fence@nb-await@vcs0:
    - fi-bsw-nick:        [FAIL][24] ([i915#3457]) -> [PASS][25] +2 similar issues
   [24]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10102/fi-bsw-nick/igt@gem_exec_fence@nb-await@vcs0.html
   [25]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20152/fi-bsw-nick/igt@gem_exec_fence@nb-await@vcs0.html

  * igt@gem_wait@busy@all:
    - fi-apl-guc:         [FAIL][26] ([i915#3457]) -> [PASS][27]
   [26]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10102/fi-apl-guc/igt@gem_wait@busy@all.html
   [27]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20152/fi-apl-guc/igt@gem_wait@busy@all.html
    - fi-bsw-kefka:       [FAIL][28] ([i915#3177] / [i915#3457]) -> [PASS][29]
   [28]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10102/fi-bsw-kefka/igt@gem_wait@busy@all.html
   [29]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20152/fi-bsw-kefka/igt@gem_wait@busy@all.html

  * igt@gem_wait@wait@all:
    - fi-bwr-2160:        [FAIL][30] ([i915#3457]) -> [PASS][31]
   [30]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10102/fi-bwr-2160/igt@gem_wait@wait@all.html
   [31]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20152/fi-bwr-2160/igt@gem_wait@wait@all.html

  * igt@kms_chamelium@hdmi-hpd-fast:
    - fi-icl-u2:          [DMESG-WARN][32] ([i915#2868]) -> [PASS][33]
   [32]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10102/fi-icl-u2/igt@kms_chamelium@hdmi-hpd-fast.html
   [33]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20152/fi-icl-u2/igt@kms_chamelium@hdmi-hpd-fast.html

  
#### Warnings ####

  * igt@gem_exec_gttfill@basic:
    - fi-ilk-650:         [FAIL][34] ([i915#3472]) -> [FAIL][35] ([i915#3457] / [i915#3472])
   [34]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10102/fi-ilk-650/igt@gem_exec_gttfill@basic.html
   [35]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20152/fi-ilk-650/igt@gem_exec_gttfill@basic.html

  * igt@i915_module_load@reload:
    - fi-elk-e7500:       [DMESG-WARN][36] ([i915#3457]) -> [DMESG-FAIL][37] ([i915#3457])
   [36]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10102/fi-elk-e7500/igt@i915_module_load@reload.html
   [37]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20152/fi-elk-e7500/igt@i915_module_load@reload.html
    - fi-bsw-nick:        [DMESG-WARN][38] ([i915#3457]) -> [DMESG-FAIL][39] ([i915#3457])
   [38]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10102/fi-bsw-nick/igt@i915_module_load@reload.html
   [39]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20152/fi-bsw-nick/igt@i915_module_load@reload.html

  * igt@runner@aborted:
    - fi-skl-6600u:       [FAIL][40] ([i915#1436] / [i915#3363]) -> [FAIL][41] ([i915#1436] / [i915#2426] / [i915#3363])
   [40]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10102/fi-skl-6600u/igt@runner@aborted.html
   [41]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20152/fi-skl-6600u/igt@runner@aborted.html
    - fi-kbl-7567u:       [FAIL][42] ([i915#1436] / [i915#2426] / [i915#3363]) -> [FAIL][43] ([i915#1436] / [i915#3363])
   [42]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10102/fi-kbl-7567u/igt@runner@aborted.html
   [43]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20152/fi-kbl-7567u/igt@runner@aborted.html

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [i915#1436]: https://gitlab.freedesktop.org/drm/intel/issues/1436
  [i915#1982]: https://gitlab.freedesktop.org/drm/intel/issues/1982
  [i915#2426]: https://gitlab.freedesktop.org/drm/intel/issues/2426
  [i915#2868]: https://gitlab.freedesktop.org/drm/intel/issues/2868
  [i915#2932]: https://gitlab.freedesktop.org/drm/intel/issues/2932
  [i915#2966]: https://gitlab.freedesktop.org/drm/intel/issues/2966
  [i915#3177]: https://gitlab.freedesktop.org/drm/intel/issues/3177
  [i915#3363]: https://gitlab.freedesktop.org/drm/intel/issues/3363
  [i915#3457]: https://gitlab.freedesktop.org/drm/intel/issues/3457
  [i915#3472]: https://gitlab.freedesktop.org/drm/intel/issues/3472
  [i915#3475]: https://gitlab.freedesktop.org/drm/intel/issues/3475
  [i915#53]: https://gitlab.freedesktop.org/drm/intel/issues/53


Participating hosts (44 -> 38)
------------------------------

  Missing    (6): fi-rkl-11500t fi-ilk-m540 fi-hsw-4200u fi-bsw-cyan fi-ctg-p8600 fi-bdw-samus 


Build changes
-------------

  * Linux: CI_DRM_10102 -> Patchwork_20152

  CI-20190529: 20190529
  CI_DRM_10102: 085271acf13cb1943cbe359fdde2c35ec28f4963 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_6088: 2c1e9a30f17944d75640a43d3b8c2124b035de1c @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_20152: f6a21e2aad5a5eef2854950534c9a18a43219b05 @ git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

f6a21e2aad5a Revert "drm/i915: Propagate errors on awaiting already signaled fences"
d4f598ce6c56 drm/i915/cmdparser: No-op failed batches on all platforms

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20152/index.html

[-- Attachment #1.2: Type: text/html, Size: 12458 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for series starting with [1/2] drm/i915/cmdparser: No-op failed batches on all platforms (rev2)
  2021-05-19  7:43 ` Daniel Vetter
                   ` (4 preceding siblings ...)
  (?)
@ 2021-05-19 11:09 ` Patchwork
  -1 siblings, 0 replies; 25+ messages in thread
From: Patchwork @ 2021-05-19 11:09 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx

== Series Details ==

Series: series starting with [1/2] drm/i915/cmdparser: No-op failed batches on all platforms (rev2)
URL   : https://patchwork.freedesktop.org/series/90310/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
2bbc347a556d drm/i915/cmdparser: No-op failed batches on all platforms
-:79: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address mismatch: 'From: Daniel Vetter <daniel.vetter@ffwll.ch>' != 'Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>'

total: 0 errors, 1 warnings, 0 checks, 49 lines checked
fcb029dfc866 Revert "drm/i915: Propagate errors on awaiting already signaled fences"
-:31: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description (prefer a maximum 75 chars per line)
#31: 
the right option here.Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>

-:73: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address mismatch: 'From: Jason Ekstrand <jason@jlekstrand.net>' != 'Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> (v1)'

total: 0 errors, 2 warnings, 0 checks, 22 lines checked


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Intel-gfx] ✓ Fi.CI.BAT: success for series starting with [1/2] drm/i915/cmdparser: No-op failed batches on all platforms (rev2)
  2021-05-19  7:43 ` Daniel Vetter
                   ` (5 preceding siblings ...)
  (?)
@ 2021-05-19 11:40 ` Patchwork
  -1 siblings, 0 replies; 25+ messages in thread
From: Patchwork @ 2021-05-19 11:40 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx


[-- Attachment #1.1: Type: text/plain, Size: 11185 bytes --]

== Series Details ==

Series: series starting with [1/2] drm/i915/cmdparser: No-op failed batches on all platforms (rev2)
URL   : https://patchwork.freedesktop.org/series/90310/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_10102 -> Patchwork_20154
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/index.html

Known issues
------------

  Here are the changes found in Patchwork_20154 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@core_hotunplug@unbind-rebind:
    - fi-bdw-5557u:       NOTRUN -> [WARN][1] ([i915#2283])
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/fi-bdw-5557u/igt@core_hotunplug@unbind-rebind.html

  * igt@gem_busy@busy@all:
    - fi-bsw-n3050:       [PASS][2] -> [FAIL][3] ([i915#3457])
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10102/fi-bsw-n3050/igt@gem_busy@busy@all.html
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/fi-bsw-n3050/igt@gem_busy@busy@all.html

  * igt@gem_exec_fence@basic-await@rcs0:
    - fi-bsw-nick:        [PASS][4] -> [FAIL][5] ([i915#3457])
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10102/fi-bsw-nick/igt@gem_exec_fence@basic-await@rcs0.html
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/fi-bsw-nick/igt@gem_exec_fence@basic-await@rcs0.html
    - fi-bsw-kefka:       [PASS][6] -> [FAIL][7] ([i915#3457])
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10102/fi-bsw-kefka/igt@gem_exec_fence@basic-await@rcs0.html
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/fi-bsw-kefka/igt@gem_exec_fence@basic-await@rcs0.html

  * igt@gem_exec_fence@nb-await@rcs0:
    - fi-elk-e7500:       [PASS][8] -> [FAIL][9] ([i915#3457])
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10102/fi-elk-e7500/igt@gem_exec_fence@nb-await@rcs0.html
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/fi-elk-e7500/igt@gem_exec_fence@nb-await@rcs0.html

  * igt@gem_exec_fence@nb-await@vecs0:
    - fi-glk-dsi:         [PASS][10] -> [FAIL][11] ([i915#3457])
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10102/fi-glk-dsi/igt@gem_exec_fence@nb-await@vecs0.html
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/fi-glk-dsi/igt@gem_exec_fence@nb-await@vecs0.html
    - fi-apl-guc:         [PASS][12] -> [FAIL][13] ([i915#3457])
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10102/fi-apl-guc/igt@gem_exec_fence@nb-await@vecs0.html
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/fi-apl-guc/igt@gem_exec_fence@nb-await@vecs0.html

  * igt@i915_selftest@live@execlists:
    - fi-bdw-5557u:       NOTRUN -> [DMESG-FAIL][14] ([i915#3462])
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/fi-bdw-5557u/igt@i915_selftest@live@execlists.html

  * igt@i915_selftest@live@mman:
    - fi-bdw-5557u:       NOTRUN -> [DMESG-WARN][15] ([i915#3457]) +1 similar issue
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/fi-bdw-5557u/igt@i915_selftest@live@mman.html

  * igt@kms_chamelium@dp-crc-fast:
    - fi-bdw-5557u:       NOTRUN -> [SKIP][16] ([fdo#109271] / [fdo#111827]) +8 similar issues
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/fi-bdw-5557u/igt@kms_chamelium@dp-crc-fast.html

  * igt@kms_pipe_crc_basic@hang-read-crc-pipe-a:
    - fi-bwr-2160:        [PASS][17] -> [FAIL][18] ([i915#53])
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10102/fi-bwr-2160/igt@kms_pipe_crc_basic@hang-read-crc-pipe-a.html
   [18]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/fi-bwr-2160/igt@kms_pipe_crc_basic@hang-read-crc-pipe-a.html

  * igt@kms_pipe_crc_basic@nonblocking-crc-pipe-a:
    - fi-elk-e7500:       [PASS][19] -> [FAIL][20] ([i915#53]) +1 similar issue
   [19]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10102/fi-elk-e7500/igt@kms_pipe_crc_basic@nonblocking-crc-pipe-a.html
   [20]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/fi-elk-e7500/igt@kms_pipe_crc_basic@nonblocking-crc-pipe-a.html

  * igt@kms_psr@cursor_plane_move:
    - fi-bdw-5557u:       NOTRUN -> [SKIP][21] ([fdo#109271]) +9 similar issues
   [21]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/fi-bdw-5557u/igt@kms_psr@cursor_plane_move.html

  
#### Possible fixes ####

  * igt@gem_exec_fence@basic-await@rcs0:
    - fi-bsw-n3050:       [FAIL][22] ([i915#3457]) -> [PASS][23]
   [22]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10102/fi-bsw-n3050/igt@gem_exec_fence@basic-await@rcs0.html
   [23]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/fi-bsw-n3050/igt@gem_exec_fence@basic-await@rcs0.html

  * igt@gem_exec_fence@nb-await@bcs0:
    - fi-bsw-kefka:       [FAIL][24] ([i915#3457]) -> [PASS][25]
   [24]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10102/fi-bsw-kefka/igt@gem_exec_fence@nb-await@bcs0.html
   [25]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/fi-bsw-kefka/igt@gem_exec_fence@nb-await@bcs0.html

  * igt@gem_exec_fence@nb-await@rcs0:
    - fi-glk-dsi:         [FAIL][26] ([i915#3457]) -> [PASS][27]
   [26]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10102/fi-glk-dsi/igt@gem_exec_fence@nb-await@rcs0.html
   [27]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/fi-glk-dsi/igt@gem_exec_fence@nb-await@rcs0.html

  * igt@gem_exec_fence@nb-await@vcs0:
    - fi-bsw-nick:        [FAIL][28] ([i915#3457]) -> [PASS][29] +2 similar issues
   [28]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10102/fi-bsw-nick/igt@gem_exec_fence@nb-await@vcs0.html
   [29]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/fi-bsw-nick/igt@gem_exec_fence@nb-await@vcs0.html

  * igt@gem_wait@busy@all:
    - fi-apl-guc:         [FAIL][30] ([i915#3457]) -> [PASS][31]
   [30]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10102/fi-apl-guc/igt@gem_wait@busy@all.html
   [31]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/fi-apl-guc/igt@gem_wait@busy@all.html

  * igt@gem_wait@wait@all:
    - fi-bwr-2160:        [FAIL][32] ([i915#3457]) -> [PASS][33]
   [32]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10102/fi-bwr-2160/igt@gem_wait@wait@all.html
   [33]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/fi-bwr-2160/igt@gem_wait@wait@all.html

  * igt@kms_chamelium@hdmi-hpd-fast:
    - fi-icl-u2:          [DMESG-WARN][34] ([i915#2868]) -> [PASS][35]
   [34]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10102/fi-icl-u2/igt@kms_chamelium@hdmi-hpd-fast.html
   [35]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/fi-icl-u2/igt@kms_chamelium@hdmi-hpd-fast.html

  * igt@kms_pipe_crc_basic@nonblocking-crc-pipe-a:
    - fi-ilk-650:         [FAIL][36] ([i915#53]) -> [PASS][37] +1 similar issue
   [36]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10102/fi-ilk-650/igt@kms_pipe_crc_basic@nonblocking-crc-pipe-a.html
   [37]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/fi-ilk-650/igt@kms_pipe_crc_basic@nonblocking-crc-pipe-a.html

  * igt@kms_pipe_crc_basic@suspend-read-crc-pipe-a:
    - fi-elk-e7500:       [FAIL][38] ([i915#53]) -> [PASS][39]
   [38]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10102/fi-elk-e7500/igt@kms_pipe_crc_basic@suspend-read-crc-pipe-a.html
   [39]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/fi-elk-e7500/igt@kms_pipe_crc_basic@suspend-read-crc-pipe-a.html

  
#### Warnings ####

  * igt@i915_module_load@reload:
    - fi-elk-e7500:       [DMESG-WARN][40] ([i915#3457]) -> [DMESG-FAIL][41] ([i915#3457])
   [40]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10102/fi-elk-e7500/igt@i915_module_load@reload.html
   [41]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/fi-elk-e7500/igt@i915_module_load@reload.html
    - fi-bsw-kefka:       [DMESG-FAIL][42] ([i915#1982] / [i915#3457]) -> [DMESG-WARN][43] ([i915#1982] / [i915#3457])
   [42]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10102/fi-bsw-kefka/igt@i915_module_load@reload.html
   [43]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/fi-bsw-kefka/igt@i915_module_load@reload.html

  * igt@runner@aborted:
    - fi-bdw-5557u:       [FAIL][44] ([i915#1602] / [i915#2029]) -> [FAIL][45] ([i915#3462])
   [44]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10102/fi-bdw-5557u/igt@runner@aborted.html
   [45]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/fi-bdw-5557u/igt@runner@aborted.html
    - fi-kbl-guc:         [FAIL][46] ([i915#1436] / [i915#2426] / [i915#3363]) -> [FAIL][47] ([i915#1436] / [i915#3363])
   [46]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10102/fi-kbl-guc/igt@runner@aborted.html
   [47]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/fi-kbl-guc/igt@runner@aborted.html
    - fi-skl-6700k2:      [FAIL][48] ([i915#1436] / [i915#3363]) -> [FAIL][49] ([i915#1436] / [i915#2426] / [i915#3363])
   [48]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10102/fi-skl-6700k2/igt@runner@aborted.html
   [49]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/fi-skl-6700k2/igt@runner@aborted.html

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [fdo#111827]: https://bugs.freedesktop.org/show_bug.cgi?id=111827
  [i915#1436]: https://gitlab.freedesktop.org/drm/intel/issues/1436
  [i915#1602]: https://gitlab.freedesktop.org/drm/intel/issues/1602
  [i915#1982]: https://gitlab.freedesktop.org/drm/intel/issues/1982
  [i915#2029]: https://gitlab.freedesktop.org/drm/intel/issues/2029
  [i915#2283]: https://gitlab.freedesktop.org/drm/intel/issues/2283
  [i915#2426]: https://gitlab.freedesktop.org/drm/intel/issues/2426
  [i915#2868]: https://gitlab.freedesktop.org/drm/intel/issues/2868
  [i915#2932]: https://gitlab.freedesktop.org/drm/intel/issues/2932
  [i915#2966]: https://gitlab.freedesktop.org/drm/intel/issues/2966
  [i915#3363]: https://gitlab.freedesktop.org/drm/intel/issues/3363
  [i915#3457]: https://gitlab.freedesktop.org/drm/intel/issues/3457
  [i915#3462]: https://gitlab.freedesktop.org/drm/intel/issues/3462
  [i915#53]: https://gitlab.freedesktop.org/drm/intel/issues/53


Participating hosts (44 -> 38)
------------------------------

  Missing    (6): fi-rkl-11500t fi-ilk-m540 fi-hsw-4200u fi-bsw-cyan fi-ctg-p8600 fi-bdw-samus 


Build changes
-------------

  * Linux: CI_DRM_10102 -> Patchwork_20154

  CI-20190529: 20190529
  CI_DRM_10102: 085271acf13cb1943cbe359fdde2c35ec28f4963 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_6088: 2c1e9a30f17944d75640a43d3b8c2124b035de1c @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_20154: fcb029dfc866c5779796608d76dcc2696de5bde3 @ git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

fcb029dfc866 Revert "drm/i915: Propagate errors on awaiting already signaled fences"
2bbc347a556d drm/i915/cmdparser: No-op failed batches on all platforms

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/index.html

[-- Attachment #1.2: Type: text/html, Size: 14228 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/2] drm/i915/cmdparser: No-op failed batches on all platforms
  2021-05-19  7:43 ` Daniel Vetter
  (?)
@ 2021-05-19 15:04   ` Jason Ekstrand
  -1 siblings, 0 replies; 25+ messages in thread
From: Jason Ekstrand @ 2021-05-19 15:04 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: DRI Development, Intel Graphics Development, stable,
	Jason Ekstrand, Jon Bloomfield, Marcin Slusarz, Daniel Vetter

On Wed, May 19, 2021 at 2:43 AM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>
> On gen9 for blt cmd parser we relied on the magic fence error
> propagation which:
> - doesn't work on gen7, because there's no scheduler with ringbuffers
>   there yet
> - fence error propagation can be weaponized to attack other things, so
>   not a good design idea
>
> Instead of magic, do the same thing on gen9 as on gen7.

I think the commit message could be improved.  Maybe something like this?

When we re-introduced the command parser on Gen9 platforms to protect
against BLT CS register writes, we did things a bit differently than
on previous platforms.  On Gen7 platforms, if a batch contains
unsupported commands, we smash the start of the shadow batch to
MI_BATCH_BUFFER_END to cancel the batch.  If it's mostly ok
(-EACCESS), we trampoline to run in unprivileged mode and let the
limited HW parser handle security.  On Gen9, we only care about
rejecting batches because we don't trust the HW parser for a few cases
so we don't need this second trampoline case.

However, instead of stopping there and avoiding the trampoline, we
chose to avoid executing the new batch all together on Gen9 by use of
dma-fence error propagation.  When the batch parser fails, it returns
a non-zero error and we would propgate that through the chain of
fences and trust the scheduler to know to cancel anything dependent on
a fence with an error.  However, fence error propagation is sketchy at
best and can be weaponized to attack other things so it's not really a
good design.  This commit restores a bit of the Gen7 functionality on
Gen9 (smashing the start of the shadow batch to MI_BB_END) so that
it's always safe to run the batch post-parser.  A later commit will
get rid of the error propagation nonsense.

>
> Kudos to Jason for figuring this out.
>
> Fixes: 9e31c1fe45d5 ("drm/i915: Propagate errors on awaiting already signaled fences")
> Cc: <stable@vger.kernel.org> # v5.6+
> Cc: Jason Ekstrand <jason.ekstrand@intel.com>
> Cc: Marcin Slusarz <marcin.slusarz@intel.com>
> Cc: Jon Bloomfield <jon.bloomfield@intel.com>
> Relates: https://gitlab.freedesktop.org/drm/intel/-/issues/3080
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_cmd_parser.c | 34 +++++++++++++-------------
>  1 file changed, 17 insertions(+), 17 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c
> index 5b4b2bd46e7c..2d3336ab7ba3 100644
> --- a/drivers/gpu/drm/i915/i915_cmd_parser.c
> +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
> @@ -1509,6 +1509,12 @@ int intel_engine_cmd_parser(struct intel_engine_cs *engine,
>                 }
>         }
>
> +       /* Batch unsafe to execute with privileges, cancel! */
> +       if (ret) {
> +               cmd = page_mask_bits(shadow->obj->mm.mapping);
> +               *cmd = MI_BATCH_BUFFER_END;
> +       }
> +
>         if (trampoline) {
>                 /*
>                  * With the trampoline, the shadow is executed twice.
> @@ -1524,26 +1530,20 @@ int intel_engine_cmd_parser(struct intel_engine_cs *engine,
>                  */
>                 *batch_end = MI_BATCH_BUFFER_END;

Bit of a bike shed but, given the new structure of the code, I think
it makes it more clear if we do

if (ret == -EACCESS) {
   /* stuff */
   __gen6_emit_bb_start(...);
} else {
   *batch_end = MI_BATCH_BUFFER_END;
}

That way it's clear that we're making a choice between firing off the
client batch in privileged mode and ending early.

>
> -               if (ret) {
> -                       /* Batch unsafe to execute with privileges, cancel! */
> -                       cmd = page_mask_bits(shadow->obj->mm.mapping);
> -                       *cmd = MI_BATCH_BUFFER_END;
> +               /* If batch is unsafe but valid, jump to the original */
> +               if (ret == -EACCES) {
> +                       unsigned int flags;
>
> -                       /* If batch is unsafe but valid, jump to the original */
> -                       if (ret == -EACCES) {
> -                               unsigned int flags;
> +                       flags = MI_BATCH_NON_SECURE_I965;
> +                       if (IS_HASWELL(engine->i915))
> +                               flags = MI_BATCH_NON_SECURE_HSW;
>
> -                               flags = MI_BATCH_NON_SECURE_I965;
> -                               if (IS_HASWELL(engine->i915))
> -                                       flags = MI_BATCH_NON_SECURE_HSW;
> +                       GEM_BUG_ON(!IS_GEN_RANGE(engine->i915, 6, 7));
> +                       __gen6_emit_bb_start(batch_end,
> +                                            batch_addr,
> +                                            flags);
>
> -                               GEM_BUG_ON(!IS_GEN_RANGE(engine->i915, 6, 7));
> -                               __gen6_emit_bb_start(batch_end,
> -                                                    batch_addr,
> -                                                    flags);
> -
> -                               ret = 0; /* allow execution */
> -                       }
> +                       ret = 0; /* allow execution */
>                 }
>         }
>
> --
> 2.31.0
>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/2] drm/i915/cmdparser: No-op failed batches on all platforms
@ 2021-05-19 15:04   ` Jason Ekstrand
  0 siblings, 0 replies; 25+ messages in thread
From: Jason Ekstrand @ 2021-05-19 15:04 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Intel Graphics Development, stable, Jason Ekstrand,
	Jon Bloomfield, DRI Development, Marcin Slusarz, Daniel Vetter

On Wed, May 19, 2021 at 2:43 AM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>
> On gen9 for blt cmd parser we relied on the magic fence error
> propagation which:
> - doesn't work on gen7, because there's no scheduler with ringbuffers
>   there yet
> - fence error propagation can be weaponized to attack other things, so
>   not a good design idea
>
> Instead of magic, do the same thing on gen9 as on gen7.

I think the commit message could be improved.  Maybe something like this?

When we re-introduced the command parser on Gen9 platforms to protect
against BLT CS register writes, we did things a bit differently than
on previous platforms.  On Gen7 platforms, if a batch contains
unsupported commands, we smash the start of the shadow batch to
MI_BATCH_BUFFER_END to cancel the batch.  If it's mostly ok
(-EACCESS), we trampoline to run in unprivileged mode and let the
limited HW parser handle security.  On Gen9, we only care about
rejecting batches because we don't trust the HW parser for a few cases
so we don't need this second trampoline case.

However, instead of stopping there and avoiding the trampoline, we
chose to avoid executing the new batch all together on Gen9 by use of
dma-fence error propagation.  When the batch parser fails, it returns
a non-zero error and we would propgate that through the chain of
fences and trust the scheduler to know to cancel anything dependent on
a fence with an error.  However, fence error propagation is sketchy at
best and can be weaponized to attack other things so it's not really a
good design.  This commit restores a bit of the Gen7 functionality on
Gen9 (smashing the start of the shadow batch to MI_BB_END) so that
it's always safe to run the batch post-parser.  A later commit will
get rid of the error propagation nonsense.

>
> Kudos to Jason for figuring this out.
>
> Fixes: 9e31c1fe45d5 ("drm/i915: Propagate errors on awaiting already signaled fences")
> Cc: <stable@vger.kernel.org> # v5.6+
> Cc: Jason Ekstrand <jason.ekstrand@intel.com>
> Cc: Marcin Slusarz <marcin.slusarz@intel.com>
> Cc: Jon Bloomfield <jon.bloomfield@intel.com>
> Relates: https://gitlab.freedesktop.org/drm/intel/-/issues/3080
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_cmd_parser.c | 34 +++++++++++++-------------
>  1 file changed, 17 insertions(+), 17 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c
> index 5b4b2bd46e7c..2d3336ab7ba3 100644
> --- a/drivers/gpu/drm/i915/i915_cmd_parser.c
> +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
> @@ -1509,6 +1509,12 @@ int intel_engine_cmd_parser(struct intel_engine_cs *engine,
>                 }
>         }
>
> +       /* Batch unsafe to execute with privileges, cancel! */
> +       if (ret) {
> +               cmd = page_mask_bits(shadow->obj->mm.mapping);
> +               *cmd = MI_BATCH_BUFFER_END;
> +       }
> +
>         if (trampoline) {
>                 /*
>                  * With the trampoline, the shadow is executed twice.
> @@ -1524,26 +1530,20 @@ int intel_engine_cmd_parser(struct intel_engine_cs *engine,
>                  */
>                 *batch_end = MI_BATCH_BUFFER_END;

Bit of a bike shed but, given the new structure of the code, I think
it makes it more clear if we do

if (ret == -EACCESS) {
   /* stuff */
   __gen6_emit_bb_start(...);
} else {
   *batch_end = MI_BATCH_BUFFER_END;
}

That way it's clear that we're making a choice between firing off the
client batch in privileged mode and ending early.

>
> -               if (ret) {
> -                       /* Batch unsafe to execute with privileges, cancel! */
> -                       cmd = page_mask_bits(shadow->obj->mm.mapping);
> -                       *cmd = MI_BATCH_BUFFER_END;
> +               /* If batch is unsafe but valid, jump to the original */
> +               if (ret == -EACCES) {
> +                       unsigned int flags;
>
> -                       /* If batch is unsafe but valid, jump to the original */
> -                       if (ret == -EACCES) {
> -                               unsigned int flags;
> +                       flags = MI_BATCH_NON_SECURE_I965;
> +                       if (IS_HASWELL(engine->i915))
> +                               flags = MI_BATCH_NON_SECURE_HSW;
>
> -                               flags = MI_BATCH_NON_SECURE_I965;
> -                               if (IS_HASWELL(engine->i915))
> -                                       flags = MI_BATCH_NON_SECURE_HSW;
> +                       GEM_BUG_ON(!IS_GEN_RANGE(engine->i915, 6, 7));
> +                       __gen6_emit_bb_start(batch_end,
> +                                            batch_addr,
> +                                            flags);
>
> -                               GEM_BUG_ON(!IS_GEN_RANGE(engine->i915, 6, 7));
> -                               __gen6_emit_bb_start(batch_end,
> -                                                    batch_addr,
> -                                                    flags);
> -
> -                               ret = 0; /* allow execution */
> -                       }
> +                       ret = 0; /* allow execution */
>                 }
>         }
>
> --
> 2.31.0
>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Intel-gfx] [PATCH 1/2] drm/i915/cmdparser: No-op failed batches on all platforms
@ 2021-05-19 15:04   ` Jason Ekstrand
  0 siblings, 0 replies; 25+ messages in thread
From: Jason Ekstrand @ 2021-05-19 15:04 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Intel Graphics Development, stable, Jason Ekstrand,
	DRI Development, Daniel Vetter

On Wed, May 19, 2021 at 2:43 AM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>
> On gen9 for blt cmd parser we relied on the magic fence error
> propagation which:
> - doesn't work on gen7, because there's no scheduler with ringbuffers
>   there yet
> - fence error propagation can be weaponized to attack other things, so
>   not a good design idea
>
> Instead of magic, do the same thing on gen9 as on gen7.

I think the commit message could be improved.  Maybe something like this?

When we re-introduced the command parser on Gen9 platforms to protect
against BLT CS register writes, we did things a bit differently than
on previous platforms.  On Gen7 platforms, if a batch contains
unsupported commands, we smash the start of the shadow batch to
MI_BATCH_BUFFER_END to cancel the batch.  If it's mostly ok
(-EACCESS), we trampoline to run in unprivileged mode and let the
limited HW parser handle security.  On Gen9, we only care about
rejecting batches because we don't trust the HW parser for a few cases
so we don't need this second trampoline case.

However, instead of stopping there and avoiding the trampoline, we
chose to avoid executing the new batch all together on Gen9 by use of
dma-fence error propagation.  When the batch parser fails, it returns
a non-zero error and we would propgate that through the chain of
fences and trust the scheduler to know to cancel anything dependent on
a fence with an error.  However, fence error propagation is sketchy at
best and can be weaponized to attack other things so it's not really a
good design.  This commit restores a bit of the Gen7 functionality on
Gen9 (smashing the start of the shadow batch to MI_BB_END) so that
it's always safe to run the batch post-parser.  A later commit will
get rid of the error propagation nonsense.

>
> Kudos to Jason for figuring this out.
>
> Fixes: 9e31c1fe45d5 ("drm/i915: Propagate errors on awaiting already signaled fences")
> Cc: <stable@vger.kernel.org> # v5.6+
> Cc: Jason Ekstrand <jason.ekstrand@intel.com>
> Cc: Marcin Slusarz <marcin.slusarz@intel.com>
> Cc: Jon Bloomfield <jon.bloomfield@intel.com>
> Relates: https://gitlab.freedesktop.org/drm/intel/-/issues/3080
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_cmd_parser.c | 34 +++++++++++++-------------
>  1 file changed, 17 insertions(+), 17 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c
> index 5b4b2bd46e7c..2d3336ab7ba3 100644
> --- a/drivers/gpu/drm/i915/i915_cmd_parser.c
> +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
> @@ -1509,6 +1509,12 @@ int intel_engine_cmd_parser(struct intel_engine_cs *engine,
>                 }
>         }
>
> +       /* Batch unsafe to execute with privileges, cancel! */
> +       if (ret) {
> +               cmd = page_mask_bits(shadow->obj->mm.mapping);
> +               *cmd = MI_BATCH_BUFFER_END;
> +       }
> +
>         if (trampoline) {
>                 /*
>                  * With the trampoline, the shadow is executed twice.
> @@ -1524,26 +1530,20 @@ int intel_engine_cmd_parser(struct intel_engine_cs *engine,
>                  */
>                 *batch_end = MI_BATCH_BUFFER_END;

Bit of a bike shed but, given the new structure of the code, I think
it makes it more clear if we do

if (ret == -EACCESS) {
   /* stuff */
   __gen6_emit_bb_start(...);
} else {
   *batch_end = MI_BATCH_BUFFER_END;
}

That way it's clear that we're making a choice between firing off the
client batch in privileged mode and ending early.

>
> -               if (ret) {
> -                       /* Batch unsafe to execute with privileges, cancel! */
> -                       cmd = page_mask_bits(shadow->obj->mm.mapping);
> -                       *cmd = MI_BATCH_BUFFER_END;
> +               /* If batch is unsafe but valid, jump to the original */
> +               if (ret == -EACCES) {
> +                       unsigned int flags;
>
> -                       /* If batch is unsafe but valid, jump to the original */
> -                       if (ret == -EACCES) {
> -                               unsigned int flags;
> +                       flags = MI_BATCH_NON_SECURE_I965;
> +                       if (IS_HASWELL(engine->i915))
> +                               flags = MI_BATCH_NON_SECURE_HSW;
>
> -                               flags = MI_BATCH_NON_SECURE_I965;
> -                               if (IS_HASWELL(engine->i915))
> -                                       flags = MI_BATCH_NON_SECURE_HSW;
> +                       GEM_BUG_ON(!IS_GEN_RANGE(engine->i915, 6, 7));
> +                       __gen6_emit_bb_start(batch_end,
> +                                            batch_addr,
> +                                            flags);
>
> -                               GEM_BUG_ON(!IS_GEN_RANGE(engine->i915, 6, 7));
> -                               __gen6_emit_bb_start(batch_end,
> -                                                    batch_addr,
> -                                                    flags);
> -
> -                               ret = 0; /* allow execution */
> -                       }
> +                       ret = 0; /* allow execution */
>                 }
>         }
>
> --
> 2.31.0
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] Revert "drm/i915: Propagate errors on awaiting already signaled fences"
  2021-05-19 10:15     ` Daniel Vetter
  (?)
@ 2021-05-19 15:06       ` Jason Ekstrand
  -1 siblings, 0 replies; 25+ messages in thread
From: Jason Ekstrand @ 2021-05-19 15:06 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Intel Graphics Development, DRI Development, Jason Ekstrand,
	Marcin Slusarz, stable, Jon Bloomfield

Once we no longer rely on error propagation, I think there's a lot we
can rip out.

--Jason

On Wed, May 19, 2021 at 5:15 AM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>
> From: Jason Ekstrand <jason@jlekstrand.net>
>
> This reverts commit 9e31c1fe45d555a948ff66f1f0e3fe1f83ca63f7.  Ever
> since that commit, we've been having issues where a hang in one client
> can propagate to another.  In particular, a hang in an app can propagate
> to the X server which causes the whole desktop to lock up.
>
> Error propagation along fences sound like a good idea, but as your bug
> shows, surprising consequences, since propagating errors across security
> boundaries is not a good thing.
>
> What we do have is track the hangs on the ctx, and report information to
> userspace using RESET_STATS. That's how arb_robustness works. Also, if my
> understanding is still correct, the EIO from execbuf is when your context
> is banned (because not recoverable or too many hangs). And in all these
> cases it's up to userspace to figure out what is all impacted and should
> be reported to the application, that's not on the kernel to guess and
> automatically propagate.
>
> What's more, we're also building more features on top of ctx error
> reporting with RESET_STATS ioctl: Encrypted buffers use the same, and the
> userspace fence wait also relies on that mechanism. So it is the path
> going forward for reporting gpu hangs and resets to userspace.
>
> So all together that's why I think we should just bury this idea again as
> not quite the direction we want to go to, hence why I think the revert is
> the right option here.Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
>
> v2: Augment commit message. Also restore Jason's sob that I
> accidentally lost.
>
> Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> (v1)
> Reported-by: Marcin Slusarz <marcin.slusarz@intel.com>
> Cc: <stable@vger.kernel.org> # v5.6+
> Cc: Jason Ekstrand <jason.ekstrand@intel.com>
> Cc: Marcin Slusarz <marcin.slusarz@intel.com>
> Cc: Jon Bloomfield <jon.bloomfield@intel.com>
> Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/3080
> Fixes: 9e31c1fe45d5 ("drm/i915: Propagate errors on awaiting already signaled fences")
> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> ---
>  drivers/gpu/drm/i915/i915_request.c | 8 ++------
>  1 file changed, 2 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> index 970d8f4986bb..b796197c0772 100644
> --- a/drivers/gpu/drm/i915/i915_request.c
> +++ b/drivers/gpu/drm/i915/i915_request.c
> @@ -1426,10 +1426,8 @@ i915_request_await_execution(struct i915_request *rq,
>
>         do {
>                 fence = *child++;
> -               if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags)) {
> -                       i915_sw_fence_set_error_once(&rq->submit, fence->error);
> +               if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags))
>                         continue;
> -               }
>
>                 if (fence->context == rq->fence.context)
>                         continue;
> @@ -1527,10 +1525,8 @@ i915_request_await_dma_fence(struct i915_request *rq, struct dma_fence *fence)
>
>         do {
>                 fence = *child++;
> -               if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags)) {
> -                       i915_sw_fence_set_error_once(&rq->submit, fence->error);
> +               if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags))
>                         continue;
> -               }
>
>                 /*
>                  * Requests on the same timeline are explicitly ordered, along
> --
> 2.31.0
>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] Revert "drm/i915: Propagate errors on awaiting already signaled fences"
@ 2021-05-19 15:06       ` Jason Ekstrand
  0 siblings, 0 replies; 25+ messages in thread
From: Jason Ekstrand @ 2021-05-19 15:06 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Intel Graphics Development, stable, Jason Ekstrand,
	Jon Bloomfield, DRI Development, Marcin Slusarz

Once we no longer rely on error propagation, I think there's a lot we
can rip out.

--Jason

On Wed, May 19, 2021 at 5:15 AM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>
> From: Jason Ekstrand <jason@jlekstrand.net>
>
> This reverts commit 9e31c1fe45d555a948ff66f1f0e3fe1f83ca63f7.  Ever
> since that commit, we've been having issues where a hang in one client
> can propagate to another.  In particular, a hang in an app can propagate
> to the X server which causes the whole desktop to lock up.
>
> Error propagation along fences sound like a good idea, but as your bug
> shows, surprising consequences, since propagating errors across security
> boundaries is not a good thing.
>
> What we do have is track the hangs on the ctx, and report information to
> userspace using RESET_STATS. That's how arb_robustness works. Also, if my
> understanding is still correct, the EIO from execbuf is when your context
> is banned (because not recoverable or too many hangs). And in all these
> cases it's up to userspace to figure out what is all impacted and should
> be reported to the application, that's not on the kernel to guess and
> automatically propagate.
>
> What's more, we're also building more features on top of ctx error
> reporting with RESET_STATS ioctl: Encrypted buffers use the same, and the
> userspace fence wait also relies on that mechanism. So it is the path
> going forward for reporting gpu hangs and resets to userspace.
>
> So all together that's why I think we should just bury this idea again as
> not quite the direction we want to go to, hence why I think the revert is
> the right option here.Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
>
> v2: Augment commit message. Also restore Jason's sob that I
> accidentally lost.
>
> Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> (v1)
> Reported-by: Marcin Slusarz <marcin.slusarz@intel.com>
> Cc: <stable@vger.kernel.org> # v5.6+
> Cc: Jason Ekstrand <jason.ekstrand@intel.com>
> Cc: Marcin Slusarz <marcin.slusarz@intel.com>
> Cc: Jon Bloomfield <jon.bloomfield@intel.com>
> Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/3080
> Fixes: 9e31c1fe45d5 ("drm/i915: Propagate errors on awaiting already signaled fences")
> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> ---
>  drivers/gpu/drm/i915/i915_request.c | 8 ++------
>  1 file changed, 2 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> index 970d8f4986bb..b796197c0772 100644
> --- a/drivers/gpu/drm/i915/i915_request.c
> +++ b/drivers/gpu/drm/i915/i915_request.c
> @@ -1426,10 +1426,8 @@ i915_request_await_execution(struct i915_request *rq,
>
>         do {
>                 fence = *child++;
> -               if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags)) {
> -                       i915_sw_fence_set_error_once(&rq->submit, fence->error);
> +               if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags))
>                         continue;
> -               }
>
>                 if (fence->context == rq->fence.context)
>                         continue;
> @@ -1527,10 +1525,8 @@ i915_request_await_dma_fence(struct i915_request *rq, struct dma_fence *fence)
>
>         do {
>                 fence = *child++;
> -               if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags)) {
> -                       i915_sw_fence_set_error_once(&rq->submit, fence->error);
> +               if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags))
>                         continue;
> -               }
>
>                 /*
>                  * Requests on the same timeline are explicitly ordered, along
> --
> 2.31.0
>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Intel-gfx] [PATCH] Revert "drm/i915: Propagate errors on awaiting already signaled fences"
@ 2021-05-19 15:06       ` Jason Ekstrand
  0 siblings, 0 replies; 25+ messages in thread
From: Jason Ekstrand @ 2021-05-19 15:06 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Intel Graphics Development, stable, Jason Ekstrand, DRI Development

Once we no longer rely on error propagation, I think there's a lot we
can rip out.

--Jason

On Wed, May 19, 2021 at 5:15 AM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>
> From: Jason Ekstrand <jason@jlekstrand.net>
>
> This reverts commit 9e31c1fe45d555a948ff66f1f0e3fe1f83ca63f7.  Ever
> since that commit, we've been having issues where a hang in one client
> can propagate to another.  In particular, a hang in an app can propagate
> to the X server which causes the whole desktop to lock up.
>
> Error propagation along fences sound like a good idea, but as your bug
> shows, surprising consequences, since propagating errors across security
> boundaries is not a good thing.
>
> What we do have is track the hangs on the ctx, and report information to
> userspace using RESET_STATS. That's how arb_robustness works. Also, if my
> understanding is still correct, the EIO from execbuf is when your context
> is banned (because not recoverable or too many hangs). And in all these
> cases it's up to userspace to figure out what is all impacted and should
> be reported to the application, that's not on the kernel to guess and
> automatically propagate.
>
> What's more, we're also building more features on top of ctx error
> reporting with RESET_STATS ioctl: Encrypted buffers use the same, and the
> userspace fence wait also relies on that mechanism. So it is the path
> going forward for reporting gpu hangs and resets to userspace.
>
> So all together that's why I think we should just bury this idea again as
> not quite the direction we want to go to, hence why I think the revert is
> the right option here.Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
>
> v2: Augment commit message. Also restore Jason's sob that I
> accidentally lost.
>
> Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> (v1)
> Reported-by: Marcin Slusarz <marcin.slusarz@intel.com>
> Cc: <stable@vger.kernel.org> # v5.6+
> Cc: Jason Ekstrand <jason.ekstrand@intel.com>
> Cc: Marcin Slusarz <marcin.slusarz@intel.com>
> Cc: Jon Bloomfield <jon.bloomfield@intel.com>
> Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/3080
> Fixes: 9e31c1fe45d5 ("drm/i915: Propagate errors on awaiting already signaled fences")
> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> ---
>  drivers/gpu/drm/i915/i915_request.c | 8 ++------
>  1 file changed, 2 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> index 970d8f4986bb..b796197c0772 100644
> --- a/drivers/gpu/drm/i915/i915_request.c
> +++ b/drivers/gpu/drm/i915/i915_request.c
> @@ -1426,10 +1426,8 @@ i915_request_await_execution(struct i915_request *rq,
>
>         do {
>                 fence = *child++;
> -               if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags)) {
> -                       i915_sw_fence_set_error_once(&rq->submit, fence->error);
> +               if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags))
>                         continue;
> -               }
>
>                 if (fence->context == rq->fence.context)
>                         continue;
> @@ -1527,10 +1525,8 @@ i915_request_await_dma_fence(struct i915_request *rq, struct dma_fence *fence)
>
>         do {
>                 fence = *child++;
> -               if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags)) {
> -                       i915_sw_fence_set_error_once(&rq->submit, fence->error);
> +               if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags))
>                         continue;
> -               }
>
>                 /*
>                  * Requests on the same timeline are explicitly ordered, along
> --
> 2.31.0
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] Revert "drm/i915: Propagate errors on awaiting already signaled fences"
  2021-05-19 15:06       ` Jason Ekstrand
  (?)
@ 2021-05-19 17:16         ` Daniel Vetter
  -1 siblings, 0 replies; 25+ messages in thread
From: Daniel Vetter @ 2021-05-19 17:16 UTC (permalink / raw)
  To: Jason Ekstrand
  Cc: Intel Graphics Development, DRI Development, Jason Ekstrand,
	Marcin Slusarz, stable, Jon Bloomfield

On Wed, May 19, 2021 at 5:06 PM Jason Ekstrand <jason@jlekstrand.net> wrote:
>
> Once we no longer rely on error propagation, I think there's a lot we
> can rip out.

I honestly did not find that much ... what did you uncover?
-Daniel

>
> --Jason
>
> On Wed, May 19, 2021 at 5:15 AM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> >
> > From: Jason Ekstrand <jason@jlekstrand.net>
> >
> > This reverts commit 9e31c1fe45d555a948ff66f1f0e3fe1f83ca63f7.  Ever
> > since that commit, we've been having issues where a hang in one client
> > can propagate to another.  In particular, a hang in an app can propagate
> > to the X server which causes the whole desktop to lock up.
> >
> > Error propagation along fences sound like a good idea, but as your bug
> > shows, surprising consequences, since propagating errors across security
> > boundaries is not a good thing.
> >
> > What we do have is track the hangs on the ctx, and report information to
> > userspace using RESET_STATS. That's how arb_robustness works. Also, if my
> > understanding is still correct, the EIO from execbuf is when your context
> > is banned (because not recoverable or too many hangs). And in all these
> > cases it's up to userspace to figure out what is all impacted and should
> > be reported to the application, that's not on the kernel to guess and
> > automatically propagate.
> >
> > What's more, we're also building more features on top of ctx error
> > reporting with RESET_STATS ioctl: Encrypted buffers use the same, and the
> > userspace fence wait also relies on that mechanism. So it is the path
> > going forward for reporting gpu hangs and resets to userspace.
> >
> > So all together that's why I think we should just bury this idea again as
> > not quite the direction we want to go to, hence why I think the revert is
> > the right option here.Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
> >
> > v2: Augment commit message. Also restore Jason's sob that I
> > accidentally lost.
> >
> > Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> (v1)
> > Reported-by: Marcin Slusarz <marcin.slusarz@intel.com>
> > Cc: <stable@vger.kernel.org> # v5.6+
> > Cc: Jason Ekstrand <jason.ekstrand@intel.com>
> > Cc: Marcin Slusarz <marcin.slusarz@intel.com>
> > Cc: Jon Bloomfield <jon.bloomfield@intel.com>
> > Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/3080
> > Fixes: 9e31c1fe45d5 ("drm/i915: Propagate errors on awaiting already signaled fences")
> > Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> > ---
> >  drivers/gpu/drm/i915/i915_request.c | 8 ++------
> >  1 file changed, 2 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> > index 970d8f4986bb..b796197c0772 100644
> > --- a/drivers/gpu/drm/i915/i915_request.c
> > +++ b/drivers/gpu/drm/i915/i915_request.c
> > @@ -1426,10 +1426,8 @@ i915_request_await_execution(struct i915_request *rq,
> >
> >         do {
> >                 fence = *child++;
> > -               if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags)) {
> > -                       i915_sw_fence_set_error_once(&rq->submit, fence->error);
> > +               if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags))
> >                         continue;
> > -               }
> >
> >                 if (fence->context == rq->fence.context)
> >                         continue;
> > @@ -1527,10 +1525,8 @@ i915_request_await_dma_fence(struct i915_request *rq, struct dma_fence *fence)
> >
> >         do {
> >                 fence = *child++;
> > -               if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags)) {
> > -                       i915_sw_fence_set_error_once(&rq->submit, fence->error);
> > +               if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags))
> >                         continue;
> > -               }
> >
> >                 /*
> >                  * Requests on the same timeline are explicitly ordered, along
> > --
> > 2.31.0
> >



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] Revert "drm/i915: Propagate errors on awaiting already signaled fences"
@ 2021-05-19 17:16         ` Daniel Vetter
  0 siblings, 0 replies; 25+ messages in thread
From: Daniel Vetter @ 2021-05-19 17:16 UTC (permalink / raw)
  To: Jason Ekstrand
  Cc: Intel Graphics Development, stable, Jason Ekstrand,
	Jon Bloomfield, DRI Development, Marcin Slusarz

On Wed, May 19, 2021 at 5:06 PM Jason Ekstrand <jason@jlekstrand.net> wrote:
>
> Once we no longer rely on error propagation, I think there's a lot we
> can rip out.

I honestly did not find that much ... what did you uncover?
-Daniel

>
> --Jason
>
> On Wed, May 19, 2021 at 5:15 AM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> >
> > From: Jason Ekstrand <jason@jlekstrand.net>
> >
> > This reverts commit 9e31c1fe45d555a948ff66f1f0e3fe1f83ca63f7.  Ever
> > since that commit, we've been having issues where a hang in one client
> > can propagate to another.  In particular, a hang in an app can propagate
> > to the X server which causes the whole desktop to lock up.
> >
> > Error propagation along fences sound like a good idea, but as your bug
> > shows, surprising consequences, since propagating errors across security
> > boundaries is not a good thing.
> >
> > What we do have is track the hangs on the ctx, and report information to
> > userspace using RESET_STATS. That's how arb_robustness works. Also, if my
> > understanding is still correct, the EIO from execbuf is when your context
> > is banned (because not recoverable or too many hangs). And in all these
> > cases it's up to userspace to figure out what is all impacted and should
> > be reported to the application, that's not on the kernel to guess and
> > automatically propagate.
> >
> > What's more, we're also building more features on top of ctx error
> > reporting with RESET_STATS ioctl: Encrypted buffers use the same, and the
> > userspace fence wait also relies on that mechanism. So it is the path
> > going forward for reporting gpu hangs and resets to userspace.
> >
> > So all together that's why I think we should just bury this idea again as
> > not quite the direction we want to go to, hence why I think the revert is
> > the right option here.Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
> >
> > v2: Augment commit message. Also restore Jason's sob that I
> > accidentally lost.
> >
> > Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> (v1)
> > Reported-by: Marcin Slusarz <marcin.slusarz@intel.com>
> > Cc: <stable@vger.kernel.org> # v5.6+
> > Cc: Jason Ekstrand <jason.ekstrand@intel.com>
> > Cc: Marcin Slusarz <marcin.slusarz@intel.com>
> > Cc: Jon Bloomfield <jon.bloomfield@intel.com>
> > Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/3080
> > Fixes: 9e31c1fe45d5 ("drm/i915: Propagate errors on awaiting already signaled fences")
> > Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> > ---
> >  drivers/gpu/drm/i915/i915_request.c | 8 ++------
> >  1 file changed, 2 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> > index 970d8f4986bb..b796197c0772 100644
> > --- a/drivers/gpu/drm/i915/i915_request.c
> > +++ b/drivers/gpu/drm/i915/i915_request.c
> > @@ -1426,10 +1426,8 @@ i915_request_await_execution(struct i915_request *rq,
> >
> >         do {
> >                 fence = *child++;
> > -               if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags)) {
> > -                       i915_sw_fence_set_error_once(&rq->submit, fence->error);
> > +               if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags))
> >                         continue;
> > -               }
> >
> >                 if (fence->context == rq->fence.context)
> >                         continue;
> > @@ -1527,10 +1525,8 @@ i915_request_await_dma_fence(struct i915_request *rq, struct dma_fence *fence)
> >
> >         do {
> >                 fence = *child++;
> > -               if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags)) {
> > -                       i915_sw_fence_set_error_once(&rq->submit, fence->error);
> > +               if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags))
> >                         continue;
> > -               }
> >
> >                 /*
> >                  * Requests on the same timeline are explicitly ordered, along
> > --
> > 2.31.0
> >



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Intel-gfx] [PATCH] Revert "drm/i915: Propagate errors on awaiting already signaled fences"
@ 2021-05-19 17:16         ` Daniel Vetter
  0 siblings, 0 replies; 25+ messages in thread
From: Daniel Vetter @ 2021-05-19 17:16 UTC (permalink / raw)
  To: Jason Ekstrand
  Cc: Intel Graphics Development, stable, Jason Ekstrand, DRI Development

On Wed, May 19, 2021 at 5:06 PM Jason Ekstrand <jason@jlekstrand.net> wrote:
>
> Once we no longer rely on error propagation, I think there's a lot we
> can rip out.

I honestly did not find that much ... what did you uncover?
-Daniel

>
> --Jason
>
> On Wed, May 19, 2021 at 5:15 AM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> >
> > From: Jason Ekstrand <jason@jlekstrand.net>
> >
> > This reverts commit 9e31c1fe45d555a948ff66f1f0e3fe1f83ca63f7.  Ever
> > since that commit, we've been having issues where a hang in one client
> > can propagate to another.  In particular, a hang in an app can propagate
> > to the X server which causes the whole desktop to lock up.
> >
> > Error propagation along fences sound like a good idea, but as your bug
> > shows, surprising consequences, since propagating errors across security
> > boundaries is not a good thing.
> >
> > What we do have is track the hangs on the ctx, and report information to
> > userspace using RESET_STATS. That's how arb_robustness works. Also, if my
> > understanding is still correct, the EIO from execbuf is when your context
> > is banned (because not recoverable or too many hangs). And in all these
> > cases it's up to userspace to figure out what is all impacted and should
> > be reported to the application, that's not on the kernel to guess and
> > automatically propagate.
> >
> > What's more, we're also building more features on top of ctx error
> > reporting with RESET_STATS ioctl: Encrypted buffers use the same, and the
> > userspace fence wait also relies on that mechanism. So it is the path
> > going forward for reporting gpu hangs and resets to userspace.
> >
> > So all together that's why I think we should just bury this idea again as
> > not quite the direction we want to go to, hence why I think the revert is
> > the right option here.Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
> >
> > v2: Augment commit message. Also restore Jason's sob that I
> > accidentally lost.
> >
> > Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> (v1)
> > Reported-by: Marcin Slusarz <marcin.slusarz@intel.com>
> > Cc: <stable@vger.kernel.org> # v5.6+
> > Cc: Jason Ekstrand <jason.ekstrand@intel.com>
> > Cc: Marcin Slusarz <marcin.slusarz@intel.com>
> > Cc: Jon Bloomfield <jon.bloomfield@intel.com>
> > Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/3080
> > Fixes: 9e31c1fe45d5 ("drm/i915: Propagate errors on awaiting already signaled fences")
> > Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> > ---
> >  drivers/gpu/drm/i915/i915_request.c | 8 ++------
> >  1 file changed, 2 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> > index 970d8f4986bb..b796197c0772 100644
> > --- a/drivers/gpu/drm/i915/i915_request.c
> > +++ b/drivers/gpu/drm/i915/i915_request.c
> > @@ -1426,10 +1426,8 @@ i915_request_await_execution(struct i915_request *rq,
> >
> >         do {
> >                 fence = *child++;
> > -               if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags)) {
> > -                       i915_sw_fence_set_error_once(&rq->submit, fence->error);
> > +               if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags))
> >                         continue;
> > -               }
> >
> >                 if (fence->context == rq->fence.context)
> >                         continue;
> > @@ -1527,10 +1525,8 @@ i915_request_await_dma_fence(struct i915_request *rq, struct dma_fence *fence)
> >
> >         do {
> >                 fence = *child++;
> > -               if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags)) {
> > -                       i915_sw_fence_set_error_once(&rq->submit, fence->error);
> > +               if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags))
> >                         continue;
> > -               }
> >
> >                 /*
> >                  * Requests on the same timeline are explicitly ordered, along
> > --
> > 2.31.0
> >



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] Revert "drm/i915: Propagate errors on awaiting already signaled fences"
  2021-05-19 17:16         ` Daniel Vetter
@ 2021-05-19 19:01           ` Jason Ekstrand
  -1 siblings, 0 replies; 25+ messages in thread
From: Jason Ekstrand @ 2021-05-19 19:01 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Intel Graphics Development, stable, Jason Ekstrand,
	Jon Bloomfield, DRI Development, Marcin Slusarz

[-- Attachment #1: Type: text/plain, Size: 4342 bytes --]

On May 19, 2021 12:16:15 Daniel Vetter <daniel.vetter@ffwll.ch> wrote:

> On Wed, May 19, 2021 at 5:06 PM Jason Ekstrand <jason@jlekstrand.net> wrote:
>>
>> Once we no longer rely on error propagation, I think there's a lot we
>> can rip out.
>
> I honestly did not find that much ... what did you uncover?

When I was digging through this earlier today, I think I convinced myself 
that the cmdparser is the only user of the entire fence error architecture. 
I may have missed something, though.

--Jason

>
> -Daniel
>
>>
>> --Jason
>>
>> On Wed, May 19, 2021 at 5:15 AM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>>>
>>> From: Jason Ekstrand <jason@jlekstrand.net>
>>>
>>> This reverts commit 9e31c1fe45d555a948ff66f1f0e3fe1f83ca63f7.  Ever
>>> since that commit, we've been having issues where a hang in one client
>>> can propagate to another.  In particular, a hang in an app can propagate
>>> to the X server which causes the whole desktop to lock up.
>>>
>>> Error propagation along fences sound like a good idea, but as your bug
>>> shows, surprising consequences, since propagating errors across security
>>> boundaries is not a good thing.
>>>
>>> What we do have is track the hangs on the ctx, and report information to
>>> userspace using RESET_STATS. That's how arb_robustness works. Also, if my
>>> understanding is still correct, the EIO from execbuf is when your context
>>> is banned (because not recoverable or too many hangs). And in all these
>>> cases it's up to userspace to figure out what is all impacted and should
>>> be reported to the application, that's not on the kernel to guess and
>>> automatically propagate.
>>>
>>> What's more, we're also building more features on top of ctx error
>>> reporting with RESET_STATS ioctl: Encrypted buffers use the same, and the
>>> userspace fence wait also relies on that mechanism. So it is the path
>>> going forward for reporting gpu hangs and resets to userspace.
>>>
>>> So all together that's why I think we should just bury this idea again as
>>> not quite the direction we want to go to, hence why I think the revert is
>>> the right option here.Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
>>>
>>> v2: Augment commit message. Also restore Jason's sob that I
>>> accidentally lost.
>>>
>>> Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> (v1)
>>> Reported-by: Marcin Slusarz <marcin.slusarz@intel.com>
>>> Cc: <stable@vger.kernel.org> # v5.6+
>>> Cc: Jason Ekstrand <jason.ekstrand@intel.com>
>>> Cc: Marcin Slusarz <marcin.slusarz@intel.com>
>>> Cc: Jon Bloomfield <jon.bloomfield@intel.com>
>>> Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/3080
>>> Fixes: 9e31c1fe45d5 ("drm/i915: Propagate errors on awaiting already 
>>> signaled fences")
>>> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
>>> ---
>>> drivers/gpu/drm/i915/i915_request.c | 8 ++------
>>> 1 file changed, 2 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_request.c 
>>> b/drivers/gpu/drm/i915/i915_request.c
>>> index 970d8f4986bb..b796197c0772 100644
>>> --- a/drivers/gpu/drm/i915/i915_request.c
>>> +++ b/drivers/gpu/drm/i915/i915_request.c
>>> @@ -1426,10 +1426,8 @@ i915_request_await_execution(struct i915_request *rq,
>>>
>>>  do {
>>>          fence = *child++;
>>> -               if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags)) {
>>> -                       i915_sw_fence_set_error_once(&rq->submit, 
>>> fence->error);
>>> +               if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags))
>>>                  continue;
>>> -               }
>>>
>>>          if (fence->context == rq->fence.context)
>>>                  continue;
>>> @@ -1527,10 +1525,8 @@ i915_request_await_dma_fence(struct i915_request 
>>> *rq, struct dma_fence *fence)
>>>
>>>  do {
>>>          fence = *child++;
>>> -               if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags)) {
>>> -                       i915_sw_fence_set_error_once(&rq->submit, 
>>> fence->error);
>>> +               if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags))
>>>                  continue;
>>> -               }
>>>
>>>          /*
>>>           * Requests on the same timeline are explicitly ordered, along
>>> --
>>> 2.31.0
>
>
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch


[-- Attachment #2: Type: text/html, Size: 8249 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Intel-gfx] [PATCH] Revert "drm/i915: Propagate errors on awaiting already signaled fences"
@ 2021-05-19 19:01           ` Jason Ekstrand
  0 siblings, 0 replies; 25+ messages in thread
From: Jason Ekstrand @ 2021-05-19 19:01 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Intel Graphics Development, stable, Jason Ekstrand, DRI Development


[-- Attachment #1.1: Type: text/plain, Size: 4342 bytes --]

On May 19, 2021 12:16:15 Daniel Vetter <daniel.vetter@ffwll.ch> wrote:

> On Wed, May 19, 2021 at 5:06 PM Jason Ekstrand <jason@jlekstrand.net> wrote:
>>
>> Once we no longer rely on error propagation, I think there's a lot we
>> can rip out.
>
> I honestly did not find that much ... what did you uncover?

When I was digging through this earlier today, I think I convinced myself 
that the cmdparser is the only user of the entire fence error architecture. 
I may have missed something, though.

--Jason

>
> -Daniel
>
>>
>> --Jason
>>
>> On Wed, May 19, 2021 at 5:15 AM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>>>
>>> From: Jason Ekstrand <jason@jlekstrand.net>
>>>
>>> This reverts commit 9e31c1fe45d555a948ff66f1f0e3fe1f83ca63f7.  Ever
>>> since that commit, we've been having issues where a hang in one client
>>> can propagate to another.  In particular, a hang in an app can propagate
>>> to the X server which causes the whole desktop to lock up.
>>>
>>> Error propagation along fences sound like a good idea, but as your bug
>>> shows, surprising consequences, since propagating errors across security
>>> boundaries is not a good thing.
>>>
>>> What we do have is track the hangs on the ctx, and report information to
>>> userspace using RESET_STATS. That's how arb_robustness works. Also, if my
>>> understanding is still correct, the EIO from execbuf is when your context
>>> is banned (because not recoverable or too many hangs). And in all these
>>> cases it's up to userspace to figure out what is all impacted and should
>>> be reported to the application, that's not on the kernel to guess and
>>> automatically propagate.
>>>
>>> What's more, we're also building more features on top of ctx error
>>> reporting with RESET_STATS ioctl: Encrypted buffers use the same, and the
>>> userspace fence wait also relies on that mechanism. So it is the path
>>> going forward for reporting gpu hangs and resets to userspace.
>>>
>>> So all together that's why I think we should just bury this idea again as
>>> not quite the direction we want to go to, hence why I think the revert is
>>> the right option here.Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
>>>
>>> v2: Augment commit message. Also restore Jason's sob that I
>>> accidentally lost.
>>>
>>> Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> (v1)
>>> Reported-by: Marcin Slusarz <marcin.slusarz@intel.com>
>>> Cc: <stable@vger.kernel.org> # v5.6+
>>> Cc: Jason Ekstrand <jason.ekstrand@intel.com>
>>> Cc: Marcin Slusarz <marcin.slusarz@intel.com>
>>> Cc: Jon Bloomfield <jon.bloomfield@intel.com>
>>> Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/3080
>>> Fixes: 9e31c1fe45d5 ("drm/i915: Propagate errors on awaiting already 
>>> signaled fences")
>>> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
>>> ---
>>> drivers/gpu/drm/i915/i915_request.c | 8 ++------
>>> 1 file changed, 2 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_request.c 
>>> b/drivers/gpu/drm/i915/i915_request.c
>>> index 970d8f4986bb..b796197c0772 100644
>>> --- a/drivers/gpu/drm/i915/i915_request.c
>>> +++ b/drivers/gpu/drm/i915/i915_request.c
>>> @@ -1426,10 +1426,8 @@ i915_request_await_execution(struct i915_request *rq,
>>>
>>>  do {
>>>          fence = *child++;
>>> -               if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags)) {
>>> -                       i915_sw_fence_set_error_once(&rq->submit, 
>>> fence->error);
>>> +               if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags))
>>>                  continue;
>>> -               }
>>>
>>>          if (fence->context == rq->fence.context)
>>>                  continue;
>>> @@ -1527,10 +1525,8 @@ i915_request_await_dma_fence(struct i915_request 
>>> *rq, struct dma_fence *fence)
>>>
>>>  do {
>>>          fence = *child++;
>>> -               if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags)) {
>>> -                       i915_sw_fence_set_error_once(&rq->submit, 
>>> fence->error);
>>> +               if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags))
>>>                  continue;
>>> -               }
>>>
>>>          /*
>>>           * Requests on the same timeline are explicitly ordered, along
>>> --
>>> 2.31.0
>
>
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch


[-- Attachment #1.2: Type: text/html, Size: 8249 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Intel-gfx] ✗ Fi.CI.IGT: failure for series starting with [1/2] drm/i915/cmdparser: No-op failed batches on all platforms (rev2)
  2021-05-19  7:43 ` Daniel Vetter
                   ` (7 preceding siblings ...)
  (?)
@ 2021-05-21  1:03 ` Patchwork
  -1 siblings, 0 replies; 25+ messages in thread
From: Patchwork @ 2021-05-21  1:03 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx


[-- Attachment #1.1: Type: text/plain, Size: 30314 bytes --]

== Series Details ==

Series: series starting with [1/2] drm/i915/cmdparser: No-op failed batches on all platforms (rev2)
URL   : https://patchwork.freedesktop.org/series/90310/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_10102_full -> Patchwork_20154_full
====================================================

Summary
-------

  **FAILURE**

  Serious unknown changes coming with Patchwork_20154_full absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_20154_full, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in Patchwork_20154_full:

### IGT changes ###

#### Possible regressions ####

  * igt@gem_ppgtt@blt-vs-render-ctx0:
    - shard-apl:          NOTRUN -> [FAIL][1]
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-apl7/igt@gem_ppgtt@blt-vs-render-ctx0.html

  * igt@gen9_exec_parse@batch-without-end:
    - shard-skl:          NOTRUN -> [FAIL][2] +2 similar issues
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-skl3/igt@gen9_exec_parse@batch-without-end.html
    - shard-kbl:          [PASS][3] -> [FAIL][4]
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10102/shard-kbl7/igt@gen9_exec_parse@batch-without-end.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-kbl4/igt@gen9_exec_parse@batch-without-end.html

  * igt@gen9_exec_parse@bb-start-param:
    - shard-skl:          [PASS][5] -> [FAIL][6] +1 similar issue
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10102/shard-skl1/igt@gen9_exec_parse@bb-start-param.html
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-skl1/igt@gen9_exec_parse@bb-start-param.html

  * igt@gen9_exec_parse@unaligned-jump:
    - shard-apl:          [PASS][7] -> [FAIL][8]
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10102/shard-apl7/igt@gen9_exec_parse@unaligned-jump.html
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-apl7/igt@gen9_exec_parse@unaligned-jump.html
    - shard-glk:          [PASS][9] -> [FAIL][10] +4 similar issues
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10102/shard-glk7/igt@gen9_exec_parse@unaligned-jump.html
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-glk6/igt@gen9_exec_parse@unaligned-jump.html
    - shard-kbl:          NOTRUN -> [FAIL][11]
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-kbl2/igt@gen9_exec_parse@unaligned-jump.html

  * igt@kms_cursor_legacy@all-pipes-forked-bo:
    - shard-tglb:         [PASS][12] -> [INCOMPLETE][13]
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10102/shard-tglb1/igt@kms_cursor_legacy@all-pipes-forked-bo.html
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-tglb6/igt@kms_cursor_legacy@all-pipes-forked-bo.html

  
#### Warnings ####

  * igt@gem_render_copy@y-tiled-ccs-to-yf-tiled-ccs:
    - shard-glk:          [INCOMPLETE][14] ([i915#3468]) -> [INCOMPLETE][15]
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10102/shard-glk8/igt@gem_render_copy@y-tiled-ccs-to-yf-tiled-ccs.html
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-glk7/igt@gem_render_copy@y-tiled-ccs-to-yf-tiled-ccs.html

  
Known issues
------------

  Here are the changes found in Patchwork_20154_full that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@drm_read@fault-buffer:
    - shard-apl:          NOTRUN -> [DMESG-WARN][16] ([i915#3457]) +2 similar issues
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-apl7/igt@drm_read@fault-buffer.html

  * igt@gem_create@create-clear:
    - shard-glk:          [PASS][17] -> [FAIL][18] ([i915#1888] / [i915#3160])
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10102/shard-glk7/igt@gem_create@create-clear.html
   [18]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-glk3/igt@gem_create@create-clear.html

  * igt@gem_create@create-massive:
    - shard-snb:          NOTRUN -> [DMESG-WARN][19] ([i915#3002]) +1 similar issue
   [19]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-snb2/igt@gem_create@create-massive.html
    - shard-skl:          NOTRUN -> [DMESG-WARN][20] ([i915#3002])
   [20]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-skl1/igt@gem_create@create-massive.html

  * igt@gem_ctx_isolation@preservation-s3@rcs0:
    - shard-kbl:          NOTRUN -> [DMESG-WARN][21] ([i915#180] / [i915#3457]) +1 similar issue
   [21]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-kbl2/igt@gem_ctx_isolation@preservation-s3@rcs0.html

  * igt@gem_ctx_persistence@heartbeat-many:
    - shard-glk:          [PASS][22] -> [FAIL][23] ([i915#3457]) +25 similar issues
   [22]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10102/shard-glk7/igt@gem_ctx_persistence@heartbeat-many.html
   [23]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-glk6/igt@gem_ctx_persistence@heartbeat-many.html

  * igt@gem_ctx_persistence@legacy-engines-cleanup:
    - shard-snb:          NOTRUN -> [SKIP][24] ([fdo#109271] / [i915#1099]) +2 similar issues
   [24]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-snb7/igt@gem_ctx_persistence@legacy-engines-cleanup.html

  * igt@gem_ctx_ringsize@idle@bcs0:
    - shard-skl:          NOTRUN -> [INCOMPLETE][25] ([i915#3316] / [i915#3457])
   [25]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-skl3/igt@gem_ctx_ringsize@idle@bcs0.html

  * igt@gem_eio@in-flight-suspend:
    - shard-kbl:          [PASS][26] -> [DMESG-WARN][27] ([i915#180] / [i915#3457]) +1 similar issue
   [26]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10102/shard-kbl4/igt@gem_eio@in-flight-suspend.html
   [27]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-kbl7/igt@gem_eio@in-flight-suspend.html

  * igt@gem_eio@unwedge-stress:
    - shard-apl:          [PASS][28] -> [FAIL][29] ([i915#3457]) +3 similar issues
   [28]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10102/shard-apl7/igt@gem_eio@unwedge-stress.html
   [29]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-apl7/igt@gem_eio@unwedge-stress.html

  * igt@gem_exec_fair@basic-deadline:
    - shard-kbl:          [PASS][30] -> [FAIL][31] ([i915#2846] / [i915#3457])
   [30]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10102/shard-kbl2/igt@gem_exec_fair@basic-deadline.html
   [31]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-kbl4/igt@gem_exec_fair@basic-deadline.html

  * igt@gem_exec_fair@basic-flow@rcs0:
    - shard-skl:          NOTRUN -> [SKIP][32] ([fdo#109271] / [i915#3457]) +60 similar issues
   [32]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-skl3/igt@gem_exec_fair@basic-flow@rcs0.html
    - shard-tglb:         [PASS][33] -> [FAIL][34] ([i915#2842] / [i915#3457])
   [33]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10102/shard-tglb7/igt@gem_exec_fair@basic-flow@rcs0.html
   [34]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-tglb2/igt@gem_exec_fair@basic-flow@rcs0.html

  * igt@gem_exec_fair@basic-throttle@rcs0:
    - shard-apl:          NOTRUN -> [INCOMPLETE][35] ([i915#3457])
   [35]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-apl7/igt@gem_exec_fair@basic-throttle@rcs0.html

  * igt@gem_exec_reloc@basic-wide-active@bcs0:
    - shard-skl:          NOTRUN -> [FAIL][36] ([i915#2389] / [i915#3457]) +3 similar issues
   [36]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-skl1/igt@gem_exec_reloc@basic-wide-active@bcs0.html

  * igt@gem_exec_reloc@basic-wide-active@rcs0:
    - shard-kbl:          NOTRUN -> [FAIL][37] ([i915#2389] / [i915#3457]) +4 similar issues
   [37]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-kbl3/igt@gem_exec_reloc@basic-wide-active@rcs0.html

  * igt@gem_gpgpu_fill:
    - shard-glk:          NOTRUN -> [FAIL][38] ([i915#3457]) +1 similar issue
   [38]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-glk2/igt@gem_gpgpu_fill.html

  * igt@gem_mmap_gtt@cpuset-basic-small-copy-odd:
    - shard-iclb:         [PASS][39] -> [INCOMPLETE][40] ([i915#2910] / [i915#3468])
   [39]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10102/shard-iclb5/igt@gem_mmap_gtt@cpuset-basic-small-copy-odd.html
   [40]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-iclb3/igt@gem_mmap_gtt@cpuset-basic-small-copy-odd.html

  * igt@gem_mmap_gtt@fault-concurrent-x:
    - shard-skl:          NOTRUN -> [INCOMPLETE][41] ([i915#198] / [i915#3468])
   [41]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-skl3/igt@gem_mmap_gtt@fault-concurrent-x.html

  * igt@gem_render_copy@mixed-tiled-to-yf-tiled-ccs:
    - shard-kbl:          NOTRUN -> [INCOMPLETE][42] ([i915#3468])
   [42]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-kbl4/igt@gem_render_copy@mixed-tiled-to-yf-tiled-ccs.html

  * igt@gem_render_copy@y-tiled-ccs-to-yf-tiled:
    - shard-glk:          [PASS][43] -> [DMESG-WARN][44] ([i915#118] / [i915#95])
   [43]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10102/shard-glk1/igt@gem_render_copy@y-tiled-ccs-to-yf-tiled.html
   [44]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-glk1/igt@gem_render_copy@y-tiled-ccs-to-yf-tiled.html

  * igt@gem_render_copy@yf-tiled-ccs-to-y-tiled:
    - shard-skl:          NOTRUN -> [INCOMPLETE][45] ([i915#3468]) +3 similar issues
   [45]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-skl3/igt@gem_render_copy@yf-tiled-ccs-to-y-tiled.html

  * igt@gem_render_copy@yf-tiled-ccs-to-y-tiled-ccs:
    - shard-apl:          NOTRUN -> [INCOMPLETE][46] ([i915#3468]) +2 similar issues
   [46]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-apl7/igt@gem_render_copy@yf-tiled-ccs-to-y-tiled-ccs.html

  * igt@gem_spin_batch@legacy@default:
    - shard-apl:          NOTRUN -> [FAIL][47] ([i915#2898] / [i915#3457]) +3 similar issues
   [47]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-apl7/igt@gem_spin_batch@legacy@default.html

  * igt@gem_userptr_blits@dmabuf-sync:
    - shard-apl:          NOTRUN -> [SKIP][48] ([fdo#109271] / [i915#3323])
   [48]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-apl6/igt@gem_userptr_blits@dmabuf-sync.html

  * igt@gem_userptr_blits@input-checking:
    - shard-apl:          NOTRUN -> [DMESG-WARN][49] ([i915#3002])
   [49]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-apl7/igt@gem_userptr_blits@input-checking.html

  * igt@gem_userptr_blits@vma-merge:
    - shard-skl:          NOTRUN -> [FAIL][50] ([i915#3318] / [i915#3457])
   [50]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-skl1/igt@gem_userptr_blits@vma-merge.html

  * igt@gen9_exec_parse@allowed-all:
    - shard-glk:          NOTRUN -> [DMESG-WARN][51] ([i915#1436] / [i915#716])
   [51]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-glk2/igt@gen9_exec_parse@allowed-all.html

  * igt@i915_hangman@error-state-capture@vcs1:
    - shard-iclb:         NOTRUN -> [DMESG-WARN][52] ([i915#3457])
   [52]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-iclb1/igt@i915_hangman@error-state-capture@vcs1.html

  * igt@i915_pm_rpm@cursor-dpms:
    - shard-tglb:         [PASS][53] -> [DMESG-WARN][54] ([i915#2411] / [i915#3457])
   [53]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10102/shard-tglb3/igt@i915_pm_rpm@cursor-dpms.html
   [54]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-tglb6/igt@i915_pm_rpm@cursor-dpms.html
    - shard-kbl:          [PASS][55] -> [DMESG-WARN][56] ([i915#3457])
   [55]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10102/shard-kbl4/igt@i915_pm_rpm@cursor-dpms.html
   [56]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-kbl3/igt@i915_pm_rpm@cursor-dpms.html

  * igt@i915_pm_rps@reset:
    - shard-apl:          NOTRUN -> [DMESG-FAIL][57] ([i915#3457]) +2 similar issues
   [57]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-apl7/igt@i915_pm_rps@reset.html
    - shard-skl:          NOTRUN -> [DMESG-WARN][58] ([i915#3457]) +8 similar issues
   [58]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-skl3/igt@i915_pm_rps@reset.html
    - shard-snb:          NOTRUN -> [DMESG-WARN][59] ([i915#3457])
   [59]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-snb7/igt@i915_pm_rps@reset.html

  * igt@i915_selftest@live@execlists:
    - shard-kbl:          NOTRUN -> [INCOMPLETE][60] ([i915#2782] / [i915#3462] / [i915#794])
   [60]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-kbl2/igt@i915_selftest@live@execlists.html

  * igt@i915_selftest@live@mman:
    - shard-kbl:          NOTRUN -> [DMESG-WARN][61] ([i915#3457])
   [61]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-kbl2/igt@i915_selftest@live@mman.html

  * igt@kms_big_joiner@basic:
    - shard-skl:          NOTRUN -> [SKIP][62] ([fdo#109271] / [i915#2705])
   [62]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-skl1/igt@kms_big_joiner@basic.html

  * igt@kms_ccs@pipe-c-bad-rotation-90:
    - shard-skl:          NOTRUN -> [SKIP][63] ([fdo#109271] / [fdo#111304]) +3 similar issues
   [63]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-skl3/igt@kms_ccs@pipe-c-bad-rotation-90.html

  * igt@kms_chamelium@dp-hpd-storm:
    - shard-glk:          NOTRUN -> [SKIP][64] ([fdo#109271] / [fdo#111827])
   [64]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-glk2/igt@kms_chamelium@dp-hpd-storm.html

  * igt@kms_chamelium@hdmi-hpd-storm-disable:
    - shard-skl:          NOTRUN -> [SKIP][65] ([fdo#109271] / [fdo#111827]) +39 similar issues
   [65]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-skl3/igt@kms_chamelium@hdmi-hpd-storm-disable.html

  * igt@kms_color_chamelium@pipe-a-ctm-0-75:
    - shard-kbl:          NOTRUN -> [SKIP][66] ([fdo#109271] / [fdo#111827]) +4 similar issues
   [66]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-kbl2/igt@kms_color_chamelium@pipe-a-ctm-0-75.html

  * igt@kms_color_chamelium@pipe-c-ctm-0-25:
    - shard-apl:          NOTRUN -> [SKIP][67] ([fdo#109271] / [fdo#111827]) +14 similar issues
   [67]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-apl6/igt@kms_color_chamelium@pipe-c-ctm-0-25.html

  * igt@kms_color_chamelium@pipe-c-ctm-red-to-blue:
    - shard-snb:          NOTRUN -> [SKIP][68] ([fdo#109271] / [fdo#111827]) +8 similar issues
   [68]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-snb7/igt@kms_color_chamelium@pipe-c-ctm-red-to-blue.html

  * igt@kms_content_protection@atomic-dpms:
    - shard-apl:          NOTRUN -> [TIMEOUT][69] ([i915#1319])
   [69]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-apl7/igt@kms_content_protection@atomic-dpms.html

  * igt@kms_content_protection@uevent:
    - shard-apl:          NOTRUN -> [FAIL][70] ([i915#2105])
   [70]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-apl7/igt@kms_content_protection@uevent.html

  * igt@kms_cursor_crc@pipe-a-cursor-256x256-offscreen:
    - shard-tglb:         [PASS][71] -> [FAIL][72] ([i915#2124] / [i915#3457])
   [71]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10102/shard-tglb2/igt@kms_cursor_crc@pipe-a-cursor-256x256-offscreen.html
   [72]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-tglb2/igt@kms_cursor_crc@pipe-a-cursor-256x256-offscreen.html

  * igt@kms_cursor_crc@pipe-a-cursor-256x256-random:
    - shard-snb:          NOTRUN -> [FAIL][73] ([i915#3457]) +3 similar issues
   [73]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-snb2/igt@kms_cursor_crc@pipe-a-cursor-256x256-random.html

  * igt@kms_cursor_crc@pipe-a-cursor-256x85-onscreen:
    - shard-skl:          NOTRUN -> [FAIL][74] ([i915#3444] / [i915#3457]) +35 similar issues
   [74]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-skl3/igt@kms_cursor_crc@pipe-a-cursor-256x85-onscreen.html

  * igt@kms_cursor_crc@pipe-a-cursor-alpha-opaque:
    - shard-apl:          NOTRUN -> [FAIL][75] ([i915#3444] / [i915#3457]) +1 similar issue
   [75]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-apl7/igt@kms_cursor_crc@pipe-a-cursor-alpha-opaque.html

  * igt@kms_cursor_crc@pipe-a-cursor-max-size-sliding:
    - shard-kbl:          NOTRUN -> [SKIP][76] ([fdo#109271] / [i915#3457]) +2 similar issues
   [76]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-kbl2/igt@kms_cursor_crc@pipe-a-cursor-max-size-sliding.html

  * igt@kms_cursor_crc@pipe-b-cursor-128x128-onscreen:
    - shard-glk:          NOTRUN -> [FAIL][77] ([i915#3444] / [i915#3457]) +1 similar issue
   [77]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-glk2/igt@kms_cursor_crc@pipe-b-cursor-128x128-onscreen.html

  * igt@kms_cursor_crc@pipe-b-cursor-512x170-rapid-movement:
    - shard-glk:          NOTRUN -> [SKIP][78] ([fdo#109271] / [i915#3457])
   [78]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-glk2/igt@kms_cursor_crc@pipe-b-cursor-512x170-rapid-movement.html

  * igt@kms_cursor_crc@pipe-b-cursor-alpha-opaque:
    - shard-glk:          [PASS][79] -> [FAIL][80] ([i915#3444] / [i915#3457]) +5 similar issues
   [79]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10102/shard-glk7/igt@kms_cursor_crc@pipe-b-cursor-alpha-opaque.html
   [80]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-glk6/igt@kms_cursor_crc@pipe-b-cursor-alpha-opaque.html

  * igt@kms_cursor_crc@pipe-c-cursor-128x42-onscreen:
    - shard-kbl:          NOTRUN -> [FAIL][81] ([i915#3444] / [i915#3457])
   [81]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-kbl4/igt@kms_cursor_crc@pipe-c-cursor-128x42-onscreen.html

  * igt@kms_cursor_crc@pipe-c-cursor-512x512-onscreen:
    - shard-snb:          NOTRUN -> [SKIP][82] ([fdo#109271] / [i915#3457]) +20 similar issues
   [82]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-snb5/igt@kms_cursor_crc@pipe-c-cursor-512x512-onscreen.html

  * igt@kms_cursor_crc@pipe-c-cursor-64x21-offscreen:
    - shard-kbl:          [PASS][83] -> [FAIL][84] ([i915#3444] / [i915#3457])
   [83]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10102/shard-kbl2/igt@kms_cursor_crc@pipe-c-cursor-64x21-offscreen.html
   [84]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-kbl7/igt@kms_cursor_crc@pipe-c-cursor-64x21-offscreen.html

  * igt@kms_cursor_crc@pipe-d-cursor-512x512-random:
    - shard-apl:          NOTRUN -> [SKIP][85] ([fdo#109271] / [i915#3457]) +20 similar issues
   [85]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-apl7/igt@kms_cursor_crc@pipe-d-cursor-512x512-random.html

  * igt@kms_cursor_edge_walk@pipe-c-256x256-left-edge:
    - shard-glk:          [PASS][86] -> [FAIL][87] ([i915#70]) +2 similar issues
   [86]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10102/shard-glk8/igt@kms_cursor_edge_walk@pipe-c-256x256-left-edge.html
   [87]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-glk7/igt@kms_cursor_edge_walk@pipe-c-256x256-left-edge.html

  * igt@kms_cursor_legacy@basic-busy-flip-before-cursor-atomic:
    - shard-apl:          NOTRUN -> [FAIL][88] ([i915#3457]) +8 similar issues
   [88]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-apl6/igt@kms_cursor_legacy@basic-busy-flip-before-cursor-atomic.html

  * igt@kms_cursor_legacy@flip-vs-cursor-atomic-transitions-varying-size:
    - shard-skl:          NOTRUN -> [FAIL][89] ([i915#2346] / [i915#533])
   [89]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-skl3/igt@kms_cursor_legacy@flip-vs-cursor-atomic-transitions-varying-size.html

  * igt@kms_cursor_legacy@pipe-d-torture-bo:
    - shard-apl:          NOTRUN -> [SKIP][90] ([fdo#109271] / [i915#533]) +2 similar issues
   [90]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-apl7/igt@kms_cursor_legacy@pipe-d-torture-bo.html

  * igt@kms_dp_tiled_display@basic-test-pattern-with-chamelium:
    - shard-skl:          NOTRUN -> [SKIP][91] ([fdo#109271] / [i915#2065])
   [91]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-skl3/igt@kms_dp_tiled_display@basic-test-pattern-with-chamelium.html

  * igt@kms_flip_scaled_crc@flip-32bpp-ytile-to-64bpp-ytile:
    - shard-apl:          NOTRUN -> [SKIP][92] ([fdo#109271] / [i915#2642])
   [92]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-apl7/igt@kms_flip_scaled_crc@flip-32bpp-ytile-to-64bpp-ytile.html
    - shard-skl:          NOTRUN -> [SKIP][93] ([fdo#109271] / [i915#2642])
   [93]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-skl3/igt@kms_flip_scaled_crc@flip-32bpp-ytile-to-64bpp-ytile.html

  * igt@kms_flip_scaled_crc@flip-64bpp-ytile-to-32bpp-ytilercccs:
    - shard-skl:          NOTRUN -> [SKIP][94] ([fdo#109271] / [i915#2672])
   [94]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-skl3/igt@kms_flip_scaled_crc@flip-64bpp-ytile-to-32bpp-ytilercccs.html
    - shard-apl:          NOTRUN -> [SKIP][95] ([fdo#109271] / [i915#2672])
   [95]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-apl7/igt@kms_flip_scaled_crc@flip-64bpp-ytile-to-32bpp-ytilercccs.html

  * igt@kms_flip_tiling@flip-changes-tiling-yf@edp-1-pipe-c:
    - shard-skl:          NOTRUN -> [FAIL][96] ([i915#699])
   [96]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-skl1/igt@kms_flip_tiling@flip-changes-tiling-yf@edp-1-pipe-c.html

  * igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-indfb-msflip-blt:
    - shard-skl:          NOTRUN -> [SKIP][97] ([fdo#109271]) +322 similar issues
   [97]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-skl3/igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-indfb-msflip-blt.html

  * igt@kms_frontbuffer_tracking@psr-1p-primscrn-pri-indfb-draw-mmap-wc:
    - shard-glk:          NOTRUN -> [SKIP][98] ([fdo#109271]) +6 similar issues
   [98]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-glk2/igt@kms_frontbuffer_tracking@psr-1p-primscrn-pri-indfb-draw-mmap-wc.html

  * igt@kms_frontbuffer_tracking@psr-1p-primscrn-spr-indfb-onoff:
    - shard-skl:          [PASS][99] -> [FAIL][100] ([i915#49]) +1 similar issue
   [99]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10102/shard-skl1/igt@kms_frontbuffer_tracking@psr-1p-primscrn-spr-indfb-onoff.html
   [100]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-skl1/igt@kms_frontbuffer_tracking@psr-1p-primscrn-spr-indfb-onoff.html

  * igt@kms_frontbuffer_tracking@psr-2p-pri-indfb-multidraw:
    - shard-apl:          NOTRUN -> [SKIP][101] ([fdo#109271]) +148 similar issues
   [101]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-apl7/igt@kms_frontbuffer_tracking@psr-2p-pri-indfb-multidraw.html

  * igt@kms_hdr@bpc-switch:
    - shard-skl:          NOTRUN -> [FAIL][102] ([i915#1188])
   [102]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-skl3/igt@kms_hdr@bpc-switch.html

  * igt@kms_hdr@bpc-switch-suspend:
    - shard-kbl:          [PASS][103] -> [DMESG-WARN][104] ([i915#180]) +2 similar issues
   [103]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10102/shard-kbl7/igt@kms_hdr@bpc-switch-suspend.html
   [104]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-kbl7/igt@kms_hdr@bpc-switch-suspend.html

  * igt@kms_pipe_crc_basic@hang-read-crc-pipe-d:
    - shard-skl:          NOTRUN -> [SKIP][105] ([fdo#109271] / [i915#533]) +1 similar issue
   [105]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-skl3/igt@kms_pipe_crc_basic@hang-read-crc-pipe-d.html

  * igt@kms_pipe_crc_basic@nonblocking-crc-pipe-b:
    - shard-glk:          [PASS][106] -> [FAIL][107] ([i915#53])
   [106]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10102/shard-glk4/igt@kms_pipe_crc_basic@nonblocking-crc-pipe-b.html
   [107]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-glk4/igt@kms_pipe_crc_basic@nonblocking-crc-pipe-b.html

  * igt@kms_pipe_crc_basic@suspend-read-crc-pipe-a:
    - shard-apl:          NOTRUN -> [DMESG-WARN][108] ([i915#180])
   [108]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-apl6/igt@kms_pipe_crc_basic@suspend-read-crc-pipe-a.html

  * igt@kms_plane_alpha_blend@pipe-a-alpha-7efc:
    - shard-skl:          NOTRUN -> [FAIL][109] ([fdo#108145] / [i915#265]) +6 similar issues
   [109]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-skl1/igt@kms_plane_alpha_blend@pipe-a-alpha-7efc.html

  * igt@kms_plane_alpha_blend@pipe-a-alpha-basic:
    - shard-kbl:          NOTRUN -> [FAIL][110] ([fdo#108145] / [i915#265])
   [110]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-kbl3/igt@kms_plane_alpha_blend@pipe-a-alpha-basic.html

  * igt@kms_plane_alpha_blend@pipe-a-coverage-7efc:
    - shard-skl:          [PASS][111] -> [FAIL][112] ([fdo#108145] / [i915#265])
   [111]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10102/shard-skl1/igt@kms_plane_alpha_blend@pipe-a-coverage-7efc.html
   [112]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-skl1/igt@kms_plane_alpha_blend@pipe-a-coverage-7efc.html

  * igt@kms_plane_alpha_blend@pipe-c-constant-alpha-max:
    - shard-glk:          NOTRUN -> [FAIL][113] ([fdo#108145] / [i915#265])
   [113]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-glk2/igt@kms_plane_alpha_blend@pipe-c-constant-alpha-max.html
    - shard-apl:          NOTRUN -> [FAIL][114] ([fdo#108145] / [i915#265])
   [114]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-apl7/igt@kms_plane_alpha_blend@pipe-c-constant-alpha-max.html

  * igt@kms_plane_cursor@pipe-a-overlay-size-256:
    - shard-snb:          NOTRUN -> [FAIL][115] ([i915#2657]) +2 similar issues
   [115]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-snb7/igt@kms_plane_cursor@pipe-a-overlay-size-256.html

  * igt@kms_plane_cursor@pipe-a-viewport-size-256:
    - shard-skl:          NOTRUN -> [FAIL][116] ([i915#2657]) +6 similar issues
   [116]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-skl3/igt@kms_plane_cursor@pipe-a-viewport-size-256.html

  * igt@kms_plane_cursor@pipe-b-primary-size-128:
    - shard-skl:          [PASS][117] -> [FAIL][118] ([i915#2657])
   [117]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10102/shard-skl1/igt@kms_plane_cursor@pipe-b-primary-size-128.html
   [118]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-skl1/igt@kms_plane_cursor@pipe-b-primary-size-128.html

  * igt@kms_plane_cursor@pipe-b-primary-size-256:
    - shard-glk:          [PASS][119] -> [FAIL][120] ([i915#2657]) +1 similar issue
   [119]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10102/shard-glk7/igt@kms_plane_cursor@pipe-b-primary-size-256.html
   [120]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-glk3/igt@kms_plane_cursor@pipe-b-primary-size-256.html

  * igt@kms_plane_cursor@pipe-b-viewport-size-64:
    - shard-skl:          NOTRUN -> [FAIL][121] ([i915#2657] / [i915#3457]) +3 similar issues
   [121]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-skl3/igt@kms_plane_cursor@pipe-b-viewport-size-64.html

  * igt@kms_plane_cursor@pipe-c-primary-size-64:
    - shard-kbl:          NOTRUN -> [FAIL][122] ([i915#2657] / [i915#3457])
   [122]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-kbl2/igt@kms_plane_cursor@pipe-c-primary-size-64.html

  * igt@kms_plane_cursor@pipe-c-viewport-size-128:
    - shard-snb:          NOTRUN -> [SKIP][123] ([fdo#109271]) +144 similar issues
   [123]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-snb7/igt@kms_plane_cursor@pipe-c-viewport-size-128.html
    - shard-glk:          NOTRUN -> [FAIL][124] ([i915#2657] / [i915#3457]) +1 similar issue
   [124]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-glk2/igt@kms_plane_cursor@pipe-c-viewport-size-128.html

  * igt@kms_psr2_sf@overlay-primary-update-sf-dmg-area-1:
    - shard-apl:          NOTRUN -> [SKIP][125] ([fdo#109271] / [i915#658]) +1 similar issue
   [125]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-apl7/igt@kms_psr2_sf@overlay-primary-update-sf-dmg-area-1.html
    - shard-glk:          NOTRUN -> [SKIP][126] ([fdo#109271] / [i915#658])
   [126]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-glk2/igt@kms_psr2_sf@overlay-primary-update-sf-dmg-area-1.html

  * igt@kms_psr2_sf@overlay-primary-update-sf-dmg-area-2:
    - shard-skl:          NOTRUN -> [SKIP][127] ([fdo#109271] / [i915#658]) +6 similar issues
   [127]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-skl1/igt@kms_psr2_sf@overlay-primary-update-sf-dmg-area-2.html

  * igt@kms_sysfs_edid_timing:
    - shard-apl:          NOTRUN -> [FAIL][128] ([IGT#2])
   [128]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-apl7/igt@kms_sysfs_edid_timing.html
    - shard-skl:          NOTRUN -> [FAIL][129] ([IGT#2])
   [129]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-skl3/igt@kms_sysfs_edid_timing.html

  * igt@kms_writeback@writeback-fb-id:
    - shard-skl:          NOTRUN -> [SKIP][130] ([fdo#109271] / [i915#2437])
   [130]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-skl3/igt@kms_writeback@writeback-fb-id.html

  * igt@perf@gen12-mi-rpc:
    - shard-kbl:          NOTRUN -> [SKIP][131] ([fdo#109271]) +28 similar issues
   [131]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-kbl4/igt@perf@gen12-mi-rpc.html

  * igt@perf@polling-small-buf:
    - shard-skl:          NOTRUN -> [FAIL][132] ([i915#1722])
   [132]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/shard-skl3/igt@perf@polling-small-buf.html

  * igt@perf_pmu@most-busy-idle-check-all@rcs0:
    - shard-apl:          NOTRUN -> [WARN][133] ([i915#3457]) +3 similar is

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20154/index.html

[-- Attachment #1.2: Type: text/html, Size: 33798 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2021-05-21  1:03 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-19  7:43 [PATCH 1/2] drm/i915/cmdparser: No-op failed batches on all platforms Daniel Vetter
2021-05-19  7:43 ` [Intel-gfx] " Daniel Vetter
2021-05-19  7:43 ` Daniel Vetter
2021-05-19  7:43 ` [PATCH 2/2] Revert "drm/i915: Propagate errors on awaiting already signaled fences" Daniel Vetter
2021-05-19  7:43   ` [Intel-gfx] " Daniel Vetter
2021-05-19  7:43   ` Daniel Vetter
2021-05-19 10:15   ` [PATCH] " Daniel Vetter
2021-05-19 10:15     ` [Intel-gfx] " Daniel Vetter
2021-05-19 10:15     ` Daniel Vetter
2021-05-19 15:06     ` Jason Ekstrand
2021-05-19 15:06       ` [Intel-gfx] " Jason Ekstrand
2021-05-19 15:06       ` Jason Ekstrand
2021-05-19 17:16       ` Daniel Vetter
2021-05-19 17:16         ` [Intel-gfx] " Daniel Vetter
2021-05-19 17:16         ` Daniel Vetter
2021-05-19 19:01         ` Jason Ekstrand
2021-05-19 19:01           ` [Intel-gfx] " Jason Ekstrand
2021-05-19 10:04 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for series starting with [1/2] drm/i915/cmdparser: No-op failed batches on all platforms Patchwork
2021-05-19 10:34 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2021-05-19 11:09 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for series starting with [1/2] drm/i915/cmdparser: No-op failed batches on all platforms (rev2) Patchwork
2021-05-19 11:40 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2021-05-19 15:04 ` [PATCH 1/2] drm/i915/cmdparser: No-op failed batches on all platforms Jason Ekstrand
2021-05-19 15:04   ` [Intel-gfx] " Jason Ekstrand
2021-05-19 15:04   ` Jason Ekstrand
2021-05-21  1:03 ` [Intel-gfx] ✗ Fi.CI.IGT: failure for series starting with [1/2] drm/i915/cmdparser: No-op failed batches on all platforms (rev2) Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.