All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] drm/i915: Fix timeout handling when retiring requests
@ 2022-11-16 11:25 ` Janusz Krzysztofik
  0 siblings, 0 replies; 22+ messages in thread
From: Janusz Krzysztofik @ 2022-11-16 11:25 UTC (permalink / raw)
  To: Tvrtko Ursulin, Joonas Lahtinen
  Cc: Matthew Brost, intel-gfx, Chris Wilson, dri-devel,
	Janusz Krzysztofik, John Harrison

Fixes for issues discovered via code review while working on
https://gitlab.freedesktop.org/drm/intel/issues/7349.

Reworked version of a series supposed to fix the same issues, sent before
under the same subject.  Since some solutions are significantly different
than before, I'm not marking this series and individual patches as v2.

Janusz Krzysztofik (3):
  drm/i915: Fix negative remaining time after retire requests
  drm/i915: Never return 0 on timeout when retiring requests
  drm/i915: Never return 0 if request wait succeeds

 drivers/gpu/drm/i915/gt/intel_gt_requests.c | 26 ++++++++++++++++++---
 drivers/gpu/drm/i915/i915_request.c         |  2 ++
 2 files changed, 25 insertions(+), 3 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Intel-gfx] [PATCH 0/3] drm/i915: Fix timeout handling when retiring requests
@ 2022-11-16 11:25 ` Janusz Krzysztofik
  0 siblings, 0 replies; 22+ messages in thread
From: Janusz Krzysztofik @ 2022-11-16 11:25 UTC (permalink / raw)
  To: Tvrtko Ursulin, Joonas Lahtinen
  Cc: intel-gfx, Chris Wilson, dri-devel, Daniel Vetter

Fixes for issues discovered via code review while working on
https://gitlab.freedesktop.org/drm/intel/issues/7349.

Reworked version of a series supposed to fix the same issues, sent before
under the same subject.  Since some solutions are significantly different
than before, I'm not marking this series and individual patches as v2.

Janusz Krzysztofik (3):
  drm/i915: Fix negative remaining time after retire requests
  drm/i915: Never return 0 on timeout when retiring requests
  drm/i915: Never return 0 if request wait succeeds

 drivers/gpu/drm/i915/gt/intel_gt_requests.c | 26 ++++++++++++++++++---
 drivers/gpu/drm/i915/i915_request.c         |  2 ++
 2 files changed, 25 insertions(+), 3 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 1/3] drm/i915: Fix negative remaining time after retire requests
  2022-11-16 11:25 ` [Intel-gfx] " Janusz Krzysztofik
@ 2022-11-16 11:25   ` Janusz Krzysztofik
  -1 siblings, 0 replies; 22+ messages in thread
From: Janusz Krzysztofik @ 2022-11-16 11:25 UTC (permalink / raw)
  To: Tvrtko Ursulin, Joonas Lahtinen
  Cc: Matthew Brost, intel-gfx, Chris Wilson, dri-devel,
	Janusz Krzysztofik, John Harrison

Commit b97060a99b01 ("drm/i915/guc: Update intel_gt_wait_for_idle to work
with GuC") extended the API of intel_gt_retire_requests_timeout() with an
extra argument 'remaining_timeout', intended for passing back unconsumed
portion of requested timeout when 0 (success) is returned.  However, when
request retirement happens to succeed despite an error returned by
dma_fence_wait_timeout(), the error code (a negative value) is passed back
instead of remaining time.  If a user then passes that negative value
forward as requested timeout to another wait, an explicit WARN or BUG can
be triggered.

Instead of copying the value of timeout variable to *remaining_timeout
before return, update the *remaining_timeout after each DMA fence wait.
Set it to 0 on -ETIME, -EINTR or -ERESTARTSYS, and assume no time has been
consumed on other errors returned from the wait.

Fixes: b97060a99b01 ("drm/i915/guc: Update intel_gt_wait_for_idle to work with GuC")
Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Cc: stable@vger.kernel.org # v5.15+
---
 drivers/gpu/drm/i915/gt/intel_gt_requests.c | 23 ++++++++++++++++++---
 1 file changed, 20 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt_requests.c b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
index edb881d756309..ccaf2fd80625b 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_requests.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
@@ -138,6 +138,9 @@ long intel_gt_retire_requests_timeout(struct intel_gt *gt, long timeout,
 	unsigned long active_count = 0;
 	LIST_HEAD(free);
 
+	if (remaining_timeout)
+		*remaining_timeout = timeout;
+
 	flush_submission(gt, timeout); /* kick the ksoftirqd tasklets */
 	spin_lock(&timelines->lock);
 	list_for_each_entry_safe(tl, tn, &timelines->active_list, link) {
@@ -163,6 +166,23 @@ long intel_gt_retire_requests_timeout(struct intel_gt *gt, long timeout,
 								 timeout);
 				dma_fence_put(fence);
 
+				if (remaining_timeout) {
+					/*
+					 * If we get an error here but request
+					 * retirement succeeds anyway
+					 * (!active_count) and we return 0, the
+					 * caller may want to spend remaining
+					 * time on waiting for other events.
+					 */
+					if (timeout == -ETIME ||
+					    timeout == -EINTR ||
+					    timeout == -ERESTARTSYS)
+						*remaining_timeout = 0;
+					else if (timeout >= 0)
+						*remaining_timeout = timeout;
+					/* else assume no time consumed */
+				}
+
 				/* Retirement is best effort */
 				if (!mutex_trylock(&tl->mutex)) {
 					active_count++;
@@ -196,9 +216,6 @@ out_active:	spin_lock(&timelines->lock);
 	if (flush_submission(gt, timeout)) /* Wait, there's more! */
 		active_count++;
 
-	if (remaining_timeout)
-		*remaining_timeout = timeout;
-
 	return active_count ? timeout : 0;
 }
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Intel-gfx] [PATCH 1/3] drm/i915: Fix negative remaining time after retire requests
@ 2022-11-16 11:25   ` Janusz Krzysztofik
  0 siblings, 0 replies; 22+ messages in thread
From: Janusz Krzysztofik @ 2022-11-16 11:25 UTC (permalink / raw)
  To: Tvrtko Ursulin, Joonas Lahtinen
  Cc: intel-gfx, Chris Wilson, dri-devel, Daniel Vetter

Commit b97060a99b01 ("drm/i915/guc: Update intel_gt_wait_for_idle to work
with GuC") extended the API of intel_gt_retire_requests_timeout() with an
extra argument 'remaining_timeout', intended for passing back unconsumed
portion of requested timeout when 0 (success) is returned.  However, when
request retirement happens to succeed despite an error returned by
dma_fence_wait_timeout(), the error code (a negative value) is passed back
instead of remaining time.  If a user then passes that negative value
forward as requested timeout to another wait, an explicit WARN or BUG can
be triggered.

Instead of copying the value of timeout variable to *remaining_timeout
before return, update the *remaining_timeout after each DMA fence wait.
Set it to 0 on -ETIME, -EINTR or -ERESTARTSYS, and assume no time has been
consumed on other errors returned from the wait.

Fixes: b97060a99b01 ("drm/i915/guc: Update intel_gt_wait_for_idle to work with GuC")
Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Cc: stable@vger.kernel.org # v5.15+
---
 drivers/gpu/drm/i915/gt/intel_gt_requests.c | 23 ++++++++++++++++++---
 1 file changed, 20 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt_requests.c b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
index edb881d756309..ccaf2fd80625b 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_requests.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
@@ -138,6 +138,9 @@ long intel_gt_retire_requests_timeout(struct intel_gt *gt, long timeout,
 	unsigned long active_count = 0;
 	LIST_HEAD(free);
 
+	if (remaining_timeout)
+		*remaining_timeout = timeout;
+
 	flush_submission(gt, timeout); /* kick the ksoftirqd tasklets */
 	spin_lock(&timelines->lock);
 	list_for_each_entry_safe(tl, tn, &timelines->active_list, link) {
@@ -163,6 +166,23 @@ long intel_gt_retire_requests_timeout(struct intel_gt *gt, long timeout,
 								 timeout);
 				dma_fence_put(fence);
 
+				if (remaining_timeout) {
+					/*
+					 * If we get an error here but request
+					 * retirement succeeds anyway
+					 * (!active_count) and we return 0, the
+					 * caller may want to spend remaining
+					 * time on waiting for other events.
+					 */
+					if (timeout == -ETIME ||
+					    timeout == -EINTR ||
+					    timeout == -ERESTARTSYS)
+						*remaining_timeout = 0;
+					else if (timeout >= 0)
+						*remaining_timeout = timeout;
+					/* else assume no time consumed */
+				}
+
 				/* Retirement is best effort */
 				if (!mutex_trylock(&tl->mutex)) {
 					active_count++;
@@ -196,9 +216,6 @@ out_active:	spin_lock(&timelines->lock);
 	if (flush_submission(gt, timeout)) /* Wait, there's more! */
 		active_count++;
 
-	if (remaining_timeout)
-		*remaining_timeout = timeout;
-
 	return active_count ? timeout : 0;
 }
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 2/3] drm/i915: Never return 0 on timeout when retiring requests
  2022-11-16 11:25 ` [Intel-gfx] " Janusz Krzysztofik
@ 2022-11-16 11:25   ` Janusz Krzysztofik
  -1 siblings, 0 replies; 22+ messages in thread
From: Janusz Krzysztofik @ 2022-11-16 11:25 UTC (permalink / raw)
  To: Tvrtko Ursulin, Joonas Lahtinen
  Cc: Matthew Brost, intel-gfx, Chris Wilson, dri-devel,
	Janusz Krzysztofik, John Harrison

Users of intel_gt_retire_requests_timeout() expect 0 return value on
success.  However, we have no protection from passing back 0 potentially
returned by dma_fence_wait_timeout() on timeout.

Replace 0 with -ETIME before using timeout as return value.

Fixes: f33a8a51602c ("drm/i915: Merge wait_for_timelines with retire_request")
Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Cc: stable@vger.kernel.org # v5.5+
---
 drivers/gpu/drm/i915/gt/intel_gt_requests.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt_requests.c b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
index ccaf2fd80625b..ac6b2b1861397 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_requests.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
@@ -213,6 +213,9 @@ out_active:	spin_lock(&timelines->lock);
 	list_for_each_entry_safe(tl, tn, &free, link)
 		__intel_timeline_free(&tl->kref);
 
+	if (!timeout)
+		timeout = -ETIME;
+
 	if (flush_submission(gt, timeout)) /* Wait, there's more! */
 		active_count++;
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Intel-gfx] [PATCH 2/3] drm/i915: Never return 0 on timeout when retiring requests
@ 2022-11-16 11:25   ` Janusz Krzysztofik
  0 siblings, 0 replies; 22+ messages in thread
From: Janusz Krzysztofik @ 2022-11-16 11:25 UTC (permalink / raw)
  To: Tvrtko Ursulin, Joonas Lahtinen
  Cc: intel-gfx, Chris Wilson, dri-devel, Daniel Vetter

Users of intel_gt_retire_requests_timeout() expect 0 return value on
success.  However, we have no protection from passing back 0 potentially
returned by dma_fence_wait_timeout() on timeout.

Replace 0 with -ETIME before using timeout as return value.

Fixes: f33a8a51602c ("drm/i915: Merge wait_for_timelines with retire_request")
Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Cc: stable@vger.kernel.org # v5.5+
---
 drivers/gpu/drm/i915/gt/intel_gt_requests.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt_requests.c b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
index ccaf2fd80625b..ac6b2b1861397 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_requests.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
@@ -213,6 +213,9 @@ out_active:	spin_lock(&timelines->lock);
 	list_for_each_entry_safe(tl, tn, &free, link)
 		__intel_timeline_free(&tl->kref);
 
+	if (!timeout)
+		timeout = -ETIME;
+
 	if (flush_submission(gt, timeout)) /* Wait, there's more! */
 		active_count++;
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 3/3] drm/i915: Never return 0 if request wait succeeds
  2022-11-16 11:25 ` [Intel-gfx] " Janusz Krzysztofik
@ 2022-11-16 11:25   ` Janusz Krzysztofik
  -1 siblings, 0 replies; 22+ messages in thread
From: Janusz Krzysztofik @ 2022-11-16 11:25 UTC (permalink / raw)
  To: Tvrtko Ursulin, Joonas Lahtinen
  Cc: Matthew Brost, intel-gfx, Chris Wilson, dri-devel,
	Janusz Krzysztofik, John Harrison

According to the docs of i915_request_wait_timeout(), its return value
"may be zero if the request is unfinished after the timeout expires."
However, 0 is also returned when the request is found finished right
after the timeout has expired.

Since the docs also state: "If the timeout is 0, it will return 1 if the
fence is signaled.", return 1 also when the fence is found signaled after
non-zero timeout has expired.

Fixes: 7e2e69ed4678 ("drm/i915: Fix i915_request fence wait semantics")
Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Cc: stable@vger.kernel.org # v5.17
---
 drivers/gpu/drm/i915/i915_request.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index f949a9495758a..406ddfafbed4d 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -2079,6 +2079,8 @@ long i915_request_wait_timeout(struct i915_request *rq,
 
 		timeout = io_schedule_timeout(timeout);
 	}
+	if (!timeout)	/* expired but signaled, we shouldn't return 0 */
+		timeout = 1;
 	__set_current_state(TASK_RUNNING);
 
 	if (READ_ONCE(wait.tsk))
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Intel-gfx] [PATCH 3/3] drm/i915: Never return 0 if request wait succeeds
@ 2022-11-16 11:25   ` Janusz Krzysztofik
  0 siblings, 0 replies; 22+ messages in thread
From: Janusz Krzysztofik @ 2022-11-16 11:25 UTC (permalink / raw)
  To: Tvrtko Ursulin, Joonas Lahtinen
  Cc: intel-gfx, Chris Wilson, dri-devel, Daniel Vetter

According to the docs of i915_request_wait_timeout(), its return value
"may be zero if the request is unfinished after the timeout expires."
However, 0 is also returned when the request is found finished right
after the timeout has expired.

Since the docs also state: "If the timeout is 0, it will return 1 if the
fence is signaled.", return 1 also when the fence is found signaled after
non-zero timeout has expired.

Fixes: 7e2e69ed4678 ("drm/i915: Fix i915_request fence wait semantics")
Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Cc: stable@vger.kernel.org # v5.17
---
 drivers/gpu/drm/i915/i915_request.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index f949a9495758a..406ddfafbed4d 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -2079,6 +2079,8 @@ long i915_request_wait_timeout(struct i915_request *rq,
 
 		timeout = io_schedule_timeout(timeout);
 	}
+	if (!timeout)	/* expired but signaled, we shouldn't return 0 */
+		timeout = 1;
 	__set_current_state(TASK_RUNNING);
 
 	if (READ_ONCE(wait.tsk))
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [Intel-gfx] [PATCH 1/3] drm/i915: Fix negative remaining time after retire requests
  2022-11-16 11:25   ` [Intel-gfx] " Janusz Krzysztofik
@ 2022-11-16 13:13     ` Andrzej Hajda
  -1 siblings, 0 replies; 22+ messages in thread
From: Andrzej Hajda @ 2022-11-16 13:13 UTC (permalink / raw)
  To: Janusz Krzysztofik, Tvrtko Ursulin, Joonas Lahtinen
  Cc: intel-gfx, dri-devel, Chris Wilson

On 16.11.2022 12:25, Janusz Krzysztofik wrote:
> Commit b97060a99b01 ("drm/i915/guc: Update intel_gt_wait_for_idle to work
> with GuC") extended the API of intel_gt_retire_requests_timeout() with an
> extra argument 'remaining_timeout', intended for passing back unconsumed
> portion of requested timeout when 0 (success) is returned.  However, when
> request retirement happens to succeed despite an error returned by
> dma_fence_wait_timeout(), the error code (a negative value) is passed back
> instead of remaining time.  If a user then passes that negative value
> forward as requested timeout to another wait, an explicit WARN or BUG can
> be triggered.
> 
> Instead of copying the value of timeout variable to *remaining_timeout
> before return, update the *remaining_timeout after each DMA fence wait.
> Set it to 0 on -ETIME, -EINTR or -ERESTARTSYS, and assume no time has been
> consumed on other errors returned from the wait.
> 
> Fixes: b97060a99b01 ("drm/i915/guc: Update intel_gt_wait_for_idle to work with GuC")
> Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
> Cc: stable@vger.kernel.org # v5.15+
> ---
>   drivers/gpu/drm/i915/gt/intel_gt_requests.c | 23 ++++++++++++++++++---
>   1 file changed, 20 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_requests.c b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
> index edb881d756309..ccaf2fd80625b 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt_requests.c
> +++ b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
> @@ -138,6 +138,9 @@ long intel_gt_retire_requests_timeout(struct intel_gt *gt, long timeout,
>   	unsigned long active_count = 0;
>   	LIST_HEAD(free);
>   
> +	if (remaining_timeout)
> +		*remaining_timeout = timeout;
> +
>   	flush_submission(gt, timeout); /* kick the ksoftirqd tasklets */
>   	spin_lock(&timelines->lock);
>   	list_for_each_entry_safe(tl, tn, &timelines->active_list, link) {
> @@ -163,6 +166,23 @@ long intel_gt_retire_requests_timeout(struct intel_gt *gt, long timeout,
>   								 timeout);
>   				dma_fence_put(fence);
>   
> +				if (remaining_timeout) {
> +					/*
> +					 * If we get an error here but request
> +					 * retirement succeeds anyway
> +					 * (!active_count) and we return 0, the
> +					 * caller may want to spend remaining
> +					 * time on waiting for other events.
> +					 */
> +					if (timeout == -ETIME ||
> +					    timeout == -EINTR ||
> +					    timeout == -ERESTARTSYS)
> +						*remaining_timeout = 0;
> +					else if (timeout >= 0)
> +						*remaining_timeout = timeout;
> +					/* else assume no time consumed */

Looks correct, but the crazy semantic of dma_fence_wait_timeout does not 
make it easy to understand.

Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>

Regards
Andrzej


> +				}
> +
>   				/* Retirement is best effort */
>   				if (!mutex_trylock(&tl->mutex)) {
>   					active_count++;
> @@ -196,9 +216,6 @@ out_active:	spin_lock(&timelines->lock);
>   	if (flush_submission(gt, timeout)) /* Wait, there's more! */
>   		active_count++;
>   
> -	if (remaining_timeout)
> -		*remaining_timeout = timeout;
> -
>   	return active_count ? timeout : 0;
>   }
>   


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Intel-gfx] [PATCH 1/3] drm/i915: Fix negative remaining time after retire requests
@ 2022-11-16 13:13     ` Andrzej Hajda
  0 siblings, 0 replies; 22+ messages in thread
From: Andrzej Hajda @ 2022-11-16 13:13 UTC (permalink / raw)
  To: Janusz Krzysztofik, Tvrtko Ursulin, Joonas Lahtinen
  Cc: Daniel Vetter, intel-gfx, dri-devel, Chris Wilson

On 16.11.2022 12:25, Janusz Krzysztofik wrote:
> Commit b97060a99b01 ("drm/i915/guc: Update intel_gt_wait_for_idle to work
> with GuC") extended the API of intel_gt_retire_requests_timeout() with an
> extra argument 'remaining_timeout', intended for passing back unconsumed
> portion of requested timeout when 0 (success) is returned.  However, when
> request retirement happens to succeed despite an error returned by
> dma_fence_wait_timeout(), the error code (a negative value) is passed back
> instead of remaining time.  If a user then passes that negative value
> forward as requested timeout to another wait, an explicit WARN or BUG can
> be triggered.
> 
> Instead of copying the value of timeout variable to *remaining_timeout
> before return, update the *remaining_timeout after each DMA fence wait.
> Set it to 0 on -ETIME, -EINTR or -ERESTARTSYS, and assume no time has been
> consumed on other errors returned from the wait.
> 
> Fixes: b97060a99b01 ("drm/i915/guc: Update intel_gt_wait_for_idle to work with GuC")
> Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
> Cc: stable@vger.kernel.org # v5.15+
> ---
>   drivers/gpu/drm/i915/gt/intel_gt_requests.c | 23 ++++++++++++++++++---
>   1 file changed, 20 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_requests.c b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
> index edb881d756309..ccaf2fd80625b 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt_requests.c
> +++ b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
> @@ -138,6 +138,9 @@ long intel_gt_retire_requests_timeout(struct intel_gt *gt, long timeout,
>   	unsigned long active_count = 0;
>   	LIST_HEAD(free);
>   
> +	if (remaining_timeout)
> +		*remaining_timeout = timeout;
> +
>   	flush_submission(gt, timeout); /* kick the ksoftirqd tasklets */
>   	spin_lock(&timelines->lock);
>   	list_for_each_entry_safe(tl, tn, &timelines->active_list, link) {
> @@ -163,6 +166,23 @@ long intel_gt_retire_requests_timeout(struct intel_gt *gt, long timeout,
>   								 timeout);
>   				dma_fence_put(fence);
>   
> +				if (remaining_timeout) {
> +					/*
> +					 * If we get an error here but request
> +					 * retirement succeeds anyway
> +					 * (!active_count) and we return 0, the
> +					 * caller may want to spend remaining
> +					 * time on waiting for other events.
> +					 */
> +					if (timeout == -ETIME ||
> +					    timeout == -EINTR ||
> +					    timeout == -ERESTARTSYS)
> +						*remaining_timeout = 0;
> +					else if (timeout >= 0)
> +						*remaining_timeout = timeout;
> +					/* else assume no time consumed */

Looks correct, but the crazy semantic of dma_fence_wait_timeout does not 
make it easy to understand.

Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>

Regards
Andrzej


> +				}
> +
>   				/* Retirement is best effort */
>   				if (!mutex_trylock(&tl->mutex)) {
>   					active_count++;
> @@ -196,9 +216,6 @@ out_active:	spin_lock(&timelines->lock);
>   	if (flush_submission(gt, timeout)) /* Wait, there's more! */
>   		active_count++;
>   
> -	if (remaining_timeout)
> -		*remaining_timeout = timeout;
> -
>   	return active_count ? timeout : 0;
>   }
>   


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Intel-gfx] [PATCH 2/3] drm/i915: Never return 0 on timeout when retiring requests
  2022-11-16 11:25   ` [Intel-gfx] " Janusz Krzysztofik
@ 2022-11-16 14:15     ` Andrzej Hajda
  -1 siblings, 0 replies; 22+ messages in thread
From: Andrzej Hajda @ 2022-11-16 14:15 UTC (permalink / raw)
  To: Janusz Krzysztofik, Tvrtko Ursulin, Joonas Lahtinen
  Cc: intel-gfx, dri-devel, Chris Wilson

On 16.11.2022 12:25, Janusz Krzysztofik wrote:
> Users of intel_gt_retire_requests_timeout() expect 0 return value on
> success.  However, we have no protection from passing back 0 potentially
> returned by dma_fence_wait_timeout() on timeout.
> 
> Replace 0 with -ETIME before using timeout as return value.
> 
> Fixes: f33a8a51602c ("drm/i915: Merge wait_for_timelines with retire_request")
> Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
> Cc: stable@vger.kernel.org # v5.5+
> ---
>   drivers/gpu/drm/i915/gt/intel_gt_requests.c | 3 +++
>   1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_requests.c b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
> index ccaf2fd80625b..ac6b2b1861397 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt_requests.c
> +++ b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
> @@ -213,6 +213,9 @@ out_active:	spin_lock(&timelines->lock);
>   	list_for_each_entry_safe(tl, tn, &free, link)
>   		__intel_timeline_free(&tl->kref);
>   
> +	if (!timeout)
> +		timeout = -ETIME;
> +



Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>

Regards
Andrzej

>   	if (flush_submission(gt, timeout)) /* Wait, there's more! */
>   		active_count++;
>   


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Intel-gfx] [PATCH 2/3] drm/i915: Never return 0 on timeout when retiring requests
@ 2022-11-16 14:15     ` Andrzej Hajda
  0 siblings, 0 replies; 22+ messages in thread
From: Andrzej Hajda @ 2022-11-16 14:15 UTC (permalink / raw)
  To: Janusz Krzysztofik, Tvrtko Ursulin, Joonas Lahtinen
  Cc: Daniel Vetter, intel-gfx, dri-devel, Chris Wilson

On 16.11.2022 12:25, Janusz Krzysztofik wrote:
> Users of intel_gt_retire_requests_timeout() expect 0 return value on
> success.  However, we have no protection from passing back 0 potentially
> returned by dma_fence_wait_timeout() on timeout.
> 
> Replace 0 with -ETIME before using timeout as return value.
> 
> Fixes: f33a8a51602c ("drm/i915: Merge wait_for_timelines with retire_request")
> Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
> Cc: stable@vger.kernel.org # v5.5+
> ---
>   drivers/gpu/drm/i915/gt/intel_gt_requests.c | 3 +++
>   1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_requests.c b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
> index ccaf2fd80625b..ac6b2b1861397 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt_requests.c
> +++ b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
> @@ -213,6 +213,9 @@ out_active:	spin_lock(&timelines->lock);
>   	list_for_each_entry_safe(tl, tn, &free, link)
>   		__intel_timeline_free(&tl->kref);
>   
> +	if (!timeout)
> +		timeout = -ETIME;
> +



Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>

Regards
Andrzej

>   	if (flush_submission(gt, timeout)) /* Wait, there's more! */
>   		active_count++;
>   


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Intel-gfx] [PATCH 3/3] drm/i915: Never return 0 if request wait succeeds
  2022-11-16 11:25   ` [Intel-gfx] " Janusz Krzysztofik
@ 2022-11-16 14:42     ` Andrzej Hajda
  -1 siblings, 0 replies; 22+ messages in thread
From: Andrzej Hajda @ 2022-11-16 14:42 UTC (permalink / raw)
  To: Janusz Krzysztofik, Tvrtko Ursulin, Joonas Lahtinen
  Cc: intel-gfx, dri-devel, Chris Wilson

On 16.11.2022 12:25, Janusz Krzysztofik wrote:
> According to the docs of i915_request_wait_timeout(), its return value
> "may be zero if the request is unfinished after the timeout expires."
> However, 0 is also returned when the request is found finished right
> after the timeout has expired.
> 
> Since the docs also state: "If the timeout is 0, it will return 1 if the
> fence is signaled.", return 1 also when the fence is found signaled after
> non-zero timeout has expired.

As I understand the patch "drm/i915: Fix i915_request fence wait 
semantics", and the docs "timeout is 0" means the initial value of 
timeout argument and this is handled already on the beginning of the 
function.
In case initial timeout is greater than zero and then it expires, 
function should return 0 regardless of fence state. This is what I have 
understood from reading docs and implementation of 
dma_fence_default_wait [1], which should be the best source of info 
about "dma_fence wait semantic".

As I said already, mixing remaining time and bool in return value of 
dma_fence_wait* functions is very confusing, but changing it would 
require major rework of the framework.

[1]: 
https://elixir.bootlin.com/linux/latest/source/drivers/dma-buf/dma-fence.c#L753

Regards
Andrzej

> 
> Fixes: 7e2e69ed4678 ("drm/i915: Fix i915_request fence wait semantics")
> Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
> Cc: stable@vger.kernel.org # v5.17
> ---
>   drivers/gpu/drm/i915/i915_request.c | 2 ++
>   1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> index f949a9495758a..406ddfafbed4d 100644
> --- a/drivers/gpu/drm/i915/i915_request.c
> +++ b/drivers/gpu/drm/i915/i915_request.c
> @@ -2079,6 +2079,8 @@ long i915_request_wait_timeout(struct i915_request *rq,
>   
>   		timeout = io_schedule_timeout(timeout);
>   	}
> +	if (!timeout)	/* expired but signaled, we shouldn't return 0 */
> +		timeout = 1;
>   	__set_current_state(TASK_RUNNING);
>   
>   	if (READ_ONCE(wait.tsk))


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Intel-gfx] [PATCH 3/3] drm/i915: Never return 0 if request wait succeeds
@ 2022-11-16 14:42     ` Andrzej Hajda
  0 siblings, 0 replies; 22+ messages in thread
From: Andrzej Hajda @ 2022-11-16 14:42 UTC (permalink / raw)
  To: Janusz Krzysztofik, Tvrtko Ursulin, Joonas Lahtinen
  Cc: Daniel Vetter, intel-gfx, dri-devel, Chris Wilson

On 16.11.2022 12:25, Janusz Krzysztofik wrote:
> According to the docs of i915_request_wait_timeout(), its return value
> "may be zero if the request is unfinished after the timeout expires."
> However, 0 is also returned when the request is found finished right
> after the timeout has expired.
> 
> Since the docs also state: "If the timeout is 0, it will return 1 if the
> fence is signaled.", return 1 also when the fence is found signaled after
> non-zero timeout has expired.

As I understand the patch "drm/i915: Fix i915_request fence wait 
semantics", and the docs "timeout is 0" means the initial value of 
timeout argument and this is handled already on the beginning of the 
function.
In case initial timeout is greater than zero and then it expires, 
function should return 0 regardless of fence state. This is what I have 
understood from reading docs and implementation of 
dma_fence_default_wait [1], which should be the best source of info 
about "dma_fence wait semantic".

As I said already, mixing remaining time and bool in return value of 
dma_fence_wait* functions is very confusing, but changing it would 
require major rework of the framework.

[1]: 
https://elixir.bootlin.com/linux/latest/source/drivers/dma-buf/dma-fence.c#L753

Regards
Andrzej

> 
> Fixes: 7e2e69ed4678 ("drm/i915: Fix i915_request fence wait semantics")
> Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
> Cc: stable@vger.kernel.org # v5.17
> ---
>   drivers/gpu/drm/i915/i915_request.c | 2 ++
>   1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> index f949a9495758a..406ddfafbed4d 100644
> --- a/drivers/gpu/drm/i915/i915_request.c
> +++ b/drivers/gpu/drm/i915/i915_request.c
> @@ -2079,6 +2079,8 @@ long i915_request_wait_timeout(struct i915_request *rq,
>   
>   		timeout = io_schedule_timeout(timeout);
>   	}
> +	if (!timeout)	/* expired but signaled, we shouldn't return 0 */
> +		timeout = 1;
>   	__set_current_state(TASK_RUNNING);
>   
>   	if (READ_ONCE(wait.tsk))


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Intel-gfx] [PATCH 3/3] drm/i915: Never return 0 if request wait succeeds
  2022-11-16 14:42     ` Andrzej Hajda
@ 2022-11-16 15:48       ` Janusz Krzysztofik
  -1 siblings, 0 replies; 22+ messages in thread
From: Janusz Krzysztofik @ 2022-11-16 15:48 UTC (permalink / raw)
  To: Tvrtko Ursulin, Joonas Lahtinen, Andrzej Hajda
  Cc: intel-gfx, dri-devel, Chris Wilson

On Wednesday, 16 November 2022 15:42:46 CET Andrzej Hajda wrote:
> On 16.11.2022 12:25, Janusz Krzysztofik wrote:
> > According to the docs of i915_request_wait_timeout(), its return value
> > "may be zero if the request is unfinished after the timeout expires."
> > However, 0 is also returned when the request is found finished right
> > after the timeout has expired.
> > 
> > Since the docs also state: "If the timeout is 0, it will return 1 if the
> > fence is signaled.", return 1 also when the fence is found signaled after
> > non-zero timeout has expired.
> 
> As I understand the patch "drm/i915: Fix i915_request fence wait 
> semantics", and the docs "timeout is 0" means the initial value of 
> timeout argument and this is handled already on the beginning of the 
> function.
> In case initial timeout is greater than zero and then it expires, 
> function should return 0 regardless of fence state. This is what I have 
> understood from reading docs and implementation of 
> dma_fence_default_wait [1], which should be the best source of info 
> about "dma_fence wait semantic".
> 
> As I said already, mixing remaining time and bool in return value of 
> dma_fence_wait* functions is very confusing, but changing it would 
> require major rework of the framework.

OK, let's drop this controversial patch.  The corner case it touches should 
already be handled correctly by intel_gt_retire_requests_timeout(), which this 
series is about, after 1/3 and 2/3 are applied.

Thanks,
Janusz

> 
> [1]: 
> https://elixir.bootlin.com/linux/latest/source/drivers/dma-buf/dma-fence.c#L753
> 
> Regards
> Andrzej
> 
> > 
> > Fixes: 7e2e69ed4678 ("drm/i915: Fix i915_request fence wait semantics")
> > Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
> > Cc: stable@vger.kernel.org # v5.17
> > ---
> >   drivers/gpu/drm/i915/i915_request.c | 2 ++
> >   1 file changed, 2 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> > index f949a9495758a..406ddfafbed4d 100644
> > --- a/drivers/gpu/drm/i915/i915_request.c
> > +++ b/drivers/gpu/drm/i915/i915_request.c
> > @@ -2079,6 +2079,8 @@ long i915_request_wait_timeout(struct i915_request *rq,
> >   
> >   		timeout = io_schedule_timeout(timeout);
> >   	}
> > +	if (!timeout)	/* expired but signaled, we shouldn't return 0 */
> > +		timeout = 1;
> >   	__set_current_state(TASK_RUNNING);
> >   
> >   	if (READ_ONCE(wait.tsk))
> 
> 





^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Intel-gfx] [PATCH 3/3] drm/i915: Never return 0 if request wait succeeds
@ 2022-11-16 15:48       ` Janusz Krzysztofik
  0 siblings, 0 replies; 22+ messages in thread
From: Janusz Krzysztofik @ 2022-11-16 15:48 UTC (permalink / raw)
  To: Tvrtko Ursulin, Joonas Lahtinen, Andrzej Hajda
  Cc: Daniel Vetter, intel-gfx, dri-devel, Chris Wilson

On Wednesday, 16 November 2022 15:42:46 CET Andrzej Hajda wrote:
> On 16.11.2022 12:25, Janusz Krzysztofik wrote:
> > According to the docs of i915_request_wait_timeout(), its return value
> > "may be zero if the request is unfinished after the timeout expires."
> > However, 0 is also returned when the request is found finished right
> > after the timeout has expired.
> > 
> > Since the docs also state: "If the timeout is 0, it will return 1 if the
> > fence is signaled.", return 1 also when the fence is found signaled after
> > non-zero timeout has expired.
> 
> As I understand the patch "drm/i915: Fix i915_request fence wait 
> semantics", and the docs "timeout is 0" means the initial value of 
> timeout argument and this is handled already on the beginning of the 
> function.
> In case initial timeout is greater than zero and then it expires, 
> function should return 0 regardless of fence state. This is what I have 
> understood from reading docs and implementation of 
> dma_fence_default_wait [1], which should be the best source of info 
> about "dma_fence wait semantic".
> 
> As I said already, mixing remaining time and bool in return value of 
> dma_fence_wait* functions is very confusing, but changing it would 
> require major rework of the framework.

OK, let's drop this controversial patch.  The corner case it touches should 
already be handled correctly by intel_gt_retire_requests_timeout(), which this 
series is about, after 1/3 and 2/3 are applied.

Thanks,
Janusz

> 
> [1]: 
> https://elixir.bootlin.com/linux/latest/source/drivers/dma-buf/dma-fence.c#L753
> 
> Regards
> Andrzej
> 
> > 
> > Fixes: 7e2e69ed4678 ("drm/i915: Fix i915_request fence wait semantics")
> > Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
> > Cc: stable@vger.kernel.org # v5.17
> > ---
> >   drivers/gpu/drm/i915/i915_request.c | 2 ++
> >   1 file changed, 2 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> > index f949a9495758a..406ddfafbed4d 100644
> > --- a/drivers/gpu/drm/i915/i915_request.c
> > +++ b/drivers/gpu/drm/i915/i915_request.c
> > @@ -2079,6 +2079,8 @@ long i915_request_wait_timeout(struct i915_request *rq,
> >   
> >   		timeout = io_schedule_timeout(timeout);
> >   	}
> > +	if (!timeout)	/* expired but signaled, we shouldn't return 0 */
> > +		timeout = 1;
> >   	__set_current_state(TASK_RUNNING);
> >   
> >   	if (READ_ONCE(wait.tsk))
> 
> 





^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915: Fix timeout handling when retiring requests
  2022-11-16 11:25 ` [Intel-gfx] " Janusz Krzysztofik
                   ` (3 preceding siblings ...)
  (?)
@ 2022-11-16 19:06 ` Patchwork
  -1 siblings, 0 replies; 22+ messages in thread
From: Patchwork @ 2022-11-16 19:06 UTC (permalink / raw)
  To: Janusz Krzysztofik; +Cc: intel-gfx

[-- Attachment #1: Type: text/plain, Size: 7095 bytes --]

== Series Details ==

Series: drm/i915: Fix timeout handling when retiring requests
URL   : https://patchwork.freedesktop.org/series/110964/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_12389 -> Patchwork_110964v1
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/index.html

Participating hosts (40 -> 39)
------------------------------

  Missing    (1): fi-pnv-d510 

Known issues
------------

  Here are the changes found in Patchwork_110964v1 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@gem_lmem_swapping@basic:
    - fi-apl-guc:         NOTRUN -> [SKIP][1] ([fdo#109271] / [i915#4613]) +3 similar issues
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/fi-apl-guc/igt@gem_lmem_swapping@basic.html

  * igt@i915_pm_rpm@basic-pci-d3-state:
    - bat-adlp-4:         [PASS][2] -> [DMESG-WARN][3] ([i915#7077])
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12389/bat-adlp-4/igt@i915_pm_rpm@basic-pci-d3-state.html
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/bat-adlp-4/igt@i915_pm_rpm@basic-pci-d3-state.html

  * igt@i915_selftest@live@gt_engines:
    - bat-dg1-6:          [PASS][4] -> [INCOMPLETE][5] ([i915#4418])
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12389/bat-dg1-6/igt@i915_selftest@live@gt_engines.html
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/bat-dg1-6/igt@i915_selftest@live@gt_engines.html

  * igt@i915_selftest@live@hangcheck:
    - fi-hsw-4770:        [PASS][6] -> [INCOMPLETE][7] ([i915#4785])
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12389/fi-hsw-4770/igt@i915_selftest@live@hangcheck.html
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/fi-hsw-4770/igt@i915_selftest@live@hangcheck.html

  * igt@kms_chamelium@hdmi-crc-fast:
    - fi-apl-guc:         NOTRUN -> [SKIP][8] ([fdo#109271] / [fdo#111827]) +8 similar issues
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/fi-apl-guc/igt@kms_chamelium@hdmi-crc-fast.html

  * igt@kms_psr@sprite_plane_onoff:
    - fi-apl-guc:         NOTRUN -> [SKIP][9] ([fdo#109271]) +11 similar issues
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/fi-apl-guc/igt@kms_psr@sprite_plane_onoff.html

  * igt@runner@aborted:
    - fi-hsw-4770:        NOTRUN -> [FAIL][10] ([fdo#109271] / [i915#4312] / [i915#5594])
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/fi-hsw-4770/igt@runner@aborted.html
    - bat-adlp-4:         NOTRUN -> [FAIL][11] ([i915#4312])
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/bat-adlp-4/igt@runner@aborted.html
    - bat-dg1-6:          NOTRUN -> [FAIL][12] ([i915#4312])
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/bat-dg1-6/igt@runner@aborted.html

  
#### Possible fixes ####

  * igt@gem_exec_suspend@basic-s0@smem:
    - {bat-rpls-2}:       [DMESG-WARN][13] ([i915#6434]) -> [PASS][14]
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12389/bat-rpls-2/igt@gem_exec_suspend@basic-s0@smem.html
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/bat-rpls-2/igt@gem_exec_suspend@basic-s0@smem.html

  * igt@gem_render_tiled_blits@basic:
    - fi-apl-guc:         [INCOMPLETE][15] ([i915#7056]) -> [PASS][16]
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12389/fi-apl-guc/igt@gem_render_tiled_blits@basic.html
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/fi-apl-guc/igt@gem_render_tiled_blits@basic.html

  * igt@i915_module_load@reload:
    - fi-kbl-soraka:      [DMESG-WARN][17] ([i915#1982]) -> [PASS][18]
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12389/fi-kbl-soraka/igt@i915_module_load@reload.html
   [18]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/fi-kbl-soraka/igt@i915_module_load@reload.html

  * igt@i915_pm_rpm@module-reload:
    - {bat-rpls-2}:       [WARN][19] ([i915#7346]) -> [PASS][20]
   [19]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12389/bat-rpls-2/igt@i915_pm_rpm@module-reload.html
   [20]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/bat-rpls-2/igt@i915_pm_rpm@module-reload.html

  * igt@i915_selftest@live@gt_engines:
    - {bat-dg1-7}:        [INCOMPLETE][21] ([i915#4418]) -> [PASS][22]
   [21]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12389/bat-dg1-7/igt@i915_selftest@live@gt_engines.html
   [22]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/bat-dg1-7/igt@i915_selftest@live@gt_engines.html

  * igt@i915_selftest@live@gt_heartbeat:
    - fi-kbl-soraka:      [DMESG-FAIL][23] ([i915#5334]) -> [PASS][24]
   [23]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12389/fi-kbl-soraka/igt@i915_selftest@live@gt_heartbeat.html
   [24]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/fi-kbl-soraka/igt@i915_selftest@live@gt_heartbeat.html

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [fdo#111827]: https://bugs.freedesktop.org/show_bug.cgi?id=111827
  [i915#1982]: https://gitlab.freedesktop.org/drm/intel/issues/1982
  [i915#2867]: https://gitlab.freedesktop.org/drm/intel/issues/2867
  [i915#4078]: https://gitlab.freedesktop.org/drm/intel/issues/4078
  [i915#4312]: https://gitlab.freedesktop.org/drm/intel/issues/4312
  [i915#4418]: https://gitlab.freedesktop.org/drm/intel/issues/4418
  [i915#4613]: https://gitlab.freedesktop.org/drm/intel/issues/4613
  [i915#4785]: https://gitlab.freedesktop.org/drm/intel/issues/4785
  [i915#5334]: https://gitlab.freedesktop.org/drm/intel/issues/5334
  [i915#5594]: https://gitlab.freedesktop.org/drm/intel/issues/5594
  [i915#6367]: https://gitlab.freedesktop.org/drm/intel/issues/6367
  [i915#6434]: https://gitlab.freedesktop.org/drm/intel/issues/6434
  [i915#6997]: https://gitlab.freedesktop.org/drm/intel/issues/6997
  [i915#7056]: https://gitlab.freedesktop.org/drm/intel/issues/7056
  [i915#7077]: https://gitlab.freedesktop.org/drm/intel/issues/7077
  [i915#7346]: https://gitlab.freedesktop.org/drm/intel/issues/7346


Build changes
-------------

  * Linux: CI_DRM_12389 -> Patchwork_110964v1

  CI-20190529: 20190529
  CI_DRM_12389: 3eb60fea7cd44530f85bd9362508647aceea1dea @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_7061: d16e679b99d43643c3a0bcc34ec4c4f001dbbbb4 @ https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  Patchwork_110964v1: 3eb60fea7cd44530f85bd9362508647aceea1dea @ git://anongit.freedesktop.org/gfx-ci/linux


### Linux commits

c7a3e7c22a6e drm/i915: Never return 0 if request wait succeeds
0a08e4fdefe7 drm/i915: Never return 0 on timeout when retiring requests
36453b32988e drm/i915: Fix negative remaining time after retire requests

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/index.html

[-- Attachment #2: Type: text/html, Size: 8195 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Intel-gfx] ✗ Fi.CI.IGT: failure for drm/i915: Fix timeout handling when retiring requests
  2022-11-16 11:25 ` [Intel-gfx] " Janusz Krzysztofik
                   ` (4 preceding siblings ...)
  (?)
@ 2022-11-17  3:27 ` Patchwork
  -1 siblings, 0 replies; 22+ messages in thread
From: Patchwork @ 2022-11-17  3:27 UTC (permalink / raw)
  To: Janusz Krzysztofik; +Cc: intel-gfx

[-- Attachment #1: Type: text/plain, Size: 31155 bytes --]

== Series Details ==

Series: drm/i915: Fix timeout handling when retiring requests
URL   : https://patchwork.freedesktop.org/series/110964/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_12389_full -> Patchwork_110964v1_full
====================================================

Summary
-------

  **FAILURE**

  Serious unknown changes coming with Patchwork_110964v1_full absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_110964v1_full, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  

Participating hosts (10 -> 9)
------------------------------

  Missing    (1): shard-rkl 

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in Patchwork_110964v1_full:

### IGT changes ###

#### Possible regressions ####

  * igt@gem_eio@reset-stress:
    - shard-skl:          [PASS][1] -> [INCOMPLETE][2]
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12389/shard-skl4/igt@gem_eio@reset-stress.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-skl9/igt@gem_eio@reset-stress.html

  
Known issues
------------

  Here are the changes found in Patchwork_110964v1_full that come from known issues:

### CI changes ###

#### Possible fixes ####

  * boot:
    - shard-glk:          ([PASS][3], [PASS][4], [PASS][5], [PASS][6], [PASS][7], [PASS][8], [PASS][9], [PASS][10], [PASS][11], [PASS][12], [PASS][13], [PASS][14], [PASS][15], [PASS][16], [PASS][17], [PASS][18], [PASS][19], [PASS][20], [PASS][21], [PASS][22], [PASS][23], [FAIL][24], [FAIL][25], [PASS][26], [PASS][27]) ([i915#4392]) -> ([PASS][28], [PASS][29], [PASS][30], [PASS][31], [PASS][32], [PASS][33], [PASS][34], [PASS][35], [PASS][36], [PASS][37], [PASS][38], [PASS][39], [PASS][40], [PASS][41], [PASS][42], [PASS][43], [PASS][44], [PASS][45], [PASS][46], [PASS][47], [PASS][48], [PASS][49], [PASS][50], [PASS][51], [PASS][52])
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12389/shard-glk1/boot.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12389/shard-glk1/boot.html
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12389/shard-glk1/boot.html
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12389/shard-glk2/boot.html
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12389/shard-glk2/boot.html
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12389/shard-glk2/boot.html
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12389/shard-glk3/boot.html
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12389/shard-glk3/boot.html
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12389/shard-glk3/boot.html
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12389/shard-glk5/boot.html
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12389/shard-glk5/boot.html
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12389/shard-glk5/boot.html
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12389/shard-glk5/boot.html
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12389/shard-glk6/boot.html
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12389/shard-glk6/boot.html
   [18]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12389/shard-glk6/boot.html
   [19]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12389/shard-glk7/boot.html
   [20]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12389/shard-glk7/boot.html
   [21]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12389/shard-glk7/boot.html
   [22]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12389/shard-glk8/boot.html
   [23]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12389/shard-glk8/boot.html
   [24]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12389/shard-glk8/boot.html
   [25]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12389/shard-glk9/boot.html
   [26]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12389/shard-glk9/boot.html
   [27]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12389/shard-glk9/boot.html
   [28]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-glk9/boot.html
   [29]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-glk9/boot.html
   [30]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-glk9/boot.html
   [31]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-glk8/boot.html
   [32]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-glk8/boot.html
   [33]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-glk8/boot.html
   [34]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-glk7/boot.html
   [35]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-glk7/boot.html
   [36]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-glk7/boot.html
   [37]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-glk6/boot.html
   [38]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-glk6/boot.html
   [39]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-glk6/boot.html
   [40]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-glk5/boot.html
   [41]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-glk5/boot.html
   [42]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-glk5/boot.html
   [43]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-glk5/boot.html
   [44]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-glk3/boot.html
   [45]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-glk3/boot.html
   [46]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-glk3/boot.html
   [47]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-glk2/boot.html
   [48]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-glk2/boot.html
   [49]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-glk2/boot.html
   [50]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-glk1/boot.html
   [51]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-glk1/boot.html
   [52]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-glk1/boot.html
    - shard-skl:          ([PASS][53], [PASS][54], [PASS][55], [PASS][56], [PASS][57], [PASS][58], [PASS][59], [PASS][60], [PASS][61], [PASS][62], [PASS][63], [PASS][64], [FAIL][65], [PASS][66], [PASS][67], [PASS][68], [PASS][69], [PASS][70], [PASS][71], [PASS][72], [PASS][73]) ([i915#5032]) -> ([PASS][74], [PASS][75], [PASS][76], [PASS][77], [PASS][78], [PASS][79], [PASS][80], [PASS][81], [PASS][82], [PASS][83], [PASS][84], [PASS][85], [PASS][86], [PASS][87], [PASS][88], [PASS][89], [PASS][90], [PASS][91], [PASS][92], [PASS][93], [PASS][94], [PASS][95], [PASS][96])
   [53]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12389/shard-skl10/boot.html
   [54]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12389/shard-skl10/boot.html
   [55]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12389/shard-skl10/boot.html
   [56]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12389/shard-skl1/boot.html
   [57]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12389/shard-skl1/boot.html
   [58]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12389/shard-skl2/boot.html
   [59]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12389/shard-skl2/boot.html
   [60]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12389/shard-skl3/boot.html
   [61]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12389/shard-skl4/boot.html
   [62]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12389/shard-skl4/boot.html
   [63]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12389/shard-skl4/boot.html
   [64]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12389/shard-skl5/boot.html
   [65]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12389/shard-skl5/boot.html
   [66]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12389/shard-skl5/boot.html
   [67]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12389/shard-skl6/boot.html
   [68]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12389/shard-skl6/boot.html
   [69]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12389/shard-skl7/boot.html
   [70]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12389/shard-skl7/boot.html
   [71]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12389/shard-skl9/boot.html
   [72]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12389/shard-skl9/boot.html
   [73]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12389/shard-skl9/boot.html
   [74]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-skl9/boot.html
   [75]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-skl9/boot.html
   [76]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-skl9/boot.html
   [77]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-skl7/boot.html
   [78]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-skl7/boot.html
   [79]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-skl7/boot.html
   [80]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-skl6/boot.html
   [81]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-skl6/boot.html
   [82]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-skl6/boot.html
   [83]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-skl5/boot.html
   [84]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-skl4/boot.html
   [85]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-skl4/boot.html
   [86]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-skl4/boot.html
   [87]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-skl3/boot.html
   [88]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-skl3/boot.html
   [89]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-skl2/boot.html
   [90]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-skl2/boot.html
   [91]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-skl1/boot.html
   [92]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-skl1/boot.html
   [93]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-skl1/boot.html
   [94]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-skl10/boot.html
   [95]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-skl10/boot.html
   [96]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-skl10/boot.html

  

### IGT changes ###

#### Issues hit ####

  * igt@dmabuf@all@dma_fence_chain:
    - shard-skl:          NOTRUN -> [INCOMPLETE][97] ([i915#6949])
   [97]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-skl2/igt@dmabuf@all@dma_fence_chain.html

  * igt@gem_exec_balancer@parallel-keep-submit-fence:
    - shard-iclb:         [PASS][98] -> [SKIP][99] ([i915#4525])
   [98]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12389/shard-iclb1/igt@gem_exec_balancer@parallel-keep-submit-fence.html
   [99]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-iclb7/igt@gem_exec_balancer@parallel-keep-submit-fence.html

  * igt@gem_exec_fair@basic-deadline:
    - shard-glk:          [PASS][100] -> [FAIL][101] ([i915#2846])
   [100]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12389/shard-glk2/igt@gem_exec_fair@basic-deadline.html
   [101]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-glk5/igt@gem_exec_fair@basic-deadline.html
    - shard-skl:          NOTRUN -> [FAIL][102] ([i915#2846])
   [102]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-skl5/igt@gem_exec_fair@basic-deadline.html

  * igt@gem_exec_fair@basic-none@vcs0:
    - shard-glk:          [PASS][103] -> [FAIL][104] ([i915#2842]) +1 similar issue
   [103]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12389/shard-glk7/igt@gem_exec_fair@basic-none@vcs0.html
   [104]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-glk2/igt@gem_exec_fair@basic-none@vcs0.html

  * igt@gem_lmem_swapping@basic:
    - shard-skl:          NOTRUN -> [SKIP][105] ([fdo#109271] / [i915#4613]) +2 similar issues
   [105]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-skl1/igt@gem_lmem_swapping@basic.html

  * igt@gem_userptr_blits@dmabuf-sync:
    - shard-skl:          NOTRUN -> [SKIP][106] ([fdo#109271] / [i915#3323])
   [106]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-skl2/igt@gem_userptr_blits@dmabuf-sync.html

  * igt@gen9_exec_parse@allowed-all:
    - shard-skl:          [PASS][107] -> [DMESG-WARN][108] ([i915#5566] / [i915#716])
   [107]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12389/shard-skl9/igt@gen9_exec_parse@allowed-all.html
   [108]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-skl9/igt@gen9_exec_parse@allowed-all.html

  * igt@i915_pm_dc@dc6-dpms:
    - shard-iclb:         [PASS][109] -> [FAIL][110] ([i915#3989] / [i915#454])
   [109]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12389/shard-iclb2/igt@i915_pm_dc@dc6-dpms.html
   [110]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-iclb3/igt@i915_pm_dc@dc6-dpms.html

  * igt@i915_selftest@live@hangcheck:
    - shard-tglb:         [PASS][111] -> [DMESG-WARN][112] ([i915#5591])
   [111]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12389/shard-tglb1/igt@i915_selftest@live@hangcheck.html
   [112]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-tglb1/igt@i915_selftest@live@hangcheck.html

  * igt@i915_suspend@forcewake:
    - shard-skl:          [PASS][113] -> [INCOMPLETE][114] ([i915#4817] / [i915#7233])
   [113]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12389/shard-skl2/igt@i915_suspend@forcewake.html
   [114]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-skl10/igt@i915_suspend@forcewake.html

  * igt@kms_async_flips@alternate-sync-async-flip@pipe-c-edp-1:
    - shard-skl:          NOTRUN -> [FAIL][115] ([i915#2521])
   [115]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-skl7/igt@kms_async_flips@alternate-sync-async-flip@pipe-c-edp-1.html

  * igt@kms_big_fb@y-tiled-max-hw-stride-64bpp-rotate-0-async-flip:
    - shard-skl:          NOTRUN -> [FAIL][116] ([i915#3763])
   [116]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-skl1/igt@kms_big_fb@y-tiled-max-hw-stride-64bpp-rotate-0-async-flip.html

  * igt@kms_big_fb@yf-tiled-max-hw-stride-64bpp-rotate-180-hflip-async-flip:
    - shard-glk:          NOTRUN -> [SKIP][117] ([fdo#109271]) +75 similar issues
   [117]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-glk7/igt@kms_big_fb@yf-tiled-max-hw-stride-64bpp-rotate-180-hflip-async-flip.html

  * igt@kms_ccs@pipe-a-bad-pixel-format-y_tiled_gen12_mc_ccs:
    - shard-skl:          NOTRUN -> [SKIP][118] ([fdo#109271] / [i915#3886]) +10 similar issues
   [118]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-skl7/igt@kms_ccs@pipe-a-bad-pixel-format-y_tiled_gen12_mc_ccs.html

  * igt@kms_ccs@pipe-a-crc-primary-rotation-180-y_tiled_gen12_rc_ccs_cc:
    - shard-glk:          NOTRUN -> [SKIP][119] ([fdo#109271] / [i915#3886]) +2 similar issues
   [119]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-glk5/igt@kms_ccs@pipe-a-crc-primary-rotation-180-y_tiled_gen12_rc_ccs_cc.html

  * igt@kms_chamelium@hdmi-crc-multiple:
    - shard-skl:          NOTRUN -> [SKIP][120] ([fdo#109271] / [fdo#111827]) +10 similar issues
   [120]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-skl7/igt@kms_chamelium@hdmi-crc-multiple.html

  * igt@kms_chamelium@hdmi-hpd-storm:
    - shard-glk:          NOTRUN -> [SKIP][121] ([fdo#109271] / [fdo#111827]) +4 similar issues
   [121]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-glk5/igt@kms_chamelium@hdmi-hpd-storm.html

  * igt@kms_cursor_legacy@flip-vs-cursor@toggle:
    - shard-iclb:         [PASS][122] -> [FAIL][123] ([i915#2346]) +3 similar issues
   [122]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12389/shard-iclb8/igt@kms_cursor_legacy@flip-vs-cursor@toggle.html
   [123]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-iclb7/igt@kms_cursor_legacy@flip-vs-cursor@toggle.html

  * igt@kms_flip@flip-vs-expired-vblank-interruptible@b-edp1:
    - shard-skl:          [PASS][124] -> [FAIL][125] ([i915#79]) +1 similar issue
   [124]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12389/shard-skl9/igt@kms_flip@flip-vs-expired-vblank-interruptible@b-edp1.html
   [125]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-skl9/igt@kms_flip@flip-vs-expired-vblank-interruptible@b-edp1.html

  * igt@kms_flip@plain-flip-ts-check-interruptible@c-edp1:
    - shard-skl:          NOTRUN -> [FAIL][126] ([i915#2122]) +1 similar issue
   [126]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-skl5/igt@kms_flip@plain-flip-ts-check-interruptible@c-edp1.html

  * igt@kms_flip_scaled_crc@flip-32bpp-linear-to-64bpp-linear-downscaling@pipe-a-default-mode:
    - shard-iclb:         NOTRUN -> [SKIP][127] ([i915#3555]) +1 similar issue
   [127]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-iclb2/igt@kms_flip_scaled_crc@flip-32bpp-linear-to-64bpp-linear-downscaling@pipe-a-default-mode.html

  * igt@kms_flip_scaled_crc@flip-64bpp-4tile-to-32bpp-4tiledg2rcccs-upscaling@pipe-a-default-mode:
    - shard-skl:          NOTRUN -> [SKIP][128] ([fdo#109271]) +170 similar issues
   [128]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-skl10/igt@kms_flip_scaled_crc@flip-64bpp-4tile-to-32bpp-4tiledg2rcccs-upscaling@pipe-a-default-mode.html

  * igt@kms_flip_scaled_crc@flip-64bpp-yftile-to-16bpp-yftile-downscaling@pipe-a-valid-mode:
    - shard-iclb:         NOTRUN -> [SKIP][129] ([i915#2587] / [i915#2672]) +2 similar issues
   [129]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-iclb5/igt@kms_flip_scaled_crc@flip-64bpp-yftile-to-16bpp-yftile-downscaling@pipe-a-valid-mode.html

  * igt@kms_flip_scaled_crc@flip-64bpp-yftile-to-32bpp-yftile-upscaling@pipe-a-default-mode:
    - shard-iclb:         NOTRUN -> [SKIP][130] ([i915#2672]) +4 similar issues
   [130]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-iclb2/igt@kms_flip_scaled_crc@flip-64bpp-yftile-to-32bpp-yftile-upscaling@pipe-a-default-mode.html

  * igt@kms_flip_scaled_crc@flip-64bpp-ytile-to-32bpp-ytilegen12rcccs-upscaling@pipe-a-default-mode:
    - shard-iclb:         NOTRUN -> [SKIP][131] ([i915#2672] / [i915#3555])
   [131]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-iclb2/igt@kms_flip_scaled_crc@flip-64bpp-ytile-to-32bpp-ytilegen12rcccs-upscaling@pipe-a-default-mode.html

  * igt@kms_psr2_su@frontbuffer-xrgb8888:
    - shard-skl:          NOTRUN -> [SKIP][132] ([fdo#109271] / [i915#658]) +2 similar issues
   [132]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-skl7/igt@kms_psr2_su@frontbuffer-xrgb8888.html

  * igt@kms_psr@psr2_cursor_blt:
    - shard-iclb:         [PASS][133] -> [SKIP][134] ([fdo#109441]) +2 similar issues
   [133]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12389/shard-iclb2/igt@kms_psr@psr2_cursor_blt.html
   [134]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-iclb5/igt@kms_psr@psr2_cursor_blt.html

  * igt@kms_psr_stress_test@flip-primary-invalidate-overlay:
    - shard-tglb:         [PASS][135] -> [SKIP][136] ([i915#5519])
   [135]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12389/shard-tglb8/igt@kms_psr_stress_test@flip-primary-invalidate-overlay.html
   [136]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-tglb7/igt@kms_psr_stress_test@flip-primary-invalidate-overlay.html

  * igt@kms_vblank@pipe-d-wait-idle:
    - shard-skl:          NOTRUN -> [SKIP][137] ([fdo#109271] / [i915#533])
   [137]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-skl2/igt@kms_vblank@pipe-d-wait-idle.html

  * igt@sysfs_clients@recycle-many:
    - shard-glk:          NOTRUN -> [SKIP][138] ([fdo#109271] / [i915#2994])
   [138]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-glk7/igt@sysfs_clients@recycle-many.html

  * igt@sysfs_clients@sema-50:
    - shard-skl:          NOTRUN -> [SKIP][139] ([fdo#109271] / [i915#2994]) +1 similar issue
   [139]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-skl5/igt@sysfs_clients@sema-50.html

  
#### Possible fixes ####

  * igt@gem_exec_balancer@parallel-bb-first:
    - shard-iclb:         [SKIP][140] ([i915#4525]) -> [PASS][141]
   [140]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12389/shard-iclb5/igt@gem_exec_balancer@parallel-bb-first.html
   [141]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-iclb1/igt@gem_exec_balancer@parallel-bb-first.html

  * igt@gem_exec_fair@basic-pace-share@rcs0:
    - shard-glk:          [FAIL][142] ([i915#2842]) -> [PASS][143]
   [142]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12389/shard-glk1/igt@gem_exec_fair@basic-pace-share@rcs0.html
   [143]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-glk3/igt@gem_exec_fair@basic-pace-share@rcs0.html

  * igt@gem_huc_copy@huc-copy:
    - shard-tglb:         [SKIP][144] ([i915#2190]) -> [PASS][145]
   [144]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12389/shard-tglb7/igt@gem_huc_copy@huc-copy.html
   [145]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-tglb8/igt@gem_huc_copy@huc-copy.html

  * igt@i915_pm_dc@dc6-psr:
    - shard-iclb:         [FAIL][146] ([i915#3989] / [i915#454]) -> [PASS][147]
   [146]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12389/shard-iclb1/igt@i915_pm_dc@dc6-psr.html
   [147]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-iclb7/igt@i915_pm_dc@dc6-psr.html

  * igt@kms_big_fb@y-tiled-max-hw-stride-32bpp-rotate-180:
    - shard-skl:          [DMESG-WARN][148] ([i915#1982]) -> [PASS][149] +1 similar issue
   [148]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12389/shard-skl2/igt@kms_big_fb@y-tiled-max-hw-stride-32bpp-rotate-180.html
   [149]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-skl10/igt@kms_big_fb@y-tiled-max-hw-stride-32bpp-rotate-180.html

  * igt@kms_flip@flip-vs-expired-vblank-interruptible@a-edp1:
    - shard-skl:          [FAIL][150] ([i915#79]) -> [PASS][151]
   [150]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12389/shard-skl9/igt@kms_flip@flip-vs-expired-vblank-interruptible@a-edp1.html
   [151]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-skl9/igt@kms_flip@flip-vs-expired-vblank-interruptible@a-edp1.html

  * igt@kms_flip_scaled_crc@flip-64bpp-xtile-to-16bpp-xtile-downscaling@pipe-a-default-mode:
    - shard-iclb:         [SKIP][152] ([i915#3555]) -> [PASS][153]
   [152]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12389/shard-iclb2/igt@kms_flip_scaled_crc@flip-64bpp-xtile-to-16bpp-xtile-downscaling@pipe-a-default-mode.html
   [153]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-iclb3/igt@kms_flip_scaled_crc@flip-64bpp-xtile-to-16bpp-xtile-downscaling@pipe-a-default-mode.html

  * igt@kms_psr@psr2_sprite_plane_onoff:
    - shard-iclb:         [SKIP][154] ([fdo#109441]) -> [PASS][155] +4 similar issues
   [154]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12389/shard-iclb6/igt@kms_psr@psr2_sprite_plane_onoff.html
   [155]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-iclb2/igt@kms_psr@psr2_sprite_plane_onoff.html

  * igt@kms_psr_stress_test@flip-primary-invalidate-overlay:
    - shard-iclb:         [SKIP][156] ([i915#5519]) -> [PASS][157]
   [156]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12389/shard-iclb1/igt@kms_psr_stress_test@flip-primary-invalidate-overlay.html
   [157]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-iclb1/igt@kms_psr_stress_test@flip-primary-invalidate-overlay.html

  * igt@kms_psr_stress_test@invalidate-primary-flip-overlay:
    - shard-tglb:         [SKIP][158] ([i915#5519]) -> [PASS][159]
   [158]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12389/shard-tglb5/igt@kms_psr_stress_test@invalidate-primary-flip-overlay.html
   [159]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-tglb6/igt@kms_psr_stress_test@invalidate-primary-flip-overlay.html

  * igt@syncobj_timeline@wait-for-submit-snapshot:
    - shard-skl:          [FAIL][160] ([i915#7326]) -> [PASS][161]
   [160]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12389/shard-skl6/igt@syncobj_timeline@wait-for-submit-snapshot.html
   [161]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-skl4/igt@syncobj_timeline@wait-for-submit-snapshot.html

  
#### Warnings ####

  * igt@gem_pwrite@basic-exhaustion:
    - shard-apl:          [INCOMPLETE][162] ([i915#7248]) -> [WARN][163] ([i915#2658])
   [162]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12389/shard-apl3/igt@gem_pwrite@basic-exhaustion.html
   [163]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-apl1/igt@gem_pwrite@basic-exhaustion.html

  * igt@kms_psr2_sf@cursor-plane-move-continuous-exceed-sf:
    - shard-iclb:         [SKIP][164] ([i915#658]) -> [SKIP][165] ([i915#2920]) +1 similar issue
   [164]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12389/shard-iclb1/igt@kms_psr2_sf@cursor-plane-move-continuous-exceed-sf.html
   [165]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-iclb2/igt@kms_psr2_sf@cursor-plane-move-continuous-exceed-sf.html

  * igt@kms_psr2_sf@cursor-plane-update-sf:
    - shard-iclb:         [SKIP][166] ([fdo#111068] / [i915#658]) -> [SKIP][167] ([i915#2920])
   [166]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12389/shard-iclb6/igt@kms_psr2_sf@cursor-plane-update-sf.html
   [167]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-iclb2/igt@kms_psr2_sf@cursor-plane-update-sf.html

  * igt@kms_psr2_sf@overlay-plane-move-continuous-sf:
    - shard-iclb:         [SKIP][168] ([i915#2920]) -> [SKIP][169] ([i915#658])
   [168]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12389/shard-iclb2/igt@kms_psr2_sf@overlay-plane-move-continuous-sf.html
   [169]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-iclb5/igt@kms_psr2_sf@overlay-plane-move-continuous-sf.html

  * igt@kms_psr2_sf@overlay-plane-update-continuous-sf:
    - shard-iclb:         [SKIP][170] ([i915#2920]) -> [SKIP][171] ([fdo#111068] / [i915#658])
   [170]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12389/shard-iclb2/igt@kms_psr2_sf@overlay-plane-update-continuous-sf.html
   [171]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-iclb3/igt@kms_psr2_sf@overlay-plane-update-continuous-sf.html

  * igt@runner@aborted:
    - shard-skl:          [FAIL][172] ([i915#3002] / [i915#4312]) -> ([FAIL][173], [FAIL][174], [FAIL][175], [FAIL][176]) ([i915#3002] / [i915#4312] / [i915#6949])
   [172]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12389/shard-skl9/igt@runner@aborted.html
   [173]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-skl2/igt@runner@aborted.html
   [174]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-skl2/igt@runner@aborted.html
   [175]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-skl7/igt@runner@aborted.html
   [176]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/shard-skl9/igt@runner@aborted.html

  
  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [fdo#109441]: https://bugs.freedesktop.org/show_bug.cgi?id=109441
  [fdo#111068]: https://bugs.freedesktop.org/show_bug.cgi?id=111068
  [fdo#111827]: https://bugs.freedesktop.org/show_bug.cgi?id=111827
  [i915#1982]: https://gitlab.freedesktop.org/drm/intel/issues/1982
  [i915#2122]: https://gitlab.freedesktop.org/drm/intel/issues/2122
  [i915#2190]: https://gitlab.freedesktop.org/drm/intel/issues/2190
  [i915#2346]: https://gitlab.freedesktop.org/drm/intel/issues/2346
  [i915#2521]: https://gitlab.freedesktop.org/drm/intel/issues/2521
  [i915#2587]: https://gitlab.freedesktop.org/drm/intel/issues/2587
  [i915#2658]: https://gitlab.freedesktop.org/drm/intel/issues/2658
  [i915#2672]: https://gitlab.freedesktop.org/drm/intel/issues/2672
  [i915#2842]: https://gitlab.freedesktop.org/drm/intel/issues/2842
  [i915#2846]: https://gitlab.freedesktop.org/drm/intel/issues/2846
  [i915#2920]: https://gitlab.freedesktop.org/drm/intel/issues/2920
  [i915#2994]: https://gitlab.freedesktop.org/drm/intel/issues/2994
  [i915#3002]: https://gitlab.freedesktop.org/drm/intel/issues/3002
  [i915#3323]: https://gitlab.freedesktop.org/drm/intel/issues/3323
  [i915#3555]: https://gitlab.freedesktop.org/drm/intel/issues/3555
  [i915#3763]: https://gitlab.freedesktop.org/drm/intel/issues/3763
  [i915#3886]: https://gitlab.freedesktop.org/drm/intel/issues/3886
  [i915#3989]: https://gitlab.freedesktop.org/drm/intel/issues/3989
  [i915#4312]: https://gitlab.freedesktop.org/drm/intel/issues/4312
  [i915#4392]: https://gitlab.freedesktop.org/drm/intel/issues/4392
  [i915#4525]: https://gitlab.freedesktop.org/drm/intel/issues/4525
  [i915#454]: https://gitlab.freedesktop.org/drm/intel/issues/454
  [i915#4613]: https://gitlab.freedesktop.org/drm/intel/issues/4613
  [i915#4817]: https://gitlab.freedesktop.org/drm/intel/issues/4817
  [i915#5032]: https://gitlab.freedesktop.org/drm/intel/issues/5032
  [i915#533]: https://gitlab.freedesktop.org/drm/intel/issues/533
  [i915#5519]: https://gitlab.freedesktop.org/drm/intel/issues/5519
  [i915#5566]: https://gitlab.freedesktop.org/drm/intel/issues/5566
  [i915#5591]: https://gitlab.freedesktop.org/drm/intel/issues/5591
  [i915#658]: https://gitlab.freedesktop.org/drm/intel/issues/658
  [i915#6949]: https://gitlab.freedesktop.org/drm/intel/issues/6949
  [i915#716]: https://gitlab.freedesktop.org/drm/intel/issues/716
  [i915#7233]: https://gitlab.freedesktop.org/drm/intel/issues/7233
  [i915#7248]: https://gitlab.freedesktop.org/drm/intel/issues/7248
  [i915#7326]: https://gitlab.freedesktop.org/drm/intel/issues/7326
  [i915#79]: https://gitlab.freedesktop.org/drm/intel/issues/79


Build changes
-------------

  * Linux: CI_DRM_12389 -> Patchwork_110964v1

  CI-20190529: 20190529
  CI_DRM_12389: 3eb60fea7cd44530f85bd9362508647aceea1dea @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_7061: d16e679b99d43643c3a0bcc34ec4c4f001dbbbb4 @ https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  Patchwork_110964v1: 3eb60fea7cd44530f85bd9362508647aceea1dea @ git://anongit.freedesktop.org/gfx-ci/linux
  piglit_4509: fdc5a4ca11124ab8413c7988896eec4c97336694 @ git://anongit.freedesktop.org/piglit

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110964v1/index.html

[-- Attachment #2: Type: text/html, Size: 35688 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Intel-gfx] [PATCH 2/3] drm/i915: Never return 0 on timeout when retiring requests
  2022-11-16 11:25   ` [Intel-gfx] " Janusz Krzysztofik
@ 2022-11-17  9:53     ` Das, Nirmoy
  -1 siblings, 0 replies; 22+ messages in thread
From: Das, Nirmoy @ 2022-11-17  9:53 UTC (permalink / raw)
  To: Janusz Krzysztofik, Tvrtko Ursulin, Joonas Lahtinen
  Cc: intel-gfx, dri-devel, Chris Wilson

Looks very relevant to  our recent hangcheck failures.


Acked-by: Nirmoy Das <nirmoy.das@intel.com>

On 11/16/2022 12:25 PM, Janusz Krzysztofik wrote:
> Users of intel_gt_retire_requests_timeout() expect 0 return value on
> success.  However, we have no protection from passing back 0 potentially
> returned by dma_fence_wait_timeout() on timeout.
>
> Replace 0 with -ETIME before using timeout as return value.
>
> Fixes: f33a8a51602c ("drm/i915: Merge wait_for_timelines with retire_request")
> Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
> Cc: stable@vger.kernel.org # v5.5+
> ---
>   drivers/gpu/drm/i915/gt/intel_gt_requests.c | 3 +++
>   1 file changed, 3 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_requests.c b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
> index ccaf2fd80625b..ac6b2b1861397 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt_requests.c
> +++ b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
> @@ -213,6 +213,9 @@ out_active:	spin_lock(&timelines->lock);
>   	list_for_each_entry_safe(tl, tn, &free, link)
>   		__intel_timeline_free(&tl->kref);
>   
> +	if (!timeout)
> +		timeout = -ETIME;
> +
>   	if (flush_submission(gt, timeout)) /* Wait, there's more! */
>   		active_count++;
>   

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Intel-gfx] [PATCH 2/3] drm/i915: Never return 0 on timeout when retiring requests
@ 2022-11-17  9:53     ` Das, Nirmoy
  0 siblings, 0 replies; 22+ messages in thread
From: Das, Nirmoy @ 2022-11-17  9:53 UTC (permalink / raw)
  To: Janusz Krzysztofik, Tvrtko Ursulin, Joonas Lahtinen
  Cc: Daniel Vetter, intel-gfx, dri-devel, Chris Wilson

Looks very relevant to  our recent hangcheck failures.


Acked-by: Nirmoy Das <nirmoy.das@intel.com>

On 11/16/2022 12:25 PM, Janusz Krzysztofik wrote:
> Users of intel_gt_retire_requests_timeout() expect 0 return value on
> success.  However, we have no protection from passing back 0 potentially
> returned by dma_fence_wait_timeout() on timeout.
>
> Replace 0 with -ETIME before using timeout as return value.
>
> Fixes: f33a8a51602c ("drm/i915: Merge wait_for_timelines with retire_request")
> Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
> Cc: stable@vger.kernel.org # v5.5+
> ---
>   drivers/gpu/drm/i915/gt/intel_gt_requests.c | 3 +++
>   1 file changed, 3 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_requests.c b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
> index ccaf2fd80625b..ac6b2b1861397 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt_requests.c
> +++ b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
> @@ -213,6 +213,9 @@ out_active:	spin_lock(&timelines->lock);
>   	list_for_each_entry_safe(tl, tn, &free, link)
>   		__intel_timeline_free(&tl->kref);
>   
> +	if (!timeout)
> +		timeout = -ETIME;
> +
>   	if (flush_submission(gt, timeout)) /* Wait, there's more! */
>   		active_count++;
>   

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Intel-gfx] [PATCH 1/3] drm/i915: Fix negative remaining time after retire requests
  2022-11-16 11:25   ` [Intel-gfx] " Janusz Krzysztofik
@ 2022-11-17  9:58     ` Das, Nirmoy
  -1 siblings, 0 replies; 22+ messages in thread
From: Das, Nirmoy @ 2022-11-17  9:58 UTC (permalink / raw)
  To: Janusz Krzysztofik, Tvrtko Ursulin, Joonas Lahtinen
  Cc: intel-gfx, dri-devel, Chris Wilson

On 11/16/2022 12:25 PM, Janusz Krzysztofik wrote:

> Commit b97060a99b01 ("drm/i915/guc: Update intel_gt_wait_for_idle to work
> with GuC") extended the API of intel_gt_retire_requests_timeout() with an
> extra argument 'remaining_timeout', intended for passing back unconsumed
> portion of requested timeout when 0 (success) is returned.  However, when
> request retirement happens to succeed despite an error returned by
> dma_fence_wait_timeout(), the error code (a negative value) is passed back
> instead of remaining time.  If a user then passes that negative value
> forward as requested timeout to another wait, an explicit WARN or BUG can
> be triggered.
>
> Instead of copying the value of timeout variable to *remaining_timeout
> before return, update the *remaining_timeout after each DMA fence wait.


Thanks for the detailed comment, indeed we were not accounting for the 
return value of dma_fence_wait_timeout()

Acked-by: Nirmoy Das <nirmoy.das@intel.com>


Thanks,

Nirmoy


> Set it to 0 on -ETIME, -EINTR or -ERESTARTSYS, and assume no time has been
> consumed on other errors returned from the wait.
>
> Fixes: b97060a99b01 ("drm/i915/guc: Update intel_gt_wait_for_idle to work with GuC")
> Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
> Cc: stable@vger.kernel.org # v5.15+
> ---
>   drivers/gpu/drm/i915/gt/intel_gt_requests.c | 23 ++++++++++++++++++---
>   1 file changed, 20 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_requests.c b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
> index edb881d756309..ccaf2fd80625b 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt_requests.c
> +++ b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
> @@ -138,6 +138,9 @@ long intel_gt_retire_requests_timeout(struct intel_gt *gt, long timeout,
>   	unsigned long active_count = 0;
>   	LIST_HEAD(free);
>   
> +	if (remaining_timeout)
> +		*remaining_timeout = timeout;
> +
>   	flush_submission(gt, timeout); /* kick the ksoftirqd tasklets */
>   	spin_lock(&timelines->lock);
>   	list_for_each_entry_safe(tl, tn, &timelines->active_list, link) {
> @@ -163,6 +166,23 @@ long intel_gt_retire_requests_timeout(struct intel_gt *gt, long timeout,
>   								 timeout);
>   				dma_fence_put(fence);
>   
> +				if (remaining_timeout) {
> +					/*
> +					 * If we get an error here but request
> +					 * retirement succeeds anyway
> +					 * (!active_count) and we return 0, the
> +					 * caller may want to spend remaining
> +					 * time on waiting for other events.
> +					 */
> +					if (timeout == -ETIME ||
> +					    timeout == -EINTR ||
> +					    timeout == -ERESTARTSYS)
> +						*remaining_timeout = 0;
> +					else if (timeout >= 0)
> +						*remaining_timeout = timeout;
> +					/* else assume no time consumed */
> +				}
> +
>   				/* Retirement is best effort */
>   				if (!mutex_trylock(&tl->mutex)) {
>   					active_count++;
> @@ -196,9 +216,6 @@ out_active:	spin_lock(&timelines->lock);
>   	if (flush_submission(gt, timeout)) /* Wait, there's more! */
>   		active_count++;
>   
> -	if (remaining_timeout)
> -		*remaining_timeout = timeout;
> -
>   	return active_count ? timeout : 0;
>   }
>   

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Intel-gfx] [PATCH 1/3] drm/i915: Fix negative remaining time after retire requests
@ 2022-11-17  9:58     ` Das, Nirmoy
  0 siblings, 0 replies; 22+ messages in thread
From: Das, Nirmoy @ 2022-11-17  9:58 UTC (permalink / raw)
  To: Janusz Krzysztofik, Tvrtko Ursulin, Joonas Lahtinen
  Cc: Daniel Vetter, intel-gfx, dri-devel, Chris Wilson

On 11/16/2022 12:25 PM, Janusz Krzysztofik wrote:

> Commit b97060a99b01 ("drm/i915/guc: Update intel_gt_wait_for_idle to work
> with GuC") extended the API of intel_gt_retire_requests_timeout() with an
> extra argument 'remaining_timeout', intended for passing back unconsumed
> portion of requested timeout when 0 (success) is returned.  However, when
> request retirement happens to succeed despite an error returned by
> dma_fence_wait_timeout(), the error code (a negative value) is passed back
> instead of remaining time.  If a user then passes that negative value
> forward as requested timeout to another wait, an explicit WARN or BUG can
> be triggered.
>
> Instead of copying the value of timeout variable to *remaining_timeout
> before return, update the *remaining_timeout after each DMA fence wait.


Thanks for the detailed comment, indeed we were not accounting for the 
return value of dma_fence_wait_timeout()

Acked-by: Nirmoy Das <nirmoy.das@intel.com>


Thanks,

Nirmoy


> Set it to 0 on -ETIME, -EINTR or -ERESTARTSYS, and assume no time has been
> consumed on other errors returned from the wait.
>
> Fixes: b97060a99b01 ("drm/i915/guc: Update intel_gt_wait_for_idle to work with GuC")
> Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
> Cc: stable@vger.kernel.org # v5.15+
> ---
>   drivers/gpu/drm/i915/gt/intel_gt_requests.c | 23 ++++++++++++++++++---
>   1 file changed, 20 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_requests.c b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
> index edb881d756309..ccaf2fd80625b 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt_requests.c
> +++ b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
> @@ -138,6 +138,9 @@ long intel_gt_retire_requests_timeout(struct intel_gt *gt, long timeout,
>   	unsigned long active_count = 0;
>   	LIST_HEAD(free);
>   
> +	if (remaining_timeout)
> +		*remaining_timeout = timeout;
> +
>   	flush_submission(gt, timeout); /* kick the ksoftirqd tasklets */
>   	spin_lock(&timelines->lock);
>   	list_for_each_entry_safe(tl, tn, &timelines->active_list, link) {
> @@ -163,6 +166,23 @@ long intel_gt_retire_requests_timeout(struct intel_gt *gt, long timeout,
>   								 timeout);
>   				dma_fence_put(fence);
>   
> +				if (remaining_timeout) {
> +					/*
> +					 * If we get an error here but request
> +					 * retirement succeeds anyway
> +					 * (!active_count) and we return 0, the
> +					 * caller may want to spend remaining
> +					 * time on waiting for other events.
> +					 */
> +					if (timeout == -ETIME ||
> +					    timeout == -EINTR ||
> +					    timeout == -ERESTARTSYS)
> +						*remaining_timeout = 0;
> +					else if (timeout >= 0)
> +						*remaining_timeout = timeout;
> +					/* else assume no time consumed */
> +				}
> +
>   				/* Retirement is best effort */
>   				if (!mutex_trylock(&tl->mutex)) {
>   					active_count++;
> @@ -196,9 +216,6 @@ out_active:	spin_lock(&timelines->lock);
>   	if (flush_submission(gt, timeout)) /* Wait, there's more! */
>   		active_count++;
>   
> -	if (remaining_timeout)
> -		*remaining_timeout = timeout;
> -
>   	return active_count ? timeout : 0;
>   }
>   

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2022-11-17  9:58 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-16 11:25 [PATCH 0/3] drm/i915: Fix timeout handling when retiring requests Janusz Krzysztofik
2022-11-16 11:25 ` [Intel-gfx] " Janusz Krzysztofik
2022-11-16 11:25 ` [PATCH 1/3] drm/i915: Fix negative remaining time after retire requests Janusz Krzysztofik
2022-11-16 11:25   ` [Intel-gfx] " Janusz Krzysztofik
2022-11-16 13:13   ` Andrzej Hajda
2022-11-16 13:13     ` Andrzej Hajda
2022-11-17  9:58   ` Das, Nirmoy
2022-11-17  9:58     ` Das, Nirmoy
2022-11-16 11:25 ` [PATCH 2/3] drm/i915: Never return 0 on timeout when retiring requests Janusz Krzysztofik
2022-11-16 11:25   ` [Intel-gfx] " Janusz Krzysztofik
2022-11-16 14:15   ` Andrzej Hajda
2022-11-16 14:15     ` Andrzej Hajda
2022-11-17  9:53   ` Das, Nirmoy
2022-11-17  9:53     ` Das, Nirmoy
2022-11-16 11:25 ` [PATCH 3/3] drm/i915: Never return 0 if request wait succeeds Janusz Krzysztofik
2022-11-16 11:25   ` [Intel-gfx] " Janusz Krzysztofik
2022-11-16 14:42   ` Andrzej Hajda
2022-11-16 14:42     ` Andrzej Hajda
2022-11-16 15:48     ` Janusz Krzysztofik
2022-11-16 15:48       ` Janusz Krzysztofik
2022-11-16 19:06 ` [Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915: Fix timeout handling when retiring requests Patchwork
2022-11-17  3:27 ` [Intel-gfx] ✗ Fi.CI.IGT: failure " Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.