[Intel-gfx] [PATCH v5 0/5] Fix error propagation amongst request

intel-gfx.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed

* [Intel-gfx] [PATCH v5 0/5] Fix error propagation amongst request
@ 2023-04-12 11:33 Andi Shyti
  2023-04-12 11:33 ` [Intel-gfx] [PATCH v5 1/5] drm/i915/gt: Add intel_context_timeline_is_locked helper Andi Shyti
                   ` (8 more replies)
  0 siblings, 9 replies; 19+ messages in thread
From: Andi Shyti @ 2023-04-12 11:33 UTC (permalink / raw)
  To: intel-gfx, dri-devel, stable
  Cc: Maciej Patelczyk, Andi Shyti, Matthew Auld, Andrzej Hajda,
	Rodrigo Vivi, Chris Wilson, Nirmoy Das

Hi,

This series of two patches fixes the issue introduced in
cf586021642d80 ("drm/i915/gt: Pipelined page migration") where,
as reported by Matt, in a chain of requests an error is reported
only if happens in the last request.

However Chris noticed that without ensuring exclusivity in the
locking we might end up in some deadlock. That's why patch 1
throttles for the ringspace in order to make sure that no one is
holding it.

Version 1 of this patch has been reviewed by matt and this
version is adding Chris exclusive locking.

Thanks Chris for this work.

Andi

Changelog
=========
v4 -> v5
 - add timeline locking also in the copy operation, which was
   forgottein in v4.
 - rearrange the patches in order to avoid a bisect break.

v3 -> v4
 - In v3 the timeline was being locked, but I forgot that also
   request_create() and request_add() are locking the timeline
   as well. The former does the locking, the latter does the
   unlocking. In order to avoid this extra lock/unlock, we need
   the "_locked" version of the said functions.

v2 -> v3
 - Really lock the timeline before generating all the requests
   until the last.

v1 -> v2
 - Add patch 1 for ensuring exclusive locking of the timeline
 - Reword git commit of patch 2.

Andi Shyti (4):
  drm/i915/gt: Add intel_context_timeline_is_locked helper
  drm/i915: Create the locked version of the request create
  drm/i915: Create the locked version of the request add
  drm/i915/gt: Make sure that errors are propagated through request
    chains

Chris Wilson (1):
  drm/i915: Throttle for ringspace prior to taking the timeline mutex

 drivers/gpu/drm/i915/gt/intel_context.c | 41 ++++++++++++++++++
 drivers/gpu/drm/i915/gt/intel_context.h |  8 ++++
 drivers/gpu/drm/i915/gt/intel_migrate.c | 51 +++++++++++++++++------
 drivers/gpu/drm/i915/i915_request.c     | 55 +++++++++++++++++++------
 drivers/gpu/drm/i915/i915_request.h     |  3 ++
 5 files changed, 133 insertions(+), 25 deletions(-)

-- 
2.39.2


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Intel-gfx] [PATCH v5 1/5] drm/i915/gt: Add intel_context_timeline_is_locked helper
  2023-04-12 11:33 [Intel-gfx] [PATCH v5 0/5] Fix error propagation amongst request Andi Shyti
@ 2023-04-12 11:33 ` Andi Shyti
  2023-04-12 12:12   ` Andrzej Hajda
  2023-04-12 11:33 ` [Intel-gfx] [PATCH v5 2/5] drm/i915: Create the locked version of the request create Andi Shyti
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 19+ messages in thread
From: Andi Shyti @ 2023-04-12 11:33 UTC (permalink / raw)
  To: intel-gfx, dri-devel, stable
  Cc: Maciej Patelczyk, Andi Shyti, Matthew Auld, Andrzej Hajda,
	Rodrigo Vivi, Chris Wilson, Nirmoy Das

We have:

 - intel_context_timeline_lock()
 - intel_context_timeline_unlock()

In the next patches we will also need:

 - intel_context_timeline_is_locked()

Add it.

Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_context.h | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/intel_context.h b/drivers/gpu/drm/i915/gt/intel_context.h
index 48f888c3da083..f2f79ff0dfd1d 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.h
+++ b/drivers/gpu/drm/i915/gt/intel_context.h
@@ -270,6 +270,12 @@ static inline void intel_context_timeline_unlock(struct intel_timeline *tl)
 	mutex_unlock(&tl->mutex);
 }
 
+static inline void intel_context_assert_timeline_is_locked(struct intel_timeline *tl)
+	__must_hold(&tl->mutex)
+{
+	lockdep_assert_held(&tl->mutex);
+}
+
 int intel_context_prepare_remote_request(struct intel_context *ce,
 					 struct i915_request *rq);
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Intel-gfx] [PATCH v5 2/5] drm/i915: Create the locked version of the request create
  2023-04-12 11:33 [Intel-gfx] [PATCH v5 0/5] Fix error propagation amongst request Andi Shyti
  2023-04-12 11:33 ` [Intel-gfx] [PATCH v5 1/5] drm/i915/gt: Add intel_context_timeline_is_locked helper Andi Shyti
@ 2023-04-12 11:33 ` Andi Shyti
  2023-04-12 13:04   ` Andrzej Hajda
  2023-04-12 11:33 ` [Intel-gfx] [PATCH v5 3/5] drm/i915: Create the locked version of the request add Andi Shyti
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 19+ messages in thread
From: Andi Shyti @ 2023-04-12 11:33 UTC (permalink / raw)
  To: intel-gfx, dri-devel, stable
  Cc: Maciej Patelczyk, Andi Shyti, Matthew Auld, Andrzej Hajda,
	Rodrigo Vivi, Chris Wilson, Nirmoy Das

Make version of the request creation that doesn't hold any
lock.

Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
---
 drivers/gpu/drm/i915/i915_request.c | 38 +++++++++++++++++++++--------
 drivers/gpu/drm/i915/i915_request.h |  2 ++
 2 files changed, 30 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 630a732aaecca..58662360ac34e 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -1028,15 +1028,11 @@ __i915_request_create(struct intel_context *ce, gfp_t gfp)
 	return ERR_PTR(ret);
 }
 
-struct i915_request *
-i915_request_create(struct intel_context *ce)
+static struct i915_request *
+__i915_request_create_locked(struct intel_context *ce)
 {
+	struct intel_timeline *tl = ce->timeline;
 	struct i915_request *rq;
-	struct intel_timeline *tl;
-
-	tl = intel_context_timeline_lock(ce);
-	if (IS_ERR(tl))
-		return ERR_CAST(tl);
 
 	/* Move our oldest request to the slab-cache (if not in use!) */
 	rq = list_first_entry(&tl->requests, typeof(*rq), link);
@@ -1046,16 +1042,38 @@ i915_request_create(struct intel_context *ce)
 	intel_context_enter(ce);
 	rq = __i915_request_create(ce, GFP_KERNEL);
 	intel_context_exit(ce); /* active reference transferred to request */
+
 	if (IS_ERR(rq))
-		goto err_unlock;
+		return rq;
 
 	/* Check that we do not interrupt ourselves with a new request */
 	rq->cookie = lockdep_pin_lock(&tl->mutex);
 
 	return rq;
+}
+
+struct i915_request *
+i915_request_create_locked(struct intel_context *ce)
+{
+	intel_context_assert_timeline_is_locked(ce->timeline);
+
+	return __i915_request_create_locked(ce);
+}
+
+struct i915_request *
+i915_request_create(struct intel_context *ce)
+{
+	struct i915_request *rq;
+	struct intel_timeline *tl;
+
+	tl = intel_context_timeline_lock(ce);
+	if (IS_ERR(tl))
+		return ERR_CAST(tl);
+
+	rq = __i915_request_create_locked(ce);
+	if (IS_ERR(rq))
+		intel_context_timeline_unlock(tl);
 
-err_unlock:
-	intel_context_timeline_unlock(tl);
 	return rq;
 }
 
diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h
index f5e1bb5e857aa..bb48bd4605c03 100644
--- a/drivers/gpu/drm/i915/i915_request.h
+++ b/drivers/gpu/drm/i915/i915_request.h
@@ -374,6 +374,8 @@ struct i915_request * __must_check
 __i915_request_create(struct intel_context *ce, gfp_t gfp);
 struct i915_request * __must_check
 i915_request_create(struct intel_context *ce);
+struct i915_request * __must_check
+i915_request_create_locked(struct intel_context *ce);
 
 void __i915_request_skip(struct i915_request *rq);
 bool i915_request_set_error_once(struct i915_request *rq, int error);
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Intel-gfx] [PATCH v5 3/5] drm/i915: Create the locked version of the request add
  2023-04-12 11:33 [Intel-gfx] [PATCH v5 0/5] Fix error propagation amongst request Andi Shyti
  2023-04-12 11:33 ` [Intel-gfx] [PATCH v5 1/5] drm/i915/gt: Add intel_context_timeline_is_locked helper Andi Shyti
  2023-04-12 11:33 ` [Intel-gfx] [PATCH v5 2/5] drm/i915: Create the locked version of the request create Andi Shyti
@ 2023-04-12 11:33 ` Andi Shyti
  2023-04-12 13:06   ` Andrzej Hajda
  2023-04-12 11:33 ` [Intel-gfx] [PATCH v5 4/5] drm/i915: Throttle for ringspace prior to taking the timeline mutex Andi Shyti
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 19+ messages in thread
From: Andi Shyti @ 2023-04-12 11:33 UTC (permalink / raw)
  To: intel-gfx, dri-devel, stable
  Cc: Maciej Patelczyk, Andi Shyti, Matthew Auld, Andrzej Hajda,
	Rodrigo Vivi, Chris Wilson, Nirmoy Das

i915_request_add() assumes that the timeline is locked whtn the
function is called. Before exiting it releases the lock. But in
the next commit we have one case where releasing the timeline
mutex is not necessary and we don't want that.

Make a new i915_request_add_locked() version of the function
where the lock is not released.

Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Cc: stable@vger.kernel.org
---
 drivers/gpu/drm/i915/i915_request.c | 14 +++++++++++---
 drivers/gpu/drm/i915/i915_request.h |  1 +
 2 files changed, 12 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 58662360ac34e..21032b3b9d330 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -1852,13 +1852,13 @@ void __i915_request_queue(struct i915_request *rq,
 	local_bh_enable(); /* kick tasklets */
 }
 
-void i915_request_add(struct i915_request *rq)
+void i915_request_add_locked(struct i915_request *rq)
 {
 	struct intel_timeline * const tl = i915_request_timeline(rq);
 	struct i915_sched_attr attr = {};
 	struct i915_gem_context *ctx;
 
-	lockdep_assert_held(&tl->mutex);
+	intel_context_assert_timeline_is_locked(tl);
 	lockdep_unpin_lock(&tl->mutex, rq->cookie);
 
 	trace_i915_request_add(rq);
@@ -1873,7 +1873,15 @@ void i915_request_add(struct i915_request *rq)
 
 	__i915_request_queue(rq, &attr);
 
-	mutex_unlock(&tl->mutex);
+}
+
+void i915_request_add(struct i915_request *rq)
+{
+	struct intel_timeline * const tl = i915_request_timeline(rq);
+
+	i915_request_add_locked(rq);
+
+	intel_context_timeline_unlock(tl);
 }
 
 static unsigned long local_clock_ns(unsigned int *cpu)
diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h
index bb48bd4605c03..29e3a37c300a7 100644
--- a/drivers/gpu/drm/i915/i915_request.h
+++ b/drivers/gpu/drm/i915/i915_request.h
@@ -425,6 +425,7 @@ int i915_request_await_deps(struct i915_request *rq, const struct i915_deps *dep
 int i915_request_await_execution(struct i915_request *rq,
 				 struct dma_fence *fence);
 
+void i915_request_add_locked(struct i915_request *rq);
 void i915_request_add(struct i915_request *rq);
 
 bool __i915_request_submit(struct i915_request *request);
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Intel-gfx] [PATCH v5 4/5] drm/i915: Throttle for ringspace prior to taking the timeline mutex
  2023-04-12 11:33 [Intel-gfx] [PATCH v5 0/5] Fix error propagation amongst request Andi Shyti
                   ` (2 preceding siblings ...)
  2023-04-12 11:33 ` [Intel-gfx] [PATCH v5 3/5] drm/i915: Create the locked version of the request add Andi Shyti
@ 2023-04-12 11:33 ` Andi Shyti
  2023-04-12 11:33 ` [Intel-gfx] [PATCH v5 5/5] drm/i915/gt: Make sure that errors are propagated through request chains Andi Shyti
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 19+ messages in thread
From: Andi Shyti @ 2023-04-12 11:33 UTC (permalink / raw)
  To: intel-gfx, dri-devel, stable
  Cc: Maciej Patelczyk, Andi Shyti, Matthew Auld, Andrzej Hajda,
	Rodrigo Vivi, Chris Wilson, Nirmoy Das

From: Chris Wilson <chris@chris-wilson.co.uk>

Before taking exclusive ownership of the ring for emitting the request,
wait for space in the ring to become available. This allows others to
take the timeline->mutex to make forward progresses while userspace is
blocked.

In particular, this allows regular clients to issue requests on the
kernel context, potentially filling the ring, but allow the higher
priority heartbeats and pulses to still be submitted without being
blocked by the less critical work.

Signed-off-by: Chris Wilson <chris.p.wilson@linux.intel.com>
Cc: Maciej Patelczyk <maciej.patelczyk@intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_context.c | 41 +++++++++++++++++++++++++
 drivers/gpu/drm/i915/gt/intel_context.h |  2 ++
 drivers/gpu/drm/i915/i915_request.c     |  3 ++
 3 files changed, 46 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c
index 2aa63ec521b89..59cd612a23561 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.c
+++ b/drivers/gpu/drm/i915/gt/intel_context.c
@@ -626,6 +626,47 @@ bool intel_context_revoke(struct intel_context *ce)
 	return ret;
 }
 
+int intel_context_throttle(const struct intel_context *ce)
+{
+	const struct intel_ring *ring = ce->ring;
+	const struct intel_timeline *tl = ce->timeline;
+	struct i915_request *rq;
+	int err = 0;
+
+	if (READ_ONCE(ring->space) >= SZ_1K)
+		return 0;
+
+	rcu_read_lock();
+	list_for_each_entry_reverse(rq, &tl->requests, link) {
+		if (__i915_request_is_complete(rq))
+			break;
+
+		if (rq->ring != ring)
+			continue;
+
+		/* Wait until there will be enough space following that rq */
+		if (__intel_ring_space(rq->postfix,
+				       ring->emit,
+				       ring->size) < ring->size / 2) {
+			if (i915_request_get_rcu(rq)) {
+				rcu_read_unlock();
+
+				if (i915_request_wait(rq,
+						      I915_WAIT_INTERRUPTIBLE,
+						      MAX_SCHEDULE_TIMEOUT) < 0)
+					err = -EINTR;
+
+				rcu_read_lock();
+				i915_request_put(rq);
+			}
+			break;
+		}
+	}
+	rcu_read_unlock();
+
+	return err;
+}
+
 #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
 #include "selftest_context.c"
 #endif
diff --git a/drivers/gpu/drm/i915/gt/intel_context.h b/drivers/gpu/drm/i915/gt/intel_context.h
index f2f79ff0dfd1d..c0db00ac6b950 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.h
+++ b/drivers/gpu/drm/i915/gt/intel_context.h
@@ -233,6 +233,8 @@ static inline void intel_context_exit(struct intel_context *ce)
 	ce->ops->exit(ce);
 }
 
+int intel_context_throttle(const struct intel_context *ce);
+
 static inline struct intel_context *intel_context_get(struct intel_context *ce)
 {
 	kref_get(&ce->ref);
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 21032b3b9d330..0b7c6aede0c6b 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -1057,6 +1057,9 @@ i915_request_create_locked(struct intel_context *ce)
 {
 	intel_context_assert_timeline_is_locked(ce->timeline);
 
+	if (intel_context_throttle(ce))
+		return ERR_PTR(-EINTR);
+
 	return __i915_request_create_locked(ce);
 }
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Intel-gfx] [PATCH v5 5/5] drm/i915/gt: Make sure that errors are propagated through request chains
  2023-04-12 11:33 [Intel-gfx] [PATCH v5 0/5] Fix error propagation amongst request Andi Shyti
                   ` (3 preceding siblings ...)
  2023-04-12 11:33 ` [Intel-gfx] [PATCH v5 4/5] drm/i915: Throttle for ringspace prior to taking the timeline mutex Andi Shyti
@ 2023-04-12 11:33 ` Andi Shyti
  2023-04-13 11:56   ` Tvrtko Ursulin
  2023-04-12 17:28 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for Fix error propagation amongst request (rev3) Patchwork
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 19+ messages in thread
From: Andi Shyti @ 2023-04-12 11:33 UTC (permalink / raw)
  To: intel-gfx, dri-devel, stable
  Cc: Maciej Patelczyk, Andi Shyti, Matthew Auld, Andrzej Hajda,
	Rodrigo Vivi, Chris Wilson, Nirmoy Das

Currently, when we perform operations such as clearing or copying
large blocks of memory, we generate multiple requests that are
executed in a chain.

However, if one of these requests fails, we may not realize it
unless it happens to be the last request in the chain. This is
because errors are not properly propagated.

For this we need to keep propagating the chain of fence
notification in order to always reach the final fence associated
to the final request.

To address this issue, we need to ensure that the chain of fence
notifications is always propagated so that we can reach the final
fence associated with the last request. By doing so, we will be
able to detect any memory operation  failures and determine
whether the memory is still invalid.

On copy and clear migration signal fences upon completion.

On copy and clear migration, signal fences upon request
completion to ensure that we have a reliable perpetuation of the
operation outcome.

Fixes: cf586021642d80 ("drm/i915/gt: Pipelined page migration")
Reported-by: Matthew Auld <matthew.auld@intel.com>
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Acked-by: Nirmoy Das <nirmoy.das@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_migrate.c | 51 +++++++++++++++++++------
 1 file changed, 39 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_migrate.c b/drivers/gpu/drm/i915/gt/intel_migrate.c
index 3f638f1987968..668c95af8cbcf 100644
--- a/drivers/gpu/drm/i915/gt/intel_migrate.c
+++ b/drivers/gpu/drm/i915/gt/intel_migrate.c
@@ -742,13 +742,19 @@ intel_context_migrate_copy(struct intel_context *ce,
 			dst_offset = 2 * CHUNK_SZ;
 	}
 
+	/*
+	 * While building the chain of requests, we need to ensure
+	 * that no one can sneak into the timeline unnoticed.
+	 */
+	mutex_lock(&ce->timeline->mutex);
+
 	do {
 		int len;
 
-		rq = i915_request_create(ce);
+		rq = i915_request_create_locked(ce);
 		if (IS_ERR(rq)) {
 			err = PTR_ERR(rq);
-			goto out_ce;
+			break;
 		}
 
 		if (deps) {
@@ -878,10 +884,14 @@ intel_context_migrate_copy(struct intel_context *ce,
 
 		/* Arbitration is re-enabled between requests. */
 out_rq:
-		if (*out)
+		i915_sw_fence_await(&rq->submit);
+		i915_request_get(rq);
+		i915_request_add_locked(rq);
+		if (*out) {
+			i915_sw_fence_complete(&(*out)->submit);
 			i915_request_put(*out);
-		*out = i915_request_get(rq);
-		i915_request_add(rq);
+		}
+		*out = rq;
 
 		if (err)
 			break;
@@ -905,7 +915,10 @@ intel_context_migrate_copy(struct intel_context *ce,
 		cond_resched();
 	} while (1);
 
-out_ce:
+	mutex_unlock(&ce->timeline->mutex);
+
+	if (*out)
+		i915_sw_fence_complete(&(*out)->submit);
 	return err;
 }
 
@@ -999,13 +1012,19 @@ intel_context_migrate_clear(struct intel_context *ce,
 	if (HAS_64K_PAGES(i915) && is_lmem)
 		offset = CHUNK_SZ;
 
+	/*
+	 * While building the chain of requests, we need to ensure
+	 * that no one can sneak into the timeline unnoticed.
+	 */
+	mutex_lock(&ce->timeline->mutex);
+
 	do {
 		int len;
 
-		rq = i915_request_create(ce);
+		rq = i915_request_create_locked(ce);
 		if (IS_ERR(rq)) {
 			err = PTR_ERR(rq);
-			goto out_ce;
+			break;
 		}
 
 		if (deps) {
@@ -1056,17 +1075,25 @@ intel_context_migrate_clear(struct intel_context *ce,
 
 		/* Arbitration is re-enabled between requests. */
 out_rq:
-		if (*out)
+		i915_sw_fence_await(&rq->submit);
+		i915_request_get(rq);
+		i915_request_add_locked(rq);
+		if (*out) {
+			i915_sw_fence_complete(&(*out)->submit);
 			i915_request_put(*out);
-		*out = i915_request_get(rq);
-		i915_request_add(rq);
+		}
+		*out = rq;
+
 		if (err || !it.sg || !sg_dma_len(it.sg))
 			break;
 
 		cond_resched();
 	} while (1);
 
-out_ce:
+	mutex_unlock(&ce->timeline->mutex);
+
+	if (*out)
+		i915_sw_fence_complete(&(*out)->submit);
 	return err;
 }
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [Intel-gfx] [PATCH v5 1/5] drm/i915/gt: Add intel_context_timeline_is_locked helper
  2023-04-12 11:33 ` [Intel-gfx] [PATCH v5 1/5] drm/i915/gt: Add intel_context_timeline_is_locked helper Andi Shyti
@ 2023-04-12 12:12   ` Andrzej Hajda
  0 siblings, 0 replies; 19+ messages in thread
From: Andrzej Hajda @ 2023-04-12 12:12 UTC (permalink / raw)
  To: Andi Shyti, intel-gfx, dri-devel, stable
  Cc: Andi Shyti, Maciej Patelczyk, Matthew Auld, Rodrigo Vivi,
	Chris Wilson, Nirmoy Das



On 12.04.2023 13:33, Andi Shyti wrote:
> We have:
>
>   - intel_context_timeline_lock()
>   - intel_context_timeline_unlock()
>
> In the next patches we will also need:
>
>   - intel_context_timeline_is_locked()
>
> Add it.
>
> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
> Cc: stable@vger.kernel.org
> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>

Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>

Regards
Andrzej

> ---
>   drivers/gpu/drm/i915/gt/intel_context.h | 6 ++++++
>   1 file changed, 6 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/gt/intel_context.h b/drivers/gpu/drm/i915/gt/intel_context.h
> index 48f888c3da083..f2f79ff0dfd1d 100644
> --- a/drivers/gpu/drm/i915/gt/intel_context.h
> +++ b/drivers/gpu/drm/i915/gt/intel_context.h
> @@ -270,6 +270,12 @@ static inline void intel_context_timeline_unlock(struct intel_timeline *tl)
>   	mutex_unlock(&tl->mutex);
>   }
>   
> +static inline void intel_context_assert_timeline_is_locked(struct intel_timeline *tl)
> +	__must_hold(&tl->mutex)
> +{
> +	lockdep_assert_held(&tl->mutex);
> +}
> +
>   int intel_context_prepare_remote_request(struct intel_context *ce,
>   					 struct i915_request *rq);
>   


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Intel-gfx] [PATCH v5 2/5] drm/i915: Create the locked version of the request create
  2023-04-12 11:33 ` [Intel-gfx] [PATCH v5 2/5] drm/i915: Create the locked version of the request create Andi Shyti
@ 2023-04-12 13:04   ` Andrzej Hajda
  2023-04-13  8:53     ` Andi Shyti
  0 siblings, 1 reply; 19+ messages in thread
From: Andrzej Hajda @ 2023-04-12 13:04 UTC (permalink / raw)
  To: Andi Shyti, intel-gfx, dri-devel, stable
  Cc: Andi Shyti, Maciej Patelczyk, Matthew Auld, Rodrigo Vivi,
	Chris Wilson, Nirmoy Das



On 12.04.2023 13:33, Andi Shyti wrote:
> Make version of the request creation that doesn't hold any
> lock.
>
> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
> Cc: stable@vger.kernel.org
> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
> ---
>   drivers/gpu/drm/i915/i915_request.c | 38 +++++++++++++++++++++--------
>   drivers/gpu/drm/i915/i915_request.h |  2 ++
>   2 files changed, 30 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> index 630a732aaecca..58662360ac34e 100644
> --- a/drivers/gpu/drm/i915/i915_request.c
> +++ b/drivers/gpu/drm/i915/i915_request.c
> @@ -1028,15 +1028,11 @@ __i915_request_create(struct intel_context *ce, gfp_t gfp)
>   	return ERR_PTR(ret);
>   }
>   
> -struct i915_request *
> -i915_request_create(struct intel_context *ce)
> +static struct i915_request *
> +__i915_request_create_locked(struct intel_context *ce)
>   {
> +	struct intel_timeline *tl = ce->timeline;
>   	struct i915_request *rq;
> -	struct intel_timeline *tl;
> -
> -	tl = intel_context_timeline_lock(ce);
> -	if (IS_ERR(tl))
> -		return ERR_CAST(tl);
>   
>   	/* Move our oldest request to the slab-cache (if not in use!) */
>   	rq = list_first_entry(&tl->requests, typeof(*rq), link);
> @@ -1046,16 +1042,38 @@ i915_request_create(struct intel_context *ce)
>   	intel_context_enter(ce);
>   	rq = __i915_request_create(ce, GFP_KERNEL);
>   	intel_context_exit(ce); /* active reference transferred to request */
> +
>   	if (IS_ERR(rq))
> -		goto err_unlock;
> +		return rq;
>   
>   	/* Check that we do not interrupt ourselves with a new request */
>   	rq->cookie = lockdep_pin_lock(&tl->mutex);
>   
>   	return rq;
> +}
> +
> +struct i915_request *
> +i915_request_create_locked(struct intel_context *ce)
> +{
> +	intel_context_assert_timeline_is_locked(ce->timeline);
> +
> +	return __i915_request_create_locked(ce);
> +}

I wonder if we really need to have such granularity? Leaving only 
i915_request_create_locked and removing __i915_request_create_locked 
would simplify little bit the code,
I guess the cost of calling intel_context_assert_timeline_is_locked 
twice sometimes is not big, or maybe it can be re-arranged, up to you.

Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>

Regards
Andrzej


> +
> +struct i915_request *
> +i915_request_create(struct intel_context *ce)
> +{
> +	struct i915_request *rq;
> +	struct intel_timeline *tl;
> +
> +	tl = intel_context_timeline_lock(ce);
> +	if (IS_ERR(tl))
> +		return ERR_CAST(tl);
> +
> +	rq = __i915_request_create_locked(ce);
> +	if (IS_ERR(rq))
> +		intel_context_timeline_unlock(tl);
>   
> -err_unlock:
> -	intel_context_timeline_unlock(tl);
>   	return rq;
>   }
>   
> diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h
> index f5e1bb5e857aa..bb48bd4605c03 100644
> --- a/drivers/gpu/drm/i915/i915_request.h
> +++ b/drivers/gpu/drm/i915/i915_request.h
> @@ -374,6 +374,8 @@ struct i915_request * __must_check
>   __i915_request_create(struct intel_context *ce, gfp_t gfp);
>   struct i915_request * __must_check
>   i915_request_create(struct intel_context *ce);
> +struct i915_request * __must_check
> +i915_request_create_locked(struct intel_context *ce);
>   
>   void __i915_request_skip(struct i915_request *rq);
>   bool i915_request_set_error_once(struct i915_request *rq, int error);


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Intel-gfx] [PATCH v5 3/5] drm/i915: Create the locked version of the request add
  2023-04-12 11:33 ` [Intel-gfx] [PATCH v5 3/5] drm/i915: Create the locked version of the request add Andi Shyti
@ 2023-04-12 13:06   ` Andrzej Hajda
  2023-04-13  8:57     ` Andi Shyti
  0 siblings, 1 reply; 19+ messages in thread
From: Andrzej Hajda @ 2023-04-12 13:06 UTC (permalink / raw)
  To: Andi Shyti, intel-gfx, dri-devel, stable
  Cc: Andi Shyti, Maciej Patelczyk, Matthew Auld, Rodrigo Vivi,
	Chris Wilson, Nirmoy Das

On 12.04.2023 13:33, Andi Shyti wrote:
> i915_request_add() assumes that the timeline is locked whtn the
*when
> function is called. Before exiting it releases the lock. But in
> the next commit we have one case where releasing the timeline
> mutex is not necessary and we don't want that.
>
> Make a new i915_request_add_locked() version of the function
> where the lock is not released.
>
> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
> Cc: stable@vger.kernel.org

Have you looked for other potential users of these new helpers?

Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>

Regards
Andrzej

> ---
>   drivers/gpu/drm/i915/i915_request.c | 14 +++++++++++---
>   drivers/gpu/drm/i915/i915_request.h |  1 +
>   2 files changed, 12 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> index 58662360ac34e..21032b3b9d330 100644
> --- a/drivers/gpu/drm/i915/i915_request.c
> +++ b/drivers/gpu/drm/i915/i915_request.c
> @@ -1852,13 +1852,13 @@ void __i915_request_queue(struct i915_request *rq,
>   	local_bh_enable(); /* kick tasklets */
>   }
>   
> -void i915_request_add(struct i915_request *rq)
> +void i915_request_add_locked(struct i915_request *rq)
>   {
>   	struct intel_timeline * const tl = i915_request_timeline(rq);
>   	struct i915_sched_attr attr = {};
>   	struct i915_gem_context *ctx;
>   
> -	lockdep_assert_held(&tl->mutex);
> +	intel_context_assert_timeline_is_locked(tl);
>   	lockdep_unpin_lock(&tl->mutex, rq->cookie);
>   
>   	trace_i915_request_add(rq);
> @@ -1873,7 +1873,15 @@ void i915_request_add(struct i915_request *rq)
>   
>   	__i915_request_queue(rq, &attr);
>   
> -	mutex_unlock(&tl->mutex);
> +}
> +
> +void i915_request_add(struct i915_request *rq)
> +{
> +	struct intel_timeline * const tl = i915_request_timeline(rq);
> +
> +	i915_request_add_locked(rq);
> +
> +	intel_context_timeline_unlock(tl);
>   }
>   
>   static unsigned long local_clock_ns(unsigned int *cpu)
> diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h
> index bb48bd4605c03..29e3a37c300a7 100644
> --- a/drivers/gpu/drm/i915/i915_request.h
> +++ b/drivers/gpu/drm/i915/i915_request.h
> @@ -425,6 +425,7 @@ int i915_request_await_deps(struct i915_request *rq, const struct i915_deps *dep
>   int i915_request_await_execution(struct i915_request *rq,
>   				 struct dma_fence *fence);
>   
> +void i915_request_add_locked(struct i915_request *rq);
>   void i915_request_add(struct i915_request *rq);
>   
>   bool __i915_request_submit(struct i915_request *request);


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for Fix error propagation amongst request (rev3)
  2023-04-12 11:33 [Intel-gfx] [PATCH v5 0/5] Fix error propagation amongst request Andi Shyti
                   ` (4 preceding siblings ...)
  2023-04-12 11:33 ` [Intel-gfx] [PATCH v5 5/5] drm/i915/gt: Make sure that errors are propagated through request chains Andi Shyti
@ 2023-04-12 17:28 ` Patchwork
  2023-04-12 17:28 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 19+ messages in thread
From: Patchwork @ 2023-04-12 17:28 UTC (permalink / raw)
  To: Andi Shyti; +Cc: intel-gfx

== Series Details ==

Series: Fix error propagation amongst request (rev3)
URL   : https://patchwork.freedesktop.org/series/114451/
State : warning

== Summary ==

Error: dim checkpatch failed
9e4d4a173f02 drm/i915/gt: Add intel_context_timeline_is_locked helper
5e0b2f6541d9 drm/i915: Create the locked version of the request create
f24c647aa3ba drm/i915: Create the locked version of the request add
-:43: CHECK:BRACES: Blank lines aren't necessary before a close brace '}'
#43: FILE: drivers/gpu/drm/i915/i915_request.c:1876:
 
+}

total: 0 errors, 0 warnings, 1 checks, 38 lines checked
216285b78cf8 drm/i915: Throttle for ringspace prior to taking the timeline mutex
-:101: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address mismatch: 'From: Chris Wilson <chris@chris-wilson.co.uk>' != 'Signed-off-by: Chris Wilson <chris.p.wilson@linux.intel.com>'

total: 0 errors, 1 warnings, 0 checks, 64 lines checked
ebf8f355b629 drm/i915/gt: Make sure that errors are propagated through request chains
-:31: WARNING:BAD_FIXES_TAG: Please use correct Fixes: style 'Fixes: <12 chars of sha1> ("<title line>")' - ie: 'Fixes: cf586021642d ("drm/i915/gt: Pipelined page migration")'
#31: 
Fixes: cf586021642d80 ("drm/i915/gt: Pipelined page migration")

-:32: WARNING:BAD_REPORTED_BY_LINK: Reported-by: should be immediately followed by Link: with a URL to the report
#32: 
Reported-by: Matthew Auld <matthew.auld@intel.com>
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>

total: 0 errors, 2 warnings, 0 checks, 99 lines checked



^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Intel-gfx] ✗ Fi.CI.SPARSE: warning for Fix error propagation amongst request (rev3)
  2023-04-12 11:33 [Intel-gfx] [PATCH v5 0/5] Fix error propagation amongst request Andi Shyti
                   ` (5 preceding siblings ...)
  2023-04-12 17:28 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for Fix error propagation amongst request (rev3) Patchwork
@ 2023-04-12 17:28 ` Patchwork
  2023-04-12 17:41 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
  2023-04-13  3:47 ` [Intel-gfx] ✓ Fi.CI.IGT: " Patchwork
  8 siblings, 0 replies; 19+ messages in thread
From: Patchwork @ 2023-04-12 17:28 UTC (permalink / raw)
  To: Andi Shyti; +Cc: intel-gfx

== Series Details ==

Series: Fix error propagation amongst request (rev3)
URL   : https://patchwork.freedesktop.org/series/114451/
State : warning

== Summary ==

Error: dim sparse failed
Sparse version: v0.6.2
Fast mode used, each commit won't be checked separately.



^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Intel-gfx] ✓ Fi.CI.BAT: success for Fix error propagation amongst request (rev3)
  2023-04-12 11:33 [Intel-gfx] [PATCH v5 0/5] Fix error propagation amongst request Andi Shyti
                   ` (6 preceding siblings ...)
  2023-04-12 17:28 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
@ 2023-04-12 17:41 ` Patchwork
  2023-04-13  3:47 ` [Intel-gfx] ✓ Fi.CI.IGT: " Patchwork
  8 siblings, 0 replies; 19+ messages in thread
From: Patchwork @ 2023-04-12 17:41 UTC (permalink / raw)
  To: Andi Shyti; +Cc: intel-gfx

[-- Attachment #1: Type: text/plain, Size: 6618 bytes --]

== Series Details ==

Series: Fix error propagation amongst request (rev3)
URL   : https://patchwork.freedesktop.org/series/114451/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_12996 -> Patchwork_114451v3
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114451v3/index.html

Participating hosts (36 -> 35)
------------------------------

  Missing    (1): fi-snb-2520m 

Known issues
------------

  Here are the changes found in Patchwork_114451v3 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@gem_exec_suspend@basic-s3@smem:
    - bat-atsm-1:         [PASS][1] -> [INCOMPLETE][2] ([i915#6311])
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12996/bat-atsm-1/igt@gem_exec_suspend@basic-s3@smem.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114451v3/bat-atsm-1/igt@gem_exec_suspend@basic-s3@smem.html
    - bat-rpls-1:         [PASS][3] -> [ABORT][4] ([i915#6687] / [i915#7978])
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12996/bat-rpls-1/igt@gem_exec_suspend@basic-s3@smem.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114451v3/bat-rpls-1/igt@gem_exec_suspend@basic-s3@smem.html

  * igt@i915_pm_rps@basic-api:
    - bat-dg2-11:         [PASS][5] -> [FAIL][6] ([i915#8308])
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12996/bat-dg2-11/igt@i915_pm_rps@basic-api.html
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114451v3/bat-dg2-11/igt@i915_pm_rps@basic-api.html

  * igt@i915_selftest@live@migrate:
    - bat-dg2-11:         [PASS][7] -> [DMESG-WARN][8] ([i915#7699])
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12996/bat-dg2-11/igt@i915_selftest@live@migrate.html
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114451v3/bat-dg2-11/igt@i915_selftest@live@migrate.html
    - bat-atsm-1:         [PASS][9] -> [DMESG-FAIL][10] ([i915#7699])
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12996/bat-atsm-1/igt@i915_selftest@live@migrate.html
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114451v3/bat-atsm-1/igt@i915_selftest@live@migrate.html
    - bat-rpls-2:         [PASS][11] -> [DMESG-FAIL][12] ([i915#7699])
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12996/bat-rpls-2/igt@i915_selftest@live@migrate.html
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114451v3/bat-rpls-2/igt@i915_selftest@live@migrate.html

  * igt@i915_selftest@live@requests:
    - bat-adlp-6:         [PASS][13] -> [ABORT][14] ([i915#7913] / [i915#7982])
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12996/bat-adlp-6/igt@i915_selftest@live@requests.html
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114451v3/bat-adlp-6/igt@i915_selftest@live@requests.html

  * igt@kms_chamelium_hpd@common-hpd-after-suspend:
    - bat-rpls-2:         NOTRUN -> [SKIP][15] ([i915#7828])
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114451v3/bat-rpls-2/igt@kms_chamelium_hpd@common-hpd-after-suspend.html

  * igt@kms_pipe_crc_basic@read-crc:
    - bat-dg2-11:         NOTRUN -> [SKIP][16] ([i915#5354]) +1 similar issue
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114451v3/bat-dg2-11/igt@kms_pipe_crc_basic@read-crc.html

  * igt@kms_pipe_crc_basic@suspend-read-crc:
    - bat-rpls-2:         NOTRUN -> [SKIP][17] ([i915#1845])
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114451v3/bat-rpls-2/igt@kms_pipe_crc_basic@suspend-read-crc.html

  
#### Possible fixes ####

  * igt@gem_exec_suspend@basic-s3@smem:
    - bat-rpls-2:         [ABORT][18] ([i915#6687] / [i915#7978]) -> [PASS][19]
   [18]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12996/bat-rpls-2/igt@gem_exec_suspend@basic-s3@smem.html
   [19]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114451v3/bat-rpls-2/igt@gem_exec_suspend@basic-s3@smem.html

  * igt@kms_pipe_crc_basic@nonblocking-crc-frame-sequence@pipe-d-dp-1:
    - bat-dg2-8:          [FAIL][20] ([i915#7932]) -> [PASS][21]
   [20]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12996/bat-dg2-8/igt@kms_pipe_crc_basic@nonblocking-crc-frame-sequence@pipe-d-dp-1.html
   [21]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114451v3/bat-dg2-8/igt@kms_pipe_crc_basic@nonblocking-crc-frame-sequence@pipe-d-dp-1.html

  
#### Warnings ####

  * igt@i915_selftest@live@slpc:
    - bat-rpls-1:         [DMESG-FAIL][22] ([i915#6367]) -> [DMESG-FAIL][23] ([i915#6367] / [i915#7996])
   [22]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12996/bat-rpls-1/igt@i915_selftest@live@slpc.html
   [23]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114451v3/bat-rpls-1/igt@i915_selftest@live@slpc.html

  
  [i915#1845]: https://gitlab.freedesktop.org/drm/intel/issues/1845
  [i915#5354]: https://gitlab.freedesktop.org/drm/intel/issues/5354
  [i915#6311]: https://gitlab.freedesktop.org/drm/intel/issues/6311
  [i915#6367]: https://gitlab.freedesktop.org/drm/intel/issues/6367
  [i915#6687]: https://gitlab.freedesktop.org/drm/intel/issues/6687
  [i915#7699]: https://gitlab.freedesktop.org/drm/intel/issues/7699
  [i915#7828]: https://gitlab.freedesktop.org/drm/intel/issues/7828
  [i915#7913]: https://gitlab.freedesktop.org/drm/intel/issues/7913
  [i915#7932]: https://gitlab.freedesktop.org/drm/intel/issues/7932
  [i915#7978]: https://gitlab.freedesktop.org/drm/intel/issues/7978
  [i915#7982]: https://gitlab.freedesktop.org/drm/intel/issues/7982
  [i915#7996]: https://gitlab.freedesktop.org/drm/intel/issues/7996
  [i915#8308]: https://gitlab.freedesktop.org/drm/intel/issues/8308


Build changes
-------------

  * Linux: CI_DRM_12996 -> Patchwork_114451v3

  CI-20190529: 20190529
  CI_DRM_12996: d82f63ad2143079892f2bee4f3e72556c54fac7d @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_7253: 1a619e8dbc6ca887f2ae24b2d7f1ac536342f58c @ https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  Patchwork_114451v3: d82f63ad2143079892f2bee4f3e72556c54fac7d @ git://anongit.freedesktop.org/gfx-ci/linux


### Linux commits

f3519c42fc8d drm/i915/gt: Make sure that errors are propagated through request chains
327332d5b34e drm/i915: Throttle for ringspace prior to taking the timeline mutex
37fa43d37a38 drm/i915: Create the locked version of the request add
d0943676553b drm/i915: Create the locked version of the request create
6614e8ff5ba9 drm/i915/gt: Add intel_context_timeline_is_locked helper

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114451v3/index.html

[-- Attachment #2: Type: text/html, Size: 7864 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Intel-gfx] ✓ Fi.CI.IGT: success for Fix error propagation amongst request (rev3)
  2023-04-12 11:33 [Intel-gfx] [PATCH v5 0/5] Fix error propagation amongst request Andi Shyti
                   ` (7 preceding siblings ...)
  2023-04-12 17:41 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
@ 2023-04-13  3:47 ` Patchwork
  8 siblings, 0 replies; 19+ messages in thread
From: Patchwork @ 2023-04-13  3:47 UTC (permalink / raw)
  To: Andi Shyti; +Cc: intel-gfx

[-- Attachment #1: Type: text/plain, Size: 13875 bytes --]

== Series Details ==

Series: Fix error propagation amongst request (rev3)
URL   : https://patchwork.freedesktop.org/series/114451/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_12996_full -> Patchwork_114451v3_full
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  

Participating hosts (7 -> 7)
------------------------------

  No changes in participating hosts

Known issues
------------

  Here are the changes found in Patchwork_114451v3_full that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@gem_exec_fair@basic-pace-solo@rcs0:
    - shard-apl:          [PASS][1] -> [FAIL][2] ([i915#2842])
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12996/shard-apl2/igt@gem_exec_fair@basic-pace-solo@rcs0.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114451v3/shard-apl6/igt@gem_exec_fair@basic-pace-solo@rcs0.html

  * igt@gem_userptr_blits@vma-merge:
    - shard-snb:          NOTRUN -> [FAIL][3] ([i915#2724])
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114451v3/shard-snb5/igt@gem_userptr_blits@vma-merge.html

  * igt@i915_pm_dc@dc9-dpms:
    - shard-apl:          [PASS][4] -> [FAIL][5] ([i915#4275])
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12996/shard-apl7/igt@i915_pm_dc@dc9-dpms.html
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114451v3/shard-apl2/igt@i915_pm_dc@dc9-dpms.html

  * igt@kms_ccs@pipe-b-missing-ccs-buffer-y_tiled_gen12_mc_ccs:
    - shard-apl:          NOTRUN -> [SKIP][6] ([fdo#109271] / [i915#3886])
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114451v3/shard-apl6/igt@kms_ccs@pipe-b-missing-ccs-buffer-y_tiled_gen12_mc_ccs.html

  * igt@kms_ccs@pipe-c-bad-pixel-format-y_tiled_gen12_rc_ccs_cc:
    - shard-snb:          NOTRUN -> [SKIP][7] ([fdo#109271]) +22 similar issues
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114451v3/shard-snb5/igt@kms_ccs@pipe-c-bad-pixel-format-y_tiled_gen12_rc_ccs_cc.html

  * igt@kms_draw_crc@draw-method-blt@xrgb2101010-ytiled:
    - shard-glk:          [PASS][8] -> [DMESG-WARN][9] ([i915#7936])
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12996/shard-glk5/igt@kms_draw_crc@draw-method-blt@xrgb2101010-ytiled.html
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114451v3/shard-glk7/igt@kms_draw_crc@draw-method-blt@xrgb2101010-ytiled.html

  * igt@kms_frontbuffer_tracking@psr-1p-primscrn-pri-shrfb-draw-mmap-cpu:
    - shard-apl:          NOTRUN -> [SKIP][10] ([fdo#109271]) +24 similar issues
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114451v3/shard-apl6/igt@kms_frontbuffer_tracking@psr-1p-primscrn-pri-shrfb-draw-mmap-cpu.html

  * igt@perf@stress-open-close@0-rcs0:
    - shard-glk:          [PASS][11] -> [ABORT][12] ([i915#5213])
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12996/shard-glk2/igt@perf@stress-open-close@0-rcs0.html
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114451v3/shard-glk4/igt@perf@stress-open-close@0-rcs0.html

  
#### Possible fixes ####

  * igt@gem_ctx_exec@basic-nohangcheck:
    - {shard-rkl}:        [FAIL][13] ([i915#6268]) -> [PASS][14]
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12996/shard-rkl-4/igt@gem_ctx_exec@basic-nohangcheck.html
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114451v3/shard-rkl-7/igt@gem_ctx_exec@basic-nohangcheck.html

  * igt@gem_eio@in-flight-contexts-10ms:
    - shard-glk:          [TIMEOUT][15] ([i915#3063]) -> [PASS][16]
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12996/shard-glk9/igt@gem_eio@in-flight-contexts-10ms.html
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114451v3/shard-glk3/igt@gem_eio@in-flight-contexts-10ms.html

  * igt@gem_exec_fair@basic-deadline:
    - {shard-rkl}:        [FAIL][17] ([i915#2846]) -> [PASS][18]
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12996/shard-rkl-3/igt@gem_exec_fair@basic-deadline.html
   [18]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114451v3/shard-rkl-1/igt@gem_exec_fair@basic-deadline.html

  * igt@gem_exec_fair@basic-flow@rcs0:
    - {shard-tglu}:       [FAIL][19] ([i915#2842]) -> [PASS][20]
   [19]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12996/shard-tglu-6/igt@gem_exec_fair@basic-flow@rcs0.html
   [20]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114451v3/shard-tglu-10/igt@gem_exec_fair@basic-flow@rcs0.html

  * igt@gem_exec_fair@basic-pace-share@rcs0:
    - shard-glk:          [FAIL][21] ([i915#2842]) -> [PASS][22]
   [21]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12996/shard-glk8/igt@gem_exec_fair@basic-pace-share@rcs0.html
   [22]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114451v3/shard-glk3/igt@gem_exec_fair@basic-pace-share@rcs0.html

  * igt@gen9_exec_parse@allowed-single:
    - shard-apl:          [ABORT][23] ([i915#5566]) -> [PASS][24]
   [23]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12996/shard-apl6/igt@gen9_exec_parse@allowed-single.html
   [24]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114451v3/shard-apl6/igt@gen9_exec_parse@allowed-single.html

  * igt@i915_module_load@reload-no-display:
    - shard-snb:          [ABORT][25] ([i915#4528]) -> [PASS][26]
   [25]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12996/shard-snb7/igt@i915_module_load@reload-no-display.html
   [26]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114451v3/shard-snb5/igt@i915_module_load@reload-no-display.html

  * igt@i915_pm_rc6_residency@rc6-idle@vcs0:
    - {shard-dg1}:        [FAIL][27] ([i915#3591]) -> [PASS][28]
   [27]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12996/shard-dg1-18/igt@i915_pm_rc6_residency@rc6-idle@vcs0.html
   [28]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114451v3/shard-dg1-18/igt@i915_pm_rc6_residency@rc6-idle@vcs0.html

  * igt@i915_pm_rpm@dpms-mode-unset-lpsp:
    - {shard-dg1}:        [SKIP][29] ([i915#1397]) -> [PASS][30]
   [29]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12996/shard-dg1-15/igt@i915_pm_rpm@dpms-mode-unset-lpsp.html
   [30]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114451v3/shard-dg1-14/igt@i915_pm_rpm@dpms-mode-unset-lpsp.html

  * igt@i915_pm_rpm@modeset-lpsp-stress:
    - {shard-rkl}:        [SKIP][31] ([i915#1397]) -> [PASS][32]
   [31]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12996/shard-rkl-4/igt@i915_pm_rpm@modeset-lpsp-stress.html
   [32]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114451v3/shard-rkl-7/igt@i915_pm_rpm@modeset-lpsp-stress.html

  * igt@kms_big_fb@y-tiled-max-hw-stride-32bpp-rotate-0-hflip-async-flip:
    - {shard-tglu}:       [FAIL][33] ([i915#3743]) -> [PASS][34]
   [33]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12996/shard-tglu-9/igt@kms_big_fb@y-tiled-max-hw-stride-32bpp-rotate-0-hflip-async-flip.html
   [34]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114451v3/shard-tglu-2/igt@kms_big_fb@y-tiled-max-hw-stride-32bpp-rotate-0-hflip-async-flip.html

  * igt@kms_flip@flip-vs-expired-vblank@a-hdmi-a2:
    - shard-glk:          [FAIL][35] ([i915#79]) -> [PASS][36]
   [35]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12996/shard-glk6/igt@kms_flip@flip-vs-expired-vblank@a-hdmi-a2.html
   [36]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114451v3/shard-glk2/igt@kms_flip@flip-vs-expired-vblank@a-hdmi-a2.html

  * igt@kms_plane_scaling@i915-max-src-size@pipe-a-hdmi-a-1:
    - {shard-tglu}:       [FAIL][37] ([i915#8292]) -> [PASS][38]
   [37]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12996/shard-tglu-9/igt@kms_plane_scaling@i915-max-src-size@pipe-a-hdmi-a-1.html
   [38]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114451v3/shard-tglu-9/igt@kms_plane_scaling@i915-max-src-size@pipe-a-hdmi-a-1.html

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [fdo#109289]: https://bugs.freedesktop.org/show_bug.cgi?id=109289
  [fdo#109307]: https://bugs.freedesktop.org/show_bug.cgi?id=109307
  [fdo#109506]: https://bugs.freedesktop.org/show_bug.cgi?id=109506
  [fdo#111068]: https://bugs.freedesktop.org/show_bug.cgi?id=111068
  [fdo#111615]: https://bugs.freedesktop.org/show_bug.cgi?id=111615
  [fdo#111825]: https://bugs.freedesktop.org/show_bug.cgi?id=111825
  [fdo#111827]: https://bugs.freedesktop.org/show_bug.cgi?id=111827
  [i915#1072]: https://gitlab.freedesktop.org/drm/intel/issues/1072
  [i915#1397]: https://gitlab.freedesktop.org/drm/intel/issues/1397
  [i915#1839]: https://gitlab.freedesktop.org/drm/intel/issues/1839
  [i915#1937]: https://gitlab.freedesktop.org/drm/intel/issues/1937
  [i915#2433]: https://gitlab.freedesktop.org/drm/intel/issues/2433
  [i915#2437]: https://gitlab.freedesktop.org/drm/intel/issues/2437
  [i915#2527]: https://gitlab.freedesktop.org/drm/intel/issues/2527
  [i915#2575]: https://gitlab.freedesktop.org/drm/intel/issues/2575
  [i915#2587]: https://gitlab.freedesktop.org/drm/intel/issues/2587
  [i915#2672]: https://gitlab.freedesktop.org/drm/intel/issues/2672
  [i915#2705]: https://gitlab.freedesktop.org/drm/intel/issues/2705
  [i915#2724]: https://gitlab.freedesktop.org/drm/intel/issues/2724
  [i915#2842]: https://gitlab.freedesktop.org/drm/intel/issues/2842
  [i915#2846]: https://gitlab.freedesktop.org/drm/intel/issues/2846
  [i915#3063]: https://gitlab.freedesktop.org/drm/intel/issues/3063
  [i915#3281]: https://gitlab.freedesktop.org/drm/intel/issues/3281
  [i915#3282]: https://gitlab.freedesktop.org/drm/intel/issues/3282
  [i915#3299]: https://gitlab.freedesktop.org/drm/intel/issues/3299
  [i915#3359]: https://gitlab.freedesktop.org/drm/intel/issues/3359
  [i915#3458]: https://gitlab.freedesktop.org/drm/intel/issues/3458
  [i915#3469]: https://gitlab.freedesktop.org/drm/intel/issues/3469
  [i915#3539]: https://gitlab.freedesktop.org/drm/intel/issues/3539
  [i915#3555]: https://gitlab.freedesktop.org/drm/intel/issues/3555
  [i915#3591]: https://gitlab.freedesktop.org/drm/intel/issues/3591
  [i915#3638]: https://gitlab.freedesktop.org/drm/intel/issues/3638
  [i915#3689]: https://gitlab.freedesktop.org/drm/intel/issues/3689
  [i915#3708]: https://gitlab.freedesktop.org/drm/intel/issues/3708
  [i915#3743]: https://gitlab.freedesktop.org/drm/intel/issues/3743
  [i915#3886]: https://gitlab.freedesktop.org/drm/intel/issues/3886
  [i915#3952]: https://gitlab.freedesktop.org/drm/intel/issues/3952
  [i915#404]: https://gitlab.freedesktop.org/drm/intel/issues/404
  [i915#4077]: https://gitlab.freedesktop.org/drm/intel/issues/4077
  [i915#4079]: https://gitlab.freedesktop.org/drm/intel/issues/4079
  [i915#4083]: https://gitlab.freedesktop.org/drm/intel/issues/4083
  [i915#4270]: https://gitlab.freedesktop.org/drm/intel/issues/4270
  [i915#4275]: https://gitlab.freedesktop.org/drm/intel/issues/4275
  [i915#4528]: https://gitlab.freedesktop.org/drm/intel/issues/4528
  [i915#4538]: https://gitlab.freedesktop.org/drm/intel/issues/4538
  [i915#4579]: https://gitlab.freedesktop.org/drm/intel/issues/4579
  [i915#4771]: https://gitlab.freedesktop.org/drm/intel/issues/4771
  [i915#4812]: https://gitlab.freedesktop.org/drm/intel/issues/4812
  [i915#4818]: https://gitlab.freedesktop.org/drm/intel/issues/4818
  [i915#4833]: https://gitlab.freedesktop.org/drm/intel/issues/4833
  [i915#4852]: https://gitlab.freedesktop.org/drm/intel/issues/4852
  [i915#4859]: https://gitlab.freedesktop.org/drm/intel/issues/4859
  [i915#4860]: https://gitlab.freedesktop.org/drm/intel/issues/4860
  [i915#5176]: https://gitlab.freedesktop.org/drm/intel/issues/5176
  [i915#5213]: https://gitlab.freedesktop.org/drm/intel/issues/5213
  [i915#5235]: https://gitlab.freedesktop.org/drm/intel/issues/5235
  [i915#5286]: https://gitlab.freedesktop.org/drm/intel/issues/5286
  [i915#5289]: https://gitlab.freedesktop.org/drm/intel/issues/5289
  [i915#5563]: https://gitlab.freedesktop.org/drm/intel/issues/5563
  [i915#5566]: https://gitlab.freedesktop.org/drm/intel/issues/5566
  [i915#5784]: https://gitlab.freedesktop.org/drm/intel/issues/5784
  [i915#6095]: https://gitlab.freedesktop.org/drm/intel/issues/6095
  [i915#6268]: https://gitlab.freedesktop.org/drm/intel/issues/6268
  [i915#6334]: https://gitlab.freedesktop.org/drm/intel/issues/6334
  [i915#6433]: https://gitlab.freedesktop.org/drm/intel/issues/6433
  [i915#658]: https://gitlab.freedesktop.org/drm/intel/issues/658
  [i915#7561]: https://gitlab.freedesktop.org/drm/intel/issues/7561
  [i915#7697]: https://gitlab.freedesktop.org/drm/intel/issues/7697
  [i915#7701]: https://gitlab.freedesktop.org/drm/intel/issues/7701
  [i915#7711]: https://gitlab.freedesktop.org/drm/intel/issues/7711
  [i915#7742]: https://gitlab.freedesktop.org/drm/intel/issues/7742
  [i915#7828]: https://gitlab.freedesktop.org/drm/intel/issues/7828
  [i915#79]: https://gitlab.freedesktop.org/drm/intel/issues/79
  [i915#7936]: https://gitlab.freedesktop.org/drm/intel/issues/7936
  [i915#8011]: https://gitlab.freedesktop.org/drm/intel/issues/8011
  [i915#8253]: https://gitlab.freedesktop.org/drm/intel/issues/8253
  [i915#8292]: https://gitlab.freedesktop.org/drm/intel/issues/8292


Build changes
-------------

  * Linux: CI_DRM_12996 -> Patchwork_114451v3

  CI-20190529: 20190529
  CI_DRM_12996: d82f63ad2143079892f2bee4f3e72556c54fac7d @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_7253: 1a619e8dbc6ca887f2ae24b2d7f1ac536342f58c @ https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  Patchwork_114451v3: d82f63ad2143079892f2bee4f3e72556c54fac7d @ git://anongit.freedesktop.org/gfx-ci/linux
  piglit_4509: fdc5a4ca11124ab8413c7988896eec4c97336694 @ git://anongit.freedesktop.org/piglit

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114451v3/index.html

[-- Attachment #2: Type: text/html, Size: 11194 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Intel-gfx] [PATCH v5 2/5] drm/i915: Create the locked version of the request create
  2023-04-12 13:04   ` Andrzej Hajda
@ 2023-04-13  8:53     ` Andi Shyti
  0 siblings, 0 replies; 19+ messages in thread
From: Andi Shyti @ 2023-04-13  8:53 UTC (permalink / raw)
  To: Andrzej Hajda
  Cc: Maciej Patelczyk, intel-gfx, stable, Andi Shyti, dri-devel,
	Rodrigo Vivi, Nirmoy Das, Chris Wilson, Matthew Auld

Hi Andrzej,

> > Make version of the request creation that doesn't hold any
> > lock.
> > 
> > Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
> > Cc: stable@vger.kernel.org
> > Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
> > ---
> >   drivers/gpu/drm/i915/i915_request.c | 38 +++++++++++++++++++++--------
> >   drivers/gpu/drm/i915/i915_request.h |  2 ++
> >   2 files changed, 30 insertions(+), 10 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> > index 630a732aaecca..58662360ac34e 100644
> > --- a/drivers/gpu/drm/i915/i915_request.c
> > +++ b/drivers/gpu/drm/i915/i915_request.c
> > @@ -1028,15 +1028,11 @@ __i915_request_create(struct intel_context *ce, gfp_t gfp)
> >   	return ERR_PTR(ret);
> >   }
> > -struct i915_request *
> > -i915_request_create(struct intel_context *ce)
> > +static struct i915_request *
> > +__i915_request_create_locked(struct intel_context *ce)
> >   {
> > +	struct intel_timeline *tl = ce->timeline;
> >   	struct i915_request *rq;
> > -	struct intel_timeline *tl;
> > -
> > -	tl = intel_context_timeline_lock(ce);
> > -	if (IS_ERR(tl))
> > -		return ERR_CAST(tl);
> >   	/* Move our oldest request to the slab-cache (if not in use!) */
> >   	rq = list_first_entry(&tl->requests, typeof(*rq), link);
> > @@ -1046,16 +1042,38 @@ i915_request_create(struct intel_context *ce)
> >   	intel_context_enter(ce);
> >   	rq = __i915_request_create(ce, GFP_KERNEL);
> >   	intel_context_exit(ce); /* active reference transferred to request */
> > +
> >   	if (IS_ERR(rq))
> > -		goto err_unlock;
> > +		return rq;
> >   	/* Check that we do not interrupt ourselves with a new request */
> >   	rq->cookie = lockdep_pin_lock(&tl->mutex);
> >   	return rq;
> > +}
> > +
> > +struct i915_request *
> > +i915_request_create_locked(struct intel_context *ce)
> > +{
> > +	intel_context_assert_timeline_is_locked(ce->timeline);
> > +
> > +	return __i915_request_create_locked(ce);
> > +}
> 
> I wonder if we really need to have such granularity? Leaving only
> i915_request_create_locked and removing __i915_request_create_locked would
> simplify little bit the code,
> I guess the cost of calling intel_context_assert_timeline_is_locked twice
> sometimes is not big, or maybe it can be re-arranged, up to you.

There is some usage of such granularity in patch 4. I am adding
here the throttle on the timeline. I am adding it in the
"_locked" version to avoid potential deadlocks coming from
selftests (and from realworld?).

Here I'd love to have some comments from Chris and Matt.

I might still add this in the commit message:

"i915_request_create_locked() is now empty but will be used in
later commits where a throttle on the ringspace will be executed
to ensure exclusive ownership of the timeline."

> Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>

Thanks!

Andi

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Intel-gfx] [PATCH v5 3/5] drm/i915: Create the locked version of the request add
  2023-04-12 13:06   ` Andrzej Hajda
@ 2023-04-13  8:57     ` Andi Shyti
  0 siblings, 0 replies; 19+ messages in thread
From: Andi Shyti @ 2023-04-13  8:57 UTC (permalink / raw)
  To: Andrzej Hajda
  Cc: Maciej Patelczyk, intel-gfx, stable, Andi Shyti, dri-devel,
	Rodrigo Vivi, Nirmoy Das, Chris Wilson, Matthew Auld

Hi Andrzej,

On Wed, Apr 12, 2023 at 03:06:42PM +0200, Andrzej Hajda wrote:
> On 12.04.2023 13:33, Andi Shyti wrote:
> > i915_request_add() assumes that the timeline is locked whtn the
> *when
> > function is called. Before exiting it releases the lock. But in
> > the next commit we have one case where releasing the timeline
> > mutex is not necessary and we don't want that.
> > 
> > Make a new i915_request_add_locked() version of the function
> > where the lock is not released.
> > 
> > Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
> > Cc: stable@vger.kernel.org
> 
> Have you looked for other potential users of these new helpers?

not yet, will do!

> Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>

Thanks!

Andi

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Intel-gfx] [PATCH v5 5/5] drm/i915/gt: Make sure that errors are propagated through request chains
  2023-04-12 11:33 ` [Intel-gfx] [PATCH v5 5/5] drm/i915/gt: Make sure that errors are propagated through request chains Andi Shyti
@ 2023-04-13 11:56   ` Tvrtko Ursulin
  2023-04-13 12:43     ` Tvrtko Ursulin
  2023-05-04  9:37     ` Andi Shyti
  0 siblings, 2 replies; 19+ messages in thread
From: Tvrtko Ursulin @ 2023-04-13 11:56 UTC (permalink / raw)
  To: Andi Shyti, intel-gfx, dri-devel, stable
  Cc: Maciej Patelczyk, Matthew Auld, Andrzej Hajda, Rodrigo Vivi,
	Nirmoy Das, Chris Wilson, Andi Shyti


On 12/04/2023 12:33, Andi Shyti wrote:
> Currently, when we perform operations such as clearing or copying
> large blocks of memory, we generate multiple requests that are
> executed in a chain.
> 
> However, if one of these requests fails, we may not realize it
> unless it happens to be the last request in the chain. This is
> because errors are not properly propagated.
> 
> For this we need to keep propagating the chain of fence
> notification in order to always reach the final fence associated
> to the final request.
> 
> To address this issue, we need to ensure that the chain of fence
> notifications is always propagated so that we can reach the final
> fence associated with the last request. By doing so, we will be
> able to detect any memory operation  failures and determine
> whether the memory is still invalid.

Above two paragraphs seems to have redundancy in the message they convey.

> On copy and clear migration signal fences upon completion.
> 
> On copy and clear migration, signal fences upon request
> completion to ensure that we have a reliable perpetuation of the
> operation outcome.

These two too. So I think commit message can be a bit polished.

> Fixes: cf586021642d80 ("drm/i915/gt: Pipelined page migration")
> Reported-by: Matthew Auld <matthew.auld@intel.com>
> Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
> Cc: stable@vger.kernel.org
> Reviewed-by: Matthew Auld <matthew.auld@intel.com>
> Acked-by: Nirmoy Das <nirmoy.das@intel.com>
> ---
>   drivers/gpu/drm/i915/gt/intel_migrate.c | 51 +++++++++++++++++++------
>   1 file changed, 39 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_migrate.c b/drivers/gpu/drm/i915/gt/intel_migrate.c
> index 3f638f1987968..668c95af8cbcf 100644
> --- a/drivers/gpu/drm/i915/gt/intel_migrate.c
> +++ b/drivers/gpu/drm/i915/gt/intel_migrate.c
> @@ -742,13 +742,19 @@ intel_context_migrate_copy(struct intel_context *ce,
>   			dst_offset = 2 * CHUNK_SZ;
>   	}
>   
> +	/*
> +	 * While building the chain of requests, we need to ensure
> +	 * that no one can sneak into the timeline unnoticed.
> +	 */
> +	mutex_lock(&ce->timeline->mutex);
> +
>   	do {
>   		int len;
>   
> -		rq = i915_request_create(ce);
> +		rq = i915_request_create_locked(ce);
>   		if (IS_ERR(rq)) {
>   			err = PTR_ERR(rq);
> -			goto out_ce;
> +			break;
>   		}
>   
>   		if (deps) {
> @@ -878,10 +884,14 @@ intel_context_migrate_copy(struct intel_context *ce,
>   
>   		/* Arbitration is re-enabled between requests. */
>   out_rq:
> -		if (*out)
> +		i915_sw_fence_await(&rq->submit);
> +		i915_request_get(rq);
> +		i915_request_add_locked(rq);
> +		if (*out) {
> +			i915_sw_fence_complete(&(*out)->submit);
>   			i915_request_put(*out);

Could you help me understand this please. I have a few questions - 
first, what are the actual mechanics of fence error transfer here? I see 
the submit fence is being blocked until the next request is submitted - 
effectively previous request is only allowed to get on the hardware 
after the next one has been queued up. But I don't immediately see what 
that does in practice.

Second question relates to the need to hold the timeline mutex 
throughout. Presumably this is so two copy or migrate operations on the 
same context do not interleave, which can otherwise happen?

Would the error propagation be doable without the lock held by chaining 
on the previous request _completion_ fence? If so I am sure that would 
have a performance impact, because chunk by chunk would need a GPU<->CPU 
round trip to schedule. How much of an impact I don't know. Maybe 
enlarging CHUNK_SZ to compensate is an option?

Or if the perf hit would be bearable for stable backports only (much 
smaller patch) and then for tip we can do this full speed solution.

But yes, I would first want to understand the actual error propagation 
mechanism because sadly my working knowledge is a bit rusty.

> -		*out = i915_request_get(rq);
> -		i915_request_add(rq);
> +		}
> +		*out = rq;
>   
>   		if (err)
>   			break;
> @@ -905,7 +915,10 @@ intel_context_migrate_copy(struct intel_context *ce,
>   		cond_resched();
>   	} while (1);
>   
> -out_ce:
> +	mutex_unlock(&ce->timeline->mutex);
> +
> +	if (*out)
> +		i915_sw_fence_complete(&(*out)->submit);
>   	return err;
>   }
>   
> @@ -999,13 +1012,19 @@ intel_context_migrate_clear(struct intel_context *ce,
>   	if (HAS_64K_PAGES(i915) && is_lmem)
>   		offset = CHUNK_SZ;
>   
> +	/*
> +	 * While building the chain of requests, we need to ensure
> +	 * that no one can sneak into the timeline unnoticed.
> +	 */
> +	mutex_lock(&ce->timeline->mutex);
> +
>   	do {
>   		int len;
>   
> -		rq = i915_request_create(ce);
> +		rq = i915_request_create_locked(ce);
>   		if (IS_ERR(rq)) {
>   			err = PTR_ERR(rq);
> -			goto out_ce;
> +			break;
>   		}
>   
>   		if (deps) {
> @@ -1056,17 +1075,25 @@ intel_context_migrate_clear(struct intel_context *ce,
>   
>   		/* Arbitration is re-enabled between requests. */
>   out_rq:
> -		if (*out)
> +		i915_sw_fence_await(&rq->submit);
> +		i915_request_get(rq);
> +		i915_request_add_locked(rq);
> +		if (*out) {
> +			i915_sw_fence_complete(&(*out)->submit);
>   			i915_request_put(*out);
> -		*out = i915_request_get(rq);
> -		i915_request_add(rq);
> +		}
> +		*out = rq;

Btw if all else fails perhaps these two blocks can be consolidated by 
something like __chain_requests(rq, out) and all these operations in it. 
Not sure how much would that save in the grand total.

Regards,

Tvrtko

> +
>   		if (err || !it.sg || !sg_dma_len(it.sg))
>   			break;
>   
>   		cond_resched();
>   	} while (1);
>   
> -out_ce:
> +	mutex_unlock(&ce->timeline->mutex);
> +
> +	if (*out)
> +		i915_sw_fence_complete(&(*out)->submit);
>   	return err;
>   }
>   

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Intel-gfx] [PATCH v5 5/5] drm/i915/gt: Make sure that errors are propagated through request chains
  2023-04-13 11:56   ` Tvrtko Ursulin
@ 2023-04-13 12:43     ` Tvrtko Ursulin
  2023-05-04  9:45       ` Andi Shyti
  2023-05-04  9:37     ` Andi Shyti
  1 sibling, 1 reply; 19+ messages in thread
From: Tvrtko Ursulin @ 2023-04-13 12:43 UTC (permalink / raw)
  To: Andi Shyti, intel-gfx, dri-devel, stable
  Cc: Maciej Patelczyk, Matthew Auld, Andrzej Hajda, Rodrigo Vivi,
	Nirmoy Das, Chris Wilson, Andi Shyti


On 13/04/2023 12:56, Tvrtko Ursulin wrote:
> 
> On 12/04/2023 12:33, Andi Shyti wrote:
>> Currently, when we perform operations such as clearing or copying
>> large blocks of memory, we generate multiple requests that are
>> executed in a chain.
>>
>> However, if one of these requests fails, we may not realize it
>> unless it happens to be the last request in the chain. This is
>> because errors are not properly propagated.
>>
>> For this we need to keep propagating the chain of fence
>> notification in order to always reach the final fence associated
>> to the final request.
>>
>> To address this issue, we need to ensure that the chain of fence
>> notifications is always propagated so that we can reach the final
>> fence associated with the last request. By doing so, we will be
>> able to detect any memory operation  failures and determine
>> whether the memory is still invalid.
> 
> Above two paragraphs seems to have redundancy in the message they convey.
> 
>> On copy and clear migration signal fences upon completion.
>>
>> On copy and clear migration, signal fences upon request
>> completion to ensure that we have a reliable perpetuation of the
>> operation outcome.
> 
> These two too. So I think commit message can be a bit polished.
> 
>> Fixes: cf586021642d80 ("drm/i915/gt: Pipelined page migration")
>> Reported-by: Matthew Auld <matthew.auld@intel.com>
>> Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
>> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
>> Cc: stable@vger.kernel.org
>> Reviewed-by: Matthew Auld <matthew.auld@intel.com>
>> Acked-by: Nirmoy Das <nirmoy.das@intel.com>
>> ---
>>   drivers/gpu/drm/i915/gt/intel_migrate.c | 51 +++++++++++++++++++------
>>   1 file changed, 39 insertions(+), 12 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/gt/intel_migrate.c 
>> b/drivers/gpu/drm/i915/gt/intel_migrate.c
>> index 3f638f1987968..668c95af8cbcf 100644
>> --- a/drivers/gpu/drm/i915/gt/intel_migrate.c
>> +++ b/drivers/gpu/drm/i915/gt/intel_migrate.c
>> @@ -742,13 +742,19 @@ intel_context_migrate_copy(struct intel_context 
>> *ce,
>>               dst_offset = 2 * CHUNK_SZ;
>>       }
>> +    /*
>> +     * While building the chain of requests, we need to ensure
>> +     * that no one can sneak into the timeline unnoticed.
>> +     */
>> +    mutex_lock(&ce->timeline->mutex);
>> +
>>       do {
>>           int len;
>> -        rq = i915_request_create(ce);
>> +        rq = i915_request_create_locked(ce);
>>           if (IS_ERR(rq)) {
>>               err = PTR_ERR(rq);
>> -            goto out_ce;
>> +            break;
>>           }
>>           if (deps) {
>> @@ -878,10 +884,14 @@ intel_context_migrate_copy(struct intel_context 
>> *ce,
>>           /* Arbitration is re-enabled between requests. */
>>   out_rq:
>> -        if (*out)
>> +        i915_sw_fence_await(&rq->submit);
>> +        i915_request_get(rq);
>> +        i915_request_add_locked(rq);
>> +        if (*out) {
>> +            i915_sw_fence_complete(&(*out)->submit);
>>               i915_request_put(*out);
> 
> Could you help me understand this please. I have a few questions - 
> first, what are the actual mechanics of fence error transfer here? I see 
> the submit fence is being blocked until the next request is submitted - 
> effectively previous request is only allowed to get on the hardware 
> after the next one has been queued up. But I don't immediately see what 
> that does in practice.
> 
> Second question relates to the need to hold the timeline mutex 
> throughout. Presumably this is so two copy or migrate operations on the 
> same context do not interleave, which can otherwise happen?
> 
> Would the error propagation be doable without the lock held by chaining 
> on the previous request _completion_ fence? If so I am sure that would 
> have a performance impact, because chunk by chunk would need a GPU<->CPU 
> round trip to schedule. How much of an impact I don't know. Maybe 
> enlarging CHUNK_SZ to compensate is an option?
> 
> Or if the perf hit would be bearable for stable backports only (much 
> smaller patch) and then for tip we can do this full speed solution.
> 
> But yes, I would first want to understand the actual error propagation 
> mechanism because sadly my working knowledge is a bit rusty.

Another option - maybe - is this related to revert of fence error 
propagation? If it is and having that would avoid the need for this 
invasive fix, maybe we unrevert 3761baae908a7b5012be08d70fa553cc2eb82305 
with edits to limit to special contexts? If doable..

Regards,

Tvrtko

> 
>> -        *out = i915_request_get(rq);
>> -        i915_request_add(rq);
>> +        }
>> +        *out = rq;
>>           if (err)
>>               break;
>> @@ -905,7 +915,10 @@ intel_context_migrate_copy(struct intel_context *ce,
>>           cond_resched();
>>       } while (1);
>> -out_ce:
>> +    mutex_unlock(&ce->timeline->mutex);
>> +
>> +    if (*out)
>> +        i915_sw_fence_complete(&(*out)->submit);
>>       return err;
>>   }
>> @@ -999,13 +1012,19 @@ intel_context_migrate_clear(struct 
>> intel_context *ce,
>>       if (HAS_64K_PAGES(i915) && is_lmem)
>>           offset = CHUNK_SZ;
>> +    /*
>> +     * While building the chain of requests, we need to ensure
>> +     * that no one can sneak into the timeline unnoticed.
>> +     */
>> +    mutex_lock(&ce->timeline->mutex);
>> +
>>       do {
>>           int len;
>> -        rq = i915_request_create(ce);
>> +        rq = i915_request_create_locked(ce);
>>           if (IS_ERR(rq)) {
>>               err = PTR_ERR(rq);
>> -            goto out_ce;
>> +            break;
>>           }
>>           if (deps) {
>> @@ -1056,17 +1075,25 @@ intel_context_migrate_clear(struct 
>> intel_context *ce,
>>           /* Arbitration is re-enabled between requests. */
>>   out_rq:
>> -        if (*out)
>> +        i915_sw_fence_await(&rq->submit);
>> +        i915_request_get(rq);
>> +        i915_request_add_locked(rq);
>> +        if (*out) {
>> +            i915_sw_fence_complete(&(*out)->submit);
>>               i915_request_put(*out);
>> -        *out = i915_request_get(rq);
>> -        i915_request_add(rq);
>> +        }
>> +        *out = rq;
> 
> Btw if all else fails perhaps these two blocks can be consolidated by 
> something like __chain_requests(rq, out) and all these operations in it. 
> Not sure how much would that save in the grand total.
> 
> Regards,
> 
> Tvrtko
> 
>> +
>>           if (err || !it.sg || !sg_dma_len(it.sg))
>>               break;
>>           cond_resched();
>>       } while (1);
>> -out_ce:
>> +    mutex_unlock(&ce->timeline->mutex);
>> +
>> +    if (*out)
>> +        i915_sw_fence_complete(&(*out)->submit);
>>       return err;
>>   }

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Intel-gfx] [PATCH v5 5/5] drm/i915/gt: Make sure that errors are propagated through request chains
  2023-04-13 11:56   ` Tvrtko Ursulin
  2023-04-13 12:43     ` Tvrtko Ursulin
@ 2023-05-04  9:37     ` Andi Shyti
  1 sibling, 0 replies; 19+ messages in thread
From: Andi Shyti @ 2023-05-04  9:37 UTC (permalink / raw)
  To: Tvrtko Ursulin
  Cc: Andi Shyti, intel-gfx, Matthew Auld, stable, Andrzej Hajda,
	dri-devel, Rodrigo Vivi, Nirmoy Das, Chris Wilson,
	Maciej Patelczyk

Hi Tvrtko,

sorry for the very late reply, it's about time to bring this
patch up.

On Thu, Apr 13, 2023 at 12:56:00PM +0100, Tvrtko Ursulin wrote:
> 
> On 12/04/2023 12:33, Andi Shyti wrote:
> > Currently, when we perform operations such as clearing or copying
> > large blocks of memory, we generate multiple requests that are
> > executed in a chain.
> > 
> > However, if one of these requests fails, we may not realize it
> > unless it happens to be the last request in the chain. This is
> > because errors are not properly propagated.
> > 
> > For this we need to keep propagating the chain of fence
> > notification in order to always reach the final fence associated
> > to the final request.
> > 
> > To address this issue, we need to ensure that the chain of fence
> > notifications is always propagated so that we can reach the final
> > fence associated with the last request. By doing so, we will be
> > able to detect any memory operation  failures and determine
> > whether the memory is still invalid.
> 
> Above two paragraphs seems to have redundancy in the message they convey.
> 
> > On copy and clear migration signal fences upon completion.
> > 
> > On copy and clear migration, signal fences upon request
> > completion to ensure that we have a reliable perpetuation of the
> > operation outcome.
> 
> These two too. So I think commit message can be a bit polished.

In my intent of being very explicative I might have exaggerated.
I know that these kind of patches might bring some controversy.

I will review the commit.

> > Fixes: cf586021642d80 ("drm/i915/gt: Pipelined page migration")
> > Reported-by: Matthew Auld <matthew.auld@intel.com>
> > Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
> > Cc: stable@vger.kernel.org
> > Reviewed-by: Matthew Auld <matthew.auld@intel.com>
> > Acked-by: Nirmoy Das <nirmoy.das@intel.com>
> > ---
> >   drivers/gpu/drm/i915/gt/intel_migrate.c | 51 +++++++++++++++++++------
> >   1 file changed, 39 insertions(+), 12 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/intel_migrate.c b/drivers/gpu/drm/i915/gt/intel_migrate.c
> > index 3f638f1987968..668c95af8cbcf 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_migrate.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_migrate.c
> > @@ -742,13 +742,19 @@ intel_context_migrate_copy(struct intel_context *ce,
> >   			dst_offset = 2 * CHUNK_SZ;
> >   	}
> > +	/*
> > +	 * While building the chain of requests, we need to ensure
> > +	 * that no one can sneak into the timeline unnoticed.
> > +	 */
> > +	mutex_lock(&ce->timeline->mutex);
> > +
> >   	do {
> >   		int len;
> > -		rq = i915_request_create(ce);
> > +		rq = i915_request_create_locked(ce);
> >   		if (IS_ERR(rq)) {
> >   			err = PTR_ERR(rq);
> > -			goto out_ce;
> > +			break;
> >   		}
> >   		if (deps) {
> > @@ -878,10 +884,14 @@ intel_context_migrate_copy(struct intel_context *ce,
> >   		/* Arbitration is re-enabled between requests. */
> >   out_rq:
> > -		if (*out)
> > +		i915_sw_fence_await(&rq->submit);
> > +		i915_request_get(rq);
> > +		i915_request_add_locked(rq);
> > +		if (*out) {
> > +			i915_sw_fence_complete(&(*out)->submit);
> >   			i915_request_put(*out);
> 
> Could you help me understand this please. I have a few questions - first,
> what are the actual mechanics of fence error transfer here? I see the submit
> fence is being blocked until the next request is submitted - effectively
> previous request is only allowed to get on the hardware after the next one
> has been queued up. But I don't immediately see what that does in practice.

This is the basic of the error perpetuation. Without this
serialization, for big operations like migrate and copy, we would
only catch the error in the last rq.

> Second question relates to the need to hold the timeline mutex throughout.
> Presumably this is so two copy or migrate operations on the same context do
> not interleave, which can otherwise happen?
> 
> Would the error propagation be doable without the lock held by chaining on
> the previous request _completion_ fence? If so I am sure that would have a
> performance impact, because chunk by chunk would need a GPU<->CPU round trip
> to schedule. How much of an impact I don't know. Maybe enlarging CHUNK_SZ to
> compensate is an option?

The need for a mutex lock comes from adding the throttle during
request creation, which ensures no pending requests are being
served.

I will copy paste from Chris review, which was missed in the
mailing list:

Adding a large throttle before the mutex makes the race less
likely, but to overcome that just increase the number of
simultaneous clients fighting for ring space.

If we hold the lock while constructing the chain, no one else may
inject themselves between links in our chain. If we do not, we
may end up with

ABCDEFGHI
^head   ^tail

Then in order for A to submit its next request it has to wait
upon its previous request. But since we are holding the submit
fence for A, it will not be executed until after we complete our
submission. Boom.

Andi

> Or if the perf hit would be bearable for stable backports only (much smaller
> patch) and then for tip we can do this full speed solution.
> 
> But yes, I would first want to understand the actual error propagation
> mechanism because sadly my working knowledge is a bit rusty.
> 
> > -		*out = i915_request_get(rq);
> > -		i915_request_add(rq);
> > +		}
> > +		*out = rq;
> >   		if (err)
> >   			break;
> > @@ -905,7 +915,10 @@ intel_context_migrate_copy(struct intel_context *ce,
> >   		cond_resched();
> >   	} while (1);
> > -out_ce:
> > +	mutex_unlock(&ce->timeline->mutex);
> > +
> > +	if (*out)
> > +		i915_sw_fence_complete(&(*out)->submit);
> >   	return err;
> >   }
> > @@ -999,13 +1012,19 @@ intel_context_migrate_clear(struct intel_context *ce,
> >   	if (HAS_64K_PAGES(i915) && is_lmem)
> >   		offset = CHUNK_SZ;
> > +	/*
> > +	 * While building the chain of requests, we need to ensure
> > +	 * that no one can sneak into the timeline unnoticed.
> > +	 */
> > +	mutex_lock(&ce->timeline->mutex);
> > +
> >   	do {
> >   		int len;
> > -		rq = i915_request_create(ce);
> > +		rq = i915_request_create_locked(ce);
> >   		if (IS_ERR(rq)) {
> >   			err = PTR_ERR(rq);
> > -			goto out_ce;
> > +			break;
> >   		}
> >   		if (deps) {
> > @@ -1056,17 +1075,25 @@ intel_context_migrate_clear(struct intel_context *ce,
> >   		/* Arbitration is re-enabled between requests. */
> >   out_rq:
> > -		if (*out)
> > +		i915_sw_fence_await(&rq->submit);
> > +		i915_request_get(rq);
> > +		i915_request_add_locked(rq);
> > +		if (*out) {
> > +			i915_sw_fence_complete(&(*out)->submit);
> >   			i915_request_put(*out);
> > -		*out = i915_request_get(rq);
> > -		i915_request_add(rq);
> > +		}
> > +		*out = rq;
> 
> Btw if all else fails perhaps these two blocks can be consolidated by
> something like __chain_requests(rq, out) and all these operations in it. Not
> sure how much would that save in the grand total.
> 
> Regards,
> 
> Tvrtko
> 
> > +
> >   		if (err || !it.sg || !sg_dma_len(it.sg))
> >   			break;
> >   		cond_resched();
> >   	} while (1);
> > -out_ce:
> > +	mutex_unlock(&ce->timeline->mutex);
> > +
> > +	if (*out)
> > +		i915_sw_fence_complete(&(*out)->submit);
> >   	return err;
> >   }

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Intel-gfx] [PATCH v5 5/5] drm/i915/gt: Make sure that errors are propagated through request chains
  2023-04-13 12:43     ` Tvrtko Ursulin
@ 2023-05-04  9:45       ` Andi Shyti
  0 siblings, 0 replies; 19+ messages in thread
From: Andi Shyti @ 2023-05-04  9:45 UTC (permalink / raw)
  To: Tvrtko Ursulin
  Cc: Andi Shyti, intel-gfx, Matthew Auld, stable, Andrzej Hajda,
	dri-devel, Rodrigo Vivi, Nirmoy Das, Chris Wilson,
	Maciej Patelczyk

Hi Tvrtko,

> Another option - maybe - is this related to revert of fence error
> propagation? If it is and having that would avoid the need for this invasive
> fix, maybe we unrevert 3761baae908a7b5012be08d70fa553cc2eb82305 with edits
> to limit to special contexts? If doable..

I think that is not enough as we want to get anyway to the last
request and fence submitted. Right?

I guess this commit should be reverted anyway.

Andi

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2023-05-04  9:45 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-04-12 11:33 [Intel-gfx] [PATCH v5 0/5] Fix error propagation amongst request Andi Shyti
2023-04-12 11:33 ` [Intel-gfx] [PATCH v5 1/5] drm/i915/gt: Add intel_context_timeline_is_locked helper Andi Shyti
2023-04-12 12:12   ` Andrzej Hajda
2023-04-12 11:33 ` [Intel-gfx] [PATCH v5 2/5] drm/i915: Create the locked version of the request create Andi Shyti
2023-04-12 13:04   ` Andrzej Hajda
2023-04-13  8:53     ` Andi Shyti
2023-04-12 11:33 ` [Intel-gfx] [PATCH v5 3/5] drm/i915: Create the locked version of the request add Andi Shyti
2023-04-12 13:06   ` Andrzej Hajda
2023-04-13  8:57     ` Andi Shyti
2023-04-12 11:33 ` [Intel-gfx] [PATCH v5 4/5] drm/i915: Throttle for ringspace prior to taking the timeline mutex Andi Shyti
2023-04-12 11:33 ` [Intel-gfx] [PATCH v5 5/5] drm/i915/gt: Make sure that errors are propagated through request chains Andi Shyti
2023-04-13 11:56   ` Tvrtko Ursulin
2023-04-13 12:43     ` Tvrtko Ursulin
2023-05-04  9:45       ` Andi Shyti
2023-05-04  9:37     ` Andi Shyti
2023-04-12 17:28 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for Fix error propagation amongst request (rev3) Patchwork
2023-04-12 17:28 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
2023-04-12 17:41 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2023-04-13  3:47 ` [Intel-gfx] ✓ Fi.CI.IGT: " Patchwork

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).