All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] Use FAST_REQUEST mechanism for non-blocking H2G calls
@ 2023-05-26 23:55 ` John.C.Harrison
  0 siblings, 0 replies; 17+ messages in thread
From: John.C.Harrison @ 2023-05-26 23:55 UTC (permalink / raw)
  To: Intel-GFX; +Cc: John Harrison, DRI-Devel

From: John Harrison <John.C.Harrison@Intel.com>

The GuC interface supports a mechanism for returning errors against
non-blocking H2G calls. This is called FAST_REQUEST. Given that the
call is asynchronous, matching the returned error up is difficult.
However, getting any error at all back is better than no error.

If any such errors are reported, then extra tracking support can be
compiled in for manual debug.

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>


Michal Wajdeczko (3):
  drm/i915/guc: Use FAST_REQUEST for non-blocking H2G calls
  drm/i915/guc: Update log for unsolicited CTB response
  drm/i915/guc: Track all sent actions to GuC

 drivers/gpu/drm/i915/Kconfig.debug            |  1 +
 .../gpu/drm/i915/gt/uc/abi/guc_messages_abi.h | 30 +++++++
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c     | 79 ++++++++++++++++---
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h     | 11 +++
 4 files changed, 112 insertions(+), 9 deletions(-)

-- 
2.39.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Intel-gfx] [PATCH 0/3] Use FAST_REQUEST mechanism for non-blocking H2G calls
@ 2023-05-26 23:55 ` John.C.Harrison
  0 siblings, 0 replies; 17+ messages in thread
From: John.C.Harrison @ 2023-05-26 23:55 UTC (permalink / raw)
  To: Intel-GFX; +Cc: DRI-Devel

From: John Harrison <John.C.Harrison@Intel.com>

The GuC interface supports a mechanism for returning errors against
non-blocking H2G calls. This is called FAST_REQUEST. Given that the
call is asynchronous, matching the returned error up is difficult.
However, getting any error at all back is better than no error.

If any such errors are reported, then extra tracking support can be
compiled in for manual debug.

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>


Michal Wajdeczko (3):
  drm/i915/guc: Use FAST_REQUEST for non-blocking H2G calls
  drm/i915/guc: Update log for unsolicited CTB response
  drm/i915/guc: Track all sent actions to GuC

 drivers/gpu/drm/i915/Kconfig.debug            |  1 +
 .../gpu/drm/i915/gt/uc/abi/guc_messages_abi.h | 30 +++++++
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c     | 79 ++++++++++++++++---
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h     | 11 +++
 4 files changed, 112 insertions(+), 9 deletions(-)

-- 
2.39.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 1/3] drm/i915/guc: Use FAST_REQUEST for non-blocking H2G calls
  2023-05-26 23:55 ` [Intel-gfx] " John.C.Harrison
@ 2023-05-26 23:55   ` John.C.Harrison
  -1 siblings, 0 replies; 17+ messages in thread
From: John.C.Harrison @ 2023-05-26 23:55 UTC (permalink / raw)
  To: Intel-GFX; +Cc: John Harrison, DRI-Devel, Michal Wajdeczko

From: Michal Wajdeczko <michal.wajdeczko@intel.com>

In addition to the already defined REQUEST HXG message format,
which is used when sender expects some confirmation or data,
HXG protocol includes definition of the FAST REQUEST message,
that may be used when sender does not expect any useful data
to be returned.

Using this instead of GUC_HXG_TYPE_EVENT for non-blocking CTB requests
will allow GuC to send back GUC_HXG_TYPE_RESPONSE_FAILURE in case of
errors.

Note that it is not possible to return such errors to the caller,
since this is for non-blocking calls and the related fence is not
stored. Instead such messages are treated as unexpected, which will
give an indication of potential GuC misprogramming that warrants extra
debugging effort.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
---
 .../gpu/drm/i915/gt/uc/abi/guc_messages_abi.h | 30 +++++++++++++++++++
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c     |  6 ++--
 2 files changed, 33 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/abi/guc_messages_abi.h b/drivers/gpu/drm/i915/gt/uc/abi/guc_messages_abi.h
index 7d5ba4d97d708..98eb4f46572b9 100644
--- a/drivers/gpu/drm/i915/gt/uc/abi/guc_messages_abi.h
+++ b/drivers/gpu/drm/i915/gt/uc/abi/guc_messages_abi.h
@@ -24,6 +24,7 @@
  *  |   | 30:28 | **TYPE** - message type                                      |
  *  |   |       |   - _`GUC_HXG_TYPE_REQUEST` = 0                              |
  *  |   |       |   - _`GUC_HXG_TYPE_EVENT` = 1                                |
+ *  |   |       |   - _`GUC_HXG_TYPE_FAST_REQUEST` = 2                         |
  *  |   |       |   - _`GUC_HXG_TYPE_NO_RESPONSE_BUSY` = 3                     |
  *  |   |       |   - _`GUC_HXG_TYPE_NO_RESPONSE_RETRY` = 5                    |
  *  |   |       |   - _`GUC_HXG_TYPE_RESPONSE_FAILURE` = 6                     |
@@ -46,6 +47,7 @@
 #define GUC_HXG_MSG_0_TYPE			(0x7 << 28)
 #define   GUC_HXG_TYPE_REQUEST			0u
 #define   GUC_HXG_TYPE_EVENT			1u
+#define   GUC_HXG_TYPE_FAST_REQUEST		2u
 #define   GUC_HXG_TYPE_NO_RESPONSE_BUSY		3u
 #define   GUC_HXG_TYPE_NO_RESPONSE_RETRY	5u
 #define   GUC_HXG_TYPE_RESPONSE_FAILURE		6u
@@ -89,6 +91,34 @@
 #define GUC_HXG_REQUEST_MSG_0_ACTION		(0xffff << 0)
 #define GUC_HXG_REQUEST_MSG_n_DATAn		GUC_HXG_MSG_n_PAYLOAD
 
+/**
+ * DOC: HXG Fast Request
+ *
+ * The `HXG Request`_ message should be used to initiate asynchronous activity
+ * for which confirmation or return data is not expected.
+ *
+ * If confirmation is required then `HXG Request`_ shall be used instead.
+ *
+ * The recipient of this message may only use `HXG Failure`_ message if it was
+ * unable to accept this request (like invalid data).
+ *
+ * Format of `HXG Fast Request`_ message is same as `HXG Request`_ except @TYPE.
+ *
+ *  +---+-------+--------------------------------------------------------------+
+ *  |   | Bits  | Description                                                  |
+ *  +===+=======+==============================================================+
+ *  | 0 |    31 | ORIGIN - see `HXG Message`_                                  |
+ *  |   +-------+--------------------------------------------------------------+
+ *  |   | 30:28 | TYPE = `GUC_HXG_TYPE_FAST_REQUEST`_                          |
+ *  |   +-------+--------------------------------------------------------------+
+ *  |   | 27:16 | DATA0 - see `HXG Request`_                                   |
+ *  |   +-------+--------------------------------------------------------------+
+ *  |   |  15:0 | ACTION - see `HXG Request`_                                  |
+ *  +---+-------+--------------------------------------------------------------+
+ *  |...|       | DATAn - see `HXG Request`_                                   |
+ *  +---+-------+--------------------------------------------------------------+
+ */
+
 /**
  * DOC: HXG Event
  *
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
index a22e33f37cae6..af52ed4ffc7fb 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
@@ -426,11 +426,11 @@ static int ct_write(struct intel_guc_ct *ct,
 		 FIELD_PREP(GUC_CTB_MSG_0_NUM_DWORDS, len) |
 		 FIELD_PREP(GUC_CTB_MSG_0_FENCE, fence);
 
-	type = (flags & INTEL_GUC_CT_SEND_NB) ? GUC_HXG_TYPE_EVENT :
+	type = (flags & INTEL_GUC_CT_SEND_NB) ? GUC_HXG_TYPE_FAST_REQUEST :
 		GUC_HXG_TYPE_REQUEST;
 	hxg = FIELD_PREP(GUC_HXG_MSG_0_TYPE, type) |
-		FIELD_PREP(GUC_HXG_EVENT_MSG_0_ACTION |
-			   GUC_HXG_EVENT_MSG_0_DATA0, action[0]);
+		FIELD_PREP(GUC_HXG_REQUEST_MSG_0_ACTION |
+			   GUC_HXG_REQUEST_MSG_0_DATA0, action[0]);
 
 	CT_DEBUG(ct, "writing (tail %u) %*ph %*ph %*ph\n",
 		 tail, 4, &header, 4, &hxg, 4 * (len - 1), &action[1]);
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Intel-gfx] [PATCH 1/3] drm/i915/guc: Use FAST_REQUEST for non-blocking H2G calls
@ 2023-05-26 23:55   ` John.C.Harrison
  0 siblings, 0 replies; 17+ messages in thread
From: John.C.Harrison @ 2023-05-26 23:55 UTC (permalink / raw)
  To: Intel-GFX; +Cc: DRI-Devel

From: Michal Wajdeczko <michal.wajdeczko@intel.com>

In addition to the already defined REQUEST HXG message format,
which is used when sender expects some confirmation or data,
HXG protocol includes definition of the FAST REQUEST message,
that may be used when sender does not expect any useful data
to be returned.

Using this instead of GUC_HXG_TYPE_EVENT for non-blocking CTB requests
will allow GuC to send back GUC_HXG_TYPE_RESPONSE_FAILURE in case of
errors.

Note that it is not possible to return such errors to the caller,
since this is for non-blocking calls and the related fence is not
stored. Instead such messages are treated as unexpected, which will
give an indication of potential GuC misprogramming that warrants extra
debugging effort.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
---
 .../gpu/drm/i915/gt/uc/abi/guc_messages_abi.h | 30 +++++++++++++++++++
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c     |  6 ++--
 2 files changed, 33 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/abi/guc_messages_abi.h b/drivers/gpu/drm/i915/gt/uc/abi/guc_messages_abi.h
index 7d5ba4d97d708..98eb4f46572b9 100644
--- a/drivers/gpu/drm/i915/gt/uc/abi/guc_messages_abi.h
+++ b/drivers/gpu/drm/i915/gt/uc/abi/guc_messages_abi.h
@@ -24,6 +24,7 @@
  *  |   | 30:28 | **TYPE** - message type                                      |
  *  |   |       |   - _`GUC_HXG_TYPE_REQUEST` = 0                              |
  *  |   |       |   - _`GUC_HXG_TYPE_EVENT` = 1                                |
+ *  |   |       |   - _`GUC_HXG_TYPE_FAST_REQUEST` = 2                         |
  *  |   |       |   - _`GUC_HXG_TYPE_NO_RESPONSE_BUSY` = 3                     |
  *  |   |       |   - _`GUC_HXG_TYPE_NO_RESPONSE_RETRY` = 5                    |
  *  |   |       |   - _`GUC_HXG_TYPE_RESPONSE_FAILURE` = 6                     |
@@ -46,6 +47,7 @@
 #define GUC_HXG_MSG_0_TYPE			(0x7 << 28)
 #define   GUC_HXG_TYPE_REQUEST			0u
 #define   GUC_HXG_TYPE_EVENT			1u
+#define   GUC_HXG_TYPE_FAST_REQUEST		2u
 #define   GUC_HXG_TYPE_NO_RESPONSE_BUSY		3u
 #define   GUC_HXG_TYPE_NO_RESPONSE_RETRY	5u
 #define   GUC_HXG_TYPE_RESPONSE_FAILURE		6u
@@ -89,6 +91,34 @@
 #define GUC_HXG_REQUEST_MSG_0_ACTION		(0xffff << 0)
 #define GUC_HXG_REQUEST_MSG_n_DATAn		GUC_HXG_MSG_n_PAYLOAD
 
+/**
+ * DOC: HXG Fast Request
+ *
+ * The `HXG Request`_ message should be used to initiate asynchronous activity
+ * for which confirmation or return data is not expected.
+ *
+ * If confirmation is required then `HXG Request`_ shall be used instead.
+ *
+ * The recipient of this message may only use `HXG Failure`_ message if it was
+ * unable to accept this request (like invalid data).
+ *
+ * Format of `HXG Fast Request`_ message is same as `HXG Request`_ except @TYPE.
+ *
+ *  +---+-------+--------------------------------------------------------------+
+ *  |   | Bits  | Description                                                  |
+ *  +===+=======+==============================================================+
+ *  | 0 |    31 | ORIGIN - see `HXG Message`_                                  |
+ *  |   +-------+--------------------------------------------------------------+
+ *  |   | 30:28 | TYPE = `GUC_HXG_TYPE_FAST_REQUEST`_                          |
+ *  |   +-------+--------------------------------------------------------------+
+ *  |   | 27:16 | DATA0 - see `HXG Request`_                                   |
+ *  |   +-------+--------------------------------------------------------------+
+ *  |   |  15:0 | ACTION - see `HXG Request`_                                  |
+ *  +---+-------+--------------------------------------------------------------+
+ *  |...|       | DATAn - see `HXG Request`_                                   |
+ *  +---+-------+--------------------------------------------------------------+
+ */
+
 /**
  * DOC: HXG Event
  *
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
index a22e33f37cae6..af52ed4ffc7fb 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
@@ -426,11 +426,11 @@ static int ct_write(struct intel_guc_ct *ct,
 		 FIELD_PREP(GUC_CTB_MSG_0_NUM_DWORDS, len) |
 		 FIELD_PREP(GUC_CTB_MSG_0_FENCE, fence);
 
-	type = (flags & INTEL_GUC_CT_SEND_NB) ? GUC_HXG_TYPE_EVENT :
+	type = (flags & INTEL_GUC_CT_SEND_NB) ? GUC_HXG_TYPE_FAST_REQUEST :
 		GUC_HXG_TYPE_REQUEST;
 	hxg = FIELD_PREP(GUC_HXG_MSG_0_TYPE, type) |
-		FIELD_PREP(GUC_HXG_EVENT_MSG_0_ACTION |
-			   GUC_HXG_EVENT_MSG_0_DATA0, action[0]);
+		FIELD_PREP(GUC_HXG_REQUEST_MSG_0_ACTION |
+			   GUC_HXG_REQUEST_MSG_0_DATA0, action[0]);
 
 	CT_DEBUG(ct, "writing (tail %u) %*ph %*ph %*ph\n",
 		 tail, 4, &header, 4, &hxg, 4 * (len - 1), &action[1]);
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 2/3] drm/i915/guc: Update log for unsolicited CTB response
  2023-05-26 23:55 ` [Intel-gfx] " John.C.Harrison
@ 2023-05-26 23:55   ` John.C.Harrison
  -1 siblings, 0 replies; 17+ messages in thread
From: John.C.Harrison @ 2023-05-26 23:55 UTC (permalink / raw)
  To: Intel-GFX; +Cc: John Harrison, DRI-Devel, Michal Wajdeczko

From: Michal Wajdeczko <michal.wajdeczko@intel.com>

Instead of printing message fence twice, include HXG header of the
unexpected message and its len.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
index af52ed4ffc7fb..3a71bb582089e 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
@@ -994,9 +994,8 @@ static int ct_handle_response(struct intel_guc_ct *ct, struct ct_incoming_msg *r
 		break;
 	}
 	if (!found) {
-		CT_ERROR(ct, "Unsolicited response (fence %u)\n", fence);
-		CT_ERROR(ct, "Could not find fence=%u, last_fence=%u\n", fence,
-			 ct->requests.last_fence);
+		CT_ERROR(ct, "Unsolicited response message: len %u, data %#x (fence %u, last %u)\n",
+			 len, hxg[0], fence, ct->requests.last_fence);
 		list_for_each_entry(req, &ct->requests.pending, link)
 			CT_ERROR(ct, "request %u awaits response\n",
 				 req->fence);
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Intel-gfx] [PATCH 2/3] drm/i915/guc: Update log for unsolicited CTB response
@ 2023-05-26 23:55   ` John.C.Harrison
  0 siblings, 0 replies; 17+ messages in thread
From: John.C.Harrison @ 2023-05-26 23:55 UTC (permalink / raw)
  To: Intel-GFX; +Cc: DRI-Devel

From: Michal Wajdeczko <michal.wajdeczko@intel.com>

Instead of printing message fence twice, include HXG header of the
unexpected message and its len.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
index af52ed4ffc7fb..3a71bb582089e 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
@@ -994,9 +994,8 @@ static int ct_handle_response(struct intel_guc_ct *ct, struct ct_incoming_msg *r
 		break;
 	}
 	if (!found) {
-		CT_ERROR(ct, "Unsolicited response (fence %u)\n", fence);
-		CT_ERROR(ct, "Could not find fence=%u, last_fence=%u\n", fence,
-			 ct->requests.last_fence);
+		CT_ERROR(ct, "Unsolicited response message: len %u, data %#x (fence %u, last %u)\n",
+			 len, hxg[0], fence, ct->requests.last_fence);
 		list_for_each_entry(req, &ct->requests.pending, link)
 			CT_ERROR(ct, "request %u awaits response\n",
 				 req->fence);
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 3/3] drm/i915/guc: Track all sent actions to GuC
  2023-05-26 23:55 ` [Intel-gfx] " John.C.Harrison
@ 2023-05-26 23:55   ` John.C.Harrison
  -1 siblings, 0 replies; 17+ messages in thread
From: John.C.Harrison @ 2023-05-26 23:55 UTC (permalink / raw)
  To: Intel-GFX; +Cc: John Harrison, DRI-Devel, Michal Wajdeczko

From: Michal Wajdeczko <michal.wajdeczko@intel.com>

For easier debug of any unexpected error responses from GuC that
might be related to non-blocking fast requests, track action code (and
stack if under DEBUG_GUC config) for every H2G request.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
---
 drivers/gpu/drm/i915/Kconfig.debug        |  1 +
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 68 ++++++++++++++++++++++-
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h | 11 ++++
 3 files changed, 77 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/Kconfig.debug b/drivers/gpu/drm/i915/Kconfig.debug
index 47e845353ffad..2d21930d55015 100644
--- a/drivers/gpu/drm/i915/Kconfig.debug
+++ b/drivers/gpu/drm/i915/Kconfig.debug
@@ -157,6 +157,7 @@ config DRM_I915_SW_FENCE_CHECK_DAG
 config DRM_I915_DEBUG_GUC
 	bool "Enable additional driver debugging for GuC"
 	depends on DRM_I915
+	select STACKDEPOT
 	default n
 	help
 	  Choose this option to turn on extra driver debugging that may affect
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
index 3a71bb582089e..4aa903be1317b 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
@@ -376,6 +376,24 @@ void intel_guc_ct_disable(struct intel_guc_ct *ct)
 	}
 }
 
+#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM)
+static void ct_track_lost_and_found(struct intel_guc_ct *ct, u32 fence, u32 action)
+{
+	unsigned int lost = fence % ARRAY_SIZE(ct->requests.lost_and_found);
+#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GUC)
+	unsigned long entries[SZ_32];
+	unsigned int n;
+
+	n = stack_trace_save(entries, ARRAY_SIZE(entries), 1);
+
+	/* May be called under spinlock, so avoid sleeping */
+	ct->requests.lost_and_found[lost].stack = stack_depot_save(entries, n, GFP_NOWAIT);
+#endif
+	ct->requests.lost_and_found[lost].fence = fence;
+	ct->requests.lost_and_found[lost].action = action;
+}
+#endif
+
 static u32 ct_get_next_fence(struct intel_guc_ct *ct)
 {
 	/* For now it's trivial */
@@ -447,6 +465,11 @@ static int ct_write(struct intel_guc_ct *ct,
 	}
 	GEM_BUG_ON(tail > size);
 
+#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM)
+	ct_track_lost_and_found(ct, fence,
+				FIELD_GET(GUC_HXG_EVENT_MSG_0_ACTION, action[0]));
+#endif
+
 	/*
 	 * make sure H2G buffer update and LRC tail update (if this triggering a
 	 * submission) are visible before updating the descriptor tail
@@ -953,6 +976,43 @@ static int ct_read(struct intel_guc_ct *ct, struct ct_incoming_msg **msg)
 	return -EPIPE;
 }
 
+#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM)
+static bool ct_check_lost_and_found(struct intel_guc_ct *ct, u32 fence)
+{
+	unsigned int n;
+	char *buf = NULL;
+	bool found = false;
+
+	lockdep_assert_held(&ct->requests.lock);
+
+	for (n = 0; n < ARRAY_SIZE(ct->requests.lost_and_found); n++) {
+		if (ct->requests.lost_and_found[n].fence != fence)
+			continue;
+		found = true;
+
+#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GUC)
+		buf = kmalloc(SZ_4K, GFP_NOWAIT);
+		if (buf && stack_depot_snprint(ct->requests.lost_and_found[n].stack,
+					       buf, SZ_4K, 0)) {
+			CT_ERROR(ct, "Fence %u was used by action %#04x sent at\n%s",
+				 fence, ct->requests.lost_and_found[n].action, buf);
+			break;
+		}
+#endif
+		CT_ERROR(ct, "Fence %u was used by action %#04x\n",
+			 fence, ct->requests.lost_and_found[n].action);
+		break;
+	}
+	kfree(buf);
+	return found;
+}
+#else
+static bool ct_check_lost_and_found(struct intel_guc_ct *ct, u32 fence)
+{
+	return false;
+}
+#endif
+
 static int ct_handle_response(struct intel_guc_ct *ct, struct ct_incoming_msg *response)
 {
 	u32 len = FIELD_GET(GUC_CTB_MSG_0_NUM_DWORDS, response->msg[0]);
@@ -996,9 +1056,11 @@ static int ct_handle_response(struct intel_guc_ct *ct, struct ct_incoming_msg *r
 	if (!found) {
 		CT_ERROR(ct, "Unsolicited response message: len %u, data %#x (fence %u, last %u)\n",
 			 len, hxg[0], fence, ct->requests.last_fence);
-		list_for_each_entry(req, &ct->requests.pending, link)
-			CT_ERROR(ct, "request %u awaits response\n",
-				 req->fence);
+		if (!ct_check_lost_and_found(ct, fence)) {
+			list_for_each_entry(req, &ct->requests.pending, link)
+				CT_ERROR(ct, "request %u awaits response\n",
+					 req->fence);
+		}
 		err = -ENOKEY;
 	}
 	spin_unlock_irqrestore(&ct->requests.lock, flags);
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
index 818415b64f4d1..58e42901ff498 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
@@ -8,6 +8,7 @@
 
 #include <linux/interrupt.h>
 #include <linux/spinlock.h>
+#include <linux/stackdepot.h>
 #include <linux/workqueue.h>
 #include <linux/ktime.h>
 #include <linux/wait.h>
@@ -81,6 +82,16 @@ struct intel_guc_ct {
 
 		struct list_head incoming; /* incoming requests */
 		struct work_struct worker; /* handler for incoming requests */
+
+#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM)
+		struct {
+			u16 fence;
+			u16 action;
+#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GUC)
+			depot_stack_handle_t stack;
+#endif
+		} lost_and_found[SZ_16];
+#endif
 	} requests;
 
 	/** @stall_time: time of first time a CTB submission is stalled */
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Intel-gfx] [PATCH 3/3] drm/i915/guc: Track all sent actions to GuC
@ 2023-05-26 23:55   ` John.C.Harrison
  0 siblings, 0 replies; 17+ messages in thread
From: John.C.Harrison @ 2023-05-26 23:55 UTC (permalink / raw)
  To: Intel-GFX; +Cc: DRI-Devel

From: Michal Wajdeczko <michal.wajdeczko@intel.com>

For easier debug of any unexpected error responses from GuC that
might be related to non-blocking fast requests, track action code (and
stack if under DEBUG_GUC config) for every H2G request.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
---
 drivers/gpu/drm/i915/Kconfig.debug        |  1 +
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 68 ++++++++++++++++++++++-
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h | 11 ++++
 3 files changed, 77 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/Kconfig.debug b/drivers/gpu/drm/i915/Kconfig.debug
index 47e845353ffad..2d21930d55015 100644
--- a/drivers/gpu/drm/i915/Kconfig.debug
+++ b/drivers/gpu/drm/i915/Kconfig.debug
@@ -157,6 +157,7 @@ config DRM_I915_SW_FENCE_CHECK_DAG
 config DRM_I915_DEBUG_GUC
 	bool "Enable additional driver debugging for GuC"
 	depends on DRM_I915
+	select STACKDEPOT
 	default n
 	help
 	  Choose this option to turn on extra driver debugging that may affect
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
index 3a71bb582089e..4aa903be1317b 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
@@ -376,6 +376,24 @@ void intel_guc_ct_disable(struct intel_guc_ct *ct)
 	}
 }
 
+#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM)
+static void ct_track_lost_and_found(struct intel_guc_ct *ct, u32 fence, u32 action)
+{
+	unsigned int lost = fence % ARRAY_SIZE(ct->requests.lost_and_found);
+#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GUC)
+	unsigned long entries[SZ_32];
+	unsigned int n;
+
+	n = stack_trace_save(entries, ARRAY_SIZE(entries), 1);
+
+	/* May be called under spinlock, so avoid sleeping */
+	ct->requests.lost_and_found[lost].stack = stack_depot_save(entries, n, GFP_NOWAIT);
+#endif
+	ct->requests.lost_and_found[lost].fence = fence;
+	ct->requests.lost_and_found[lost].action = action;
+}
+#endif
+
 static u32 ct_get_next_fence(struct intel_guc_ct *ct)
 {
 	/* For now it's trivial */
@@ -447,6 +465,11 @@ static int ct_write(struct intel_guc_ct *ct,
 	}
 	GEM_BUG_ON(tail > size);
 
+#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM)
+	ct_track_lost_and_found(ct, fence,
+				FIELD_GET(GUC_HXG_EVENT_MSG_0_ACTION, action[0]));
+#endif
+
 	/*
 	 * make sure H2G buffer update and LRC tail update (if this triggering a
 	 * submission) are visible before updating the descriptor tail
@@ -953,6 +976,43 @@ static int ct_read(struct intel_guc_ct *ct, struct ct_incoming_msg **msg)
 	return -EPIPE;
 }
 
+#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM)
+static bool ct_check_lost_and_found(struct intel_guc_ct *ct, u32 fence)
+{
+	unsigned int n;
+	char *buf = NULL;
+	bool found = false;
+
+	lockdep_assert_held(&ct->requests.lock);
+
+	for (n = 0; n < ARRAY_SIZE(ct->requests.lost_and_found); n++) {
+		if (ct->requests.lost_and_found[n].fence != fence)
+			continue;
+		found = true;
+
+#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GUC)
+		buf = kmalloc(SZ_4K, GFP_NOWAIT);
+		if (buf && stack_depot_snprint(ct->requests.lost_and_found[n].stack,
+					       buf, SZ_4K, 0)) {
+			CT_ERROR(ct, "Fence %u was used by action %#04x sent at\n%s",
+				 fence, ct->requests.lost_and_found[n].action, buf);
+			break;
+		}
+#endif
+		CT_ERROR(ct, "Fence %u was used by action %#04x\n",
+			 fence, ct->requests.lost_and_found[n].action);
+		break;
+	}
+	kfree(buf);
+	return found;
+}
+#else
+static bool ct_check_lost_and_found(struct intel_guc_ct *ct, u32 fence)
+{
+	return false;
+}
+#endif
+
 static int ct_handle_response(struct intel_guc_ct *ct, struct ct_incoming_msg *response)
 {
 	u32 len = FIELD_GET(GUC_CTB_MSG_0_NUM_DWORDS, response->msg[0]);
@@ -996,9 +1056,11 @@ static int ct_handle_response(struct intel_guc_ct *ct, struct ct_incoming_msg *r
 	if (!found) {
 		CT_ERROR(ct, "Unsolicited response message: len %u, data %#x (fence %u, last %u)\n",
 			 len, hxg[0], fence, ct->requests.last_fence);
-		list_for_each_entry(req, &ct->requests.pending, link)
-			CT_ERROR(ct, "request %u awaits response\n",
-				 req->fence);
+		if (!ct_check_lost_and_found(ct, fence)) {
+			list_for_each_entry(req, &ct->requests.pending, link)
+				CT_ERROR(ct, "request %u awaits response\n",
+					 req->fence);
+		}
 		err = -ENOKEY;
 	}
 	spin_unlock_irqrestore(&ct->requests.lock, flags);
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
index 818415b64f4d1..58e42901ff498 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
@@ -8,6 +8,7 @@
 
 #include <linux/interrupt.h>
 #include <linux/spinlock.h>
+#include <linux/stackdepot.h>
 #include <linux/workqueue.h>
 #include <linux/ktime.h>
 #include <linux/wait.h>
@@ -81,6 +82,16 @@ struct intel_guc_ct {
 
 		struct list_head incoming; /* incoming requests */
 		struct work_struct worker; /* handler for incoming requests */
+
+#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM)
+		struct {
+			u16 fence;
+			u16 action;
+#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GUC)
+			depot_stack_handle_t stack;
+#endif
+		} lost_and_found[SZ_16];
+#endif
 	} requests;
 
 	/** @stall_time: time of first time a CTB submission is stalled */
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Intel-gfx] ✗ Fi.CI.SPARSE: warning for Use FAST_REQUEST mechanism for non-blocking H2G calls
  2023-05-26 23:55 ` [Intel-gfx] " John.C.Harrison
                   ` (3 preceding siblings ...)
  (?)
@ 2023-05-27  0:22 ` Patchwork
  -1 siblings, 0 replies; 17+ messages in thread
From: Patchwork @ 2023-05-27  0:22 UTC (permalink / raw)
  To: john.c.harrison; +Cc: intel-gfx

== Series Details ==

Series: Use FAST_REQUEST mechanism for non-blocking H2G calls
URL   : https://patchwork.freedesktop.org/series/118450/
State : warning

== Summary ==

Error: dim sparse failed
Sparse version: v0.6.2
Fast mode used, each commit won't be checked separately.



^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Intel-gfx] ✓ Fi.CI.BAT: success for Use FAST_REQUEST mechanism for non-blocking H2G calls
  2023-05-26 23:55 ` [Intel-gfx] " John.C.Harrison
                   ` (4 preceding siblings ...)
  (?)
@ 2023-05-27  0:35 ` Patchwork
  -1 siblings, 0 replies; 17+ messages in thread
From: Patchwork @ 2023-05-27  0:35 UTC (permalink / raw)
  To: john.c.harrison; +Cc: intel-gfx

[-- Attachment #1: Type: text/plain, Size: 7075 bytes --]

== Series Details ==

Series: Use FAST_REQUEST mechanism for non-blocking H2G calls
URL   : https://patchwork.freedesktop.org/series/118450/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_13196 -> Patchwork_118450v1
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_118450v1/index.html

Participating hosts (39 -> 38)
------------------------------

  Additional (1): bat-mtlp-6 
  Missing    (2): bat-atsm-1 fi-snb-2520m 

Known issues
------------

  Here are the changes found in Patchwork_118450v1 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@i915_module_load@reload:
    - bat-jsl-3:          [PASS][1] -> [INCOMPLETE][2] ([i915#8321])
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13196/bat-jsl-3/igt@i915_module_load@reload.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_118450v1/bat-jsl-3/igt@i915_module_load@reload.html

  * igt@i915_selftest@live@requests:
    - bat-rpls-2:         [PASS][3] -> [ABORT][4] ([i915#4983] / [i915#7913])
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13196/bat-rpls-2/igt@i915_selftest@live@requests.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_118450v1/bat-rpls-2/igt@i915_selftest@live@requests.html

  * igt@kms_chamelium_hpd@common-hpd-after-suspend:
    - fi-skl-6600u:       NOTRUN -> [SKIP][5] ([fdo#109271])
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_118450v1/fi-skl-6600u/igt@kms_chamelium_hpd@common-hpd-after-suspend.html

  * igt@kms_pipe_crc_basic@nonblocking-crc-frame-sequence:
    - bat-dg2-11:         NOTRUN -> [SKIP][6] ([i915#1845] / [i915#5354])
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_118450v1/bat-dg2-11/igt@kms_pipe_crc_basic@nonblocking-crc-frame-sequence.html

  
#### Possible fixes ####

  * igt@gem_exec_suspend@basic-s3@smem:
    - fi-skl-6600u:       [ABORT][7] -> [PASS][8]
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13196/fi-skl-6600u/igt@gem_exec_suspend@basic-s3@smem.html
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_118450v1/fi-skl-6600u/igt@gem_exec_suspend@basic-s3@smem.html

  * igt@i915_selftest@live@gt_heartbeat:
    - fi-kbl-soraka:      [DMESG-FAIL][9] ([i915#5334] / [i915#7872]) -> [PASS][10]
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13196/fi-kbl-soraka/igt@i915_selftest@live@gt_heartbeat.html
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_118450v1/fi-kbl-soraka/igt@i915_selftest@live@gt_heartbeat.html

  * igt@i915_selftest@live@gt_mocs:
    - {bat-mtlp-8}:       [DMESG-FAIL][11] ([i915#7059]) -> [PASS][12]
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13196/bat-mtlp-8/igt@i915_selftest@live@gt_mocs.html
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_118450v1/bat-mtlp-8/igt@i915_selftest@live@gt_mocs.html

  * igt@i915_selftest@live@slpc:
    - {bat-mtlp-8}:       [DMESG-WARN][13] ([i915#6367]) -> [PASS][14]
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13196/bat-mtlp-8/igt@i915_selftest@live@slpc.html
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_118450v1/bat-mtlp-8/igt@i915_selftest@live@slpc.html

  * igt@kms_pipe_crc_basic@nonblocking-crc-frame-sequence@pipe-c-dp-1:
    - bat-dg2-8:          [FAIL][15] ([i915#7932]) -> [PASS][16] +1 similar issue
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13196/bat-dg2-8/igt@kms_pipe_crc_basic@nonblocking-crc-frame-sequence@pipe-c-dp-1.html
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_118450v1/bat-dg2-8/igt@kms_pipe_crc_basic@nonblocking-crc-frame-sequence@pipe-c-dp-1.html

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [IGT#6]: https://gitlab.freedesktop.org/drm/igt-gpu-tools/issues/6
  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [fdo#109285]: https://bugs.freedesktop.org/show_bug.cgi?id=109285
  [i915#1072]: https://gitlab.freedesktop.org/drm/intel/issues/1072
  [i915#1845]: https://gitlab.freedesktop.org/drm/intel/issues/1845
  [i915#2582]: https://gitlab.freedesktop.org/drm/intel/issues/2582
  [i915#3546]: https://gitlab.freedesktop.org/drm/intel/issues/3546
  [i915#3595]: https://gitlab.freedesktop.org/drm/intel/issues/3595
  [i915#3637]: https://gitlab.freedesktop.org/drm/intel/issues/3637
  [i915#3708]: https://gitlab.freedesktop.org/drm/intel/issues/3708
  [i915#4077]: https://gitlab.freedesktop.org/drm/intel/issues/4077
  [i915#4078]: https://gitlab.freedesktop.org/drm/intel/issues/4078
  [i915#4079]: https://gitlab.freedesktop.org/drm/intel/issues/4079
  [i915#4083]: https://gitlab.freedesktop.org/drm/intel/issues/4083
  [i915#4212]: https://gitlab.freedesktop.org/drm/intel/issues/4212
  [i915#4342]: https://gitlab.freedesktop.org/drm/intel/issues/4342
  [i915#4423]: https://gitlab.freedesktop.org/drm/intel/issues/4423
  [i915#4579]: https://gitlab.freedesktop.org/drm/intel/issues/4579
  [i915#4613]: https://gitlab.freedesktop.org/drm/intel/issues/4613
  [i915#4983]: https://gitlab.freedesktop.org/drm/intel/issues/4983
  [i915#5190]: https://gitlab.freedesktop.org/drm/intel/issues/5190
  [i915#5274]: https://gitlab.freedesktop.org/drm/intel/issues/5274
  [i915#5334]: https://gitlab.freedesktop.org/drm/intel/issues/5334
  [i915#5354]: https://gitlab.freedesktop.org/drm/intel/issues/5354
  [i915#6367]: https://gitlab.freedesktop.org/drm/intel/issues/6367
  [i915#6621]: https://gitlab.freedesktop.org/drm/intel/issues/6621
  [i915#6645]: https://gitlab.freedesktop.org/drm/intel/issues/6645
  [i915#7059]: https://gitlab.freedesktop.org/drm/intel/issues/7059
  [i915#7456]: https://gitlab.freedesktop.org/drm/intel/issues/7456
  [i915#7828]: https://gitlab.freedesktop.org/drm/intel/issues/7828
  [i915#7872]: https://gitlab.freedesktop.org/drm/intel/issues/7872
  [i915#7913]: https://gitlab.freedesktop.org/drm/intel/issues/7913
  [i915#7932]: https://gitlab.freedesktop.org/drm/intel/issues/7932
  [i915#8189]: https://gitlab.freedesktop.org/drm/intel/issues/8189
  [i915#8321]: https://gitlab.freedesktop.org/drm/intel/issues/8321
  [i915#8497]: https://gitlab.freedesktop.org/drm/intel/issues/8497


Build changes
-------------

  * Linux: CI_DRM_13196 -> Patchwork_118450v1

  CI-20190529: 20190529
  CI_DRM_13196: 9e0c716f834ec17dbf96c47c8b5a2b32c4f483cd @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_7307: f0714273cd896c637759b3790f485308c4c97008 @ https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  Patchwork_118450v1: 9e0c716f834ec17dbf96c47c8b5a2b32c4f483cd @ git://anongit.freedesktop.org/gfx-ci/linux


### Linux commits

e3a1c312f7b0 drm/i915/guc: Track all sent actions to GuC
f80bc2b6dc51 drm/i915/guc: Update log for unsolicited CTB response
40befae90b6a drm/i915/guc: Use FAST_REQUEST for non-blocking H2G calls

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_118450v1/index.html

[-- Attachment #2: Type: text/html, Size: 6221 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Intel-gfx] ✓ Fi.CI.IGT: success for Use FAST_REQUEST mechanism for non-blocking H2G calls
  2023-05-26 23:55 ` [Intel-gfx] " John.C.Harrison
                   ` (5 preceding siblings ...)
  (?)
@ 2023-05-27 23:41 ` Patchwork
  -1 siblings, 0 replies; 17+ messages in thread
From: Patchwork @ 2023-05-27 23:41 UTC (permalink / raw)
  To: john.c.harrison; +Cc: intel-gfx

[-- Attachment #1: Type: text/plain, Size: 9527 bytes --]

== Series Details ==

Series: Use FAST_REQUEST mechanism for non-blocking H2G calls
URL   : https://patchwork.freedesktop.org/series/118450/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_13196_full -> Patchwork_118450v1_full
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  

Participating hosts (7 -> 7)
------------------------------

  No changes in participating hosts

Known issues
------------

  Here are the changes found in Patchwork_118450v1_full that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@gem_userptr_blits@vma-merge:
    - shard-snb:          NOTRUN -> [FAIL][1] ([i915#2724])
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_118450v1/shard-snb4/igt@gem_userptr_blits@vma-merge.html

  * igt@kms_atomic_transition@plane-all-modeset-transition-fencing-internal-panels:
    - shard-snb:          NOTRUN -> [SKIP][2] ([fdo#109271] / [i915#4579]) +16 similar issues
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_118450v1/shard-snb4/igt@kms_atomic_transition@plane-all-modeset-transition-fencing-internal-panels.html

  * igt@kms_ccs@pipe-c-crc-primary-rotation-180-4_tiled_mtl_mc_ccs:
    - shard-snb:          NOTRUN -> [SKIP][3] ([fdo#109271]) +76 similar issues
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_118450v1/shard-snb4/igt@kms_ccs@pipe-c-crc-primary-rotation-180-4_tiled_mtl_mc_ccs.html

  * igt@kms_cursor_legacy@flip-vs-cursor-atomic-transitions:
    - shard-glk:          [PASS][4] -> [FAIL][5] ([IGT#6] / [i915#2346])
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13196/shard-glk4/igt@kms_cursor_legacy@flip-vs-cursor-atomic-transitions.html
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_118450v1/shard-glk6/igt@kms_cursor_legacy@flip-vs-cursor-atomic-transitions.html

  * igt@kms_dither@fb-8bpc-vs-panel-6bpc@pipe-a-hdmi-a-1:
    - shard-glk:          NOTRUN -> [SKIP][6] ([fdo#109271])
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_118450v1/shard-glk7/igt@kms_dither@fb-8bpc-vs-panel-6bpc@pipe-a-hdmi-a-1.html

  * igt@kms_flip@2x-flip-vs-expired-vblank@bc-hdmi-a1-hdmi-a2:
    - shard-glk:          [PASS][7] -> [FAIL][8] ([i915#79]) +1 similar issue
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13196/shard-glk1/igt@kms_flip@2x-flip-vs-expired-vblank@bc-hdmi-a1-hdmi-a2.html
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_118450v1/shard-glk3/igt@kms_flip@2x-flip-vs-expired-vblank@bc-hdmi-a1-hdmi-a2.html

  * igt@kms_plane_scaling@2x-scaler-multi-pipe:
    - shard-glk:          [PASS][9] -> [DMESG-WARN][10] ([IGT#6] / [i915#118])
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13196/shard-glk2/igt@kms_plane_scaling@2x-scaler-multi-pipe.html
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_118450v1/shard-glk6/igt@kms_plane_scaling@2x-scaler-multi-pipe.html

  
#### Possible fixes ####

  * igt@drm_fdinfo@most-busy-idle-check-all@rcs0:
    - {shard-rkl}:        [FAIL][11] ([i915#7742]) -> [PASS][12]
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13196/shard-rkl-3/igt@drm_fdinfo@most-busy-idle-check-all@rcs0.html
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_118450v1/shard-rkl-7/igt@drm_fdinfo@most-busy-idle-check-all@rcs0.html

  * igt@gem_eio@unwedge-stress:
    - {shard-dg1}:        [FAIL][13] ([i915#5784]) -> [PASS][14]
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13196/shard-dg1-15/igt@gem_eio@unwedge-stress.html
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_118450v1/shard-dg1-14/igt@gem_eio@unwedge-stress.html

  * igt@gem_exec_fair@basic-deadline:
    - shard-apl:          [FAIL][15] ([i915#2846]) -> [PASS][16]
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13196/shard-apl4/igt@gem_exec_fair@basic-deadline.html
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_118450v1/shard-apl1/igt@gem_exec_fair@basic-deadline.html

  * igt@i915_pm_rc6_residency@rc6-idle@rcs0:
    - {shard-dg1}:        [FAIL][17] ([i915#3591]) -> [PASS][18] +1 similar issue
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13196/shard-dg1-14/igt@i915_pm_rc6_residency@rc6-idle@rcs0.html
   [18]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_118450v1/shard-dg1-14/igt@i915_pm_rc6_residency@rc6-idle@rcs0.html

  * igt@i915_pm_rpm@dpms-mode-unset-lpsp:
    - {shard-rkl}:        [SKIP][19] ([i915#1397]) -> [PASS][20] +1 similar issue
   [19]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13196/shard-rkl-3/igt@i915_pm_rpm@dpms-mode-unset-lpsp.html
   [20]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_118450v1/shard-rkl-7/igt@i915_pm_rpm@dpms-mode-unset-lpsp.html

  * igt@i915_pm_rps@reset:
    - shard-snb:          [INCOMPLETE][21] ([i915#7790]) -> [PASS][22]
   [21]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13196/shard-snb1/igt@i915_pm_rps@reset.html
   [22]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_118450v1/shard-snb4/igt@i915_pm_rps@reset.html

  * igt@kms_cursor_legacy@flip-vs-cursor-atomic-transitions-varying-size:
    - shard-apl:          [FAIL][23] ([IGT#6] / [i915#2346]) -> [PASS][24] +1 similar issue
   [23]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13196/shard-apl3/igt@kms_cursor_legacy@flip-vs-cursor-atomic-transitions-varying-size.html
   [24]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_118450v1/shard-apl7/igt@kms_cursor_legacy@flip-vs-cursor-atomic-transitions-varying-size.html
    - shard-glk:          [FAIL][25] ([IGT#6] / [i915#2346]) -> [PASS][26]
   [25]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13196/shard-glk7/igt@kms_cursor_legacy@flip-vs-cursor-atomic-transitions-varying-size.html
   [26]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_118450v1/shard-glk6/igt@kms_cursor_legacy@flip-vs-cursor-atomic-transitions-varying-size.html

  * igt@kms_cursor_legacy@single-bo@pipe-b:
    - {shard-rkl}:        [INCOMPLETE][27] ([i915#8011]) -> [PASS][28]
   [27]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13196/shard-rkl-7/igt@kms_cursor_legacy@single-bo@pipe-b.html
   [28]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_118450v1/shard-rkl-3/igt@kms_cursor_legacy@single-bo@pipe-b.html

  * igt@kms_flip@flip-vs-expired-vblank-interruptible@c-hdmi-a2:
    - shard-glk:          [FAIL][29] ([i915#79]) -> [PASS][30]
   [29]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13196/shard-glk8/igt@kms_flip@flip-vs-expired-vblank-interruptible@c-hdmi-a2.html
   [30]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_118450v1/shard-glk5/igt@kms_flip@flip-vs-expired-vblank-interruptible@c-hdmi-a2.html

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [IGT#6]: https://gitlab.freedesktop.org/drm/igt-gpu-tools/issues/6
  [fdo#103375]: https://bugs.freedesktop.org/show_bug.cgi?id=103375
  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [fdo#109285]: https://bugs.freedesktop.org/show_bug.cgi?id=109285
  [fdo#110189]: https://bugs.freedesktop.org/show_bug.cgi?id=110189
  [fdo#111825]: https://bugs.freedesktop.org/show_bug.cgi?id=111825
  [i915#118]: https://gitlab.freedesktop.org/drm/intel/issues/118
  [i915#1397]: https://gitlab.freedesktop.org/drm/intel/issues/1397
  [i915#2346]: https://gitlab.freedesktop.org/drm/intel/issues/2346
  [i915#2724]: https://gitlab.freedesktop.org/drm/intel/issues/2724
  [i915#2842]: https://gitlab.freedesktop.org/drm/intel/issues/2842
  [i915#2846]: https://gitlab.freedesktop.org/drm/intel/issues/2846
  [i915#3591]: https://gitlab.freedesktop.org/drm/intel/issues/3591
  [i915#3955]: https://gitlab.freedesktop.org/drm/intel/issues/3955
  [i915#4070]: https://gitlab.freedesktop.org/drm/intel/issues/4070
  [i915#4098]: https://gitlab.freedesktop.org/drm/intel/issues/4098
  [i915#4281]: https://gitlab.freedesktop.org/drm/intel/issues/4281
  [i915#4579]: https://gitlab.freedesktop.org/drm/intel/issues/4579
  [i915#4816]: https://gitlab.freedesktop.org/drm/intel/issues/4816
  [i915#4936]: https://gitlab.freedesktop.org/drm/intel/issues/4936
  [i915#5176]: https://gitlab.freedesktop.org/drm/intel/issues/5176
  [i915#5235]: https://gitlab.freedesktop.org/drm/intel/issues/5235
  [i915#5493]: https://gitlab.freedesktop.org/drm/intel/issues/5493
  [i915#5784]: https://gitlab.freedesktop.org/drm/intel/issues/5784
  [i915#7461]: https://gitlab.freedesktop.org/drm/intel/issues/7461
  [i915#7742]: https://gitlab.freedesktop.org/drm/intel/issues/7742
  [i915#7790]: https://gitlab.freedesktop.org/drm/intel/issues/7790
  [i915#79]: https://gitlab.freedesktop.org/drm/intel/issues/79
  [i915#8011]: https://gitlab.freedesktop.org/drm/intel/issues/8011
  [i915#8178]: https://gitlab.freedesktop.org/drm/intel/issues/8178
  [i915#8311]: https://gitlab.freedesktop.org/drm/intel/issues/8311


Build changes
-------------

  * Linux: CI_DRM_13196 -> Patchwork_118450v1

  CI-20190529: 20190529
  CI_DRM_13196: 9e0c716f834ec17dbf96c47c8b5a2b32c4f483cd @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_7307: f0714273cd896c637759b3790f485308c4c97008 @ https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  Patchwork_118450v1: 9e0c716f834ec17dbf96c47c8b5a2b32c4f483cd @ git://anongit.freedesktop.org/gfx-ci/linux
  piglit_4509: fdc5a4ca11124ab8413c7988896eec4c97336694 @ git://anongit.freedesktop.org/piglit

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_118450v1/index.html

[-- Attachment #2: Type: text/html, Size: 9938 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/3] drm/i915/guc: Use FAST_REQUEST for non-blocking H2G calls
  2023-05-26 23:55   ` [Intel-gfx] " John.C.Harrison
@ 2023-05-30 21:02     ` John Harrison
  -1 siblings, 0 replies; 17+ messages in thread
From: John Harrison @ 2023-05-30 21:02 UTC (permalink / raw)
  To: Intel-GFX; +Cc: DRI-Devel, Michal Wajdeczko

On 5/26/2023 16:55, John.C.Harrison@Intel.com wrote:
> From: Michal Wajdeczko <michal.wajdeczko@intel.com>
>
> In addition to the already defined REQUEST HXG message format,
> which is used when sender expects some confirmation or data,
> HXG protocol includes definition of the FAST REQUEST message,
> that may be used when sender does not expect any useful data
> to be returned.
>
> Using this instead of GUC_HXG_TYPE_EVENT for non-blocking CTB requests
> will allow GuC to send back GUC_HXG_TYPE_RESPONSE_FAILURE in case of
> errors.
>
> Note that it is not possible to return such errors to the caller,
> since this is for non-blocking calls and the related fence is not
> stored. Instead such messages are treated as unexpected, which will
> give an indication of potential GuC misprogramming that warrants extra
> debugging effort.
>
> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>

> ---
>   .../gpu/drm/i915/gt/uc/abi/guc_messages_abi.h | 30 +++++++++++++++++++
>   drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c     |  6 ++--
>   2 files changed, 33 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gt/uc/abi/guc_messages_abi.h b/drivers/gpu/drm/i915/gt/uc/abi/guc_messages_abi.h
> index 7d5ba4d97d708..98eb4f46572b9 100644
> --- a/drivers/gpu/drm/i915/gt/uc/abi/guc_messages_abi.h
> +++ b/drivers/gpu/drm/i915/gt/uc/abi/guc_messages_abi.h
> @@ -24,6 +24,7 @@
>    *  |   | 30:28 | **TYPE** - message type                                      |
>    *  |   |       |   - _`GUC_HXG_TYPE_REQUEST` = 0                              |
>    *  |   |       |   - _`GUC_HXG_TYPE_EVENT` = 1                                |
> + *  |   |       |   - _`GUC_HXG_TYPE_FAST_REQUEST` = 2                         |
>    *  |   |       |   - _`GUC_HXG_TYPE_NO_RESPONSE_BUSY` = 3                     |
>    *  |   |       |   - _`GUC_HXG_TYPE_NO_RESPONSE_RETRY` = 5                    |
>    *  |   |       |   - _`GUC_HXG_TYPE_RESPONSE_FAILURE` = 6                     |
> @@ -46,6 +47,7 @@
>   #define GUC_HXG_MSG_0_TYPE			(0x7 << 28)
>   #define   GUC_HXG_TYPE_REQUEST			0u
>   #define   GUC_HXG_TYPE_EVENT			1u
> +#define   GUC_HXG_TYPE_FAST_REQUEST		2u
>   #define   GUC_HXG_TYPE_NO_RESPONSE_BUSY		3u
>   #define   GUC_HXG_TYPE_NO_RESPONSE_RETRY	5u
>   #define   GUC_HXG_TYPE_RESPONSE_FAILURE		6u
> @@ -89,6 +91,34 @@
>   #define GUC_HXG_REQUEST_MSG_0_ACTION		(0xffff << 0)
>   #define GUC_HXG_REQUEST_MSG_n_DATAn		GUC_HXG_MSG_n_PAYLOAD
>   
> +/**
> + * DOC: HXG Fast Request
> + *
> + * The `HXG Request`_ message should be used to initiate asynchronous activity
> + * for which confirmation or return data is not expected.
> + *
> + * If confirmation is required then `HXG Request`_ shall be used instead.
> + *
> + * The recipient of this message may only use `HXG Failure`_ message if it was
> + * unable to accept this request (like invalid data).
> + *
> + * Format of `HXG Fast Request`_ message is same as `HXG Request`_ except @TYPE.
> + *
> + *  +---+-------+--------------------------------------------------------------+
> + *  |   | Bits  | Description                                                  |
> + *  +===+=======+==============================================================+
> + *  | 0 |    31 | ORIGIN - see `HXG Message`_                                  |
> + *  |   +-------+--------------------------------------------------------------+
> + *  |   | 30:28 | TYPE = `GUC_HXG_TYPE_FAST_REQUEST`_                          |
> + *  |   +-------+--------------------------------------------------------------+
> + *  |   | 27:16 | DATA0 - see `HXG Request`_                                   |
> + *  |   +-------+--------------------------------------------------------------+
> + *  |   |  15:0 | ACTION - see `HXG Request`_                                  |
> + *  +---+-------+--------------------------------------------------------------+
> + *  |...|       | DATAn - see `HXG Request`_                                   |
> + *  +---+-------+--------------------------------------------------------------+
> + */
> +
>   /**
>    * DOC: HXG Event
>    *
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> index a22e33f37cae6..af52ed4ffc7fb 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> @@ -426,11 +426,11 @@ static int ct_write(struct intel_guc_ct *ct,
>   		 FIELD_PREP(GUC_CTB_MSG_0_NUM_DWORDS, len) |
>   		 FIELD_PREP(GUC_CTB_MSG_0_FENCE, fence);
>   
> -	type = (flags & INTEL_GUC_CT_SEND_NB) ? GUC_HXG_TYPE_EVENT :
> +	type = (flags & INTEL_GUC_CT_SEND_NB) ? GUC_HXG_TYPE_FAST_REQUEST :
>   		GUC_HXG_TYPE_REQUEST;
>   	hxg = FIELD_PREP(GUC_HXG_MSG_0_TYPE, type) |
> -		FIELD_PREP(GUC_HXG_EVENT_MSG_0_ACTION |
> -			   GUC_HXG_EVENT_MSG_0_DATA0, action[0]);
> +		FIELD_PREP(GUC_HXG_REQUEST_MSG_0_ACTION |
> +			   GUC_HXG_REQUEST_MSG_0_DATA0, action[0]);
>   
>   	CT_DEBUG(ct, "writing (tail %u) %*ph %*ph %*ph\n",
>   		 tail, 4, &header, 4, &hxg, 4 * (len - 1), &action[1]);


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Intel-gfx] [PATCH 1/3] drm/i915/guc: Use FAST_REQUEST for non-blocking H2G calls
@ 2023-05-30 21:02     ` John Harrison
  0 siblings, 0 replies; 17+ messages in thread
From: John Harrison @ 2023-05-30 21:02 UTC (permalink / raw)
  To: Intel-GFX; +Cc: DRI-Devel

On 5/26/2023 16:55, John.C.Harrison@Intel.com wrote:
> From: Michal Wajdeczko <michal.wajdeczko@intel.com>
>
> In addition to the already defined REQUEST HXG message format,
> which is used when sender expects some confirmation or data,
> HXG protocol includes definition of the FAST REQUEST message,
> that may be used when sender does not expect any useful data
> to be returned.
>
> Using this instead of GUC_HXG_TYPE_EVENT for non-blocking CTB requests
> will allow GuC to send back GUC_HXG_TYPE_RESPONSE_FAILURE in case of
> errors.
>
> Note that it is not possible to return such errors to the caller,
> since this is for non-blocking calls and the related fence is not
> stored. Instead such messages are treated as unexpected, which will
> give an indication of potential GuC misprogramming that warrants extra
> debugging effort.
>
> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>

> ---
>   .../gpu/drm/i915/gt/uc/abi/guc_messages_abi.h | 30 +++++++++++++++++++
>   drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c     |  6 ++--
>   2 files changed, 33 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gt/uc/abi/guc_messages_abi.h b/drivers/gpu/drm/i915/gt/uc/abi/guc_messages_abi.h
> index 7d5ba4d97d708..98eb4f46572b9 100644
> --- a/drivers/gpu/drm/i915/gt/uc/abi/guc_messages_abi.h
> +++ b/drivers/gpu/drm/i915/gt/uc/abi/guc_messages_abi.h
> @@ -24,6 +24,7 @@
>    *  |   | 30:28 | **TYPE** - message type                                      |
>    *  |   |       |   - _`GUC_HXG_TYPE_REQUEST` = 0                              |
>    *  |   |       |   - _`GUC_HXG_TYPE_EVENT` = 1                                |
> + *  |   |       |   - _`GUC_HXG_TYPE_FAST_REQUEST` = 2                         |
>    *  |   |       |   - _`GUC_HXG_TYPE_NO_RESPONSE_BUSY` = 3                     |
>    *  |   |       |   - _`GUC_HXG_TYPE_NO_RESPONSE_RETRY` = 5                    |
>    *  |   |       |   - _`GUC_HXG_TYPE_RESPONSE_FAILURE` = 6                     |
> @@ -46,6 +47,7 @@
>   #define GUC_HXG_MSG_0_TYPE			(0x7 << 28)
>   #define   GUC_HXG_TYPE_REQUEST			0u
>   #define   GUC_HXG_TYPE_EVENT			1u
> +#define   GUC_HXG_TYPE_FAST_REQUEST		2u
>   #define   GUC_HXG_TYPE_NO_RESPONSE_BUSY		3u
>   #define   GUC_HXG_TYPE_NO_RESPONSE_RETRY	5u
>   #define   GUC_HXG_TYPE_RESPONSE_FAILURE		6u
> @@ -89,6 +91,34 @@
>   #define GUC_HXG_REQUEST_MSG_0_ACTION		(0xffff << 0)
>   #define GUC_HXG_REQUEST_MSG_n_DATAn		GUC_HXG_MSG_n_PAYLOAD
>   
> +/**
> + * DOC: HXG Fast Request
> + *
> + * The `HXG Request`_ message should be used to initiate asynchronous activity
> + * for which confirmation or return data is not expected.
> + *
> + * If confirmation is required then `HXG Request`_ shall be used instead.
> + *
> + * The recipient of this message may only use `HXG Failure`_ message if it was
> + * unable to accept this request (like invalid data).
> + *
> + * Format of `HXG Fast Request`_ message is same as `HXG Request`_ except @TYPE.
> + *
> + *  +---+-------+--------------------------------------------------------------+
> + *  |   | Bits  | Description                                                  |
> + *  +===+=======+==============================================================+
> + *  | 0 |    31 | ORIGIN - see `HXG Message`_                                  |
> + *  |   +-------+--------------------------------------------------------------+
> + *  |   | 30:28 | TYPE = `GUC_HXG_TYPE_FAST_REQUEST`_                          |
> + *  |   +-------+--------------------------------------------------------------+
> + *  |   | 27:16 | DATA0 - see `HXG Request`_                                   |
> + *  |   +-------+--------------------------------------------------------------+
> + *  |   |  15:0 | ACTION - see `HXG Request`_                                  |
> + *  +---+-------+--------------------------------------------------------------+
> + *  |...|       | DATAn - see `HXG Request`_                                   |
> + *  +---+-------+--------------------------------------------------------------+
> + */
> +
>   /**
>    * DOC: HXG Event
>    *
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> index a22e33f37cae6..af52ed4ffc7fb 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> @@ -426,11 +426,11 @@ static int ct_write(struct intel_guc_ct *ct,
>   		 FIELD_PREP(GUC_CTB_MSG_0_NUM_DWORDS, len) |
>   		 FIELD_PREP(GUC_CTB_MSG_0_FENCE, fence);
>   
> -	type = (flags & INTEL_GUC_CT_SEND_NB) ? GUC_HXG_TYPE_EVENT :
> +	type = (flags & INTEL_GUC_CT_SEND_NB) ? GUC_HXG_TYPE_FAST_REQUEST :
>   		GUC_HXG_TYPE_REQUEST;
>   	hxg = FIELD_PREP(GUC_HXG_MSG_0_TYPE, type) |
> -		FIELD_PREP(GUC_HXG_EVENT_MSG_0_ACTION |
> -			   GUC_HXG_EVENT_MSG_0_DATA0, action[0]);
> +		FIELD_PREP(GUC_HXG_REQUEST_MSG_0_ACTION |
> +			   GUC_HXG_REQUEST_MSG_0_DATA0, action[0]);
>   
>   	CT_DEBUG(ct, "writing (tail %u) %*ph %*ph %*ph\n",
>   		 tail, 4, &header, 4, &hxg, 4 * (len - 1), &action[1]);


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/3] drm/i915/guc: Update log for unsolicited CTB response
  2023-05-26 23:55   ` [Intel-gfx] " John.C.Harrison
@ 2023-05-30 21:02     ` John Harrison
  -1 siblings, 0 replies; 17+ messages in thread
From: John Harrison @ 2023-05-30 21:02 UTC (permalink / raw)
  To: Intel-GFX; +Cc: DRI-Devel, Michal Wajdeczko

On 5/26/2023 16:55, John.C.Harrison@Intel.com wrote:
> From: Michal Wajdeczko <michal.wajdeczko@intel.com>
>
> Instead of printing message fence twice, include HXG header of the
> unexpected message and its len.
>
> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>

> ---
>   drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 5 ++---
>   1 file changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> index af52ed4ffc7fb..3a71bb582089e 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> @@ -994,9 +994,8 @@ static int ct_handle_response(struct intel_guc_ct *ct, struct ct_incoming_msg *r
>   		break;
>   	}
>   	if (!found) {
> -		CT_ERROR(ct, "Unsolicited response (fence %u)\n", fence);
> -		CT_ERROR(ct, "Could not find fence=%u, last_fence=%u\n", fence,
> -			 ct->requests.last_fence);
> +		CT_ERROR(ct, "Unsolicited response message: len %u, data %#x (fence %u, last %u)\n",
> +			 len, hxg[0], fence, ct->requests.last_fence);
>   		list_for_each_entry(req, &ct->requests.pending, link)
>   			CT_ERROR(ct, "request %u awaits response\n",
>   				 req->fence);


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Intel-gfx] [PATCH 2/3] drm/i915/guc: Update log for unsolicited CTB response
@ 2023-05-30 21:02     ` John Harrison
  0 siblings, 0 replies; 17+ messages in thread
From: John Harrison @ 2023-05-30 21:02 UTC (permalink / raw)
  To: Intel-GFX; +Cc: DRI-Devel

On 5/26/2023 16:55, John.C.Harrison@Intel.com wrote:
> From: Michal Wajdeczko <michal.wajdeczko@intel.com>
>
> Instead of printing message fence twice, include HXG header of the
> unexpected message and its len.
>
> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>

> ---
>   drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 5 ++---
>   1 file changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> index af52ed4ffc7fb..3a71bb582089e 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> @@ -994,9 +994,8 @@ static int ct_handle_response(struct intel_guc_ct *ct, struct ct_incoming_msg *r
>   		break;
>   	}
>   	if (!found) {
> -		CT_ERROR(ct, "Unsolicited response (fence %u)\n", fence);
> -		CT_ERROR(ct, "Could not find fence=%u, last_fence=%u\n", fence,
> -			 ct->requests.last_fence);
> +		CT_ERROR(ct, "Unsolicited response message: len %u, data %#x (fence %u, last %u)\n",
> +			 len, hxg[0], fence, ct->requests.last_fence);
>   		list_for_each_entry(req, &ct->requests.pending, link)
>   			CT_ERROR(ct, "request %u awaits response\n",
>   				 req->fence);


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 3/3] drm/i915/guc: Track all sent actions to GuC
  2023-05-26 23:55   ` [Intel-gfx] " John.C.Harrison
@ 2023-05-30 21:06     ` John Harrison
  -1 siblings, 0 replies; 17+ messages in thread
From: John Harrison @ 2023-05-30 21:06 UTC (permalink / raw)
  To: Intel-GFX; +Cc: DRI-Devel, Michal Wajdeczko

On 5/26/2023 16:55, John.C.Harrison@Intel.com wrote:
> From: Michal Wajdeczko <michal.wajdeczko@intel.com>
>
> For easier debug of any unexpected error responses from GuC that
> might be related to non-blocking fast requests, track action code (and
> stack if under DEBUG_GUC config) for every H2G request.
>
> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>

> ---
>   drivers/gpu/drm/i915/Kconfig.debug        |  1 +
>   drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 68 ++++++++++++++++++++++-
>   drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h | 11 ++++
>   3 files changed, 77 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/Kconfig.debug b/drivers/gpu/drm/i915/Kconfig.debug
> index 47e845353ffad..2d21930d55015 100644
> --- a/drivers/gpu/drm/i915/Kconfig.debug
> +++ b/drivers/gpu/drm/i915/Kconfig.debug
> @@ -157,6 +157,7 @@ config DRM_I915_SW_FENCE_CHECK_DAG
>   config DRM_I915_DEBUG_GUC
>   	bool "Enable additional driver debugging for GuC"
>   	depends on DRM_I915
> +	select STACKDEPOT
>   	default n
>   	help
>   	  Choose this option to turn on extra driver debugging that may affect
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> index 3a71bb582089e..4aa903be1317b 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> @@ -376,6 +376,24 @@ void intel_guc_ct_disable(struct intel_guc_ct *ct)
>   	}
>   }
>   
> +#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM)
> +static void ct_track_lost_and_found(struct intel_guc_ct *ct, u32 fence, u32 action)
> +{
> +	unsigned int lost = fence % ARRAY_SIZE(ct->requests.lost_and_found);
> +#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GUC)
> +	unsigned long entries[SZ_32];
> +	unsigned int n;
> +
> +	n = stack_trace_save(entries, ARRAY_SIZE(entries), 1);
> +
> +	/* May be called under spinlock, so avoid sleeping */
> +	ct->requests.lost_and_found[lost].stack = stack_depot_save(entries, n, GFP_NOWAIT);
> +#endif
> +	ct->requests.lost_and_found[lost].fence = fence;
> +	ct->requests.lost_and_found[lost].action = action;
> +}
> +#endif
> +
>   static u32 ct_get_next_fence(struct intel_guc_ct *ct)
>   {
>   	/* For now it's trivial */
> @@ -447,6 +465,11 @@ static int ct_write(struct intel_guc_ct *ct,
>   	}
>   	GEM_BUG_ON(tail > size);
>   
> +#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM)
> +	ct_track_lost_and_found(ct, fence,
> +				FIELD_GET(GUC_HXG_EVENT_MSG_0_ACTION, action[0]));
> +#endif
> +
>   	/*
>   	 * make sure H2G buffer update and LRC tail update (if this triggering a
>   	 * submission) are visible before updating the descriptor tail
> @@ -953,6 +976,43 @@ static int ct_read(struct intel_guc_ct *ct, struct ct_incoming_msg **msg)
>   	return -EPIPE;
>   }
>   
> +#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM)
> +static bool ct_check_lost_and_found(struct intel_guc_ct *ct, u32 fence)
> +{
> +	unsigned int n;
> +	char *buf = NULL;
> +	bool found = false;
> +
> +	lockdep_assert_held(&ct->requests.lock);
> +
> +	for (n = 0; n < ARRAY_SIZE(ct->requests.lost_and_found); n++) {
> +		if (ct->requests.lost_and_found[n].fence != fence)
> +			continue;
> +		found = true;
> +
> +#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GUC)
> +		buf = kmalloc(SZ_4K, GFP_NOWAIT);
> +		if (buf && stack_depot_snprint(ct->requests.lost_and_found[n].stack,
> +					       buf, SZ_4K, 0)) {
> +			CT_ERROR(ct, "Fence %u was used by action %#04x sent at\n%s",
> +				 fence, ct->requests.lost_and_found[n].action, buf);
> +			break;
> +		}
> +#endif
> +		CT_ERROR(ct, "Fence %u was used by action %#04x\n",
> +			 fence, ct->requests.lost_and_found[n].action);
> +		break;
> +	}
> +	kfree(buf);
> +	return found;
> +}
> +#else
> +static bool ct_check_lost_and_found(struct intel_guc_ct *ct, u32 fence)
> +{
> +	return false;
> +}
> +#endif
> +
>   static int ct_handle_response(struct intel_guc_ct *ct, struct ct_incoming_msg *response)
>   {
>   	u32 len = FIELD_GET(GUC_CTB_MSG_0_NUM_DWORDS, response->msg[0]);
> @@ -996,9 +1056,11 @@ static int ct_handle_response(struct intel_guc_ct *ct, struct ct_incoming_msg *r
>   	if (!found) {
>   		CT_ERROR(ct, "Unsolicited response message: len %u, data %#x (fence %u, last %u)\n",
>   			 len, hxg[0], fence, ct->requests.last_fence);
> -		list_for_each_entry(req, &ct->requests.pending, link)
> -			CT_ERROR(ct, "request %u awaits response\n",
> -				 req->fence);
> +		if (!ct_check_lost_and_found(ct, fence)) {
> +			list_for_each_entry(req, &ct->requests.pending, link)
> +				CT_ERROR(ct, "request %u awaits response\n",
> +					 req->fence);
> +		}
>   		err = -ENOKEY;
>   	}
>   	spin_unlock_irqrestore(&ct->requests.lock, flags);
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
> index 818415b64f4d1..58e42901ff498 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
> @@ -8,6 +8,7 @@
>   
>   #include <linux/interrupt.h>
>   #include <linux/spinlock.h>
> +#include <linux/stackdepot.h>
>   #include <linux/workqueue.h>
>   #include <linux/ktime.h>
>   #include <linux/wait.h>
> @@ -81,6 +82,16 @@ struct intel_guc_ct {
>   
>   		struct list_head incoming; /* incoming requests */
>   		struct work_struct worker; /* handler for incoming requests */
> +
> +#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM)
> +		struct {
> +			u16 fence;
> +			u16 action;
> +#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GUC)
> +			depot_stack_handle_t stack;
> +#endif
> +		} lost_and_found[SZ_16];
> +#endif
>   	} requests;
>   
>   	/** @stall_time: time of first time a CTB submission is stalled */


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Intel-gfx] [PATCH 3/3] drm/i915/guc: Track all sent actions to GuC
@ 2023-05-30 21:06     ` John Harrison
  0 siblings, 0 replies; 17+ messages in thread
From: John Harrison @ 2023-05-30 21:06 UTC (permalink / raw)
  To: Intel-GFX; +Cc: DRI-Devel

On 5/26/2023 16:55, John.C.Harrison@Intel.com wrote:
> From: Michal Wajdeczko <michal.wajdeczko@intel.com>
>
> For easier debug of any unexpected error responses from GuC that
> might be related to non-blocking fast requests, track action code (and
> stack if under DEBUG_GUC config) for every H2G request.
>
> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>

> ---
>   drivers/gpu/drm/i915/Kconfig.debug        |  1 +
>   drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 68 ++++++++++++++++++++++-
>   drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h | 11 ++++
>   3 files changed, 77 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/Kconfig.debug b/drivers/gpu/drm/i915/Kconfig.debug
> index 47e845353ffad..2d21930d55015 100644
> --- a/drivers/gpu/drm/i915/Kconfig.debug
> +++ b/drivers/gpu/drm/i915/Kconfig.debug
> @@ -157,6 +157,7 @@ config DRM_I915_SW_FENCE_CHECK_DAG
>   config DRM_I915_DEBUG_GUC
>   	bool "Enable additional driver debugging for GuC"
>   	depends on DRM_I915
> +	select STACKDEPOT
>   	default n
>   	help
>   	  Choose this option to turn on extra driver debugging that may affect
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> index 3a71bb582089e..4aa903be1317b 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> @@ -376,6 +376,24 @@ void intel_guc_ct_disable(struct intel_guc_ct *ct)
>   	}
>   }
>   
> +#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM)
> +static void ct_track_lost_and_found(struct intel_guc_ct *ct, u32 fence, u32 action)
> +{
> +	unsigned int lost = fence % ARRAY_SIZE(ct->requests.lost_and_found);
> +#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GUC)
> +	unsigned long entries[SZ_32];
> +	unsigned int n;
> +
> +	n = stack_trace_save(entries, ARRAY_SIZE(entries), 1);
> +
> +	/* May be called under spinlock, so avoid sleeping */
> +	ct->requests.lost_and_found[lost].stack = stack_depot_save(entries, n, GFP_NOWAIT);
> +#endif
> +	ct->requests.lost_and_found[lost].fence = fence;
> +	ct->requests.lost_and_found[lost].action = action;
> +}
> +#endif
> +
>   static u32 ct_get_next_fence(struct intel_guc_ct *ct)
>   {
>   	/* For now it's trivial */
> @@ -447,6 +465,11 @@ static int ct_write(struct intel_guc_ct *ct,
>   	}
>   	GEM_BUG_ON(tail > size);
>   
> +#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM)
> +	ct_track_lost_and_found(ct, fence,
> +				FIELD_GET(GUC_HXG_EVENT_MSG_0_ACTION, action[0]));
> +#endif
> +
>   	/*
>   	 * make sure H2G buffer update and LRC tail update (if this triggering a
>   	 * submission) are visible before updating the descriptor tail
> @@ -953,6 +976,43 @@ static int ct_read(struct intel_guc_ct *ct, struct ct_incoming_msg **msg)
>   	return -EPIPE;
>   }
>   
> +#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM)
> +static bool ct_check_lost_and_found(struct intel_guc_ct *ct, u32 fence)
> +{
> +	unsigned int n;
> +	char *buf = NULL;
> +	bool found = false;
> +
> +	lockdep_assert_held(&ct->requests.lock);
> +
> +	for (n = 0; n < ARRAY_SIZE(ct->requests.lost_and_found); n++) {
> +		if (ct->requests.lost_and_found[n].fence != fence)
> +			continue;
> +		found = true;
> +
> +#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GUC)
> +		buf = kmalloc(SZ_4K, GFP_NOWAIT);
> +		if (buf && stack_depot_snprint(ct->requests.lost_and_found[n].stack,
> +					       buf, SZ_4K, 0)) {
> +			CT_ERROR(ct, "Fence %u was used by action %#04x sent at\n%s",
> +				 fence, ct->requests.lost_and_found[n].action, buf);
> +			break;
> +		}
> +#endif
> +		CT_ERROR(ct, "Fence %u was used by action %#04x\n",
> +			 fence, ct->requests.lost_and_found[n].action);
> +		break;
> +	}
> +	kfree(buf);
> +	return found;
> +}
> +#else
> +static bool ct_check_lost_and_found(struct intel_guc_ct *ct, u32 fence)
> +{
> +	return false;
> +}
> +#endif
> +
>   static int ct_handle_response(struct intel_guc_ct *ct, struct ct_incoming_msg *response)
>   {
>   	u32 len = FIELD_GET(GUC_CTB_MSG_0_NUM_DWORDS, response->msg[0]);
> @@ -996,9 +1056,11 @@ static int ct_handle_response(struct intel_guc_ct *ct, struct ct_incoming_msg *r
>   	if (!found) {
>   		CT_ERROR(ct, "Unsolicited response message: len %u, data %#x (fence %u, last %u)\n",
>   			 len, hxg[0], fence, ct->requests.last_fence);
> -		list_for_each_entry(req, &ct->requests.pending, link)
> -			CT_ERROR(ct, "request %u awaits response\n",
> -				 req->fence);
> +		if (!ct_check_lost_and_found(ct, fence)) {
> +			list_for_each_entry(req, &ct->requests.pending, link)
> +				CT_ERROR(ct, "request %u awaits response\n",
> +					 req->fence);
> +		}
>   		err = -ENOKEY;
>   	}
>   	spin_unlock_irqrestore(&ct->requests.lock, flags);
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
> index 818415b64f4d1..58e42901ff498 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
> @@ -8,6 +8,7 @@
>   
>   #include <linux/interrupt.h>
>   #include <linux/spinlock.h>
> +#include <linux/stackdepot.h>
>   #include <linux/workqueue.h>
>   #include <linux/ktime.h>
>   #include <linux/wait.h>
> @@ -81,6 +82,16 @@ struct intel_guc_ct {
>   
>   		struct list_head incoming; /* incoming requests */
>   		struct work_struct worker; /* handler for incoming requests */
> +
> +#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM)
> +		struct {
> +			u16 fence;
> +			u16 action;
> +#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GUC)
> +			depot_stack_handle_t stack;
> +#endif
> +		} lost_and_found[SZ_16];
> +#endif
>   	} requests;
>   
>   	/** @stall_time: time of first time a CTB submission is stalled */


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2023-05-30 21:07 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-05-26 23:55 [PATCH 0/3] Use FAST_REQUEST mechanism for non-blocking H2G calls John.C.Harrison
2023-05-26 23:55 ` [Intel-gfx] " John.C.Harrison
2023-05-26 23:55 ` [PATCH 1/3] drm/i915/guc: Use FAST_REQUEST " John.C.Harrison
2023-05-26 23:55   ` [Intel-gfx] " John.C.Harrison
2023-05-30 21:02   ` John Harrison
2023-05-30 21:02     ` [Intel-gfx] " John Harrison
2023-05-26 23:55 ` [PATCH 2/3] drm/i915/guc: Update log for unsolicited CTB response John.C.Harrison
2023-05-26 23:55   ` [Intel-gfx] " John.C.Harrison
2023-05-30 21:02   ` John Harrison
2023-05-30 21:02     ` [Intel-gfx] " John Harrison
2023-05-26 23:55 ` [PATCH 3/3] drm/i915/guc: Track all sent actions to GuC John.C.Harrison
2023-05-26 23:55   ` [Intel-gfx] " John.C.Harrison
2023-05-30 21:06   ` John Harrison
2023-05-30 21:06     ` [Intel-gfx] " John Harrison
2023-05-27  0:22 ` [Intel-gfx] ✗ Fi.CI.SPARSE: warning for Use FAST_REQUEST mechanism for non-blocking H2G calls Patchwork
2023-05-27  0:35 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2023-05-27 23:41 ` [Intel-gfx] ✓ Fi.CI.IGT: " Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.