All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/7] CT changes required for GuC submission
@ 2021-07-01 17:15 ` Matthew Brost
  0 siblings, 0 replies; 38+ messages in thread
From: Matthew Brost @ 2021-07-01 17:15 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: john.c.harrison, Michal.Wajdeczko

As part of enabling GuC submission discussed in [1], [2], and [3] we
need optimize and update the CT code as this is now in the critical
path of submission. This series includes the patches to do that which is
the first 7 patches from [3]. The patches should have addressed all the
feedback in [3] and should be ready to merge once CI returns a we get a
few more RBs.

v2: Fix checkpatch warning, address a couple of Michal's comments

Signed-off-by: Matthew Brost <matthew.brost@intel.com>

[1] https://patchwork.freedesktop.org/series/89844/
[2] https://patchwork.freedesktop.org/series/91417/
[3] https://patchwork.freedesktop.org/series/91840/

John Harrison (1):
  drm/i915/guc: Module load failure test for CT buffer creation

Matthew Brost (6):
  drm/i915/guc: Relax CTB response timeout
  drm/i915/guc: Improve error message for unsolicited CT response
  drm/i915/guc: Increase size of CTB buffers
  drm/i915/guc: Add non blocking CTB send function
  drm/i915/guc: Add stall timer to non blocking CTB send function
  drm/i915/guc: Optimize CTB writes and reads

 .../gt/uc/abi/guc_communication_ctb_abi.h     |   3 +-
 drivers/gpu/drm/i915/gt/uc/intel_guc.h        |  11 +-
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c     | 250 +++++++++++++++---
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h     |  14 +-
 4 files changed, 232 insertions(+), 46 deletions(-)

-- 
2.28.0


^ permalink raw reply	[flat|nested] 38+ messages in thread

* [Intel-gfx] [PATCH 0/7] CT changes required for GuC submission
@ 2021-07-01 17:15 ` Matthew Brost
  0 siblings, 0 replies; 38+ messages in thread
From: Matthew Brost @ 2021-07-01 17:15 UTC (permalink / raw)
  To: intel-gfx, dri-devel

As part of enabling GuC submission discussed in [1], [2], and [3] we
need optimize and update the CT code as this is now in the critical
path of submission. This series includes the patches to do that which is
the first 7 patches from [3]. The patches should have addressed all the
feedback in [3] and should be ready to merge once CI returns a we get a
few more RBs.

v2: Fix checkpatch warning, address a couple of Michal's comments

Signed-off-by: Matthew Brost <matthew.brost@intel.com>

[1] https://patchwork.freedesktop.org/series/89844/
[2] https://patchwork.freedesktop.org/series/91417/
[3] https://patchwork.freedesktop.org/series/91840/

John Harrison (1):
  drm/i915/guc: Module load failure test for CT buffer creation

Matthew Brost (6):
  drm/i915/guc: Relax CTB response timeout
  drm/i915/guc: Improve error message for unsolicited CT response
  drm/i915/guc: Increase size of CTB buffers
  drm/i915/guc: Add non blocking CTB send function
  drm/i915/guc: Add stall timer to non blocking CTB send function
  drm/i915/guc: Optimize CTB writes and reads

 .../gt/uc/abi/guc_communication_ctb_abi.h     |   3 +-
 drivers/gpu/drm/i915/gt/uc/intel_guc.h        |  11 +-
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c     | 250 +++++++++++++++---
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h     |  14 +-
 4 files changed, 232 insertions(+), 46 deletions(-)

-- 
2.28.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH 1/7] drm/i915/guc: Relax CTB response timeout
  2021-07-01 17:15 ` [Intel-gfx] " Matthew Brost
@ 2021-07-01 17:15   ` Matthew Brost
  -1 siblings, 0 replies; 38+ messages in thread
From: Matthew Brost @ 2021-07-01 17:15 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: john.c.harrison, Michal.Wajdeczko

In upcoming patch we will allow more CTB requests to be sent in
parallel to the GuC for processing, so we shouldn't assume any more
that GuC will always reply without 10ms.

Use bigger value hardcoded value of 1s instead.

v2: Add CONFIG_DRM_I915_GUC_CTB_TIMEOUT config option
v3:
 (Daniel Vetter)
  - Use hardcoded value of 1s rather than config option
v4:
 (Michal)
  - Use defines for timeout values

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
index 43409044528e..b86575b99537 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
@@ -474,14 +474,18 @@ static int wait_for_ct_request_update(struct ct_request *req, u32 *status)
 	/*
 	 * Fast commands should complete in less than 10us, so sample quickly
 	 * up to that length of time, then switch to a slower sleep-wait loop.
-	 * No GuC command should ever take longer than 10ms.
+	 * No GuC command should ever take longer than 10ms but many GuC
+	 * commands can be inflight at time, so use a 1s timeout on the slower
+	 * sleep-wait loop.
 	 */
+#define GUC_CTB_RESPONSE_TIMEOUT_SHORT_MS 10
+#define GUC_CTB_RESPONSE_TIMEOUT_LONG_MS 1000
 #define done \
 	(FIELD_GET(GUC_HXG_MSG_0_ORIGIN, READ_ONCE(req->status)) == \
 	 GUC_HXG_ORIGIN_GUC)
-	err = wait_for_us(done, 10);
+	err = wait_for_us(done, GUC_CTB_RESPONSE_TIMEOUT_SHORT_MS);
 	if (err)
-		err = wait_for(done, 10);
+		err = wait_for(done, GUC_CTB_RESPONSE_TIMEOUT_LONG_MS);
 #undef done
 
 	if (unlikely(err))
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [Intel-gfx] [PATCH 1/7] drm/i915/guc: Relax CTB response timeout
@ 2021-07-01 17:15   ` Matthew Brost
  0 siblings, 0 replies; 38+ messages in thread
From: Matthew Brost @ 2021-07-01 17:15 UTC (permalink / raw)
  To: intel-gfx, dri-devel

In upcoming patch we will allow more CTB requests to be sent in
parallel to the GuC for processing, so we shouldn't assume any more
that GuC will always reply without 10ms.

Use bigger value hardcoded value of 1s instead.

v2: Add CONFIG_DRM_I915_GUC_CTB_TIMEOUT config option
v3:
 (Daniel Vetter)
  - Use hardcoded value of 1s rather than config option
v4:
 (Michal)
  - Use defines for timeout values

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
index 43409044528e..b86575b99537 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
@@ -474,14 +474,18 @@ static int wait_for_ct_request_update(struct ct_request *req, u32 *status)
 	/*
 	 * Fast commands should complete in less than 10us, so sample quickly
 	 * up to that length of time, then switch to a slower sleep-wait loop.
-	 * No GuC command should ever take longer than 10ms.
+	 * No GuC command should ever take longer than 10ms but many GuC
+	 * commands can be inflight at time, so use a 1s timeout on the slower
+	 * sleep-wait loop.
 	 */
+#define GUC_CTB_RESPONSE_TIMEOUT_SHORT_MS 10
+#define GUC_CTB_RESPONSE_TIMEOUT_LONG_MS 1000
 #define done \
 	(FIELD_GET(GUC_HXG_MSG_0_ORIGIN, READ_ONCE(req->status)) == \
 	 GUC_HXG_ORIGIN_GUC)
-	err = wait_for_us(done, 10);
+	err = wait_for_us(done, GUC_CTB_RESPONSE_TIMEOUT_SHORT_MS);
 	if (err)
-		err = wait_for(done, 10);
+		err = wait_for(done, GUC_CTB_RESPONSE_TIMEOUT_LONG_MS);
 #undef done
 
 	if (unlikely(err))
-- 
2.28.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 2/7] drm/i915/guc: Improve error message for unsolicited CT response
  2021-07-01 17:15 ` [Intel-gfx] " Matthew Brost
@ 2021-07-01 17:15   ` Matthew Brost
  -1 siblings, 0 replies; 38+ messages in thread
From: Matthew Brost @ 2021-07-01 17:15 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: john.c.harrison, Michal.Wajdeczko

Improve the error message when a unsolicited CT response is received by
printing fence that couldn't be found, the last fence, and all requests
with a response outstanding.

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
index b86575b99537..80db59b45c45 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
@@ -732,12 +732,16 @@ static int ct_handle_response(struct intel_guc_ct *ct, struct ct_incoming_msg *r
 		found = true;
 		break;
 	}
-	spin_unlock_irqrestore(&ct->requests.lock, flags);
-
 	if (!found) {
 		CT_ERROR(ct, "Unsolicited response (fence %u)\n", fence);
-		return -ENOKEY;
+		CT_ERROR(ct, "Could not find fence=%u, last_fence=%u\n", fence,
+			 ct->requests.last_fence);
+		list_for_each_entry(req, &ct->requests.pending, link)
+			CT_ERROR(ct, "request %u awaits response\n",
+				 req->fence);
+		err = -ENOKEY;
 	}
+	spin_unlock_irqrestore(&ct->requests.lock, flags);
 
 	if (unlikely(err))
 		return err;
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [Intel-gfx] [PATCH 2/7] drm/i915/guc: Improve error message for unsolicited CT response
@ 2021-07-01 17:15   ` Matthew Brost
  0 siblings, 0 replies; 38+ messages in thread
From: Matthew Brost @ 2021-07-01 17:15 UTC (permalink / raw)
  To: intel-gfx, dri-devel

Improve the error message when a unsolicited CT response is received by
printing fence that couldn't be found, the last fence, and all requests
with a response outstanding.

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
index b86575b99537..80db59b45c45 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
@@ -732,12 +732,16 @@ static int ct_handle_response(struct intel_guc_ct *ct, struct ct_incoming_msg *r
 		found = true;
 		break;
 	}
-	spin_unlock_irqrestore(&ct->requests.lock, flags);
-
 	if (!found) {
 		CT_ERROR(ct, "Unsolicited response (fence %u)\n", fence);
-		return -ENOKEY;
+		CT_ERROR(ct, "Could not find fence=%u, last_fence=%u\n", fence,
+			 ct->requests.last_fence);
+		list_for_each_entry(req, &ct->requests.pending, link)
+			CT_ERROR(ct, "request %u awaits response\n",
+				 req->fence);
+		err = -ENOKEY;
 	}
+	spin_unlock_irqrestore(&ct->requests.lock, flags);
 
 	if (unlikely(err))
 		return err;
-- 
2.28.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 3/7] drm/i915/guc: Increase size of CTB buffers
  2021-07-01 17:15 ` [Intel-gfx] " Matthew Brost
@ 2021-07-01 17:15   ` Matthew Brost
  -1 siblings, 0 replies; 38+ messages in thread
From: Matthew Brost @ 2021-07-01 17:15 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: john.c.harrison, Michal.Wajdeczko

With the introduction of non-blocking CTBs more than one CTB can be in
flight at a time. Increasing the size of the CTBs should reduce how
often software hits the case where no space is available in the CTB
buffer.

Cc: John Harrison <john.c.harrison@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
index 80db59b45c45..43e03aa2dde8 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
@@ -58,11 +58,16 @@ static inline struct drm_device *ct_to_drm(struct intel_guc_ct *ct)
  *      +--------+-----------------------------------------------+------+
  *
  * Size of each `CT Buffer`_ must be multiple of 4K.
- * As we don't expect too many messages, for now use minimum sizes.
+ * We don't expect too many messages in flight at any time, unless we are
+ * using the GuC submission. In that case each request requires a minimum
+ * 2 dwords which gives us a maximum 256 queue'd requests. Hopefully this
+ * enough space to avoid backpressure on the driver. We increase the size
+ * of the receive buffer (relative to the send) to ensure a G2H response
+ * CTB has a landing spot.
  */
 #define CTB_DESC_SIZE		ALIGN(sizeof(struct guc_ct_buffer_desc), SZ_2K)
 #define CTB_H2G_BUFFER_SIZE	(SZ_4K)
-#define CTB_G2H_BUFFER_SIZE	(SZ_4K)
+#define CTB_G2H_BUFFER_SIZE	(4 * CTB_H2G_BUFFER_SIZE)
 
 struct ct_request {
 	struct list_head link;
@@ -643,7 +648,7 @@ static int ct_read(struct intel_guc_ct *ct, struct ct_incoming_msg **msg)
 	/* beware of buffer wrap case */
 	if (unlikely(available < 0))
 		available += size;
-	CT_DEBUG(ct, "available %d (%u:%u)\n", available, head, tail);
+	CT_DEBUG(ct, "available %d (%u:%u:%u)\n", available, head, tail, size);
 	GEM_BUG_ON(available < 0);
 
 	header = cmds[head];
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [Intel-gfx] [PATCH 3/7] drm/i915/guc: Increase size of CTB buffers
@ 2021-07-01 17:15   ` Matthew Brost
  0 siblings, 0 replies; 38+ messages in thread
From: Matthew Brost @ 2021-07-01 17:15 UTC (permalink / raw)
  To: intel-gfx, dri-devel

With the introduction of non-blocking CTBs more than one CTB can be in
flight at a time. Increasing the size of the CTBs should reduce how
often software hits the case where no space is available in the CTB
buffer.

Cc: John Harrison <john.c.harrison@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
index 80db59b45c45..43e03aa2dde8 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
@@ -58,11 +58,16 @@ static inline struct drm_device *ct_to_drm(struct intel_guc_ct *ct)
  *      +--------+-----------------------------------------------+------+
  *
  * Size of each `CT Buffer`_ must be multiple of 4K.
- * As we don't expect too many messages, for now use minimum sizes.
+ * We don't expect too many messages in flight at any time, unless we are
+ * using the GuC submission. In that case each request requires a minimum
+ * 2 dwords which gives us a maximum 256 queue'd requests. Hopefully this
+ * enough space to avoid backpressure on the driver. We increase the size
+ * of the receive buffer (relative to the send) to ensure a G2H response
+ * CTB has a landing spot.
  */
 #define CTB_DESC_SIZE		ALIGN(sizeof(struct guc_ct_buffer_desc), SZ_2K)
 #define CTB_H2G_BUFFER_SIZE	(SZ_4K)
-#define CTB_G2H_BUFFER_SIZE	(SZ_4K)
+#define CTB_G2H_BUFFER_SIZE	(4 * CTB_H2G_BUFFER_SIZE)
 
 struct ct_request {
 	struct list_head link;
@@ -643,7 +648,7 @@ static int ct_read(struct intel_guc_ct *ct, struct ct_incoming_msg **msg)
 	/* beware of buffer wrap case */
 	if (unlikely(available < 0))
 		available += size;
-	CT_DEBUG(ct, "available %d (%u:%u)\n", available, head, tail);
+	CT_DEBUG(ct, "available %d (%u:%u:%u)\n", available, head, tail, size);
 	GEM_BUG_ON(available < 0);
 
 	header = cmds[head];
-- 
2.28.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 4/7] drm/i915/guc: Add non blocking CTB send function
  2021-07-01 17:15 ` [Intel-gfx] " Matthew Brost
@ 2021-07-01 17:15   ` Matthew Brost
  -1 siblings, 0 replies; 38+ messages in thread
From: Matthew Brost @ 2021-07-01 17:15 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: john.c.harrison, Michal.Wajdeczko

Add non blocking CTB send function, intel_guc_send_nb. GuC submission
will send CTBs in the critical path and does not need to wait for these
CTBs to complete before moving on, hence the need for this new function.

The non-blocking CTB now must have a flow control mechanism to ensure
the buffer isn't overrun. A lazy spin wait is used as we believe the
flow control condition should be rare with a properly sized buffer.

The function, intel_guc_send_nb, is exported in this patch but unused.
Several patches later in the series make use of this function.

v2:
 (Michal)
  - Use define for H2G room calculations
  - Move INTEL_GUC_SEND_NB define
 (Daniel Vetter)
  - Use msleep_interruptible rather than cond_resched
v3:
 (Michal)
  - Move includes to following patch
  - s/INTEL_GUC_SEND_NB/INTEL_GUC_CT_SEND_NB/g

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
---
 .../gt/uc/abi/guc_communication_ctb_abi.h     |  3 +-
 drivers/gpu/drm/i915/gt/uc/intel_guc.h        | 11 ++-
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c     | 87 +++++++++++++++++--
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h     |  4 +-
 4 files changed, 91 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h b/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h
index e933ca02d0eb..99e1fad5ca20 100644
--- a/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h
+++ b/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h
@@ -79,7 +79,8 @@ static_assert(sizeof(struct guc_ct_buffer_desc) == 64);
  *  +---+-------+--------------------------------------------------------------+
  */
 
-#define GUC_CTB_MSG_MIN_LEN			1u
+#define GUC_CTB_HDR_LEN				1u
+#define GUC_CTB_MSG_MIN_LEN			GUC_CTB_HDR_LEN
 #define GUC_CTB_MSG_MAX_LEN			256u
 #define GUC_CTB_MSG_0_FENCE			(0xffff << 16)
 #define GUC_CTB_MSG_0_FORMAT			(0xf << 12)
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.h b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
index 4abc59f6f3cd..72e4653222e2 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
@@ -74,7 +74,14 @@ static inline struct intel_guc *log_to_guc(struct intel_guc_log *log)
 static
 inline int intel_guc_send(struct intel_guc *guc, const u32 *action, u32 len)
 {
-	return intel_guc_ct_send(&guc->ct, action, len, NULL, 0);
+	return intel_guc_ct_send(&guc->ct, action, len, NULL, 0, 0);
+}
+
+static
+inline int intel_guc_send_nb(struct intel_guc *guc, const u32 *action, u32 len)
+{
+	return intel_guc_ct_send(&guc->ct, action, len, NULL, 0,
+				 INTEL_GUC_CT_SEND_NB);
 }
 
 static inline int
@@ -82,7 +89,7 @@ intel_guc_send_and_receive(struct intel_guc *guc, const u32 *action, u32 len,
 			   u32 *response_buf, u32 response_buf_size)
 {
 	return intel_guc_ct_send(&guc->ct, action, len,
-				 response_buf, response_buf_size);
+				 response_buf, response_buf_size, 0);
 }
 
 static inline void intel_guc_to_host_event_handler(struct intel_guc *guc)
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
index 43e03aa2dde8..fb825cc1d090 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
@@ -3,6 +3,8 @@
  * Copyright © 2016-2019 Intel Corporation
  */
 
+#include <linux/circ_buf.h>
+
 #include "i915_drv.h"
 #include "intel_guc_ct.h"
 #include "gt/intel_gt.h"
@@ -373,7 +375,7 @@ static void write_barrier(struct intel_guc_ct *ct)
 static int ct_write(struct intel_guc_ct *ct,
 		    const u32 *action,
 		    u32 len /* in dwords */,
-		    u32 fence)
+		    u32 fence, u32 flags)
 {
 	struct intel_guc_ct_buffer *ctb = &ct->ctbs.send;
 	struct guc_ct_buffer_desc *desc = ctb->desc;
@@ -409,7 +411,7 @@ static int ct_write(struct intel_guc_ct *ct,
 		used = tail - head;
 
 	/* make sure there is a space including extra dw for the fence */
-	if (unlikely(used + len + 1 >= size))
+	if (unlikely(used + len + GUC_CTB_HDR_LEN >= size))
 		return -ENOSPC;
 
 	/*
@@ -421,9 +423,13 @@ static int ct_write(struct intel_guc_ct *ct,
 		 FIELD_PREP(GUC_CTB_MSG_0_NUM_DWORDS, len) |
 		 FIELD_PREP(GUC_CTB_MSG_0_FENCE, fence);
 
-	hxg = FIELD_PREP(GUC_HXG_MSG_0_TYPE, GUC_HXG_TYPE_REQUEST) |
-	      FIELD_PREP(GUC_HXG_REQUEST_MSG_0_ACTION |
-			 GUC_HXG_REQUEST_MSG_0_DATA0, action[0]);
+	hxg = (flags & INTEL_GUC_CT_SEND_NB) ?
+		(FIELD_PREP(GUC_HXG_MSG_0_TYPE, GUC_HXG_TYPE_EVENT) |
+		 FIELD_PREP(GUC_HXG_EVENT_MSG_0_ACTION |
+			    GUC_HXG_EVENT_MSG_0_DATA0, action[0])) :
+		(FIELD_PREP(GUC_HXG_MSG_0_TYPE, GUC_HXG_TYPE_REQUEST) |
+		 FIELD_PREP(GUC_HXG_REQUEST_MSG_0_ACTION |
+			    GUC_HXG_REQUEST_MSG_0_DATA0, action[0]));
 
 	CT_DEBUG(ct, "writing (tail %u) %*ph %*ph %*ph\n",
 		 tail, 4, &header, 4, &hxg, 4 * (len - 1), &action[1]);
@@ -500,6 +506,48 @@ static int wait_for_ct_request_update(struct ct_request *req, u32 *status)
 	return err;
 }
 
+static inline bool h2g_has_room(struct intel_guc_ct_buffer *ctb, u32 len_dw)
+{
+	struct guc_ct_buffer_desc *desc = ctb->desc;
+	u32 head = READ_ONCE(desc->head);
+	u32 space;
+
+	space = CIRC_SPACE(desc->tail, head, ctb->size);
+
+	return space >= len_dw;
+}
+
+static int ct_send_nb(struct intel_guc_ct *ct,
+		      const u32 *action,
+		      u32 len,
+		      u32 flags)
+{
+	struct intel_guc_ct_buffer *ctb = &ct->ctbs.send;
+	unsigned long spin_flags;
+	u32 fence;
+	int ret;
+
+	spin_lock_irqsave(&ctb->lock, spin_flags);
+
+	ret = h2g_has_room(ctb, len + GUC_CTB_HDR_LEN);
+	if (unlikely(!ret)) {
+		ret = -EBUSY;
+		goto out;
+	}
+
+	fence = ct_get_next_fence(ct);
+	ret = ct_write(ct, action, len, fence, flags);
+	if (unlikely(ret))
+		goto out;
+
+	intel_guc_notify(ct_to_guc(ct));
+
+out:
+	spin_unlock_irqrestore(&ctb->lock, spin_flags);
+
+	return ret;
+}
+
 static int ct_send(struct intel_guc_ct *ct,
 		   const u32 *action,
 		   u32 len,
@@ -507,8 +555,10 @@ static int ct_send(struct intel_guc_ct *ct,
 		   u32 response_buf_size,
 		   u32 *status)
 {
+	struct intel_guc_ct_buffer *ctb = &ct->ctbs.send;
 	struct ct_request request;
 	unsigned long flags;
+	unsigned int sleep_period_ms = 1;
 	u32 fence;
 	int err;
 
@@ -516,8 +566,24 @@ static int ct_send(struct intel_guc_ct *ct,
 	GEM_BUG_ON(!len);
 	GEM_BUG_ON(len & ~GUC_CT_MSG_LEN_MASK);
 	GEM_BUG_ON(!response_buf && response_buf_size);
+	might_sleep();
+
+	/*
+	 * We use a lazy spin wait loop here as we believe that if the CT
+	 * buffers are sized correctly the flow control condition should be
+	 * rare.
+	 */
+retry:
+	spin_lock_irqsave(&ctb->lock, flags);
+	if (unlikely(!h2g_has_room(ctb, len + GUC_CTB_HDR_LEN))) {
+		spin_unlock_irqrestore(&ctb->lock, flags);
 
-	spin_lock_irqsave(&ct->ctbs.send.lock, flags);
+		if (msleep_interruptible(sleep_period_ms))
+			return -EINTR;
+		sleep_period_ms = sleep_period_ms << 1;
+
+		goto retry;
+	}
 
 	fence = ct_get_next_fence(ct);
 	request.fence = fence;
@@ -529,9 +595,9 @@ static int ct_send(struct intel_guc_ct *ct,
 	list_add_tail(&request.link, &ct->requests.pending);
 	spin_unlock(&ct->requests.lock);
 
-	err = ct_write(ct, action, len, fence);
+	err = ct_write(ct, action, len, fence, 0);
 
-	spin_unlock_irqrestore(&ct->ctbs.send.lock, flags);
+	spin_unlock_irqrestore(&ctb->lock, flags);
 
 	if (unlikely(err))
 		goto unlink;
@@ -571,7 +637,7 @@ static int ct_send(struct intel_guc_ct *ct,
  * Command Transport (CT) buffer based GuC send function.
  */
 int intel_guc_ct_send(struct intel_guc_ct *ct, const u32 *action, u32 len,
-		      u32 *response_buf, u32 response_buf_size)
+		      u32 *response_buf, u32 response_buf_size, u32 flags)
 {
 	u32 status = ~0; /* undefined */
 	int ret;
@@ -581,6 +647,9 @@ int intel_guc_ct_send(struct intel_guc_ct *ct, const u32 *action, u32 len,
 		return -ENODEV;
 	}
 
+	if (flags & INTEL_GUC_CT_SEND_NB)
+		return ct_send_nb(ct, action, len, flags);
+
 	ret = ct_send(ct, action, len, response_buf, response_buf_size, &status);
 	if (unlikely(ret < 0)) {
 		CT_ERROR(ct, "Sending action %#x failed (err=%d status=%#X)\n",
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
index 1ae2dde6db93..5bb8bef024c8 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
@@ -42,7 +42,6 @@ struct intel_guc_ct_buffer {
 	bool broken;
 };
 
-
 /** Top-level structure for Command Transport related data
  *
  * Includes a pair of CT buffers for bi-directional communication and tracking
@@ -87,8 +86,9 @@ static inline bool intel_guc_ct_enabled(struct intel_guc_ct *ct)
 	return ct->enabled;
 }
 
+#define INTEL_GUC_CT_SEND_NB		BIT(31)
 int intel_guc_ct_send(struct intel_guc_ct *ct, const u32 *action, u32 len,
-		      u32 *response_buf, u32 response_buf_size);
+		      u32 *response_buf, u32 response_buf_size, u32 flags);
 void intel_guc_ct_event_handler(struct intel_guc_ct *ct);
 
 #endif /* _INTEL_GUC_CT_H_ */
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [Intel-gfx] [PATCH 4/7] drm/i915/guc: Add non blocking CTB send function
@ 2021-07-01 17:15   ` Matthew Brost
  0 siblings, 0 replies; 38+ messages in thread
From: Matthew Brost @ 2021-07-01 17:15 UTC (permalink / raw)
  To: intel-gfx, dri-devel

Add non blocking CTB send function, intel_guc_send_nb. GuC submission
will send CTBs in the critical path and does not need to wait for these
CTBs to complete before moving on, hence the need for this new function.

The non-blocking CTB now must have a flow control mechanism to ensure
the buffer isn't overrun. A lazy spin wait is used as we believe the
flow control condition should be rare with a properly sized buffer.

The function, intel_guc_send_nb, is exported in this patch but unused.
Several patches later in the series make use of this function.

v2:
 (Michal)
  - Use define for H2G room calculations
  - Move INTEL_GUC_SEND_NB define
 (Daniel Vetter)
  - Use msleep_interruptible rather than cond_resched
v3:
 (Michal)
  - Move includes to following patch
  - s/INTEL_GUC_SEND_NB/INTEL_GUC_CT_SEND_NB/g

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
---
 .../gt/uc/abi/guc_communication_ctb_abi.h     |  3 +-
 drivers/gpu/drm/i915/gt/uc/intel_guc.h        | 11 ++-
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c     | 87 +++++++++++++++++--
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h     |  4 +-
 4 files changed, 91 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h b/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h
index e933ca02d0eb..99e1fad5ca20 100644
--- a/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h
+++ b/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h
@@ -79,7 +79,8 @@ static_assert(sizeof(struct guc_ct_buffer_desc) == 64);
  *  +---+-------+--------------------------------------------------------------+
  */
 
-#define GUC_CTB_MSG_MIN_LEN			1u
+#define GUC_CTB_HDR_LEN				1u
+#define GUC_CTB_MSG_MIN_LEN			GUC_CTB_HDR_LEN
 #define GUC_CTB_MSG_MAX_LEN			256u
 #define GUC_CTB_MSG_0_FENCE			(0xffff << 16)
 #define GUC_CTB_MSG_0_FORMAT			(0xf << 12)
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.h b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
index 4abc59f6f3cd..72e4653222e2 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
@@ -74,7 +74,14 @@ static inline struct intel_guc *log_to_guc(struct intel_guc_log *log)
 static
 inline int intel_guc_send(struct intel_guc *guc, const u32 *action, u32 len)
 {
-	return intel_guc_ct_send(&guc->ct, action, len, NULL, 0);
+	return intel_guc_ct_send(&guc->ct, action, len, NULL, 0, 0);
+}
+
+static
+inline int intel_guc_send_nb(struct intel_guc *guc, const u32 *action, u32 len)
+{
+	return intel_guc_ct_send(&guc->ct, action, len, NULL, 0,
+				 INTEL_GUC_CT_SEND_NB);
 }
 
 static inline int
@@ -82,7 +89,7 @@ intel_guc_send_and_receive(struct intel_guc *guc, const u32 *action, u32 len,
 			   u32 *response_buf, u32 response_buf_size)
 {
 	return intel_guc_ct_send(&guc->ct, action, len,
-				 response_buf, response_buf_size);
+				 response_buf, response_buf_size, 0);
 }
 
 static inline void intel_guc_to_host_event_handler(struct intel_guc *guc)
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
index 43e03aa2dde8..fb825cc1d090 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
@@ -3,6 +3,8 @@
  * Copyright © 2016-2019 Intel Corporation
  */
 
+#include <linux/circ_buf.h>
+
 #include "i915_drv.h"
 #include "intel_guc_ct.h"
 #include "gt/intel_gt.h"
@@ -373,7 +375,7 @@ static void write_barrier(struct intel_guc_ct *ct)
 static int ct_write(struct intel_guc_ct *ct,
 		    const u32 *action,
 		    u32 len /* in dwords */,
-		    u32 fence)
+		    u32 fence, u32 flags)
 {
 	struct intel_guc_ct_buffer *ctb = &ct->ctbs.send;
 	struct guc_ct_buffer_desc *desc = ctb->desc;
@@ -409,7 +411,7 @@ static int ct_write(struct intel_guc_ct *ct,
 		used = tail - head;
 
 	/* make sure there is a space including extra dw for the fence */
-	if (unlikely(used + len + 1 >= size))
+	if (unlikely(used + len + GUC_CTB_HDR_LEN >= size))
 		return -ENOSPC;
 
 	/*
@@ -421,9 +423,13 @@ static int ct_write(struct intel_guc_ct *ct,
 		 FIELD_PREP(GUC_CTB_MSG_0_NUM_DWORDS, len) |
 		 FIELD_PREP(GUC_CTB_MSG_0_FENCE, fence);
 
-	hxg = FIELD_PREP(GUC_HXG_MSG_0_TYPE, GUC_HXG_TYPE_REQUEST) |
-	      FIELD_PREP(GUC_HXG_REQUEST_MSG_0_ACTION |
-			 GUC_HXG_REQUEST_MSG_0_DATA0, action[0]);
+	hxg = (flags & INTEL_GUC_CT_SEND_NB) ?
+		(FIELD_PREP(GUC_HXG_MSG_0_TYPE, GUC_HXG_TYPE_EVENT) |
+		 FIELD_PREP(GUC_HXG_EVENT_MSG_0_ACTION |
+			    GUC_HXG_EVENT_MSG_0_DATA0, action[0])) :
+		(FIELD_PREP(GUC_HXG_MSG_0_TYPE, GUC_HXG_TYPE_REQUEST) |
+		 FIELD_PREP(GUC_HXG_REQUEST_MSG_0_ACTION |
+			    GUC_HXG_REQUEST_MSG_0_DATA0, action[0]));
 
 	CT_DEBUG(ct, "writing (tail %u) %*ph %*ph %*ph\n",
 		 tail, 4, &header, 4, &hxg, 4 * (len - 1), &action[1]);
@@ -500,6 +506,48 @@ static int wait_for_ct_request_update(struct ct_request *req, u32 *status)
 	return err;
 }
 
+static inline bool h2g_has_room(struct intel_guc_ct_buffer *ctb, u32 len_dw)
+{
+	struct guc_ct_buffer_desc *desc = ctb->desc;
+	u32 head = READ_ONCE(desc->head);
+	u32 space;
+
+	space = CIRC_SPACE(desc->tail, head, ctb->size);
+
+	return space >= len_dw;
+}
+
+static int ct_send_nb(struct intel_guc_ct *ct,
+		      const u32 *action,
+		      u32 len,
+		      u32 flags)
+{
+	struct intel_guc_ct_buffer *ctb = &ct->ctbs.send;
+	unsigned long spin_flags;
+	u32 fence;
+	int ret;
+
+	spin_lock_irqsave(&ctb->lock, spin_flags);
+
+	ret = h2g_has_room(ctb, len + GUC_CTB_HDR_LEN);
+	if (unlikely(!ret)) {
+		ret = -EBUSY;
+		goto out;
+	}
+
+	fence = ct_get_next_fence(ct);
+	ret = ct_write(ct, action, len, fence, flags);
+	if (unlikely(ret))
+		goto out;
+
+	intel_guc_notify(ct_to_guc(ct));
+
+out:
+	spin_unlock_irqrestore(&ctb->lock, spin_flags);
+
+	return ret;
+}
+
 static int ct_send(struct intel_guc_ct *ct,
 		   const u32 *action,
 		   u32 len,
@@ -507,8 +555,10 @@ static int ct_send(struct intel_guc_ct *ct,
 		   u32 response_buf_size,
 		   u32 *status)
 {
+	struct intel_guc_ct_buffer *ctb = &ct->ctbs.send;
 	struct ct_request request;
 	unsigned long flags;
+	unsigned int sleep_period_ms = 1;
 	u32 fence;
 	int err;
 
@@ -516,8 +566,24 @@ static int ct_send(struct intel_guc_ct *ct,
 	GEM_BUG_ON(!len);
 	GEM_BUG_ON(len & ~GUC_CT_MSG_LEN_MASK);
 	GEM_BUG_ON(!response_buf && response_buf_size);
+	might_sleep();
+
+	/*
+	 * We use a lazy spin wait loop here as we believe that if the CT
+	 * buffers are sized correctly the flow control condition should be
+	 * rare.
+	 */
+retry:
+	spin_lock_irqsave(&ctb->lock, flags);
+	if (unlikely(!h2g_has_room(ctb, len + GUC_CTB_HDR_LEN))) {
+		spin_unlock_irqrestore(&ctb->lock, flags);
 
-	spin_lock_irqsave(&ct->ctbs.send.lock, flags);
+		if (msleep_interruptible(sleep_period_ms))
+			return -EINTR;
+		sleep_period_ms = sleep_period_ms << 1;
+
+		goto retry;
+	}
 
 	fence = ct_get_next_fence(ct);
 	request.fence = fence;
@@ -529,9 +595,9 @@ static int ct_send(struct intel_guc_ct *ct,
 	list_add_tail(&request.link, &ct->requests.pending);
 	spin_unlock(&ct->requests.lock);
 
-	err = ct_write(ct, action, len, fence);
+	err = ct_write(ct, action, len, fence, 0);
 
-	spin_unlock_irqrestore(&ct->ctbs.send.lock, flags);
+	spin_unlock_irqrestore(&ctb->lock, flags);
 
 	if (unlikely(err))
 		goto unlink;
@@ -571,7 +637,7 @@ static int ct_send(struct intel_guc_ct *ct,
  * Command Transport (CT) buffer based GuC send function.
  */
 int intel_guc_ct_send(struct intel_guc_ct *ct, const u32 *action, u32 len,
-		      u32 *response_buf, u32 response_buf_size)
+		      u32 *response_buf, u32 response_buf_size, u32 flags)
 {
 	u32 status = ~0; /* undefined */
 	int ret;
@@ -581,6 +647,9 @@ int intel_guc_ct_send(struct intel_guc_ct *ct, const u32 *action, u32 len,
 		return -ENODEV;
 	}
 
+	if (flags & INTEL_GUC_CT_SEND_NB)
+		return ct_send_nb(ct, action, len, flags);
+
 	ret = ct_send(ct, action, len, response_buf, response_buf_size, &status);
 	if (unlikely(ret < 0)) {
 		CT_ERROR(ct, "Sending action %#x failed (err=%d status=%#X)\n",
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
index 1ae2dde6db93..5bb8bef024c8 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
@@ -42,7 +42,6 @@ struct intel_guc_ct_buffer {
 	bool broken;
 };
 
-
 /** Top-level structure for Command Transport related data
  *
  * Includes a pair of CT buffers for bi-directional communication and tracking
@@ -87,8 +86,9 @@ static inline bool intel_guc_ct_enabled(struct intel_guc_ct *ct)
 	return ct->enabled;
 }
 
+#define INTEL_GUC_CT_SEND_NB		BIT(31)
 int intel_guc_ct_send(struct intel_guc_ct *ct, const u32 *action, u32 len,
-		      u32 *response_buf, u32 response_buf_size);
+		      u32 *response_buf, u32 response_buf_size, u32 flags);
 void intel_guc_ct_event_handler(struct intel_guc_ct *ct);
 
 #endif /* _INTEL_GUC_CT_H_ */
-- 
2.28.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 5/7] drm/i915/guc: Add stall timer to non blocking CTB send function
  2021-07-01 17:15 ` [Intel-gfx] " Matthew Brost
@ 2021-07-01 17:15   ` Matthew Brost
  -1 siblings, 0 replies; 38+ messages in thread
From: Matthew Brost @ 2021-07-01 17:15 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: john.c.harrison, Michal.Wajdeczko

Implement a stall timer which fails H2G CTBs once a period of time
with no forward progress is reached to prevent deadlock.

v2:
 (Michal)
  - Improve error message in ct_deadlock()
  - Set broken when ct_deadlock() returns true
  - Return -EPIPE on ct_deadlock()
v3:
 (Michal)
  - Add ms to stall timer comment
 (Matthew)
  - Move broken check to intel_guc_ct_send()

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 62 ++++++++++++++++++++---
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h |  4 ++
 2 files changed, 59 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
index fb825cc1d090..a9cb7b608520 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
@@ -4,6 +4,9 @@
  */
 
 #include <linux/circ_buf.h>
+#include <linux/ktime.h>
+#include <linux/time64.h>
+#include <linux/timekeeping.h>
 
 #include "i915_drv.h"
 #include "intel_guc_ct.h"
@@ -316,6 +319,7 @@ int intel_guc_ct_enable(struct intel_guc_ct *ct)
 		goto err_deregister;
 
 	ct->enabled = true;
+	ct->stall_time = KTIME_MAX;
 
 	return 0;
 
@@ -388,9 +392,6 @@ static int ct_write(struct intel_guc_ct *ct,
 	u32 *cmds = ctb->cmds;
 	unsigned int i;
 
-	if (unlikely(ctb->broken))
-		return -EPIPE;
-
 	if (unlikely(desc->status))
 		goto corrupted;
 
@@ -506,6 +507,25 @@ static int wait_for_ct_request_update(struct ct_request *req, u32 *status)
 	return err;
 }
 
+#define GUC_CTB_TIMEOUT_MS	1500
+static inline bool ct_deadlocked(struct intel_guc_ct *ct)
+{
+	long timeout = GUC_CTB_TIMEOUT_MS;
+	bool ret = ktime_ms_delta(ktime_get(), ct->stall_time) > timeout;
+
+	if (unlikely(ret)) {
+		struct guc_ct_buffer_desc *send = ct->ctbs.send.desc;
+		struct guc_ct_buffer_desc *recv = ct->ctbs.send.desc;
+
+		CT_ERROR(ct, "Communication stalled for %lld ms, desc status=%#x,%#x\n",
+			 ktime_ms_delta(ktime_get(), ct->stall_time),
+			 send->status, recv->status);
+		ct->ctbs.send.broken = true;
+	}
+
+	return ret;
+}
+
 static inline bool h2g_has_room(struct intel_guc_ct_buffer *ctb, u32 len_dw)
 {
 	struct guc_ct_buffer_desc *desc = ctb->desc;
@@ -517,6 +537,26 @@ static inline bool h2g_has_room(struct intel_guc_ct_buffer *ctb, u32 len_dw)
 	return space >= len_dw;
 }
 
+static int has_room_nb(struct intel_guc_ct *ct, u32 len_dw)
+{
+	struct intel_guc_ct_buffer *ctb = &ct->ctbs.send;
+
+	lockdep_assert_held(&ct->ctbs.send.lock);
+
+	if (unlikely(!h2g_has_room(ctb, len_dw))) {
+		if (ct->stall_time == KTIME_MAX)
+			ct->stall_time = ktime_get();
+
+		if (unlikely(ct_deadlocked(ct)))
+			return -EPIPE;
+		else
+			return -EBUSY;
+	}
+
+	ct->stall_time = KTIME_MAX;
+	return 0;
+}
+
 static int ct_send_nb(struct intel_guc_ct *ct,
 		      const u32 *action,
 		      u32 len,
@@ -529,11 +569,9 @@ static int ct_send_nb(struct intel_guc_ct *ct,
 
 	spin_lock_irqsave(&ctb->lock, spin_flags);
 
-	ret = h2g_has_room(ctb, len + GUC_CTB_HDR_LEN);
-	if (unlikely(!ret)) {
-		ret = -EBUSY;
+	ret = has_room_nb(ct, len + GUC_CTB_HDR_LEN);
+	if (unlikely(ret))
 		goto out;
-	}
 
 	fence = ct_get_next_fence(ct);
 	ret = ct_write(ct, action, len, fence, flags);
@@ -576,8 +614,13 @@ static int ct_send(struct intel_guc_ct *ct,
 retry:
 	spin_lock_irqsave(&ctb->lock, flags);
 	if (unlikely(!h2g_has_room(ctb, len + GUC_CTB_HDR_LEN))) {
+		if (ct->stall_time == KTIME_MAX)
+			ct->stall_time = ktime_get();
 		spin_unlock_irqrestore(&ctb->lock, flags);
 
+		if (unlikely(ct_deadlocked(ct)))
+			return -EPIPE;
+
 		if (msleep_interruptible(sleep_period_ms))
 			return -EINTR;
 		sleep_period_ms = sleep_period_ms << 1;
@@ -585,6 +628,8 @@ static int ct_send(struct intel_guc_ct *ct,
 		goto retry;
 	}
 
+	ct->stall_time = KTIME_MAX;
+
 	fence = ct_get_next_fence(ct);
 	request.fence = fence;
 	request.status = 0;
@@ -647,6 +692,9 @@ int intel_guc_ct_send(struct intel_guc_ct *ct, const u32 *action, u32 len,
 		return -ENODEV;
 	}
 
+	if (unlikely(ct->ctbs.send.broken))
+		return -EPIPE;
+
 	if (flags & INTEL_GUC_CT_SEND_NB)
 		return ct_send_nb(ct, action, len, flags);
 
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
index 5bb8bef024c8..bee03794c1eb 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
@@ -9,6 +9,7 @@
 #include <linux/interrupt.h>
 #include <linux/spinlock.h>
 #include <linux/workqueue.h>
+#include <linux/ktime.h>
 
 #include "intel_guc_fwif.h"
 
@@ -68,6 +69,9 @@ struct intel_guc_ct {
 		struct list_head incoming; /* incoming requests */
 		struct work_struct worker; /* handler for incoming requests */
 	} requests;
+
+	/** @stall_time: time of first time a CTB submission is stalled */
+	ktime_t stall_time;
 };
 
 void intel_guc_ct_init_early(struct intel_guc_ct *ct);
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [Intel-gfx] [PATCH 5/7] drm/i915/guc: Add stall timer to non blocking CTB send function
@ 2021-07-01 17:15   ` Matthew Brost
  0 siblings, 0 replies; 38+ messages in thread
From: Matthew Brost @ 2021-07-01 17:15 UTC (permalink / raw)
  To: intel-gfx, dri-devel

Implement a stall timer which fails H2G CTBs once a period of time
with no forward progress is reached to prevent deadlock.

v2:
 (Michal)
  - Improve error message in ct_deadlock()
  - Set broken when ct_deadlock() returns true
  - Return -EPIPE on ct_deadlock()
v3:
 (Michal)
  - Add ms to stall timer comment
 (Matthew)
  - Move broken check to intel_guc_ct_send()

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 62 ++++++++++++++++++++---
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h |  4 ++
 2 files changed, 59 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
index fb825cc1d090..a9cb7b608520 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
@@ -4,6 +4,9 @@
  */
 
 #include <linux/circ_buf.h>
+#include <linux/ktime.h>
+#include <linux/time64.h>
+#include <linux/timekeeping.h>
 
 #include "i915_drv.h"
 #include "intel_guc_ct.h"
@@ -316,6 +319,7 @@ int intel_guc_ct_enable(struct intel_guc_ct *ct)
 		goto err_deregister;
 
 	ct->enabled = true;
+	ct->stall_time = KTIME_MAX;
 
 	return 0;
 
@@ -388,9 +392,6 @@ static int ct_write(struct intel_guc_ct *ct,
 	u32 *cmds = ctb->cmds;
 	unsigned int i;
 
-	if (unlikely(ctb->broken))
-		return -EPIPE;
-
 	if (unlikely(desc->status))
 		goto corrupted;
 
@@ -506,6 +507,25 @@ static int wait_for_ct_request_update(struct ct_request *req, u32 *status)
 	return err;
 }
 
+#define GUC_CTB_TIMEOUT_MS	1500
+static inline bool ct_deadlocked(struct intel_guc_ct *ct)
+{
+	long timeout = GUC_CTB_TIMEOUT_MS;
+	bool ret = ktime_ms_delta(ktime_get(), ct->stall_time) > timeout;
+
+	if (unlikely(ret)) {
+		struct guc_ct_buffer_desc *send = ct->ctbs.send.desc;
+		struct guc_ct_buffer_desc *recv = ct->ctbs.send.desc;
+
+		CT_ERROR(ct, "Communication stalled for %lld ms, desc status=%#x,%#x\n",
+			 ktime_ms_delta(ktime_get(), ct->stall_time),
+			 send->status, recv->status);
+		ct->ctbs.send.broken = true;
+	}
+
+	return ret;
+}
+
 static inline bool h2g_has_room(struct intel_guc_ct_buffer *ctb, u32 len_dw)
 {
 	struct guc_ct_buffer_desc *desc = ctb->desc;
@@ -517,6 +537,26 @@ static inline bool h2g_has_room(struct intel_guc_ct_buffer *ctb, u32 len_dw)
 	return space >= len_dw;
 }
 
+static int has_room_nb(struct intel_guc_ct *ct, u32 len_dw)
+{
+	struct intel_guc_ct_buffer *ctb = &ct->ctbs.send;
+
+	lockdep_assert_held(&ct->ctbs.send.lock);
+
+	if (unlikely(!h2g_has_room(ctb, len_dw))) {
+		if (ct->stall_time == KTIME_MAX)
+			ct->stall_time = ktime_get();
+
+		if (unlikely(ct_deadlocked(ct)))
+			return -EPIPE;
+		else
+			return -EBUSY;
+	}
+
+	ct->stall_time = KTIME_MAX;
+	return 0;
+}
+
 static int ct_send_nb(struct intel_guc_ct *ct,
 		      const u32 *action,
 		      u32 len,
@@ -529,11 +569,9 @@ static int ct_send_nb(struct intel_guc_ct *ct,
 
 	spin_lock_irqsave(&ctb->lock, spin_flags);
 
-	ret = h2g_has_room(ctb, len + GUC_CTB_HDR_LEN);
-	if (unlikely(!ret)) {
-		ret = -EBUSY;
+	ret = has_room_nb(ct, len + GUC_CTB_HDR_LEN);
+	if (unlikely(ret))
 		goto out;
-	}
 
 	fence = ct_get_next_fence(ct);
 	ret = ct_write(ct, action, len, fence, flags);
@@ -576,8 +614,13 @@ static int ct_send(struct intel_guc_ct *ct,
 retry:
 	spin_lock_irqsave(&ctb->lock, flags);
 	if (unlikely(!h2g_has_room(ctb, len + GUC_CTB_HDR_LEN))) {
+		if (ct->stall_time == KTIME_MAX)
+			ct->stall_time = ktime_get();
 		spin_unlock_irqrestore(&ctb->lock, flags);
 
+		if (unlikely(ct_deadlocked(ct)))
+			return -EPIPE;
+
 		if (msleep_interruptible(sleep_period_ms))
 			return -EINTR;
 		sleep_period_ms = sleep_period_ms << 1;
@@ -585,6 +628,8 @@ static int ct_send(struct intel_guc_ct *ct,
 		goto retry;
 	}
 
+	ct->stall_time = KTIME_MAX;
+
 	fence = ct_get_next_fence(ct);
 	request.fence = fence;
 	request.status = 0;
@@ -647,6 +692,9 @@ int intel_guc_ct_send(struct intel_guc_ct *ct, const u32 *action, u32 len,
 		return -ENODEV;
 	}
 
+	if (unlikely(ct->ctbs.send.broken))
+		return -EPIPE;
+
 	if (flags & INTEL_GUC_CT_SEND_NB)
 		return ct_send_nb(ct, action, len, flags);
 
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
index 5bb8bef024c8..bee03794c1eb 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
@@ -9,6 +9,7 @@
 #include <linux/interrupt.h>
 #include <linux/spinlock.h>
 #include <linux/workqueue.h>
+#include <linux/ktime.h>
 
 #include "intel_guc_fwif.h"
 
@@ -68,6 +69,9 @@ struct intel_guc_ct {
 		struct list_head incoming; /* incoming requests */
 		struct work_struct worker; /* handler for incoming requests */
 	} requests;
+
+	/** @stall_time: time of first time a CTB submission is stalled */
+	ktime_t stall_time;
 };
 
 void intel_guc_ct_init_early(struct intel_guc_ct *ct);
-- 
2.28.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 6/7] drm/i915/guc: Optimize CTB writes and reads
  2021-07-01 17:15 ` [Intel-gfx] " Matthew Brost
@ 2021-07-01 17:15   ` Matthew Brost
  -1 siblings, 0 replies; 38+ messages in thread
From: Matthew Brost @ 2021-07-01 17:15 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: john.c.harrison, Michal.Wajdeczko

CTB writes are now in the path of command submission and should be
optimized for performance. Rather than reading CTB descriptor values
(e.g. head, tail) which could result in accesses across the PCIe bus,
store shadow local copies and only read/write the descriptor values when
absolutely necessary. Also store the current space in the each channel
locally.

v2:
 (Michel)
  - Add additional sanity checks for head / tail pointers
  - Use GUC_CTB_HDR_LEN rather than magic 1

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 88 +++++++++++++++--------
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h |  6 ++
 2 files changed, 65 insertions(+), 29 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
index a9cb7b608520..5b8b4ff609e2 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
@@ -130,6 +130,10 @@ static void guc_ct_buffer_desc_init(struct guc_ct_buffer_desc *desc)
 static void guc_ct_buffer_reset(struct intel_guc_ct_buffer *ctb)
 {
 	ctb->broken = false;
+	ctb->tail = 0;
+	ctb->head = 0;
+	ctb->space = CIRC_SPACE(ctb->tail, ctb->head, ctb->size);
+
 	guc_ct_buffer_desc_init(ctb->desc);
 }
 
@@ -383,10 +387,8 @@ static int ct_write(struct intel_guc_ct *ct,
 {
 	struct intel_guc_ct_buffer *ctb = &ct->ctbs.send;
 	struct guc_ct_buffer_desc *desc = ctb->desc;
-	u32 head = desc->head;
-	u32 tail = desc->tail;
+	u32 tail = ctb->tail;
 	u32 size = ctb->size;
-	u32 used;
 	u32 header;
 	u32 hxg;
 	u32 *cmds = ctb->cmds;
@@ -395,25 +397,22 @@ static int ct_write(struct intel_guc_ct *ct,
 	if (unlikely(desc->status))
 		goto corrupted;
 
-	if (unlikely((tail | head) >= size)) {
+	GEM_BUG_ON(tail > size);
+
+#ifdef CONFIG_DRM_I915_DEBUG_GUC
+	if (unlikely(tail != READ_ONCE(desc->tail))) {
+		CT_ERROR(ct, "Tail was modified %u != %u\n",
+			 desc->tail, ctb->tail);
+		desc->status |= GUC_CTB_STATUS_MISMATCH;
+		goto corrupted;
+	}
+	if (unlikely((desc->tail | desc->head) >= size)) {
 		CT_ERROR(ct, "Invalid offsets head=%u tail=%u (size=%u)\n",
-			 head, tail, size);
+			 desc->head, desc->tail, size);
 		desc->status |= GUC_CTB_STATUS_OVERFLOW;
 		goto corrupted;
 	}
-
-	/*
-	 * tail == head condition indicates empty. GuC FW does not support
-	 * using up the entire buffer to get tail == head meaning full.
-	 */
-	if (tail < head)
-		used = (size - head) + tail;
-	else
-		used = tail - head;
-
-	/* make sure there is a space including extra dw for the fence */
-	if (unlikely(used + len + GUC_CTB_HDR_LEN >= size))
-		return -ENOSPC;
+#endif
 
 	/*
 	 * dw0: CT header (including fence)
@@ -454,7 +453,9 @@ static int ct_write(struct intel_guc_ct *ct,
 	write_barrier(ct);
 
 	/* now update descriptor */
+	ctb->tail = tail;
 	WRITE_ONCE(desc->tail, tail);
+	ctb->space -= len + GUC_CTB_HDR_LEN;
 
 	return 0;
 
@@ -470,7 +471,7 @@ static int ct_write(struct intel_guc_ct *ct,
  * @req:	pointer to pending request
  * @status:	placeholder for status
  *
- * For each sent request, Guc shall send bac CT response message.
+ * For each sent request, GuC shall send back CT response message.
  * Our message handler will update status of tracked request once
  * response message with given fence is received. Wait here and
  * check for valid response status value.
@@ -526,24 +527,35 @@ static inline bool ct_deadlocked(struct intel_guc_ct *ct)
 	return ret;
 }
 
-static inline bool h2g_has_room(struct intel_guc_ct_buffer *ctb, u32 len_dw)
+static inline bool h2g_has_room(struct intel_guc_ct *ct, u32 len_dw)
 {
-	struct guc_ct_buffer_desc *desc = ctb->desc;
-	u32 head = READ_ONCE(desc->head);
+	struct intel_guc_ct_buffer *ctb = &ct->ctbs.send;
+	u32 head;
 	u32 space;
 
-	space = CIRC_SPACE(desc->tail, head, ctb->size);
+	if (ctb->space >= len_dw)
+		return true;
+
+	head = READ_ONCE(ctb->desc->head);
+	if (unlikely(head > ctb->size)) {
+		CT_ERROR(ct, "Corrupted descriptor head=%u tail=%u size=%u\n",
+			 ctb->desc->head, ctb->desc->tail, ctb->size);
+		ctb->desc->status |= GUC_CTB_STATUS_OVERFLOW;
+		ctb->broken = true;
+		return false;
+	}
+
+	space = CIRC_SPACE(ctb->tail, head, ctb->size);
+	ctb->space = space;
 
 	return space >= len_dw;
 }
 
 static int has_room_nb(struct intel_guc_ct *ct, u32 len_dw)
 {
-	struct intel_guc_ct_buffer *ctb = &ct->ctbs.send;
-
 	lockdep_assert_held(&ct->ctbs.send.lock);
 
-	if (unlikely(!h2g_has_room(ctb, len_dw))) {
+	if (unlikely(!h2g_has_room(ct, len_dw))) {
 		if (ct->stall_time == KTIME_MAX)
 			ct->stall_time = ktime_get();
 
@@ -613,7 +625,7 @@ static int ct_send(struct intel_guc_ct *ct,
 	 */
 retry:
 	spin_lock_irqsave(&ctb->lock, flags);
-	if (unlikely(!h2g_has_room(ctb, len + GUC_CTB_HDR_LEN))) {
+	if (unlikely(!h2g_has_room(ct, len + GUC_CTB_HDR_LEN))) {
 		if (ct->stall_time == KTIME_MAX)
 			ct->stall_time = ktime_get();
 		spin_unlock_irqrestore(&ctb->lock, flags);
@@ -733,7 +745,7 @@ static int ct_read(struct intel_guc_ct *ct, struct ct_incoming_msg **msg)
 {
 	struct intel_guc_ct_buffer *ctb = &ct->ctbs.recv;
 	struct guc_ct_buffer_desc *desc = ctb->desc;
-	u32 head = desc->head;
+	u32 head = ctb->head;
 	u32 tail = desc->tail;
 	u32 size = ctb->size;
 	u32 *cmds = ctb->cmds;
@@ -748,12 +760,29 @@ static int ct_read(struct intel_guc_ct *ct, struct ct_incoming_msg **msg)
 	if (unlikely(desc->status))
 		goto corrupted;
 
-	if (unlikely((tail | head) >= size)) {
+	GEM_BUG_ON(head > size);
+
+#ifdef CONFIG_DRM_I915_DEBUG_GUC
+	if (unlikely(head != READ_ONCE(desc->head))) {
+		CT_ERROR(ct, "Head was modified %u != %u\n",
+			 desc->head, ctb->head);
+		desc->status |= GUC_CTB_STATUS_MISMATCH;
+		goto corrupted;
+	}
+	if (unlikely((desc->tail | desc->head) >= size)) {
+		CT_ERROR(ct, "Invalid offsets head=%u tail=%u (size=%u)\n",
+			 head, tail, size);
+		desc->status |= GUC_CTB_STATUS_OVERFLOW;
+		goto corrupted;
+	}
+#else
+	if (unlikely((tail | ctb->head) >= size)) {
 		CT_ERROR(ct, "Invalid offsets head=%u tail=%u (size=%u)\n",
 			 head, tail, size);
 		desc->status |= GUC_CTB_STATUS_OVERFLOW;
 		goto corrupted;
 	}
+#endif
 
 	/* tail == head condition indicates empty */
 	available = tail - head;
@@ -803,6 +832,7 @@ static int ct_read(struct intel_guc_ct *ct, struct ct_incoming_msg **msg)
 	}
 	CT_DEBUG(ct, "received %*ph\n", 4 * len, (*msg)->msg);
 
+	ctb->head = head;
 	/* now update descriptor */
 	WRITE_ONCE(desc->head, head);
 
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
index bee03794c1eb..edd1bba0445d 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
@@ -33,6 +33,9 @@ struct intel_guc;
  * @desc: pointer to the buffer descriptor
  * @cmds: pointer to the commands buffer
  * @size: size of the commands buffer in dwords
+ * @head: local shadow copy of head in dwords
+ * @tail: local shadow copy of tail in dwords
+ * @space: local shadow copy of space in dwords
  * @broken: flag to indicate if descriptor data is broken
  */
 struct intel_guc_ct_buffer {
@@ -40,6 +43,9 @@ struct intel_guc_ct_buffer {
 	struct guc_ct_buffer_desc *desc;
 	u32 *cmds;
 	u32 size;
+	u32 tail;
+	u32 head;
+	u32 space;
 	bool broken;
 };
 
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [Intel-gfx] [PATCH 6/7] drm/i915/guc: Optimize CTB writes and reads
@ 2021-07-01 17:15   ` Matthew Brost
  0 siblings, 0 replies; 38+ messages in thread
From: Matthew Brost @ 2021-07-01 17:15 UTC (permalink / raw)
  To: intel-gfx, dri-devel

CTB writes are now in the path of command submission and should be
optimized for performance. Rather than reading CTB descriptor values
(e.g. head, tail) which could result in accesses across the PCIe bus,
store shadow local copies and only read/write the descriptor values when
absolutely necessary. Also store the current space in the each channel
locally.

v2:
 (Michel)
  - Add additional sanity checks for head / tail pointers
  - Use GUC_CTB_HDR_LEN rather than magic 1

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 88 +++++++++++++++--------
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h |  6 ++
 2 files changed, 65 insertions(+), 29 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
index a9cb7b608520..5b8b4ff609e2 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
@@ -130,6 +130,10 @@ static void guc_ct_buffer_desc_init(struct guc_ct_buffer_desc *desc)
 static void guc_ct_buffer_reset(struct intel_guc_ct_buffer *ctb)
 {
 	ctb->broken = false;
+	ctb->tail = 0;
+	ctb->head = 0;
+	ctb->space = CIRC_SPACE(ctb->tail, ctb->head, ctb->size);
+
 	guc_ct_buffer_desc_init(ctb->desc);
 }
 
@@ -383,10 +387,8 @@ static int ct_write(struct intel_guc_ct *ct,
 {
 	struct intel_guc_ct_buffer *ctb = &ct->ctbs.send;
 	struct guc_ct_buffer_desc *desc = ctb->desc;
-	u32 head = desc->head;
-	u32 tail = desc->tail;
+	u32 tail = ctb->tail;
 	u32 size = ctb->size;
-	u32 used;
 	u32 header;
 	u32 hxg;
 	u32 *cmds = ctb->cmds;
@@ -395,25 +397,22 @@ static int ct_write(struct intel_guc_ct *ct,
 	if (unlikely(desc->status))
 		goto corrupted;
 
-	if (unlikely((tail | head) >= size)) {
+	GEM_BUG_ON(tail > size);
+
+#ifdef CONFIG_DRM_I915_DEBUG_GUC
+	if (unlikely(tail != READ_ONCE(desc->tail))) {
+		CT_ERROR(ct, "Tail was modified %u != %u\n",
+			 desc->tail, ctb->tail);
+		desc->status |= GUC_CTB_STATUS_MISMATCH;
+		goto corrupted;
+	}
+	if (unlikely((desc->tail | desc->head) >= size)) {
 		CT_ERROR(ct, "Invalid offsets head=%u tail=%u (size=%u)\n",
-			 head, tail, size);
+			 desc->head, desc->tail, size);
 		desc->status |= GUC_CTB_STATUS_OVERFLOW;
 		goto corrupted;
 	}
-
-	/*
-	 * tail == head condition indicates empty. GuC FW does not support
-	 * using up the entire buffer to get tail == head meaning full.
-	 */
-	if (tail < head)
-		used = (size - head) + tail;
-	else
-		used = tail - head;
-
-	/* make sure there is a space including extra dw for the fence */
-	if (unlikely(used + len + GUC_CTB_HDR_LEN >= size))
-		return -ENOSPC;
+#endif
 
 	/*
 	 * dw0: CT header (including fence)
@@ -454,7 +453,9 @@ static int ct_write(struct intel_guc_ct *ct,
 	write_barrier(ct);
 
 	/* now update descriptor */
+	ctb->tail = tail;
 	WRITE_ONCE(desc->tail, tail);
+	ctb->space -= len + GUC_CTB_HDR_LEN;
 
 	return 0;
 
@@ -470,7 +471,7 @@ static int ct_write(struct intel_guc_ct *ct,
  * @req:	pointer to pending request
  * @status:	placeholder for status
  *
- * For each sent request, Guc shall send bac CT response message.
+ * For each sent request, GuC shall send back CT response message.
  * Our message handler will update status of tracked request once
  * response message with given fence is received. Wait here and
  * check for valid response status value.
@@ -526,24 +527,35 @@ static inline bool ct_deadlocked(struct intel_guc_ct *ct)
 	return ret;
 }
 
-static inline bool h2g_has_room(struct intel_guc_ct_buffer *ctb, u32 len_dw)
+static inline bool h2g_has_room(struct intel_guc_ct *ct, u32 len_dw)
 {
-	struct guc_ct_buffer_desc *desc = ctb->desc;
-	u32 head = READ_ONCE(desc->head);
+	struct intel_guc_ct_buffer *ctb = &ct->ctbs.send;
+	u32 head;
 	u32 space;
 
-	space = CIRC_SPACE(desc->tail, head, ctb->size);
+	if (ctb->space >= len_dw)
+		return true;
+
+	head = READ_ONCE(ctb->desc->head);
+	if (unlikely(head > ctb->size)) {
+		CT_ERROR(ct, "Corrupted descriptor head=%u tail=%u size=%u\n",
+			 ctb->desc->head, ctb->desc->tail, ctb->size);
+		ctb->desc->status |= GUC_CTB_STATUS_OVERFLOW;
+		ctb->broken = true;
+		return false;
+	}
+
+	space = CIRC_SPACE(ctb->tail, head, ctb->size);
+	ctb->space = space;
 
 	return space >= len_dw;
 }
 
 static int has_room_nb(struct intel_guc_ct *ct, u32 len_dw)
 {
-	struct intel_guc_ct_buffer *ctb = &ct->ctbs.send;
-
 	lockdep_assert_held(&ct->ctbs.send.lock);
 
-	if (unlikely(!h2g_has_room(ctb, len_dw))) {
+	if (unlikely(!h2g_has_room(ct, len_dw))) {
 		if (ct->stall_time == KTIME_MAX)
 			ct->stall_time = ktime_get();
 
@@ -613,7 +625,7 @@ static int ct_send(struct intel_guc_ct *ct,
 	 */
 retry:
 	spin_lock_irqsave(&ctb->lock, flags);
-	if (unlikely(!h2g_has_room(ctb, len + GUC_CTB_HDR_LEN))) {
+	if (unlikely(!h2g_has_room(ct, len + GUC_CTB_HDR_LEN))) {
 		if (ct->stall_time == KTIME_MAX)
 			ct->stall_time = ktime_get();
 		spin_unlock_irqrestore(&ctb->lock, flags);
@@ -733,7 +745,7 @@ static int ct_read(struct intel_guc_ct *ct, struct ct_incoming_msg **msg)
 {
 	struct intel_guc_ct_buffer *ctb = &ct->ctbs.recv;
 	struct guc_ct_buffer_desc *desc = ctb->desc;
-	u32 head = desc->head;
+	u32 head = ctb->head;
 	u32 tail = desc->tail;
 	u32 size = ctb->size;
 	u32 *cmds = ctb->cmds;
@@ -748,12 +760,29 @@ static int ct_read(struct intel_guc_ct *ct, struct ct_incoming_msg **msg)
 	if (unlikely(desc->status))
 		goto corrupted;
 
-	if (unlikely((tail | head) >= size)) {
+	GEM_BUG_ON(head > size);
+
+#ifdef CONFIG_DRM_I915_DEBUG_GUC
+	if (unlikely(head != READ_ONCE(desc->head))) {
+		CT_ERROR(ct, "Head was modified %u != %u\n",
+			 desc->head, ctb->head);
+		desc->status |= GUC_CTB_STATUS_MISMATCH;
+		goto corrupted;
+	}
+	if (unlikely((desc->tail | desc->head) >= size)) {
+		CT_ERROR(ct, "Invalid offsets head=%u tail=%u (size=%u)\n",
+			 head, tail, size);
+		desc->status |= GUC_CTB_STATUS_OVERFLOW;
+		goto corrupted;
+	}
+#else
+	if (unlikely((tail | ctb->head) >= size)) {
 		CT_ERROR(ct, "Invalid offsets head=%u tail=%u (size=%u)\n",
 			 head, tail, size);
 		desc->status |= GUC_CTB_STATUS_OVERFLOW;
 		goto corrupted;
 	}
+#endif
 
 	/* tail == head condition indicates empty */
 	available = tail - head;
@@ -803,6 +832,7 @@ static int ct_read(struct intel_guc_ct *ct, struct ct_incoming_msg **msg)
 	}
 	CT_DEBUG(ct, "received %*ph\n", 4 * len, (*msg)->msg);
 
+	ctb->head = head;
 	/* now update descriptor */
 	WRITE_ONCE(desc->head, head);
 
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
index bee03794c1eb..edd1bba0445d 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
@@ -33,6 +33,9 @@ struct intel_guc;
  * @desc: pointer to the buffer descriptor
  * @cmds: pointer to the commands buffer
  * @size: size of the commands buffer in dwords
+ * @head: local shadow copy of head in dwords
+ * @tail: local shadow copy of tail in dwords
+ * @space: local shadow copy of space in dwords
  * @broken: flag to indicate if descriptor data is broken
  */
 struct intel_guc_ct_buffer {
@@ -40,6 +43,9 @@ struct intel_guc_ct_buffer {
 	struct guc_ct_buffer_desc *desc;
 	u32 *cmds;
 	u32 size;
+	u32 tail;
+	u32 head;
+	u32 space;
 	bool broken;
 };
 
-- 
2.28.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 7/7] drm/i915/guc: Module load failure test for CT buffer creation
  2021-07-01 17:15 ` [Intel-gfx] " Matthew Brost
@ 2021-07-01 17:15   ` Matthew Brost
  -1 siblings, 0 replies; 38+ messages in thread
From: Matthew Brost @ 2021-07-01 17:15 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: john.c.harrison, Michal.Wajdeczko

From: John Harrison <John.C.Harrison@Intel.com>

Add several module failure load inject points in the CT buffer creation
code path.

Signed-off-by: John Harrison <john.c.harrison@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
index 5b8b4ff609e2..d2a55521ef25 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
@@ -175,6 +175,10 @@ static int ct_register_buffer(struct intel_guc_ct *ct, u32 type,
 {
 	int err;
 
+	err = i915_inject_probe_error(guc_to_gt(ct_to_guc(ct))->i915, -ENXIO);
+	if (unlikely(err))
+		return err;
+
 	err = guc_action_register_ct_buffer(ct_to_guc(ct), type,
 					    desc_addr, buff_addr, size);
 	if (unlikely(err))
@@ -226,6 +230,10 @@ int intel_guc_ct_init(struct intel_guc_ct *ct)
 	u32 *cmds;
 	int err;
 
+	err = i915_inject_probe_error(guc_to_gt(guc)->i915, -ENXIO);
+	if (err)
+		return err;
+
 	GEM_BUG_ON(ct->vma);
 
 	blob_size = 2 * CTB_DESC_SIZE + CTB_H2G_BUFFER_SIZE + CTB_G2H_BUFFER_SIZE;
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [Intel-gfx] [PATCH 7/7] drm/i915/guc: Module load failure test for CT buffer creation
@ 2021-07-01 17:15   ` Matthew Brost
  0 siblings, 0 replies; 38+ messages in thread
From: Matthew Brost @ 2021-07-01 17:15 UTC (permalink / raw)
  To: intel-gfx, dri-devel

From: John Harrison <John.C.Harrison@Intel.com>

Add several module failure load inject points in the CT buffer creation
code path.

Signed-off-by: John Harrison <john.c.harrison@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
index 5b8b4ff609e2..d2a55521ef25 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
@@ -175,6 +175,10 @@ static int ct_register_buffer(struct intel_guc_ct *ct, u32 type,
 {
 	int err;
 
+	err = i915_inject_probe_error(guc_to_gt(ct_to_guc(ct))->i915, -ENXIO);
+	if (unlikely(err))
+		return err;
+
 	err = guc_action_register_ct_buffer(ct_to_guc(ct), type,
 					    desc_addr, buff_addr, size);
 	if (unlikely(err))
@@ -226,6 +230,10 @@ int intel_guc_ct_init(struct intel_guc_ct *ct)
 	u32 *cmds;
 	int err;
 
+	err = i915_inject_probe_error(guc_to_gt(guc)->i915, -ENXIO);
+	if (err)
+		return err;
+
 	GEM_BUG_ON(ct->vma);
 
 	blob_size = 2 * CTB_DESC_SIZE + CTB_H2G_BUFFER_SIZE + CTB_G2H_BUFFER_SIZE;
-- 
2.28.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for CT changes required for GuC submission (rev2)
  2021-07-01 17:15 ` [Intel-gfx] " Matthew Brost
                   ` (7 preceding siblings ...)
  (?)
@ 2021-07-01 23:20 ` Patchwork
  -1 siblings, 0 replies; 38+ messages in thread
From: Patchwork @ 2021-07-01 23:20 UTC (permalink / raw)
  To: Matthew Brost; +Cc: intel-gfx

== Series Details ==

Series: CT changes required for GuC submission (rev2)
URL   : https://patchwork.freedesktop.org/series/91943/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
1bcca481ebc1 drm/i915/guc: Relax CTB response timeout
ee04968a1998 drm/i915/guc: Improve error message for unsolicited CT response
8cab0a3b4751 drm/i915/guc: Increase size of CTB buffers
239769e0c636 drm/i915/guc: Add non blocking CTB send function
2668763f4f82 drm/i915/guc: Add stall timer to non blocking CTB send function
5e72f3dc91ad drm/i915/guc: Optimize CTB writes and reads
b25f638c19a5 drm/i915/guc: Module load failure test for CT buffer creation
-:38: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address mismatch: 'From: John Harrison <John.C.Harrison@Intel.com>' != 'Signed-off-by: John Harrison <john.c.harrison@intel.com>'

total: 0 errors, 1 warnings, 0 checks, 20 lines checked


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [Intel-gfx] ✗ Fi.CI.SPARSE: warning for CT changes required for GuC submission (rev2)
  2021-07-01 17:15 ` [Intel-gfx] " Matthew Brost
                   ` (8 preceding siblings ...)
  (?)
@ 2021-07-01 23:21 ` Patchwork
  -1 siblings, 0 replies; 38+ messages in thread
From: Patchwork @ 2021-07-01 23:21 UTC (permalink / raw)
  To: Matthew Brost; +Cc: intel-gfx

== Series Details ==

Series: CT changes required for GuC submission (rev2)
URL   : https://patchwork.freedesktop.org/series/91943/
State : warning

== Summary ==

$ dim sparse --fast origin/drm-tip
Sparse version: v0.6.2
Fast mode used, each commit won't be checked separately.
-
+drivers/gpu/drm/i915/display/intel_display.c:1896:21:    expected struct i915_vma *[assigned] vma
+drivers/gpu/drm/i915/display/intel_display.c:1896:21:    got void [noderef] __iomem *[assigned] iomem
+drivers/gpu/drm/i915/display/intel_display.c:1896:21: warning: incorrect type in assignment (different address spaces)
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:27:9: warning: trying to copy expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:27:9: warning: trying to copy expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:27:9: warning: trying to copy expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:32:9: warning: trying to copy expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:32:9: warning: trying to copy expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:49:9: warning: trying to copy expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:49:9: warning: trying to copy expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:49:9: warning: trying to copy expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:56:9: warning: trying to copy expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:56:9: warning: trying to copy expression type 31
+drivers/gpu/drm/i915/gt/intel_reset.c:1396:5: warning: context imbalance in 'intel_gt_reset_trylock' - different lock contexts for basic block
+drivers/gpu/drm/i915/gt/intel_ring_submission.c:1207:24: warning: Using plain integer as NULL pointer
+drivers/gpu/drm/i915/i915_perf.c:1434:15: warning: memset with byte count of 16777216
+drivers/gpu/drm/i915/i915_perf.c:1488:15: warning: memset with byte count of 16777216
+./include/asm-generic/bitops/find.h:112:45: warning: shift count is negative (-262080)
+./include/asm-generic/bitops/find.h:32:31: warning: shift count is negative (-262080)
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'fwtable_read16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'fwtable_read32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'fwtable_read64' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'fwtable_read8' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'fwtable_write16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'fwtable_write32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'fwtable_write8' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen11_fwtable_read16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen11_fwtable_read32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen11_fwtable_read64' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen11_fwtable_read8' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen11_fwtable_write16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen11_fwtable_write32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen11_fwtable_write8' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen12_fwtable_read16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen12_fwtable_read32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen12_fwtable_read64' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen12_fwtable_read8' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen12_fwtable_write16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen12_fwtable_write32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen12_fwtable_write8' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen6_read16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen6_read32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen6_read64' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen6_read8' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen6_write16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen6_write32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen6_write8' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen8_write16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen8_write32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen8_write8' - different lock contexts for basic block


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [Intel-gfx] ✓ Fi.CI.BAT: success for CT changes required for GuC submission (rev2)
  2021-07-01 17:15 ` [Intel-gfx] " Matthew Brost
                   ` (9 preceding siblings ...)
  (?)
@ 2021-07-01 23:49 ` Patchwork
  -1 siblings, 0 replies; 38+ messages in thread
From: Patchwork @ 2021-07-01 23:49 UTC (permalink / raw)
  To: Matthew Brost; +Cc: intel-gfx


[-- Attachment #1.1: Type: text/plain, Size: 3033 bytes --]

== Series Details ==

Series: CT changes required for GuC submission (rev2)
URL   : https://patchwork.freedesktop.org/series/91943/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_10301 -> Patchwork_20513
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/index.html

Known issues
------------

  Here are the changes found in Patchwork_20513 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@amdgpu/amd_basic@query-info:
    - fi-bsw-kefka:       NOTRUN -> [SKIP][1] ([fdo#109271]) +17 similar issues
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/fi-bsw-kefka/igt@amdgpu/amd_basic@query-info.html

  * igt@amdgpu/amd_basic@semaphore:
    - fi-bsw-nick:        NOTRUN -> [SKIP][2] ([fdo#109271]) +17 similar issues
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/fi-bsw-nick/igt@amdgpu/amd_basic@semaphore.html

  
#### Possible fixes ####

  * igt@i915_selftest@live@execlists:
    - fi-bsw-nick:        [INCOMPLETE][3] ([i915#2782] / [i915#2940]) -> [PASS][4]
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10301/fi-bsw-nick/igt@i915_selftest@live@execlists.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/fi-bsw-nick/igt@i915_selftest@live@execlists.html
    - fi-bsw-kefka:       [INCOMPLETE][5] ([i915#2782] / [i915#2940]) -> [PASS][6]
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10301/fi-bsw-kefka/igt@i915_selftest@live@execlists.html
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/fi-bsw-kefka/igt@i915_selftest@live@execlists.html

  
  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [i915#2782]: https://gitlab.freedesktop.org/drm/intel/issues/2782
  [i915#2940]: https://gitlab.freedesktop.org/drm/intel/issues/2940


Participating hosts (37 -> 35)
------------------------------

  Missing    (2): fi-bsw-cyan fi-bdw-samus 


Build changes
-------------

  * Linux: CI_DRM_10301 -> Patchwork_20513

  CI-20190529: 20190529
  CI_DRM_10301: 3d3ff5917ce204b783f4847c14e8839fde358a3a @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_6128: b24e5949af7e51f0af484d2ce4cb4c5a41ac5358 @ https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  Patchwork_20513: b25f638c19a501484ef4ed53f1971453acd87195 @ git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

b25f638c19a5 drm/i915/guc: Module load failure test for CT buffer creation
5e72f3dc91ad drm/i915/guc: Optimize CTB writes and reads
2668763f4f82 drm/i915/guc: Add stall timer to non blocking CTB send function
239769e0c636 drm/i915/guc: Add non blocking CTB send function
8cab0a3b4751 drm/i915/guc: Increase size of CTB buffers
ee04968a1998 drm/i915/guc: Improve error message for unsolicited CT response
1bcca481ebc1 drm/i915/guc: Relax CTB response timeout

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/index.html

[-- Attachment #1.2: Type: text/html, Size: 3951 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [Intel-gfx] ✗ Fi.CI.IGT: failure for CT changes required for GuC submission (rev2)
  2021-07-01 17:15 ` [Intel-gfx] " Matthew Brost
                   ` (10 preceding siblings ...)
  (?)
@ 2021-07-02  6:55 ` Patchwork
  -1 siblings, 0 replies; 38+ messages in thread
From: Patchwork @ 2021-07-02  6:55 UTC (permalink / raw)
  To: Matthew Brost; +Cc: intel-gfx


[-- Attachment #1.1: Type: text/plain, Size: 26829 bytes --]

== Series Details ==

Series: CT changes required for GuC submission (rev2)
URL   : https://patchwork.freedesktop.org/series/91943/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_10301_full -> Patchwork_20513_full
====================================================

Summary
-------

  **FAILURE**

  Serious unknown changes coming with Patchwork_20513_full absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_20513_full, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in Patchwork_20513_full:

### CI changes ###

#### Possible regressions ####

  * boot:
    - shard-tglb:         ([PASS][1], [PASS][2], [PASS][3], [PASS][4], [PASS][5], [PASS][6], [PASS][7], [PASS][8], [PASS][9], [PASS][10], [PASS][11], [PASS][12], [PASS][13], [PASS][14], [PASS][15], [PASS][16], [PASS][17], [PASS][18], [PASS][19], [PASS][20], [PASS][21], [PASS][22], [PASS][23], [PASS][24], [PASS][25]) -> ([PASS][26], [PASS][27], [PASS][28], [PASS][29], [PASS][30], [PASS][31], [PASS][32], [PASS][33], [PASS][34], [PASS][35], [PASS][36], [PASS][37], [PASS][38], [PASS][39], [PASS][40], [PASS][41], [PASS][42], [PASS][43], [FAIL][44], [PASS][45], [PASS][46], [PASS][47], [PASS][48], [PASS][49])
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10301/shard-tglb7/boot.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10301/shard-tglb7/boot.html
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10301/shard-tglb7/boot.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10301/shard-tglb7/boot.html
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10301/shard-tglb7/boot.html
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10301/shard-tglb6/boot.html
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10301/shard-tglb6/boot.html
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10301/shard-tglb6/boot.html
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10301/shard-tglb6/boot.html
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10301/shard-tglb5/boot.html
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10301/shard-tglb5/boot.html
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10301/shard-tglb5/boot.html
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10301/shard-tglb5/boot.html
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10301/shard-tglb3/boot.html
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10301/shard-tglb3/boot.html
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10301/shard-tglb3/boot.html
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10301/shard-tglb3/boot.html
   [18]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10301/shard-tglb2/boot.html
   [19]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10301/shard-tglb2/boot.html
   [20]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10301/shard-tglb2/boot.html
   [21]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10301/shard-tglb2/boot.html
   [22]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10301/shard-tglb1/boot.html
   [23]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10301/shard-tglb1/boot.html
   [24]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10301/shard-tglb1/boot.html
   [25]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10301/shard-tglb1/boot.html
   [26]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/shard-tglb7/boot.html
   [27]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/shard-tglb7/boot.html
   [28]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/shard-tglb7/boot.html
   [29]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/shard-tglb7/boot.html
   [30]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/shard-tglb6/boot.html
   [31]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/shard-tglb6/boot.html
   [32]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/shard-tglb6/boot.html
   [33]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/shard-tglb5/boot.html
   [34]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/shard-tglb5/boot.html
   [35]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/shard-tglb5/boot.html
   [36]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/shard-tglb5/boot.html
   [37]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/shard-tglb3/boot.html
   [38]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/shard-tglb3/boot.html
   [39]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/shard-tglb3/boot.html
   [40]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/shard-tglb3/boot.html
   [41]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/shard-tglb3/boot.html
   [42]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/shard-tglb2/boot.html
   [43]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/shard-tglb2/boot.html
   [44]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/shard-tglb2/boot.html
   [45]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/shard-tglb2/boot.html
   [46]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/shard-tglb1/boot.html
   [47]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/shard-tglb1/boot.html
   [48]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/shard-tglb1/boot.html
   [49]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/shard-tglb1/boot.html

  

### IGT changes ###

#### Possible regressions ####

  * igt@kms_draw_crc@draw-method-xrgb8888-render-ytiled:
    - shard-glk:          [PASS][50] -> [FAIL][51]
   [50]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10301/shard-glk6/igt@kms_draw_crc@draw-method-xrgb8888-render-ytiled.html
   [51]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/shard-glk4/igt@kms_draw_crc@draw-method-xrgb8888-render-ytiled.html

  
Known issues
------------

  Here are the changes found in Patchwork_20513_full that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@gem_ctx_persistence@engines-hostile:
    - shard-snb:          NOTRUN -> [SKIP][52] ([fdo#109271] / [i915#1099])
   [52]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/shard-snb7/igt@gem_ctx_persistence@engines-hostile.html

  * igt@gem_ctx_persistence@many-contexts:
    - shard-tglb:         [PASS][53] -> [FAIL][54] ([i915#2410])
   [53]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10301/shard-tglb5/igt@gem_ctx_persistence@many-contexts.html
   [54]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/shard-tglb3/igt@gem_ctx_persistence@many-contexts.html

  * igt@gem_eio@unwedge-stress:
    - shard-tglb:         [PASS][55] -> [TIMEOUT][56] ([i915#2369] / [i915#3063] / [i915#3648])
   [55]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10301/shard-tglb7/igt@gem_eio@unwedge-stress.html
   [56]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/shard-tglb1/igt@gem_eio@unwedge-stress.html

  * igt@gem_exec_fair@basic-deadline:
    - shard-kbl:          [PASS][57] -> [FAIL][58] ([i915#2846])
   [57]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10301/shard-kbl2/igt@gem_exec_fair@basic-deadline.html
   [58]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/shard-kbl4/igt@gem_exec_fair@basic-deadline.html

  * igt@gem_exec_fair@basic-none@vcs1:
    - shard-iclb:         NOTRUN -> [FAIL][59] ([i915#2842])
   [59]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/shard-iclb4/igt@gem_exec_fair@basic-none@vcs1.html

  * igt@gem_exec_fair@basic-pace-solo@rcs0:
    - shard-iclb:         [PASS][60] -> [FAIL][61] ([i915#2842])
   [60]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10301/shard-iclb4/igt@gem_exec_fair@basic-pace-solo@rcs0.html
   [61]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/shard-iclb7/igt@gem_exec_fair@basic-pace-solo@rcs0.html
    - shard-glk:          [PASS][62] -> [FAIL][63] ([i915#2842])
   [62]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10301/shard-glk7/igt@gem_exec_fair@basic-pace-solo@rcs0.html
   [63]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/shard-glk8/igt@gem_exec_fair@basic-pace-solo@rcs0.html

  * igt@gem_exec_fair@basic-pace@vecs0:
    - shard-kbl:          [PASS][64] -> [FAIL][65] ([i915#2842])
   [64]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10301/shard-kbl7/igt@gem_exec_fair@basic-pace@vecs0.html
   [65]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/shard-kbl3/igt@gem_exec_fair@basic-pace@vecs0.html

  * igt@gem_fenced_exec_thrash@too-many-fences:
    - shard-snb:          [PASS][66] -> [INCOMPLETE][67] ([i915#2055])
   [66]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10301/shard-snb7/igt@gem_fenced_exec_thrash@too-many-fences.html
   [67]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/shard-snb5/igt@gem_fenced_exec_thrash@too-many-fences.html

  * igt@gem_mmap_gtt@cpuset-big-copy-odd:
    - shard-iclb:         [PASS][68] -> [FAIL][69] ([i915#307])
   [68]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10301/shard-iclb2/igt@gem_mmap_gtt@cpuset-big-copy-odd.html
   [69]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/shard-iclb7/igt@gem_mmap_gtt@cpuset-big-copy-odd.html

  * igt@gen9_exec_parse@bb-large:
    - shard-apl:          NOTRUN -> [FAIL][70] ([i915#3296])
   [70]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/shard-apl2/igt@gen9_exec_parse@bb-large.html

  * igt@kms_big_fb@x-tiled-16bpp-rotate-270:
    - shard-iclb:         NOTRUN -> [SKIP][71] ([fdo#110725] / [fdo#111614])
   [71]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/shard-iclb3/igt@kms_big_fb@x-tiled-16bpp-rotate-270.html

  * igt@kms_big_fb@yf-tiled-8bpp-rotate-90:
    - shard-snb:          NOTRUN -> [SKIP][72] ([fdo#109271]) +49 similar issues
   [72]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/shard-snb7/igt@kms_big_fb@yf-tiled-8bpp-rotate-90.html

  * igt@kms_color_chamelium@pipe-b-ctm-red-to-blue:
    - shard-apl:          NOTRUN -> [SKIP][73] ([fdo#109271] / [fdo#111827]) +11 similar issues
   [73]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/shard-apl2/igt@kms_color_chamelium@pipe-b-ctm-red-to-blue.html

  * igt@kms_color_chamelium@pipe-c-gamma:
    - shard-glk:          NOTRUN -> [SKIP][74] ([fdo#109271] / [fdo#111827]) +2 similar issues
   [74]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/shard-glk5/igt@kms_color_chamelium@pipe-c-gamma.html

  * igt@kms_color_chamelium@pipe-invalid-ctm-matrix-sizes:
    - shard-snb:          NOTRUN -> [SKIP][75] ([fdo#109271] / [fdo#111827]) +3 similar issues
   [75]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/shard-snb7/igt@kms_color_chamelium@pipe-invalid-ctm-matrix-sizes.html

  * igt@kms_content_protection@atomic-dpms:
    - shard-apl:          NOTRUN -> [TIMEOUT][76] ([i915#1319])
   [76]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/shard-apl6/igt@kms_content_protection@atomic-dpms.html

  * igt@kms_cursor_legacy@flip-vs-cursor-atomic:
    - shard-skl:          [PASS][77] -> [FAIL][78] ([i915#2346]) +1 similar issue
   [77]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10301/shard-skl4/igt@kms_cursor_legacy@flip-vs-cursor-atomic.html
   [78]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/shard-skl9/igt@kms_cursor_legacy@flip-vs-cursor-atomic.html

  * igt@kms_draw_crc@draw-method-rgb565-blt-ytiled:
    - shard-skl:          [PASS][79] -> [DMESG-WARN][80] ([i915#1982]) +2 similar issues
   [79]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10301/shard-skl9/igt@kms_draw_crc@draw-method-rgb565-blt-ytiled.html
   [80]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/shard-skl3/igt@kms_draw_crc@draw-method-rgb565-blt-ytiled.html

  * igt@kms_flip@flip-vs-suspend-interruptible@c-dp1:
    - shard-apl:          NOTRUN -> [DMESG-WARN][81] ([i915#180]) +2 similar issues
   [81]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/shard-apl3/igt@kms_flip@flip-vs-suspend-interruptible@c-dp1.html

  * igt@kms_flip@plain-flip-ts-check@a-edp1:
    - shard-skl:          [PASS][82] -> [FAIL][83] ([i915#2122])
   [82]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10301/shard-skl1/igt@kms_flip@plain-flip-ts-check@a-edp1.html
   [83]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/shard-skl8/igt@kms_flip@plain-flip-ts-check@a-edp1.html

  * igt@kms_frontbuffer_tracking@fbc-2p-primscrn-spr-indfb-draw-pwrite:
    - shard-iclb:         NOTRUN -> [SKIP][84] ([fdo#109280])
   [84]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/shard-iclb3/igt@kms_frontbuffer_tracking@fbc-2p-primscrn-spr-indfb-draw-pwrite.html

  * igt@kms_frontbuffer_tracking@fbcpsr-rgb565-draw-mmap-gtt:
    - shard-glk:          NOTRUN -> [SKIP][85] ([fdo#109271]) +39 similar issues
   [85]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/shard-glk5/igt@kms_frontbuffer_tracking@fbcpsr-rgb565-draw-mmap-gtt.html

  * igt@kms_hdr@bpc-switch-suspend:
    - shard-skl:          [PASS][86] -> [FAIL][87] ([i915#1188])
   [86]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10301/shard-skl9/igt@kms_hdr@bpc-switch-suspend.html
   [87]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/shard-skl10/igt@kms_hdr@bpc-switch-suspend.html

  * igt@kms_plane_alpha_blend@pipe-a-alpha-basic:
    - shard-apl:          NOTRUN -> [FAIL][88] ([fdo#108145] / [i915#265]) +2 similar issues
   [88]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/shard-apl7/igt@kms_plane_alpha_blend@pipe-a-alpha-basic.html

  * igt@kms_plane_alpha_blend@pipe-b-constant-alpha-max:
    - shard-glk:          NOTRUN -> [FAIL][89] ([fdo#108145] / [i915#265])
   [89]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/shard-glk5/igt@kms_plane_alpha_blend@pipe-b-constant-alpha-max.html

  * igt@kms_plane_scaling@scaler-with-clipping-clamping@pipe-c-scaler-with-clipping-clamping:
    - shard-apl:          NOTRUN -> [SKIP][90] ([fdo#109271] / [i915#2733])
   [90]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/shard-apl2/igt@kms_plane_scaling@scaler-with-clipping-clamping@pipe-c-scaler-with-clipping-clamping.html

  * igt@kms_psr2_sf@overlay-plane-update-sf-dmg-area-4:
    - shard-glk:          NOTRUN -> [SKIP][91] ([fdo#109271] / [i915#658])
   [91]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/shard-glk5/igt@kms_psr2_sf@overlay-plane-update-sf-dmg-area-4.html

  * igt@kms_psr2_sf@overlay-primary-update-sf-dmg-area-4:
    - shard-apl:          NOTRUN -> [SKIP][92] ([fdo#109271] / [i915#658]) +3 similar issues
   [92]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/shard-apl7/igt@kms_psr2_sf@overlay-primary-update-sf-dmg-area-4.html

  * igt@kms_psr@psr2_sprite_mmap_gtt:
    - shard-iclb:         [PASS][93] -> [SKIP][94] ([fdo#109441]) +2 similar issues
   [93]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10301/shard-iclb2/igt@kms_psr@psr2_sprite_mmap_gtt.html
   [94]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/shard-iclb7/igt@kms_psr@psr2_sprite_mmap_gtt.html

  * igt@nouveau_crc@pipe-b-ctx-flip-skip-current-frame:
    - shard-apl:          NOTRUN -> [SKIP][95] ([fdo#109271]) +175 similar issues
   [95]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/shard-apl2/igt@nouveau_crc@pipe-b-ctx-flip-skip-current-frame.html

  * igt@sysfs_clients@split-50:
    - shard-apl:          NOTRUN -> [SKIP][96] ([fdo#109271] / [i915#2994])
   [96]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/shard-apl2/igt@sysfs_clients@split-50.html

  
#### Possible fixes ####

  * igt@gem_exec_fair@basic-none-solo@rcs0:
    - shard-kbl:          [FAIL][97] ([i915#2842]) -> [PASS][98]
   [97]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10301/shard-kbl2/igt@gem_exec_fair@basic-none-solo@rcs0.html
   [98]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/shard-kbl4/igt@gem_exec_fair@basic-none-solo@rcs0.html

  * igt@gem_exec_fair@basic-pace@vcs1:
    - shard-tglb:         [FAIL][99] ([i915#2842]) -> [PASS][100]
   [99]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10301/shard-tglb3/igt@gem_exec_fair@basic-pace@vcs1.html
   [100]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/shard-tglb2/igt@gem_exec_fair@basic-pace@vcs1.html

  * igt@gem_vm_create@destroy-race:
    - shard-tglb:         [TIMEOUT][101] ([i915#2795]) -> [PASS][102]
   [101]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10301/shard-tglb6/igt@gem_vm_create@destroy-race.html
   [102]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/shard-tglb7/igt@gem_vm_create@destroy-race.html

  * igt@i915_pm_dc@dc6-psr:
    - shard-iclb:         [FAIL][103] ([i915#454]) -> [PASS][104]
   [103]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10301/shard-iclb6/igt@i915_pm_dc@dc6-psr.html
   [104]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/shard-iclb6/igt@i915_pm_dc@dc6-psr.html

  * igt@i915_selftest@live@hangcheck:
    - shard-snb:          [INCOMPLETE][105] ([i915#2782]) -> [PASS][106]
   [105]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10301/shard-snb5/igt@i915_selftest@live@hangcheck.html
   [106]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/shard-snb7/igt@i915_selftest@live@hangcheck.html

  * igt@kms_draw_crc@draw-method-rgb565-blt-untiled:
    - shard-snb:          [SKIP][107] ([fdo#109271]) -> [PASS][108] +1 similar issue
   [107]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10301/shard-snb5/igt@kms_draw_crc@draw-method-rgb565-blt-untiled.html
   [108]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/shard-snb6/igt@kms_draw_crc@draw-method-rgb565-blt-untiled.html

  * igt@kms_draw_crc@draw-method-xrgb8888-mmap-cpu-ytiled:
    - shard-skl:          [FAIL][109] -> [PASS][110]
   [109]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10301/shard-skl5/igt@kms_draw_crc@draw-method-xrgb8888-mmap-cpu-ytiled.html
   [110]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/shard-skl1/igt@kms_draw_crc@draw-method-xrgb8888-mmap-cpu-ytiled.html

  * igt@kms_flip@2x-plain-flip-fb-recreate-interruptible@bc-hdmi-a1-hdmi-a2:
    - shard-glk:          [FAIL][111] ([i915#2122]) -> [PASS][112]
   [111]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10301/shard-glk7/igt@kms_flip@2x-plain-flip-fb-recreate-interruptible@bc-hdmi-a1-hdmi-a2.html
   [112]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/shard-glk8/igt@kms_flip@2x-plain-flip-fb-recreate-interruptible@bc-hdmi-a1-hdmi-a2.html

  * igt@kms_flip@plain-flip-fb-recreate-interruptible@c-edp1:
    - shard-skl:          [FAIL][113] ([i915#2122]) -> [PASS][114] +1 similar issue
   [113]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10301/shard-skl5/igt@kms_flip@plain-flip-fb-recreate-interruptible@c-edp1.html
   [114]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/shard-skl1/igt@kms_flip@plain-flip-fb-recreate-interruptible@c-edp1.html

  * igt@kms_flip_tiling@flip-changes-tiling@edp-1-pipe-a:
    - shard-skl:          [FAIL][115] ([i915#699]) -> [PASS][116]
   [115]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10301/shard-skl5/igt@kms_flip_tiling@flip-changes-tiling@edp-1-pipe-a.html
   [116]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/shard-skl1/igt@kms_flip_tiling@flip-changes-tiling@edp-1-pipe-a.html

  * igt@kms_plane_alpha_blend@pipe-c-coverage-7efc:
    - shard-skl:          [FAIL][117] ([fdo#108145] / [i915#265]) -> [PASS][118] +1 similar issue
   [117]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10301/shard-skl7/igt@kms_plane_alpha_blend@pipe-c-coverage-7efc.html
   [118]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/shard-skl5/igt@kms_plane_alpha_blend@pipe-c-coverage-7efc.html

  * igt@kms_psr@psr2_no_drrs:
    - shard-iclb:         [SKIP][119] ([fdo#109441]) -> [PASS][120] +1 similar issue
   [119]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10301/shard-iclb3/igt@kms_psr@psr2_no_drrs.html
   [120]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/shard-iclb2/igt@kms_psr@psr2_no_drrs.html

  * igt@perf@polling-small-buf:
    - shard-skl:          [FAIL][121] ([i915#1722]) -> [PASS][122]
   [121]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10301/shard-skl5/igt@perf@polling-small-buf.html
   [122]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/shard-skl1/igt@perf@polling-small-buf.html

  
#### Warnings ####

  * igt@i915_pm_rc6_residency@rc6-fence:
    - shard-iclb:         [WARN][123] ([i915#2684]) -> [WARN][124] ([i915#1804] / [i915#2684])
   [123]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10301/shard-iclb1/igt@i915_pm_rc6_residency@rc6-fence.html
   [124]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/shard-iclb4/igt@i915_pm_rc6_residency@rc6-fence.html

  * igt@i915_pm_rc6_residency@rc6-idle:
    - shard-iclb:         [WARN][125] ([i915#1804] / [i915#2684]) -> [WARN][126] ([i915#2684])
   [125]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10301/shard-iclb3/igt@i915_pm_rc6_residency@rc6-idle.html
   [126]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/shard-iclb2/igt@i915_pm_rc6_residency@rc6-idle.html

  * igt@kms_psr2_sf@primary-plane-update-sf-dmg-area-3:
    - shard-iclb:         [SKIP][127] ([i915#658]) -> [SKIP][128] ([i915#2920]) +1 similar issue
   [127]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10301/shard-iclb1/igt@kms_psr2_sf@primary-plane-update-sf-dmg-area-3.html
   [128]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/shard-iclb2/igt@kms_psr2_sf@primary-plane-update-sf-dmg-area-3.html

  * igt@kms_psr2_sf@primary-plane-update-sf-dmg-area-4:
    - shard-iclb:         [SKIP][129] ([i915#2920]) -> [SKIP][130] ([i915#658]) +1 similar issue
   [129]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10301/shard-iclb2/igt@kms_psr2_sf@primary-plane-update-sf-dmg-area-4.html
   [130]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/shard-iclb7/igt@kms_psr2_sf@primary-plane-update-sf-dmg-area-4.html

  * igt@runner@aborted:
    - shard-iclb:         ([FAIL][131], [FAIL][132], [FAIL][133]) ([i915#1814] / [i915#3002]) -> ([FAIL][134], [FAIL][135]) ([i915#3002])
   [131]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10301/shard-iclb5/igt@runner@aborted.html
   [132]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10301/shard-iclb6/igt@runner@aborted.html
   [133]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10301/shard-iclb6/igt@runner@aborted.html
   [134]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/shard-iclb4/igt@runner@aborted.html
   [135]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/shard-iclb6/igt@runner@aborted.html
    - shard-apl:          ([FAIL][136], [FAIL][137]) ([i915#3002] / [i915#3363]) -> ([FAIL][138], [FAIL][139], [FAIL][140]) ([i915#180] / [i915#3002] / [i915#3363])
   [136]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10301/shard-apl2/igt@runner@aborted.html
   [137]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10301/shard-apl8/igt@runner@aborted.html
   [138]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/shard-apl3/igt@runner@aborted.html
   [139]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/shard-apl6/igt@runner@aborted.html
   [140]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/shard-apl7/igt@runner@aborted.html

  
  [fdo#108145]: https://bugs.freedesktop.org/show_bug.cgi?id=108145
  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [fdo#109280]: https://bugs.freedesktop.org/show_bug.cgi?id=109280
  [fdo#109441]: https://bugs.freedesktop.org/show_bug.cgi?id=109441
  [fdo#110725]: https://bugs.freedesktop.org/show_bug.cgi?id=110725
  [fdo#111614]: https://bugs.freedesktop.org/show_bug.cgi?id=111614
  [fdo#111827]: https://bugs.freedesktop.org/show_bug.cgi?id=111827
  [i915#1099]: https://gitlab.freedesktop.org/drm/intel/issues/1099
  [i915#1188]: https://gitlab.freedesktop.org/drm/intel/issues/1188
  [i915#1319]: https://gitlab.freedesktop.org/drm/intel/issues/1319
  [i915#1722]: https://gitlab.freedesktop.org/drm/intel/issues/1722
  [i915#180]: https://gitlab.freedesktop.org/drm/intel/issues/180
  [i915#1804]: https://gitlab.freedesktop.org/drm/intel/issues/1804
  [i915#1814]: https://gitlab.freedesktop.org/drm/intel/issues/1814
  [i915#1982]: https://gitlab.freedesktop.org/drm/intel/issues/1982
  [i915#2055]: https://gitlab.freedesktop.org/drm/intel/issues/2055
  [i915#2122]: https://gitlab.freedesktop.org/drm/intel/issues/2122
  [i915#2346]: https://gitlab.freedesktop.org/drm/intel/issues/2346
  [i915#2369]: https://gitlab.freedesktop.org/drm/intel/issues/2369
  [i915#2410]: https://gitlab.freedesktop.org/drm/intel/issues/2410
  [i915#265]: https://gitlab.freedesktop.org/drm/intel/issues/265
  [i915#2684]: https://gitlab.freedesktop.org/drm/intel/issues/2684
  [i915#2733]: https://gitlab.freedesktop.org/drm/intel/issues/2733
  [i915#2782]: https://gitlab.freedesktop.org/drm/intel/issues/2782
  [i915#2795]: https://gitlab.freedesktop.org/drm/intel/issues/2795
  [i915#2842]: https://gitlab.freedesktop.org/drm/intel/issues/2842
  [i915#2846]: https://gitlab.freedesktop.org/drm/intel/issues/2846
  [i915#2920]: https://gitlab.freedesktop.org/drm/intel/issues/2920
  [i915#2994]: https://gitlab.freedesktop.org/drm/intel/issues/2994
  [i915#3002]: https://gitlab.freedesktop.org/drm/intel/issues/3002
  [i915#3063]: https://gitlab.freedesktop.org/drm/intel/issues/3063
  [i915#307]: https://gitlab.freedesktop.org/drm/intel/issues/307
  [i915#3296]: https://gitlab.freedesktop.org/drm/intel/issues/3296
  [i915#3363]: https://gitlab.freedesktop.org/drm/intel/issues/3363
  [i915#3648]: https://gitlab.freedesktop.org/drm/intel/issues/3648
  [i915#454]: https://gitlab.freedesktop.org/drm/intel/issues/454
  [i915#658]: https://gitlab.freedesktop.org/drm/intel/issues/658
  [i915#699]: https://gitlab.freedesktop.org/drm/intel/issues/699


Participating hosts (10 -> 10)
------------------------------

  No changes in participating hosts


Build changes
-------------

  * Linux: CI_DRM_10301 -> Patchwork_20513

  CI-20190529: 20190529
  CI_DRM_10301: 3d3ff5917ce204b783f4847c14e8839fde358a3a @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_6128: b24e5949af7e51f0af484d2ce4cb4c5a41ac5358 @ https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  Patchwork_20513: b25f638c19a501484ef4ed53f1971453acd87195 @ git://anongit.freedesktop.org/gfx-ci/linux
  piglit_4509: fdc5a4ca11124ab8413c7988896eec4c97336694 @ git://anongit.freedesktop.org/piglit

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20513/index.html

[-- Attachment #1.2: Type: text/html, Size: 31492 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 4/7] drm/i915/guc: Add non blocking CTB send function
  2021-07-01 17:15   ` [Intel-gfx] " Matthew Brost
@ 2021-07-06 18:12     ` John Harrison
  -1 siblings, 0 replies; 38+ messages in thread
From: John Harrison @ 2021-07-06 18:12 UTC (permalink / raw)
  To: Matthew Brost, intel-gfx, dri-devel; +Cc: Michal.Wajdeczko

[-- Attachment #1: Type: text/plain, Size: 10127 bytes --]

On 7/1/2021 10:15, Matthew Brost wrote:
> Add non blocking CTB send function, intel_guc_send_nb. GuC submission
> will send CTBs in the critical path and does not need to wait for these
> CTBs to complete before moving on, hence the need for this new function.
>
> The non-blocking CTB now must have a flow control mechanism to ensure
> the buffer isn't overrun. A lazy spin wait is used as we believe the
> flow control condition should be rare with a properly sized buffer.
>
> The function, intel_guc_send_nb, is exported in this patch but unused.
> Several patches later in the series make use of this function.
>
> v2:
>   (Michal)
>    - Use define for H2G room calculations
>    - Move INTEL_GUC_SEND_NB define
>   (Daniel Vetter)
>    - Use msleep_interruptible rather than cond_resched
> v3:
>   (Michal)
>    - Move includes to following patch
>    - s/INTEL_GUC_SEND_NB/INTEL_GUC_CT_SEND_NB/g
>
> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> ---
>   .../gt/uc/abi/guc_communication_ctb_abi.h     |  3 +-
>   drivers/gpu/drm/i915/gt/uc/intel_guc.h        | 11 ++-
>   drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c     | 87 +++++++++++++++++--
>   drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h     |  4 +-
>   4 files changed, 91 insertions(+), 14 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h b/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h
> index e933ca02d0eb..99e1fad5ca20 100644
> --- a/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h
> +++ b/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h
> @@ -79,7 +79,8 @@ static_assert(sizeof(struct guc_ct_buffer_desc) == 64);
>    *  +---+-------+--------------------------------------------------------------+
>    */
>   
> -#define GUC_CTB_MSG_MIN_LEN			1u
> +#define GUC_CTB_HDR_LEN				1u
> +#define GUC_CTB_MSG_MIN_LEN			GUC_CTB_HDR_LEN
>   #define GUC_CTB_MSG_MAX_LEN			256u
>   #define GUC_CTB_MSG_0_FENCE			(0xffff << 16)
>   #define GUC_CTB_MSG_0_FORMAT			(0xf << 12)
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.h b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
> index 4abc59f6f3cd..72e4653222e2 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.h
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
> @@ -74,7 +74,14 @@ static inline struct intel_guc *log_to_guc(struct intel_guc_log *log)
>   static
>   inline int intel_guc_send(struct intel_guc *guc, const u32 *action, u32 len)
>   {
> -	return intel_guc_ct_send(&guc->ct, action, len, NULL, 0);
> +	return intel_guc_ct_send(&guc->ct, action, len, NULL, 0, 0);
> +}
> +
> +static
> +inline int intel_guc_send_nb(struct intel_guc *guc, const u32 *action, u32 len)
> +{
> +	return intel_guc_ct_send(&guc->ct, action, len, NULL, 0,
> +				 INTEL_GUC_CT_SEND_NB);
>   }
>   
>   static inline int
> @@ -82,7 +89,7 @@ intel_guc_send_and_receive(struct intel_guc *guc, const u32 *action, u32 len,
>   			   u32 *response_buf, u32 response_buf_size)
>   {
>   	return intel_guc_ct_send(&guc->ct, action, len,
> -				 response_buf, response_buf_size);
> +				 response_buf, response_buf_size, 0);
>   }
>   
>   static inline void intel_guc_to_host_event_handler(struct intel_guc *guc)
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> index 43e03aa2dde8..fb825cc1d090 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> @@ -3,6 +3,8 @@
>    * Copyright © 2016-2019 Intel Corporation
>    */
>   
> +#include <linux/circ_buf.h>
> +
>   #include "i915_drv.h"
>   #include "intel_guc_ct.h"
>   #include "gt/intel_gt.h"
> @@ -373,7 +375,7 @@ static void write_barrier(struct intel_guc_ct *ct)
>   static int ct_write(struct intel_guc_ct *ct,
>   		    const u32 *action,
>   		    u32 len /* in dwords */,
> -		    u32 fence)
> +		    u32 fence, u32 flags)
>   {
>   	struct intel_guc_ct_buffer *ctb = &ct->ctbs.send;
>   	struct guc_ct_buffer_desc *desc = ctb->desc;
> @@ -409,7 +411,7 @@ static int ct_write(struct intel_guc_ct *ct,
>   		used = tail - head;
>   
>   	/* make sure there is a space including extra dw for the fence */
> -	if (unlikely(used + len + 1 >= size))
> +	if (unlikely(used + len + GUC_CTB_HDR_LEN >= size))
I thought the plan was to update the comment? Given that the '+1' is now 
'HDR_LEN' it would be good to update the comment to say 'header' instead 
of 'fence'.

>   		return -ENOSPC;
>   
>   	/*
> @@ -421,9 +423,13 @@ static int ct_write(struct intel_guc_ct *ct,
>   		 FIELD_PREP(GUC_CTB_MSG_0_NUM_DWORDS, len) |
>   		 FIELD_PREP(GUC_CTB_MSG_0_FENCE, fence);
>   
> -	hxg = FIELD_PREP(GUC_HXG_MSG_0_TYPE, GUC_HXG_TYPE_REQUEST) |
> -	      FIELD_PREP(GUC_HXG_REQUEST_MSG_0_ACTION |
> -			 GUC_HXG_REQUEST_MSG_0_DATA0, action[0]);
> +	hxg = (flags & INTEL_GUC_CT_SEND_NB) ?
> +		(FIELD_PREP(GUC_HXG_MSG_0_TYPE, GUC_HXG_TYPE_EVENT) |
> +		 FIELD_PREP(GUC_HXG_EVENT_MSG_0_ACTION |
> +			    GUC_HXG_EVENT_MSG_0_DATA0, action[0])) :
> +		(FIELD_PREP(GUC_HXG_MSG_0_TYPE, GUC_HXG_TYPE_REQUEST) |
> +		 FIELD_PREP(GUC_HXG_REQUEST_MSG_0_ACTION |
> +			    GUC_HXG_REQUEST_MSG_0_DATA0, action[0]));
If would be much easier to read if this used a proper 'if' rather than 
an inline '?'. Or maybe have something like:

    type = SEND_NB ? TYPE_EVENT : TYPE REQUEST;
    hxg = PREP(type) | ACTION ...;


Neither issue above is a blocker but I think the comment at least should 
be fixed up when merging. With that:

Reviewed-by: John Harrison <John.C.Harrison@Intel.com>


>   
>   	CT_DEBUG(ct, "writing (tail %u) %*ph %*ph %*ph\n",
>   		 tail, 4, &header, 4, &hxg, 4 * (len - 1), &action[1]);
> @@ -500,6 +506,48 @@ static int wait_for_ct_request_update(struct ct_request *req, u32 *status)
>   	return err;
>   }
>   
> +static inline bool h2g_has_room(struct intel_guc_ct_buffer *ctb, u32 len_dw)
> +{
> +	struct guc_ct_buffer_desc *desc = ctb->desc;
> +	u32 head = READ_ONCE(desc->head);
> +	u32 space;
> +
> +	space = CIRC_SPACE(desc->tail, head, ctb->size);
> +
> +	return space >= len_dw;
> +}
> +
> +static int ct_send_nb(struct intel_guc_ct *ct,
> +		      const u32 *action,
> +		      u32 len,
> +		      u32 flags)
> +{
> +	struct intel_guc_ct_buffer *ctb = &ct->ctbs.send;
> +	unsigned long spin_flags;
> +	u32 fence;
> +	int ret;
> +
> +	spin_lock_irqsave(&ctb->lock, spin_flags);
> +
> +	ret = h2g_has_room(ctb, len + GUC_CTB_HDR_LEN);
> +	if (unlikely(!ret)) {
> +		ret = -EBUSY;
> +		goto out;
> +	}
> +
> +	fence = ct_get_next_fence(ct);
> +	ret = ct_write(ct, action, len, fence, flags);
> +	if (unlikely(ret))
> +		goto out;
> +
> +	intel_guc_notify(ct_to_guc(ct));
> +
> +out:
> +	spin_unlock_irqrestore(&ctb->lock, spin_flags);
> +
> +	return ret;
> +}
> +
>   static int ct_send(struct intel_guc_ct *ct,
>   		   const u32 *action,
>   		   u32 len,
> @@ -507,8 +555,10 @@ static int ct_send(struct intel_guc_ct *ct,
>   		   u32 response_buf_size,
>   		   u32 *status)
>   {
> +	struct intel_guc_ct_buffer *ctb = &ct->ctbs.send;
>   	struct ct_request request;
>   	unsigned long flags;
> +	unsigned int sleep_period_ms = 1;
>   	u32 fence;
>   	int err;
>   
> @@ -516,8 +566,24 @@ static int ct_send(struct intel_guc_ct *ct,
>   	GEM_BUG_ON(!len);
>   	GEM_BUG_ON(len & ~GUC_CT_MSG_LEN_MASK);
>   	GEM_BUG_ON(!response_buf && response_buf_size);
> +	might_sleep();
> +
> +	/*
> +	 * We use a lazy spin wait loop here as we believe that if the CT
> +	 * buffers are sized correctly the flow control condition should be
> +	 * rare.
> +	 */
> +retry:
> +	spin_lock_irqsave(&ctb->lock, flags);
> +	if (unlikely(!h2g_has_room(ctb, len + GUC_CTB_HDR_LEN))) {
> +		spin_unlock_irqrestore(&ctb->lock, flags);
>   
> -	spin_lock_irqsave(&ct->ctbs.send.lock, flags);
> +		if (msleep_interruptible(sleep_period_ms))
> +			return -EINTR;
> +		sleep_period_ms = sleep_period_ms << 1;
> +
> +		goto retry;
> +	}
>   
>   	fence = ct_get_next_fence(ct);
>   	request.fence = fence;
> @@ -529,9 +595,9 @@ static int ct_send(struct intel_guc_ct *ct,
>   	list_add_tail(&request.link, &ct->requests.pending);
>   	spin_unlock(&ct->requests.lock);
>   
> -	err = ct_write(ct, action, len, fence);
> +	err = ct_write(ct, action, len, fence, 0);
>   
> -	spin_unlock_irqrestore(&ct->ctbs.send.lock, flags);
> +	spin_unlock_irqrestore(&ctb->lock, flags);
>   
>   	if (unlikely(err))
>   		goto unlink;
> @@ -571,7 +637,7 @@ static int ct_send(struct intel_guc_ct *ct,
>    * Command Transport (CT) buffer based GuC send function.
>    */
>   int intel_guc_ct_send(struct intel_guc_ct *ct, const u32 *action, u32 len,
> -		      u32 *response_buf, u32 response_buf_size)
> +		      u32 *response_buf, u32 response_buf_size, u32 flags)
>   {
>   	u32 status = ~0; /* undefined */
>   	int ret;
> @@ -581,6 +647,9 @@ int intel_guc_ct_send(struct intel_guc_ct *ct, const u32 *action, u32 len,
>   		return -ENODEV;
>   	}
>   
> +	if (flags & INTEL_GUC_CT_SEND_NB)
> +		return ct_send_nb(ct, action, len, flags);
> +
>   	ret = ct_send(ct, action, len, response_buf, response_buf_size, &status);
>   	if (unlikely(ret < 0)) {
>   		CT_ERROR(ct, "Sending action %#x failed (err=%d status=%#X)\n",
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
> index 1ae2dde6db93..5bb8bef024c8 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
> @@ -42,7 +42,6 @@ struct intel_guc_ct_buffer {
>   	bool broken;
>   };
>   
> -
>   /** Top-level structure for Command Transport related data
>    *
>    * Includes a pair of CT buffers for bi-directional communication and tracking
> @@ -87,8 +86,9 @@ static inline bool intel_guc_ct_enabled(struct intel_guc_ct *ct)
>   	return ct->enabled;
>   }
>   
> +#define INTEL_GUC_CT_SEND_NB		BIT(31)
>   int intel_guc_ct_send(struct intel_guc_ct *ct, const u32 *action, u32 len,
> -		      u32 *response_buf, u32 response_buf_size);
> +		      u32 *response_buf, u32 response_buf_size, u32 flags);
>   void intel_guc_ct_event_handler(struct intel_guc_ct *ct);
>   
>   #endif /* _INTEL_GUC_CT_H_ */


[-- Attachment #2: Type: text/html, Size: 10693 bytes --]

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Intel-gfx] [PATCH 4/7] drm/i915/guc: Add non blocking CTB send function
@ 2021-07-06 18:12     ` John Harrison
  0 siblings, 0 replies; 38+ messages in thread
From: John Harrison @ 2021-07-06 18:12 UTC (permalink / raw)
  To: Matthew Brost, intel-gfx, dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 10127 bytes --]

On 7/1/2021 10:15, Matthew Brost wrote:
> Add non blocking CTB send function, intel_guc_send_nb. GuC submission
> will send CTBs in the critical path and does not need to wait for these
> CTBs to complete before moving on, hence the need for this new function.
>
> The non-blocking CTB now must have a flow control mechanism to ensure
> the buffer isn't overrun. A lazy spin wait is used as we believe the
> flow control condition should be rare with a properly sized buffer.
>
> The function, intel_guc_send_nb, is exported in this patch but unused.
> Several patches later in the series make use of this function.
>
> v2:
>   (Michal)
>    - Use define for H2G room calculations
>    - Move INTEL_GUC_SEND_NB define
>   (Daniel Vetter)
>    - Use msleep_interruptible rather than cond_resched
> v3:
>   (Michal)
>    - Move includes to following patch
>    - s/INTEL_GUC_SEND_NB/INTEL_GUC_CT_SEND_NB/g
>
> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> ---
>   .../gt/uc/abi/guc_communication_ctb_abi.h     |  3 +-
>   drivers/gpu/drm/i915/gt/uc/intel_guc.h        | 11 ++-
>   drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c     | 87 +++++++++++++++++--
>   drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h     |  4 +-
>   4 files changed, 91 insertions(+), 14 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h b/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h
> index e933ca02d0eb..99e1fad5ca20 100644
> --- a/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h
> +++ b/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h
> @@ -79,7 +79,8 @@ static_assert(sizeof(struct guc_ct_buffer_desc) == 64);
>    *  +---+-------+--------------------------------------------------------------+
>    */
>   
> -#define GUC_CTB_MSG_MIN_LEN			1u
> +#define GUC_CTB_HDR_LEN				1u
> +#define GUC_CTB_MSG_MIN_LEN			GUC_CTB_HDR_LEN
>   #define GUC_CTB_MSG_MAX_LEN			256u
>   #define GUC_CTB_MSG_0_FENCE			(0xffff << 16)
>   #define GUC_CTB_MSG_0_FORMAT			(0xf << 12)
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.h b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
> index 4abc59f6f3cd..72e4653222e2 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.h
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
> @@ -74,7 +74,14 @@ static inline struct intel_guc *log_to_guc(struct intel_guc_log *log)
>   static
>   inline int intel_guc_send(struct intel_guc *guc, const u32 *action, u32 len)
>   {
> -	return intel_guc_ct_send(&guc->ct, action, len, NULL, 0);
> +	return intel_guc_ct_send(&guc->ct, action, len, NULL, 0, 0);
> +}
> +
> +static
> +inline int intel_guc_send_nb(struct intel_guc *guc, const u32 *action, u32 len)
> +{
> +	return intel_guc_ct_send(&guc->ct, action, len, NULL, 0,
> +				 INTEL_GUC_CT_SEND_NB);
>   }
>   
>   static inline int
> @@ -82,7 +89,7 @@ intel_guc_send_and_receive(struct intel_guc *guc, const u32 *action, u32 len,
>   			   u32 *response_buf, u32 response_buf_size)
>   {
>   	return intel_guc_ct_send(&guc->ct, action, len,
> -				 response_buf, response_buf_size);
> +				 response_buf, response_buf_size, 0);
>   }
>   
>   static inline void intel_guc_to_host_event_handler(struct intel_guc *guc)
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> index 43e03aa2dde8..fb825cc1d090 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> @@ -3,6 +3,8 @@
>    * Copyright © 2016-2019 Intel Corporation
>    */
>   
> +#include <linux/circ_buf.h>
> +
>   #include "i915_drv.h"
>   #include "intel_guc_ct.h"
>   #include "gt/intel_gt.h"
> @@ -373,7 +375,7 @@ static void write_barrier(struct intel_guc_ct *ct)
>   static int ct_write(struct intel_guc_ct *ct,
>   		    const u32 *action,
>   		    u32 len /* in dwords */,
> -		    u32 fence)
> +		    u32 fence, u32 flags)
>   {
>   	struct intel_guc_ct_buffer *ctb = &ct->ctbs.send;
>   	struct guc_ct_buffer_desc *desc = ctb->desc;
> @@ -409,7 +411,7 @@ static int ct_write(struct intel_guc_ct *ct,
>   		used = tail - head;
>   
>   	/* make sure there is a space including extra dw for the fence */
> -	if (unlikely(used + len + 1 >= size))
> +	if (unlikely(used + len + GUC_CTB_HDR_LEN >= size))
I thought the plan was to update the comment? Given that the '+1' is now 
'HDR_LEN' it would be good to update the comment to say 'header' instead 
of 'fence'.

>   		return -ENOSPC;
>   
>   	/*
> @@ -421,9 +423,13 @@ static int ct_write(struct intel_guc_ct *ct,
>   		 FIELD_PREP(GUC_CTB_MSG_0_NUM_DWORDS, len) |
>   		 FIELD_PREP(GUC_CTB_MSG_0_FENCE, fence);
>   
> -	hxg = FIELD_PREP(GUC_HXG_MSG_0_TYPE, GUC_HXG_TYPE_REQUEST) |
> -	      FIELD_PREP(GUC_HXG_REQUEST_MSG_0_ACTION |
> -			 GUC_HXG_REQUEST_MSG_0_DATA0, action[0]);
> +	hxg = (flags & INTEL_GUC_CT_SEND_NB) ?
> +		(FIELD_PREP(GUC_HXG_MSG_0_TYPE, GUC_HXG_TYPE_EVENT) |
> +		 FIELD_PREP(GUC_HXG_EVENT_MSG_0_ACTION |
> +			    GUC_HXG_EVENT_MSG_0_DATA0, action[0])) :
> +		(FIELD_PREP(GUC_HXG_MSG_0_TYPE, GUC_HXG_TYPE_REQUEST) |
> +		 FIELD_PREP(GUC_HXG_REQUEST_MSG_0_ACTION |
> +			    GUC_HXG_REQUEST_MSG_0_DATA0, action[0]));
If would be much easier to read if this used a proper 'if' rather than 
an inline '?'. Or maybe have something like:

    type = SEND_NB ? TYPE_EVENT : TYPE REQUEST;
    hxg = PREP(type) | ACTION ...;


Neither issue above is a blocker but I think the comment at least should 
be fixed up when merging. With that:

Reviewed-by: John Harrison <John.C.Harrison@Intel.com>


>   
>   	CT_DEBUG(ct, "writing (tail %u) %*ph %*ph %*ph\n",
>   		 tail, 4, &header, 4, &hxg, 4 * (len - 1), &action[1]);
> @@ -500,6 +506,48 @@ static int wait_for_ct_request_update(struct ct_request *req, u32 *status)
>   	return err;
>   }
>   
> +static inline bool h2g_has_room(struct intel_guc_ct_buffer *ctb, u32 len_dw)
> +{
> +	struct guc_ct_buffer_desc *desc = ctb->desc;
> +	u32 head = READ_ONCE(desc->head);
> +	u32 space;
> +
> +	space = CIRC_SPACE(desc->tail, head, ctb->size);
> +
> +	return space >= len_dw;
> +}
> +
> +static int ct_send_nb(struct intel_guc_ct *ct,
> +		      const u32 *action,
> +		      u32 len,
> +		      u32 flags)
> +{
> +	struct intel_guc_ct_buffer *ctb = &ct->ctbs.send;
> +	unsigned long spin_flags;
> +	u32 fence;
> +	int ret;
> +
> +	spin_lock_irqsave(&ctb->lock, spin_flags);
> +
> +	ret = h2g_has_room(ctb, len + GUC_CTB_HDR_LEN);
> +	if (unlikely(!ret)) {
> +		ret = -EBUSY;
> +		goto out;
> +	}
> +
> +	fence = ct_get_next_fence(ct);
> +	ret = ct_write(ct, action, len, fence, flags);
> +	if (unlikely(ret))
> +		goto out;
> +
> +	intel_guc_notify(ct_to_guc(ct));
> +
> +out:
> +	spin_unlock_irqrestore(&ctb->lock, spin_flags);
> +
> +	return ret;
> +}
> +
>   static int ct_send(struct intel_guc_ct *ct,
>   		   const u32 *action,
>   		   u32 len,
> @@ -507,8 +555,10 @@ static int ct_send(struct intel_guc_ct *ct,
>   		   u32 response_buf_size,
>   		   u32 *status)
>   {
> +	struct intel_guc_ct_buffer *ctb = &ct->ctbs.send;
>   	struct ct_request request;
>   	unsigned long flags;
> +	unsigned int sleep_period_ms = 1;
>   	u32 fence;
>   	int err;
>   
> @@ -516,8 +566,24 @@ static int ct_send(struct intel_guc_ct *ct,
>   	GEM_BUG_ON(!len);
>   	GEM_BUG_ON(len & ~GUC_CT_MSG_LEN_MASK);
>   	GEM_BUG_ON(!response_buf && response_buf_size);
> +	might_sleep();
> +
> +	/*
> +	 * We use a lazy spin wait loop here as we believe that if the CT
> +	 * buffers are sized correctly the flow control condition should be
> +	 * rare.
> +	 */
> +retry:
> +	spin_lock_irqsave(&ctb->lock, flags);
> +	if (unlikely(!h2g_has_room(ctb, len + GUC_CTB_HDR_LEN))) {
> +		spin_unlock_irqrestore(&ctb->lock, flags);
>   
> -	spin_lock_irqsave(&ct->ctbs.send.lock, flags);
> +		if (msleep_interruptible(sleep_period_ms))
> +			return -EINTR;
> +		sleep_period_ms = sleep_period_ms << 1;
> +
> +		goto retry;
> +	}
>   
>   	fence = ct_get_next_fence(ct);
>   	request.fence = fence;
> @@ -529,9 +595,9 @@ static int ct_send(struct intel_guc_ct *ct,
>   	list_add_tail(&request.link, &ct->requests.pending);
>   	spin_unlock(&ct->requests.lock);
>   
> -	err = ct_write(ct, action, len, fence);
> +	err = ct_write(ct, action, len, fence, 0);
>   
> -	spin_unlock_irqrestore(&ct->ctbs.send.lock, flags);
> +	spin_unlock_irqrestore(&ctb->lock, flags);
>   
>   	if (unlikely(err))
>   		goto unlink;
> @@ -571,7 +637,7 @@ static int ct_send(struct intel_guc_ct *ct,
>    * Command Transport (CT) buffer based GuC send function.
>    */
>   int intel_guc_ct_send(struct intel_guc_ct *ct, const u32 *action, u32 len,
> -		      u32 *response_buf, u32 response_buf_size)
> +		      u32 *response_buf, u32 response_buf_size, u32 flags)
>   {
>   	u32 status = ~0; /* undefined */
>   	int ret;
> @@ -581,6 +647,9 @@ int intel_guc_ct_send(struct intel_guc_ct *ct, const u32 *action, u32 len,
>   		return -ENODEV;
>   	}
>   
> +	if (flags & INTEL_GUC_CT_SEND_NB)
> +		return ct_send_nb(ct, action, len, flags);
> +
>   	ret = ct_send(ct, action, len, response_buf, response_buf_size, &status);
>   	if (unlikely(ret < 0)) {
>   		CT_ERROR(ct, "Sending action %#x failed (err=%d status=%#X)\n",
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
> index 1ae2dde6db93..5bb8bef024c8 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
> @@ -42,7 +42,6 @@ struct intel_guc_ct_buffer {
>   	bool broken;
>   };
>   
> -
>   /** Top-level structure for Command Transport related data
>    *
>    * Includes a pair of CT buffers for bi-directional communication and tracking
> @@ -87,8 +86,9 @@ static inline bool intel_guc_ct_enabled(struct intel_guc_ct *ct)
>   	return ct->enabled;
>   }
>   
> +#define INTEL_GUC_CT_SEND_NB		BIT(31)
>   int intel_guc_ct_send(struct intel_guc_ct *ct, const u32 *action, u32 len,
> -		      u32 *response_buf, u32 response_buf_size);
> +		      u32 *response_buf, u32 response_buf_size, u32 flags);
>   void intel_guc_ct_event_handler(struct intel_guc_ct *ct);
>   
>   #endif /* _INTEL_GUC_CT_H_ */


[-- Attachment #1.2: Type: text/html, Size: 10693 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 4/7] drm/i915/guc: Add non blocking CTB send function
  2021-07-06 18:12     ` [Intel-gfx] " John Harrison
@ 2021-07-06 18:17       ` Matthew Brost
  -1 siblings, 0 replies; 38+ messages in thread
From: Matthew Brost @ 2021-07-06 18:17 UTC (permalink / raw)
  To: John Harrison; +Cc: intel-gfx, dri-devel, Michal.Wajdeczko

On Tue, Jul 06, 2021 at 11:12:21AM -0700, John Harrison wrote:
>    On 7/1/2021 10:15, Matthew Brost wrote:
> 
> Add non blocking CTB send function, intel_guc_send_nb. GuC submission
> will send CTBs in the critical path and does not need to wait for these
> CTBs to complete before moving on, hence the need for this new function.
> 
> The non-blocking CTB now must have a flow control mechanism to ensure
> the buffer isn't overrun. A lazy spin wait is used as we believe the
> flow control condition should be rare with a properly sized buffer.
> 
> The function, intel_guc_send_nb, is exported in this patch but unused.
> Several patches later in the series make use of this function.
> 
> v2:
>  (Michal)
>   - Use define for H2G room calculations
>   - Move INTEL_GUC_SEND_NB define
>  (Daniel Vetter)
>   - Use msleep_interruptible rather than cond_resched
> v3:
>  (Michal)
>   - Move includes to following patch
>   - s/INTEL_GUC_SEND_NB/INTEL_GUC_CT_SEND_NB/g
> 
> Signed-off-by: John Harrison [1]<John.C.Harrison@Intel.com>
> Signed-off-by: Matthew Brost [2]<matthew.brost@intel.com>
> ---
>  .../gt/uc/abi/guc_communication_ctb_abi.h     |  3 +-
>  drivers/gpu/drm/i915/gt/uc/intel_guc.h        | 11 ++-
>  drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c     | 87 +++++++++++++++++--
>  drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h     |  4 +-
>  4 files changed, 91 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h b/driver
> s/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h
> index e933ca02d0eb..99e1fad5ca20 100644
> --- a/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h
> +++ b/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h
> @@ -79,7 +79,8 @@ static_assert(sizeof(struct guc_ct_buffer_desc) == 64);
>   *  +---+-------+--------------------------------------------------------------
> +
>   */
> 
> -#define GUC_CTB_MSG_MIN_LEN                    1u
> +#define GUC_CTB_HDR_LEN                                1u
> +#define GUC_CTB_MSG_MIN_LEN                    GUC_CTB_HDR_LEN
>  #define GUC_CTB_MSG_MAX_LEN                    256u
>  #define GUC_CTB_MSG_0_FENCE                    (0xffff << 16)
>  #define GUC_CTB_MSG_0_FORMAT                   (0xf << 12)
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.h b/drivers/gpu/drm/i915/gt/uc
> /intel_guc.h
> index 4abc59f6f3cd..72e4653222e2 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.h
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
> @@ -74,7 +74,14 @@ static inline struct intel_guc *log_to_guc(struct intel_guc_l
> og *log)
>  static
>  inline int intel_guc_send(struct intel_guc *guc, const u32 *action, u32 len)
>  {
> -       return intel_guc_ct_send(&guc->ct, action, len, NULL, 0);
> +       return intel_guc_ct_send(&guc->ct, action, len, NULL, 0, 0);
> +}
> +
> +static
> +inline int intel_guc_send_nb(struct intel_guc *guc, const u32 *action, u32 len)
> +{
> +       return intel_guc_ct_send(&guc->ct, action, len, NULL, 0,
> +                                INTEL_GUC_CT_SEND_NB);
>  }
> 
>  static inline int
> @@ -82,7 +89,7 @@ intel_guc_send_and_receive(struct intel_guc *guc, const u32 *a
> ction, u32 len,
>                            u32 *response_buf, u32 response_buf_size)
>  {
>         return intel_guc_ct_send(&guc->ct, action, len,
> -                                response_buf, response_buf_size);
> +                                response_buf, response_buf_size, 0);
>  }
> 
>  static inline void intel_guc_to_host_event_handler(struct intel_guc *guc)
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c b/drivers/gpu/drm/i915/gt
> /uc/intel_guc_ct.c
> index 43e03aa2dde8..fb825cc1d090 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> @@ -3,6 +3,8 @@
>   * Copyright © 2016-2019 Intel Corporation
>   */
> 
> +#include <linux/circ_buf.h>
> +
>  #include "i915_drv.h"
>  #include "intel_guc_ct.h"
>  #include "gt/intel_gt.h"
> @@ -373,7 +375,7 @@ static void write_barrier(struct intel_guc_ct *ct)
>  static int ct_write(struct intel_guc_ct *ct,
>                     const u32 *action,
>                     u32 len /* in dwords */,
> -                   u32 fence)
> +                   u32 fence, u32 flags)
>  {
>         struct intel_guc_ct_buffer *ctb = &ct->ctbs.send;
>         struct guc_ct_buffer_desc *desc = ctb->desc;
> @@ -409,7 +411,7 @@ static int ct_write(struct intel_guc_ct *ct,
>                 used = tail - head;
> 
>         /* make sure there is a space including extra dw for the fence */
> -       if (unlikely(used + len + 1 >= size))
> +       if (unlikely(used + len + GUC_CTB_HDR_LEN >= size))
> 
>    I thought the plan was to update the comment? Given that the '+1' is
>    now 'HDR_LEN' it would be good to update the comment to say 'header'
>    instead of 'fence'.
> 

Yep, will fix.

>                 return -ENOSPC;
> 
>         /*
> @@ -421,9 +423,13 @@ static int ct_write(struct intel_guc_ct *ct,
>                  FIELD_PREP(GUC_CTB_MSG_0_NUM_DWORDS, len) |
>                  FIELD_PREP(GUC_CTB_MSG_0_FENCE, fence);
> 
> -       hxg = FIELD_PREP(GUC_HXG_MSG_0_TYPE, GUC_HXG_TYPE_REQUEST) |
> -             FIELD_PREP(GUC_HXG_REQUEST_MSG_0_ACTION |
> -                        GUC_HXG_REQUEST_MSG_0_DATA0, action[0]);
> +       hxg = (flags & INTEL_GUC_CT_SEND_NB) ?
> +               (FIELD_PREP(GUC_HXG_MSG_0_TYPE, GUC_HXG_TYPE_EVENT) |
> +                FIELD_PREP(GUC_HXG_EVENT_MSG_0_ACTION |
> +                           GUC_HXG_EVENT_MSG_0_DATA0, action[0])) :
> +               (FIELD_PREP(GUC_HXG_MSG_0_TYPE, GUC_HXG_TYPE_REQUEST) |
> +                FIELD_PREP(GUC_HXG_REQUEST_MSG_0_ACTION |
> +                           GUC_HXG_REQUEST_MSG_0_DATA0, action[0]));
> 
>    If would be much easier to read if this used a proper 'if' rather than
>    an inline '?'. Or maybe have something like:
> 
>      type = SEND_NB ? TYPE_EVENT : TYPE REQUEST;
>      hxg = PREP(type) | ACTION ...;
> 
>    Neither issue above is a blocker but I think the comment at least
>    should be fixed up when merging. With that:

Probably a bit cleaner. Will fix.

Matt

>    Reviewed-by: John Harrison [3]<John.C.Harrison@Intel.com>
> 
> 
>         CT_DEBUG(ct, "writing (tail %u) %*ph %*ph %*ph\n",
>                  tail, 4, &header, 4, &hxg, 4 * (len - 1), &action[1]);
> @@ -500,6 +506,48 @@ static int wait_for_ct_request_update(struct ct_request *re
> q, u32 *status)
>         return err;
>  }
> 
> +static inline bool h2g_has_room(struct intel_guc_ct_buffer *ctb, u32 len_dw)
> +{
> +       struct guc_ct_buffer_desc *desc = ctb->desc;
> +       u32 head = READ_ONCE(desc->head);
> +       u32 space;
> +
> +       space = CIRC_SPACE(desc->tail, head, ctb->size);
> +
> +       return space >= len_dw;
> +}
> +
> +static int ct_send_nb(struct intel_guc_ct *ct,
> +                     const u32 *action,
> +                     u32 len,
> +                     u32 flags)
> +{
> +       struct intel_guc_ct_buffer *ctb = &ct->ctbs.send;
> +       unsigned long spin_flags;
> +       u32 fence;
> +       int ret;
> +
> +       spin_lock_irqsave(&ctb->lock, spin_flags);
> +
> +       ret = h2g_has_room(ctb, len + GUC_CTB_HDR_LEN);
> +       if (unlikely(!ret)) {
> +               ret = -EBUSY;
> +               goto out;
> +       }
> +
> +       fence = ct_get_next_fence(ct);
> +       ret = ct_write(ct, action, len, fence, flags);
> +       if (unlikely(ret))
> +               goto out;
> +
> +       intel_guc_notify(ct_to_guc(ct));
> +
> +out:
> +       spin_unlock_irqrestore(&ctb->lock, spin_flags);
> +
> +       return ret;
> +}
> +
>  static int ct_send(struct intel_guc_ct *ct,
>                    const u32 *action,
>                    u32 len,
> @@ -507,8 +555,10 @@ static int ct_send(struct intel_guc_ct *ct,
>                    u32 response_buf_size,
>                    u32 *status)
>  {
> +       struct intel_guc_ct_buffer *ctb = &ct->ctbs.send;
>         struct ct_request request;
>         unsigned long flags;
> +       unsigned int sleep_period_ms = 1;
>         u32 fence;
>         int err;
> 
> @@ -516,8 +566,24 @@ static int ct_send(struct intel_guc_ct *ct,
>         GEM_BUG_ON(!len);
>         GEM_BUG_ON(len & ~GUC_CT_MSG_LEN_MASK);
>         GEM_BUG_ON(!response_buf && response_buf_size);
> +       might_sleep();
> +
> +       /*
> +        * We use a lazy spin wait loop here as we believe that if the CT
> +        * buffers are sized correctly the flow control condition should be
> +        * rare.
> +        */
> +retry:
> +       spin_lock_irqsave(&ctb->lock, flags);
> +       if (unlikely(!h2g_has_room(ctb, len + GUC_CTB_HDR_LEN))) {
> +               spin_unlock_irqrestore(&ctb->lock, flags);
> 
> -       spin_lock_irqsave(&ct->ctbs.send.lock, flags);
> +               if (msleep_interruptible(sleep_period_ms))
> +                       return -EINTR;
> +               sleep_period_ms = sleep_period_ms << 1;
> +
> +               goto retry;
> +       }
> 
>         fence = ct_get_next_fence(ct);
>         request.fence = fence;
> @@ -529,9 +595,9 @@ static int ct_send(struct intel_guc_ct *ct,
>         list_add_tail(&request.link, &ct->requests.pending);
>         spin_unlock(&ct->requests.lock);
> 
> -       err = ct_write(ct, action, len, fence);
> +       err = ct_write(ct, action, len, fence, 0);
> 
> -       spin_unlock_irqrestore(&ct->ctbs.send.lock, flags);
> +       spin_unlock_irqrestore(&ctb->lock, flags);
> 
>         if (unlikely(err))
>                 goto unlink;
> @@ -571,7 +637,7 @@ static int ct_send(struct intel_guc_ct *ct,
>   * Command Transport (CT) buffer based GuC send function.
>   */
>  int intel_guc_ct_send(struct intel_guc_ct *ct, const u32 *action, u32 len,
> -                     u32 *response_buf, u32 response_buf_size)
> +                     u32 *response_buf, u32 response_buf_size, u32 flags)
>  {
>         u32 status = ~0; /* undefined */
>         int ret;
> @@ -581,6 +647,9 @@ int intel_guc_ct_send(struct intel_guc_ct *ct, const u32 *ac
> tion, u32 len,
>                 return -ENODEV;
>         }
> 
> +       if (flags & INTEL_GUC_CT_SEND_NB)
> +               return ct_send_nb(ct, action, len, flags);
> +
>         ret = ct_send(ct, action, len, response_buf, response_buf_size, &status)
> ;
>         if (unlikely(ret < 0)) {
>                 CT_ERROR(ct, "Sending action %#x failed (err=%d status=%#X)\n",
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h b/drivers/gpu/drm/i915/gt
> /uc/intel_guc_ct.h
> index 1ae2dde6db93..5bb8bef024c8 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
> @@ -42,7 +42,6 @@ struct intel_guc_ct_buffer {
>         bool broken;
>  };
> 
> -
>  /** Top-level structure for Command Transport related data
>   *
>   * Includes a pair of CT buffers for bi-directional communication and tracking
> @@ -87,8 +86,9 @@ static inline bool intel_guc_ct_enabled(struct intel_guc_ct *c
> t)
>         return ct->enabled;
>  }
> 
> +#define INTEL_GUC_CT_SEND_NB           BIT(31)
>  int intel_guc_ct_send(struct intel_guc_ct *ct, const u32 *action, u32 len,
> -                     u32 *response_buf, u32 response_buf_size);
> +                     u32 *response_buf, u32 response_buf_size, u32 flags);
>  void intel_guc_ct_event_handler(struct intel_guc_ct *ct);
> 
>  #endif /* _INTEL_GUC_CT_H_ */
> 
> References
> 
>    1. mailto:John.C.Harrison@Intel.com
>    2. mailto:matthew.brost@intel.com
>    3. mailto:John.C.Harrison@Intel.com

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Intel-gfx] [PATCH 4/7] drm/i915/guc: Add non blocking CTB send function
@ 2021-07-06 18:17       ` Matthew Brost
  0 siblings, 0 replies; 38+ messages in thread
From: Matthew Brost @ 2021-07-06 18:17 UTC (permalink / raw)
  To: John Harrison; +Cc: intel-gfx, dri-devel

On Tue, Jul 06, 2021 at 11:12:21AM -0700, John Harrison wrote:
>    On 7/1/2021 10:15, Matthew Brost wrote:
> 
> Add non blocking CTB send function, intel_guc_send_nb. GuC submission
> will send CTBs in the critical path and does not need to wait for these
> CTBs to complete before moving on, hence the need for this new function.
> 
> The non-blocking CTB now must have a flow control mechanism to ensure
> the buffer isn't overrun. A lazy spin wait is used as we believe the
> flow control condition should be rare with a properly sized buffer.
> 
> The function, intel_guc_send_nb, is exported in this patch but unused.
> Several patches later in the series make use of this function.
> 
> v2:
>  (Michal)
>   - Use define for H2G room calculations
>   - Move INTEL_GUC_SEND_NB define
>  (Daniel Vetter)
>   - Use msleep_interruptible rather than cond_resched
> v3:
>  (Michal)
>   - Move includes to following patch
>   - s/INTEL_GUC_SEND_NB/INTEL_GUC_CT_SEND_NB/g
> 
> Signed-off-by: John Harrison [1]<John.C.Harrison@Intel.com>
> Signed-off-by: Matthew Brost [2]<matthew.brost@intel.com>
> ---
>  .../gt/uc/abi/guc_communication_ctb_abi.h     |  3 +-
>  drivers/gpu/drm/i915/gt/uc/intel_guc.h        | 11 ++-
>  drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c     | 87 +++++++++++++++++--
>  drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h     |  4 +-
>  4 files changed, 91 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h b/driver
> s/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h
> index e933ca02d0eb..99e1fad5ca20 100644
> --- a/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h
> +++ b/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h
> @@ -79,7 +79,8 @@ static_assert(sizeof(struct guc_ct_buffer_desc) == 64);
>   *  +---+-------+--------------------------------------------------------------
> +
>   */
> 
> -#define GUC_CTB_MSG_MIN_LEN                    1u
> +#define GUC_CTB_HDR_LEN                                1u
> +#define GUC_CTB_MSG_MIN_LEN                    GUC_CTB_HDR_LEN
>  #define GUC_CTB_MSG_MAX_LEN                    256u
>  #define GUC_CTB_MSG_0_FENCE                    (0xffff << 16)
>  #define GUC_CTB_MSG_0_FORMAT                   (0xf << 12)
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.h b/drivers/gpu/drm/i915/gt/uc
> /intel_guc.h
> index 4abc59f6f3cd..72e4653222e2 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.h
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
> @@ -74,7 +74,14 @@ static inline struct intel_guc *log_to_guc(struct intel_guc_l
> og *log)
>  static
>  inline int intel_guc_send(struct intel_guc *guc, const u32 *action, u32 len)
>  {
> -       return intel_guc_ct_send(&guc->ct, action, len, NULL, 0);
> +       return intel_guc_ct_send(&guc->ct, action, len, NULL, 0, 0);
> +}
> +
> +static
> +inline int intel_guc_send_nb(struct intel_guc *guc, const u32 *action, u32 len)
> +{
> +       return intel_guc_ct_send(&guc->ct, action, len, NULL, 0,
> +                                INTEL_GUC_CT_SEND_NB);
>  }
> 
>  static inline int
> @@ -82,7 +89,7 @@ intel_guc_send_and_receive(struct intel_guc *guc, const u32 *a
> ction, u32 len,
>                            u32 *response_buf, u32 response_buf_size)
>  {
>         return intel_guc_ct_send(&guc->ct, action, len,
> -                                response_buf, response_buf_size);
> +                                response_buf, response_buf_size, 0);
>  }
> 
>  static inline void intel_guc_to_host_event_handler(struct intel_guc *guc)
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c b/drivers/gpu/drm/i915/gt
> /uc/intel_guc_ct.c
> index 43e03aa2dde8..fb825cc1d090 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> @@ -3,6 +3,8 @@
>   * Copyright © 2016-2019 Intel Corporation
>   */
> 
> +#include <linux/circ_buf.h>
> +
>  #include "i915_drv.h"
>  #include "intel_guc_ct.h"
>  #include "gt/intel_gt.h"
> @@ -373,7 +375,7 @@ static void write_barrier(struct intel_guc_ct *ct)
>  static int ct_write(struct intel_guc_ct *ct,
>                     const u32 *action,
>                     u32 len /* in dwords */,
> -                   u32 fence)
> +                   u32 fence, u32 flags)
>  {
>         struct intel_guc_ct_buffer *ctb = &ct->ctbs.send;
>         struct guc_ct_buffer_desc *desc = ctb->desc;
> @@ -409,7 +411,7 @@ static int ct_write(struct intel_guc_ct *ct,
>                 used = tail - head;
> 
>         /* make sure there is a space including extra dw for the fence */
> -       if (unlikely(used + len + 1 >= size))
> +       if (unlikely(used + len + GUC_CTB_HDR_LEN >= size))
> 
>    I thought the plan was to update the comment? Given that the '+1' is
>    now 'HDR_LEN' it would be good to update the comment to say 'header'
>    instead of 'fence'.
> 

Yep, will fix.

>                 return -ENOSPC;
> 
>         /*
> @@ -421,9 +423,13 @@ static int ct_write(struct intel_guc_ct *ct,
>                  FIELD_PREP(GUC_CTB_MSG_0_NUM_DWORDS, len) |
>                  FIELD_PREP(GUC_CTB_MSG_0_FENCE, fence);
> 
> -       hxg = FIELD_PREP(GUC_HXG_MSG_0_TYPE, GUC_HXG_TYPE_REQUEST) |
> -             FIELD_PREP(GUC_HXG_REQUEST_MSG_0_ACTION |
> -                        GUC_HXG_REQUEST_MSG_0_DATA0, action[0]);
> +       hxg = (flags & INTEL_GUC_CT_SEND_NB) ?
> +               (FIELD_PREP(GUC_HXG_MSG_0_TYPE, GUC_HXG_TYPE_EVENT) |
> +                FIELD_PREP(GUC_HXG_EVENT_MSG_0_ACTION |
> +                           GUC_HXG_EVENT_MSG_0_DATA0, action[0])) :
> +               (FIELD_PREP(GUC_HXG_MSG_0_TYPE, GUC_HXG_TYPE_REQUEST) |
> +                FIELD_PREP(GUC_HXG_REQUEST_MSG_0_ACTION |
> +                           GUC_HXG_REQUEST_MSG_0_DATA0, action[0]));
> 
>    If would be much easier to read if this used a proper 'if' rather than
>    an inline '?'. Or maybe have something like:
> 
>      type = SEND_NB ? TYPE_EVENT : TYPE REQUEST;
>      hxg = PREP(type) | ACTION ...;
> 
>    Neither issue above is a blocker but I think the comment at least
>    should be fixed up when merging. With that:

Probably a bit cleaner. Will fix.

Matt

>    Reviewed-by: John Harrison [3]<John.C.Harrison@Intel.com>
> 
> 
>         CT_DEBUG(ct, "writing (tail %u) %*ph %*ph %*ph\n",
>                  tail, 4, &header, 4, &hxg, 4 * (len - 1), &action[1]);
> @@ -500,6 +506,48 @@ static int wait_for_ct_request_update(struct ct_request *re
> q, u32 *status)
>         return err;
>  }
> 
> +static inline bool h2g_has_room(struct intel_guc_ct_buffer *ctb, u32 len_dw)
> +{
> +       struct guc_ct_buffer_desc *desc = ctb->desc;
> +       u32 head = READ_ONCE(desc->head);
> +       u32 space;
> +
> +       space = CIRC_SPACE(desc->tail, head, ctb->size);
> +
> +       return space >= len_dw;
> +}
> +
> +static int ct_send_nb(struct intel_guc_ct *ct,
> +                     const u32 *action,
> +                     u32 len,
> +                     u32 flags)
> +{
> +       struct intel_guc_ct_buffer *ctb = &ct->ctbs.send;
> +       unsigned long spin_flags;
> +       u32 fence;
> +       int ret;
> +
> +       spin_lock_irqsave(&ctb->lock, spin_flags);
> +
> +       ret = h2g_has_room(ctb, len + GUC_CTB_HDR_LEN);
> +       if (unlikely(!ret)) {
> +               ret = -EBUSY;
> +               goto out;
> +       }
> +
> +       fence = ct_get_next_fence(ct);
> +       ret = ct_write(ct, action, len, fence, flags);
> +       if (unlikely(ret))
> +               goto out;
> +
> +       intel_guc_notify(ct_to_guc(ct));
> +
> +out:
> +       spin_unlock_irqrestore(&ctb->lock, spin_flags);
> +
> +       return ret;
> +}
> +
>  static int ct_send(struct intel_guc_ct *ct,
>                    const u32 *action,
>                    u32 len,
> @@ -507,8 +555,10 @@ static int ct_send(struct intel_guc_ct *ct,
>                    u32 response_buf_size,
>                    u32 *status)
>  {
> +       struct intel_guc_ct_buffer *ctb = &ct->ctbs.send;
>         struct ct_request request;
>         unsigned long flags;
> +       unsigned int sleep_period_ms = 1;
>         u32 fence;
>         int err;
> 
> @@ -516,8 +566,24 @@ static int ct_send(struct intel_guc_ct *ct,
>         GEM_BUG_ON(!len);
>         GEM_BUG_ON(len & ~GUC_CT_MSG_LEN_MASK);
>         GEM_BUG_ON(!response_buf && response_buf_size);
> +       might_sleep();
> +
> +       /*
> +        * We use a lazy spin wait loop here as we believe that if the CT
> +        * buffers are sized correctly the flow control condition should be
> +        * rare.
> +        */
> +retry:
> +       spin_lock_irqsave(&ctb->lock, flags);
> +       if (unlikely(!h2g_has_room(ctb, len + GUC_CTB_HDR_LEN))) {
> +               spin_unlock_irqrestore(&ctb->lock, flags);
> 
> -       spin_lock_irqsave(&ct->ctbs.send.lock, flags);
> +               if (msleep_interruptible(sleep_period_ms))
> +                       return -EINTR;
> +               sleep_period_ms = sleep_period_ms << 1;
> +
> +               goto retry;
> +       }
> 
>         fence = ct_get_next_fence(ct);
>         request.fence = fence;
> @@ -529,9 +595,9 @@ static int ct_send(struct intel_guc_ct *ct,
>         list_add_tail(&request.link, &ct->requests.pending);
>         spin_unlock(&ct->requests.lock);
> 
> -       err = ct_write(ct, action, len, fence);
> +       err = ct_write(ct, action, len, fence, 0);
> 
> -       spin_unlock_irqrestore(&ct->ctbs.send.lock, flags);
> +       spin_unlock_irqrestore(&ctb->lock, flags);
> 
>         if (unlikely(err))
>                 goto unlink;
> @@ -571,7 +637,7 @@ static int ct_send(struct intel_guc_ct *ct,
>   * Command Transport (CT) buffer based GuC send function.
>   */
>  int intel_guc_ct_send(struct intel_guc_ct *ct, const u32 *action, u32 len,
> -                     u32 *response_buf, u32 response_buf_size)
> +                     u32 *response_buf, u32 response_buf_size, u32 flags)
>  {
>         u32 status = ~0; /* undefined */
>         int ret;
> @@ -581,6 +647,9 @@ int intel_guc_ct_send(struct intel_guc_ct *ct, const u32 *ac
> tion, u32 len,
>                 return -ENODEV;
>         }
> 
> +       if (flags & INTEL_GUC_CT_SEND_NB)
> +               return ct_send_nb(ct, action, len, flags);
> +
>         ret = ct_send(ct, action, len, response_buf, response_buf_size, &status)
> ;
>         if (unlikely(ret < 0)) {
>                 CT_ERROR(ct, "Sending action %#x failed (err=%d status=%#X)\n",
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h b/drivers/gpu/drm/i915/gt
> /uc/intel_guc_ct.h
> index 1ae2dde6db93..5bb8bef024c8 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
> @@ -42,7 +42,6 @@ struct intel_guc_ct_buffer {
>         bool broken;
>  };
> 
> -
>  /** Top-level structure for Command Transport related data
>   *
>   * Includes a pair of CT buffers for bi-directional communication and tracking
> @@ -87,8 +86,9 @@ static inline bool intel_guc_ct_enabled(struct intel_guc_ct *c
> t)
>         return ct->enabled;
>  }
> 
> +#define INTEL_GUC_CT_SEND_NB           BIT(31)
>  int intel_guc_ct_send(struct intel_guc_ct *ct, const u32 *action, u32 len,
> -                     u32 *response_buf, u32 response_buf_size);
> +                     u32 *response_buf, u32 response_buf_size, u32 flags);
>  void intel_guc_ct_event_handler(struct intel_guc_ct *ct);
> 
>  #endif /* _INTEL_GUC_CT_H_ */
> 
> References
> 
>    1. mailto:John.C.Harrison@Intel.com
>    2. mailto:matthew.brost@intel.com
>    3. mailto:John.C.Harrison@Intel.com
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 5/7] drm/i915/guc: Add stall timer to non blocking CTB send function
  2021-07-01 17:15   ` [Intel-gfx] " Matthew Brost
@ 2021-07-06 18:27     ` John Harrison
  -1 siblings, 0 replies; 38+ messages in thread
From: John Harrison @ 2021-07-06 18:27 UTC (permalink / raw)
  To: Matthew Brost, intel-gfx, dri-devel; +Cc: Michal.Wajdeczko

On 7/1/2021 10:15, Matthew Brost wrote:
> Implement a stall timer which fails H2G CTBs once a period of time
> with no forward progress is reached to prevent deadlock.
>
> v2:
>   (Michal)
>    - Improve error message in ct_deadlock()
>    - Set broken when ct_deadlock() returns true
>    - Return -EPIPE on ct_deadlock()
> v3:
>   (Michal)
>    - Add ms to stall timer comment
>   (Matthew)
>    - Move broken check to intel_guc_ct_send()
>
> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Looks plausible to me.

Reviewed-by: John Harrison <John.C.Harrison@Intel.com>

> ---
>   drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 62 ++++++++++++++++++++---
>   drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h |  4 ++
>   2 files changed, 59 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> index fb825cc1d090..a9cb7b608520 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> @@ -4,6 +4,9 @@
>    */
>   
>   #include <linux/circ_buf.h>
> +#include <linux/ktime.h>
> +#include <linux/time64.h>
> +#include <linux/timekeeping.h>
>   
>   #include "i915_drv.h"
>   #include "intel_guc_ct.h"
> @@ -316,6 +319,7 @@ int intel_guc_ct_enable(struct intel_guc_ct *ct)
>   		goto err_deregister;
>   
>   	ct->enabled = true;
> +	ct->stall_time = KTIME_MAX;
>   
>   	return 0;
>   
> @@ -388,9 +392,6 @@ static int ct_write(struct intel_guc_ct *ct,
>   	u32 *cmds = ctb->cmds;
>   	unsigned int i;
>   
> -	if (unlikely(ctb->broken))
> -		return -EPIPE;
> -
>   	if (unlikely(desc->status))
>   		goto corrupted;
>   
> @@ -506,6 +507,25 @@ static int wait_for_ct_request_update(struct ct_request *req, u32 *status)
>   	return err;
>   }
>   
> +#define GUC_CTB_TIMEOUT_MS	1500
> +static inline bool ct_deadlocked(struct intel_guc_ct *ct)
> +{
> +	long timeout = GUC_CTB_TIMEOUT_MS;
> +	bool ret = ktime_ms_delta(ktime_get(), ct->stall_time) > timeout;
> +
> +	if (unlikely(ret)) {
> +		struct guc_ct_buffer_desc *send = ct->ctbs.send.desc;
> +		struct guc_ct_buffer_desc *recv = ct->ctbs.send.desc;
> +
> +		CT_ERROR(ct, "Communication stalled for %lld ms, desc status=%#x,%#x\n",
> +			 ktime_ms_delta(ktime_get(), ct->stall_time),
> +			 send->status, recv->status);
> +		ct->ctbs.send.broken = true;
> +	}
> +
> +	return ret;
> +}
> +
>   static inline bool h2g_has_room(struct intel_guc_ct_buffer *ctb, u32 len_dw)
>   {
>   	struct guc_ct_buffer_desc *desc = ctb->desc;
> @@ -517,6 +537,26 @@ static inline bool h2g_has_room(struct intel_guc_ct_buffer *ctb, u32 len_dw)
>   	return space >= len_dw;
>   }
>   
> +static int has_room_nb(struct intel_guc_ct *ct, u32 len_dw)
> +{
> +	struct intel_guc_ct_buffer *ctb = &ct->ctbs.send;
> +
> +	lockdep_assert_held(&ct->ctbs.send.lock);
> +
> +	if (unlikely(!h2g_has_room(ctb, len_dw))) {
> +		if (ct->stall_time == KTIME_MAX)
> +			ct->stall_time = ktime_get();
> +
> +		if (unlikely(ct_deadlocked(ct)))
> +			return -EPIPE;
> +		else
> +			return -EBUSY;
> +	}
> +
> +	ct->stall_time = KTIME_MAX;
> +	return 0;
> +}
> +
>   static int ct_send_nb(struct intel_guc_ct *ct,
>   		      const u32 *action,
>   		      u32 len,
> @@ -529,11 +569,9 @@ static int ct_send_nb(struct intel_guc_ct *ct,
>   
>   	spin_lock_irqsave(&ctb->lock, spin_flags);
>   
> -	ret = h2g_has_room(ctb, len + GUC_CTB_HDR_LEN);
> -	if (unlikely(!ret)) {
> -		ret = -EBUSY;
> +	ret = has_room_nb(ct, len + GUC_CTB_HDR_LEN);
> +	if (unlikely(ret))
>   		goto out;
> -	}
>   
>   	fence = ct_get_next_fence(ct);
>   	ret = ct_write(ct, action, len, fence, flags);
> @@ -576,8 +614,13 @@ static int ct_send(struct intel_guc_ct *ct,
>   retry:
>   	spin_lock_irqsave(&ctb->lock, flags);
>   	if (unlikely(!h2g_has_room(ctb, len + GUC_CTB_HDR_LEN))) {
> +		if (ct->stall_time == KTIME_MAX)
> +			ct->stall_time = ktime_get();
>   		spin_unlock_irqrestore(&ctb->lock, flags);
>   
> +		if (unlikely(ct_deadlocked(ct)))
> +			return -EPIPE;
> +
>   		if (msleep_interruptible(sleep_period_ms))
>   			return -EINTR;
>   		sleep_period_ms = sleep_period_ms << 1;
> @@ -585,6 +628,8 @@ static int ct_send(struct intel_guc_ct *ct,
>   		goto retry;
>   	}
>   
> +	ct->stall_time = KTIME_MAX;
> +
>   	fence = ct_get_next_fence(ct);
>   	request.fence = fence;
>   	request.status = 0;
> @@ -647,6 +692,9 @@ int intel_guc_ct_send(struct intel_guc_ct *ct, const u32 *action, u32 len,
>   		return -ENODEV;
>   	}
>   
> +	if (unlikely(ct->ctbs.send.broken))
> +		return -EPIPE;
> +
>   	if (flags & INTEL_GUC_CT_SEND_NB)
>   		return ct_send_nb(ct, action, len, flags);
>   
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
> index 5bb8bef024c8..bee03794c1eb 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
> @@ -9,6 +9,7 @@
>   #include <linux/interrupt.h>
>   #include <linux/spinlock.h>
>   #include <linux/workqueue.h>
> +#include <linux/ktime.h>
>   
>   #include "intel_guc_fwif.h"
>   
> @@ -68,6 +69,9 @@ struct intel_guc_ct {
>   		struct list_head incoming; /* incoming requests */
>   		struct work_struct worker; /* handler for incoming requests */
>   	} requests;
> +
> +	/** @stall_time: time of first time a CTB submission is stalled */
> +	ktime_t stall_time;
>   };
>   
>   void intel_guc_ct_init_early(struct intel_guc_ct *ct);


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Intel-gfx] [PATCH 5/7] drm/i915/guc: Add stall timer to non blocking CTB send function
@ 2021-07-06 18:27     ` John Harrison
  0 siblings, 0 replies; 38+ messages in thread
From: John Harrison @ 2021-07-06 18:27 UTC (permalink / raw)
  To: Matthew Brost, intel-gfx, dri-devel

On 7/1/2021 10:15, Matthew Brost wrote:
> Implement a stall timer which fails H2G CTBs once a period of time
> with no forward progress is reached to prevent deadlock.
>
> v2:
>   (Michal)
>    - Improve error message in ct_deadlock()
>    - Set broken when ct_deadlock() returns true
>    - Return -EPIPE on ct_deadlock()
> v3:
>   (Michal)
>    - Add ms to stall timer comment
>   (Matthew)
>    - Move broken check to intel_guc_ct_send()
>
> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Looks plausible to me.

Reviewed-by: John Harrison <John.C.Harrison@Intel.com>

> ---
>   drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 62 ++++++++++++++++++++---
>   drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h |  4 ++
>   2 files changed, 59 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> index fb825cc1d090..a9cb7b608520 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> @@ -4,6 +4,9 @@
>    */
>   
>   #include <linux/circ_buf.h>
> +#include <linux/ktime.h>
> +#include <linux/time64.h>
> +#include <linux/timekeeping.h>
>   
>   #include "i915_drv.h"
>   #include "intel_guc_ct.h"
> @@ -316,6 +319,7 @@ int intel_guc_ct_enable(struct intel_guc_ct *ct)
>   		goto err_deregister;
>   
>   	ct->enabled = true;
> +	ct->stall_time = KTIME_MAX;
>   
>   	return 0;
>   
> @@ -388,9 +392,6 @@ static int ct_write(struct intel_guc_ct *ct,
>   	u32 *cmds = ctb->cmds;
>   	unsigned int i;
>   
> -	if (unlikely(ctb->broken))
> -		return -EPIPE;
> -
>   	if (unlikely(desc->status))
>   		goto corrupted;
>   
> @@ -506,6 +507,25 @@ static int wait_for_ct_request_update(struct ct_request *req, u32 *status)
>   	return err;
>   }
>   
> +#define GUC_CTB_TIMEOUT_MS	1500
> +static inline bool ct_deadlocked(struct intel_guc_ct *ct)
> +{
> +	long timeout = GUC_CTB_TIMEOUT_MS;
> +	bool ret = ktime_ms_delta(ktime_get(), ct->stall_time) > timeout;
> +
> +	if (unlikely(ret)) {
> +		struct guc_ct_buffer_desc *send = ct->ctbs.send.desc;
> +		struct guc_ct_buffer_desc *recv = ct->ctbs.send.desc;
> +
> +		CT_ERROR(ct, "Communication stalled for %lld ms, desc status=%#x,%#x\n",
> +			 ktime_ms_delta(ktime_get(), ct->stall_time),
> +			 send->status, recv->status);
> +		ct->ctbs.send.broken = true;
> +	}
> +
> +	return ret;
> +}
> +
>   static inline bool h2g_has_room(struct intel_guc_ct_buffer *ctb, u32 len_dw)
>   {
>   	struct guc_ct_buffer_desc *desc = ctb->desc;
> @@ -517,6 +537,26 @@ static inline bool h2g_has_room(struct intel_guc_ct_buffer *ctb, u32 len_dw)
>   	return space >= len_dw;
>   }
>   
> +static int has_room_nb(struct intel_guc_ct *ct, u32 len_dw)
> +{
> +	struct intel_guc_ct_buffer *ctb = &ct->ctbs.send;
> +
> +	lockdep_assert_held(&ct->ctbs.send.lock);
> +
> +	if (unlikely(!h2g_has_room(ctb, len_dw))) {
> +		if (ct->stall_time == KTIME_MAX)
> +			ct->stall_time = ktime_get();
> +
> +		if (unlikely(ct_deadlocked(ct)))
> +			return -EPIPE;
> +		else
> +			return -EBUSY;
> +	}
> +
> +	ct->stall_time = KTIME_MAX;
> +	return 0;
> +}
> +
>   static int ct_send_nb(struct intel_guc_ct *ct,
>   		      const u32 *action,
>   		      u32 len,
> @@ -529,11 +569,9 @@ static int ct_send_nb(struct intel_guc_ct *ct,
>   
>   	spin_lock_irqsave(&ctb->lock, spin_flags);
>   
> -	ret = h2g_has_room(ctb, len + GUC_CTB_HDR_LEN);
> -	if (unlikely(!ret)) {
> -		ret = -EBUSY;
> +	ret = has_room_nb(ct, len + GUC_CTB_HDR_LEN);
> +	if (unlikely(ret))
>   		goto out;
> -	}
>   
>   	fence = ct_get_next_fence(ct);
>   	ret = ct_write(ct, action, len, fence, flags);
> @@ -576,8 +614,13 @@ static int ct_send(struct intel_guc_ct *ct,
>   retry:
>   	spin_lock_irqsave(&ctb->lock, flags);
>   	if (unlikely(!h2g_has_room(ctb, len + GUC_CTB_HDR_LEN))) {
> +		if (ct->stall_time == KTIME_MAX)
> +			ct->stall_time = ktime_get();
>   		spin_unlock_irqrestore(&ctb->lock, flags);
>   
> +		if (unlikely(ct_deadlocked(ct)))
> +			return -EPIPE;
> +
>   		if (msleep_interruptible(sleep_period_ms))
>   			return -EINTR;
>   		sleep_period_ms = sleep_period_ms << 1;
> @@ -585,6 +628,8 @@ static int ct_send(struct intel_guc_ct *ct,
>   		goto retry;
>   	}
>   
> +	ct->stall_time = KTIME_MAX;
> +
>   	fence = ct_get_next_fence(ct);
>   	request.fence = fence;
>   	request.status = 0;
> @@ -647,6 +692,9 @@ int intel_guc_ct_send(struct intel_guc_ct *ct, const u32 *action, u32 len,
>   		return -ENODEV;
>   	}
>   
> +	if (unlikely(ct->ctbs.send.broken))
> +		return -EPIPE;
> +
>   	if (flags & INTEL_GUC_CT_SEND_NB)
>   		return ct_send_nb(ct, action, len, flags);
>   
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
> index 5bb8bef024c8..bee03794c1eb 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
> @@ -9,6 +9,7 @@
>   #include <linux/interrupt.h>
>   #include <linux/spinlock.h>
>   #include <linux/workqueue.h>
> +#include <linux/ktime.h>
>   
>   #include "intel_guc_fwif.h"
>   
> @@ -68,6 +69,9 @@ struct intel_guc_ct {
>   		struct list_head incoming; /* incoming requests */
>   		struct work_struct worker; /* handler for incoming requests */
>   	} requests;
> +
> +	/** @stall_time: time of first time a CTB submission is stalled */
> +	ktime_t stall_time;
>   };
>   
>   void intel_guc_ct_init_early(struct intel_guc_ct *ct);

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 6/7] drm/i915/guc: Optimize CTB writes and reads
  2021-07-01 17:15   ` [Intel-gfx] " Matthew Brost
@ 2021-07-06 19:00     ` John Harrison
  -1 siblings, 0 replies; 38+ messages in thread
From: John Harrison @ 2021-07-06 19:00 UTC (permalink / raw)
  To: Matthew Brost, intel-gfx, dri-devel; +Cc: Michal.Wajdeczko

On 7/1/2021 10:15, Matthew Brost wrote:
> CTB writes are now in the path of command submission and should be
> optimized for performance. Rather than reading CTB descriptor values
> (e.g. head, tail) which could result in accesses across the PCIe bus,
> store shadow local copies and only read/write the descriptor values when
> absolutely necessary. Also store the current space in the each channel
> locally.
>
> v2:
>   (Michel)
>    - Add additional sanity checks for head / tail pointers
>    - Use GUC_CTB_HDR_LEN rather than magic 1
>
> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> ---
>   drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 88 +++++++++++++++--------
>   drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h |  6 ++
>   2 files changed, 65 insertions(+), 29 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> index a9cb7b608520..5b8b4ff609e2 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> @@ -130,6 +130,10 @@ static void guc_ct_buffer_desc_init(struct guc_ct_buffer_desc *desc)
>   static void guc_ct_buffer_reset(struct intel_guc_ct_buffer *ctb)
>   {
>   	ctb->broken = false;
> +	ctb->tail = 0;
> +	ctb->head = 0;
> +	ctb->space = CIRC_SPACE(ctb->tail, ctb->head, ctb->size);
> +
>   	guc_ct_buffer_desc_init(ctb->desc);
>   }
>   
> @@ -383,10 +387,8 @@ static int ct_write(struct intel_guc_ct *ct,
>   {
>   	struct intel_guc_ct_buffer *ctb = &ct->ctbs.send;
>   	struct guc_ct_buffer_desc *desc = ctb->desc;
> -	u32 head = desc->head;
> -	u32 tail = desc->tail;
> +	u32 tail = ctb->tail;
>   	u32 size = ctb->size;
> -	u32 used;
>   	u32 header;
>   	u32 hxg;
>   	u32 *cmds = ctb->cmds;
> @@ -395,25 +397,22 @@ static int ct_write(struct intel_guc_ct *ct,
>   	if (unlikely(desc->status))
>   		goto corrupted;
>   
> -	if (unlikely((tail | head) >= size)) {
> +	GEM_BUG_ON(tail > size);
> +
> +#ifdef CONFIG_DRM_I915_DEBUG_GUC
> +	if (unlikely(tail != READ_ONCE(desc->tail))) {
> +		CT_ERROR(ct, "Tail was modified %u != %u\n",
> +			 desc->tail, ctb->tail);
> +		desc->status |= GUC_CTB_STATUS_MISMATCH;
> +		goto corrupted;
> +	}
> +	if (unlikely((desc->tail | desc->head) >= size)) {
>   		CT_ERROR(ct, "Invalid offsets head=%u tail=%u (size=%u)\n",
> -			 head, tail, size);
> +			 desc->head, desc->tail, size);
>   		desc->status |= GUC_CTB_STATUS_OVERFLOW;
>   		goto corrupted;
>   	}
> -
> -	/*
> -	 * tail == head condition indicates empty. GuC FW does not support
> -	 * using up the entire buffer to get tail == head meaning full.
> -	 */
> -	if (tail < head)
> -		used = (size - head) + tail;
> -	else
> -		used = tail - head;
> -
> -	/* make sure there is a space including extra dw for the fence */
> -	if (unlikely(used + len + GUC_CTB_HDR_LEN >= size))
> -		return -ENOSPC;
> +#endif
>   
>   	/*
>   	 * dw0: CT header (including fence)
> @@ -454,7 +453,9 @@ static int ct_write(struct intel_guc_ct *ct,
>   	write_barrier(ct);
>   
>   	/* now update descriptor */
> +	ctb->tail = tail;
>   	WRITE_ONCE(desc->tail, tail);
> +	ctb->space -= len + GUC_CTB_HDR_LEN;
>   
>   	return 0;
>   
> @@ -470,7 +471,7 @@ static int ct_write(struct intel_guc_ct *ct,
>    * @req:	pointer to pending request
>    * @status:	placeholder for status
>    *
> - * For each sent request, Guc shall send bac CT response message.
> + * For each sent request, GuC shall send back CT response message.
>    * Our message handler will update status of tracked request once
>    * response message with given fence is received. Wait here and
>    * check for valid response status value.
> @@ -526,24 +527,35 @@ static inline bool ct_deadlocked(struct intel_guc_ct *ct)
>   	return ret;
>   }
>   
> -static inline bool h2g_has_room(struct intel_guc_ct_buffer *ctb, u32 len_dw)
> +static inline bool h2g_has_room(struct intel_guc_ct *ct, u32 len_dw)
>   {
> -	struct guc_ct_buffer_desc *desc = ctb->desc;
> -	u32 head = READ_ONCE(desc->head);
> +	struct intel_guc_ct_buffer *ctb = &ct->ctbs.send;
> +	u32 head;
>   	u32 space;
>   
> -	space = CIRC_SPACE(desc->tail, head, ctb->size);
> +	if (ctb->space >= len_dw)
> +		return true;
> +
> +	head = READ_ONCE(ctb->desc->head);
> +	if (unlikely(head > ctb->size)) {
> +		CT_ERROR(ct, "Corrupted descriptor head=%u tail=%u size=%u\n",
> +			 ctb->desc->head, ctb->desc->tail, ctb->size);
> +		ctb->desc->status |= GUC_CTB_STATUS_OVERFLOW;
> +		ctb->broken = true;
> +		return false;
> +	}
> +
> +	space = CIRC_SPACE(ctb->tail, head, ctb->size);
> +	ctb->space = space;
>   
>   	return space >= len_dw;
>   }
>   
>   static int has_room_nb(struct intel_guc_ct *ct, u32 len_dw)
>   {
> -	struct intel_guc_ct_buffer *ctb = &ct->ctbs.send;
> -
>   	lockdep_assert_held(&ct->ctbs.send.lock);
>   
> -	if (unlikely(!h2g_has_room(ctb, len_dw))) {
> +	if (unlikely(!h2g_has_room(ct, len_dw))) {
>   		if (ct->stall_time == KTIME_MAX)
>   			ct->stall_time = ktime_get();
>   
> @@ -613,7 +625,7 @@ static int ct_send(struct intel_guc_ct *ct,
>   	 */
>   retry:
>   	spin_lock_irqsave(&ctb->lock, flags);
> -	if (unlikely(!h2g_has_room(ctb, len + GUC_CTB_HDR_LEN))) {
> +	if (unlikely(!h2g_has_room(ct, len + GUC_CTB_HDR_LEN))) {
>   		if (ct->stall_time == KTIME_MAX)
>   			ct->stall_time = ktime_get();
>   		spin_unlock_irqrestore(&ctb->lock, flags);
> @@ -733,7 +745,7 @@ static int ct_read(struct intel_guc_ct *ct, struct ct_incoming_msg **msg)
>   {
>   	struct intel_guc_ct_buffer *ctb = &ct->ctbs.recv;
>   	struct guc_ct_buffer_desc *desc = ctb->desc;
> -	u32 head = desc->head;
> +	u32 head = ctb->head;
>   	u32 tail = desc->tail;
>   	u32 size = ctb->size;
>   	u32 *cmds = ctb->cmds;
> @@ -748,12 +760,29 @@ static int ct_read(struct intel_guc_ct *ct, struct ct_incoming_msg **msg)
>   	if (unlikely(desc->status))
>   		goto corrupted;
>   
> -	if (unlikely((tail | head) >= size)) {
> +	GEM_BUG_ON(head > size);
Is the BUG_ON necessary given that both options below do the same check 
but as a corrupted buffer test (with subsequent recovery by GT reset?) 
rather than killing the driver.

> +
> +#ifdef CONFIG_DRM_I915_DEBUG_GUC
> +	if (unlikely(head != READ_ONCE(desc->head))) {
> +		CT_ERROR(ct, "Head was modified %u != %u\n",
> +			 desc->head, ctb->head);
> +		desc->status |= GUC_CTB_STATUS_MISMATCH;
> +		goto corrupted;
> +	}
> +	if (unlikely((desc->tail | desc->head) >= size)) {
> +		CT_ERROR(ct, "Invalid offsets head=%u tail=%u (size=%u)\n",
> +			 head, tail, size);
> +		desc->status |= GUC_CTB_STATUS_OVERFLOW;
> +		goto corrupted;
> +	}
> +#else
> +	if (unlikely((tail | ctb->head) >= size)) {
Could just be 'head' rather than 'ctb->head'.

John.

>   		CT_ERROR(ct, "Invalid offsets head=%u tail=%u (size=%u)\n",
>   			 head, tail, size);
>   		desc->status |= GUC_CTB_STATUS_OVERFLOW;
>   		goto corrupted;
>   	}
> +#endif
>   
>   	/* tail == head condition indicates empty */
>   	available = tail - head;
> @@ -803,6 +832,7 @@ static int ct_read(struct intel_guc_ct *ct, struct ct_incoming_msg **msg)
>   	}
>   	CT_DEBUG(ct, "received %*ph\n", 4 * len, (*msg)->msg);
>   
> +	ctb->head = head;
>   	/* now update descriptor */
>   	WRITE_ONCE(desc->head, head);
>   
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
> index bee03794c1eb..edd1bba0445d 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
> @@ -33,6 +33,9 @@ struct intel_guc;
>    * @desc: pointer to the buffer descriptor
>    * @cmds: pointer to the commands buffer
>    * @size: size of the commands buffer in dwords
> + * @head: local shadow copy of head in dwords
> + * @tail: local shadow copy of tail in dwords
> + * @space: local shadow copy of space in dwords
>    * @broken: flag to indicate if descriptor data is broken
>    */
>   struct intel_guc_ct_buffer {
> @@ -40,6 +43,9 @@ struct intel_guc_ct_buffer {
>   	struct guc_ct_buffer_desc *desc;
>   	u32 *cmds;
>   	u32 size;
> +	u32 tail;
> +	u32 head;
> +	u32 space;
>   	bool broken;
>   };
>   


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Intel-gfx] [PATCH 6/7] drm/i915/guc: Optimize CTB writes and reads
@ 2021-07-06 19:00     ` John Harrison
  0 siblings, 0 replies; 38+ messages in thread
From: John Harrison @ 2021-07-06 19:00 UTC (permalink / raw)
  To: Matthew Brost, intel-gfx, dri-devel

On 7/1/2021 10:15, Matthew Brost wrote:
> CTB writes are now in the path of command submission and should be
> optimized for performance. Rather than reading CTB descriptor values
> (e.g. head, tail) which could result in accesses across the PCIe bus,
> store shadow local copies and only read/write the descriptor values when
> absolutely necessary. Also store the current space in the each channel
> locally.
>
> v2:
>   (Michel)
>    - Add additional sanity checks for head / tail pointers
>    - Use GUC_CTB_HDR_LEN rather than magic 1
>
> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> ---
>   drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 88 +++++++++++++++--------
>   drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h |  6 ++
>   2 files changed, 65 insertions(+), 29 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> index a9cb7b608520..5b8b4ff609e2 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> @@ -130,6 +130,10 @@ static void guc_ct_buffer_desc_init(struct guc_ct_buffer_desc *desc)
>   static void guc_ct_buffer_reset(struct intel_guc_ct_buffer *ctb)
>   {
>   	ctb->broken = false;
> +	ctb->tail = 0;
> +	ctb->head = 0;
> +	ctb->space = CIRC_SPACE(ctb->tail, ctb->head, ctb->size);
> +
>   	guc_ct_buffer_desc_init(ctb->desc);
>   }
>   
> @@ -383,10 +387,8 @@ static int ct_write(struct intel_guc_ct *ct,
>   {
>   	struct intel_guc_ct_buffer *ctb = &ct->ctbs.send;
>   	struct guc_ct_buffer_desc *desc = ctb->desc;
> -	u32 head = desc->head;
> -	u32 tail = desc->tail;
> +	u32 tail = ctb->tail;
>   	u32 size = ctb->size;
> -	u32 used;
>   	u32 header;
>   	u32 hxg;
>   	u32 *cmds = ctb->cmds;
> @@ -395,25 +397,22 @@ static int ct_write(struct intel_guc_ct *ct,
>   	if (unlikely(desc->status))
>   		goto corrupted;
>   
> -	if (unlikely((tail | head) >= size)) {
> +	GEM_BUG_ON(tail > size);
> +
> +#ifdef CONFIG_DRM_I915_DEBUG_GUC
> +	if (unlikely(tail != READ_ONCE(desc->tail))) {
> +		CT_ERROR(ct, "Tail was modified %u != %u\n",
> +			 desc->tail, ctb->tail);
> +		desc->status |= GUC_CTB_STATUS_MISMATCH;
> +		goto corrupted;
> +	}
> +	if (unlikely((desc->tail | desc->head) >= size)) {
>   		CT_ERROR(ct, "Invalid offsets head=%u tail=%u (size=%u)\n",
> -			 head, tail, size);
> +			 desc->head, desc->tail, size);
>   		desc->status |= GUC_CTB_STATUS_OVERFLOW;
>   		goto corrupted;
>   	}
> -
> -	/*
> -	 * tail == head condition indicates empty. GuC FW does not support
> -	 * using up the entire buffer to get tail == head meaning full.
> -	 */
> -	if (tail < head)
> -		used = (size - head) + tail;
> -	else
> -		used = tail - head;
> -
> -	/* make sure there is a space including extra dw for the fence */
> -	if (unlikely(used + len + GUC_CTB_HDR_LEN >= size))
> -		return -ENOSPC;
> +#endif
>   
>   	/*
>   	 * dw0: CT header (including fence)
> @@ -454,7 +453,9 @@ static int ct_write(struct intel_guc_ct *ct,
>   	write_barrier(ct);
>   
>   	/* now update descriptor */
> +	ctb->tail = tail;
>   	WRITE_ONCE(desc->tail, tail);
> +	ctb->space -= len + GUC_CTB_HDR_LEN;
>   
>   	return 0;
>   
> @@ -470,7 +471,7 @@ static int ct_write(struct intel_guc_ct *ct,
>    * @req:	pointer to pending request
>    * @status:	placeholder for status
>    *
> - * For each sent request, Guc shall send bac CT response message.
> + * For each sent request, GuC shall send back CT response message.
>    * Our message handler will update status of tracked request once
>    * response message with given fence is received. Wait here and
>    * check for valid response status value.
> @@ -526,24 +527,35 @@ static inline bool ct_deadlocked(struct intel_guc_ct *ct)
>   	return ret;
>   }
>   
> -static inline bool h2g_has_room(struct intel_guc_ct_buffer *ctb, u32 len_dw)
> +static inline bool h2g_has_room(struct intel_guc_ct *ct, u32 len_dw)
>   {
> -	struct guc_ct_buffer_desc *desc = ctb->desc;
> -	u32 head = READ_ONCE(desc->head);
> +	struct intel_guc_ct_buffer *ctb = &ct->ctbs.send;
> +	u32 head;
>   	u32 space;
>   
> -	space = CIRC_SPACE(desc->tail, head, ctb->size);
> +	if (ctb->space >= len_dw)
> +		return true;
> +
> +	head = READ_ONCE(ctb->desc->head);
> +	if (unlikely(head > ctb->size)) {
> +		CT_ERROR(ct, "Corrupted descriptor head=%u tail=%u size=%u\n",
> +			 ctb->desc->head, ctb->desc->tail, ctb->size);
> +		ctb->desc->status |= GUC_CTB_STATUS_OVERFLOW;
> +		ctb->broken = true;
> +		return false;
> +	}
> +
> +	space = CIRC_SPACE(ctb->tail, head, ctb->size);
> +	ctb->space = space;
>   
>   	return space >= len_dw;
>   }
>   
>   static int has_room_nb(struct intel_guc_ct *ct, u32 len_dw)
>   {
> -	struct intel_guc_ct_buffer *ctb = &ct->ctbs.send;
> -
>   	lockdep_assert_held(&ct->ctbs.send.lock);
>   
> -	if (unlikely(!h2g_has_room(ctb, len_dw))) {
> +	if (unlikely(!h2g_has_room(ct, len_dw))) {
>   		if (ct->stall_time == KTIME_MAX)
>   			ct->stall_time = ktime_get();
>   
> @@ -613,7 +625,7 @@ static int ct_send(struct intel_guc_ct *ct,
>   	 */
>   retry:
>   	spin_lock_irqsave(&ctb->lock, flags);
> -	if (unlikely(!h2g_has_room(ctb, len + GUC_CTB_HDR_LEN))) {
> +	if (unlikely(!h2g_has_room(ct, len + GUC_CTB_HDR_LEN))) {
>   		if (ct->stall_time == KTIME_MAX)
>   			ct->stall_time = ktime_get();
>   		spin_unlock_irqrestore(&ctb->lock, flags);
> @@ -733,7 +745,7 @@ static int ct_read(struct intel_guc_ct *ct, struct ct_incoming_msg **msg)
>   {
>   	struct intel_guc_ct_buffer *ctb = &ct->ctbs.recv;
>   	struct guc_ct_buffer_desc *desc = ctb->desc;
> -	u32 head = desc->head;
> +	u32 head = ctb->head;
>   	u32 tail = desc->tail;
>   	u32 size = ctb->size;
>   	u32 *cmds = ctb->cmds;
> @@ -748,12 +760,29 @@ static int ct_read(struct intel_guc_ct *ct, struct ct_incoming_msg **msg)
>   	if (unlikely(desc->status))
>   		goto corrupted;
>   
> -	if (unlikely((tail | head) >= size)) {
> +	GEM_BUG_ON(head > size);
Is the BUG_ON necessary given that both options below do the same check 
but as a corrupted buffer test (with subsequent recovery by GT reset?) 
rather than killing the driver.

> +
> +#ifdef CONFIG_DRM_I915_DEBUG_GUC
> +	if (unlikely(head != READ_ONCE(desc->head))) {
> +		CT_ERROR(ct, "Head was modified %u != %u\n",
> +			 desc->head, ctb->head);
> +		desc->status |= GUC_CTB_STATUS_MISMATCH;
> +		goto corrupted;
> +	}
> +	if (unlikely((desc->tail | desc->head) >= size)) {
> +		CT_ERROR(ct, "Invalid offsets head=%u tail=%u (size=%u)\n",
> +			 head, tail, size);
> +		desc->status |= GUC_CTB_STATUS_OVERFLOW;
> +		goto corrupted;
> +	}
> +#else
> +	if (unlikely((tail | ctb->head) >= size)) {
Could just be 'head' rather than 'ctb->head'.

John.

>   		CT_ERROR(ct, "Invalid offsets head=%u tail=%u (size=%u)\n",
>   			 head, tail, size);
>   		desc->status |= GUC_CTB_STATUS_OVERFLOW;
>   		goto corrupted;
>   	}
> +#endif
>   
>   	/* tail == head condition indicates empty */
>   	available = tail - head;
> @@ -803,6 +832,7 @@ static int ct_read(struct intel_guc_ct *ct, struct ct_incoming_msg **msg)
>   	}
>   	CT_DEBUG(ct, "received %*ph\n", 4 * len, (*msg)->msg);
>   
> +	ctb->head = head;
>   	/* now update descriptor */
>   	WRITE_ONCE(desc->head, head);
>   
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
> index bee03794c1eb..edd1bba0445d 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
> @@ -33,6 +33,9 @@ struct intel_guc;
>    * @desc: pointer to the buffer descriptor
>    * @cmds: pointer to the commands buffer
>    * @size: size of the commands buffer in dwords
> + * @head: local shadow copy of head in dwords
> + * @tail: local shadow copy of tail in dwords
> + * @space: local shadow copy of space in dwords
>    * @broken: flag to indicate if descriptor data is broken
>    */
>   struct intel_guc_ct_buffer {
> @@ -40,6 +43,9 @@ struct intel_guc_ct_buffer {
>   	struct guc_ct_buffer_desc *desc;
>   	u32 *cmds;
>   	u32 size;
> +	u32 tail;
> +	u32 head;
> +	u32 space;
>   	bool broken;
>   };
>   

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 6/7] drm/i915/guc: Optimize CTB writes and reads
  2021-07-06 19:00     ` [Intel-gfx] " John Harrison
@ 2021-07-06 19:12       ` Michal Wajdeczko
  -1 siblings, 0 replies; 38+ messages in thread
From: Michal Wajdeczko @ 2021-07-06 19:12 UTC (permalink / raw)
  To: John Harrison, Matthew Brost, intel-gfx, dri-devel



On 06.07.2021 21:00, John Harrison wrote:
> On 7/1/2021 10:15, Matthew Brost wrote:
>> CTB writes are now in the path of command submission and should be
>> optimized for performance. Rather than reading CTB descriptor values
>> (e.g. head, tail) which could result in accesses across the PCIe bus,
>> store shadow local copies and only read/write the descriptor values when
>> absolutely necessary. Also store the current space in the each channel
>> locally.
>>
>> v2:
>>   (Michel)
>>    - Add additional sanity checks for head / tail pointers
>>    - Use GUC_CTB_HDR_LEN rather than magic 1
>>
>> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
>> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
>> ---
>>   drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 88 +++++++++++++++--------
>>   drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h |  6 ++
>>   2 files changed, 65 insertions(+), 29 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
>> b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
>> index a9cb7b608520..5b8b4ff609e2 100644
>> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
>> @@ -130,6 +130,10 @@ static void guc_ct_buffer_desc_init(struct
>> guc_ct_buffer_desc *desc)
>>   static void guc_ct_buffer_reset(struct intel_guc_ct_buffer *ctb)
>>   {
>>       ctb->broken = false;
>> +    ctb->tail = 0;
>> +    ctb->head = 0;
>> +    ctb->space = CIRC_SPACE(ctb->tail, ctb->head, ctb->size);
>> +
>>       guc_ct_buffer_desc_init(ctb->desc);
>>   }
>>   @@ -383,10 +387,8 @@ static int ct_write(struct intel_guc_ct *ct,
>>   {
>>       struct intel_guc_ct_buffer *ctb = &ct->ctbs.send;
>>       struct guc_ct_buffer_desc *desc = ctb->desc;
>> -    u32 head = desc->head;
>> -    u32 tail = desc->tail;
>> +    u32 tail = ctb->tail;
>>       u32 size = ctb->size;
>> -    u32 used;
>>       u32 header;
>>       u32 hxg;
>>       u32 *cmds = ctb->cmds;
>> @@ -395,25 +397,22 @@ static int ct_write(struct intel_guc_ct *ct,
>>       if (unlikely(desc->status))
>>           goto corrupted;
>>   -    if (unlikely((tail | head) >= size)) {
>> +    GEM_BUG_ON(tail > size);
>> +
>> +#ifdef CONFIG_DRM_I915_DEBUG_GUC
>> +    if (unlikely(tail != READ_ONCE(desc->tail))) {
>> +        CT_ERROR(ct, "Tail was modified %u != %u\n",
>> +             desc->tail, ctb->tail);
>> +        desc->status |= GUC_CTB_STATUS_MISMATCH;
>> +        goto corrupted;
>> +    }
>> +    if (unlikely((desc->tail | desc->head) >= size)) {
>>           CT_ERROR(ct, "Invalid offsets head=%u tail=%u (size=%u)\n",
>> -             head, tail, size);
>> +             desc->head, desc->tail, size);
>>           desc->status |= GUC_CTB_STATUS_OVERFLOW;
>>           goto corrupted;
>>       }
>> -
>> -    /*
>> -     * tail == head condition indicates empty. GuC FW does not support
>> -     * using up the entire buffer to get tail == head meaning full.
>> -     */
>> -    if (tail < head)
>> -        used = (size - head) + tail;
>> -    else
>> -        used = tail - head;
>> -
>> -    /* make sure there is a space including extra dw for the fence */
>> -    if (unlikely(used + len + GUC_CTB_HDR_LEN >= size))
>> -        return -ENOSPC;
>> +#endif
>>         /*
>>        * dw0: CT header (including fence)
>> @@ -454,7 +453,9 @@ static int ct_write(struct intel_guc_ct *ct,
>>       write_barrier(ct);
>>         /* now update descriptor */
>> +    ctb->tail = tail;
>>       WRITE_ONCE(desc->tail, tail);
>> +    ctb->space -= len + GUC_CTB_HDR_LEN;
>>         return 0;
>>   @@ -470,7 +471,7 @@ static int ct_write(struct intel_guc_ct *ct,
>>    * @req:    pointer to pending request
>>    * @status:    placeholder for status
>>    *
>> - * For each sent request, Guc shall send bac CT response message.
>> + * For each sent request, GuC shall send back CT response message.
>>    * Our message handler will update status of tracked request once
>>    * response message with given fence is received. Wait here and
>>    * check for valid response status value.
>> @@ -526,24 +527,35 @@ static inline bool ct_deadlocked(struct
>> intel_guc_ct *ct)
>>       return ret;
>>   }
>>   -static inline bool h2g_has_room(struct intel_guc_ct_buffer *ctb,
>> u32 len_dw)
>> +static inline bool h2g_has_room(struct intel_guc_ct *ct, u32 len_dw)
>>   {
>> -    struct guc_ct_buffer_desc *desc = ctb->desc;
>> -    u32 head = READ_ONCE(desc->head);
>> +    struct intel_guc_ct_buffer *ctb = &ct->ctbs.send;
>> +    u32 head;
>>       u32 space;
>>   -    space = CIRC_SPACE(desc->tail, head, ctb->size);
>> +    if (ctb->space >= len_dw)
>> +        return true;
>> +
>> +    head = READ_ONCE(ctb->desc->head);
>> +    if (unlikely(head > ctb->size)) {
>> +        CT_ERROR(ct, "Corrupted descriptor head=%u tail=%u size=%u\n",
>> +             ctb->desc->head, ctb->desc->tail, ctb->size);
>> +        ctb->desc->status |= GUC_CTB_STATUS_OVERFLOW;
>> +        ctb->broken = true;
>> +        return false;
>> +    }
>> +
>> +    space = CIRC_SPACE(ctb->tail, head, ctb->size);
>> +    ctb->space = space;
>>         return space >= len_dw;
>>   }
>>     static int has_room_nb(struct intel_guc_ct *ct, u32 len_dw)
>>   {
>> -    struct intel_guc_ct_buffer *ctb = &ct->ctbs.send;
>> -
>>       lockdep_assert_held(&ct->ctbs.send.lock);
>>   -    if (unlikely(!h2g_has_room(ctb, len_dw))) {
>> +    if (unlikely(!h2g_has_room(ct, len_dw))) {
>>           if (ct->stall_time == KTIME_MAX)
>>               ct->stall_time = ktime_get();
>>   @@ -613,7 +625,7 @@ static int ct_send(struct intel_guc_ct *ct,
>>        */
>>   retry:
>>       spin_lock_irqsave(&ctb->lock, flags);
>> -    if (unlikely(!h2g_has_room(ctb, len + GUC_CTB_HDR_LEN))) {
>> +    if (unlikely(!h2g_has_room(ct, len + GUC_CTB_HDR_LEN))) {
>>           if (ct->stall_time == KTIME_MAX)
>>               ct->stall_time = ktime_get();
>>           spin_unlock_irqrestore(&ctb->lock, flags);
>> @@ -733,7 +745,7 @@ static int ct_read(struct intel_guc_ct *ct, struct
>> ct_incoming_msg **msg)
>>   {
>>       struct intel_guc_ct_buffer *ctb = &ct->ctbs.recv;
>>       struct guc_ct_buffer_desc *desc = ctb->desc;
>> -    u32 head = desc->head;
>> +    u32 head = ctb->head;
>>       u32 tail = desc->tail;
>>       u32 size = ctb->size;
>>       u32 *cmds = ctb->cmds;
>> @@ -748,12 +760,29 @@ static int ct_read(struct intel_guc_ct *ct,
>> struct ct_incoming_msg **msg)
>>       if (unlikely(desc->status))
>>           goto corrupted;
>>   -    if (unlikely((tail | head) >= size)) {
>> +    GEM_BUG_ON(head > size);
> Is the BUG_ON necessary given that both options below do the same check
> but as a corrupted buffer test (with subsequent recovery by GT reset?)
> rather than killing the driver.

"head" and "size" are now fully owned by the driver.
BUGON here is to make sure driver is coded correctly.

> 
>> +
>> +#ifdef CONFIG_DRM_I915_DEBUG_GUC
>> +    if (unlikely(head != READ_ONCE(desc->head))) {
>> +        CT_ERROR(ct, "Head was modified %u != %u\n",
>> +             desc->head, ctb->head);
>> +        desc->status |= GUC_CTB_STATUS_MISMATCH;
>> +        goto corrupted;
>> +    }
>> +    if (unlikely((desc->tail | desc->head) >= size)) {
>> +        CT_ERROR(ct, "Invalid offsets head=%u tail=%u (size=%u)\n",
>> +             head, tail, size);
>> +        desc->status |= GUC_CTB_STATUS_OVERFLOW;
>> +        goto corrupted;
>> +    }
>> +#else
>> +    if (unlikely((tail | ctb->head) >= size)) {
> Could just be 'head' rather than 'ctb->head'.

or drop "ctb->head" completely since this is driver owned field and
above you already have BUGON to test it

Michal

> 
> John.
> 
>>           CT_ERROR(ct, "Invalid offsets head=%u tail=%u (size=%u)\n",
>>                head, tail, size);
>>           desc->status |= GUC_CTB_STATUS_OVERFLOW;
>>           goto corrupted;
>>       }
>> +#endif
>>         /* tail == head condition indicates empty */
>>       available = tail - head;
>> @@ -803,6 +832,7 @@ static int ct_read(struct intel_guc_ct *ct, struct
>> ct_incoming_msg **msg)
>>       }
>>       CT_DEBUG(ct, "received %*ph\n", 4 * len, (*msg)->msg);
>>   +    ctb->head = head;
>>       /* now update descriptor */
>>       WRITE_ONCE(desc->head, head);
>>   diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
>> b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
>> index bee03794c1eb..edd1bba0445d 100644
>> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
>> @@ -33,6 +33,9 @@ struct intel_guc;
>>    * @desc: pointer to the buffer descriptor
>>    * @cmds: pointer to the commands buffer
>>    * @size: size of the commands buffer in dwords
>> + * @head: local shadow copy of head in dwords
>> + * @tail: local shadow copy of tail in dwords
>> + * @space: local shadow copy of space in dwords
>>    * @broken: flag to indicate if descriptor data is broken
>>    */
>>   struct intel_guc_ct_buffer {
>> @@ -40,6 +43,9 @@ struct intel_guc_ct_buffer {
>>       struct guc_ct_buffer_desc *desc;
>>       u32 *cmds;
>>       u32 size;
>> +    u32 tail;
>> +    u32 head;
>> +    u32 space;
>>       bool broken;
>>   };
>>   
> 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Intel-gfx] [PATCH 6/7] drm/i915/guc: Optimize CTB writes and reads
@ 2021-07-06 19:12       ` Michal Wajdeczko
  0 siblings, 0 replies; 38+ messages in thread
From: Michal Wajdeczko @ 2021-07-06 19:12 UTC (permalink / raw)
  To: John Harrison, Matthew Brost, intel-gfx, dri-devel



On 06.07.2021 21:00, John Harrison wrote:
> On 7/1/2021 10:15, Matthew Brost wrote:
>> CTB writes are now in the path of command submission and should be
>> optimized for performance. Rather than reading CTB descriptor values
>> (e.g. head, tail) which could result in accesses across the PCIe bus,
>> store shadow local copies and only read/write the descriptor values when
>> absolutely necessary. Also store the current space in the each channel
>> locally.
>>
>> v2:
>>   (Michel)
>>    - Add additional sanity checks for head / tail pointers
>>    - Use GUC_CTB_HDR_LEN rather than magic 1
>>
>> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
>> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
>> ---
>>   drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 88 +++++++++++++++--------
>>   drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h |  6 ++
>>   2 files changed, 65 insertions(+), 29 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
>> b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
>> index a9cb7b608520..5b8b4ff609e2 100644
>> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
>> @@ -130,6 +130,10 @@ static void guc_ct_buffer_desc_init(struct
>> guc_ct_buffer_desc *desc)
>>   static void guc_ct_buffer_reset(struct intel_guc_ct_buffer *ctb)
>>   {
>>       ctb->broken = false;
>> +    ctb->tail = 0;
>> +    ctb->head = 0;
>> +    ctb->space = CIRC_SPACE(ctb->tail, ctb->head, ctb->size);
>> +
>>       guc_ct_buffer_desc_init(ctb->desc);
>>   }
>>   @@ -383,10 +387,8 @@ static int ct_write(struct intel_guc_ct *ct,
>>   {
>>       struct intel_guc_ct_buffer *ctb = &ct->ctbs.send;
>>       struct guc_ct_buffer_desc *desc = ctb->desc;
>> -    u32 head = desc->head;
>> -    u32 tail = desc->tail;
>> +    u32 tail = ctb->tail;
>>       u32 size = ctb->size;
>> -    u32 used;
>>       u32 header;
>>       u32 hxg;
>>       u32 *cmds = ctb->cmds;
>> @@ -395,25 +397,22 @@ static int ct_write(struct intel_guc_ct *ct,
>>       if (unlikely(desc->status))
>>           goto corrupted;
>>   -    if (unlikely((tail | head) >= size)) {
>> +    GEM_BUG_ON(tail > size);
>> +
>> +#ifdef CONFIG_DRM_I915_DEBUG_GUC
>> +    if (unlikely(tail != READ_ONCE(desc->tail))) {
>> +        CT_ERROR(ct, "Tail was modified %u != %u\n",
>> +             desc->tail, ctb->tail);
>> +        desc->status |= GUC_CTB_STATUS_MISMATCH;
>> +        goto corrupted;
>> +    }
>> +    if (unlikely((desc->tail | desc->head) >= size)) {
>>           CT_ERROR(ct, "Invalid offsets head=%u tail=%u (size=%u)\n",
>> -             head, tail, size);
>> +             desc->head, desc->tail, size);
>>           desc->status |= GUC_CTB_STATUS_OVERFLOW;
>>           goto corrupted;
>>       }
>> -
>> -    /*
>> -     * tail == head condition indicates empty. GuC FW does not support
>> -     * using up the entire buffer to get tail == head meaning full.
>> -     */
>> -    if (tail < head)
>> -        used = (size - head) + tail;
>> -    else
>> -        used = tail - head;
>> -
>> -    /* make sure there is a space including extra dw for the fence */
>> -    if (unlikely(used + len + GUC_CTB_HDR_LEN >= size))
>> -        return -ENOSPC;
>> +#endif
>>         /*
>>        * dw0: CT header (including fence)
>> @@ -454,7 +453,9 @@ static int ct_write(struct intel_guc_ct *ct,
>>       write_barrier(ct);
>>         /* now update descriptor */
>> +    ctb->tail = tail;
>>       WRITE_ONCE(desc->tail, tail);
>> +    ctb->space -= len + GUC_CTB_HDR_LEN;
>>         return 0;
>>   @@ -470,7 +471,7 @@ static int ct_write(struct intel_guc_ct *ct,
>>    * @req:    pointer to pending request
>>    * @status:    placeholder for status
>>    *
>> - * For each sent request, Guc shall send bac CT response message.
>> + * For each sent request, GuC shall send back CT response message.
>>    * Our message handler will update status of tracked request once
>>    * response message with given fence is received. Wait here and
>>    * check for valid response status value.
>> @@ -526,24 +527,35 @@ static inline bool ct_deadlocked(struct
>> intel_guc_ct *ct)
>>       return ret;
>>   }
>>   -static inline bool h2g_has_room(struct intel_guc_ct_buffer *ctb,
>> u32 len_dw)
>> +static inline bool h2g_has_room(struct intel_guc_ct *ct, u32 len_dw)
>>   {
>> -    struct guc_ct_buffer_desc *desc = ctb->desc;
>> -    u32 head = READ_ONCE(desc->head);
>> +    struct intel_guc_ct_buffer *ctb = &ct->ctbs.send;
>> +    u32 head;
>>       u32 space;
>>   -    space = CIRC_SPACE(desc->tail, head, ctb->size);
>> +    if (ctb->space >= len_dw)
>> +        return true;
>> +
>> +    head = READ_ONCE(ctb->desc->head);
>> +    if (unlikely(head > ctb->size)) {
>> +        CT_ERROR(ct, "Corrupted descriptor head=%u tail=%u size=%u\n",
>> +             ctb->desc->head, ctb->desc->tail, ctb->size);
>> +        ctb->desc->status |= GUC_CTB_STATUS_OVERFLOW;
>> +        ctb->broken = true;
>> +        return false;
>> +    }
>> +
>> +    space = CIRC_SPACE(ctb->tail, head, ctb->size);
>> +    ctb->space = space;
>>         return space >= len_dw;
>>   }
>>     static int has_room_nb(struct intel_guc_ct *ct, u32 len_dw)
>>   {
>> -    struct intel_guc_ct_buffer *ctb = &ct->ctbs.send;
>> -
>>       lockdep_assert_held(&ct->ctbs.send.lock);
>>   -    if (unlikely(!h2g_has_room(ctb, len_dw))) {
>> +    if (unlikely(!h2g_has_room(ct, len_dw))) {
>>           if (ct->stall_time == KTIME_MAX)
>>               ct->stall_time = ktime_get();
>>   @@ -613,7 +625,7 @@ static int ct_send(struct intel_guc_ct *ct,
>>        */
>>   retry:
>>       spin_lock_irqsave(&ctb->lock, flags);
>> -    if (unlikely(!h2g_has_room(ctb, len + GUC_CTB_HDR_LEN))) {
>> +    if (unlikely(!h2g_has_room(ct, len + GUC_CTB_HDR_LEN))) {
>>           if (ct->stall_time == KTIME_MAX)
>>               ct->stall_time = ktime_get();
>>           spin_unlock_irqrestore(&ctb->lock, flags);
>> @@ -733,7 +745,7 @@ static int ct_read(struct intel_guc_ct *ct, struct
>> ct_incoming_msg **msg)
>>   {
>>       struct intel_guc_ct_buffer *ctb = &ct->ctbs.recv;
>>       struct guc_ct_buffer_desc *desc = ctb->desc;
>> -    u32 head = desc->head;
>> +    u32 head = ctb->head;
>>       u32 tail = desc->tail;
>>       u32 size = ctb->size;
>>       u32 *cmds = ctb->cmds;
>> @@ -748,12 +760,29 @@ static int ct_read(struct intel_guc_ct *ct,
>> struct ct_incoming_msg **msg)
>>       if (unlikely(desc->status))
>>           goto corrupted;
>>   -    if (unlikely((tail | head) >= size)) {
>> +    GEM_BUG_ON(head > size);
> Is the BUG_ON necessary given that both options below do the same check
> but as a corrupted buffer test (with subsequent recovery by GT reset?)
> rather than killing the driver.

"head" and "size" are now fully owned by the driver.
BUGON here is to make sure driver is coded correctly.

> 
>> +
>> +#ifdef CONFIG_DRM_I915_DEBUG_GUC
>> +    if (unlikely(head != READ_ONCE(desc->head))) {
>> +        CT_ERROR(ct, "Head was modified %u != %u\n",
>> +             desc->head, ctb->head);
>> +        desc->status |= GUC_CTB_STATUS_MISMATCH;
>> +        goto corrupted;
>> +    }
>> +    if (unlikely((desc->tail | desc->head) >= size)) {
>> +        CT_ERROR(ct, "Invalid offsets head=%u tail=%u (size=%u)\n",
>> +             head, tail, size);
>> +        desc->status |= GUC_CTB_STATUS_OVERFLOW;
>> +        goto corrupted;
>> +    }
>> +#else
>> +    if (unlikely((tail | ctb->head) >= size)) {
> Could just be 'head' rather than 'ctb->head'.

or drop "ctb->head" completely since this is driver owned field and
above you already have BUGON to test it

Michal

> 
> John.
> 
>>           CT_ERROR(ct, "Invalid offsets head=%u tail=%u (size=%u)\n",
>>                head, tail, size);
>>           desc->status |= GUC_CTB_STATUS_OVERFLOW;
>>           goto corrupted;
>>       }
>> +#endif
>>         /* tail == head condition indicates empty */
>>       available = tail - head;
>> @@ -803,6 +832,7 @@ static int ct_read(struct intel_guc_ct *ct, struct
>> ct_incoming_msg **msg)
>>       }
>>       CT_DEBUG(ct, "received %*ph\n", 4 * len, (*msg)->msg);
>>   +    ctb->head = head;
>>       /* now update descriptor */
>>       WRITE_ONCE(desc->head, head);
>>   diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
>> b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
>> index bee03794c1eb..edd1bba0445d 100644
>> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
>> @@ -33,6 +33,9 @@ struct intel_guc;
>>    * @desc: pointer to the buffer descriptor
>>    * @cmds: pointer to the commands buffer
>>    * @size: size of the commands buffer in dwords
>> + * @head: local shadow copy of head in dwords
>> + * @tail: local shadow copy of tail in dwords
>> + * @space: local shadow copy of space in dwords
>>    * @broken: flag to indicate if descriptor data is broken
>>    */
>>   struct intel_guc_ct_buffer {
>> @@ -40,6 +43,9 @@ struct intel_guc_ct_buffer {
>>       struct guc_ct_buffer_desc *desc;
>>       u32 *cmds;
>>       u32 size;
>> +    u32 tail;
>> +    u32 head;
>> +    u32 space;
>>       bool broken;
>>   };
>>   
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 6/7] drm/i915/guc: Optimize CTB writes and reads
  2021-07-06 19:12       ` [Intel-gfx] " Michal Wajdeczko
@ 2021-07-06 19:19         ` John Harrison
  -1 siblings, 0 replies; 38+ messages in thread
From: John Harrison @ 2021-07-06 19:19 UTC (permalink / raw)
  To: Michal Wajdeczko, Matthew Brost, intel-gfx, dri-devel

On 7/6/2021 12:12, Michal Wajdeczko wrote:
> On 06.07.2021 21:00, John Harrison wrote:
>> On 7/1/2021 10:15, Matthew Brost wrote:
>>> CTB writes are now in the path of command submission and should be
>>> optimized for performance. Rather than reading CTB descriptor values
>>> (e.g. head, tail) which could result in accesses across the PCIe bus,
>>> store shadow local copies and only read/write the descriptor values when
>>> absolutely necessary. Also store the current space in the each channel
>>> locally.
>>>
>>> v2:
>>>    (Michel)
>>>     - Add additional sanity checks for head / tail pointers
>>>     - Use GUC_CTB_HDR_LEN rather than magic 1
>>>
>>> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
>>> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
>>> ---
>>>    drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 88 +++++++++++++++--------
>>>    drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h |  6 ++
>>>    2 files changed, 65 insertions(+), 29 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
>>> b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
>>> index a9cb7b608520..5b8b4ff609e2 100644
>>> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
>>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
>>> @@ -130,6 +130,10 @@ static void guc_ct_buffer_desc_init(struct
>>> guc_ct_buffer_desc *desc)
>>>    static void guc_ct_buffer_reset(struct intel_guc_ct_buffer *ctb)
>>>    {
>>>        ctb->broken = false;
>>> +    ctb->tail = 0;
>>> +    ctb->head = 0;
>>> +    ctb->space = CIRC_SPACE(ctb->tail, ctb->head, ctb->size);
>>> +
>>>        guc_ct_buffer_desc_init(ctb->desc);
>>>    }
>>>    @@ -383,10 +387,8 @@ static int ct_write(struct intel_guc_ct *ct,
>>>    {
>>>        struct intel_guc_ct_buffer *ctb = &ct->ctbs.send;
>>>        struct guc_ct_buffer_desc *desc = ctb->desc;
>>> -    u32 head = desc->head;
>>> -    u32 tail = desc->tail;
>>> +    u32 tail = ctb->tail;
>>>        u32 size = ctb->size;
>>> -    u32 used;
>>>        u32 header;
>>>        u32 hxg;
>>>        u32 *cmds = ctb->cmds;
>>> @@ -395,25 +397,22 @@ static int ct_write(struct intel_guc_ct *ct,
>>>        if (unlikely(desc->status))
>>>            goto corrupted;
>>>    -    if (unlikely((tail | head) >= size)) {
>>> +    GEM_BUG_ON(tail > size);
>>> +
>>> +#ifdef CONFIG_DRM_I915_DEBUG_GUC
>>> +    if (unlikely(tail != READ_ONCE(desc->tail))) {
>>> +        CT_ERROR(ct, "Tail was modified %u != %u\n",
>>> +             desc->tail, ctb->tail);
>>> +        desc->status |= GUC_CTB_STATUS_MISMATCH;
>>> +        goto corrupted;
>>> +    }
>>> +    if (unlikely((desc->tail | desc->head) >= size)) {
>>>            CT_ERROR(ct, "Invalid offsets head=%u tail=%u (size=%u)\n",
>>> -             head, tail, size);
>>> +             desc->head, desc->tail, size);
>>>            desc->status |= GUC_CTB_STATUS_OVERFLOW;
>>>            goto corrupted;
>>>        }
>>> -
>>> -    /*
>>> -     * tail == head condition indicates empty. GuC FW does not support
>>> -     * using up the entire buffer to get tail == head meaning full.
>>> -     */
>>> -    if (tail < head)
>>> -        used = (size - head) + tail;
>>> -    else
>>> -        used = tail - head;
>>> -
>>> -    /* make sure there is a space including extra dw for the fence */
>>> -    if (unlikely(used + len + GUC_CTB_HDR_LEN >= size))
>>> -        return -ENOSPC;
>>> +#endif
>>>          /*
>>>         * dw0: CT header (including fence)
>>> @@ -454,7 +453,9 @@ static int ct_write(struct intel_guc_ct *ct,
>>>        write_barrier(ct);
>>>          /* now update descriptor */
>>> +    ctb->tail = tail;
>>>        WRITE_ONCE(desc->tail, tail);
>>> +    ctb->space -= len + GUC_CTB_HDR_LEN;
>>>          return 0;
>>>    @@ -470,7 +471,7 @@ static int ct_write(struct intel_guc_ct *ct,
>>>     * @req:    pointer to pending request
>>>     * @status:    placeholder for status
>>>     *
>>> - * For each sent request, Guc shall send bac CT response message.
>>> + * For each sent request, GuC shall send back CT response message.
>>>     * Our message handler will update status of tracked request once
>>>     * response message with given fence is received. Wait here and
>>>     * check for valid response status value.
>>> @@ -526,24 +527,35 @@ static inline bool ct_deadlocked(struct
>>> intel_guc_ct *ct)
>>>        return ret;
>>>    }
>>>    -static inline bool h2g_has_room(struct intel_guc_ct_buffer *ctb,
>>> u32 len_dw)
>>> +static inline bool h2g_has_room(struct intel_guc_ct *ct, u32 len_dw)
>>>    {
>>> -    struct guc_ct_buffer_desc *desc = ctb->desc;
>>> -    u32 head = READ_ONCE(desc->head);
>>> +    struct intel_guc_ct_buffer *ctb = &ct->ctbs.send;
>>> +    u32 head;
>>>        u32 space;
>>>    -    space = CIRC_SPACE(desc->tail, head, ctb->size);
>>> +    if (ctb->space >= len_dw)
>>> +        return true;
>>> +
>>> +    head = READ_ONCE(ctb->desc->head);
>>> +    if (unlikely(head > ctb->size)) {
>>> +        CT_ERROR(ct, "Corrupted descriptor head=%u tail=%u size=%u\n",
>>> +             ctb->desc->head, ctb->desc->tail, ctb->size);
>>> +        ctb->desc->status |= GUC_CTB_STATUS_OVERFLOW;
>>> +        ctb->broken = true;
>>> +        return false;
>>> +    }
>>> +
>>> +    space = CIRC_SPACE(ctb->tail, head, ctb->size);
>>> +    ctb->space = space;
>>>          return space >= len_dw;
>>>    }
>>>      static int has_room_nb(struct intel_guc_ct *ct, u32 len_dw)
>>>    {
>>> -    struct intel_guc_ct_buffer *ctb = &ct->ctbs.send;
>>> -
>>>        lockdep_assert_held(&ct->ctbs.send.lock);
>>>    -    if (unlikely(!h2g_has_room(ctb, len_dw))) {
>>> +    if (unlikely(!h2g_has_room(ct, len_dw))) {
>>>            if (ct->stall_time == KTIME_MAX)
>>>                ct->stall_time = ktime_get();
>>>    @@ -613,7 +625,7 @@ static int ct_send(struct intel_guc_ct *ct,
>>>         */
>>>    retry:
>>>        spin_lock_irqsave(&ctb->lock, flags);
>>> -    if (unlikely(!h2g_has_room(ctb, len + GUC_CTB_HDR_LEN))) {
>>> +    if (unlikely(!h2g_has_room(ct, len + GUC_CTB_HDR_LEN))) {
>>>            if (ct->stall_time == KTIME_MAX)
>>>                ct->stall_time = ktime_get();
>>>            spin_unlock_irqrestore(&ctb->lock, flags);
>>> @@ -733,7 +745,7 @@ static int ct_read(struct intel_guc_ct *ct, struct
>>> ct_incoming_msg **msg)
>>>    {
>>>        struct intel_guc_ct_buffer *ctb = &ct->ctbs.recv;
>>>        struct guc_ct_buffer_desc *desc = ctb->desc;
>>> -    u32 head = desc->head;
>>> +    u32 head = ctb->head;
>>>        u32 tail = desc->tail;
>>>        u32 size = ctb->size;
>>>        u32 *cmds = ctb->cmds;
>>> @@ -748,12 +760,29 @@ static int ct_read(struct intel_guc_ct *ct,
>>> struct ct_incoming_msg **msg)
>>>        if (unlikely(desc->status))
>>>            goto corrupted;
>>>    -    if (unlikely((tail | head) >= size)) {
>>> +    GEM_BUG_ON(head > size);
>> Is the BUG_ON necessary given that both options below do the same check
>> but as a corrupted buffer test (with subsequent recovery by GT reset?)
>> rather than killing the driver.
> "head" and "size" are now fully owned by the driver.
> BUGON here is to make sure driver is coded correctly.
The point is that both sides of the #if below also validate head. So 
first there is a BUG_ON, then there is the same test but without blowing 
the driver apart. One or the other is not required. My vote would be to 
keep the recoverable test rather than the BUG_ON.

John.

>
>>> +
>>> +#ifdef CONFIG_DRM_I915_DEBUG_GUC
>>> +    if (unlikely(head != READ_ONCE(desc->head))) {
>>> +        CT_ERROR(ct, "Head was modified %u != %u\n",
>>> +             desc->head, ctb->head);
>>> +        desc->status |= GUC_CTB_STATUS_MISMATCH;
>>> +        goto corrupted;
>>> +    }
>>> +    if (unlikely((desc->tail | desc->head) >= size)) {
>>> +        CT_ERROR(ct, "Invalid offsets head=%u tail=%u (size=%u)\n",
>>> +             head, tail, size);
>>> +        desc->status |= GUC_CTB_STATUS_OVERFLOW;
>>> +        goto corrupted;
>>> +    }
>>> +#else
>>> +    if (unlikely((tail | ctb->head) >= size)) {
>> Could just be 'head' rather than 'ctb->head'.
> or drop "ctb->head" completely since this is driver owned field and
> above you already have BUGON to test it
>
> Michal
>
>> John.
>>
>>>            CT_ERROR(ct, "Invalid offsets head=%u tail=%u (size=%u)\n",
>>>                 head, tail, size);
>>>            desc->status |= GUC_CTB_STATUS_OVERFLOW;
>>>            goto corrupted;
>>>        }
>>> +#endif
>>>          /* tail == head condition indicates empty */
>>>        available = tail - head;
>>> @@ -803,6 +832,7 @@ static int ct_read(struct intel_guc_ct *ct, struct
>>> ct_incoming_msg **msg)
>>>        }
>>>        CT_DEBUG(ct, "received %*ph\n", 4 * len, (*msg)->msg);
>>>    +    ctb->head = head;
>>>        /* now update descriptor */
>>>        WRITE_ONCE(desc->head, head);
>>>    diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
>>> b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
>>> index bee03794c1eb..edd1bba0445d 100644
>>> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
>>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
>>> @@ -33,6 +33,9 @@ struct intel_guc;
>>>     * @desc: pointer to the buffer descriptor
>>>     * @cmds: pointer to the commands buffer
>>>     * @size: size of the commands buffer in dwords
>>> + * @head: local shadow copy of head in dwords
>>> + * @tail: local shadow copy of tail in dwords
>>> + * @space: local shadow copy of space in dwords
>>>     * @broken: flag to indicate if descriptor data is broken
>>>     */
>>>    struct intel_guc_ct_buffer {
>>> @@ -40,6 +43,9 @@ struct intel_guc_ct_buffer {
>>>        struct guc_ct_buffer_desc *desc;
>>>        u32 *cmds;
>>>        u32 size;
>>> +    u32 tail;
>>> +    u32 head;
>>> +    u32 space;
>>>        bool broken;
>>>    };
>>>    


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Intel-gfx] [PATCH 6/7] drm/i915/guc: Optimize CTB writes and reads
@ 2021-07-06 19:19         ` John Harrison
  0 siblings, 0 replies; 38+ messages in thread
From: John Harrison @ 2021-07-06 19:19 UTC (permalink / raw)
  To: Michal Wajdeczko, Matthew Brost, intel-gfx, dri-devel

On 7/6/2021 12:12, Michal Wajdeczko wrote:
> On 06.07.2021 21:00, John Harrison wrote:
>> On 7/1/2021 10:15, Matthew Brost wrote:
>>> CTB writes are now in the path of command submission and should be
>>> optimized for performance. Rather than reading CTB descriptor values
>>> (e.g. head, tail) which could result in accesses across the PCIe bus,
>>> store shadow local copies and only read/write the descriptor values when
>>> absolutely necessary. Also store the current space in the each channel
>>> locally.
>>>
>>> v2:
>>>    (Michel)
>>>     - Add additional sanity checks for head / tail pointers
>>>     - Use GUC_CTB_HDR_LEN rather than magic 1
>>>
>>> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
>>> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
>>> ---
>>>    drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 88 +++++++++++++++--------
>>>    drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h |  6 ++
>>>    2 files changed, 65 insertions(+), 29 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
>>> b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
>>> index a9cb7b608520..5b8b4ff609e2 100644
>>> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
>>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
>>> @@ -130,6 +130,10 @@ static void guc_ct_buffer_desc_init(struct
>>> guc_ct_buffer_desc *desc)
>>>    static void guc_ct_buffer_reset(struct intel_guc_ct_buffer *ctb)
>>>    {
>>>        ctb->broken = false;
>>> +    ctb->tail = 0;
>>> +    ctb->head = 0;
>>> +    ctb->space = CIRC_SPACE(ctb->tail, ctb->head, ctb->size);
>>> +
>>>        guc_ct_buffer_desc_init(ctb->desc);
>>>    }
>>>    @@ -383,10 +387,8 @@ static int ct_write(struct intel_guc_ct *ct,
>>>    {
>>>        struct intel_guc_ct_buffer *ctb = &ct->ctbs.send;
>>>        struct guc_ct_buffer_desc *desc = ctb->desc;
>>> -    u32 head = desc->head;
>>> -    u32 tail = desc->tail;
>>> +    u32 tail = ctb->tail;
>>>        u32 size = ctb->size;
>>> -    u32 used;
>>>        u32 header;
>>>        u32 hxg;
>>>        u32 *cmds = ctb->cmds;
>>> @@ -395,25 +397,22 @@ static int ct_write(struct intel_guc_ct *ct,
>>>        if (unlikely(desc->status))
>>>            goto corrupted;
>>>    -    if (unlikely((tail | head) >= size)) {
>>> +    GEM_BUG_ON(tail > size);
>>> +
>>> +#ifdef CONFIG_DRM_I915_DEBUG_GUC
>>> +    if (unlikely(tail != READ_ONCE(desc->tail))) {
>>> +        CT_ERROR(ct, "Tail was modified %u != %u\n",
>>> +             desc->tail, ctb->tail);
>>> +        desc->status |= GUC_CTB_STATUS_MISMATCH;
>>> +        goto corrupted;
>>> +    }
>>> +    if (unlikely((desc->tail | desc->head) >= size)) {
>>>            CT_ERROR(ct, "Invalid offsets head=%u tail=%u (size=%u)\n",
>>> -             head, tail, size);
>>> +             desc->head, desc->tail, size);
>>>            desc->status |= GUC_CTB_STATUS_OVERFLOW;
>>>            goto corrupted;
>>>        }
>>> -
>>> -    /*
>>> -     * tail == head condition indicates empty. GuC FW does not support
>>> -     * using up the entire buffer to get tail == head meaning full.
>>> -     */
>>> -    if (tail < head)
>>> -        used = (size - head) + tail;
>>> -    else
>>> -        used = tail - head;
>>> -
>>> -    /* make sure there is a space including extra dw for the fence */
>>> -    if (unlikely(used + len + GUC_CTB_HDR_LEN >= size))
>>> -        return -ENOSPC;
>>> +#endif
>>>          /*
>>>         * dw0: CT header (including fence)
>>> @@ -454,7 +453,9 @@ static int ct_write(struct intel_guc_ct *ct,
>>>        write_barrier(ct);
>>>          /* now update descriptor */
>>> +    ctb->tail = tail;
>>>        WRITE_ONCE(desc->tail, tail);
>>> +    ctb->space -= len + GUC_CTB_HDR_LEN;
>>>          return 0;
>>>    @@ -470,7 +471,7 @@ static int ct_write(struct intel_guc_ct *ct,
>>>     * @req:    pointer to pending request
>>>     * @status:    placeholder for status
>>>     *
>>> - * For each sent request, Guc shall send bac CT response message.
>>> + * For each sent request, GuC shall send back CT response message.
>>>     * Our message handler will update status of tracked request once
>>>     * response message with given fence is received. Wait here and
>>>     * check for valid response status value.
>>> @@ -526,24 +527,35 @@ static inline bool ct_deadlocked(struct
>>> intel_guc_ct *ct)
>>>        return ret;
>>>    }
>>>    -static inline bool h2g_has_room(struct intel_guc_ct_buffer *ctb,
>>> u32 len_dw)
>>> +static inline bool h2g_has_room(struct intel_guc_ct *ct, u32 len_dw)
>>>    {
>>> -    struct guc_ct_buffer_desc *desc = ctb->desc;
>>> -    u32 head = READ_ONCE(desc->head);
>>> +    struct intel_guc_ct_buffer *ctb = &ct->ctbs.send;
>>> +    u32 head;
>>>        u32 space;
>>>    -    space = CIRC_SPACE(desc->tail, head, ctb->size);
>>> +    if (ctb->space >= len_dw)
>>> +        return true;
>>> +
>>> +    head = READ_ONCE(ctb->desc->head);
>>> +    if (unlikely(head > ctb->size)) {
>>> +        CT_ERROR(ct, "Corrupted descriptor head=%u tail=%u size=%u\n",
>>> +             ctb->desc->head, ctb->desc->tail, ctb->size);
>>> +        ctb->desc->status |= GUC_CTB_STATUS_OVERFLOW;
>>> +        ctb->broken = true;
>>> +        return false;
>>> +    }
>>> +
>>> +    space = CIRC_SPACE(ctb->tail, head, ctb->size);
>>> +    ctb->space = space;
>>>          return space >= len_dw;
>>>    }
>>>      static int has_room_nb(struct intel_guc_ct *ct, u32 len_dw)
>>>    {
>>> -    struct intel_guc_ct_buffer *ctb = &ct->ctbs.send;
>>> -
>>>        lockdep_assert_held(&ct->ctbs.send.lock);
>>>    -    if (unlikely(!h2g_has_room(ctb, len_dw))) {
>>> +    if (unlikely(!h2g_has_room(ct, len_dw))) {
>>>            if (ct->stall_time == KTIME_MAX)
>>>                ct->stall_time = ktime_get();
>>>    @@ -613,7 +625,7 @@ static int ct_send(struct intel_guc_ct *ct,
>>>         */
>>>    retry:
>>>        spin_lock_irqsave(&ctb->lock, flags);
>>> -    if (unlikely(!h2g_has_room(ctb, len + GUC_CTB_HDR_LEN))) {
>>> +    if (unlikely(!h2g_has_room(ct, len + GUC_CTB_HDR_LEN))) {
>>>            if (ct->stall_time == KTIME_MAX)
>>>                ct->stall_time = ktime_get();
>>>            spin_unlock_irqrestore(&ctb->lock, flags);
>>> @@ -733,7 +745,7 @@ static int ct_read(struct intel_guc_ct *ct, struct
>>> ct_incoming_msg **msg)
>>>    {
>>>        struct intel_guc_ct_buffer *ctb = &ct->ctbs.recv;
>>>        struct guc_ct_buffer_desc *desc = ctb->desc;
>>> -    u32 head = desc->head;
>>> +    u32 head = ctb->head;
>>>        u32 tail = desc->tail;
>>>        u32 size = ctb->size;
>>>        u32 *cmds = ctb->cmds;
>>> @@ -748,12 +760,29 @@ static int ct_read(struct intel_guc_ct *ct,
>>> struct ct_incoming_msg **msg)
>>>        if (unlikely(desc->status))
>>>            goto corrupted;
>>>    -    if (unlikely((tail | head) >= size)) {
>>> +    GEM_BUG_ON(head > size);
>> Is the BUG_ON necessary given that both options below do the same check
>> but as a corrupted buffer test (with subsequent recovery by GT reset?)
>> rather than killing the driver.
> "head" and "size" are now fully owned by the driver.
> BUGON here is to make sure driver is coded correctly.
The point is that both sides of the #if below also validate head. So 
first there is a BUG_ON, then there is the same test but without blowing 
the driver apart. One or the other is not required. My vote would be to 
keep the recoverable test rather than the BUG_ON.

John.

>
>>> +
>>> +#ifdef CONFIG_DRM_I915_DEBUG_GUC
>>> +    if (unlikely(head != READ_ONCE(desc->head))) {
>>> +        CT_ERROR(ct, "Head was modified %u != %u\n",
>>> +             desc->head, ctb->head);
>>> +        desc->status |= GUC_CTB_STATUS_MISMATCH;
>>> +        goto corrupted;
>>> +    }
>>> +    if (unlikely((desc->tail | desc->head) >= size)) {
>>> +        CT_ERROR(ct, "Invalid offsets head=%u tail=%u (size=%u)\n",
>>> +             head, tail, size);
>>> +        desc->status |= GUC_CTB_STATUS_OVERFLOW;
>>> +        goto corrupted;
>>> +    }
>>> +#else
>>> +    if (unlikely((tail | ctb->head) >= size)) {
>> Could just be 'head' rather than 'ctb->head'.
> or drop "ctb->head" completely since this is driver owned field and
> above you already have BUGON to test it
>
> Michal
>
>> John.
>>
>>>            CT_ERROR(ct, "Invalid offsets head=%u tail=%u (size=%u)\n",
>>>                 head, tail, size);
>>>            desc->status |= GUC_CTB_STATUS_OVERFLOW;
>>>            goto corrupted;
>>>        }
>>> +#endif
>>>          /* tail == head condition indicates empty */
>>>        available = tail - head;
>>> @@ -803,6 +832,7 @@ static int ct_read(struct intel_guc_ct *ct, struct
>>> ct_incoming_msg **msg)
>>>        }
>>>        CT_DEBUG(ct, "received %*ph\n", 4 * len, (*msg)->msg);
>>>    +    ctb->head = head;
>>>        /* now update descriptor */
>>>        WRITE_ONCE(desc->head, head);
>>>    diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
>>> b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
>>> index bee03794c1eb..edd1bba0445d 100644
>>> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
>>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
>>> @@ -33,6 +33,9 @@ struct intel_guc;
>>>     * @desc: pointer to the buffer descriptor
>>>     * @cmds: pointer to the commands buffer
>>>     * @size: size of the commands buffer in dwords
>>> + * @head: local shadow copy of head in dwords
>>> + * @tail: local shadow copy of tail in dwords
>>> + * @space: local shadow copy of space in dwords
>>>     * @broken: flag to indicate if descriptor data is broken
>>>     */
>>>    struct intel_guc_ct_buffer {
>>> @@ -40,6 +43,9 @@ struct intel_guc_ct_buffer {
>>>        struct guc_ct_buffer_desc *desc;
>>>        u32 *cmds;
>>>        u32 size;
>>> +    u32 tail;
>>> +    u32 head;
>>> +    u32 space;
>>>        bool broken;
>>>    };
>>>    

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 6/7] drm/i915/guc: Optimize CTB writes and reads
  2021-07-06 19:19         ` [Intel-gfx] " John Harrison
@ 2021-07-06 19:33           ` Michal Wajdeczko
  -1 siblings, 0 replies; 38+ messages in thread
From: Michal Wajdeczko @ 2021-07-06 19:33 UTC (permalink / raw)
  To: John Harrison, Matthew Brost, intel-gfx, dri-devel



On 06.07.2021 21:19, John Harrison wrote:
> On 7/6/2021 12:12, Michal Wajdeczko wrote:
>> On 06.07.2021 21:00, John Harrison wrote:
>>> On 7/1/2021 10:15, Matthew Brost wrote:
>>>> CTB writes are now in the path of command submission and should be
>>>> optimized for performance. Rather than reading CTB descriptor values
>>>> (e.g. head, tail) which could result in accesses across the PCIe bus,
>>>> store shadow local copies and only read/write the descriptor values
>>>> when
>>>> absolutely necessary. Also store the current space in the each channel
>>>> locally.
>>>>
>>>> v2:
>>>>    (Michel)
>>>>     - Add additional sanity checks for head / tail pointers
>>>>     - Use GUC_CTB_HDR_LEN rather than magic 1
>>>>
>>>> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
>>>> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
>>>> ---
>>>>    drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 88
>>>> +++++++++++++++--------
>>>>    drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h |  6 ++
>>>>    2 files changed, 65 insertions(+), 29 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
>>>> b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
>>>> index a9cb7b608520..5b8b4ff609e2 100644
>>>> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
>>>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
>>>> @@ -130,6 +130,10 @@ static void guc_ct_buffer_desc_init(struct
>>>> guc_ct_buffer_desc *desc)
>>>>    static void guc_ct_buffer_reset(struct intel_guc_ct_buffer *ctb)
>>>>    {
>>>>        ctb->broken = false;
>>>> +    ctb->tail = 0;
>>>> +    ctb->head = 0;
>>>> +    ctb->space = CIRC_SPACE(ctb->tail, ctb->head, ctb->size);
>>>> +
>>>>        guc_ct_buffer_desc_init(ctb->desc);
>>>>    }
>>>>    @@ -383,10 +387,8 @@ static int ct_write(struct intel_guc_ct *ct,
>>>>    {
>>>>        struct intel_guc_ct_buffer *ctb = &ct->ctbs.send;
>>>>        struct guc_ct_buffer_desc *desc = ctb->desc;
>>>> -    u32 head = desc->head;
>>>> -    u32 tail = desc->tail;
>>>> +    u32 tail = ctb->tail;
>>>>        u32 size = ctb->size;
>>>> -    u32 used;
>>>>        u32 header;
>>>>        u32 hxg;
>>>>        u32 *cmds = ctb->cmds;
>>>> @@ -395,25 +397,22 @@ static int ct_write(struct intel_guc_ct *ct,
>>>>        if (unlikely(desc->status))
>>>>            goto corrupted;
>>>>    -    if (unlikely((tail | head) >= size)) {
>>>> +    GEM_BUG_ON(tail > size);
>>>> +
>>>> +#ifdef CONFIG_DRM_I915_DEBUG_GUC
>>>> +    if (unlikely(tail != READ_ONCE(desc->tail))) {
>>>> +        CT_ERROR(ct, "Tail was modified %u != %u\n",
>>>> +             desc->tail, ctb->tail);
>>>> +        desc->status |= GUC_CTB_STATUS_MISMATCH;
>>>> +        goto corrupted;
>>>> +    }
>>>> +    if (unlikely((desc->tail | desc->head) >= size)) {
>>>>            CT_ERROR(ct, "Invalid offsets head=%u tail=%u (size=%u)\n",
>>>> -             head, tail, size);
>>>> +             desc->head, desc->tail, size);
>>>>            desc->status |= GUC_CTB_STATUS_OVERFLOW;
>>>>            goto corrupted;
>>>>        }
>>>> -
>>>> -    /*
>>>> -     * tail == head condition indicates empty. GuC FW does not support
>>>> -     * using up the entire buffer to get tail == head meaning full.
>>>> -     */
>>>> -    if (tail < head)
>>>> -        used = (size - head) + tail;
>>>> -    else
>>>> -        used = tail - head;
>>>> -
>>>> -    /* make sure there is a space including extra dw for the fence */
>>>> -    if (unlikely(used + len + GUC_CTB_HDR_LEN >= size))
>>>> -        return -ENOSPC;
>>>> +#endif
>>>>          /*
>>>>         * dw0: CT header (including fence)
>>>> @@ -454,7 +453,9 @@ static int ct_write(struct intel_guc_ct *ct,
>>>>        write_barrier(ct);
>>>>          /* now update descriptor */
>>>> +    ctb->tail = tail;
>>>>        WRITE_ONCE(desc->tail, tail);
>>>> +    ctb->space -= len + GUC_CTB_HDR_LEN;
>>>>          return 0;
>>>>    @@ -470,7 +471,7 @@ static int ct_write(struct intel_guc_ct *ct,
>>>>     * @req:    pointer to pending request
>>>>     * @status:    placeholder for status
>>>>     *
>>>> - * For each sent request, Guc shall send bac CT response message.
>>>> + * For each sent request, GuC shall send back CT response message.
>>>>     * Our message handler will update status of tracked request once
>>>>     * response message with given fence is received. Wait here and
>>>>     * check for valid response status value.
>>>> @@ -526,24 +527,35 @@ static inline bool ct_deadlocked(struct
>>>> intel_guc_ct *ct)
>>>>        return ret;
>>>>    }
>>>>    -static inline bool h2g_has_room(struct intel_guc_ct_buffer *ctb,
>>>> u32 len_dw)
>>>> +static inline bool h2g_has_room(struct intel_guc_ct *ct, u32 len_dw)
>>>>    {
>>>> -    struct guc_ct_buffer_desc *desc = ctb->desc;
>>>> -    u32 head = READ_ONCE(desc->head);
>>>> +    struct intel_guc_ct_buffer *ctb = &ct->ctbs.send;
>>>> +    u32 head;
>>>>        u32 space;
>>>>    -    space = CIRC_SPACE(desc->tail, head, ctb->size);
>>>> +    if (ctb->space >= len_dw)
>>>> +        return true;
>>>> +
>>>> +    head = READ_ONCE(ctb->desc->head);
>>>> +    if (unlikely(head > ctb->size)) {
>>>> +        CT_ERROR(ct, "Corrupted descriptor head=%u tail=%u size=%u\n",
>>>> +             ctb->desc->head, ctb->desc->tail, ctb->size);
>>>> +        ctb->desc->status |= GUC_CTB_STATUS_OVERFLOW;
>>>> +        ctb->broken = true;
>>>> +        return false;
>>>> +    }
>>>> +
>>>> +    space = CIRC_SPACE(ctb->tail, head, ctb->size);
>>>> +    ctb->space = space;
>>>>          return space >= len_dw;
>>>>    }
>>>>      static int has_room_nb(struct intel_guc_ct *ct, u32 len_dw)
>>>>    {
>>>> -    struct intel_guc_ct_buffer *ctb = &ct->ctbs.send;
>>>> -
>>>>        lockdep_assert_held(&ct->ctbs.send.lock);
>>>>    -    if (unlikely(!h2g_has_room(ctb, len_dw))) {
>>>> +    if (unlikely(!h2g_has_room(ct, len_dw))) {
>>>>            if (ct->stall_time == KTIME_MAX)
>>>>                ct->stall_time = ktime_get();
>>>>    @@ -613,7 +625,7 @@ static int ct_send(struct intel_guc_ct *ct,
>>>>         */
>>>>    retry:
>>>>        spin_lock_irqsave(&ctb->lock, flags);
>>>> -    if (unlikely(!h2g_has_room(ctb, len + GUC_CTB_HDR_LEN))) {
>>>> +    if (unlikely(!h2g_has_room(ct, len + GUC_CTB_HDR_LEN))) {
>>>>            if (ct->stall_time == KTIME_MAX)
>>>>                ct->stall_time = ktime_get();
>>>>            spin_unlock_irqrestore(&ctb->lock, flags);
>>>> @@ -733,7 +745,7 @@ static int ct_read(struct intel_guc_ct *ct, struct
>>>> ct_incoming_msg **msg)
>>>>    {
>>>>        struct intel_guc_ct_buffer *ctb = &ct->ctbs.recv;
>>>>        struct guc_ct_buffer_desc *desc = ctb->desc;
>>>> -    u32 head = desc->head;
>>>> +    u32 head = ctb->head;
>>>>        u32 tail = desc->tail;
>>>>        u32 size = ctb->size;
>>>>        u32 *cmds = ctb->cmds;
>>>> @@ -748,12 +760,29 @@ static int ct_read(struct intel_guc_ct *ct,
>>>> struct ct_incoming_msg **msg)
>>>>        if (unlikely(desc->status))
>>>>            goto corrupted;
>>>>    -    if (unlikely((tail | head) >= size)) {
>>>> +    GEM_BUG_ON(head > size);
>>> Is the BUG_ON necessary given that both options below do the same check
>>> but as a corrupted buffer test (with subsequent recovery by GT reset?)
>>> rather than killing the driver.
>> "head" and "size" are now fully owned by the driver.
>> BUGON here is to make sure driver is coded correctly.
> The point is that both sides of the #if below also validate head. So

but not the same "head"

under DEBUG we are validating the one from descriptor (together with
tail) - and that should be recoverable as if this fails it was clearly
not our fault.

but under non-DEBUG we were attempting to validate again the local one,
pretending that this is recoverable, but it is not, as this is our fault
(elsewhere in i915 we don't attempt to recover from obvious coding errors).

> first there is a BUG_ON, then there is the same test but without blowing
> the driver apart. One or the other is not required. My vote would be to
> keep the recoverable test rather than the BUG_ON.

IMHO we should keep GEMBUGON and drop redundant check in non-DEBUG.

But let Matt decide.

Michal

> 
> John.
> 
>>
>>>> +
>>>> +#ifdef CONFIG_DRM_I915_DEBUG_GUC
>>>> +    if (unlikely(head != READ_ONCE(desc->head))) {
>>>> +        CT_ERROR(ct, "Head was modified %u != %u\n",
>>>> +             desc->head, ctb->head);
>>>> +        desc->status |= GUC_CTB_STATUS_MISMATCH;
>>>> +        goto corrupted;
>>>> +    }
>>>> +    if (unlikely((desc->tail | desc->head) >= size)) {
>>>> +        CT_ERROR(ct, "Invalid offsets head=%u tail=%u (size=%u)\n",
>>>> +             head, tail, size);
>>>> +        desc->status |= GUC_CTB_STATUS_OVERFLOW;
>>>> +        goto corrupted;
>>>> +    }
>>>> +#else
>>>> +    if (unlikely((tail | ctb->head) >= size)) {
>>> Could just be 'head' rather than 'ctb->head'.
>> or drop "ctb->head" completely since this is driver owned field and
>> above you already have BUGON to test it
>>
>> Michal
>>
>>> John.
>>>
>>>>            CT_ERROR(ct, "Invalid offsets head=%u tail=%u (size=%u)\n",
>>>>                 head, tail, size);
>>>>            desc->status |= GUC_CTB_STATUS_OVERFLOW;
>>>>            goto corrupted;
>>>>        }
>>>> +#endif
>>>>          /* tail == head condition indicates empty */
>>>>        available = tail - head;
>>>> @@ -803,6 +832,7 @@ static int ct_read(struct intel_guc_ct *ct, struct
>>>> ct_incoming_msg **msg)
>>>>        }
>>>>        CT_DEBUG(ct, "received %*ph\n", 4 * len, (*msg)->msg);
>>>>    +    ctb->head = head;
>>>>        /* now update descriptor */
>>>>        WRITE_ONCE(desc->head, head);
>>>>    diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
>>>> b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
>>>> index bee03794c1eb..edd1bba0445d 100644
>>>> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
>>>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
>>>> @@ -33,6 +33,9 @@ struct intel_guc;
>>>>     * @desc: pointer to the buffer descriptor
>>>>     * @cmds: pointer to the commands buffer
>>>>     * @size: size of the commands buffer in dwords
>>>> + * @head: local shadow copy of head in dwords
>>>> + * @tail: local shadow copy of tail in dwords
>>>> + * @space: local shadow copy of space in dwords
>>>>     * @broken: flag to indicate if descriptor data is broken
>>>>     */
>>>>    struct intel_guc_ct_buffer {
>>>> @@ -40,6 +43,9 @@ struct intel_guc_ct_buffer {
>>>>        struct guc_ct_buffer_desc *desc;
>>>>        u32 *cmds;
>>>>        u32 size;
>>>> +    u32 tail;
>>>> +    u32 head;
>>>> +    u32 space;
>>>>        bool broken;
>>>>    };
>>>>    
> 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Intel-gfx] [PATCH 6/7] drm/i915/guc: Optimize CTB writes and reads
@ 2021-07-06 19:33           ` Michal Wajdeczko
  0 siblings, 0 replies; 38+ messages in thread
From: Michal Wajdeczko @ 2021-07-06 19:33 UTC (permalink / raw)
  To: John Harrison, Matthew Brost, intel-gfx, dri-devel



On 06.07.2021 21:19, John Harrison wrote:
> On 7/6/2021 12:12, Michal Wajdeczko wrote:
>> On 06.07.2021 21:00, John Harrison wrote:
>>> On 7/1/2021 10:15, Matthew Brost wrote:
>>>> CTB writes are now in the path of command submission and should be
>>>> optimized for performance. Rather than reading CTB descriptor values
>>>> (e.g. head, tail) which could result in accesses across the PCIe bus,
>>>> store shadow local copies and only read/write the descriptor values
>>>> when
>>>> absolutely necessary. Also store the current space in the each channel
>>>> locally.
>>>>
>>>> v2:
>>>>    (Michel)
>>>>     - Add additional sanity checks for head / tail pointers
>>>>     - Use GUC_CTB_HDR_LEN rather than magic 1
>>>>
>>>> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
>>>> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
>>>> ---
>>>>    drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 88
>>>> +++++++++++++++--------
>>>>    drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h |  6 ++
>>>>    2 files changed, 65 insertions(+), 29 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
>>>> b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
>>>> index a9cb7b608520..5b8b4ff609e2 100644
>>>> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
>>>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
>>>> @@ -130,6 +130,10 @@ static void guc_ct_buffer_desc_init(struct
>>>> guc_ct_buffer_desc *desc)
>>>>    static void guc_ct_buffer_reset(struct intel_guc_ct_buffer *ctb)
>>>>    {
>>>>        ctb->broken = false;
>>>> +    ctb->tail = 0;
>>>> +    ctb->head = 0;
>>>> +    ctb->space = CIRC_SPACE(ctb->tail, ctb->head, ctb->size);
>>>> +
>>>>        guc_ct_buffer_desc_init(ctb->desc);
>>>>    }
>>>>    @@ -383,10 +387,8 @@ static int ct_write(struct intel_guc_ct *ct,
>>>>    {
>>>>        struct intel_guc_ct_buffer *ctb = &ct->ctbs.send;
>>>>        struct guc_ct_buffer_desc *desc = ctb->desc;
>>>> -    u32 head = desc->head;
>>>> -    u32 tail = desc->tail;
>>>> +    u32 tail = ctb->tail;
>>>>        u32 size = ctb->size;
>>>> -    u32 used;
>>>>        u32 header;
>>>>        u32 hxg;
>>>>        u32 *cmds = ctb->cmds;
>>>> @@ -395,25 +397,22 @@ static int ct_write(struct intel_guc_ct *ct,
>>>>        if (unlikely(desc->status))
>>>>            goto corrupted;
>>>>    -    if (unlikely((tail | head) >= size)) {
>>>> +    GEM_BUG_ON(tail > size);
>>>> +
>>>> +#ifdef CONFIG_DRM_I915_DEBUG_GUC
>>>> +    if (unlikely(tail != READ_ONCE(desc->tail))) {
>>>> +        CT_ERROR(ct, "Tail was modified %u != %u\n",
>>>> +             desc->tail, ctb->tail);
>>>> +        desc->status |= GUC_CTB_STATUS_MISMATCH;
>>>> +        goto corrupted;
>>>> +    }
>>>> +    if (unlikely((desc->tail | desc->head) >= size)) {
>>>>            CT_ERROR(ct, "Invalid offsets head=%u tail=%u (size=%u)\n",
>>>> -             head, tail, size);
>>>> +             desc->head, desc->tail, size);
>>>>            desc->status |= GUC_CTB_STATUS_OVERFLOW;
>>>>            goto corrupted;
>>>>        }
>>>> -
>>>> -    /*
>>>> -     * tail == head condition indicates empty. GuC FW does not support
>>>> -     * using up the entire buffer to get tail == head meaning full.
>>>> -     */
>>>> -    if (tail < head)
>>>> -        used = (size - head) + tail;
>>>> -    else
>>>> -        used = tail - head;
>>>> -
>>>> -    /* make sure there is a space including extra dw for the fence */
>>>> -    if (unlikely(used + len + GUC_CTB_HDR_LEN >= size))
>>>> -        return -ENOSPC;
>>>> +#endif
>>>>          /*
>>>>         * dw0: CT header (including fence)
>>>> @@ -454,7 +453,9 @@ static int ct_write(struct intel_guc_ct *ct,
>>>>        write_barrier(ct);
>>>>          /* now update descriptor */
>>>> +    ctb->tail = tail;
>>>>        WRITE_ONCE(desc->tail, tail);
>>>> +    ctb->space -= len + GUC_CTB_HDR_LEN;
>>>>          return 0;
>>>>    @@ -470,7 +471,7 @@ static int ct_write(struct intel_guc_ct *ct,
>>>>     * @req:    pointer to pending request
>>>>     * @status:    placeholder for status
>>>>     *
>>>> - * For each sent request, Guc shall send bac CT response message.
>>>> + * For each sent request, GuC shall send back CT response message.
>>>>     * Our message handler will update status of tracked request once
>>>>     * response message with given fence is received. Wait here and
>>>>     * check for valid response status value.
>>>> @@ -526,24 +527,35 @@ static inline bool ct_deadlocked(struct
>>>> intel_guc_ct *ct)
>>>>        return ret;
>>>>    }
>>>>    -static inline bool h2g_has_room(struct intel_guc_ct_buffer *ctb,
>>>> u32 len_dw)
>>>> +static inline bool h2g_has_room(struct intel_guc_ct *ct, u32 len_dw)
>>>>    {
>>>> -    struct guc_ct_buffer_desc *desc = ctb->desc;
>>>> -    u32 head = READ_ONCE(desc->head);
>>>> +    struct intel_guc_ct_buffer *ctb = &ct->ctbs.send;
>>>> +    u32 head;
>>>>        u32 space;
>>>>    -    space = CIRC_SPACE(desc->tail, head, ctb->size);
>>>> +    if (ctb->space >= len_dw)
>>>> +        return true;
>>>> +
>>>> +    head = READ_ONCE(ctb->desc->head);
>>>> +    if (unlikely(head > ctb->size)) {
>>>> +        CT_ERROR(ct, "Corrupted descriptor head=%u tail=%u size=%u\n",
>>>> +             ctb->desc->head, ctb->desc->tail, ctb->size);
>>>> +        ctb->desc->status |= GUC_CTB_STATUS_OVERFLOW;
>>>> +        ctb->broken = true;
>>>> +        return false;
>>>> +    }
>>>> +
>>>> +    space = CIRC_SPACE(ctb->tail, head, ctb->size);
>>>> +    ctb->space = space;
>>>>          return space >= len_dw;
>>>>    }
>>>>      static int has_room_nb(struct intel_guc_ct *ct, u32 len_dw)
>>>>    {
>>>> -    struct intel_guc_ct_buffer *ctb = &ct->ctbs.send;
>>>> -
>>>>        lockdep_assert_held(&ct->ctbs.send.lock);
>>>>    -    if (unlikely(!h2g_has_room(ctb, len_dw))) {
>>>> +    if (unlikely(!h2g_has_room(ct, len_dw))) {
>>>>            if (ct->stall_time == KTIME_MAX)
>>>>                ct->stall_time = ktime_get();
>>>>    @@ -613,7 +625,7 @@ static int ct_send(struct intel_guc_ct *ct,
>>>>         */
>>>>    retry:
>>>>        spin_lock_irqsave(&ctb->lock, flags);
>>>> -    if (unlikely(!h2g_has_room(ctb, len + GUC_CTB_HDR_LEN))) {
>>>> +    if (unlikely(!h2g_has_room(ct, len + GUC_CTB_HDR_LEN))) {
>>>>            if (ct->stall_time == KTIME_MAX)
>>>>                ct->stall_time = ktime_get();
>>>>            spin_unlock_irqrestore(&ctb->lock, flags);
>>>> @@ -733,7 +745,7 @@ static int ct_read(struct intel_guc_ct *ct, struct
>>>> ct_incoming_msg **msg)
>>>>    {
>>>>        struct intel_guc_ct_buffer *ctb = &ct->ctbs.recv;
>>>>        struct guc_ct_buffer_desc *desc = ctb->desc;
>>>> -    u32 head = desc->head;
>>>> +    u32 head = ctb->head;
>>>>        u32 tail = desc->tail;
>>>>        u32 size = ctb->size;
>>>>        u32 *cmds = ctb->cmds;
>>>> @@ -748,12 +760,29 @@ static int ct_read(struct intel_guc_ct *ct,
>>>> struct ct_incoming_msg **msg)
>>>>        if (unlikely(desc->status))
>>>>            goto corrupted;
>>>>    -    if (unlikely((tail | head) >= size)) {
>>>> +    GEM_BUG_ON(head > size);
>>> Is the BUG_ON necessary given that both options below do the same check
>>> but as a corrupted buffer test (with subsequent recovery by GT reset?)
>>> rather than killing the driver.
>> "head" and "size" are now fully owned by the driver.
>> BUGON here is to make sure driver is coded correctly.
> The point is that both sides of the #if below also validate head. So

but not the same "head"

under DEBUG we are validating the one from descriptor (together with
tail) - and that should be recoverable as if this fails it was clearly
not our fault.

but under non-DEBUG we were attempting to validate again the local one,
pretending that this is recoverable, but it is not, as this is our fault
(elsewhere in i915 we don't attempt to recover from obvious coding errors).

> first there is a BUG_ON, then there is the same test but without blowing
> the driver apart. One or the other is not required. My vote would be to
> keep the recoverable test rather than the BUG_ON.

IMHO we should keep GEMBUGON and drop redundant check in non-DEBUG.

But let Matt decide.

Michal

> 
> John.
> 
>>
>>>> +
>>>> +#ifdef CONFIG_DRM_I915_DEBUG_GUC
>>>> +    if (unlikely(head != READ_ONCE(desc->head))) {
>>>> +        CT_ERROR(ct, "Head was modified %u != %u\n",
>>>> +             desc->head, ctb->head);
>>>> +        desc->status |= GUC_CTB_STATUS_MISMATCH;
>>>> +        goto corrupted;
>>>> +    }
>>>> +    if (unlikely((desc->tail | desc->head) >= size)) {
>>>> +        CT_ERROR(ct, "Invalid offsets head=%u tail=%u (size=%u)\n",
>>>> +             head, tail, size);
>>>> +        desc->status |= GUC_CTB_STATUS_OVERFLOW;
>>>> +        goto corrupted;
>>>> +    }
>>>> +#else
>>>> +    if (unlikely((tail | ctb->head) >= size)) {
>>> Could just be 'head' rather than 'ctb->head'.
>> or drop "ctb->head" completely since this is driver owned field and
>> above you already have BUGON to test it
>>
>> Michal
>>
>>> John.
>>>
>>>>            CT_ERROR(ct, "Invalid offsets head=%u tail=%u (size=%u)\n",
>>>>                 head, tail, size);
>>>>            desc->status |= GUC_CTB_STATUS_OVERFLOW;
>>>>            goto corrupted;
>>>>        }
>>>> +#endif
>>>>          /* tail == head condition indicates empty */
>>>>        available = tail - head;
>>>> @@ -803,6 +832,7 @@ static int ct_read(struct intel_guc_ct *ct, struct
>>>> ct_incoming_msg **msg)
>>>>        }
>>>>        CT_DEBUG(ct, "received %*ph\n", 4 * len, (*msg)->msg);
>>>>    +    ctb->head = head;
>>>>        /* now update descriptor */
>>>>        WRITE_ONCE(desc->head, head);
>>>>    diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
>>>> b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
>>>> index bee03794c1eb..edd1bba0445d 100644
>>>> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
>>>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
>>>> @@ -33,6 +33,9 @@ struct intel_guc;
>>>>     * @desc: pointer to the buffer descriptor
>>>>     * @cmds: pointer to the commands buffer
>>>>     * @size: size of the commands buffer in dwords
>>>> + * @head: local shadow copy of head in dwords
>>>> + * @tail: local shadow copy of tail in dwords
>>>> + * @space: local shadow copy of space in dwords
>>>>     * @broken: flag to indicate if descriptor data is broken
>>>>     */
>>>>    struct intel_guc_ct_buffer {
>>>> @@ -40,6 +43,9 @@ struct intel_guc_ct_buffer {
>>>>        struct guc_ct_buffer_desc *desc;
>>>>        u32 *cmds;
>>>>        u32 size;
>>>> +    u32 tail;
>>>> +    u32 head;
>>>> +    u32 space;
>>>>        bool broken;
>>>>    };
>>>>    
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 6/7] drm/i915/guc: Optimize CTB writes and reads
  2021-07-06 19:33           ` [Intel-gfx] " Michal Wajdeczko
@ 2021-07-06 21:22             ` Matthew Brost
  -1 siblings, 0 replies; 38+ messages in thread
From: Matthew Brost @ 2021-07-06 21:22 UTC (permalink / raw)
  To: Michal Wajdeczko; +Cc: intel-gfx, dri-devel, John Harrison

On Tue, Jul 06, 2021 at 09:33:23PM +0200, Michal Wajdeczko wrote:
> 
> 
> On 06.07.2021 21:19, John Harrison wrote:
> > On 7/6/2021 12:12, Michal Wajdeczko wrote:
> >> On 06.07.2021 21:00, John Harrison wrote:
> >>> On 7/1/2021 10:15, Matthew Brost wrote:
> >>>> CTB writes are now in the path of command submission and should be
> >>>> optimized for performance. Rather than reading CTB descriptor values
> >>>> (e.g. head, tail) which could result in accesses across the PCIe bus,
> >>>> store shadow local copies and only read/write the descriptor values
> >>>> when
> >>>> absolutely necessary. Also store the current space in the each channel
> >>>> locally.
> >>>>
> >>>> v2:
> >>>>    (Michel)
> >>>>     - Add additional sanity checks for head / tail pointers
> >>>>     - Use GUC_CTB_HDR_LEN rather than magic 1
> >>>>
> >>>> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
> >>>> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> >>>> ---
> >>>>    drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 88
> >>>> +++++++++++++++--------
> >>>>    drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h |  6 ++
> >>>>    2 files changed, 65 insertions(+), 29 deletions(-)
> >>>>
> >>>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> >>>> b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> >>>> index a9cb7b608520..5b8b4ff609e2 100644
> >>>> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> >>>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> >>>> @@ -130,6 +130,10 @@ static void guc_ct_buffer_desc_init(struct
> >>>> guc_ct_buffer_desc *desc)
> >>>>    static void guc_ct_buffer_reset(struct intel_guc_ct_buffer *ctb)
> >>>>    {
> >>>>        ctb->broken = false;
> >>>> +    ctb->tail = 0;
> >>>> +    ctb->head = 0;
> >>>> +    ctb->space = CIRC_SPACE(ctb->tail, ctb->head, ctb->size);
> >>>> +
> >>>>        guc_ct_buffer_desc_init(ctb->desc);
> >>>>    }
> >>>>    @@ -383,10 +387,8 @@ static int ct_write(struct intel_guc_ct *ct,
> >>>>    {
> >>>>        struct intel_guc_ct_buffer *ctb = &ct->ctbs.send;
> >>>>        struct guc_ct_buffer_desc *desc = ctb->desc;
> >>>> -    u32 head = desc->head;
> >>>> -    u32 tail = desc->tail;
> >>>> +    u32 tail = ctb->tail;
> >>>>        u32 size = ctb->size;
> >>>> -    u32 used;
> >>>>        u32 header;
> >>>>        u32 hxg;
> >>>>        u32 *cmds = ctb->cmds;
> >>>> @@ -395,25 +397,22 @@ static int ct_write(struct intel_guc_ct *ct,
> >>>>        if (unlikely(desc->status))
> >>>>            goto corrupted;
> >>>>    -    if (unlikely((tail | head) >= size)) {
> >>>> +    GEM_BUG_ON(tail > size);
> >>>> +
> >>>> +#ifdef CONFIG_DRM_I915_DEBUG_GUC
> >>>> +    if (unlikely(tail != READ_ONCE(desc->tail))) {
> >>>> +        CT_ERROR(ct, "Tail was modified %u != %u\n",
> >>>> +             desc->tail, ctb->tail);
> >>>> +        desc->status |= GUC_CTB_STATUS_MISMATCH;
> >>>> +        goto corrupted;
> >>>> +    }
> >>>> +    if (unlikely((desc->tail | desc->head) >= size)) {
> >>>>            CT_ERROR(ct, "Invalid offsets head=%u tail=%u (size=%u)\n",
> >>>> -             head, tail, size);
> >>>> +             desc->head, desc->tail, size);
> >>>>            desc->status |= GUC_CTB_STATUS_OVERFLOW;
> >>>>            goto corrupted;
> >>>>        }
> >>>> -
> >>>> -    /*
> >>>> -     * tail == head condition indicates empty. GuC FW does not support
> >>>> -     * using up the entire buffer to get tail == head meaning full.
> >>>> -     */
> >>>> -    if (tail < head)
> >>>> -        used = (size - head) + tail;
> >>>> -    else
> >>>> -        used = tail - head;
> >>>> -
> >>>> -    /* make sure there is a space including extra dw for the fence */
> >>>> -    if (unlikely(used + len + GUC_CTB_HDR_LEN >= size))
> >>>> -        return -ENOSPC;
> >>>> +#endif
> >>>>          /*
> >>>>         * dw0: CT header (including fence)
> >>>> @@ -454,7 +453,9 @@ static int ct_write(struct intel_guc_ct *ct,
> >>>>        write_barrier(ct);
> >>>>          /* now update descriptor */
> >>>> +    ctb->tail = tail;
> >>>>        WRITE_ONCE(desc->tail, tail);
> >>>> +    ctb->space -= len + GUC_CTB_HDR_LEN;
> >>>>          return 0;
> >>>>    @@ -470,7 +471,7 @@ static int ct_write(struct intel_guc_ct *ct,
> >>>>     * @req:    pointer to pending request
> >>>>     * @status:    placeholder for status
> >>>>     *
> >>>> - * For each sent request, Guc shall send bac CT response message.
> >>>> + * For each sent request, GuC shall send back CT response message.
> >>>>     * Our message handler will update status of tracked request once
> >>>>     * response message with given fence is received. Wait here and
> >>>>     * check for valid response status value.
> >>>> @@ -526,24 +527,35 @@ static inline bool ct_deadlocked(struct
> >>>> intel_guc_ct *ct)
> >>>>        return ret;
> >>>>    }
> >>>>    -static inline bool h2g_has_room(struct intel_guc_ct_buffer *ctb,
> >>>> u32 len_dw)
> >>>> +static inline bool h2g_has_room(struct intel_guc_ct *ct, u32 len_dw)
> >>>>    {
> >>>> -    struct guc_ct_buffer_desc *desc = ctb->desc;
> >>>> -    u32 head = READ_ONCE(desc->head);
> >>>> +    struct intel_guc_ct_buffer *ctb = &ct->ctbs.send;
> >>>> +    u32 head;
> >>>>        u32 space;
> >>>>    -    space = CIRC_SPACE(desc->tail, head, ctb->size);
> >>>> +    if (ctb->space >= len_dw)
> >>>> +        return true;
> >>>> +
> >>>> +    head = READ_ONCE(ctb->desc->head);
> >>>> +    if (unlikely(head > ctb->size)) {
> >>>> +        CT_ERROR(ct, "Corrupted descriptor head=%u tail=%u size=%u\n",
> >>>> +             ctb->desc->head, ctb->desc->tail, ctb->size);
> >>>> +        ctb->desc->status |= GUC_CTB_STATUS_OVERFLOW;
> >>>> +        ctb->broken = true;
> >>>> +        return false;
> >>>> +    }
> >>>> +
> >>>> +    space = CIRC_SPACE(ctb->tail, head, ctb->size);
> >>>> +    ctb->space = space;
> >>>>          return space >= len_dw;
> >>>>    }
> >>>>      static int has_room_nb(struct intel_guc_ct *ct, u32 len_dw)
> >>>>    {
> >>>> -    struct intel_guc_ct_buffer *ctb = &ct->ctbs.send;
> >>>> -
> >>>>        lockdep_assert_held(&ct->ctbs.send.lock);
> >>>>    -    if (unlikely(!h2g_has_room(ctb, len_dw))) {
> >>>> +    if (unlikely(!h2g_has_room(ct, len_dw))) {
> >>>>            if (ct->stall_time == KTIME_MAX)
> >>>>                ct->stall_time = ktime_get();
> >>>>    @@ -613,7 +625,7 @@ static int ct_send(struct intel_guc_ct *ct,
> >>>>         */
> >>>>    retry:
> >>>>        spin_lock_irqsave(&ctb->lock, flags);
> >>>> -    if (unlikely(!h2g_has_room(ctb, len + GUC_CTB_HDR_LEN))) {
> >>>> +    if (unlikely(!h2g_has_room(ct, len + GUC_CTB_HDR_LEN))) {
> >>>>            if (ct->stall_time == KTIME_MAX)
> >>>>                ct->stall_time = ktime_get();
> >>>>            spin_unlock_irqrestore(&ctb->lock, flags);
> >>>> @@ -733,7 +745,7 @@ static int ct_read(struct intel_guc_ct *ct, struct
> >>>> ct_incoming_msg **msg)
> >>>>    {
> >>>>        struct intel_guc_ct_buffer *ctb = &ct->ctbs.recv;
> >>>>        struct guc_ct_buffer_desc *desc = ctb->desc;
> >>>> -    u32 head = desc->head;
> >>>> +    u32 head = ctb->head;
> >>>>        u32 tail = desc->tail;
> >>>>        u32 size = ctb->size;
> >>>>        u32 *cmds = ctb->cmds;
> >>>> @@ -748,12 +760,29 @@ static int ct_read(struct intel_guc_ct *ct,
> >>>> struct ct_incoming_msg **msg)
> >>>>        if (unlikely(desc->status))
> >>>>            goto corrupted;
> >>>>    -    if (unlikely((tail | head) >= size)) {
> >>>> +    GEM_BUG_ON(head > size);

This is driver owned field so I think a GEM_BUG_ON is correct as if this
blows the driver apart we have a bug in the i915. 

> >>> Is the BUG_ON necessary given that both options below do the same check
> >>> but as a corrupted buffer test (with subsequent recovery by GT reset?)
> >>> rather than killing the driver.
> >> "head" and "size" are now fully owned by the driver.
> >> BUGON here is to make sure driver is coded correctly.
> > The point is that both sides of the #if below also validate head. So
> 
> but not the same "head"
> 
> under DEBUG we are validating the one from descriptor (together with
> tail) - and that should be recoverable as if this fails it was clearly
> not our fault.
> 
> but under non-DEBUG we were attempting to validate again the local one,
> pretending that this is recoverable, but it is not, as this is our fault
> (elsewhere in i915 we don't attempt to recover from obvious coding errors).
>
> > first there is a BUG_ON, then there is the same test but without blowing
> > the driver apart. One or the other is not required. My vote would be to
> > keep the recoverable test rather than the BUG_ON.
> 
> IMHO we should keep GEMBUGON and drop redundant check in non-DEBUG.
> 
> But let Matt decide.
>

I think I'll drop the testing of the head value and keep the BUG_ON.

Matt
 
> Michal
> 
> > 
> > John.
> > 
> >>
> >>>> +
> >>>> +#ifdef CONFIG_DRM_I915_DEBUG_GUC
> >>>> +    if (unlikely(head != READ_ONCE(desc->head))) {
> >>>> +        CT_ERROR(ct, "Head was modified %u != %u\n",
> >>>> +             desc->head, ctb->head);
> >>>> +        desc->status |= GUC_CTB_STATUS_MISMATCH;
> >>>> +        goto corrupted;
> >>>> +    }
> >>>> +    if (unlikely((desc->tail | desc->head) >= size)) {
> >>>> +        CT_ERROR(ct, "Invalid offsets head=%u tail=%u (size=%u)\n",
> >>>> +             head, tail, size);
> >>>> +        desc->status |= GUC_CTB_STATUS_OVERFLOW;
> >>>> +        goto corrupted;
> >>>> +    }
> >>>> +#else
> >>>> +    if (unlikely((tail | ctb->head) >= size)) {
> >>> Could just be 'head' rather than 'ctb->head'.
> >> or drop "ctb->head" completely since this is driver owned field and
> >> above you already have BUGON to test it
> >>
> >> Michal
> >>
> >>> John.
> >>>
> >>>>            CT_ERROR(ct, "Invalid offsets head=%u tail=%u (size=%u)\n",
> >>>>                 head, tail, size);
> >>>>            desc->status |= GUC_CTB_STATUS_OVERFLOW;
> >>>>            goto corrupted;
> >>>>        }
> >>>> +#endif
> >>>>          /* tail == head condition indicates empty */
> >>>>        available = tail - head;
> >>>> @@ -803,6 +832,7 @@ static int ct_read(struct intel_guc_ct *ct, struct
> >>>> ct_incoming_msg **msg)
> >>>>        }
> >>>>        CT_DEBUG(ct, "received %*ph\n", 4 * len, (*msg)->msg);
> >>>>    +    ctb->head = head;
> >>>>        /* now update descriptor */
> >>>>        WRITE_ONCE(desc->head, head);
> >>>>    diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
> >>>> b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
> >>>> index bee03794c1eb..edd1bba0445d 100644
> >>>> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
> >>>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
> >>>> @@ -33,6 +33,9 @@ struct intel_guc;
> >>>>     * @desc: pointer to the buffer descriptor
> >>>>     * @cmds: pointer to the commands buffer
> >>>>     * @size: size of the commands buffer in dwords
> >>>> + * @head: local shadow copy of head in dwords
> >>>> + * @tail: local shadow copy of tail in dwords
> >>>> + * @space: local shadow copy of space in dwords
> >>>>     * @broken: flag to indicate if descriptor data is broken
> >>>>     */
> >>>>    struct intel_guc_ct_buffer {
> >>>> @@ -40,6 +43,9 @@ struct intel_guc_ct_buffer {
> >>>>        struct guc_ct_buffer_desc *desc;
> >>>>        u32 *cmds;
> >>>>        u32 size;
> >>>> +    u32 tail;
> >>>> +    u32 head;
> >>>> +    u32 space;
> >>>>        bool broken;
> >>>>    };
> >>>>    
> > 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Intel-gfx] [PATCH 6/7] drm/i915/guc: Optimize CTB writes and reads
@ 2021-07-06 21:22             ` Matthew Brost
  0 siblings, 0 replies; 38+ messages in thread
From: Matthew Brost @ 2021-07-06 21:22 UTC (permalink / raw)
  To: Michal Wajdeczko; +Cc: intel-gfx, dri-devel

On Tue, Jul 06, 2021 at 09:33:23PM +0200, Michal Wajdeczko wrote:
> 
> 
> On 06.07.2021 21:19, John Harrison wrote:
> > On 7/6/2021 12:12, Michal Wajdeczko wrote:
> >> On 06.07.2021 21:00, John Harrison wrote:
> >>> On 7/1/2021 10:15, Matthew Brost wrote:
> >>>> CTB writes are now in the path of command submission and should be
> >>>> optimized for performance. Rather than reading CTB descriptor values
> >>>> (e.g. head, tail) which could result in accesses across the PCIe bus,
> >>>> store shadow local copies and only read/write the descriptor values
> >>>> when
> >>>> absolutely necessary. Also store the current space in the each channel
> >>>> locally.
> >>>>
> >>>> v2:
> >>>>    (Michel)
> >>>>     - Add additional sanity checks for head / tail pointers
> >>>>     - Use GUC_CTB_HDR_LEN rather than magic 1
> >>>>
> >>>> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
> >>>> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> >>>> ---
> >>>>    drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 88
> >>>> +++++++++++++++--------
> >>>>    drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h |  6 ++
> >>>>    2 files changed, 65 insertions(+), 29 deletions(-)
> >>>>
> >>>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> >>>> b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> >>>> index a9cb7b608520..5b8b4ff609e2 100644
> >>>> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> >>>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> >>>> @@ -130,6 +130,10 @@ static void guc_ct_buffer_desc_init(struct
> >>>> guc_ct_buffer_desc *desc)
> >>>>    static void guc_ct_buffer_reset(struct intel_guc_ct_buffer *ctb)
> >>>>    {
> >>>>        ctb->broken = false;
> >>>> +    ctb->tail = 0;
> >>>> +    ctb->head = 0;
> >>>> +    ctb->space = CIRC_SPACE(ctb->tail, ctb->head, ctb->size);
> >>>> +
> >>>>        guc_ct_buffer_desc_init(ctb->desc);
> >>>>    }
> >>>>    @@ -383,10 +387,8 @@ static int ct_write(struct intel_guc_ct *ct,
> >>>>    {
> >>>>        struct intel_guc_ct_buffer *ctb = &ct->ctbs.send;
> >>>>        struct guc_ct_buffer_desc *desc = ctb->desc;
> >>>> -    u32 head = desc->head;
> >>>> -    u32 tail = desc->tail;
> >>>> +    u32 tail = ctb->tail;
> >>>>        u32 size = ctb->size;
> >>>> -    u32 used;
> >>>>        u32 header;
> >>>>        u32 hxg;
> >>>>        u32 *cmds = ctb->cmds;
> >>>> @@ -395,25 +397,22 @@ static int ct_write(struct intel_guc_ct *ct,
> >>>>        if (unlikely(desc->status))
> >>>>            goto corrupted;
> >>>>    -    if (unlikely((tail | head) >= size)) {
> >>>> +    GEM_BUG_ON(tail > size);
> >>>> +
> >>>> +#ifdef CONFIG_DRM_I915_DEBUG_GUC
> >>>> +    if (unlikely(tail != READ_ONCE(desc->tail))) {
> >>>> +        CT_ERROR(ct, "Tail was modified %u != %u\n",
> >>>> +             desc->tail, ctb->tail);
> >>>> +        desc->status |= GUC_CTB_STATUS_MISMATCH;
> >>>> +        goto corrupted;
> >>>> +    }
> >>>> +    if (unlikely((desc->tail | desc->head) >= size)) {
> >>>>            CT_ERROR(ct, "Invalid offsets head=%u tail=%u (size=%u)\n",
> >>>> -             head, tail, size);
> >>>> +             desc->head, desc->tail, size);
> >>>>            desc->status |= GUC_CTB_STATUS_OVERFLOW;
> >>>>            goto corrupted;
> >>>>        }
> >>>> -
> >>>> -    /*
> >>>> -     * tail == head condition indicates empty. GuC FW does not support
> >>>> -     * using up the entire buffer to get tail == head meaning full.
> >>>> -     */
> >>>> -    if (tail < head)
> >>>> -        used = (size - head) + tail;
> >>>> -    else
> >>>> -        used = tail - head;
> >>>> -
> >>>> -    /* make sure there is a space including extra dw for the fence */
> >>>> -    if (unlikely(used + len + GUC_CTB_HDR_LEN >= size))
> >>>> -        return -ENOSPC;
> >>>> +#endif
> >>>>          /*
> >>>>         * dw0: CT header (including fence)
> >>>> @@ -454,7 +453,9 @@ static int ct_write(struct intel_guc_ct *ct,
> >>>>        write_barrier(ct);
> >>>>          /* now update descriptor */
> >>>> +    ctb->tail = tail;
> >>>>        WRITE_ONCE(desc->tail, tail);
> >>>> +    ctb->space -= len + GUC_CTB_HDR_LEN;
> >>>>          return 0;
> >>>>    @@ -470,7 +471,7 @@ static int ct_write(struct intel_guc_ct *ct,
> >>>>     * @req:    pointer to pending request
> >>>>     * @status:    placeholder for status
> >>>>     *
> >>>> - * For each sent request, Guc shall send bac CT response message.
> >>>> + * For each sent request, GuC shall send back CT response message.
> >>>>     * Our message handler will update status of tracked request once
> >>>>     * response message with given fence is received. Wait here and
> >>>>     * check for valid response status value.
> >>>> @@ -526,24 +527,35 @@ static inline bool ct_deadlocked(struct
> >>>> intel_guc_ct *ct)
> >>>>        return ret;
> >>>>    }
> >>>>    -static inline bool h2g_has_room(struct intel_guc_ct_buffer *ctb,
> >>>> u32 len_dw)
> >>>> +static inline bool h2g_has_room(struct intel_guc_ct *ct, u32 len_dw)
> >>>>    {
> >>>> -    struct guc_ct_buffer_desc *desc = ctb->desc;
> >>>> -    u32 head = READ_ONCE(desc->head);
> >>>> +    struct intel_guc_ct_buffer *ctb = &ct->ctbs.send;
> >>>> +    u32 head;
> >>>>        u32 space;
> >>>>    -    space = CIRC_SPACE(desc->tail, head, ctb->size);
> >>>> +    if (ctb->space >= len_dw)
> >>>> +        return true;
> >>>> +
> >>>> +    head = READ_ONCE(ctb->desc->head);
> >>>> +    if (unlikely(head > ctb->size)) {
> >>>> +        CT_ERROR(ct, "Corrupted descriptor head=%u tail=%u size=%u\n",
> >>>> +             ctb->desc->head, ctb->desc->tail, ctb->size);
> >>>> +        ctb->desc->status |= GUC_CTB_STATUS_OVERFLOW;
> >>>> +        ctb->broken = true;
> >>>> +        return false;
> >>>> +    }
> >>>> +
> >>>> +    space = CIRC_SPACE(ctb->tail, head, ctb->size);
> >>>> +    ctb->space = space;
> >>>>          return space >= len_dw;
> >>>>    }
> >>>>      static int has_room_nb(struct intel_guc_ct *ct, u32 len_dw)
> >>>>    {
> >>>> -    struct intel_guc_ct_buffer *ctb = &ct->ctbs.send;
> >>>> -
> >>>>        lockdep_assert_held(&ct->ctbs.send.lock);
> >>>>    -    if (unlikely(!h2g_has_room(ctb, len_dw))) {
> >>>> +    if (unlikely(!h2g_has_room(ct, len_dw))) {
> >>>>            if (ct->stall_time == KTIME_MAX)
> >>>>                ct->stall_time = ktime_get();
> >>>>    @@ -613,7 +625,7 @@ static int ct_send(struct intel_guc_ct *ct,
> >>>>         */
> >>>>    retry:
> >>>>        spin_lock_irqsave(&ctb->lock, flags);
> >>>> -    if (unlikely(!h2g_has_room(ctb, len + GUC_CTB_HDR_LEN))) {
> >>>> +    if (unlikely(!h2g_has_room(ct, len + GUC_CTB_HDR_LEN))) {
> >>>>            if (ct->stall_time == KTIME_MAX)
> >>>>                ct->stall_time = ktime_get();
> >>>>            spin_unlock_irqrestore(&ctb->lock, flags);
> >>>> @@ -733,7 +745,7 @@ static int ct_read(struct intel_guc_ct *ct, struct
> >>>> ct_incoming_msg **msg)
> >>>>    {
> >>>>        struct intel_guc_ct_buffer *ctb = &ct->ctbs.recv;
> >>>>        struct guc_ct_buffer_desc *desc = ctb->desc;
> >>>> -    u32 head = desc->head;
> >>>> +    u32 head = ctb->head;
> >>>>        u32 tail = desc->tail;
> >>>>        u32 size = ctb->size;
> >>>>        u32 *cmds = ctb->cmds;
> >>>> @@ -748,12 +760,29 @@ static int ct_read(struct intel_guc_ct *ct,
> >>>> struct ct_incoming_msg **msg)
> >>>>        if (unlikely(desc->status))
> >>>>            goto corrupted;
> >>>>    -    if (unlikely((tail | head) >= size)) {
> >>>> +    GEM_BUG_ON(head > size);

This is driver owned field so I think a GEM_BUG_ON is correct as if this
blows the driver apart we have a bug in the i915. 

> >>> Is the BUG_ON necessary given that both options below do the same check
> >>> but as a corrupted buffer test (with subsequent recovery by GT reset?)
> >>> rather than killing the driver.
> >> "head" and "size" are now fully owned by the driver.
> >> BUGON here is to make sure driver is coded correctly.
> > The point is that both sides of the #if below also validate head. So
> 
> but not the same "head"
> 
> under DEBUG we are validating the one from descriptor (together with
> tail) - and that should be recoverable as if this fails it was clearly
> not our fault.
> 
> but under non-DEBUG we were attempting to validate again the local one,
> pretending that this is recoverable, but it is not, as this is our fault
> (elsewhere in i915 we don't attempt to recover from obvious coding errors).
>
> > first there is a BUG_ON, then there is the same test but without blowing
> > the driver apart. One or the other is not required. My vote would be to
> > keep the recoverable test rather than the BUG_ON.
> 
> IMHO we should keep GEMBUGON and drop redundant check in non-DEBUG.
> 
> But let Matt decide.
>

I think I'll drop the testing of the head value and keep the BUG_ON.

Matt
 
> Michal
> 
> > 
> > John.
> > 
> >>
> >>>> +
> >>>> +#ifdef CONFIG_DRM_I915_DEBUG_GUC
> >>>> +    if (unlikely(head != READ_ONCE(desc->head))) {
> >>>> +        CT_ERROR(ct, "Head was modified %u != %u\n",
> >>>> +             desc->head, ctb->head);
> >>>> +        desc->status |= GUC_CTB_STATUS_MISMATCH;
> >>>> +        goto corrupted;
> >>>> +    }
> >>>> +    if (unlikely((desc->tail | desc->head) >= size)) {
> >>>> +        CT_ERROR(ct, "Invalid offsets head=%u tail=%u (size=%u)\n",
> >>>> +             head, tail, size);
> >>>> +        desc->status |= GUC_CTB_STATUS_OVERFLOW;
> >>>> +        goto corrupted;
> >>>> +    }
> >>>> +#else
> >>>> +    if (unlikely((tail | ctb->head) >= size)) {
> >>> Could just be 'head' rather than 'ctb->head'.
> >> or drop "ctb->head" completely since this is driver owned field and
> >> above you already have BUGON to test it
> >>
> >> Michal
> >>
> >>> John.
> >>>
> >>>>            CT_ERROR(ct, "Invalid offsets head=%u tail=%u (size=%u)\n",
> >>>>                 head, tail, size);
> >>>>            desc->status |= GUC_CTB_STATUS_OVERFLOW;
> >>>>            goto corrupted;
> >>>>        }
> >>>> +#endif
> >>>>          /* tail == head condition indicates empty */
> >>>>        available = tail - head;
> >>>> @@ -803,6 +832,7 @@ static int ct_read(struct intel_guc_ct *ct, struct
> >>>> ct_incoming_msg **msg)
> >>>>        }
> >>>>        CT_DEBUG(ct, "received %*ph\n", 4 * len, (*msg)->msg);
> >>>>    +    ctb->head = head;
> >>>>        /* now update descriptor */
> >>>>        WRITE_ONCE(desc->head, head);
> >>>>    diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
> >>>> b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
> >>>> index bee03794c1eb..edd1bba0445d 100644
> >>>> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
> >>>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
> >>>> @@ -33,6 +33,9 @@ struct intel_guc;
> >>>>     * @desc: pointer to the buffer descriptor
> >>>>     * @cmds: pointer to the commands buffer
> >>>>     * @size: size of the commands buffer in dwords
> >>>> + * @head: local shadow copy of head in dwords
> >>>> + * @tail: local shadow copy of tail in dwords
> >>>> + * @space: local shadow copy of space in dwords
> >>>>     * @broken: flag to indicate if descriptor data is broken
> >>>>     */
> >>>>    struct intel_guc_ct_buffer {
> >>>> @@ -40,6 +43,9 @@ struct intel_guc_ct_buffer {
> >>>>        struct guc_ct_buffer_desc *desc;
> >>>>        u32 *cmds;
> >>>>        u32 size;
> >>>> +    u32 tail;
> >>>> +    u32 head;
> >>>> +    u32 space;
> >>>>        bool broken;
> >>>>    };
> >>>>    
> > 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 6/7] drm/i915/guc: Optimize CTB writes and reads
  2021-07-06 19:33           ` [Intel-gfx] " Michal Wajdeczko
@ 2021-07-06 22:41             ` John Harrison
  -1 siblings, 0 replies; 38+ messages in thread
From: John Harrison @ 2021-07-06 22:41 UTC (permalink / raw)
  To: Michal Wajdeczko, Matthew Brost, intel-gfx, dri-devel

On 7/6/2021 12:33, Michal Wajdeczko wrote:
> On 06.07.2021 21:19, John Harrison wrote:
>> On 7/6/2021 12:12, Michal Wajdeczko wrote:
>>> On 06.07.2021 21:00, John Harrison wrote:
>>>> On 7/1/2021 10:15, Matthew Brost wrote:
>>>>> CTB writes are now in the path of command submission and should be
>>>>> optimized for performance. Rather than reading CTB descriptor values
>>>>> (e.g. head, tail) which could result in accesses across the PCIe bus,
>>>>> store shadow local copies and only read/write the descriptor values
>>>>> when
>>>>> absolutely necessary. Also store the current space in the each channel
>>>>> locally.
>>>>>
>>>>> v2:
>>>>>     (Michel)
>>>>>      - Add additional sanity checks for head / tail pointers
>>>>>      - Use GUC_CTB_HDR_LEN rather than magic 1
>>>>>
>>>>> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
>>>>> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
>>>>> ---
>>>>>     drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 88
>>>>> +++++++++++++++--------
>>>>>     drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h |  6 ++
>>>>>     2 files changed, 65 insertions(+), 29 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
>>>>> b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
>>>>> index a9cb7b608520..5b8b4ff609e2 100644
>>>>> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
>>>>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
>>>>> @@ -130,6 +130,10 @@ static void guc_ct_buffer_desc_init(struct
>>>>> guc_ct_buffer_desc *desc)
>>>>>     static void guc_ct_buffer_reset(struct intel_guc_ct_buffer *ctb)
>>>>>     {
>>>>>         ctb->broken = false;
>>>>> +    ctb->tail = 0;
>>>>> +    ctb->head = 0;
>>>>> +    ctb->space = CIRC_SPACE(ctb->tail, ctb->head, ctb->size);
>>>>> +
>>>>>         guc_ct_buffer_desc_init(ctb->desc);
>>>>>     }
>>>>>     @@ -383,10 +387,8 @@ static int ct_write(struct intel_guc_ct *ct,
>>>>>     {
>>>>>         struct intel_guc_ct_buffer *ctb = &ct->ctbs.send;
>>>>>         struct guc_ct_buffer_desc *desc = ctb->desc;
>>>>> -    u32 head = desc->head;
>>>>> -    u32 tail = desc->tail;
>>>>> +    u32 tail = ctb->tail;
>>>>>         u32 size = ctb->size;
>>>>> -    u32 used;
>>>>>         u32 header;
>>>>>         u32 hxg;
>>>>>         u32 *cmds = ctb->cmds;
>>>>> @@ -395,25 +397,22 @@ static int ct_write(struct intel_guc_ct *ct,
>>>>>         if (unlikely(desc->status))
>>>>>             goto corrupted;
>>>>>     -    if (unlikely((tail | head) >= size)) {
>>>>> +    GEM_BUG_ON(tail > size);
>>>>> +
>>>>> +#ifdef CONFIG_DRM_I915_DEBUG_GUC
>>>>> +    if (unlikely(tail != READ_ONCE(desc->tail))) {
>>>>> +        CT_ERROR(ct, "Tail was modified %u != %u\n",
>>>>> +             desc->tail, ctb->tail);
>>>>> +        desc->status |= GUC_CTB_STATUS_MISMATCH;
>>>>> +        goto corrupted;
>>>>> +    }
>>>>> +    if (unlikely((desc->tail | desc->head) >= size)) {
>>>>>             CT_ERROR(ct, "Invalid offsets head=%u tail=%u (size=%u)\n",
>>>>> -             head, tail, size);
>>>>> +             desc->head, desc->tail, size);
>>>>>             desc->status |= GUC_CTB_STATUS_OVERFLOW;
>>>>>             goto corrupted;
>>>>>         }
>>>>> -
>>>>> -    /*
>>>>> -     * tail == head condition indicates empty. GuC FW does not support
>>>>> -     * using up the entire buffer to get tail == head meaning full.
>>>>> -     */
>>>>> -    if (tail < head)
>>>>> -        used = (size - head) + tail;
>>>>> -    else
>>>>> -        used = tail - head;
>>>>> -
>>>>> -    /* make sure there is a space including extra dw for the fence */
>>>>> -    if (unlikely(used + len + GUC_CTB_HDR_LEN >= size))
>>>>> -        return -ENOSPC;
>>>>> +#endif
>>>>>           /*
>>>>>          * dw0: CT header (including fence)
>>>>> @@ -454,7 +453,9 @@ static int ct_write(struct intel_guc_ct *ct,
>>>>>         write_barrier(ct);
>>>>>           /* now update descriptor */
>>>>> +    ctb->tail = tail;
>>>>>         WRITE_ONCE(desc->tail, tail);
>>>>> +    ctb->space -= len + GUC_CTB_HDR_LEN;
>>>>>           return 0;
>>>>>     @@ -470,7 +471,7 @@ static int ct_write(struct intel_guc_ct *ct,
>>>>>      * @req:    pointer to pending request
>>>>>      * @status:    placeholder for status
>>>>>      *
>>>>> - * For each sent request, Guc shall send bac CT response message.
>>>>> + * For each sent request, GuC shall send back CT response message.
>>>>>      * Our message handler will update status of tracked request once
>>>>>      * response message with given fence is received. Wait here and
>>>>>      * check for valid response status value.
>>>>> @@ -526,24 +527,35 @@ static inline bool ct_deadlocked(struct
>>>>> intel_guc_ct *ct)
>>>>>         return ret;
>>>>>     }
>>>>>     -static inline bool h2g_has_room(struct intel_guc_ct_buffer *ctb,
>>>>> u32 len_dw)
>>>>> +static inline bool h2g_has_room(struct intel_guc_ct *ct, u32 len_dw)
>>>>>     {
>>>>> -    struct guc_ct_buffer_desc *desc = ctb->desc;
>>>>> -    u32 head = READ_ONCE(desc->head);
>>>>> +    struct intel_guc_ct_buffer *ctb = &ct->ctbs.send;
>>>>> +    u32 head;
>>>>>         u32 space;
>>>>>     -    space = CIRC_SPACE(desc->tail, head, ctb->size);
>>>>> +    if (ctb->space >= len_dw)
>>>>> +        return true;
>>>>> +
>>>>> +    head = READ_ONCE(ctb->desc->head);
>>>>> +    if (unlikely(head > ctb->size)) {
>>>>> +        CT_ERROR(ct, "Corrupted descriptor head=%u tail=%u size=%u\n",
>>>>> +             ctb->desc->head, ctb->desc->tail, ctb->size);
>>>>> +        ctb->desc->status |= GUC_CTB_STATUS_OVERFLOW;
>>>>> +        ctb->broken = true;
>>>>> +        return false;
>>>>> +    }
>>>>> +
>>>>> +    space = CIRC_SPACE(ctb->tail, head, ctb->size);
>>>>> +    ctb->space = space;
>>>>>           return space >= len_dw;
>>>>>     }
>>>>>       static int has_room_nb(struct intel_guc_ct *ct, u32 len_dw)
>>>>>     {
>>>>> -    struct intel_guc_ct_buffer *ctb = &ct->ctbs.send;
>>>>> -
>>>>>         lockdep_assert_held(&ct->ctbs.send.lock);
>>>>>     -    if (unlikely(!h2g_has_room(ctb, len_dw))) {
>>>>> +    if (unlikely(!h2g_has_room(ct, len_dw))) {
>>>>>             if (ct->stall_time == KTIME_MAX)
>>>>>                 ct->stall_time = ktime_get();
>>>>>     @@ -613,7 +625,7 @@ static int ct_send(struct intel_guc_ct *ct,
>>>>>          */
>>>>>     retry:
>>>>>         spin_lock_irqsave(&ctb->lock, flags);
>>>>> -    if (unlikely(!h2g_has_room(ctb, len + GUC_CTB_HDR_LEN))) {
>>>>> +    if (unlikely(!h2g_has_room(ct, len + GUC_CTB_HDR_LEN))) {
>>>>>             if (ct->stall_time == KTIME_MAX)
>>>>>                 ct->stall_time = ktime_get();
>>>>>             spin_unlock_irqrestore(&ctb->lock, flags);
>>>>> @@ -733,7 +745,7 @@ static int ct_read(struct intel_guc_ct *ct, struct
>>>>> ct_incoming_msg **msg)
>>>>>     {
>>>>>         struct intel_guc_ct_buffer *ctb = &ct->ctbs.recv;
>>>>>         struct guc_ct_buffer_desc *desc = ctb->desc;
>>>>> -    u32 head = desc->head;
>>>>> +    u32 head = ctb->head;
>>>>>         u32 tail = desc->tail;
>>>>>         u32 size = ctb->size;
>>>>>         u32 *cmds = ctb->cmds;
>>>>> @@ -748,12 +760,29 @@ static int ct_read(struct intel_guc_ct *ct,
>>>>> struct ct_incoming_msg **msg)
>>>>>         if (unlikely(desc->status))
>>>>>             goto corrupted;
>>>>>     -    if (unlikely((tail | head) >= size)) {
>>>>> +    GEM_BUG_ON(head > size);
>>>> Is the BUG_ON necessary given that both options below do the same check
>>>> but as a corrupted buffer test (with subsequent recovery by GT reset?)
>>>> rather than killing the driver.
>>> "head" and "size" are now fully owned by the driver.
>>> BUGON here is to make sure driver is coded correctly.
>> The point is that both sides of the #if below also validate head. So
> but not the same "head"
It is from the point of view of is the BUG_ON redundant duplication:

+    if (unlikely(head != READ_ONCE(desc->head))) {



>
> under DEBUG we are validating the one from descriptor (together with
> tail) - and that should be recoverable as if this fails it was clearly
> not our fault.
>
> but under non-DEBUG we were attempting to validate again the local one,
> pretending that this is recoverable, but it is not, as this is our fault
> (elsewhere in i915 we don't attempt to recover from obvious coding errors).
If it happens during driver development when someone is editing code 
then it might be an obvious coding error. If it happens once in a blue 
moon on some customer's system then it is a very hard to track down race 
condition that is totally not obvious at all. The former won't make it 
past CI whether it is a BUG_ON or a CT_ERROR. The latter has just killed 
the end user's kernel rather than attempted to recover (and would have 
been likely to succeed the recovery).

John.

>
>> first there is a BUG_ON, then there is the same test but without blowing
>> the driver apart. One or the other is not required. My vote would be to
>> keep the recoverable test rather than the BUG_ON.
> IMHO we should keep GEMBUGON and drop redundant check in non-DEBUG.
>
> But let Matt decide.
>
> Michal
>
>> John.
>>
>>>>> +
>>>>> +#ifdef CONFIG_DRM_I915_DEBUG_GUC
>>>>> +    if (unlikely(head != READ_ONCE(desc->head))) {
>>>>> +        CT_ERROR(ct, "Head was modified %u != %u\n",
>>>>> +             desc->head, ctb->head);
>>>>> +        desc->status |= GUC_CTB_STATUS_MISMATCH;
>>>>> +        goto corrupted;
>>>>> +    }
>>>>> +    if (unlikely((desc->tail | desc->head) >= size)) {
>>>>> +        CT_ERROR(ct, "Invalid offsets head=%u tail=%u (size=%u)\n",
>>>>> +             head, tail, size);
>>>>> +        desc->status |= GUC_CTB_STATUS_OVERFLOW;
>>>>> +        goto corrupted;
>>>>> +    }
>>>>> +#else
>>>>> +    if (unlikely((tail | ctb->head) >= size)) {
>>>> Could just be 'head' rather than 'ctb->head'.
>>> or drop "ctb->head" completely since this is driver owned field and
>>> above you already have BUGON to test it
>>>
>>> Michal
>>>
>>>> John.
>>>>
>>>>>             CT_ERROR(ct, "Invalid offsets head=%u tail=%u (size=%u)\n",
>>>>>                  head, tail, size);
>>>>>             desc->status |= GUC_CTB_STATUS_OVERFLOW;
>>>>>             goto corrupted;
>>>>>         }
>>>>> +#endif
>>>>>           /* tail == head condition indicates empty */
>>>>>         available = tail - head;
>>>>> @@ -803,6 +832,7 @@ static int ct_read(struct intel_guc_ct *ct, struct
>>>>> ct_incoming_msg **msg)
>>>>>         }
>>>>>         CT_DEBUG(ct, "received %*ph\n", 4 * len, (*msg)->msg);
>>>>>     +    ctb->head = head;
>>>>>         /* now update descriptor */
>>>>>         WRITE_ONCE(desc->head, head);
>>>>>     diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
>>>>> b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
>>>>> index bee03794c1eb..edd1bba0445d 100644
>>>>> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
>>>>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
>>>>> @@ -33,6 +33,9 @@ struct intel_guc;
>>>>>      * @desc: pointer to the buffer descriptor
>>>>>      * @cmds: pointer to the commands buffer
>>>>>      * @size: size of the commands buffer in dwords
>>>>> + * @head: local shadow copy of head in dwords
>>>>> + * @tail: local shadow copy of tail in dwords
>>>>> + * @space: local shadow copy of space in dwords
>>>>>      * @broken: flag to indicate if descriptor data is broken
>>>>>      */
>>>>>     struct intel_guc_ct_buffer {
>>>>> @@ -40,6 +43,9 @@ struct intel_guc_ct_buffer {
>>>>>         struct guc_ct_buffer_desc *desc;
>>>>>         u32 *cmds;
>>>>>         u32 size;
>>>>> +    u32 tail;
>>>>> +    u32 head;
>>>>> +    u32 space;
>>>>>         bool broken;
>>>>>     };
>>>>>     


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Intel-gfx] [PATCH 6/7] drm/i915/guc: Optimize CTB writes and reads
@ 2021-07-06 22:41             ` John Harrison
  0 siblings, 0 replies; 38+ messages in thread
From: John Harrison @ 2021-07-06 22:41 UTC (permalink / raw)
  To: Michal Wajdeczko, Matthew Brost, intel-gfx, dri-devel

On 7/6/2021 12:33, Michal Wajdeczko wrote:
> On 06.07.2021 21:19, John Harrison wrote:
>> On 7/6/2021 12:12, Michal Wajdeczko wrote:
>>> On 06.07.2021 21:00, John Harrison wrote:
>>>> On 7/1/2021 10:15, Matthew Brost wrote:
>>>>> CTB writes are now in the path of command submission and should be
>>>>> optimized for performance. Rather than reading CTB descriptor values
>>>>> (e.g. head, tail) which could result in accesses across the PCIe bus,
>>>>> store shadow local copies and only read/write the descriptor values
>>>>> when
>>>>> absolutely necessary. Also store the current space in the each channel
>>>>> locally.
>>>>>
>>>>> v2:
>>>>>     (Michel)
>>>>>      - Add additional sanity checks for head / tail pointers
>>>>>      - Use GUC_CTB_HDR_LEN rather than magic 1
>>>>>
>>>>> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
>>>>> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
>>>>> ---
>>>>>     drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 88
>>>>> +++++++++++++++--------
>>>>>     drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h |  6 ++
>>>>>     2 files changed, 65 insertions(+), 29 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
>>>>> b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
>>>>> index a9cb7b608520..5b8b4ff609e2 100644
>>>>> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
>>>>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
>>>>> @@ -130,6 +130,10 @@ static void guc_ct_buffer_desc_init(struct
>>>>> guc_ct_buffer_desc *desc)
>>>>>     static void guc_ct_buffer_reset(struct intel_guc_ct_buffer *ctb)
>>>>>     {
>>>>>         ctb->broken = false;
>>>>> +    ctb->tail = 0;
>>>>> +    ctb->head = 0;
>>>>> +    ctb->space = CIRC_SPACE(ctb->tail, ctb->head, ctb->size);
>>>>> +
>>>>>         guc_ct_buffer_desc_init(ctb->desc);
>>>>>     }
>>>>>     @@ -383,10 +387,8 @@ static int ct_write(struct intel_guc_ct *ct,
>>>>>     {
>>>>>         struct intel_guc_ct_buffer *ctb = &ct->ctbs.send;
>>>>>         struct guc_ct_buffer_desc *desc = ctb->desc;
>>>>> -    u32 head = desc->head;
>>>>> -    u32 tail = desc->tail;
>>>>> +    u32 tail = ctb->tail;
>>>>>         u32 size = ctb->size;
>>>>> -    u32 used;
>>>>>         u32 header;
>>>>>         u32 hxg;
>>>>>         u32 *cmds = ctb->cmds;
>>>>> @@ -395,25 +397,22 @@ static int ct_write(struct intel_guc_ct *ct,
>>>>>         if (unlikely(desc->status))
>>>>>             goto corrupted;
>>>>>     -    if (unlikely((tail | head) >= size)) {
>>>>> +    GEM_BUG_ON(tail > size);
>>>>> +
>>>>> +#ifdef CONFIG_DRM_I915_DEBUG_GUC
>>>>> +    if (unlikely(tail != READ_ONCE(desc->tail))) {
>>>>> +        CT_ERROR(ct, "Tail was modified %u != %u\n",
>>>>> +             desc->tail, ctb->tail);
>>>>> +        desc->status |= GUC_CTB_STATUS_MISMATCH;
>>>>> +        goto corrupted;
>>>>> +    }
>>>>> +    if (unlikely((desc->tail | desc->head) >= size)) {
>>>>>             CT_ERROR(ct, "Invalid offsets head=%u tail=%u (size=%u)\n",
>>>>> -             head, tail, size);
>>>>> +             desc->head, desc->tail, size);
>>>>>             desc->status |= GUC_CTB_STATUS_OVERFLOW;
>>>>>             goto corrupted;
>>>>>         }
>>>>> -
>>>>> -    /*
>>>>> -     * tail == head condition indicates empty. GuC FW does not support
>>>>> -     * using up the entire buffer to get tail == head meaning full.
>>>>> -     */
>>>>> -    if (tail < head)
>>>>> -        used = (size - head) + tail;
>>>>> -    else
>>>>> -        used = tail - head;
>>>>> -
>>>>> -    /* make sure there is a space including extra dw for the fence */
>>>>> -    if (unlikely(used + len + GUC_CTB_HDR_LEN >= size))
>>>>> -        return -ENOSPC;
>>>>> +#endif
>>>>>           /*
>>>>>          * dw0: CT header (including fence)
>>>>> @@ -454,7 +453,9 @@ static int ct_write(struct intel_guc_ct *ct,
>>>>>         write_barrier(ct);
>>>>>           /* now update descriptor */
>>>>> +    ctb->tail = tail;
>>>>>         WRITE_ONCE(desc->tail, tail);
>>>>> +    ctb->space -= len + GUC_CTB_HDR_LEN;
>>>>>           return 0;
>>>>>     @@ -470,7 +471,7 @@ static int ct_write(struct intel_guc_ct *ct,
>>>>>      * @req:    pointer to pending request
>>>>>      * @status:    placeholder for status
>>>>>      *
>>>>> - * For each sent request, Guc shall send bac CT response message.
>>>>> + * For each sent request, GuC shall send back CT response message.
>>>>>      * Our message handler will update status of tracked request once
>>>>>      * response message with given fence is received. Wait here and
>>>>>      * check for valid response status value.
>>>>> @@ -526,24 +527,35 @@ static inline bool ct_deadlocked(struct
>>>>> intel_guc_ct *ct)
>>>>>         return ret;
>>>>>     }
>>>>>     -static inline bool h2g_has_room(struct intel_guc_ct_buffer *ctb,
>>>>> u32 len_dw)
>>>>> +static inline bool h2g_has_room(struct intel_guc_ct *ct, u32 len_dw)
>>>>>     {
>>>>> -    struct guc_ct_buffer_desc *desc = ctb->desc;
>>>>> -    u32 head = READ_ONCE(desc->head);
>>>>> +    struct intel_guc_ct_buffer *ctb = &ct->ctbs.send;
>>>>> +    u32 head;
>>>>>         u32 space;
>>>>>     -    space = CIRC_SPACE(desc->tail, head, ctb->size);
>>>>> +    if (ctb->space >= len_dw)
>>>>> +        return true;
>>>>> +
>>>>> +    head = READ_ONCE(ctb->desc->head);
>>>>> +    if (unlikely(head > ctb->size)) {
>>>>> +        CT_ERROR(ct, "Corrupted descriptor head=%u tail=%u size=%u\n",
>>>>> +             ctb->desc->head, ctb->desc->tail, ctb->size);
>>>>> +        ctb->desc->status |= GUC_CTB_STATUS_OVERFLOW;
>>>>> +        ctb->broken = true;
>>>>> +        return false;
>>>>> +    }
>>>>> +
>>>>> +    space = CIRC_SPACE(ctb->tail, head, ctb->size);
>>>>> +    ctb->space = space;
>>>>>           return space >= len_dw;
>>>>>     }
>>>>>       static int has_room_nb(struct intel_guc_ct *ct, u32 len_dw)
>>>>>     {
>>>>> -    struct intel_guc_ct_buffer *ctb = &ct->ctbs.send;
>>>>> -
>>>>>         lockdep_assert_held(&ct->ctbs.send.lock);
>>>>>     -    if (unlikely(!h2g_has_room(ctb, len_dw))) {
>>>>> +    if (unlikely(!h2g_has_room(ct, len_dw))) {
>>>>>             if (ct->stall_time == KTIME_MAX)
>>>>>                 ct->stall_time = ktime_get();
>>>>>     @@ -613,7 +625,7 @@ static int ct_send(struct intel_guc_ct *ct,
>>>>>          */
>>>>>     retry:
>>>>>         spin_lock_irqsave(&ctb->lock, flags);
>>>>> -    if (unlikely(!h2g_has_room(ctb, len + GUC_CTB_HDR_LEN))) {
>>>>> +    if (unlikely(!h2g_has_room(ct, len + GUC_CTB_HDR_LEN))) {
>>>>>             if (ct->stall_time == KTIME_MAX)
>>>>>                 ct->stall_time = ktime_get();
>>>>>             spin_unlock_irqrestore(&ctb->lock, flags);
>>>>> @@ -733,7 +745,7 @@ static int ct_read(struct intel_guc_ct *ct, struct
>>>>> ct_incoming_msg **msg)
>>>>>     {
>>>>>         struct intel_guc_ct_buffer *ctb = &ct->ctbs.recv;
>>>>>         struct guc_ct_buffer_desc *desc = ctb->desc;
>>>>> -    u32 head = desc->head;
>>>>> +    u32 head = ctb->head;
>>>>>         u32 tail = desc->tail;
>>>>>         u32 size = ctb->size;
>>>>>         u32 *cmds = ctb->cmds;
>>>>> @@ -748,12 +760,29 @@ static int ct_read(struct intel_guc_ct *ct,
>>>>> struct ct_incoming_msg **msg)
>>>>>         if (unlikely(desc->status))
>>>>>             goto corrupted;
>>>>>     -    if (unlikely((tail | head) >= size)) {
>>>>> +    GEM_BUG_ON(head > size);
>>>> Is the BUG_ON necessary given that both options below do the same check
>>>> but as a corrupted buffer test (with subsequent recovery by GT reset?)
>>>> rather than killing the driver.
>>> "head" and "size" are now fully owned by the driver.
>>> BUGON here is to make sure driver is coded correctly.
>> The point is that both sides of the #if below also validate head. So
> but not the same "head"
It is from the point of view of is the BUG_ON redundant duplication:

+    if (unlikely(head != READ_ONCE(desc->head))) {



>
> under DEBUG we are validating the one from descriptor (together with
> tail) - and that should be recoverable as if this fails it was clearly
> not our fault.
>
> but under non-DEBUG we were attempting to validate again the local one,
> pretending that this is recoverable, but it is not, as this is our fault
> (elsewhere in i915 we don't attempt to recover from obvious coding errors).
If it happens during driver development when someone is editing code 
then it might be an obvious coding error. If it happens once in a blue 
moon on some customer's system then it is a very hard to track down race 
condition that is totally not obvious at all. The former won't make it 
past CI whether it is a BUG_ON or a CT_ERROR. The latter has just killed 
the end user's kernel rather than attempted to recover (and would have 
been likely to succeed the recovery).

John.

>
>> first there is a BUG_ON, then there is the same test but without blowing
>> the driver apart. One or the other is not required. My vote would be to
>> keep the recoverable test rather than the BUG_ON.
> IMHO we should keep GEMBUGON and drop redundant check in non-DEBUG.
>
> But let Matt decide.
>
> Michal
>
>> John.
>>
>>>>> +
>>>>> +#ifdef CONFIG_DRM_I915_DEBUG_GUC
>>>>> +    if (unlikely(head != READ_ONCE(desc->head))) {
>>>>> +        CT_ERROR(ct, "Head was modified %u != %u\n",
>>>>> +             desc->head, ctb->head);
>>>>> +        desc->status |= GUC_CTB_STATUS_MISMATCH;
>>>>> +        goto corrupted;
>>>>> +    }
>>>>> +    if (unlikely((desc->tail | desc->head) >= size)) {
>>>>> +        CT_ERROR(ct, "Invalid offsets head=%u tail=%u (size=%u)\n",
>>>>> +             head, tail, size);
>>>>> +        desc->status |= GUC_CTB_STATUS_OVERFLOW;
>>>>> +        goto corrupted;
>>>>> +    }
>>>>> +#else
>>>>> +    if (unlikely((tail | ctb->head) >= size)) {
>>>> Could just be 'head' rather than 'ctb->head'.
>>> or drop "ctb->head" completely since this is driver owned field and
>>> above you already have BUGON to test it
>>>
>>> Michal
>>>
>>>> John.
>>>>
>>>>>             CT_ERROR(ct, "Invalid offsets head=%u tail=%u (size=%u)\n",
>>>>>                  head, tail, size);
>>>>>             desc->status |= GUC_CTB_STATUS_OVERFLOW;
>>>>>             goto corrupted;
>>>>>         }
>>>>> +#endif
>>>>>           /* tail == head condition indicates empty */
>>>>>         available = tail - head;
>>>>> @@ -803,6 +832,7 @@ static int ct_read(struct intel_guc_ct *ct, struct
>>>>> ct_incoming_msg **msg)
>>>>>         }
>>>>>         CT_DEBUG(ct, "received %*ph\n", 4 * len, (*msg)->msg);
>>>>>     +    ctb->head = head;
>>>>>         /* now update descriptor */
>>>>>         WRITE_ONCE(desc->head, head);
>>>>>     diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
>>>>> b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
>>>>> index bee03794c1eb..edd1bba0445d 100644
>>>>> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
>>>>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
>>>>> @@ -33,6 +33,9 @@ struct intel_guc;
>>>>>      * @desc: pointer to the buffer descriptor
>>>>>      * @cmds: pointer to the commands buffer
>>>>>      * @size: size of the commands buffer in dwords
>>>>> + * @head: local shadow copy of head in dwords
>>>>> + * @tail: local shadow copy of tail in dwords
>>>>> + * @space: local shadow copy of space in dwords
>>>>>      * @broken: flag to indicate if descriptor data is broken
>>>>>      */
>>>>>     struct intel_guc_ct_buffer {
>>>>> @@ -40,6 +43,9 @@ struct intel_guc_ct_buffer {
>>>>>         struct guc_ct_buffer_desc *desc;
>>>>>         u32 *cmds;
>>>>>         u32 size;
>>>>> +    u32 tail;
>>>>> +    u32 head;
>>>>> +    u32 space;
>>>>>         bool broken;
>>>>>     };
>>>>>     

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2021-07-06 22:41 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-01 17:15 [PATCH 0/7] CT changes required for GuC submission Matthew Brost
2021-07-01 17:15 ` [Intel-gfx] " Matthew Brost
2021-07-01 17:15 ` [PATCH 1/7] drm/i915/guc: Relax CTB response timeout Matthew Brost
2021-07-01 17:15   ` [Intel-gfx] " Matthew Brost
2021-07-01 17:15 ` [PATCH 2/7] drm/i915/guc: Improve error message for unsolicited CT response Matthew Brost
2021-07-01 17:15   ` [Intel-gfx] " Matthew Brost
2021-07-01 17:15 ` [PATCH 3/7] drm/i915/guc: Increase size of CTB buffers Matthew Brost
2021-07-01 17:15   ` [Intel-gfx] " Matthew Brost
2021-07-01 17:15 ` [PATCH 4/7] drm/i915/guc: Add non blocking CTB send function Matthew Brost
2021-07-01 17:15   ` [Intel-gfx] " Matthew Brost
2021-07-06 18:12   ` John Harrison
2021-07-06 18:12     ` [Intel-gfx] " John Harrison
2021-07-06 18:17     ` Matthew Brost
2021-07-06 18:17       ` [Intel-gfx] " Matthew Brost
2021-07-01 17:15 ` [PATCH 5/7] drm/i915/guc: Add stall timer to " Matthew Brost
2021-07-01 17:15   ` [Intel-gfx] " Matthew Brost
2021-07-06 18:27   ` John Harrison
2021-07-06 18:27     ` [Intel-gfx] " John Harrison
2021-07-01 17:15 ` [PATCH 6/7] drm/i915/guc: Optimize CTB writes and reads Matthew Brost
2021-07-01 17:15   ` [Intel-gfx] " Matthew Brost
2021-07-06 19:00   ` John Harrison
2021-07-06 19:00     ` [Intel-gfx] " John Harrison
2021-07-06 19:12     ` Michal Wajdeczko
2021-07-06 19:12       ` [Intel-gfx] " Michal Wajdeczko
2021-07-06 19:19       ` John Harrison
2021-07-06 19:19         ` [Intel-gfx] " John Harrison
2021-07-06 19:33         ` Michal Wajdeczko
2021-07-06 19:33           ` [Intel-gfx] " Michal Wajdeczko
2021-07-06 21:22           ` Matthew Brost
2021-07-06 21:22             ` [Intel-gfx] " Matthew Brost
2021-07-06 22:41           ` John Harrison
2021-07-06 22:41             ` [Intel-gfx] " John Harrison
2021-07-01 17:15 ` [PATCH 7/7] drm/i915/guc: Module load failure test for CT buffer creation Matthew Brost
2021-07-01 17:15   ` [Intel-gfx] " Matthew Brost
2021-07-01 23:20 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for CT changes required for GuC submission (rev2) Patchwork
2021-07-01 23:21 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
2021-07-01 23:49 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2021-07-02  6:55 ` [Intel-gfx] ✗ Fi.CI.IGT: failure " Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.