linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] habanalabs: count dropped CS because max CS in-flight
@ 2020-09-04 15:11 Oded Gabbay
  2020-09-07 15:15 ` Tomer Tayar
  0 siblings, 1 reply; 2+ messages in thread
From: Oded Gabbay @ 2020-09-04 15:11 UTC (permalink / raw)
  To: linux-kernel, SW_Drivers

There is a case where the user reaches the maximum number of CS in-flight.
In that case, the driver rejects the new CS of the user with EAGAIN. Count
that event so the user can query the driver later to see if it happened.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
---
 drivers/misc/habanalabs/common/command_submission.c | 5 ++++-
 include/uapi/misc/habanalabs.h                      | 2 ++
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/misc/habanalabs/common/command_submission.c b/drivers/misc/habanalabs/common/command_submission.c
index a811a9fdf13b..470bffbe9bdc 100644
--- a/drivers/misc/habanalabs/common/command_submission.c
+++ b/drivers/misc/habanalabs/common/command_submission.c
@@ -252,6 +252,8 @@ static void cs_counters_aggregate(struct hl_device *hdev, struct hl_ctx *ctx)
 			ctx->cs_counters.parsing_drop_cnt;
 	hdev->aggregated_cs_counters.queue_full_drop_cnt +=
 			ctx->cs_counters.queue_full_drop_cnt;
+	hdev->aggregated_cs_counters.max_cs_in_flight_drop_cnt +=
+			ctx->cs_counters.max_cs_in_flight_drop_cnt;
 }
 
 static void cs_do_release(struct kref *ref)
@@ -431,8 +433,9 @@ static int allocate_cs(struct hl_device *hdev, struct hl_ctx *ctx,
 				(hdev->asic_prop.max_pending_cs - 1)];
 
 	if (other && !completion_done(&other->completion)) {
-		dev_dbg(hdev->dev,
+		dev_dbg_ratelimited(hdev->dev,
 			"Rejecting CS because of too many in-flights CS\n");
+		ctx->cs_counters.max_cs_in_flight_drop_cnt++;
 		rc = -EAGAIN;
 		goto free_fence;
 	}
diff --git a/include/uapi/misc/habanalabs.h b/include/uapi/misc/habanalabs.h
index a2dcad29340f..69fb44d35292 100644
--- a/include/uapi/misc/habanalabs.h
+++ b/include/uapi/misc/habanalabs.h
@@ -401,12 +401,14 @@ struct hl_info_sync_manager {
  * @parsing_drop_cnt: dropped due to error in packet parsing
  * @queue_full_drop_cnt: dropped due to queue full
  * @device_in_reset_drop_cnt: dropped due to device in reset
+ * @max_cs_in_flight_drop_cnt: dropped due to maximum CS in-flight
  */
 struct hl_cs_counters {
 	__u64 out_of_mem_drop_cnt;
 	__u64 parsing_drop_cnt;
 	__u64 queue_full_drop_cnt;
 	__u64 device_in_reset_drop_cnt;
+	__u64 max_cs_in_flight_drop_cnt;
 };
 
 struct hl_info_cs_counters {
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* RE: [PATCH] habanalabs: count dropped CS because max CS in-flight
  2020-09-04 15:11 [PATCH] habanalabs: count dropped CS because max CS in-flight Oded Gabbay
@ 2020-09-07 15:15 ` Tomer Tayar
  0 siblings, 0 replies; 2+ messages in thread
From: Tomer Tayar @ 2020-09-07 15:15 UTC (permalink / raw)
  To: Oded Gabbay, linux-kernel, SW_Drivers

On Fri, Sep 4, 2020 at 18:11 Oded Gabbay <oded.gabbay@gmail.com> wrote:
> There is a case where the user reaches the maximum number of CS in-flight.
> In that case, the driver rejects the new CS of the user with EAGAIN. Count
> that event so the user can query the driver later to see if it happened.
> 
> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>

Reviewed-by: Tomer Tayar <ttayar@habana.ai>

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2020-09-07 15:28 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-04 15:11 [PATCH] habanalabs: count dropped CS because max CS in-flight Oded Gabbay
2020-09-07 15:15 ` Tomer Tayar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).