All of lore.kernel.org
 help / color / mirror / Atom feed
From: Oded Gabbay <ogabbay@kernel.org>
To: linux-kernel@vger.kernel.org
Cc: Tomer Tayar <ttayar@habana.ai>
Subject: [PATCH 10/15] habanalabs: use graceful hard reset for CS timeouts
Date: Thu, 27 Oct 2022 12:10:02 +0300	[thread overview]
Message-ID: <20221027091007.664797-10-ogabbay@kernel.org> (raw)
In-Reply-To: <20221027091007.664797-1-ogabbay@kernel.org>

From: Tomer Tayar <ttayar@habana.ai>

Use graceful hard reset when detecting a CS timeout that requires a
device reset.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
---
 .../misc/habanalabs/common/command_submission.c  | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/misc/habanalabs/common/command_submission.c b/drivers/misc/habanalabs/common/command_submission.c
index fa05770865c6..f1c69c8ed74a 100644
--- a/drivers/misc/habanalabs/common/command_submission.c
+++ b/drivers/misc/habanalabs/common/command_submission.c
@@ -798,7 +798,7 @@ static void cs_do_release(struct kref *ref)
 static void cs_timedout(struct work_struct *work)
 {
 	struct hl_device *hdev;
-	u64 event_mask;
+	u64 event_mask = 0x0;
 	int rc;
 	struct hl_cs *cs = container_of(work, struct hl_cs,
 						 work_tdr.work);
@@ -830,11 +830,7 @@ static void cs_timedout(struct work_struct *work)
 	if (rc) {
 		hdev->captured_err_info.cs_timeout.timestamp = ktime_get();
 		hdev->captured_err_info.cs_timeout.seq = cs->sequence;
-
-		event_mask = device_reset ? (HL_NOTIFIER_EVENT_CS_TIMEOUT |
-				HL_NOTIFIER_EVENT_DEVICE_RESET) : HL_NOTIFIER_EVENT_CS_TIMEOUT;
-
-		hl_notifier_event_send_all(hdev, event_mask);
+		event_mask |= HL_NOTIFIER_EVENT_CS_TIMEOUT;
 	}
 
 	switch (cs->type) {
@@ -869,8 +865,12 @@ static void cs_timedout(struct work_struct *work)
 
 	cs_put(cs);
 
-	if (device_reset)
-		hl_device_reset(hdev, HL_DRV_RESET_TDR);
+	if (device_reset) {
+		event_mask |= HL_NOTIFIER_EVENT_DEVICE_RESET;
+		hl_device_cond_reset(hdev, HL_DRV_RESET_TDR, event_mask);
+	} else if (event_mask) {
+		hl_notifier_event_send_all(hdev, event_mask);
+	}
 }
 
 static int allocate_cs(struct hl_device *hdev, struct hl_ctx *ctx,
-- 
2.25.1


  parent reply	other threads:[~2022-10-27  9:11 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-27  9:09 [PATCH 01/15] habanalabs: fix using freed pointer Oded Gabbay
2022-10-27  9:09 ` [PATCH 02/15] habanalabs: allow setting HBM BAR to other regions Oded Gabbay
2022-10-27  9:09 ` [PATCH 03/15] habanalabs/gaudi2: remove configurations to access the MSI-X doorbell Oded Gabbay
2022-10-27  9:09 ` [PATCH 04/15] habanalabs: fix user mappings calculation in case of page fault Oded Gabbay
2022-10-27  9:09 ` [PATCH 05/15] habanalabs: avoid divide by zero in device utilization Oded Gabbay
2022-10-27  9:09 ` [PATCH 06/15] habanalabs: add support for graceful hard reset Oded Gabbay
2022-10-27  9:09 ` [PATCH 07/15] habanalabs: add an option to control watchdog timeout via debugfs Oded Gabbay
2022-10-27  9:10 ` [PATCH 08/15] habanalabs/gaudi: use graceful hard reset for F/W events Oded Gabbay
2022-10-27  9:10 ` [PATCH 09/15] habanalabs/gaudi2: " Oded Gabbay
2022-10-27  9:10 ` Oded Gabbay [this message]
2022-10-27  9:10 ` [PATCH 11/15] habanalabs: no consecutive err when user context is enabled Oded Gabbay
2022-10-27  9:10 ` [PATCH 12/15] habanalabs: zero ts registration buff when allocated Oded Gabbay
2022-10-27  9:10 ` [PATCH 13/15] habanalabs: fix PCIe access to SRAM via debugfs Oded Gabbay
2022-10-27  9:10 ` [PATCH 14/15] habanalabs: add warning print upon a PCI error Oded Gabbay
2022-10-27  9:10 ` [PATCH 15/15] habanalabs: remove redundant gaudi2_sec asic type Oded Gabbay

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20221027091007.664797-10-ogabbay@kernel.org \
    --to=ogabbay@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ttayar@habana.ai \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.