All of lore.kernel.org
 help / color / mirror / Atom feed
From: Oded Gabbay <ogabbay@kernel.org>
To: linux-kernel@vger.kernel.org
Cc: Tomer Tayar <ttayar@habana.ai>
Subject: [PATCH 09/15] habanalabs/gaudi2: use graceful hard reset for F/W events
Date: Thu, 27 Oct 2022 12:10:01 +0300	[thread overview]
Message-ID: <20221027091007.664797-9-ogabbay@kernel.org> (raw)
In-Reply-To: <20221027091007.664797-1-ogabbay@kernel.org>

From: Tomer Tayar <ttayar@habana.ai>

Use graceful hard reset for F/W events on Gaudi2 device that require a
device reset.

While at it, do a small refactor of the checks and function calls,
to simplify it and to avoid code duplication.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
---
 drivers/misc/habanalabs/gaudi2/gaudi2.c | 27 +++++++++----------------
 1 file changed, 10 insertions(+), 17 deletions(-)

diff --git a/drivers/misc/habanalabs/gaudi2/gaudi2.c b/drivers/misc/habanalabs/gaudi2/gaudi2.c
index 9208f69dd7f8..22f5445fe71c 100644
--- a/drivers/misc/habanalabs/gaudi2/gaudi2.c
+++ b/drivers/misc/habanalabs/gaudi2/gaudi2.c
@@ -8768,9 +8768,9 @@ static void hl_arc_event_handle(struct hl_device *hdev,
 
 static void gaudi2_handle_eqe(struct hl_device *hdev, struct hl_eq_entry *eq_entry)
 {
-	u32 ctl, reset_flags = HL_DRV_RESET_HARD | HL_DRV_RESET_DELAY;
-	struct gaudi2_device *gaudi2 = hdev->asic_specific;
 	bool reset_required = false, skip_reset = false, is_critical = false;
+	struct gaudi2_device *gaudi2 = hdev->asic_specific;
+	u32 ctl, reset_flags = HL_DRV_RESET_HARD;
 	int index, sbte_index;
 	u64 event_mask = 0;
 	u16 event_type;
@@ -9158,7 +9158,9 @@ static void gaudi2_handle_eqe(struct hl_device *hdev, struct hl_eq_entry *eq_ent
 						event_type);
 	}
 
-	if ((gaudi2_irq_map_table[event_type].reset || reset_required) && !skip_reset)
+	if ((gaudi2_irq_map_table[event_type].reset || reset_required) && !skip_reset &&
+			(hdev->hard_reset_on_fw_events ||
+			(hdev->asic_prop.fw_security_enabled && is_critical)))
 		goto reset_device;
 
 	/* Send unmask irq only for interrupts not classified as MSG */
@@ -9172,22 +9174,13 @@ static void gaudi2_handle_eqe(struct hl_device *hdev, struct hl_eq_entry *eq_ent
 
 reset_device:
 	if (hdev->asic_prop.fw_security_enabled && is_critical) {
-		reset_flags = HL_DRV_RESET_HARD | HL_DRV_RESET_BYPASS_REQ_TO_FW;
-
-		/* notify on device unavailable while the reset triggered by fw */
-		event_mask |= (HL_NOTIFIER_EVENT_DEVICE_RESET |
-					HL_NOTIFIER_EVENT_DEVICE_UNAVAILABLE);
-		hl_device_reset(hdev, reset_flags);
-	} else if (hdev->hard_reset_on_fw_events) {
-		event_mask |= HL_NOTIFIER_EVENT_DEVICE_RESET;
-		hl_device_reset(hdev, reset_flags);
+		reset_flags |= HL_DRV_RESET_BYPASS_REQ_TO_FW;
+		event_mask |= HL_NOTIFIER_EVENT_DEVICE_UNAVAILABLE;
 	} else {
-		if (!gaudi2_irq_map_table[event_type].msg)
-			hl_fw_unmask_irq(hdev, event_type);
+		reset_flags |= HL_DRV_RESET_DELAY;
 	}
-
-	if (event_mask)
-		hl_notifier_event_send_all(hdev, event_mask);
+	event_mask |= HL_NOTIFIER_EVENT_DEVICE_RESET;
+	hl_device_cond_reset(hdev, reset_flags, event_mask);
 }
 
 static int gaudi2_memset_device_memory(struct hl_device *hdev, u64 addr, u64 size, u64 val)
-- 
2.25.1


  parent reply	other threads:[~2022-10-27  9:11 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-27  9:09 [PATCH 01/15] habanalabs: fix using freed pointer Oded Gabbay
2022-10-27  9:09 ` [PATCH 02/15] habanalabs: allow setting HBM BAR to other regions Oded Gabbay
2022-10-27  9:09 ` [PATCH 03/15] habanalabs/gaudi2: remove configurations to access the MSI-X doorbell Oded Gabbay
2022-10-27  9:09 ` [PATCH 04/15] habanalabs: fix user mappings calculation in case of page fault Oded Gabbay
2022-10-27  9:09 ` [PATCH 05/15] habanalabs: avoid divide by zero in device utilization Oded Gabbay
2022-10-27  9:09 ` [PATCH 06/15] habanalabs: add support for graceful hard reset Oded Gabbay
2022-10-27  9:09 ` [PATCH 07/15] habanalabs: add an option to control watchdog timeout via debugfs Oded Gabbay
2022-10-27  9:10 ` [PATCH 08/15] habanalabs/gaudi: use graceful hard reset for F/W events Oded Gabbay
2022-10-27  9:10 ` Oded Gabbay [this message]
2022-10-27  9:10 ` [PATCH 10/15] habanalabs: use graceful hard reset for CS timeouts Oded Gabbay
2022-10-27  9:10 ` [PATCH 11/15] habanalabs: no consecutive err when user context is enabled Oded Gabbay
2022-10-27  9:10 ` [PATCH 12/15] habanalabs: zero ts registration buff when allocated Oded Gabbay
2022-10-27  9:10 ` [PATCH 13/15] habanalabs: fix PCIe access to SRAM via debugfs Oded Gabbay
2022-10-27  9:10 ` [PATCH 14/15] habanalabs: add warning print upon a PCI error Oded Gabbay
2022-10-27  9:10 ` [PATCH 15/15] habanalabs: remove redundant gaudi2_sec asic type Oded Gabbay

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20221027091007.664797-9-ogabbay@kernel.org \
    --to=ogabbay@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ttayar@habana.ai \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.