All of lore.kernel.org
 help / color / mirror / Atom feed
From: Oded Gabbay <ogabbay@kernel.org>
To: linux-kernel@vger.kernel.org
Cc: farah kassabri <fkassabri@habana.ai>
Subject: [PATCH 2/2] habanalabs: fix kernel OOPs related to staged cs
Date: Wed,  1 Sep 2021 19:33:09 +0300	[thread overview]
Message-ID: <20210901163309.112126-2-ogabbay@kernel.org> (raw)
In-Reply-To: <20210901163309.112126-1-ogabbay@kernel.org>

From: farah kassabri <fkassabri@habana.ai>

In case of single staged cs with both first/last indications
set, we reach a scenario where in cs_release function flow
we don't cancel the TDR work before freeing the cs memory,
this lead to kernel OOPs since when the timer expires
the work pointer will be freed already.
In addition treat wait encaps cs "not found" handle
as "OK" for the user in order to keep the user interface
for both legacy and encpas signal/wait features the same.

Signed-off-by: farah kassabri <fkassabri@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
---
 .../habanalabs/common/command_submission.c     | 18 +++++++++++++-----
 1 file changed, 13 insertions(+), 5 deletions(-)

diff --git a/drivers/misc/habanalabs/common/command_submission.c b/drivers/misc/habanalabs/common/command_submission.c
index 9a8b9191c28c..deb080830ecb 100644
--- a/drivers/misc/habanalabs/common/command_submission.c
+++ b/drivers/misc/habanalabs/common/command_submission.c
@@ -405,7 +405,7 @@ static void staged_cs_put(struct hl_device *hdev, struct hl_cs *cs)
 static void cs_handle_tdr(struct hl_device *hdev, struct hl_cs *cs)
 {
 	bool next_entry_found = false;
-	struct hl_cs *next;
+	struct hl_cs *next, *first_cs;
 
 	if (!cs_needs_timeout(cs))
 		return;
@@ -415,9 +415,16 @@ static void cs_handle_tdr(struct hl_device *hdev, struct hl_cs *cs)
 	/* We need to handle tdr only once for the complete staged submission.
 	 * Hence, we choose the CS that reaches this function first which is
 	 * the CS marked as 'staged_last'.
+	 * In case single staged cs was submitted which has both first and last
+	 * indications, then "cs_find_first" below will return NULL, since we
+	 * removed the cs node from the list before getting here,
+	 * in such cases just continue with the cs to cancel it's TDR work.
 	 */
-	if (cs->staged_cs && cs->staged_last)
-		cs = hl_staged_cs_find_first(hdev, cs->staged_sequence);
+	if (cs->staged_cs && cs->staged_last) {
+		first_cs = hl_staged_cs_find_first(hdev, cs->staged_sequence);
+		if (first_cs)
+			cs = first_cs;
+	}
 
 	spin_unlock(&hdev->cs_mirror_lock);
 
@@ -2026,9 +2033,10 @@ static int cs_ioctl_signal_wait(struct hl_fpriv *hpriv, enum hl_cs_type cs_type,
 			spin_unlock(&ctx->sig_mgr.lock);
 
 			if (!handle_found) {
-				dev_err(hdev->dev, "Cannot find encapsulated signals handle for seq 0x%llx\n",
+				/* treat as signal CS already finished */
+				dev_dbg(hdev->dev, "Cannot find encapsulated signals handle for seq 0x%llx\n",
 						signal_seq);
-				rc = -EINVAL;
+				rc = 0;
 				goto free_cs_chunk_array;
 			}
 
-- 
2.17.1


      reply	other threads:[~2021-09-01 16:33 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-01 16:33 [PATCH 1/2] habanalabs: fix potential race in interrupt wait ioctl Oded Gabbay
2021-09-01 16:33 ` Oded Gabbay [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210901163309.112126-2-ogabbay@kernel.org \
    --to=ogabbay@kernel.org \
    --cc=fkassabri@habana.ai \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.