From: Oded Gabbay <oded.gabbay@gmail.com>
To: linux-kernel@vger.kernel.org, oshpigelman@habana.ai,
ttayar@habana.ai, gregkh@linuxfoundation.org
Subject: [PATCH 7/9] habanalabs: protect only pointer dereference in hard-reset
Date: Sun, 28 Jul 2019 14:28:16 +0300 [thread overview]
Message-ID: <20190728112818.30397-8-oded.gabbay@gmail.com> (raw)
In-Reply-To: <20190728112818.30397-1-oded.gabbay@gmail.com>
This patch changes the location of taking a mutex lock and releasing it
during the hard-reset process of the ASIC.
The only place we need to protect is when we dereference pointers that may
go away in case the user process aborts/closes the FD.
That way, we allow the user process to actually close its FD in case we
tell him that an error occurred.
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
---
drivers/misc/habanalabs/device.c | 26 ++++++++++++++------------
1 file changed, 14 insertions(+), 12 deletions(-)
diff --git a/drivers/misc/habanalabs/device.c b/drivers/misc/habanalabs/device.c
index 5400e65ba5fa..471506b54217 100644
--- a/drivers/misc/habanalabs/device.c
+++ b/drivers/misc/habanalabs/device.c
@@ -574,20 +574,21 @@ static void device_kill_open_processes(struct hl_device *hdev)
else
pending_total = HL_PENDING_RESET_PER_SEC;
- pending_cnt = pending_total;
-
/* Flush all processes that are inside hl_open */
mutex_lock(&hdev->fpriv_list_lock);
+ mutex_unlock(&hdev->fpriv_list_lock);
- while ((!list_empty(&hdev->fpriv_list)) && (pending_cnt)) {
-
- pending_cnt--;
-
- dev_info(hdev->dev,
- "Can't HARD reset, waiting for user to close FD\n");
+ /* Giving time for user to close FD, and for processes that are inside
+ * hl_device_open to finish
+ */
+ if (!list_empty(&hdev->fpriv_list))
ssleep(1);
- }
+ mutex_lock(&hdev->fpriv_list_lock);
+
+ /* This section must be protected because we are dereferencing
+ * pointers that are freed if the process exits
+ */
if (!list_empty(&hdev->fpriv_list)) {
task = get_pid_task(hdev->compute_ctx->hpriv->taskpid,
PIDTYPE_PID);
@@ -600,6 +601,8 @@ static void device_kill_open_processes(struct hl_device *hdev)
}
}
+ mutex_unlock(&hdev->fpriv_list_lock);
+
/* We killed the open users, but because the driver cleans up after the
* user contexts are closed (e.g. mmu mappings), we need to wait again
* to make sure the cleaning phase is finished before continuing with
@@ -609,6 +612,8 @@ static void device_kill_open_processes(struct hl_device *hdev)
pending_cnt = pending_total;
while ((!list_empty(&hdev->fpriv_list)) && (pending_cnt)) {
+ dev_info(hdev->dev,
+ "Waiting for all unmap operations to finish before hard reset\n");
pending_cnt--;
@@ -618,9 +623,6 @@ static void device_kill_open_processes(struct hl_device *hdev)
if (!list_empty(&hdev->fpriv_list))
dev_crit(hdev->dev,
"Going to hard reset with open user contexts\n");
-
- mutex_unlock(&hdev->fpriv_list_lock);
-
}
static void device_hard_reset_pending(struct work_struct *work)
--
2.17.1
next prev parent reply other threads:[~2019-07-28 11:28 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-07-28 11:28 [PATCH 0/9] habanalabs: support open device by multiple processes Oded Gabbay
2019-07-28 11:28 ` [PATCH 1/9] habanalabs: add handle field to context structure Oded Gabbay
2019-07-28 11:28 ` [PATCH 2/9] habanalabs: verify context is valid in IOCTLs Oded Gabbay
2019-07-28 11:28 ` [PATCH 3/9] habanalabs: create context in lazy mode Oded Gabbay
2019-07-28 11:28 ` [PATCH 4/9] habanalabs: don't change frequency if user context is valid Oded Gabbay
2019-07-28 11:28 ` [PATCH 5/9] habanalabs: maintain a list of file private data objects Oded Gabbay
2019-07-28 11:28 ` [PATCH 6/9] habanalabs: define user context as compute context Oded Gabbay
2019-07-28 11:28 ` Oded Gabbay [this message]
2019-07-28 11:28 ` [PATCH 8/9] habanalabs: kill user process after CS rollback Oded Gabbay
2019-07-28 11:28 ` [PATCH 9/9] habanalabs: allow multiple processes to open FD Oded Gabbay
2019-07-28 11:44 ` Greg KH
2019-07-28 11:56 ` Oded Gabbay
2019-07-28 12:04 ` Greg KH
2019-07-28 12:06 ` Oded Gabbay
2019-07-28 12:12 ` Greg KH
2019-07-28 12:18 ` Oded Gabbay
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190728112818.30397-8-oded.gabbay@gmail.com \
--to=oded.gabbay@gmail.com \
--cc=gregkh@linuxfoundation.org \
--cc=linux-kernel@vger.kernel.org \
--cc=oshpigelman@habana.ai \
--cc=ttayar@habana.ai \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).