[PATCH 1/7] habanalabs: add terminating NULL to attrs arrays

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH 1/7] habanalabs: add terminating NULL to attrs arrays
@ 2022-06-05 10:33 Oded Gabbay
  2022-06-05 10:33 ` [PATCH 2/7] habanalabs: align ioctl uapi structures to 64-bit Oded Gabbay
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: Oded Gabbay @ 2022-06-05 10:33 UTC (permalink / raw)
  To: linux-kernel; +Cc: Dafna Hirschfeld

From: Dafna Hirschfeld <dhirschfeld@habana.ai>

Arrays of struct attribute are expected to be NULL terminated.
This is required by API methods such as device_add_groups.
This fixes a crash when loading the driver for Goya device.

Signed-off-by: Dafna Hirschfeld <dhirschfeld@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
---
 drivers/misc/habanalabs/common/sysfs.c    | 2 ++
 drivers/misc/habanalabs/gaudi/gaudi.c     | 1 +
 drivers/misc/habanalabs/goya/goya_hwmgr.c | 2 ++
 3 files changed, 5 insertions(+)

diff --git a/drivers/misc/habanalabs/common/sysfs.c b/drivers/misc/habanalabs/common/sysfs.c
index 9ebeb18ab85e..da8181068895 100644
--- a/drivers/misc/habanalabs/common/sysfs.c
+++ b/drivers/misc/habanalabs/common/sysfs.c
@@ -73,6 +73,7 @@ static DEVICE_ATTR_RO(clk_cur_freq_mhz);
 static struct attribute *hl_dev_clk_attrs[] = {
 	&dev_attr_clk_max_freq_mhz.attr,
 	&dev_attr_clk_cur_freq_mhz.attr,
+	NULL,
 };
 
 static ssize_t vrm_ver_show(struct device *dev, struct device_attribute *attr, char *buf)
@@ -93,6 +94,7 @@ static DEVICE_ATTR_RO(vrm_ver);
 
 static struct attribute *hl_dev_vrm_attrs[] = {
 	&dev_attr_vrm_ver.attr,
+	NULL,
 };
 
 static ssize_t uboot_ver_show(struct device *dev, struct device_attribute *attr,
diff --git a/drivers/misc/habanalabs/gaudi/gaudi.c b/drivers/misc/habanalabs/gaudi/gaudi.c
index fba322241096..25d735aee6a3 100644
--- a/drivers/misc/habanalabs/gaudi/gaudi.c
+++ b/drivers/misc/habanalabs/gaudi/gaudi.c
@@ -9187,6 +9187,7 @@ static DEVICE_ATTR_RO(infineon_ver);
 
 static struct attribute *gaudi_vrm_dev_attrs[] = {
 	&dev_attr_infineon_ver.attr,
+	NULL,
 };
 
 static void gaudi_add_device_attr(struct hl_device *hdev, struct attribute_group *dev_clk_attr_grp,
diff --git a/drivers/misc/habanalabs/goya/goya_hwmgr.c b/drivers/misc/habanalabs/goya/goya_hwmgr.c
index 6580fc6a486a..b595721751c1 100644
--- a/drivers/misc/habanalabs/goya/goya_hwmgr.c
+++ b/drivers/misc/habanalabs/goya/goya_hwmgr.c
@@ -359,6 +359,7 @@ static struct attribute *goya_clk_dev_attrs[] = {
 	&dev_attr_pm_mng_profile.attr,
 	&dev_attr_tpc_clk.attr,
 	&dev_attr_tpc_clk_curr.attr,
+	NULL,
 };
 
 static ssize_t infineon_ver_show(struct device *dev, struct device_attribute *attr, char *buf)
@@ -375,6 +376,7 @@ static DEVICE_ATTR_RO(infineon_ver);
 
 static struct attribute *goya_vrm_dev_attrs[] = {
 	&dev_attr_infineon_ver.attr,
+	NULL,
 };
 
 void goya_add_device_attr(struct hl_device *hdev, struct attribute_group *dev_clk_attr_grp,
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/7] habanalabs: align ioctl uapi structures to 64-bit
  2022-06-05 10:33 [PATCH 1/7] habanalabs: add terminating NULL to attrs arrays Oded Gabbay
@ 2022-06-05 10:33 ` Oded Gabbay
  2022-06-05 10:33 ` [PATCH 3/7] habanalabs/gaudi: move tpc assert raise into internal func Oded Gabbay
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Oded Gabbay @ 2022-06-05 10:33 UTC (permalink / raw)
  To: linux-kernel; +Cc: Dan Rapaport

From: Dan Rapaport <drapaport@habana.ai>

The compiler is padding the members of the struct to be aligned to
64-bit. The content of the padded bytes is and not zeroed explicitly,
hence might copy undefined data. We add a padding member to the struct
to get a zeroed 64-bit align struct.

Signed-off-by: Dan Rapaport <drapaport@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
---
 include/uapi/misc/habanalabs.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/uapi/misc/habanalabs.h b/include/uapi/misc/habanalabs.h
index 52540d5b4fc9..6d2ccc09dcf2 100644
--- a/include/uapi/misc/habanalabs.h
+++ b/include/uapi/misc/habanalabs.h
@@ -949,6 +949,7 @@ struct hl_cs_in {
 
 	/* Context ID - Currently not in use */
 	__u32 ctx_id;
+	__u8 pad[4];
 };
 
 struct hl_cs_out {
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 3/7] habanalabs/gaudi: move tpc assert raise into internal func
  2022-06-05 10:33 [PATCH 1/7] habanalabs: add terminating NULL to attrs arrays Oded Gabbay
  2022-06-05 10:33 ` [PATCH 2/7] habanalabs: align ioctl uapi structures to 64-bit Oded Gabbay
@ 2022-06-05 10:33 ` Oded Gabbay
  2022-06-05 10:33 ` [PATCH 4/7] habanalabs: change the write flag name of error info structs Oded Gabbay
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Oded Gabbay @ 2022-06-05 10:33 UTC (permalink / raw)
  To: linux-kernel; +Cc: Tal Cohen

From: Tal Cohen <talcohen@habana.ai>

raising the tpc assert event in an internal function will make
the code cleaner as we are going to be adding more events

Signed-off-by: Tal Cohen <talcohen@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
---
 drivers/misc/habanalabs/gaudi/gaudi.c | 27 ++++++++++++---------------
 1 file changed, 12 insertions(+), 15 deletions(-)

diff --git a/drivers/misc/habanalabs/gaudi/gaudi.c b/drivers/misc/habanalabs/gaudi/gaudi.c
index 25d735aee6a3..4db5f6ef96f1 100644
--- a/drivers/misc/habanalabs/gaudi/gaudi.c
+++ b/drivers/misc/habanalabs/gaudi/gaudi.c
@@ -7290,7 +7290,7 @@ static void gaudi_handle_ecc_event(struct hl_device *hdev, u16 event_type,
 		ecc_address, ecc_syndrom, memory_wrapper_idx);
 }
 
-static void gaudi_handle_qman_err(struct hl_device *hdev, u16 event_type)
+static void gaudi_handle_qman_err(struct hl_device *hdev, u16 event_type, u64 *event_mask)
 {
 	u64 qman_base;
 	char desc[32];
@@ -7299,6 +7299,12 @@ static void gaudi_handle_qman_err(struct hl_device *hdev, u16 event_type)
 
 	switch (event_type) {
 	case GAUDI_EVENT_TPC0_QM ... GAUDI_EVENT_TPC7_QM:
+		/* In TPC QM event, notify on TPC assertion. While there isn't
+		 * a specific event for assertion yet, the FW generates QM event.
+		 * The SW upper layer will inspect an internal mapped area to indicate
+		 * if the event is a tpc assertion or tpc QM.
+		 */
+		*event_mask |= HL_NOTIFIER_EVENT_TPC_ASSERT;
 		index = event_type - GAUDI_EVENT_TPC0_QM;
 		qid_base = GAUDI_QUEUE_ID_TPC_0_0 + index * QMAN_STREAMS;
 		qman_base = mmTPC0_QM_BASE + index * TPC_QMAN_OFFSET;
@@ -7715,7 +7721,7 @@ static void gaudi_handle_eqe(struct hl_device *hdev,
 				struct hl_eq_entry *eq_entry)
 {
 	struct gaudi_device *gaudi = hdev->asic_specific;
-	u64 data = le64_to_cpu(eq_entry->data[0]);
+	u64 data = le64_to_cpu(eq_entry->data[0]), event_mask = 0;
 	u32 ctl = le32_to_cpu(eq_entry->hdr.ctl);
 	u32 fw_fatal_err_flag = 0;
 	u16 event_type = ((ctl & EQ_CTL_EVENT_TYPE_MASK)
@@ -7892,22 +7898,10 @@ static void gaudi_handle_eqe(struct hl_device *hdev,
 	case GAUDI_EVENT_NIC4_QM0:
 	case GAUDI_EVENT_NIC4_QM1:
 	case GAUDI_EVENT_DMA0_CORE ... GAUDI_EVENT_DMA7_CORE:
-		gaudi_print_irq_info(hdev, event_type, true);
-		gaudi_handle_qman_err(hdev, event_type);
-		hl_fw_unmask_irq(hdev, event_type);
-		break;
-
 	case GAUDI_EVENT_TPC0_QM ... GAUDI_EVENT_TPC7_QM:
 		gaudi_print_irq_info(hdev, event_type, true);
-		gaudi_handle_qman_err(hdev, event_type);
+		gaudi_handle_qman_err(hdev, event_type, &event_mask);
 		hl_fw_unmask_irq(hdev, event_type);
-
-		/* In TPC QM event, notify on TPC assertion. While there isn't
-		 * a specific event for assertion yet, the FW generates QM event.
-		 * The SW upper layer will inspect an internal mapped area to indicate
-		 * if the event is a tpc assertion or tpc QM.
-		 */
-		hl_notifier_event_send_all(hdev, HL_NOTIFIER_EVENT_TPC_ASSERT);
 		break;
 
 	case GAUDI_EVENT_RAZWI_OR_ADC_SW:
@@ -7978,6 +7972,9 @@ static void gaudi_handle_eqe(struct hl_device *hdev,
 		break;
 	}
 
+	if (event_mask)
+		hl_notifier_event_send_all(hdev, event_mask);
+
 	return;
 
 reset_device:
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 4/7] habanalabs: change the write flag name of error info structs
  2022-06-05 10:33 [PATCH 1/7] habanalabs: add terminating NULL to attrs arrays Oded Gabbay
  2022-06-05 10:33 ` [PATCH 2/7] habanalabs: align ioctl uapi structures to 64-bit Oded Gabbay
  2022-06-05 10:33 ` [PATCH 3/7] habanalabs/gaudi: move tpc assert raise into internal func Oded Gabbay
@ 2022-06-05 10:33 ` Oded Gabbay
  2022-06-05 10:33 ` [PATCH 5/7] habanalabs/gaudi: fix comment to reflect current code Oded Gabbay
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Oded Gabbay @ 2022-06-05 10:33 UTC (permalink / raw)
  To: linux-kernel; +Cc: Tal Cohen

From: Tal Cohen <talcohen@habana.ai>

positive flags naming will make more clear code while adding
more 'error info' structures

Signed-off-by: Tal Cohen <talcohen@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
---
 drivers/misc/habanalabs/common/command_submission.c |  4 ++--
 drivers/misc/habanalabs/common/habanalabs.h         | 12 ++++++------
 drivers/misc/habanalabs/common/habanalabs_drv.c     |  4 ++--
 drivers/misc/habanalabs/gaudi/gaudi.c               |  4 ++--
 4 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/drivers/misc/habanalabs/common/command_submission.c b/drivers/misc/habanalabs/common/command_submission.c
index fb30b7de4aab..22109be06139 100644
--- a/drivers/misc/habanalabs/common/command_submission.c
+++ b/drivers/misc/habanalabs/common/command_submission.c
@@ -735,8 +735,8 @@ static void cs_timedout(struct work_struct *work)
 	hdev = cs->ctx->hdev;
 
 	/* Save only the first CS timeout parameters */
-	rc = atomic_cmpxchg(&hdev->last_error.cs_timeout.write_disable, 0, 1);
-	if (!rc) {
+	rc = atomic_cmpxchg(&hdev->last_error.cs_timeout.write_enable, 1, 0);
+	if (rc) {
 		hdev->last_error.cs_timeout.timestamp = ktime_get();
 		hdev->last_error.cs_timeout.seq = cs->sequence;
 	}
diff --git a/drivers/misc/habanalabs/common/habanalabs.h b/drivers/misc/habanalabs/common/habanalabs.h
index b0b0f3f89865..7a46f36518fe 100644
--- a/drivers/misc/habanalabs/common/habanalabs.h
+++ b/drivers/misc/habanalabs/common/habanalabs.h
@@ -2570,21 +2570,21 @@ struct hl_clk_throttle {
 /**
  * struct cs_timeout_info - info of last CS timeout occurred.
  * @timestamp: CS timeout timestamp.
- * @write_disable: if set writing to CS parameters in the structure is disabled so,
- *                 the first (root cause) CS timeout will not be overwritten.
+ * @write_enable: if set writing to CS parameters in the structure is enabled. otherwise - disabled,
+ *                so the first (root cause) CS timeout will not be overwritten.
  * @seq: CS timeout sequence number.
  */
 struct cs_timeout_info {
 	ktime_t		timestamp;
-	atomic_t	write_disable;
+	atomic_t	write_enable;
 	u64		seq;
 };
 
 /**
  * struct razwi_info - info about last razwi error occurred.
  * @timestamp: razwi timestamp.
- * @write_disable: if set writing to razwi parameters in the structure is disabled so the
- *                 first (root cause) razwi will not be overwritten.
+ * @write_enable: if set writing to razwi parameters in the structure is enabled.
+ *                otherwise - disabled, so the first (root cause) razwi will not be overwritten.
  * @addr: address that caused razwi.
  * @engine_id_1: engine id of the razwi initiator, if it was initiated by engine that does
  *               not have engine id it will be set to U16_MAX.
@@ -2596,7 +2596,7 @@ struct cs_timeout_info {
  */
 struct razwi_info {
 	ktime_t		timestamp;
-	atomic_t	write_disable;
+	atomic_t	write_enable;
 	u64		addr;
 	u16		engine_id_1;
 	u16		engine_id_2;
diff --git a/drivers/misc/habanalabs/common/habanalabs_drv.c b/drivers/misc/habanalabs/common/habanalabs_drv.c
index 37edb69a7255..e182637c2d93 100644
--- a/drivers/misc/habanalabs/common/habanalabs_drv.c
+++ b/drivers/misc/habanalabs/common/habanalabs_drv.c
@@ -195,8 +195,8 @@ int hl_device_open(struct inode *inode, struct file *filp)
 
 	hl_debugfs_add_file(hpriv);
 
-	atomic_set(&hdev->last_error.cs_timeout.write_disable, 0);
-	atomic_set(&hdev->last_error.razwi.write_disable, 0);
+	atomic_set(&hdev->last_error.cs_timeout.write_enable, 1);
+	atomic_set(&hdev->last_error.razwi.write_enable, 1);
 
 	hdev->open_counter++;
 	hdev->last_successful_open_jif = jiffies;
diff --git a/drivers/misc/habanalabs/gaudi/gaudi.c b/drivers/misc/habanalabs/gaudi/gaudi.c
index 4db5f6ef96f1..c16c0f9fe202 100644
--- a/drivers/misc/habanalabs/gaudi/gaudi.c
+++ b/drivers/misc/habanalabs/gaudi/gaudi.c
@@ -7410,8 +7410,8 @@ static void gaudi_print_irq_info(struct hl_device *hdev, u16 event_type,
 		gaudi_print_and_get_mmu_error_info(hdev, &razwi_addr, &razwi_type);
 
 		/* In case it's the first razwi, save its parameters*/
-		rc = atomic_cmpxchg(&hdev->last_error.razwi.write_disable, 0, 1);
-		if (!rc) {
+		rc = atomic_cmpxchg(&hdev->last_error.razwi.write_enable, 1, 0);
+		if (rc) {
 			hdev->last_error.razwi.timestamp = ktime_get();
 			hdev->last_error.razwi.addr = razwi_addr;
 			hdev->last_error.razwi.engine_id_1 = engine_id_1;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 5/7] habanalabs/gaudi: fix comment to reflect current code
  2022-06-05 10:33 [PATCH 1/7] habanalabs: add terminating NULL to attrs arrays Oded Gabbay
                   ` (2 preceding siblings ...)
  2022-06-05 10:33 ` [PATCH 4/7] habanalabs: change the write flag name of error info structs Oded Gabbay
@ 2022-06-05 10:33 ` Oded Gabbay
  2022-06-05 10:33 ` [PATCH 6/7] habanalabs: keep a record of completed CS outcomes Oded Gabbay
  2022-06-05 10:33 ` [PATCH 7/7] habanalabs: fix race between hl_get_compute_ctx() and hl_ctx_put() Oded Gabbay
  5 siblings, 0 replies; 7+ messages in thread
From: Oded Gabbay @ 2022-06-05 10:33 UTC (permalink / raw)
  To: linux-kernel; +Cc: Dan Carpenter

Due to code changes in the past few years, the original comment of
how parser->user_cb_size is checked was not correct anymore.

Fix it to reflect current code and add more explanation as the code
is more complex now.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
---
 drivers/misc/habanalabs/gaudi/gaudi.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/misc/habanalabs/gaudi/gaudi.c b/drivers/misc/habanalabs/gaudi/gaudi.c
index c16c0f9fe202..72b0d145e853 100644
--- a/drivers/misc/habanalabs/gaudi/gaudi.c
+++ b/drivers/misc/habanalabs/gaudi/gaudi.c
@@ -5562,8 +5562,14 @@ static int gaudi_parse_cb_mmu(struct hl_device *hdev,
 	}
 
 	/*
-	 * The check that parser->user_cb_size <= parser->user_cb->size was done
-	 * in validate_queue_index().
+	 * We are protected from overflow because the check
+	 * "parser->user_cb_size <= parser->user_cb->size" was done in get_cb_from_cs_chunk()
+	 * in the common code. That check is done only if is_kernel_allocated_cb is true.
+	 *
+	 * There is no option to reach here without going through that check because:
+	 * 1. validate_queue_index() assigns true to is_kernel_allocated_cb for any submission to
+	 *    an external queue.
+	 * 2. For Gaudi, we only parse CBs that were submitted to the external queues.
 	 */
 	memcpy(parser->patched_cb->kernel_address,
 		parser->user_cb->kernel_address,
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 6/7] habanalabs: keep a record of completed CS outcomes
  2022-06-05 10:33 [PATCH 1/7] habanalabs: add terminating NULL to attrs arrays Oded Gabbay
                   ` (3 preceding siblings ...)
  2022-06-05 10:33 ` [PATCH 5/7] habanalabs/gaudi: fix comment to reflect current code Oded Gabbay
@ 2022-06-05 10:33 ` Oded Gabbay
  2022-06-05 10:33 ` [PATCH 7/7] habanalabs: fix race between hl_get_compute_ctx() and hl_ctx_put() Oded Gabbay
  5 siblings, 0 replies; 7+ messages in thread
From: Oded Gabbay @ 2022-06-05 10:33 UTC (permalink / raw)
  To: linux-kernel; +Cc: Yuri Nudelman

From: Yuri Nudelman <ynudelman@habana.ai>

Often, the user is not interested in the completion timestamp of all
command submissions.
A common situation is, for example, when the user submits a burst of,
possibly, several thousands of commands, then request the completion
timestamp of only couple of specific key commands from all the burst.
The problem is that currently, the outcome of the early commands may be
lost, due to a large amount of later commands, that the user does not
really care about.

This patch creates a separate store with the outcomes of commands the
user has mark explicitly as interested in. This store does not mix the
marked commands with the unmarked ones, hence the data there will
survive for much longer.

Signed-off-by: Yuri Nudelman <ynudelman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
---
 .../habanalabs/common/command_submission.c    | 113 ++++++++++++++++--
 drivers/misc/habanalabs/common/context.c      |   9 +-
 drivers/misc/habanalabs/common/habanalabs.h   |  37 ++++++
 3 files changed, 147 insertions(+), 12 deletions(-)

diff --git a/drivers/misc/habanalabs/common/command_submission.c b/drivers/misc/habanalabs/common/command_submission.c
index 22109be06139..47b49cbf67ab 100644
--- a/drivers/misc/habanalabs/common/command_submission.c
+++ b/drivers/misc/habanalabs/common/command_submission.c
@@ -34,6 +34,84 @@ static int _hl_cs_wait_ioctl(struct hl_device *hdev, struct hl_ctx *ctx,
 				enum hl_cs_wait_status *status, s64 *timestamp);
 static void cs_do_release(struct kref *ref);
 
+static void hl_push_cs_outcome(struct hl_device *hdev,
+			       struct hl_cs_outcome_store *outcome_store,
+			       u64 seq, ktime_t ts, int error)
+{
+	struct hl_cs_outcome *node;
+	unsigned long flags;
+
+	/*
+	 * CS outcome store supports the following operations:
+	 * push outcome - store a recent CS outcome in the store
+	 * pop outcome - retrieve a SPECIFIC (by seq) CS outcome from the store
+	 * It uses 2 lists: used list and free list.
+	 * It has a pre-allocated amount of nodes, each node stores
+	 * a single CS outcome.
+	 * Initially, all the nodes are in the free list.
+	 * On push outcome, a node (any) is taken from the free list, its
+	 * information is filled in, and the node is moved to the used list.
+	 * It is possible, that there are no nodes left in the free list.
+	 * In this case, we will lose some information about old outcomes. We
+	 * will pop the OLDEST node from the used list, and make it free.
+	 * On pop, the node is searched for in the used list (using a search
+	 * index).
+	 * If found, the node is then removed from the used list, and moved
+	 * back to the free list. The outcome data that the node contained is
+	 * returned back to the user.
+	 */
+
+	spin_lock_irqsave(&outcome_store->db_lock, flags);
+
+	if (list_empty(&outcome_store->free_list)) {
+		node = list_last_entry(&outcome_store->used_list,
+				       struct hl_cs_outcome, list_link);
+		hash_del(&node->map_link);
+		dev_dbg(hdev->dev, "CS %llu outcome was lost\n", node->seq);
+	} else {
+		node = list_last_entry(&outcome_store->free_list,
+				       struct hl_cs_outcome, list_link);
+	}
+
+	list_del_init(&node->list_link);
+
+	node->seq = seq;
+	node->ts = ts;
+	node->error = error;
+
+	list_add(&node->list_link, &outcome_store->used_list);
+	hash_add(outcome_store->outcome_map, &node->map_link, node->seq);
+
+	spin_unlock_irqrestore(&outcome_store->db_lock, flags);
+}
+
+static bool hl_pop_cs_outcome(struct hl_cs_outcome_store *outcome_store,
+			       u64 seq, ktime_t *ts, int *error)
+{
+	struct hl_cs_outcome *node;
+	unsigned long flags;
+
+	spin_lock_irqsave(&outcome_store->db_lock, flags);
+
+	hash_for_each_possible(outcome_store->outcome_map, node, map_link, seq)
+		if (node->seq == seq) {
+			*ts = node->ts;
+			*error = node->error;
+
+			hash_del(&node->map_link);
+			list_del_init(&node->list_link);
+			list_add(&node->list_link, &outcome_store->free_list);
+
+			spin_unlock_irqrestore(&outcome_store->db_lock, flags);
+
+			return true;
+		}
+
+	spin_unlock_irqrestore(&outcome_store->db_lock, flags);
+
+	return false;
+}
+
 static void hl_sob_reset(struct kref *ref)
 {
 	struct hl_hw_sob *hw_sob = container_of(ref, struct hl_hw_sob,
@@ -678,7 +756,6 @@ static void cs_do_release(struct kref *ref)
 	 */
 	hl_debugfs_remove_cs(cs);
 
-	hl_ctx_put(cs->ctx);
 
 	/* We need to mark an error for not submitted because in that case
 	 * the hl fence release flow is different. Mainly, we don't need
@@ -698,8 +775,14 @@ static void cs_do_release(struct kref *ref)
 			div_u64(jiffies - cs->submission_time_jiffies, HZ));
 	}
 
-	if (cs->timestamp)
+	if (cs->timestamp) {
 		cs->fence->timestamp = ktime_get();
+		hl_push_cs_outcome(hdev, &cs->ctx->outcome_store, cs->sequence,
+				   cs->fence->timestamp, cs->fence->error);
+	}
+
+	hl_ctx_put(cs->ctx);
+
 	complete_all(&cs->fence->completion);
 	complete_multi_cs(hdev, cs);
 
@@ -2325,8 +2408,9 @@ static int hl_wait_for_fence(struct hl_ctx *ctx, u64 seq, struct hl_fence *fence
 				s64 *timestamp)
 {
 	struct hl_device *hdev = ctx->hdev;
+	ktime_t timestamp_kt;
 	long completion_rc;
-	int rc = 0;
+	int rc = 0, error;
 
 	if (IS_ERR(fence)) {
 		rc = PTR_ERR(fence);
@@ -2338,12 +2422,17 @@ static int hl_wait_for_fence(struct hl_ctx *ctx, u64 seq, struct hl_fence *fence
 	}
 
 	if (!fence) {
-		dev_dbg(hdev->dev,
+		if (!hl_pop_cs_outcome(&ctx->outcome_store, seq, &timestamp_kt, &error)) {
+			dev_dbg(hdev->dev,
 			"Can't wait on seq %llu because current CS is at seq %llu (Fence is gone)\n",
 				seq, ctx->cs_sequence);
 
-		*status = CS_WAIT_STATUS_GONE;
-		return 0;
+			*status = CS_WAIT_STATUS_GONE;
+			return 0;
+		}
+
+		completion_rc = 1;
+		goto report_results;
 	}
 
 	if (!timeout_us) {
@@ -2358,18 +2447,20 @@ static int hl_wait_for_fence(struct hl_ctx *ctx, u64 seq, struct hl_fence *fence
 				&fence->completion, timeout);
 	}
 
+	error = fence->error;
+	timestamp_kt = fence->timestamp;
+
+report_results:
 	if (completion_rc > 0) {
 		*status = CS_WAIT_STATUS_COMPLETED;
 		if (timestamp)
-			*timestamp = ktime_to_ns(fence->timestamp);
+			*timestamp = ktime_to_ns(timestamp_kt);
 	} else {
 		*status = CS_WAIT_STATUS_BUSY;
 	}
 
-	if (fence->error == -ETIMEDOUT)
-		rc = -ETIMEDOUT;
-	else if (fence->error == -EIO)
-		rc = -EIO;
+	if (error == -ETIMEDOUT || error == -EIO)
+		rc = error;
 
 	return rc;
 }
diff --git a/drivers/misc/habanalabs/common/context.c b/drivers/misc/habanalabs/common/context.c
index ed2cfd0c6e99..6d033aecc8fc 100644
--- a/drivers/misc/habanalabs/common/context.c
+++ b/drivers/misc/habanalabs/common/context.c
@@ -181,7 +181,7 @@ int hl_ctx_create(struct hl_device *hdev, struct hl_fpriv *hpriv)
 
 int hl_ctx_init(struct hl_device *hdev, struct hl_ctx *ctx, bool is_kernel_ctx)
 {
-	int rc = 0;
+	int rc = 0, i;
 
 	ctx->hdev = hdev;
 
@@ -197,6 +197,13 @@ int hl_ctx_init(struct hl_device *hdev, struct hl_ctx *ctx, bool is_kernel_ctx)
 	if (!ctx->cs_pending)
 		return -ENOMEM;
 
+	INIT_LIST_HEAD(&ctx->outcome_store.used_list);
+	INIT_LIST_HEAD(&ctx->outcome_store.free_list);
+	hash_init(ctx->outcome_store.outcome_map);
+	for (i = 0; i < ARRAY_SIZE(ctx->outcome_store.nodes_pool); ++i)
+		list_add(&ctx->outcome_store.nodes_pool[i].list_link,
+			 &ctx->outcome_store.free_list);
+
 	hl_hw_block_mem_init(ctx);
 
 	if (is_kernel_ctx) {
diff --git a/drivers/misc/habanalabs/common/habanalabs.h b/drivers/misc/habanalabs/common/habanalabs.h
index 7a46f36518fe..3023ecfc19c9 100644
--- a/drivers/misc/habanalabs/common/habanalabs.h
+++ b/drivers/misc/habanalabs/common/habanalabs.h
@@ -1535,6 +1535,40 @@ struct hl_dmabuf_priv {
 	uint64_t			device_address;
 };
 
+#define HL_CS_OUTCOME_HISTORY_LEN 256
+
+/**
+ * struct hl_cs_outcome - represents a single completed CS outcome
+ * @list_link: link to either container's used list or free list
+ * @map_link: list to the container hash map
+ * @ts: completion ts
+ * @seq: the original cs sequence
+ * @error: error code cs completed with, if any
+ */
+struct hl_cs_outcome {
+	struct list_head list_link;
+	struct hlist_node map_link;
+	ktime_t ts;
+	u64 seq;
+	int error;
+};
+
+/**
+ * struct hl_cs_outcome_store - represents a limited store of completed CS outcomes
+ * @outcome_map: index of completed CS searcheable by sequence number
+ * @used_list: list of outcome objects currently in use
+ * @free_list: list of outcome objects currently not in use
+ * @nodes_pool: a static pool of preallocated outcome objects
+ * @db_lock: any operation on the store must take this lock
+ */
+struct hl_cs_outcome_store {
+	DECLARE_HASHTABLE(outcome_map, 8);
+	struct list_head used_list;
+	struct list_head free_list;
+	struct hl_cs_outcome nodes_pool[HL_CS_OUTCOME_HISTORY_LEN];
+	spinlock_t db_lock;
+};
+
 /**
  * struct hl_ctx - user/kernel context.
  * @mem_hash: holds mapping from virtual address to virtual memory area
@@ -1545,6 +1579,8 @@ struct hl_dmabuf_priv {
  * @refcount: reference counter for the context. Context is released only when
  *		this hits 0l. It is incremented on CS and CS_WAIT.
  * @cs_pending: array of hl fence objects representing pending CS.
+ * @outcome_store: storage data structure used to remember ouitcomes of completed
+ *                 command submissions for a long time after CS id wraparound.
  * @va_range: holds available virtual addresses for host and dram mappings.
  * @mem_hash_lock: protects the mem_hash.
  * @mmu_lock: protects the MMU page tables. Any change to the PGT, modifying the
@@ -1580,6 +1616,7 @@ struct hl_ctx {
 	struct hl_device		*hdev;
 	struct kref			refcount;
 	struct hl_fence			**cs_pending;
+	struct hl_cs_outcome_store	outcome_store;
 	struct hl_va_range		*va_range[HL_VA_RANGE_TYPE_MAX];
 	struct mutex			mem_hash_lock;
 	struct mutex			mmu_lock;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 7/7] habanalabs: fix race between hl_get_compute_ctx() and hl_ctx_put()
  2022-06-05 10:33 [PATCH 1/7] habanalabs: add terminating NULL to attrs arrays Oded Gabbay
                   ` (4 preceding siblings ...)
  2022-06-05 10:33 ` [PATCH 6/7] habanalabs: keep a record of completed CS outcomes Oded Gabbay
@ 2022-06-05 10:33 ` Oded Gabbay
  5 siblings, 0 replies; 7+ messages in thread
From: Oded Gabbay @ 2022-06-05 10:33 UTC (permalink / raw)
  To: linux-kernel; +Cc: Tomer Tayar

From: Tomer Tayar <ttayar@habana.ai>

hl_get_compute_ctx() is used to get the pointer to the compute context
from the hpriv object.
The function is called in code paths that are not necessarily initiated
by user, so it is possible that a context release process will happen in
parallel.
This can lead to a race condition in which hl_get_compute_ctx()
retrieves the context pointer, and just before it increments the context
refcount, the context object is released and a freed memory is accessed.

To avoid this race, add a mutex to protect the context pointer in hpriv.
With this lock, hl_get_compute_ctx() will be able to detect if the
context has been released or is about to be released.

struct hl_ctx_mgr has a mutex for contexts IDR with a similar "ctx_lock"
name, so rename it to just "lock" to avoid a confusion with the new
lock.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
---
 drivers/misc/habanalabs/common/context.c      | 58 ++++++++++++-------
 drivers/misc/habanalabs/common/device.c       |  1 +
 drivers/misc/habanalabs/common/habanalabs.h   | 11 ++--
 .../misc/habanalabs/common/habanalabs_drv.c   |  2 +
 4 files changed, 47 insertions(+), 25 deletions(-)

diff --git a/drivers/misc/habanalabs/common/context.c b/drivers/misc/habanalabs/common/context.c
index 6d033aecc8fc..64ac65d9268b 100644
--- a/drivers/misc/habanalabs/common/context.c
+++ b/drivers/misc/habanalabs/common/context.c
@@ -125,15 +125,22 @@ void hl_ctx_do_release(struct kref *ref)
 
 	hl_ctx_fini(ctx);
 
-	if (ctx->hpriv)
-		hl_hpriv_put(ctx->hpriv);
+	if (ctx->hpriv) {
+		struct hl_fpriv *hpriv = ctx->hpriv;
+
+		mutex_lock(&hpriv->ctx_lock);
+		hpriv->ctx = NULL;
+		mutex_unlock(&hpriv->ctx_lock);
+
+		hl_hpriv_put(hpriv);
+	}
 
 	kfree(ctx);
 }
 
 int hl_ctx_create(struct hl_device *hdev, struct hl_fpriv *hpriv)
 {
-	struct hl_ctx_mgr *mgr = &hpriv->ctx_mgr;
+	struct hl_ctx_mgr *ctx_mgr = &hpriv->ctx_mgr;
 	struct hl_ctx *ctx;
 	int rc;
 
@@ -143,9 +150,9 @@ int hl_ctx_create(struct hl_device *hdev, struct hl_fpriv *hpriv)
 		goto out_err;
 	}
 
-	mutex_lock(&mgr->ctx_lock);
-	rc = idr_alloc(&mgr->ctx_handles, ctx, 1, 0, GFP_KERNEL);
-	mutex_unlock(&mgr->ctx_lock);
+	mutex_lock(&ctx_mgr->lock);
+	rc = idr_alloc(&ctx_mgr->handles, ctx, 1, 0, GFP_KERNEL);
+	mutex_unlock(&ctx_mgr->lock);
 
 	if (rc < 0) {
 		dev_err(hdev->dev, "Failed to allocate IDR for a new CTX\n");
@@ -170,9 +177,9 @@ int hl_ctx_create(struct hl_device *hdev, struct hl_fpriv *hpriv)
 	return 0;
 
 remove_from_idr:
-	mutex_lock(&mgr->ctx_lock);
-	idr_remove(&mgr->ctx_handles, ctx->handle);
-	mutex_unlock(&mgr->ctx_lock);
+	mutex_lock(&ctx_mgr->lock);
+	idr_remove(&ctx_mgr->handles, ctx->handle);
+	mutex_unlock(&ctx_mgr->lock);
 free_ctx:
 	kfree(ctx);
 out_err:
@@ -269,6 +276,11 @@ int hl_ctx_init(struct hl_device *hdev, struct hl_ctx *ctx, bool is_kernel_ctx)
 	return rc;
 }
 
+static int hl_ctx_get_unless_zero(struct hl_ctx *ctx)
+{
+	return kref_get_unless_zero(&ctx->refcount);
+}
+
 void hl_ctx_get(struct hl_ctx *ctx)
 {
 	kref_get(&ctx->refcount);
@@ -287,11 +299,15 @@ struct hl_ctx *hl_get_compute_ctx(struct hl_device *hdev)
 	mutex_lock(&hdev->fpriv_list_lock);
 
 	list_for_each_entry(hpriv, &hdev->fpriv_list, dev_node) {
+		mutex_lock(&hpriv->ctx_lock);
+		ctx = hpriv->ctx;
+		if (ctx && !hl_ctx_get_unless_zero(ctx))
+			ctx = NULL;
+		mutex_unlock(&hpriv->ctx_lock);
+
 		/* There can only be a single user which has opened the compute device, so exit
-		 * immediately once we find him
+		 * immediately once we find its context or if we see that it has been released
 		 */
-		ctx = hpriv->ctx;
-		hl_ctx_get(ctx);
 		break;
 	}
 
@@ -383,37 +399,37 @@ int hl_ctx_get_fences(struct hl_ctx *ctx, u64 *seq_arr,
 /*
  * hl_ctx_mgr_init - initialize the context manager
  *
- * @mgr: pointer to context manager structure
+ * @ctx_mgr: pointer to context manager structure
  *
  * This manager is an object inside the hpriv object of the user process.
  * The function is called when a user process opens the FD.
  */
-void hl_ctx_mgr_init(struct hl_ctx_mgr *mgr)
+void hl_ctx_mgr_init(struct hl_ctx_mgr *ctx_mgr)
 {
-	mutex_init(&mgr->ctx_lock);
-	idr_init(&mgr->ctx_handles);
+	mutex_init(&ctx_mgr->lock);
+	idr_init(&ctx_mgr->handles);
 }
 
 /*
  * hl_ctx_mgr_fini - finalize the context manager
  *
  * @hdev: pointer to device structure
- * @mgr: pointer to context manager structure
+ * @ctx_mgr: pointer to context manager structure
  *
  * This function goes over all the contexts in the manager and frees them.
  * It is called when a process closes the FD.
  */
-void hl_ctx_mgr_fini(struct hl_device *hdev, struct hl_ctx_mgr *mgr)
+void hl_ctx_mgr_fini(struct hl_device *hdev, struct hl_ctx_mgr *ctx_mgr)
 {
 	struct hl_ctx *ctx;
 	struct idr *idp;
 	u32 id;
 
-	idp = &mgr->ctx_handles;
+	idp = &ctx_mgr->handles;
 
 	idr_for_each_entry(idp, ctx, id)
 		kref_put(&ctx->refcount, hl_ctx_do_release);
 
-	idr_destroy(&mgr->ctx_handles);
-	mutex_destroy(&mgr->ctx_lock);
+	idr_destroy(&ctx_mgr->handles);
+	mutex_destroy(&ctx_mgr->lock);
 }
diff --git a/drivers/misc/habanalabs/common/device.c b/drivers/misc/habanalabs/common/device.c
index b4f14c6d3970..38e1ad432e51 100644
--- a/drivers/misc/habanalabs/common/device.c
+++ b/drivers/misc/habanalabs/common/device.c
@@ -245,6 +245,7 @@ static void hpriv_release(struct kref *ref)
 
 	hl_debugfs_remove_file(hpriv);
 
+	mutex_destroy(&hpriv->ctx_lock);
 	mutex_destroy(&hpriv->restore_phase_mutex);
 
 	if ((!hdev->pldm) && (hdev->pdev) &&
diff --git a/drivers/misc/habanalabs/common/habanalabs.h b/drivers/misc/habanalabs/common/habanalabs.h
index 3023ecfc19c9..1ab64e8a05c6 100644
--- a/drivers/misc/habanalabs/common/habanalabs.h
+++ b/drivers/misc/habanalabs/common/habanalabs.h
@@ -1638,12 +1638,12 @@ struct hl_ctx {
 
 /**
  * struct hl_ctx_mgr - for handling multiple contexts.
- * @ctx_lock: protects ctx_handles.
- * @ctx_handles: idr to hold all ctx handles.
+ * @lock: protects ctx_handles.
+ * @handles: idr to hold all ctx handles.
  */
 struct hl_ctx_mgr {
-	struct mutex		ctx_lock;
-	struct idr		ctx_handles;
+	struct mutex	lock;
+	struct idr	handles;
 };
 
 
@@ -1998,6 +1998,8 @@ struct hl_notifier_event {
  * @dev_node: node in the device list of file private data
  * @refcount: number of related contexts.
  * @restore_phase_mutex: lock for context switch and restore phase.
+ * @ctx_lock: protects the pointer to current executing context pointer. TODO: remove for multiple
+ *            ctx per process.
  */
 struct hl_fpriv {
 	struct hl_device		*hdev;
@@ -2011,6 +2013,7 @@ struct hl_fpriv {
 	struct list_head		dev_node;
 	struct kref			refcount;
 	struct mutex			restore_phase_mutex;
+	struct mutex			ctx_lock;
 };
 
 
diff --git a/drivers/misc/habanalabs/common/habanalabs_drv.c b/drivers/misc/habanalabs/common/habanalabs_drv.c
index e182637c2d93..e617cc394ff7 100644
--- a/drivers/misc/habanalabs/common/habanalabs_drv.c
+++ b/drivers/misc/habanalabs/common/habanalabs_drv.c
@@ -137,6 +137,7 @@ int hl_device_open(struct inode *inode, struct file *filp)
 
 	mutex_init(&hpriv->notifier_event.lock);
 	mutex_init(&hpriv->restore_phase_mutex);
+	mutex_init(&hpriv->ctx_lock);
 	kref_init(&hpriv->refcount);
 	nonseekable_open(inode, filp);
 
@@ -209,6 +210,7 @@ int hl_device_open(struct inode *inode, struct file *filp)
 	hl_mem_mgr_fini(&hpriv->mem_mgr);
 	hl_ctx_mgr_fini(hpriv->hdev, &hpriv->ctx_mgr);
 	filp->private_data = NULL;
+	mutex_destroy(&hpriv->ctx_lock);
 	mutex_destroy(&hpriv->restore_phase_mutex);
 	mutex_destroy(&hpriv->notifier_event.lock);
 	put_pid(hpriv->taskpid);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2022-06-05 10:34 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-05 10:33 [PATCH 1/7] habanalabs: add terminating NULL to attrs arrays Oded Gabbay
2022-06-05 10:33 ` [PATCH 2/7] habanalabs: align ioctl uapi structures to 64-bit Oded Gabbay
2022-06-05 10:33 ` [PATCH 3/7] habanalabs/gaudi: move tpc assert raise into internal func Oded Gabbay
2022-06-05 10:33 ` [PATCH 4/7] habanalabs: change the write flag name of error info structs Oded Gabbay
2022-06-05 10:33 ` [PATCH 5/7] habanalabs/gaudi: fix comment to reflect current code Oded Gabbay
2022-06-05 10:33 ` [PATCH 6/7] habanalabs: keep a record of completed CS outcomes Oded Gabbay
2022-06-05 10:33 ` [PATCH 7/7] habanalabs: fix race between hl_get_compute_ctx() and hl_ctx_put() Oded Gabbay

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.