[PATCH 01/11] habanalabs: return correct clock throttling period

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH 01/11] habanalabs: return correct clock throttling period
@ 2021-12-14 15:05 Oded Gabbay
  2021-12-14 15:05 ` [PATCH 02/11] habanalabs: remove in_debug check in device open Oded Gabbay
                   ` (9 more replies)
  0 siblings, 10 replies; 11+ messages in thread
From: Oded Gabbay @ 2021-12-14 15:05 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ofir Bitton

From: Ofir Bitton <obitton@habana.ai>

Current clock throttling period returned from driver was wrong due
to wrong time comparison.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
---
 drivers/misc/habanalabs/common/habanalabs_ioctl.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/misc/habanalabs/common/habanalabs_ioctl.c b/drivers/misc/habanalabs/common/habanalabs_ioctl.c
index 9210114beefe..f571641c19ae 100644
--- a/drivers/misc/habanalabs/common/habanalabs_ioctl.c
+++ b/drivers/misc/habanalabs/common/habanalabs_ioctl.c
@@ -335,9 +335,9 @@ static int clk_throttle_info(struct hl_fpriv *hpriv, struct hl_info_args *args)
 			ktime_to_us(hdev->clk_throttling.timestamp[i].start);
 
 		if (ktime_compare(hdev->clk_throttling.timestamp[i].end, zero_time))
-			end_time = ktime_get();
-		else
 			end_time = hdev->clk_throttling.timestamp[i].end;
+		else
+			end_time = ktime_get();
 
 		clk_throttle.clk_throttling_duration_ns[i] =
 			ktime_to_ns(ktime_sub(end_time,
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 02/11] habanalabs: remove in_debug check in device open
  2021-12-14 15:05 [PATCH 01/11] habanalabs: return correct clock throttling period Oded Gabbay
@ 2021-12-14 15:05 ` Oded Gabbay
  2021-12-14 15:05 ` [PATCH 03/11] habanalabs: add current PI value to cpu packets Oded Gabbay
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Oded Gabbay @ 2021-12-14 15:05 UTC (permalink / raw)
  To: linux-kernel

The driver supports only a single user anyway, so there is no point
in checking whether we are in_debug state when a user tries to open
the device, because if we are in_debug, it means a user is already
using the device.

Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
---
 drivers/misc/habanalabs/common/habanalabs.h     | 5 +++--
 drivers/misc/habanalabs/common/habanalabs_drv.c | 8 --------
 2 files changed, 3 insertions(+), 10 deletions(-)

diff --git a/drivers/misc/habanalabs/common/habanalabs.h b/drivers/misc/habanalabs/common/habanalabs.h
index eda1c70f6966..362eee3f028c 100644
--- a/drivers/misc/habanalabs/common/habanalabs.h
+++ b/drivers/misc/habanalabs/common/habanalabs.h
@@ -2561,8 +2561,9 @@ struct last_error_session_info {
  * @init_done: is the initialization of the device done.
  * @device_cpu_disabled: is the device CPU disabled (due to timeouts)
  * @dma_mask: the dma mask that was set for this device
- * @in_debug: is device under debug. This, together with fpriv_list, enforces
- *            that only a single user is configuring the debug infrastructure.
+ * @in_debug: whether the device is in a state where the profiling/tracing infrastructure
+ *            can be used. This indication is needed because in some ASICs we need to do
+ *            specific operations to enable that infrastructure.
  * @power9_64bit_dma_enable: true to enable 64-bit DMA mask support. Relevant
  *                           only to POWER9 machines.
  * @cdev_sysfs_created: were char devices and sysfs nodes created.
diff --git a/drivers/misc/habanalabs/common/habanalabs_drv.c b/drivers/misc/habanalabs/common/habanalabs_drv.c
index 62a02ef43bb7..d59201f93de9 100644
--- a/drivers/misc/habanalabs/common/habanalabs_drv.c
+++ b/drivers/misc/habanalabs/common/habanalabs_drv.c
@@ -153,14 +153,6 @@ int hl_device_open(struct inode *inode, struct file *filp)
 		goto out_err;
 	}
 
-	if (hdev->in_debug) {
-		dev_err_ratelimited(hdev->dev,
-			"Can't open %s because it is being debugged by another user\n",
-			dev_name(hdev->dev));
-		rc = -EPERM;
-		goto out_err;
-	}
-
 	if (hdev->is_compute_ctx_active) {
 		dev_dbg_ratelimited(hdev->dev,
 			"Can't open %s because another user is working on it\n",
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 03/11] habanalabs: add current PI value to cpu packets
  2021-12-14 15:05 [PATCH 01/11] habanalabs: return correct clock throttling period Oded Gabbay
  2021-12-14 15:05 ` [PATCH 02/11] habanalabs: remove in_debug check in device open Oded Gabbay
@ 2021-12-14 15:05 ` Oded Gabbay
  2021-12-14 15:05 ` [PATCH 04/11] habanalabs: fix hwmon handling for legacy f/w Oded Gabbay
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Oded Gabbay @ 2021-12-14 15:05 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ofir Bitton

From: Ofir Bitton <obitton@habana.ai>

In order to increase cpucp messaging reliability we will add
the current PI value to the descriptor sent to F/W.
F/W will wait for the PI value as an indication of a valid packet.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
---
 drivers/misc/habanalabs/common/firmware_if.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/misc/habanalabs/common/firmware_if.c b/drivers/misc/habanalabs/common/firmware_if.c
index 76741898d922..ba9a17ddc3a4 100644
--- a/drivers/misc/habanalabs/common/firmware_if.c
+++ b/drivers/misc/habanalabs/common/firmware_if.c
@@ -246,7 +246,7 @@ int hl_fw_send_cpu_message(struct hl_device *hdev, u32 hw_queue_id, u32 *msg,
 	 * Which means that we don't need to lock the access to the entire H/W
 	 * queues module when submitting a JOB to the CPU queue.
 	 */
-	hl_hw_queue_submit_bd(hdev, queue, 0, len, pkt_dma_addr);
+	hl_hw_queue_submit_bd(hdev, queue, hl_queue_inc_ptr(queue->pi), len, pkt_dma_addr);
 
 	if (prop->fw_app_cpu_boot_dev_sts0 & CPU_BOOT_DEV_STS0_PKT_PI_ACK_EN)
 		expected_ack_val = queue->pi;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 04/11] habanalabs: fix hwmon handling for legacy f/w
  2021-12-14 15:05 [PATCH 01/11] habanalabs: return correct clock throttling period Oded Gabbay
  2021-12-14 15:05 ` [PATCH 02/11] habanalabs: remove in_debug check in device open Oded Gabbay
  2021-12-14 15:05 ` [PATCH 03/11] habanalabs: add current PI value to cpu packets Oded Gabbay
@ 2021-12-14 15:05 ` Oded Gabbay
  2021-12-14 15:05 ` [PATCH 05/11] habanalabs: keep control device alive during hard reset Oded Gabbay
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Oded Gabbay @ 2021-12-14 15:05 UTC (permalink / raw)
  To: linux-kernel

In legacy f/w that use old hwmon.h file, the values of the hwmon
enums are different than the values that are in newer kernels (5.6
and above).

Therefore, to support working with those f/w, we need to do some
fixup before registering with the hwmon subsystem and also when
calling the functions that communicate with the f/w to retrieve
sensors information.

Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
---
 drivers/misc/habanalabs/common/hwmon.c | 201 +++++++++++++++++++++----
 1 file changed, 169 insertions(+), 32 deletions(-)

diff --git a/drivers/misc/habanalabs/common/hwmon.c b/drivers/misc/habanalabs/common/hwmon.c
index 70182b42940d..57f5d2c48330 100644
--- a/drivers/misc/habanalabs/common/hwmon.c
+++ b/drivers/misc/habanalabs/common/hwmon.c
@@ -10,17 +10,148 @@
 #include <linux/pci.h>
 #include <linux/hwmon.h>
 
-#define HWMON_NR_SENSOR_TYPES		(hwmon_pwm + 1)
+#define HWMON_NR_SENSOR_TYPES		(hwmon_max)
 
-int hl_build_hwmon_channel_info(struct hl_device *hdev,
-				struct cpucp_sensor *sensors_arr)
+#ifdef _HAS_HWMON_HWMON_T_ENABLE
+
+static u32 fixup_flags_legacy_fw(struct hl_device *hdev, enum hwmon_sensor_types type,
+					u32 cpucp_flags)
 {
-	u32 counts[HWMON_NR_SENSOR_TYPES] = {0};
-	u32 *sensors_by_type[HWMON_NR_SENSOR_TYPES] = {NULL};
+	u32 flags;
+
+	switch (type) {
+	case hwmon_temp:
+		flags = (cpucp_flags << 1) | HWMON_T_ENABLE;
+		break;
+
+	case hwmon_in:
+		flags = (cpucp_flags << 1) | HWMON_I_ENABLE;
+		break;
+
+	case hwmon_curr:
+		flags = (cpucp_flags << 1) | HWMON_C_ENABLE;
+		break;
+
+	case hwmon_fan:
+		flags = (cpucp_flags << 1) | HWMON_F_ENABLE;
+		break;
+
+	case hwmon_power:
+		flags = (cpucp_flags << 1) | HWMON_P_ENABLE;
+		break;
+
+	case hwmon_pwm:
+		/* enable bit was here from day 1, so no need to adjust */
+		flags = cpucp_flags;
+		break;
+
+	default:
+		dev_err(hdev->dev, "unsupported h/w sensor type %d\n", type);
+		flags = cpucp_flags;
+		break;
+	}
+
+	return flags;
+}
+
+static u32 fixup_attr_legacy_fw(u32 attr)
+{
+	return (attr - 1);
+}
+
+#else
+
+static u32 fixup_flags_legacy_fw(struct hl_device *hdev, enum hwmon_sensor_types type,
+						u32 cpucp_flags)
+{
+	return cpucp_flags;
+}
+
+static u32 fixup_attr_legacy_fw(u32 attr)
+{
+	return attr;
+}
+
+#endif /* !_HAS_HWMON_HWMON_T_ENABLE */
+
+static u32 adjust_hwmon_flags(struct hl_device *hdev, enum hwmon_sensor_types type, u32 cpucp_flags)
+{
+	u32 flags, cpucp_input_val;
+	bool use_cpucp_enum;
+
+	use_cpucp_enum = (hdev->asic_prop.fw_app_cpu_boot_dev_sts0 &
+					CPU_BOOT_DEV_STS0_MAP_HWMON_EN) ? true : false;
+
+	/* If f/w is using it's own enum, we need to check if the properties values are aligned.
+	 * If not, it means we need to adjust the values to the new format that is used in the
+	 * kernel since 5.6 (enum values were incremented by 1 by adding a new enable value).
+	 */
+	if (use_cpucp_enum) {
+		switch (type) {
+		case hwmon_temp:
+			cpucp_input_val = cpucp_temp_input;
+			if (cpucp_input_val == hwmon_temp_input)
+				flags = cpucp_flags;
+			else
+				flags = (cpucp_flags << 1) | HWMON_T_ENABLE;
+			break;
+
+		case hwmon_in:
+			cpucp_input_val = cpucp_in_input;
+			if (cpucp_input_val == hwmon_in_input)
+				flags = cpucp_flags;
+			else
+				flags = (cpucp_flags << 1) | HWMON_I_ENABLE;
+			break;
+
+		case hwmon_curr:
+			cpucp_input_val = cpucp_curr_input;
+			if (cpucp_input_val == hwmon_curr_input)
+				flags = cpucp_flags;
+			else
+				flags = (cpucp_flags << 1) | HWMON_C_ENABLE;
+			break;
+
+		case hwmon_fan:
+			cpucp_input_val = cpucp_fan_input;
+			if (cpucp_input_val == hwmon_fan_input)
+				flags = cpucp_flags;
+			else
+				flags = (cpucp_flags << 1) | HWMON_F_ENABLE;
+			break;
+
+		case hwmon_pwm:
+			/* enable bit was here from day 1, so no need to adjust */
+			flags = cpucp_flags;
+			break;
+
+		case hwmon_power:
+			cpucp_input_val = CPUCP_POWER_INPUT;
+			if (cpucp_input_val == hwmon_power_input)
+				flags = cpucp_flags;
+			else
+				flags = (cpucp_flags << 1) | HWMON_P_ENABLE;
+			break;
+
+		default:
+			dev_err(hdev->dev, "unsupported h/w sensor type %d\n", type);
+			flags = cpucp_flags;
+			break;
+		}
+	} else {
+		flags = fixup_flags_legacy_fw(hdev, type, cpucp_flags);
+	}
+
+	return flags;
+}
+
+int hl_build_hwmon_channel_info(struct hl_device *hdev, struct cpucp_sensor *sensors_arr)
+{
+	u32 num_sensors_for_type, flags, num_active_sensor_types = 0, arr_size = 0, *curr_arr;
 	u32 sensors_by_type_next_index[HWMON_NR_SENSOR_TYPES] = {0};
+	u32 *sensors_by_type[HWMON_NR_SENSOR_TYPES] = {NULL};
 	struct hwmon_channel_info **channels_info;
-	u32 num_sensors_for_type, num_active_sensor_types = 0,
-			arr_size = 0, *curr_arr;
+	u32 counts[HWMON_NR_SENSOR_TYPES] = {0};
 	enum hwmon_sensor_types type;
 	int rc, i, j;
 
@@ -31,8 +162,7 @@ int hl_build_hwmon_channel_info(struct hl_device *hdev,
 			break;
 
 		if (type >= HWMON_NR_SENSOR_TYPES) {
-			dev_err(hdev->dev,
-				"Got wrong sensor type %d from device\n", type);
+			dev_err(hdev->dev, "Got wrong sensor type %d from device\n", type);
 			return -EINVAL;
 		}
 
@@ -45,8 +175,9 @@ int hl_build_hwmon_channel_info(struct hl_device *hdev,
 			continue;
 
 		num_sensors_for_type = counts[i] + 1;
-		curr_arr = kcalloc(num_sensors_for_type, sizeof(*curr_arr),
-				GFP_KERNEL);
+		dev_dbg(hdev->dev, "num_sensors_for_type %d = %d\n", i, num_sensors_for_type);
+
+		curr_arr = kcalloc(num_sensors_for_type, sizeof(*curr_arr), GFP_KERNEL);
 		if (!curr_arr) {
 			rc = -ENOMEM;
 			goto sensors_type_err;
@@ -59,20 +190,18 @@ int hl_build_hwmon_channel_info(struct hl_device *hdev,
 	for (i = 0 ; i < arr_size ; i++) {
 		type = le32_to_cpu(sensors_arr[i].type);
 		curr_arr = sensors_by_type[type];
-		curr_arr[sensors_by_type_next_index[type]++] =
-				le32_to_cpu(sensors_arr[i].flags);
+		flags = adjust_hwmon_flags(hdev, type, le32_to_cpu(sensors_arr[i].flags));
+		curr_arr[sensors_by_type_next_index[type]++] = flags;
 	}
 
-	channels_info = kcalloc(num_active_sensor_types + 1,
-			sizeof(*channels_info), GFP_KERNEL);
+	channels_info = kcalloc(num_active_sensor_types + 1, sizeof(*channels_info), GFP_KERNEL);
 	if (!channels_info) {
 		rc = -ENOMEM;
 		goto channels_info_array_err;
 	}
 
 	for (i = 0 ; i < num_active_sensor_types ; i++) {
-		channels_info[i] = kzalloc(sizeof(*channels_info[i]),
-				GFP_KERNEL);
+		channels_info[i] = kzalloc(sizeof(*channels_info[i]), GFP_KERNEL);
 		if (!channels_info[i]) {
 			rc = -ENOMEM;
 			goto channel_info_err;
@@ -88,18 +217,19 @@ int hl_build_hwmon_channel_info(struct hl_device *hdev,
 		j++;
 	}
 
-	hdev->hl_chip_info->info =
-			(const struct hwmon_channel_info **)channels_info;
+	hdev->hl_chip_info->info = (const struct hwmon_channel_info **)channels_info;
 
 	return 0;
 
 channel_info_err:
-	for (i = 0 ; i < num_active_sensor_types ; i++)
+	for (i = 0 ; i < num_active_sensor_types ; i++) {
 		if (channels_info[i]) {
 			kfree(channels_info[i]->config);
 			kfree(channels_info[i]);
 		}
+	}
 	kfree(channels_info);
+
 channels_info_array_err:
 sensors_type_err:
 	for (i = 0 ; i < HWMON_NR_SENSOR_TYPES ; i++)
@@ -112,14 +242,16 @@ static int hl_read(struct device *dev, enum hwmon_sensor_types type,
 			u32 attr, int channel, long *val)
 {
 	struct hl_device *hdev = dev_get_drvdata(dev);
-	int rc;
+	bool use_cpucp_enum;
 	u32 cpucp_attr;
-	bool use_cpucp_enum = (hdev->asic_prop.fw_app_cpu_boot_dev_sts0 &
-				CPU_BOOT_DEV_STS0_MAP_HWMON_EN) ? true : false;
+	int rc;
 
 	if (!hl_device_operational(hdev, NULL))
 		return -ENODEV;
 
+	use_cpucp_enum = (hdev->asic_prop.fw_app_cpu_boot_dev_sts0 &
+					CPU_BOOT_DEV_STS0_MAP_HWMON_EN) ? true : false;
+
 	switch (type) {
 	case hwmon_temp:
 		switch (attr) {
@@ -151,7 +283,7 @@ static int hl_read(struct device *dev, enum hwmon_sensor_types type,
 		if (use_cpucp_enum)
 			rc = hl_get_temperature(hdev, channel, cpucp_attr, val);
 		else
-			rc = hl_get_temperature(hdev, channel, attr, val);
+			rc = hl_get_temperature(hdev, channel, fixup_attr_legacy_fw(attr), val);
 		break;
 	case hwmon_in:
 		switch (attr) {
@@ -174,7 +306,7 @@ static int hl_read(struct device *dev, enum hwmon_sensor_types type,
 		if (use_cpucp_enum)
 			rc = hl_get_voltage(hdev, channel, cpucp_attr, val);
 		else
-			rc = hl_get_voltage(hdev, channel, attr, val);
+			rc = hl_get_voltage(hdev, channel, fixup_attr_legacy_fw(attr), val);
 		break;
 	case hwmon_curr:
 		switch (attr) {
@@ -197,7 +329,7 @@ static int hl_read(struct device *dev, enum hwmon_sensor_types type,
 		if (use_cpucp_enum)
 			rc = hl_get_current(hdev, channel, cpucp_attr, val);
 		else
-			rc = hl_get_current(hdev, channel, attr, val);
+			rc = hl_get_current(hdev, channel, fixup_attr_legacy_fw(attr), val);
 		break;
 	case hwmon_fan:
 		switch (attr) {
@@ -217,7 +349,7 @@ static int hl_read(struct device *dev, enum hwmon_sensor_types type,
 		if (use_cpucp_enum)
 			rc = hl_get_fan_speed(hdev, channel, cpucp_attr, val);
 		else
-			rc = hl_get_fan_speed(hdev, channel, attr, val);
+			rc = hl_get_fan_speed(hdev, channel, fixup_attr_legacy_fw(attr), val);
 		break;
 	case hwmon_pwm:
 		switch (attr) {
@@ -234,6 +366,7 @@ static int hl_read(struct device *dev, enum hwmon_sensor_types type,
 		if (use_cpucp_enum)
 			rc = hl_get_pwm_info(hdev, channel, cpucp_attr, val);
 		else
+			/* no need for fixup as pwm was aligned from day 1 */
 			rc = hl_get_pwm_info(hdev, channel, attr, val);
 		break;
 	case hwmon_power:
@@ -251,7 +384,7 @@ static int hl_read(struct device *dev, enum hwmon_sensor_types type,
 		if (use_cpucp_enum)
 			rc = hl_get_power(hdev, channel, cpucp_attr, val);
 		else
-			rc = hl_get_power(hdev, channel, attr, val);
+			rc = hl_get_power(hdev, channel, fixup_attr_legacy_fw(attr), val);
 		break;
 	default:
 		return -EINVAL;
@@ -286,7 +419,7 @@ static int hl_write(struct device *dev, enum hwmon_sensor_types type,
 		if (use_cpucp_enum)
 			hl_set_temperature(hdev, channel, cpucp_attr, val);
 		else
-			hl_set_temperature(hdev, channel, attr, val);
+			hl_set_temperature(hdev, channel, fixup_attr_legacy_fw(attr), val);
 		break;
 	case hwmon_pwm:
 		switch (attr) {
@@ -303,6 +436,7 @@ static int hl_write(struct device *dev, enum hwmon_sensor_types type,
 		if (use_cpucp_enum)
 			hl_set_pwm_info(hdev, channel, cpucp_attr, val);
 		else
+			/* no need for fixup as pwm was aligned from day 1 */
 			hl_set_pwm_info(hdev, channel, attr, val);
 		break;
 	case hwmon_in:
@@ -317,7 +451,7 @@ static int hl_write(struct device *dev, enum hwmon_sensor_types type,
 		if (use_cpucp_enum)
 			hl_set_voltage(hdev, channel, cpucp_attr, val);
 		else
-			hl_set_voltage(hdev, channel, attr, val);
+			hl_set_voltage(hdev, channel, fixup_attr_legacy_fw(attr), val);
 		break;
 	case hwmon_curr:
 		switch (attr) {
@@ -331,7 +465,7 @@ static int hl_write(struct device *dev, enum hwmon_sensor_types type,
 		if (use_cpucp_enum)
 			hl_set_current(hdev, channel, cpucp_attr, val);
 		else
-			hl_set_current(hdev, channel, attr, val);
+			hl_set_current(hdev, channel, fixup_attr_legacy_fw(attr), val);
 		break;
 	case hwmon_power:
 		switch (attr) {
@@ -345,7 +479,7 @@ static int hl_write(struct device *dev, enum hwmon_sensor_types type,
 		if (use_cpucp_enum)
 			hl_set_power(hdev, channel, cpucp_attr, val);
 		else
-			hl_set_power(hdev, channel, attr, val);
+			hl_set_power(hdev, channel, fixup_attr_legacy_fw(attr), val);
 		break;
 	default:
 		return -EINVAL;
@@ -444,6 +578,9 @@ int hl_get_temperature(struct hl_device *hdev,
 	pkt.sensor_index = __cpu_to_le16(sensor_index);
 	pkt.type = __cpu_to_le16(attr);
 
+	dev_dbg(hdev->dev, "get temp, ctl 0x%x, sensor %d, type %d\n",
+		pkt.ctl, pkt.sensor_index, pkt.type);
+
 	rc = hdev->asic_funcs->send_cpu_message(hdev, (u32 *) &pkt, sizeof(pkt),
 						0, &result);
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 05/11] habanalabs: keep control device alive during hard reset
  2021-12-14 15:05 [PATCH 01/11] habanalabs: return correct clock throttling period Oded Gabbay
                   ` (2 preceding siblings ...)
  2021-12-14 15:05 ` [PATCH 04/11] habanalabs: fix hwmon handling for legacy f/w Oded Gabbay
@ 2021-12-14 15:05 ` Oded Gabbay
  2021-12-14 15:05 ` [PATCH 06/11] habanalabs: sysfs support for two infineon versions Oded Gabbay
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Oded Gabbay @ 2021-12-14 15:05 UTC (permalink / raw)
  To: linux-kernel; +Cc: Dani Liberman

From: Dani Liberman <dliberman@habana.ai>

Need to allow user retrieve data during reset and afterwards without
the need to reopen the device.
Did it by seperating the user peocesses list into two lists:
1. fpriv_list which contains list of user processes that opened
   the device (currently only one).
2. fpriv_ctrl_list which contains list of user processes that opened
   the control device. This processes in this list shall not be
   killed during reset, only when the device is suddenly removed from
   PCI chain.

Signed-off-by: Dani Liberman <dliberman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
---
 drivers/misc/habanalabs/common/context.c      |  8 +--
 drivers/misc/habanalabs/common/device.c       | 56 +++++++++++++------
 drivers/misc/habanalabs/common/habanalabs.h   |  7 ++-
 .../misc/habanalabs/common/habanalabs_drv.c   |  9 ++-
 4 files changed, 50 insertions(+), 30 deletions(-)

diff --git a/drivers/misc/habanalabs/common/context.c b/drivers/misc/habanalabs/common/context.c
index 49e6f1172d18..c6360e33bce8 100644
--- a/drivers/misc/habanalabs/common/context.c
+++ b/drivers/misc/habanalabs/common/context.c
@@ -283,11 +283,9 @@ struct hl_ctx *hl_get_compute_ctx(struct hl_device *hdev)
 		/* There can only be a single user which has opened the compute device, so exit
 		 * immediately once we find him
 		 */
-		if (!hpriv->is_control) {
-			ctx = hpriv->ctx;
-			hl_ctx_get(hdev, ctx);
-			break;
-		}
+		ctx = hpriv->ctx;
+		hl_ctx_get(hdev, ctx);
+		break;
 	}
 
 	mutex_unlock(&hdev->fpriv_list_lock);
diff --git a/drivers/misc/habanalabs/common/device.c b/drivers/misc/habanalabs/common/device.c
index bea05a59425f..f1f482c5cdcb 100644
--- a/drivers/misc/habanalabs/common/device.c
+++ b/drivers/misc/habanalabs/common/device.c
@@ -169,9 +169,9 @@ static int hl_device_release_ctrl(struct inode *inode, struct file *filp)
 		goto out;
 	}
 
-	mutex_lock(&hdev->fpriv_list_lock);
+	mutex_lock(&hdev->fpriv_ctrl_list_lock);
 	list_del(&hpriv->dev_node);
-	mutex_unlock(&hdev->fpriv_list_lock);
+	mutex_unlock(&hdev->fpriv_ctrl_list_lock);
 out:
 	put_pid(hpriv->taskpid);
 
@@ -449,7 +449,9 @@ static int device_early_init(struct hl_device *hdev)
 	INIT_LIST_HEAD(&hdev->cs_mirror_list);
 	spin_lock_init(&hdev->cs_mirror_lock);
 	INIT_LIST_HEAD(&hdev->fpriv_list);
+	INIT_LIST_HEAD(&hdev->fpriv_ctrl_list);
 	mutex_init(&hdev->fpriv_list_lock);
+	mutex_init(&hdev->fpriv_ctrl_list_lock);
 	atomic_set(&hdev->in_reset, 0);
 	mutex_init(&hdev->clk_throttling.lock);
 
@@ -491,6 +493,7 @@ static void device_early_fini(struct hl_device *hdev)
 	mutex_destroy(&hdev->send_cpu_message_lock);
 
 	mutex_destroy(&hdev->fpriv_list_lock);
+	mutex_destroy(&hdev->fpriv_ctrl_list_lock);
 
 	mutex_destroy(&hdev->clk_throttling.lock);
 
@@ -678,6 +681,8 @@ static void take_release_locks(struct hl_device *hdev)
 	/* Flush anyone that is inside device open */
 	mutex_lock(&hdev->fpriv_list_lock);
 	mutex_unlock(&hdev->fpriv_list_lock);
+	mutex_lock(&hdev->fpriv_ctrl_list_lock);
+	mutex_unlock(&hdev->fpriv_ctrl_list_lock);
 }
 
 static void cleanup_resources(struct hl_device *hdev, bool hard_reset, bool fw_reset)
@@ -789,17 +794,21 @@ int hl_device_resume(struct hl_device *hdev)
 	return rc;
 }
 
-static int device_kill_open_processes(struct hl_device *hdev, u32 timeout)
+static int device_kill_open_processes(struct hl_device *hdev, u32 timeout, bool control_dev)
 {
-	struct hl_fpriv	*hpriv;
 	struct task_struct *task = NULL;
+	struct list_head *fd_list;
+	struct hl_fpriv	*hpriv;
+	struct mutex *fd_lock;
 	u32 pending_cnt;
 
+	fd_lock = control_dev ? &hdev->fpriv_ctrl_list_lock : &hdev->fpriv_list_lock;
+	fd_list = control_dev ? &hdev->fpriv_ctrl_list : &hdev->fpriv_list;
 
 	/* Giving time for user to close FD, and for processes that are inside
 	 * hl_device_open to finish
 	 */
-	if (!list_empty(&hdev->fpriv_list))
+	if (!list_empty(fd_list))
 		ssleep(1);
 
 	if (timeout) {
@@ -815,12 +824,12 @@ static int device_kill_open_processes(struct hl_device *hdev, u32 timeout)
 		}
 	}
 
-	mutex_lock(&hdev->fpriv_list_lock);
+	mutex_lock(fd_lock);
 
 	/* This section must be protected because we are dereferencing
 	 * pointers that are freed if the process exits
 	 */
-	list_for_each_entry(hpriv, &hdev->fpriv_list, dev_node) {
+	list_for_each_entry(hpriv, fd_list, dev_node) {
 		task = get_pid_task(hpriv->taskpid, PIDTYPE_PID);
 		if (task) {
 			dev_info(hdev->dev, "Killing user process pid=%d\n",
@@ -832,12 +841,12 @@ static int device_kill_open_processes(struct hl_device *hdev, u32 timeout)
 		} else {
 			dev_warn(hdev->dev,
 				"Can't get task struct for PID so giving up on killing process\n");
-			mutex_unlock(&hdev->fpriv_list_lock);
+			mutex_unlock(fd_lock);
 			return -ETIME;
 		}
 	}
 
-	mutex_unlock(&hdev->fpriv_list_lock);
+	mutex_unlock(fd_lock);
 
 	/*
 	 * We killed the open users, but that doesn't mean they are closed.
@@ -849,7 +858,7 @@ static int device_kill_open_processes(struct hl_device *hdev, u32 timeout)
 	 */
 
 wait_for_processes:
-	while ((!list_empty(&hdev->fpriv_list)) && (pending_cnt)) {
+	while ((!list_empty(fd_list)) && (pending_cnt)) {
 		dev_dbg(hdev->dev,
 			"Waiting for all unmap operations to finish before hard reset\n");
 
@@ -859,7 +868,7 @@ static int device_kill_open_processes(struct hl_device *hdev, u32 timeout)
 	}
 
 	/* All processes exited successfully */
-	if (list_empty(&hdev->fpriv_list))
+	if (list_empty(fd_list))
 		return 0;
 
 	/* Give up waiting for processes to exit */
@@ -871,14 +880,19 @@ static int device_kill_open_processes(struct hl_device *hdev, u32 timeout)
 	return -EBUSY;
 }
 
-static void device_disable_open_processes(struct hl_device *hdev)
+static void device_disable_open_processes(struct hl_device *hdev, bool control_dev)
 {
+	struct list_head *fd_list;
 	struct hl_fpriv *hpriv;
+	struct mutex *fd_lock;
 
-	mutex_lock(&hdev->fpriv_list_lock);
-	list_for_each_entry(hpriv, &hdev->fpriv_list, dev_node)
+	fd_lock = control_dev ? &hdev->fpriv_ctrl_list_lock : &hdev->fpriv_list_lock;
+	fd_list = control_dev ? &hdev->fpriv_ctrl_list : &hdev->fpriv_list;
+
+	mutex_lock(fd_lock);
+	list_for_each_entry(hpriv, fd_list, dev_node)
 		hpriv->hdev = NULL;
-	mutex_unlock(&hdev->fpriv_list_lock);
+	mutex_unlock(fd_lock);
 }
 
 static void handle_reset_trigger(struct hl_device *hdev, u32 flags)
@@ -1057,7 +1071,7 @@ int hl_device_reset(struct hl_device *hdev, u32 flags)
 		 * process can't really exit until all its CSs are done, which
 		 * is what we do in cs rollback
 		 */
-		rc = device_kill_open_processes(hdev, 0);
+		rc = device_kill_open_processes(hdev, 0, false);
 
 		if (rc == -EBUSY) {
 			if (hdev->device_fini_pending) {
@@ -1629,10 +1643,16 @@ void hl_device_fini(struct hl_device *hdev)
 		"Waiting for all processes to exit (timeout of %u seconds)",
 		HL_PENDING_RESET_LONG_SEC);
 
-	rc = device_kill_open_processes(hdev, HL_PENDING_RESET_LONG_SEC);
+	rc = device_kill_open_processes(hdev, HL_PENDING_RESET_LONG_SEC, false);
 	if (rc) {
 		dev_crit(hdev->dev, "Failed to kill all open processes\n");
-		device_disable_open_processes(hdev);
+		device_disable_open_processes(hdev, false);
+	}
+
+	rc = device_kill_open_processes(hdev, 0, true);
+	if (rc) {
+		dev_crit(hdev->dev, "Failed to kill all control device open processes\n");
+		device_disable_open_processes(hdev, true);
 	}
 
 	hl_cb_pool_fini(hdev);
diff --git a/drivers/misc/habanalabs/common/habanalabs.h b/drivers/misc/habanalabs/common/habanalabs.h
index 362eee3f028c..015aa1ee8ce0 100644
--- a/drivers/misc/habanalabs/common/habanalabs.h
+++ b/drivers/misc/habanalabs/common/habanalabs.h
@@ -1824,7 +1824,6 @@ struct hl_debug_params {
  * @dev_node: node in the device list of file private data
  * @refcount: number of related contexts.
  * @restore_phase_mutex: lock for context switch and restore phase.
- * @is_control: true for control device, false otherwise
  */
 struct hl_fpriv {
 	struct hl_device	*hdev;
@@ -1837,7 +1836,6 @@ struct hl_fpriv {
 	struct list_head	dev_node;
 	struct kref		refcount;
 	struct mutex		restore_phase_mutex;
-	u8			is_control;
 };
 
 
@@ -2502,7 +2500,10 @@ struct last_error_session_info {
  * @internal_cb_va_base: internal cb pool mmu virtual address base
  * @fpriv_list: list of file private data structures. Each structure is created
  *              when a user opens the device
+ * @fpriv_ctrl_list: list of file private data structures. Each structure is created
+ *              when a user opens the control device
  * @fpriv_list_lock: protects the fpriv_list
+ * @fpriv_ctrl_list_lock: protects the fpriv_ctrl_list
  * @aggregated_cs_counters: aggregated cs counters among all contexts
  * @mmu_priv: device-specific MMU data.
  * @mmu_func: device-related MMU functions.
@@ -2655,7 +2656,9 @@ struct hl_device {
 	u64				internal_cb_va_base;
 
 	struct list_head		fpriv_list;
+	struct list_head		fpriv_ctrl_list;
 	struct mutex			fpriv_list_lock;
+	struct mutex			fpriv_ctrl_list_lock;
 
 	struct hl_cs_counters_atomic	aggregated_cs_counters;
 
diff --git a/drivers/misc/habanalabs/common/habanalabs_drv.c b/drivers/misc/habanalabs/common/habanalabs_drv.c
index d59201f93de9..aa4e07b1f839 100644
--- a/drivers/misc/habanalabs/common/habanalabs_drv.c
+++ b/drivers/misc/habanalabs/common/habanalabs_drv.c
@@ -220,12 +220,11 @@ int hl_device_open_ctrl(struct inode *inode, struct file *filp)
 	hpriv->hdev = hdev;
 	filp->private_data = hpriv;
 	hpriv->filp = filp;
-	hpriv->is_control = true;
 	nonseekable_open(inode, filp);
 
 	hpriv->taskpid = find_get_pid(current->pid);
 
-	mutex_lock(&hdev->fpriv_list_lock);
+	mutex_lock(&hdev->fpriv_ctrl_list_lock);
 
 	if (!hl_device_operational(hdev, NULL)) {
 		dev_err_ratelimited(hdev->dev_ctrl,
@@ -235,13 +234,13 @@ int hl_device_open_ctrl(struct inode *inode, struct file *filp)
 		goto out_err;
 	}
 
-	list_add(&hpriv->dev_node, &hdev->fpriv_list);
-	mutex_unlock(&hdev->fpriv_list_lock);
+	list_add(&hpriv->dev_node, &hdev->fpriv_ctrl_list);
+	mutex_unlock(&hdev->fpriv_ctrl_list_lock);
 
 	return 0;
 
 out_err:
-	mutex_unlock(&hdev->fpriv_list_lock);
+	mutex_unlock(&hdev->fpriv_ctrl_list_lock);
 	filp->private_data = NULL;
 	put_pid(hpriv->taskpid);
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 06/11] habanalabs: sysfs support for two infineon versions
  2021-12-14 15:05 [PATCH 01/11] habanalabs: return correct clock throttling period Oded Gabbay
                   ` (3 preceding siblings ...)
  2021-12-14 15:05 ` [PATCH 05/11] habanalabs: keep control device alive during hard reset Oded Gabbay
@ 2021-12-14 15:05 ` Oded Gabbay
  2021-12-14 15:05 ` [PATCH 07/11] habanalabs: expose soft reset sysfs nodes for inference ASIC Oded Gabbay
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Oded Gabbay @ 2021-12-14 15:05 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ofir Bitton

From: Ofir Bitton <obitton@habana.ai>

Currently sysfs support dumping a single infineon version, in
future asics we will have two infineon versions.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
---
 drivers/misc/habanalabs/common/sysfs.c            |  9 +++++++--
 drivers/misc/habanalabs/include/common/cpucp_if.h | 13 ++++++++++---
 2 files changed, 17 insertions(+), 5 deletions(-)

diff --git a/drivers/misc/habanalabs/common/sysfs.c b/drivers/misc/habanalabs/common/sysfs.c
index 15e4ae65e515..39d449c712b6 100644
--- a/drivers/misc/habanalabs/common/sysfs.c
+++ b/drivers/misc/habanalabs/common/sysfs.c
@@ -163,8 +163,13 @@ static ssize_t infineon_ver_show(struct device *dev,
 {
 	struct hl_device *hdev = dev_get_drvdata(dev);
 
-	return sprintf(buf, "0x%04x\n",
-			hdev->asic_prop.cpucp_info.infineon_version);
+	if (hdev->asic_prop.cpucp_info.infineon_version2)
+		return sprintf(buf, "%#04x %#04x\n",
+			le32_to_cpu(hdev->asic_prop.cpucp_info.infineon_version),
+			le32_to_cpu(hdev->asic_prop.cpucp_info.infineon_version2));
+	else
+		return sprintf(buf, "%#04x\n",
+			le32_to_cpu(hdev->asic_prop.cpucp_info.infineon_version));
 }
 
 static ssize_t fuse_ver_show(struct device *dev, struct device_attribute *attr,
diff --git a/drivers/misc/habanalabs/include/common/cpucp_if.h b/drivers/misc/habanalabs/include/common/cpucp_if.h
index 078fb4bd0316..e4f44c4d20b5 100644
--- a/drivers/misc/habanalabs/include/common/cpucp_if.h
+++ b/drivers/misc/habanalabs/include/common/cpucp_if.h
@@ -1,6 +1,6 @@
 /* SPDX-License-Identifier: GPL-2.0
  *
- * Copyright 2021 HabanaLabs, Ltd.
+ * Copyright 2020-2021 HabanaLabs, Ltd.
  * All Rights Reserved.
  *
  */
@@ -761,6 +761,7 @@ struct cpucp_security_info {
  * @fuse_version: silicon production FUSE information.
  * @thermal_version: thermald S/W version.
  * @cpucp_version: CpuCP S/W version.
+ * @infineon_version2: Infineon 2nd stage DC-DC version.
  * @dram_size: available DRAM size.
  * @card_name: card name that will be displayed in HWMON subsystem on the host
  * @sec_info: security information
@@ -770,6 +771,10 @@ struct cpucp_security_info {
  * @dram_binning_mask: DRAM binning mask, 1 bit per dram instance
  *                     (0 = functional 1 = binned)
  * @memory_repair_flag: eFuse flag indicating memory repair
+ * @edma_binning_mask: EDMA binning mask, 1 bit per EDMA instance
+ *                     (0 = functional 1 = binned)
+ * @xbar_binning_mask: Xbar binning mask, 1 bit per Xbar instance
+ *                     (0 = functional 1 = binned)
  */
 struct cpucp_info {
 	struct cpucp_sensor sensors[CPUCP_MAX_SENSORS];
@@ -782,7 +787,7 @@ struct cpucp_info {
 	__u8 fuse_version[VERSION_MAX_LEN];
 	__u8 thermal_version[VERSION_MAX_LEN];
 	__u8 cpucp_version[VERSION_MAX_LEN];
-	__le32 reserved2;
+	__le32 infineon_version2;
 	__le64 dram_size;
 	char card_name[CARD_NAME_MAX_LEN];
 	__le64 reserved3;
@@ -790,7 +795,9 @@ struct cpucp_info {
 	__u8 reserved5;
 	__u8 dram_binning_mask;
 	__u8 memory_repair_flag;
-	__u8 pad[5];
+	__u8 edma_binning_mask;
+	__u8 xbar_binning_mask;
+	__u8 pad[3];
 	struct cpucp_security_info sec_info;
 	__le32 reserved6;
 	__u8 pll_map[PLL_MAP_LEN];
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 07/11] habanalabs: expose soft reset sysfs nodes for inference ASIC
  2021-12-14 15:05 [PATCH 01/11] habanalabs: return correct clock throttling period Oded Gabbay
                   ` (4 preceding siblings ...)
  2021-12-14 15:05 ` [PATCH 06/11] habanalabs: sysfs support for two infineon versions Oded Gabbay
@ 2021-12-14 15:05 ` Oded Gabbay
  2021-12-14 15:05 ` [PATCH 08/11] habanalabs: clean MMU headers definitions Oded Gabbay
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Oded Gabbay @ 2021-12-14 15:05 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ofir Bitton

From: Ofir Bitton <obitton@habana.ai>

As we allow soft-reset to be performed only on inference devices,
having the sysfs nodes may cause a confusion. Hence, we remove those
nodes on training ASICs.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
---
 drivers/misc/habanalabs/common/sysfs.c | 32 ++++++++++++++++++++++++--
 1 file changed, 30 insertions(+), 2 deletions(-)

diff --git a/drivers/misc/habanalabs/common/sysfs.c b/drivers/misc/habanalabs/common/sysfs.c
index 39d449c712b6..8ce9884efe76 100644
--- a/drivers/misc/habanalabs/common/sysfs.c
+++ b/drivers/misc/habanalabs/common/sysfs.c
@@ -424,8 +424,6 @@ static struct attribute *hl_dev_attrs[] = {
 	&dev_attr_max_power.attr,
 	&dev_attr_pci_addr.attr,
 	&dev_attr_preboot_btl_ver.attr,
-	&dev_attr_soft_reset.attr,
-	&dev_attr_soft_reset_cnt.attr,
 	&dev_attr_status.attr,
 	&dev_attr_thermal_ver.attr,
 	&dev_attr_uboot_ver.attr,
@@ -450,6 +448,21 @@ static const struct attribute_group *hl_dev_attr_groups[] = {
 	NULL,
 };
 
+static struct attribute *hl_dev_inference_attrs[] = {
+	&dev_attr_soft_reset.attr,
+	&dev_attr_soft_reset_cnt.attr,
+	NULL,
+};
+
+static struct attribute_group hl_dev_inference_attr_group = {
+	.attrs = hl_dev_inference_attrs,
+};
+
+static const struct attribute_group *hl_dev_inference_attr_groups[] = {
+	&hl_dev_inference_attr_group,
+	NULL,
+};
+
 int hl_sysfs_init(struct hl_device *hdev)
 {
 	int rc;
@@ -465,10 +478,25 @@ int hl_sysfs_init(struct hl_device *hdev)
 		return rc;
 	}
 
+	if (!hdev->allow_inference_soft_reset)
+		return 0;
+
+	rc = device_add_groups(hdev->dev, hl_dev_inference_attr_groups);
+	if (rc) {
+		dev_err(hdev->dev,
+			"Failed to add groups to device, error %d\n", rc);
+		return rc;
+	}
+
 	return 0;
 }
 
 void hl_sysfs_fini(struct hl_device *hdev)
 {
 	device_remove_groups(hdev->dev, hl_dev_attr_groups);
+
+	if (!hdev->allow_inference_soft_reset)
+		return;
+
+	device_remove_groups(hdev->dev, hl_dev_inference_attr_groups);
 }
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 08/11] habanalabs: clean MMU headers definitions
  2021-12-14 15:05 [PATCH 01/11] habanalabs: return correct clock throttling period Oded Gabbay
                   ` (5 preceding siblings ...)
  2021-12-14 15:05 ` [PATCH 07/11] habanalabs: expose soft reset sysfs nodes for inference ASIC Oded Gabbay
@ 2021-12-14 15:05 ` Oded Gabbay
  2021-12-14 15:05 ` [PATCH 09/11] habanalabs: modify cpu boot status error print Oded Gabbay
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Oded Gabbay @ 2021-12-14 15:05 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ohad Sharabi

From: Ohad Sharabi <osharabi@habana.ai>

During the MMU development the MMU header files were left with unclean
definitions:

- MMU "version specific" definitions that were left in the mmu_general
  file
- unused definitions

This patch attempts, where possible, to keep definitions that can serve
multiple MMU versions (but that are not tightly bound with specific MMU
arch) in the mmu_general header file (e.g. different definitions for
number of HOPs).

Otherwise, move MMU version specific definitions (e.g. HOPs masks and
shifts) to the specific MMU version file.

Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
---
 drivers/misc/habanalabs/common/mmu/mmu_v1.c   |  8 +++----
 drivers/misc/habanalabs/gaudi/gaudi.c         | 24 +++++++++----------
 drivers/misc/habanalabs/goya/goya.c           | 24 +++++++++----------
 .../include/hw_ip/mmu/mmu_general.h           | 19 ++++-----------
 .../habanalabs/include/hw_ip/mmu/mmu_v1_0.h   | 18 +++++++++++---
 .../habanalabs/include/hw_ip/mmu/mmu_v1_1.h   | 20 ++++++++++++----
 6 files changed, 64 insertions(+), 49 deletions(-)

diff --git a/drivers/misc/habanalabs/common/mmu/mmu_v1.c b/drivers/misc/habanalabs/common/mmu/mmu_v1.c
index 159da2fafd79..6134b6ae7615 100644
--- a/drivers/misc/habanalabs/common/mmu/mmu_v1.c
+++ b/drivers/misc/habanalabs/common/mmu/mmu_v1.c
@@ -269,7 +269,7 @@ static int dram_default_mapping_init(struct hl_ctx *ctx)
 
 	num_of_hop3 = prop->dram_size_for_default_page_mapping;
 	do_div(num_of_hop3, prop->dram_page_size);
-	do_div(num_of_hop3, PTE_ENTRIES_IN_HOP);
+	do_div(num_of_hop3, HOP_PTE_ENTRIES_512);
 
 	/* add hop1 and hop2 */
 	total_hops = num_of_hop3 + 2;
@@ -330,7 +330,7 @@ static int dram_default_mapping_init(struct hl_ctx *ctx)
 
 	for (i = 0 ; i < num_of_hop3 ; i++) {
 		hop3_pte_addr = ctx->dram_default_hops[i];
-		for (j = 0 ; j < PTE_ENTRIES_IN_HOP ; j++) {
+		for (j = 0 ; j < HOP_PTE_ENTRIES_512 ; j++) {
 			write_final_pte(ctx, hop3_pte_addr, pte_val);
 			get_pte(ctx, ctx->dram_default_hops[i]);
 			hop3_pte_addr += HL_PTE_SIZE;
@@ -369,7 +369,7 @@ static void dram_default_mapping_fini(struct hl_ctx *ctx)
 
 	num_of_hop3 = prop->dram_size_for_default_page_mapping;
 	do_div(num_of_hop3, prop->dram_page_size);
-	do_div(num_of_hop3, PTE_ENTRIES_IN_HOP);
+	do_div(num_of_hop3, HOP_PTE_ENTRIES_512);
 
 	hop0_addr = get_hop0_addr(ctx);
 	/* add hop1 and hop2 */
@@ -379,7 +379,7 @@ static void dram_default_mapping_fini(struct hl_ctx *ctx)
 
 	for (i = 0 ; i < num_of_hop3 ; i++) {
 		hop3_pte_addr = ctx->dram_default_hops[i];
-		for (j = 0 ; j < PTE_ENTRIES_IN_HOP ; j++) {
+		for (j = 0 ; j < HOP_PTE_ENTRIES_512 ; j++) {
 			clear_pte(ctx, hop3_pte_addr);
 			put_pte(ctx, ctx->dram_default_hops[i]);
 			hop3_pte_addr += HL_PTE_SIZE;
diff --git a/drivers/misc/habanalabs/gaudi/gaudi.c b/drivers/misc/habanalabs/gaudi/gaudi.c
index 07e03d44930e..b3431eac4f04 100644
--- a/drivers/misc/habanalabs/gaudi/gaudi.c
+++ b/drivers/misc/habanalabs/gaudi/gaudi.c
@@ -593,21 +593,21 @@ static int gaudi_set_fixed_properties(struct hl_device *hdev)
 	else
 		prop->mmu_pgt_size = MMU_PAGE_TABLES_SIZE;
 	prop->mmu_pte_size = HL_PTE_SIZE;
-	prop->mmu_hop_table_size = HOP_TABLE_SIZE;
-	prop->mmu_hop0_tables_total_size = HOP0_TABLES_TOTAL_SIZE;
+	prop->mmu_hop_table_size = HOP_TABLE_SIZE_512_PTE;
+	prop->mmu_hop0_tables_total_size = HOP0_512_PTE_TABLES_TOTAL_SIZE;
 	prop->dram_page_size = PAGE_SIZE_2MB;
 	prop->dram_supports_virtual_memory = false;
 
-	prop->pmmu.hop0_shift = HOP0_SHIFT;
-	prop->pmmu.hop1_shift = HOP1_SHIFT;
-	prop->pmmu.hop2_shift = HOP2_SHIFT;
-	prop->pmmu.hop3_shift = HOP3_SHIFT;
-	prop->pmmu.hop4_shift = HOP4_SHIFT;
-	prop->pmmu.hop0_mask = HOP0_MASK;
-	prop->pmmu.hop1_mask = HOP1_MASK;
-	prop->pmmu.hop2_mask = HOP2_MASK;
-	prop->pmmu.hop3_mask = HOP3_MASK;
-	prop->pmmu.hop4_mask = HOP4_MASK;
+	prop->pmmu.hop0_shift = MMU_V1_1_HOP0_SHIFT;
+	prop->pmmu.hop1_shift = MMU_V1_1_HOP1_SHIFT;
+	prop->pmmu.hop2_shift = MMU_V1_1_HOP2_SHIFT;
+	prop->pmmu.hop3_shift = MMU_V1_1_HOP3_SHIFT;
+	prop->pmmu.hop4_shift = MMU_V1_1_HOP4_SHIFT;
+	prop->pmmu.hop0_mask = MMU_V1_1_HOP0_MASK;
+	prop->pmmu.hop1_mask = MMU_V1_1_HOP1_MASK;
+	prop->pmmu.hop2_mask = MMU_V1_1_HOP2_MASK;
+	prop->pmmu.hop3_mask = MMU_V1_1_HOP3_MASK;
+	prop->pmmu.hop4_mask = MMU_V1_1_HOP4_MASK;
 	prop->pmmu.start_addr = VA_HOST_SPACE_START;
 	prop->pmmu.end_addr =
 			(VA_HOST_SPACE_START + VA_HOST_SPACE_SIZE / 2) - 1;
diff --git a/drivers/misc/habanalabs/goya/goya.c b/drivers/misc/habanalabs/goya/goya.c
index 8d0f2cd608fc..f4473013f1ee 100644
--- a/drivers/misc/habanalabs/goya/goya.c
+++ b/drivers/misc/habanalabs/goya/goya.c
@@ -410,21 +410,21 @@ int goya_set_fixed_properties(struct hl_device *hdev)
 	else
 		prop->mmu_pgt_size = MMU_PAGE_TABLES_SIZE;
 	prop->mmu_pte_size = HL_PTE_SIZE;
-	prop->mmu_hop_table_size = HOP_TABLE_SIZE;
-	prop->mmu_hop0_tables_total_size = HOP0_TABLES_TOTAL_SIZE;
+	prop->mmu_hop_table_size = HOP_TABLE_SIZE_512_PTE;
+	prop->mmu_hop0_tables_total_size = HOP0_512_PTE_TABLES_TOTAL_SIZE;
 	prop->dram_page_size = PAGE_SIZE_2MB;
 	prop->dram_supports_virtual_memory = true;
 
-	prop->dmmu.hop0_shift = HOP0_SHIFT;
-	prop->dmmu.hop1_shift = HOP1_SHIFT;
-	prop->dmmu.hop2_shift = HOP2_SHIFT;
-	prop->dmmu.hop3_shift = HOP3_SHIFT;
-	prop->dmmu.hop4_shift = HOP4_SHIFT;
-	prop->dmmu.hop0_mask = HOP0_MASK;
-	prop->dmmu.hop1_mask = HOP1_MASK;
-	prop->dmmu.hop2_mask = HOP2_MASK;
-	prop->dmmu.hop3_mask = HOP3_MASK;
-	prop->dmmu.hop4_mask = HOP4_MASK;
+	prop->dmmu.hop0_shift = MMU_V1_0_HOP0_SHIFT;
+	prop->dmmu.hop1_shift = MMU_V1_0_HOP1_SHIFT;
+	prop->dmmu.hop2_shift = MMU_V1_0_HOP2_SHIFT;
+	prop->dmmu.hop3_shift = MMU_V1_0_HOP3_SHIFT;
+	prop->dmmu.hop4_shift = MMU_V1_0_HOP4_SHIFT;
+	prop->dmmu.hop0_mask = MMU_V1_0_HOP0_MASK;
+	prop->dmmu.hop1_mask = MMU_V1_0_HOP1_MASK;
+	prop->dmmu.hop2_mask = MMU_V1_0_HOP2_MASK;
+	prop->dmmu.hop3_mask = MMU_V1_0_HOP3_MASK;
+	prop->dmmu.hop4_mask = MMU_V1_0_HOP4_MASK;
 	prop->dmmu.start_addr = VA_DDR_SPACE_START;
 	prop->dmmu.end_addr = VA_DDR_SPACE_END;
 	prop->dmmu.page_size = PAGE_SIZE_2MB;
diff --git a/drivers/misc/habanalabs/include/hw_ip/mmu/mmu_general.h b/drivers/misc/habanalabs/include/hw_ip/mmu/mmu_general.h
index dedf20e8f956..758f246627f8 100644
--- a/drivers/misc/habanalabs/include/hw_ip/mmu/mmu_general.h
+++ b/drivers/misc/habanalabs/include/hw_ip/mmu/mmu_general.h
@@ -16,27 +16,18 @@
 #define PAGE_PRESENT_MASK		0x0000000000001ull
 #define SWAP_OUT_MASK			0x0000000000004ull
 #define LAST_MASK			0x0000000000800ull
-#define HOP0_MASK			0x3000000000000ull
-#define HOP1_MASK			0x0FF8000000000ull
-#define HOP2_MASK			0x0007FC0000000ull
-#define HOP3_MASK			0x000003FE00000ull
-#define HOP4_MASK			0x00000001FF000ull
 #define FLAGS_MASK			0x0000000000FFFull
 
-#define HOP0_SHIFT			48
-#define HOP1_SHIFT			39
-#define HOP2_SHIFT			30
-#define HOP3_SHIFT			21
-#define HOP4_SHIFT			12
-
 #define MMU_ARCH_5_HOPS			5
 
 #define HOP_PHYS_ADDR_MASK		(~FLAGS_MASK)
 
 #define HL_PTE_SIZE			sizeof(u64)
-#define HOP_TABLE_SIZE			PAGE_SIZE_4KB
-#define PTE_ENTRIES_IN_HOP		(HOP_TABLE_SIZE / HL_PTE_SIZE)
-#define HOP0_TABLES_TOTAL_SIZE		(HOP_TABLE_SIZE * MAX_ASID)
+
+/* definitions for HOP with 512 PTE entries */
+#define HOP_PTE_ENTRIES_512		512
+#define HOP_TABLE_SIZE_512_PTE		(HOP_PTE_ENTRIES_512 * HL_PTE_SIZE)
+#define HOP0_512_PTE_TABLES_TOTAL_SIZE	(HOP_TABLE_SIZE_512_PTE * MAX_ASID)
 
 #define MMU_HOP0_PA43_12_SHIFT		12
 #define MMU_HOP0_PA49_44_SHIFT		(12 + 32)
diff --git a/drivers/misc/habanalabs/include/hw_ip/mmu/mmu_v1_0.h b/drivers/misc/habanalabs/include/hw_ip/mmu/mmu_v1_0.h
index 8539dd041f2c..86511002e367 100644
--- a/drivers/misc/habanalabs/include/hw_ip/mmu/mmu_v1_0.h
+++ b/drivers/misc/habanalabs/include/hw_ip/mmu/mmu_v1_0.h
@@ -8,8 +8,20 @@
 #ifndef INCLUDE_MMU_V1_0_H_
 #define INCLUDE_MMU_V1_0_H_
 
-#define MMU_HOP0_PA43_12	0x490004
-#define MMU_HOP0_PA49_44	0x490008
-#define MMU_ASID_BUSY		0x490000
+#define MMU_V1_0_HOP0_MASK		0x3000000000000ull
+#define MMU_V1_0_HOP1_MASK		0x0FF8000000000ull
+#define MMU_V1_0_HOP2_MASK		0x0007FC0000000ull
+#define MMU_V1_0_HOP3_MASK		0x000003FE00000ull
+#define MMU_V1_0_HOP4_MASK		0x00000001FF000ull
+
+#define MMU_V1_0_HOP0_SHIFT		48
+#define MMU_V1_0_HOP1_SHIFT		39
+#define MMU_V1_0_HOP2_SHIFT		30
+#define MMU_V1_0_HOP3_SHIFT		21
+#define MMU_V1_0_HOP4_SHIFT		12
+
+#define MMU_HOP0_PA43_12		0x490004
+#define MMU_HOP0_PA49_44		0x490008
+#define MMU_ASID_BUSY			0x490000
 
 #endif /* INCLUDE_MMU_V1_0_H_ */
diff --git a/drivers/misc/habanalabs/include/hw_ip/mmu/mmu_v1_1.h b/drivers/misc/habanalabs/include/hw_ip/mmu/mmu_v1_1.h
index b2a9570583ac..9c727a5d47b4 100644
--- a/drivers/misc/habanalabs/include/hw_ip/mmu/mmu_v1_1.h
+++ b/drivers/misc/habanalabs/include/hw_ip/mmu/mmu_v1_1.h
@@ -8,9 +8,21 @@
 #ifndef INCLUDE_MMU_V1_1_H_
 #define INCLUDE_MMU_V1_1_H_
 
-#define MMU_ASID		0xC12004
-#define MMU_HOP0_PA43_12	0xC12008
-#define MMU_HOP0_PA49_44	0xC1200C
-#define MMU_BUSY		0xC12000
+#define MMU_V1_1_HOP0_MASK		0x3000000000000ull
+#define MMU_V1_1_HOP1_MASK		0x0FF8000000000ull
+#define MMU_V1_1_HOP2_MASK		0x0007FC0000000ull
+#define MMU_V1_1_HOP3_MASK		0x000003FE00000ull
+#define MMU_V1_1_HOP4_MASK		0x00000001FF000ull
+
+#define MMU_V1_1_HOP0_SHIFT		48
+#define MMU_V1_1_HOP1_SHIFT		39
+#define MMU_V1_1_HOP2_SHIFT		30
+#define MMU_V1_1_HOP3_SHIFT		21
+#define MMU_V1_1_HOP4_SHIFT		12
+
+#define MMU_ASID			0xC12004
+#define MMU_HOP0_PA43_12		0xC12008
+#define MMU_HOP0_PA49_44		0xC1200C
+#define MMU_BUSY			0xC12000
 
 #endif /* INCLUDE_MMU_V1_1_H_ */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 09/11] habanalabs: modify cpu boot status error print
  2021-12-14 15:05 [PATCH 01/11] habanalabs: return correct clock throttling period Oded Gabbay
                   ` (6 preceding siblings ...)
  2021-12-14 15:05 ` [PATCH 08/11] habanalabs: clean MMU headers definitions Oded Gabbay
@ 2021-12-14 15:05 ` Oded Gabbay
  2021-12-14 15:05 ` [PATCH 10/11] habanalabs: prevent wait if CS in multi-CS list completed Oded Gabbay
  2021-12-14 15:05 ` [PATCH 11/11] habanalabs: change wait_for_interrupt implementation Oded Gabbay
  9 siblings, 0 replies; 11+ messages in thread
From: Oded Gabbay @ 2021-12-14 15:05 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ofir Bitton

From: Ofir Bitton <obitton@habana.ai>

As BTL can be replaced by ROM we should modify relevant error print.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
---
 drivers/misc/habanalabs/common/firmware_if.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/misc/habanalabs/common/firmware_if.c b/drivers/misc/habanalabs/common/firmware_if.c
index ba9a17ddc3a4..7c0f56a73399 100644
--- a/drivers/misc/habanalabs/common/firmware_if.c
+++ b/drivers/misc/habanalabs/common/firmware_if.c
@@ -1103,7 +1103,7 @@ static void detect_cpu_boot_status(struct hl_device *hdev, u32 status)
 	switch (status) {
 	case CPU_BOOT_STATUS_NA:
 		dev_err(hdev->dev,
-			"Device boot progress - BTL did NOT run\n");
+			"Device boot progress - BTL/ROM did NOT run\n");
 		break;
 	case CPU_BOOT_STATUS_IN_WFE:
 		dev_err(hdev->dev,
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 10/11] habanalabs: prevent wait if CS in multi-CS list completed
  2021-12-14 15:05 [PATCH 01/11] habanalabs: return correct clock throttling period Oded Gabbay
                   ` (7 preceding siblings ...)
  2021-12-14 15:05 ` [PATCH 09/11] habanalabs: modify cpu boot status error print Oded Gabbay
@ 2021-12-14 15:05 ` Oded Gabbay
  2021-12-14 15:05 ` [PATCH 11/11] habanalabs: change wait_for_interrupt implementation Oded Gabbay
  9 siblings, 0 replies; 11+ messages in thread
From: Oded Gabbay @ 2021-12-14 15:05 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ohad Sharabi

From: Ohad Sharabi <osharabi@habana.ai>

By the original design we assumed that if we "miss" multi CS completion
it is of no severe consequence as we'll just call wait_for_multi_cs
again.

Sequence of events for such scenario:
1. user submit CS with sequence N
2. user calls wait for multi-CS with only CS #N in the list
3. the multi CS call starts with poll of the CSs but find that none
   completed (while CS #N did not completed yet)
4. now, multi CS #N complete but multi CS CTX was not yet created for
   the above multi-CS. so, attempt to complete multi-CS fails (as no
   multi CS CTX exist)
5. wait_for_multi_cs call now does init_wait_multi_cs_completion (and
   for this create the multi-CS CTX)
6. wait_for_multi_cs wits on completion but will not get one as CS #N
   already completed

To fix the issue we initialize the multi-CS CTX prior polling the
fences.

Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
---
 .../habanalabs/common/command_submission.c    | 85 ++++++++++++-------
 drivers/misc/habanalabs/common/habanalabs.h   |  3 -
 2 files changed, 54 insertions(+), 34 deletions(-)

diff --git a/drivers/misc/habanalabs/common/command_submission.c b/drivers/misc/habanalabs/common/command_submission.c
index f58fff3671d6..b9fed6b6d1ab 100644
--- a/drivers/misc/habanalabs/common/command_submission.c
+++ b/drivers/misc/habanalabs/common/command_submission.c
@@ -533,8 +533,8 @@ static void complete_multi_cs(struct hl_device *hdev, struct hl_cs *cs)
 					mcs_compl->stream_master_qid_map)) {
 			/* extract the timestamp only of first completed CS */
 			if (!mcs_compl->timestamp)
-				mcs_compl->timestamp =
-						ktime_to_ns(fence->timestamp);
+				mcs_compl->timestamp = ktime_to_ns(fence->timestamp);
+
 			complete_all(&mcs_compl->completion);
 
 			/*
@@ -2369,16 +2369,18 @@ static int hl_wait_for_fence(struct hl_ctx *ctx, u64 seq, struct hl_fence *fence
  * hl_cs_poll_fences - iterate CS fences to check for CS completion
  *
  * @mcs_data: multi-CS internal data
+ * @mcs_compl: multi-CS completion structure
  *
  * @return 0 on success, otherwise non 0 error code
  *
  * The function iterates on all CS sequence in the list and set bit in
  * completion_bitmap for each completed CS.
- * while iterating, the function can extracts the stream map to be later
- * used by the waiting function.
- * this function shall be called after taking context ref
+ * While iterating, the function sets the stream map of each fence in the fence
+ * array in the completion QID stream map to be used by CSs to perform
+ * completion to the multi-CS context.
+ * This function shall be called after taking context ref
  */
-static int hl_cs_poll_fences(struct multi_cs_data *mcs_data)
+static int hl_cs_poll_fences(struct multi_cs_data *mcs_data, struct multi_cs_completion *mcs_compl)
 {
 	struct hl_fence **fence_ptr = mcs_data->fence_arr;
 	struct hl_device *hdev = mcs_data->ctx->hdev;
@@ -2394,6 +2396,15 @@ static int hl_cs_poll_fences(struct multi_cs_data *mcs_data)
 	if (rc)
 		return rc;
 
+	/*
+	 * re-initialize the completion here to handle 2 possible cases:
+	 * 1. CS will complete the multi-CS prior clearing the completion. in which
+	 *    case the fence iteration is guaranteed to catch the CS completion.
+	 * 2. the completion will occur after re-init of the completion.
+	 *    in which case we will wake up immediately in wait_for_completion.
+	 */
+	reinit_completion(&mcs_compl->completion);
+
 	/*
 	 * set to maximum time to verify timestamp is valid: if at the end
 	 * this value is maintained- no timestamp was updated
@@ -2404,6 +2415,21 @@ static int hl_cs_poll_fences(struct multi_cs_data *mcs_data)
 	for (i = 0; i < arr_len; i++, fence_ptr++) {
 		struct hl_fence *fence = *fence_ptr;
 
+		/*
+		 * In order to prevent case where we wait until timeout even though a CS associated
+		 * with the multi-CS actually completed we do things in the below order:
+		 * 1. for each fence set it's QID map in the multi-CS completion QID map. This way
+		 *    any CS can, potentially, complete the multi CS for the specific QID (note
+		 *    that once completion is initialized, calling complete* and then wait on the
+		 *    completion will cause it to return at once)
+		 * 2. only after allowing multi-CS completion for the specific QID we check whether
+		 *    the specific CS already completed (and thus the wait for completion part will
+		 *    be skipped). if the CS not completed it is guaranteed that completing CS will
+		 *    wake up the completion.
+		 */
+		if (fence)
+			mcs_compl->stream_master_qid_map |= fence->stream_master_qid_map;
+
 		/*
 		 * function won't sleep as it is called with timeout 0 (i.e.
 		 * poll the fence)
@@ -2419,9 +2445,7 @@ static int hl_cs_poll_fences(struct multi_cs_data *mcs_data)
 
 		switch (status) {
 		case CS_WAIT_STATUS_BUSY:
-			/* CS did not finished, keep waiting on its QID*/
-			mcs_data->stream_master_qid_map |=
-					fence->stream_master_qid_map;
+			/* CS did not finished, QID to wait on already stored */
 			break;
 		case CS_WAIT_STATUS_COMPLETED:
 			/*
@@ -2519,9 +2543,7 @@ static inline unsigned long hl_usecs64_to_jiffies(const u64 usecs)
  * the function gets the first available completion (by marking it "used")
  * and initialize its values.
  */
-static struct multi_cs_completion *hl_wait_multi_cs_completion_init(
-							struct hl_device *hdev,
-							u8 stream_master_bitmap)
+static struct multi_cs_completion *hl_wait_multi_cs_completion_init(struct hl_device *hdev)
 {
 	struct multi_cs_completion *mcs_compl;
 	int i;
@@ -2533,8 +2555,11 @@ static struct multi_cs_completion *hl_wait_multi_cs_completion_init(
 		if (!mcs_compl->used) {
 			mcs_compl->used = 1;
 			mcs_compl->timestamp = 0;
-			mcs_compl->stream_master_qid_map = stream_master_bitmap;
-			reinit_completion(&mcs_compl->completion);
+			/*
+			 * init QID map to 0 to avoid completion by CSs. the actual QID map
+			 * to multi-CS CSs will be set incrementally at a later stage
+			 */
+			mcs_compl->stream_master_qid_map = 0;
 			spin_unlock(&mcs_compl->lock);
 			break;
 		}
@@ -2672,9 +2697,17 @@ static int hl_multi_cs_wait_ioctl(struct hl_fpriv *hpriv, void *data)
 
 	hl_ctx_get(hdev, ctx);
 
+	/* wait (with timeout) for the first CS to be completed */
+	mcs_data.timeout_jiffies = hl_usecs64_to_jiffies(args->in.timeout_us);
+	mcs_compl = hl_wait_multi_cs_completion_init(hdev);
+	if (IS_ERR(mcs_compl)) {
+		rc = PTR_ERR(mcs_compl);
+		goto put_ctx;
+	}
+
 	/* poll all CS fences, extract timestamp */
 	mcs_data.update_ts = true;
-	rc = hl_cs_poll_fences(&mcs_data);
+	rc = hl_cs_poll_fences(&mcs_data, mcs_compl);
 	/*
 	 * skip wait for CS completion when one of the below is true:
 	 * - an error on the poll function
@@ -2682,16 +2715,7 @@ static int hl_multi_cs_wait_ioctl(struct hl_fpriv *hpriv, void *data)
 	 * - the user called ioctl with timeout 0
 	 */
 	if (rc || mcs_data.completion_bitmap || !args->in.timeout_us)
-		goto put_ctx;
-
-	/* wait (with timeout) for the first CS to be completed */
-	mcs_data.timeout_jiffies = hl_usecs64_to_jiffies(args->in.timeout_us);
-
-	mcs_compl = hl_wait_multi_cs_completion_init(hdev, mcs_data.stream_master_qid_map);
-	if (IS_ERR(mcs_compl)) {
-		rc = PTR_ERR(mcs_compl);
-		goto put_ctx;
-	}
+		goto completion_fini;
 
 	while (true) {
 		rc = hl_wait_multi_cs_completion(&mcs_data, mcs_compl);
@@ -2703,7 +2727,7 @@ static int hl_multi_cs_wait_ioctl(struct hl_fpriv *hpriv, void *data)
 		 * no timestamp should be updated this time.
 		 */
 		mcs_data.update_ts = false;
-		rc = hl_cs_poll_fences(&mcs_data);
+		rc = hl_cs_poll_fences(&mcs_data, mcs_compl);
 
 		if (mcs_data.completion_bitmap)
 			break;
@@ -2713,16 +2737,15 @@ static int hl_multi_cs_wait_ioctl(struct hl_fpriv *hpriv, void *data)
 		 * it got a completion) it either got completed by CS in the multi CS list
 		 * (in which case the indication will be non empty completion_bitmap) or it
 		 * got completed by CS submitted to one of the shared stream master but
-		 * not in the multi CS list (in which case we should wait again but reinit
-		 * the completion, modify the timeout and set timestamp as zero to let a CS
-		 * related to the current multi-CS set a new, relevant, timestamp)
+		 * not in the multi CS list (in which case we should wait again but modify
+		 * the timeout and set timestamp as zero to let a CS related to the current
+		 * multi-CS set a new, relevant, timestamp)
 		 */
-		/* wait again with modified timeout */
 		mcs_data.timeout_jiffies = mcs_data.wait_status;
-		reinit_completion(&mcs_compl->completion);
 		mcs_compl->timestamp = 0;
 	}
 
+completion_fini:
 	hl_wait_multi_cs_completion_fini(mcs_compl);
 
 put_ctx:
diff --git a/drivers/misc/habanalabs/common/habanalabs.h b/drivers/misc/habanalabs/common/habanalabs.h
index 015aa1ee8ce0..4d4986177776 100644
--- a/drivers/misc/habanalabs/common/habanalabs.h
+++ b/drivers/misc/habanalabs/common/habanalabs.h
@@ -2364,8 +2364,6 @@ struct multi_cs_completion {
  * @timestamp: timestamp of first completed CS
  * @wait_status: wait for CS status
  * @completion_bitmap: bitmap of completed CSs (1- completed, otherwise 0)
- * @stream_master_qid_map: bitmap of all stream master QIDs on which the
- *                         multi-CS is waiting
  * @arr_len: fence_arr and seq_arr array length
  * @gone_cs: indication of gone CS (1- there was gone CS, otherwise 0)
  * @update_ts: update timestamp. 1- update the timestamp, otherwise 0.
@@ -2378,7 +2376,6 @@ struct multi_cs_data {
 	s64		timestamp;
 	long		wait_status;
 	u32		completion_bitmap;
-	u32		stream_master_qid_map;
 	u8		arr_len;
 	u8		gone_cs;
 	u8		update_ts;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 11/11] habanalabs: change wait_for_interrupt implementation
  2021-12-14 15:05 [PATCH 01/11] habanalabs: return correct clock throttling period Oded Gabbay
                   ` (8 preceding siblings ...)
  2021-12-14 15:05 ` [PATCH 10/11] habanalabs: prevent wait if CS in multi-CS list completed Oded Gabbay
@ 2021-12-14 15:05 ` Oded Gabbay
  9 siblings, 0 replies; 11+ messages in thread
From: Oded Gabbay @ 2021-12-14 15:05 UTC (permalink / raw)
  To: linux-kernel; +Cc: farah kassabri

From: farah kassabri <fkassabri@habana.ai>

Currently the cq counters are allocated in userspace memory,
and mapped by the driver to the device address space.

A new requirement that is part of new future API related to this one,
requires that cq counters will be allocated in kernel memory.

We leverage the existing cb_create API with KERNEL_MAPPED flag set to
allocate this memory.

That way we gain two things:
1. The memory cannot be freed while in use since it's protected
by refcount in driver.

2. No need to wake up the user thread upon each interrupt from CQ,
because the kernel has direct access to the counter. Therefore,
it can make comparison with the target value in the interrupt
handler and wake up the user thread only if the counter reaches the
target value. This is instead of waking the thread up to copy counter
value from user then go sleep again if target value wasn't reached.

Signed-off-by: farah kassabri <fkassabri@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
---
 .../misc/habanalabs/common/command_buffer.c   |  31 ++++-
 .../habanalabs/common/command_submission.c    | 111 +++++++++++++++++-
 drivers/misc/habanalabs/common/habanalabs.h   |   5 +
 drivers/misc/habanalabs/common/irq.c          |   8 +-
 include/uapi/misc/habanalabs.h                |  61 +++++++---
 5 files changed, 189 insertions(+), 27 deletions(-)

diff --git a/drivers/misc/habanalabs/common/command_buffer.c b/drivers/misc/habanalabs/common/command_buffer.c
index c591f0487272..d4eb9fb9ea12 100644
--- a/drivers/misc/habanalabs/common/command_buffer.c
+++ b/drivers/misc/habanalabs/common/command_buffer.c
@@ -380,8 +380,9 @@ int hl_cb_destroy(struct hl_device *hdev, struct hl_cb_mgr *mgr, u64 cb_handle)
 }
 
 static int hl_cb_info(struct hl_device *hdev, struct hl_cb_mgr *mgr,
-			u64 cb_handle, u32 *usage_cnt)
+			u64 cb_handle, u32 flags, u32 *usage_cnt, u64 *device_va)
 {
+	struct hl_vm_va_block *va_block;
 	struct hl_cb *cb;
 	u32 handle;
 	int rc = 0;
@@ -402,7 +403,18 @@ static int hl_cb_info(struct hl_device *hdev, struct hl_cb_mgr *mgr,
 		goto out;
 	}
 
-	*usage_cnt = atomic_read(&cb->cs_cnt);
+	if (flags & HL_CB_FLAGS_GET_DEVICE_VA) {
+		va_block = list_first_entry(&cb->va_block_list, struct hl_vm_va_block, node);
+		if (va_block) {
+			*device_va = va_block->start;
+		} else {
+			dev_err(hdev->dev, "CB is not mapped to the device's MMU\n");
+			rc = -EINVAL;
+			goto out;
+		}
+	} else {
+		*usage_cnt = atomic_read(&cb->cs_cnt);
+	}
 
 out:
 	spin_unlock(&mgr->cb_lock);
@@ -414,7 +426,7 @@ int hl_cb_ioctl(struct hl_fpriv *hpriv, void *data)
 	union hl_cb_args *args = data;
 	struct hl_device *hdev = hpriv->hdev;
 	enum hl_device_status status;
-	u64 handle = 0;
+	u64 handle = 0, device_va;
 	u32 usage_cnt = 0;
 	int rc;
 
@@ -450,9 +462,16 @@ int hl_cb_ioctl(struct hl_fpriv *hpriv, void *data)
 
 	case HL_CB_OP_INFO:
 		rc = hl_cb_info(hdev, &hpriv->cb_mgr, args->in.cb_handle,
-				&usage_cnt);
-		memset(args, 0, sizeof(*args));
-		args->out.usage_cnt = usage_cnt;
+				args->in.flags,
+				&usage_cnt,
+				&device_va);
+
+		memset(&args->out, 0, sizeof(args->out));
+
+		if (args->in.flags & HL_CB_FLAGS_GET_DEVICE_VA)
+			args->out.device_va = device_va;
+		else
+			args->out.usage_cnt = usage_cnt;
 		break;
 
 	default:
diff --git a/drivers/misc/habanalabs/common/command_submission.c b/drivers/misc/habanalabs/common/command_submission.c
index b9fed6b6d1ab..7073fa6b9f0f 100644
--- a/drivers/misc/habanalabs/common/command_submission.c
+++ b/drivers/misc/habanalabs/common/command_submission.c
@@ -2845,6 +2845,106 @@ static int hl_cs_wait_ioctl(struct hl_fpriv *hpriv, void *data)
 }
 
 static int _hl_interrupt_wait_ioctl(struct hl_device *hdev, struct hl_ctx *ctx,
+				struct hl_cb_mgr *cb_mgr, u64 timeout_us,
+				u64 cq_counters_handle,	u64 cq_counters_offset,
+				u64 target_value, struct hl_user_interrupt *interrupt,
+				u32 *status,
+				u64 *timestamp)
+{
+	struct hl_user_pending_interrupt *pend;
+	unsigned long timeout, flags;
+	long completion_rc;
+	struct hl_cb *cb;
+	int rc = 0;
+	u32 handle;
+
+	timeout = hl_usecs64_to_jiffies(timeout_us);
+
+	hl_ctx_get(hdev, ctx);
+
+	cq_counters_handle >>= PAGE_SHIFT;
+	handle = (u32) cq_counters_handle;
+
+	cb = hl_cb_get(hdev, cb_mgr, handle);
+	if (!cb) {
+		hl_ctx_put(ctx);
+		return -EINVAL;
+	}
+
+	pend = kzalloc(sizeof(*pend), GFP_KERNEL);
+	if (!pend) {
+		hl_cb_put(cb);
+		hl_ctx_put(ctx);
+		return -ENOMEM;
+	}
+
+	hl_fence_init(&pend->fence, ULONG_MAX);
+
+	pend->cq_kernel_addr = (u64 *) cb->kernel_address + cq_counters_offset;
+	pend->cq_target_value = target_value;
+
+	/* We check for completion value as interrupt could have been received
+	 * before we added the node to the wait list
+	 */
+	if (*pend->cq_kernel_addr >= target_value) {
+		*status = HL_WAIT_CS_STATUS_COMPLETED;
+		/* There was no interrupt, we assume the completion is now. */
+		pend->fence.timestamp = ktime_get();
+	}
+
+	if (!timeout_us || (*status == HL_WAIT_CS_STATUS_COMPLETED))
+		goto set_timestamp;
+
+	/* Add pending user interrupt to relevant list for the interrupt
+	 * handler to monitor
+	 */
+	spin_lock_irqsave(&interrupt->wait_list_lock, flags);
+	list_add_tail(&pend->wait_list_node, &interrupt->wait_list_head);
+	spin_unlock_irqrestore(&interrupt->wait_list_lock, flags);
+
+	/* Wait for interrupt handler to signal completion */
+	completion_rc = wait_for_completion_interruptible_timeout(&pend->fence.completion,
+								timeout);
+	if (completion_rc > 0) {
+		*status = HL_WAIT_CS_STATUS_COMPLETED;
+	} else {
+		if (completion_rc == -ERESTARTSYS) {
+			dev_err_ratelimited(hdev->dev,
+					"user process got signal while waiting for interrupt ID %d\n",
+					interrupt->interrupt_id);
+			rc = -EINTR;
+			*status = HL_WAIT_CS_STATUS_ABORTED;
+		} else {
+			if (pend->fence.error == -EIO) {
+				dev_err_ratelimited(hdev->dev,
+						"interrupt based wait ioctl aborted(error:%d) due to a reset cycle initiated\n",
+						pend->fence.error);
+				rc = -EIO;
+				*status = HL_WAIT_CS_STATUS_ABORTED;
+			} else {
+				dev_err_ratelimited(hdev->dev, "Waiting for interrupt ID %d timedout\n",
+						interrupt->interrupt_id);
+				rc = -ETIMEDOUT;
+			}
+			*status = HL_WAIT_CS_STATUS_BUSY;
+		}
+	}
+
+	spin_lock_irqsave(&interrupt->wait_list_lock, flags);
+	list_del(&pend->wait_list_node);
+	spin_unlock_irqrestore(&interrupt->wait_list_lock, flags);
+
+set_timestamp:
+	*timestamp = ktime_to_ns(pend->fence.timestamp);
+
+	kfree(pend);
+	hl_cb_put(cb);
+	hl_ctx_put(ctx);
+
+	return rc;
+}
+
+static int _hl_interrupt_wait_ioctl_user_addr(struct hl_device *hdev, struct hl_ctx *ctx,
 				u64 timeout_us, u64 user_address,
 				u64 target_value, struct hl_user_interrupt *interrupt,
 
@@ -2861,7 +2961,7 @@ static int _hl_interrupt_wait_ioctl(struct hl_device *hdev, struct hl_ctx *ctx,
 
 	hl_ctx_get(hdev, ctx);
 
-	pend = kmalloc(sizeof(*pend), GFP_KERNEL);
+	pend = kzalloc(sizeof(*pend), GFP_KERNEL);
 	if (!pend) {
 		hl_ctx_put(ctx);
 		return -ENOMEM;
@@ -2990,7 +3090,14 @@ static int hl_interrupt_wait_ioctl(struct hl_fpriv *hpriv, void *data)
 	else
 		interrupt = &hdev->user_interrupt[interrupt_id - first_interrupt];
 
-	rc = _hl_interrupt_wait_ioctl(hdev, hpriv->ctx,
+	if (args->in.flags & HL_WAIT_CS_FLAGS_INTERRUPT_KERNEL_CQ)
+		rc = _hl_interrupt_wait_ioctl(hdev, hpriv->ctx, &hpriv->cb_mgr,
+				args->in.interrupt_timeout_us, args->in.cq_counters_handle,
+				args->in.cq_counters_offset,
+				args->in.target, interrupt, &status,
+				&timestamp);
+	else
+		rc = _hl_interrupt_wait_ioctl_user_addr(hdev, hpriv->ctx,
 				args->in.interrupt_timeout_us, args->in.addr,
 				args->in.target, interrupt, &status,
 				&timestamp);
diff --git a/drivers/misc/habanalabs/common/habanalabs.h b/drivers/misc/habanalabs/common/habanalabs.h
index 4d4986177776..78772fe548b9 100644
--- a/drivers/misc/habanalabs/common/habanalabs.h
+++ b/drivers/misc/habanalabs/common/habanalabs.h
@@ -876,10 +876,15 @@ struct hl_user_interrupt {
  *                                    pending on an interrupt
  * @wait_list_node: node in the list of user threads pending on an interrupt
  * @fence: hl fence object for interrupt completion
+ * @cq_target_value: CQ target value
+ * @cq_kernel_addr: CQ kernel address, to be used in the cq interrupt
+ *                  handler for taget value comparison
  */
 struct hl_user_pending_interrupt {
 	struct list_head	wait_list_node;
 	struct hl_fence		fence;
+	u64			cq_target_value;
+	u64			*cq_kernel_addr;
 };
 
 /**
diff --git a/drivers/misc/habanalabs/common/irq.c b/drivers/misc/habanalabs/common/irq.c
index 64e0d9de21bd..6454ea12bf3a 100644
--- a/drivers/misc/habanalabs/common/irq.c
+++ b/drivers/misc/habanalabs/common/irq.c
@@ -145,8 +145,12 @@ static void handle_user_cq(struct hl_device *hdev,
 
 	spin_lock(&user_cq->wait_list_lock);
 	list_for_each_entry(pend, &user_cq->wait_list_head, wait_list_node) {
-		pend->fence.timestamp = now;
-		complete_all(&pend->fence.completion);
+		if ((pend->cq_kernel_addr &&
+				*(pend->cq_kernel_addr) >= pend->cq_target_value) ||
+				!pend->cq_kernel_addr) {
+			pend->fence.timestamp = now;
+			complete_all(&pend->fence.completion);
+		}
 	}
 	spin_unlock(&user_cq->wait_list_lock);
 }
diff --git a/include/uapi/misc/habanalabs.h b/include/uapi/misc/habanalabs.h
index 648850b954a3..371dfc4243b3 100644
--- a/include/uapi/misc/habanalabs.h
+++ b/include/uapi/misc/habanalabs.h
@@ -680,7 +680,10 @@ struct hl_info_args {
 #define HL_MAX_CB_SIZE		(0x200000 - 32)
 
 /* Indicates whether the command buffer should be mapped to the device's MMU */
-#define HL_CB_FLAGS_MAP		0x1
+#define HL_CB_FLAGS_MAP			0x1
+
+/* Used with HL_CB_OP_INFO opcode to get the device va address for kernel mapped CB */
+#define HL_CB_FLAGS_GET_DEVICE_VA	0x2
 
 struct hl_cb_in {
 	/* Handle of CB or 0 if we want to create one */
@@ -702,11 +705,16 @@ struct hl_cb_out {
 		/* Handle of CB */
 		__u64 cb_handle;
 
-		/* Information about CB */
-		struct {
-			/* Usage count of CB */
-			__u32 usage_cnt;
-			__u32 pad;
+		union {
+			/* Information about CB */
+			struct {
+				/* Usage count of CB */
+				__u32 usage_cnt;
+				__u32 pad;
+			};
+
+			/* CB mapped address to device MMU */
+			__u64 device_va;
 		};
 	};
 };
@@ -947,9 +955,10 @@ union hl_cs_args {
 	struct hl_cs_out out;
 };
 
-#define HL_WAIT_CS_FLAGS_INTERRUPT	0x2
-#define HL_WAIT_CS_FLAGS_INTERRUPT_MASK 0xFFF00000
-#define HL_WAIT_CS_FLAGS_MULTI_CS	0x4
+#define HL_WAIT_CS_FLAGS_INTERRUPT		0x2
+#define HL_WAIT_CS_FLAGS_INTERRUPT_MASK		0xFFF00000
+#define HL_WAIT_CS_FLAGS_MULTI_CS		0x4
+#define HL_WAIT_CS_FLAGS_INTERRUPT_KERNEL_CQ	0x10
 
 #define HL_WAIT_MULTI_CS_LIST_MAX_LEN	32
 
@@ -969,14 +978,23 @@ struct hl_wait_cs_in {
 		};
 
 		struct {
-			/* User address for completion comparison.
-			 * upon interrupt, driver will compare the value pointed
-			 * by this address with the supplied target value.
-			 * in order not to perform any comparison, set address
-			 * to all 1s.
-			 * Relevant only when HL_WAIT_CS_FLAGS_INTERRUPT is set
-			 */
-			__u64 addr;
+			union {
+				/* User address for completion comparison.
+				 * upon interrupt, driver will compare the value pointed
+				 * by this address with the supplied target value.
+				 * in order not to perform any comparison, set address
+				 * to all 1s.
+				 * Relevant only when HL_WAIT_CS_FLAGS_INTERRUPT is set
+				 */
+				__u64 addr;
+
+				/* cq_counters_handle to a kernel mapped cb which contains
+				 * cq counters.
+				 * Relevant only when HL_WAIT_CS_FLAGS_INTERRUPT_KERNEL_CQ is set
+				 */
+				__u64 cq_counters_handle;
+			};
+
 			/* Target value for completion comparison */
 			__u64 target;
 		};
@@ -1004,6 +1022,15 @@ struct hl_wait_cs_in {
 		 */
 		__u64 interrupt_timeout_us;
 	};
+
+	/*
+	 * cq counter offset inside the counters cb pointed by cq_counters_handle above.
+	 * upon interrupt, driver will compare the value pointed
+	 * by this address (cq_counters_handle + cq_counters_offset)
+	 * with the supplied target value.
+	 * relevant only when HL_WAIT_CS_FLAGS_INTERRUPT_KERNEL_CQ is set
+	 */
+	__u64 cq_counters_offset;
 };
 
 #define HL_WAIT_CS_STATUS_COMPLETED	0
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2021-12-14 15:06 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-14 15:05 [PATCH 01/11] habanalabs: return correct clock throttling period Oded Gabbay
2021-12-14 15:05 ` [PATCH 02/11] habanalabs: remove in_debug check in device open Oded Gabbay
2021-12-14 15:05 ` [PATCH 03/11] habanalabs: add current PI value to cpu packets Oded Gabbay
2021-12-14 15:05 ` [PATCH 04/11] habanalabs: fix hwmon handling for legacy f/w Oded Gabbay
2021-12-14 15:05 ` [PATCH 05/11] habanalabs: keep control device alive during hard reset Oded Gabbay
2021-12-14 15:05 ` [PATCH 06/11] habanalabs: sysfs support for two infineon versions Oded Gabbay
2021-12-14 15:05 ` [PATCH 07/11] habanalabs: expose soft reset sysfs nodes for inference ASIC Oded Gabbay
2021-12-14 15:05 ` [PATCH 08/11] habanalabs: clean MMU headers definitions Oded Gabbay
2021-12-14 15:05 ` [PATCH 09/11] habanalabs: modify cpu boot status error print Oded Gabbay
2021-12-14 15:05 ` [PATCH 10/11] habanalabs: prevent wait if CS in multi-CS list completed Oded Gabbay
2021-12-14 15:05 ` [PATCH 11/11] habanalabs: change wait_for_interrupt implementation Oded Gabbay

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.