linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/2] habanalabs: make sure device mem alloc is page aligned
@ 2022-02-21  8:49 Oded Gabbay
  2022-02-21  8:49 ` [PATCH 2/2] habanalabs: Fix reset upon device release bug Oded Gabbay
  2022-03-26 13:16 ` [PATCH 1/2] habanalabs: make sure device mem alloc is page aligned Guenter Roeck
  0 siblings, 2 replies; 3+ messages in thread
From: Oded Gabbay @ 2022-02-21  8:49 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ohad Sharabi

From: Ohad Sharabi <osharabi@habana.ai>

Working with MMU that supports multiple page sizes requires that mapping
of a page of a certain size will be aligned to the same size (e.g. the
physical address of 32MB page shall be aligned to 32MB).

To achieve this the gen_poll allocation is now using the "align" variant
to comply with the alignment requirements.

Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
---
 drivers/misc/habanalabs/common/memory.c | 19 ++++++++++++++-----
 1 file changed, 14 insertions(+), 5 deletions(-)

diff --git a/drivers/misc/habanalabs/common/memory.c b/drivers/misc/habanalabs/common/memory.c
index 385bf3448c73..0b76f40e9930 100644
--- a/drivers/misc/habanalabs/common/memory.c
+++ b/drivers/misc/habanalabs/common/memory.c
@@ -90,8 +90,8 @@ static int alloc_device_memory(struct hl_ctx *ctx, struct hl_mem_in *args,
 	struct hl_vm_phys_pg_pack *phys_pg_pack;
 	u64 paddr = 0, total_size, num_pgs, i;
 	u32 num_curr_pgs, page_size;
-	int handle, rc;
 	bool contiguous;
+	int handle, rc;
 
 	num_curr_pgs = 0;
 
@@ -110,7 +110,11 @@ static int alloc_device_memory(struct hl_ctx *ctx, struct hl_mem_in *args,
 	contiguous = args->flags & HL_MEM_CONTIGUOUS;
 
 	if (contiguous) {
-		paddr = (u64) gen_pool_alloc(vm->dram_pg_pool, total_size);
+		if (is_power_of_2(page_size))
+			paddr = (u64) gen_pool_dma_alloc_align(vm->dram_pg_pool, total_size, NULL,
+							page_size);
+		else
+			paddr = (u64) gen_pool_alloc(vm->dram_pg_pool, total_size);
 		if (!paddr) {
 			dev_err(hdev->dev,
 				"failed to allocate %llu contiguous pages with total size of %llu\n",
@@ -144,9 +148,14 @@ static int alloc_device_memory(struct hl_ctx *ctx, struct hl_mem_in *args,
 			phys_pg_pack->pages[i] = paddr + i * page_size;
 	} else {
 		for (i = 0 ; i < num_pgs ; i++) {
-			phys_pg_pack->pages[i] = (u64) gen_pool_alloc(
-							vm->dram_pg_pool,
-							page_size);
+			if (is_power_of_2(page_size))
+				phys_pg_pack->pages[i] =
+						(u64) gen_pool_dma_alloc_align(vm->dram_pg_pool,
+										page_size, NULL,
+										page_size);
+			else
+				phys_pg_pack->pages[i] = (u64) gen_pool_alloc(vm->dram_pg_pool,
+										page_size);
 			if (!phys_pg_pack->pages[i]) {
 				dev_err(hdev->dev,
 					"Failed to allocate device memory (out of memory)\n");
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* [PATCH 2/2] habanalabs: Fix reset upon device release bug
  2022-02-21  8:49 [PATCH 1/2] habanalabs: make sure device mem alloc is page aligned Oded Gabbay
@ 2022-02-21  8:49 ` Oded Gabbay
  2022-03-26 13:16 ` [PATCH 1/2] habanalabs: make sure device mem alloc is page aligned Guenter Roeck
  1 sibling, 0 replies; 3+ messages in thread
From: Oded Gabbay @ 2022-02-21  8:49 UTC (permalink / raw)
  To: linux-kernel; +Cc: farah kassabri

From: farah kassabri <fkassabri@habana.ai>

In case user application was interrupted while some cs still in-flight
or in the middle of completion handling in driver, the
last refcount of the kernel private data for the user process
will not be put in the fd close flow, but in the cs completion
workqueue context.

This means that the device reset-upon-device-release will be called
from that context. During the reset flow, the driver flushes all the cs
workqueue to ensure that any scheduled work has run to completion,
and since we are running from the completion context we will
have deadlock.

Therefore, we need to skip flushing the workqueue in those cases.
It is safe to do it because the user won't be able to release the device
unless the workqueues are already empty.

Signed-off-by: farah kassabri <fkassabri@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
---
 .../misc/habanalabs/common/command_submission.c | 17 ++++++++++-------
 drivers/misc/habanalabs/common/device.c         | 13 ++++++++-----
 drivers/misc/habanalabs/common/habanalabs.h     |  2 +-
 3 files changed, 19 insertions(+), 13 deletions(-)

diff --git a/drivers/misc/habanalabs/common/command_submission.c b/drivers/misc/habanalabs/common/command_submission.c
index c7757c78d0b1..d93ef9f1c45c 100644
--- a/drivers/misc/habanalabs/common/command_submission.c
+++ b/drivers/misc/habanalabs/common/command_submission.c
@@ -921,18 +921,21 @@ static void cs_rollback(struct hl_device *hdev, struct hl_cs *cs)
 		complete_job(hdev, job);
 }
 
-void hl_cs_rollback_all(struct hl_device *hdev)
+void hl_cs_rollback_all(struct hl_device *hdev, bool skip_wq_flush)
 {
 	int i;
 	struct hl_cs *cs, *tmp;
 
-	flush_workqueue(hdev->ts_free_obj_wq);
+	if (!skip_wq_flush) {
+		flush_workqueue(hdev->ts_free_obj_wq);
 
-	/* flush all completions before iterating over the CS mirror list in
-	 * order to avoid a race with the release functions
-	 */
-	for (i = 0 ; i < hdev->asic_prop.completion_queues_count ; i++)
-		flush_workqueue(hdev->cq_wq[i]);
+		/* flush all completions before iterating over the CS mirror list in
+		 * order to avoid a race with the release functions
+		 */
+		for (i = 0 ; i < hdev->asic_prop.completion_queues_count ; i++)
+			flush_workqueue(hdev->cq_wq[i]);
+
+	}
 
 	/* Make sure we don't have leftovers in the CS mirror list */
 	list_for_each_entry_safe(cs, tmp, &hdev->cs_mirror_list, mirror_node) {
diff --git a/drivers/misc/habanalabs/common/device.c b/drivers/misc/habanalabs/common/device.c
index 8ea9dfe3f79b..d52381d1fbd2 100644
--- a/drivers/misc/habanalabs/common/device.c
+++ b/drivers/misc/habanalabs/common/device.c
@@ -685,7 +685,8 @@ static void take_release_locks(struct hl_device *hdev)
 	mutex_unlock(&hdev->fpriv_ctrl_list_lock);
 }
 
-static void cleanup_resources(struct hl_device *hdev, bool hard_reset, bool fw_reset)
+static void cleanup_resources(struct hl_device *hdev, bool hard_reset, bool fw_reset,
+				bool skip_wq_flush)
 {
 	if (hard_reset)
 		device_late_fini(hdev);
@@ -698,7 +699,7 @@ static void cleanup_resources(struct hl_device *hdev, bool hard_reset, bool fw_r
 	hdev->asic_funcs->halt_engines(hdev, hard_reset, fw_reset);
 
 	/* Go over all the queues, release all CS and their jobs */
-	hl_cs_rollback_all(hdev);
+	hl_cs_rollback_all(hdev, skip_wq_flush);
 
 	/* Release all pending user interrupts, each pending user interrupt
 	 * holds a reference to user context
@@ -978,7 +979,8 @@ static void handle_reset_trigger(struct hl_device *hdev, u32 flags)
 int hl_device_reset(struct hl_device *hdev, u32 flags)
 {
 	bool hard_reset, from_hard_reset_thread, fw_reset, hard_instead_soft = false,
-			reset_upon_device_release = false, schedule_hard_reset = false;
+			reset_upon_device_release = false, schedule_hard_reset = false,
+			skip_wq_flush = false;
 	u64 idle_mask[HL_BUSY_ENGINES_MASK_EXT_SIZE] = {0};
 	struct hl_ctx *ctx;
 	int i, rc;
@@ -991,6 +993,7 @@ int hl_device_reset(struct hl_device *hdev, u32 flags)
 	hard_reset = !!(flags & HL_DRV_RESET_HARD);
 	from_hard_reset_thread = !!(flags & HL_DRV_RESET_FROM_RESET_THR);
 	fw_reset = !!(flags & HL_DRV_RESET_BYPASS_REQ_TO_FW);
+	skip_wq_flush = !!(flags & HL_DRV_RESET_DEV_RELEASE);
 
 	if (!hard_reset && !hdev->asic_prop.supports_soft_reset) {
 		hard_instead_soft = true;
@@ -1076,7 +1079,7 @@ int hl_device_reset(struct hl_device *hdev, u32 flags)
 		return 0;
 	}
 
-	cleanup_resources(hdev, hard_reset, fw_reset);
+	cleanup_resources(hdev, hard_reset, fw_reset, skip_wq_flush);
 
 kill_processes:
 	if (hard_reset) {
@@ -1686,7 +1689,7 @@ void hl_device_fini(struct hl_device *hdev)
 
 	hl_hwmon_fini(hdev);
 
-	cleanup_resources(hdev, true, false);
+	cleanup_resources(hdev, true, false, false);
 
 	/* Kill processes here after CS rollback. This is because the process
 	 * can't really exit until all its CSs are done, which is what we
diff --git a/drivers/misc/habanalabs/common/habanalabs.h b/drivers/misc/habanalabs/common/habanalabs.h
index 677ae4ff922c..cef4717d0916 100644
--- a/drivers/misc/habanalabs/common/habanalabs.h
+++ b/drivers/misc/habanalabs/common/habanalabs.h
@@ -3054,7 +3054,7 @@ int hl_cb_pool_fini(struct hl_device *hdev);
 int hl_cb_va_pool_init(struct hl_ctx *ctx);
 void hl_cb_va_pool_fini(struct hl_ctx *ctx);
 
-void hl_cs_rollback_all(struct hl_device *hdev);
+void hl_cs_rollback_all(struct hl_device *hdev, bool skip_wq_flush);
 struct hl_cs_job *hl_cs_allocate_job(struct hl_device *hdev,
 		enum hl_queue_type queue_type, bool is_kernel_allocated_cb);
 void hl_sob_reset_error(struct kref *ref);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH 1/2] habanalabs: make sure device mem alloc is page aligned
  2022-02-21  8:49 [PATCH 1/2] habanalabs: make sure device mem alloc is page aligned Oded Gabbay
  2022-02-21  8:49 ` [PATCH 2/2] habanalabs: Fix reset upon device release bug Oded Gabbay
@ 2022-03-26 13:16 ` Guenter Roeck
  1 sibling, 0 replies; 3+ messages in thread
From: Guenter Roeck @ 2022-03-26 13:16 UTC (permalink / raw)
  To: Oded Gabbay; +Cc: linux-kernel, Ohad Sharabi

On Mon, Feb 21, 2022 at 10:49:13AM +0200, Oded Gabbay wrote:
> From: Ohad Sharabi <osharabi@habana.ai>
> 
> Working with MMU that supports multiple page sizes requires that mapping
> of a page of a certain size will be aligned to the same size (e.g. the
> physical address of 32MB page shall be aligned to 32MB).
> 
> To achieve this the gen_poll allocation is now using the "align" variant
> to comply with the alignment requirements.
> 
> Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
> Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
> ---
>  drivers/misc/habanalabs/common/memory.c | 19 ++++++++++++++-----
>  1 file changed, 14 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/misc/habanalabs/common/memory.c b/drivers/misc/habanalabs/common/memory.c
> index 385bf3448c73..0b76f40e9930 100644
> --- a/drivers/misc/habanalabs/common/memory.c
> +++ b/drivers/misc/habanalabs/common/memory.c
> @@ -90,8 +90,8 @@ static int alloc_device_memory(struct hl_ctx *ctx, struct hl_mem_in *args,
>  	struct hl_vm_phys_pg_pack *phys_pg_pack;
>  	u64 paddr = 0, total_size, num_pgs, i;
>  	u32 num_curr_pgs, page_size;
> -	int handle, rc;
>  	bool contiguous;
> +	int handle, rc;
>  
>  	num_curr_pgs = 0;
>  
> @@ -110,7 +110,11 @@ static int alloc_device_memory(struct hl_ctx *ctx, struct hl_mem_in *args,
>  	contiguous = args->flags & HL_MEM_CONTIGUOUS;
>  
>  	if (contiguous) {
> -		paddr = (u64) gen_pool_alloc(vm->dram_pg_pool, total_size);
> +		if (is_power_of_2(page_size))
> +			paddr = (u64) gen_pool_dma_alloc_align(vm->dram_pg_pool, total_size, NULL,
> +							page_size);

I see that this is fixed in -next, but ...

> +		else
> +			paddr = (u64) gen_pool_alloc(vm->dram_pg_pool, total_size);

... this has an unnecessary typecast in -next (gen_pool_alloc() returns
unsigned long, not a pointer), and ...

>  		if (!paddr) {
>  			dev_err(hdev->dev,
>  				"failed to allocate %llu contiguous pages with total size of %llu\n",
> @@ -144,9 +148,14 @@ static int alloc_device_memory(struct hl_ctx *ctx, struct hl_mem_in *args,
>  			phys_pg_pack->pages[i] = paddr + i * page_size;
>  	} else {
>  		for (i = 0 ; i < num_pgs ; i++) {
> -			phys_pg_pack->pages[i] = (u64) gen_pool_alloc(
> -							vm->dram_pg_pool,
> -							page_size);
> +			if (is_power_of_2(page_size))
> +				phys_pg_pack->pages[i] =
> +						(u64) gen_pool_dma_alloc_align(vm->dram_pg_pool,
> +										page_size, NULL,
> +										page_size);

... this still causes a compile error when building 32-bit test images.

Building mips:allmodconfig ... failed
--------------
Error log:
drivers/misc/habanalabs/common/memory.c: In function 'alloc_device_memory':
drivers/misc/habanalabs/common/memory.c:153:49: error: cast from pointer to integer of different size

Guenter

> +			else
> +				phys_pg_pack->pages[i] = (u64) gen_pool_alloc(vm->dram_pg_pool,
> +										page_size);
>  			if (!phys_pg_pack->pages[i]) {
>  				dev_err(hdev->dev,
>  					"Failed to allocate device memory (out of memory)\n");
> -- 
> 2.25.1
> 

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2022-03-26 13:16 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-21  8:49 [PATCH 1/2] habanalabs: make sure device mem alloc is page aligned Oded Gabbay
2022-02-21  8:49 ` [PATCH 2/2] habanalabs: Fix reset upon device release bug Oded Gabbay
2022-03-26 13:16 ` [PATCH 1/2] habanalabs: make sure device mem alloc is page aligned Guenter Roeck

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).