All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/6] accel/habanalabs: unmap mapped memory when TLB inv fails
@ 2023-03-23 11:35 Oded Gabbay
  2023-03-23 11:35 ` [PATCH 2/6] accel/habanalabs: print event type when device is disabled Oded Gabbay
                   ` (5 more replies)
  0 siblings, 6 replies; 12+ messages in thread
From: Oded Gabbay @ 2023-03-23 11:35 UTC (permalink / raw)
  To: dri-devel; +Cc: Koby Elbaz

From: Koby Elbaz <kelbaz@habana.ai>

Once a memory mapping is added to the page tables, it's followed by
a TLB invalidation request which could potentially fail (HW failure).
Removing the mapping is simply a part of this failure handling routine.
TLB invalidation failure prints were updated to be more accurate.

Signed-off-by: Koby Elbaz <kelbaz@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
---
 drivers/accel/habanalabs/common/command_buffer.c | 15 ++++++++++++---
 drivers/accel/habanalabs/common/mmu/mmu.c        |  8 ++++++--
 2 files changed, 18 insertions(+), 5 deletions(-)

diff --git a/drivers/accel/habanalabs/common/command_buffer.c b/drivers/accel/habanalabs/common/command_buffer.c
index 3a0535ac28b1..6e09f48750a0 100644
--- a/drivers/accel/habanalabs/common/command_buffer.c
+++ b/drivers/accel/habanalabs/common/command_buffer.c
@@ -45,20 +45,29 @@ static int cb_map_mem(struct hl_ctx *ctx, struct hl_cb *cb)
 	}
 
 	mutex_lock(&hdev->mmu_lock);
+
 	rc = hl_mmu_map_contiguous(ctx, cb->virtual_addr, cb->bus_address, cb->roundup_size);
 	if (rc) {
 		dev_err(hdev->dev, "Failed to map VA %#llx to CB\n", cb->virtual_addr);
-		goto err_va_umap;
+		goto err_va_pool_free;
 	}
+
 	rc = hl_mmu_invalidate_cache(hdev, false, MMU_OP_USERPTR | MMU_OP_SKIP_LOW_CACHE_INV);
+	if (rc)
+		goto err_mmu_unmap;
+
 	mutex_unlock(&hdev->mmu_lock);
 
 	cb->is_mmu_mapped = true;
-	return rc;
 
-err_va_umap:
+	return 0;
+
+err_mmu_unmap:
+	hl_mmu_unmap_contiguous(ctx, cb->virtual_addr, cb->roundup_size);
+err_va_pool_free:
 	mutex_unlock(&hdev->mmu_lock);
 	gen_pool_free(ctx->cb_va_pool, cb->virtual_addr, cb->roundup_size);
+
 	return rc;
 }
 
diff --git a/drivers/accel/habanalabs/common/mmu/mmu.c b/drivers/accel/habanalabs/common/mmu/mmu.c
index 17581b1bcc77..f379e5b461a6 100644
--- a/drivers/accel/habanalabs/common/mmu/mmu.c
+++ b/drivers/accel/habanalabs/common/mmu/mmu.c
@@ -679,7 +679,9 @@ int hl_mmu_invalidate_cache(struct hl_device *hdev, bool is_hard, u32 flags)
 
 	rc = hdev->asic_funcs->mmu_invalidate_cache(hdev, is_hard, flags);
 	if (rc)
-		dev_err_ratelimited(hdev->dev, "MMU cache invalidation failed\n");
+		dev_err_ratelimited(hdev->dev,
+				"%s cache invalidation failed, rc=%d\n",
+				flags == VM_TYPE_USERPTR ? "PMMU" : "HMMU", rc);
 
 	return rc;
 }
@@ -692,7 +694,9 @@ int hl_mmu_invalidate_cache_range(struct hl_device *hdev, bool is_hard,
 	rc = hdev->asic_funcs->mmu_invalidate_cache_range(hdev, is_hard, flags,
 								asid, va, size);
 	if (rc)
-		dev_err_ratelimited(hdev->dev, "MMU cache range invalidation failed\n");
+		dev_err_ratelimited(hdev->dev,
+				"%s cache range invalidation failed: va=%#llx, size=%llu, rc=%d",
+				flags == VM_TYPE_USERPTR ? "PMMU" : "HMMU", va, size, rc);
 
 	return rc;
 }
-- 
2.40.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2023-03-24  8:31 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-23 11:35 [PATCH 1/6] accel/habanalabs: unmap mapped memory when TLB inv fails Oded Gabbay
2023-03-23 11:35 ` [PATCH 2/6] accel/habanalabs: print event type when device is disabled Oded Gabbay
2023-03-24  8:19   ` Stanislaw Gruszka
2023-03-23 11:35 ` [PATCH 3/6] accel/habanalabs: check return value of add_va_block_locked Oded Gabbay
2023-03-24  8:28   ` Stanislaw Gruszka
2023-03-23 11:35 ` [PATCH 4/6] accel/habanalabs: change COMMS warning messages to error level Oded Gabbay
2023-03-24  8:28   ` Stanislaw Gruszka
2023-03-23 11:35 ` [PATCH 5/6] accel/habanalabs: remove duplicated disable pci msg Oded Gabbay
2023-03-24  8:31   ` Stanislaw Gruszka
2023-03-23 11:35 ` [PATCH 6/6] accel/habanalabs: send disable pci when compute ctx is active Oded Gabbay
2023-03-24  8:31   ` Stanislaw Gruszka
2023-03-24  8:18 ` [PATCH 1/6] accel/habanalabs: unmap mapped memory when TLB inv fails Stanislaw Gruszka

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.