All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 01/11] habanalabs: set non-0 value in dram default page size
@ 2022-03-16 11:41 Oded Gabbay
  2022-03-16 11:41 ` [PATCH 02/11] habanalabs: add DRAM default page size to HW info Oded Gabbay
                   ` (9 more replies)
  0 siblings, 10 replies; 13+ messages in thread
From: Oded Gabbay @ 2022-03-16 11:41 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ohad Sharabi

From: Ohad Sharabi <osharabi@habana.ai>

Looking forward we will need to report to the user what is the default
page size used.

This will be done more conveniently by explicitly updating the property
rather than to rely on a "0 meaning default" value.

Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
---
 drivers/misc/habanalabs/common/habanalabs.h | 5 +++++
 drivers/misc/habanalabs/common/memory.c     | 2 +-
 drivers/misc/habanalabs/gaudi/gaudi.c       | 1 +
 drivers/misc/habanalabs/goya/goya.c         | 1 +
 4 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/misc/habanalabs/common/habanalabs.h b/drivers/misc/habanalabs/common/habanalabs.h
index 1edaf6ab67bd..af47accd4a56 100644
--- a/drivers/misc/habanalabs/common/habanalabs.h
+++ b/drivers/misc/habanalabs/common/habanalabs.h
@@ -528,6 +528,10 @@ struct hl_hints_range {
  * @fw_app_cpu_boot_dev_sts1: bitmap representation of application security
  *                            status reported by FW, bit description can be
  *                            found in CPU_BOOT_DEV_STS1
+ * @device_mem_alloc_default_page_size: may be different than dram_page_size only for ASICs for
+ *                                      which the property supports_user_set_page_size is true
+ *                                      (i.e. the DRAM supports multiple page sizes), otherwise
+ *                                      it will shall  be equal to dram_page_size.
  * @collective_first_sob: first sync object available for collective use
  * @collective_first_mon: first monitor available for collective use
  * @sync_stream_first_sob: first sync object available for sync stream use
@@ -626,6 +630,7 @@ struct asic_fixed_properties {
 	u32				fw_bootfit_cpu_boot_dev_sts1;
 	u32				fw_app_cpu_boot_dev_sts0;
 	u32				fw_app_cpu_boot_dev_sts1;
+	u32				device_mem_alloc_default_page_size;
 	u16				collective_first_sob;
 	u16				collective_first_mon;
 	u16				sync_stream_first_sob;
diff --git a/drivers/misc/habanalabs/common/memory.c b/drivers/misc/habanalabs/common/memory.c
index e008d82e4ba3..671633414504 100644
--- a/drivers/misc/habanalabs/common/memory.c
+++ b/drivers/misc/habanalabs/common/memory.c
@@ -41,7 +41,7 @@ static int set_alloc_page_size(struct hl_device *hdev, struct hl_mem_in *args, u
 			return -EINVAL;
 		}
 	} else {
-		psize = hdev->asic_prop.dram_page_size;
+		psize = prop->device_mem_alloc_default_page_size;
 	}
 
 	*page_size = psize;
diff --git a/drivers/misc/habanalabs/gaudi/gaudi.c b/drivers/misc/habanalabs/gaudi/gaudi.c
index 21c2b678ff72..feb1323a8f4a 100644
--- a/drivers/misc/habanalabs/gaudi/gaudi.c
+++ b/drivers/misc/habanalabs/gaudi/gaudi.c
@@ -595,6 +595,7 @@ static int gaudi_set_fixed_properties(struct hl_device *hdev)
 	prop->mmu_hop_table_size = HOP_TABLE_SIZE_512_PTE;
 	prop->mmu_hop0_tables_total_size = HOP0_512_PTE_TABLES_TOTAL_SIZE;
 	prop->dram_page_size = PAGE_SIZE_2MB;
+	prop->device_mem_alloc_default_page_size = prop->dram_page_size;
 	prop->dram_supports_virtual_memory = false;
 
 	prop->pmmu.hop0_shift = MMU_V1_1_HOP0_SHIFT;
diff --git a/drivers/misc/habanalabs/goya/goya.c b/drivers/misc/habanalabs/goya/goya.c
index ec9358bcbf0b..5bd665188ea6 100644
--- a/drivers/misc/habanalabs/goya/goya.c
+++ b/drivers/misc/habanalabs/goya/goya.c
@@ -413,6 +413,7 @@ int goya_set_fixed_properties(struct hl_device *hdev)
 	prop->mmu_hop_table_size = HOP_TABLE_SIZE_512_PTE;
 	prop->mmu_hop0_tables_total_size = HOP0_512_PTE_TABLES_TOTAL_SIZE;
 	prop->dram_page_size = PAGE_SIZE_2MB;
+	prop->device_mem_alloc_default_page_size = prop->dram_page_size;
 	prop->dram_supports_virtual_memory = true;
 
 	prop->dmmu.hop0_shift = MMU_V1_0_HOP0_SHIFT;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 02/11] habanalabs: add DRAM default page size to HW info
  2022-03-16 11:41 [PATCH 01/11] habanalabs: set non-0 value in dram default page size Oded Gabbay
@ 2022-03-16 11:41 ` Oded Gabbay
  2022-03-16 11:41 ` [PATCH 03/11] habanalabs: change mmu_get_real_page_size to be ASIC-specific Oded Gabbay
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 13+ messages in thread
From: Oded Gabbay @ 2022-03-16 11:41 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ohad Sharabi

From: Ohad Sharabi <osharabi@habana.ai>

When using the device memory allocation API the user ought to know what
is the default allocation page size.

Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
---
 drivers/misc/habanalabs/common/habanalabs_ioctl.c | 1 +
 include/uapi/misc/habanalabs.h                    | 3 +++
 2 files changed, 4 insertions(+)

diff --git a/drivers/misc/habanalabs/common/habanalabs_ioctl.c b/drivers/misc/habanalabs/common/habanalabs_ioctl.c
index c13a3c2a7013..14c58579b9cd 100644
--- a/drivers/misc/habanalabs/common/habanalabs_ioctl.c
+++ b/drivers/misc/habanalabs/common/habanalabs_ioctl.c
@@ -76,6 +76,7 @@ static int hw_ip_info(struct hl_device *hdev, struct hl_info_args *args)
 	if (hw_ip.dram_size > PAGE_SIZE)
 		hw_ip.dram_enabled = 1;
 	hw_ip.dram_page_size = prop->dram_page_size;
+	hw_ip.device_mem_alloc_default_page_size = prop->device_mem_alloc_default_page_size;
 	hw_ip.num_of_events = prop->num_of_events;
 
 	memcpy(hw_ip.cpucp_version, prop->cpucp_info.cpucp_version,
diff --git a/include/uapi/misc/habanalabs.h b/include/uapi/misc/habanalabs.h
index 1d6b4f0c4159..ae2441521467 100644
--- a/include/uapi/misc/habanalabs.h
+++ b/include/uapi/misc/habanalabs.h
@@ -409,6 +409,7 @@ enum hl_server_type {
  * @dram_page_size: The DRAM physical page size.
  * @number_of_user_interrupts: The number of interrupts that are available to the userspace
  *                             application to use. Relevant for Gaudi2 and later.
+ * @device_mem_alloc_default_page_size: default page size used in device memory allocation.
  */
 struct hl_info_hw_ip_info {
 	__u64 sram_base_address;
@@ -436,6 +437,8 @@ struct hl_info_hw_ip_info {
 	__u32 reserved3;
 	__u16 number_of_user_interrupts;
 	__u16 pad2;
+	__u64 reserved4;
+	__u64 device_mem_alloc_default_page_size;
 };
 
 struct hl_info_dram_usage {
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 03/11] habanalabs: change mmu_get_real_page_size to be ASIC-specific
  2022-03-16 11:41 [PATCH 01/11] habanalabs: set non-0 value in dram default page size Oded Gabbay
  2022-03-16 11:41 ` [PATCH 02/11] habanalabs: add DRAM default page size to HW info Oded Gabbay
@ 2022-03-16 11:41 ` Oded Gabbay
  2022-03-16 22:33   ` kernel test robot
  2022-03-16 11:41 ` [PATCH 04/11] habanalabs: convert all MMU masks/shifts to arrays Oded Gabbay
                   ` (7 subsequent siblings)
  9 siblings, 1 reply; 13+ messages in thread
From: Oded Gabbay @ 2022-03-16 11:41 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ohad Sharabi

From: Ohad Sharabi <osharabi@habana.ai>

This patch breaks the cumbersome implementation of "get real page size"
along with it's multiple inner conditions and implement each case
(according to the real complexity) inside an ASIC function.

Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
---
 drivers/misc/habanalabs/common/habanalabs.h |   4 +
 drivers/misc/habanalabs/common/mmu/mmu.c    | 204 +++++++++++---------
 drivers/misc/habanalabs/gaudi/gaudi.c       |   3 +-
 drivers/misc/habanalabs/goya/goya.c         |   3 +-
 4 files changed, 116 insertions(+), 98 deletions(-)

diff --git a/drivers/misc/habanalabs/common/habanalabs.h b/drivers/misc/habanalabs/common/habanalabs.h
index af47accd4a56..990190fc3054 100644
--- a/drivers/misc/habanalabs/common/habanalabs.h
+++ b/drivers/misc/habanalabs/common/habanalabs.h
@@ -1450,6 +1450,8 @@ struct hl_asic_funcs {
 	void (*set_pci_memory_regions)(struct hl_device *hdev);
 	u32* (*get_stream_master_qid_arr)(void);
 	bool (*is_valid_dram_page_size)(u32 page_size);
+	int (*mmu_get_real_page_size)(struct hl_device *hdev, struct hl_mmu_properties *mmu_prop,
+					u32 page_size, u32 *real_page_size, bool is_dram_addr);
 };
 
 
@@ -3109,6 +3111,8 @@ int hl_mmu_ctx_init(struct hl_ctx *ctx);
 void hl_mmu_ctx_fini(struct hl_ctx *ctx);
 int hl_mmu_map_page(struct hl_ctx *ctx, u64 virt_addr, u64 phys_addr,
 		u32 page_size, bool flush_pte);
+int hl_mmu_get_real_page_size(struct hl_device *hdev, struct hl_mmu_properties *mmu_prop,
+				u32 page_size, u32 *real_page_size, bool is_dram_addr);
 int hl_mmu_unmap_page(struct hl_ctx *ctx, u64 virt_addr, u32 page_size,
 		bool flush_pte);
 int hl_mmu_map_contiguous(struct hl_ctx *ctx, u64 virt_addr,
diff --git a/drivers/misc/habanalabs/common/mmu/mmu.c b/drivers/misc/habanalabs/common/mmu/mmu.c
index 810b73421ce1..eb85d9fb7462 100644
--- a/drivers/misc/habanalabs/common/mmu/mmu.c
+++ b/drivers/misc/habanalabs/common/mmu/mmu.c
@@ -9,6 +9,20 @@
 
 #include "../habanalabs.h"
 
+/**
+ * hl_mmu_get_funcs() - get MMU functions structure
+ * @hdev: habanalabs device structure.
+ * @pgt_residency: page table residency.
+ * @is_dram_addr: true if we need HMMU functions
+ *
+ * @return appropriate MMU functions structure
+ */
+static struct hl_mmu_funcs *hl_mmu_get_funcs(struct hl_device *hdev, int pgt_residency,
+									bool is_dram_addr)
+{
+	return &hdev->mmu_func[pgt_residency];
+}
+
 bool hl_is_dram_va(struct hl_device *hdev, u64 virt_addr)
 {
 	struct asic_fixed_properties *prop = &hdev->asic_prop;
@@ -121,6 +135,53 @@ void hl_mmu_ctx_fini(struct hl_ctx *ctx)
 	mutex_destroy(&ctx->mmu_lock);
 }
 
+/*
+ * hl_mmu_get_real_page_size - get real page size to use in map/unmap operation
+ *
+ * @hdev: pointer to device data.
+ * @mmu_prop: MMU properties.
+ * @page_size: page size
+ * @real_page_size: set here the actual page size to use for the operation
+ * @is_dram_addr: true if DRAM address, otherwise false.
+ *
+ * @return 0 on success, otherwise non 0 error code
+ *
+ * note that this is general implementation that can fit most MMU arch. but as this is used as an
+ * MMU function:
+ * 1. it shall not be called directly- only from mmu_func structure instance
+ * 2. each MMU may modify the implementation internally
+ */
+int hl_mmu_get_real_page_size(struct hl_device *hdev, struct hl_mmu_properties *mmu_prop,
+				u32 page_size, u32 *real_page_size, bool is_dram_addr)
+{
+	/*
+	 * The H/W handles mapping of specific page sizes. Hence if the page
+	 * size is bigger, we break it to sub-pages and map them separately.
+	 */
+	if ((page_size % mmu_prop->page_size) == 0) {
+		*real_page_size = mmu_prop->page_size;
+		return 0;
+	}
+
+	dev_err(hdev->dev, "page size of %u is not %uKB aligned, can't map\n",
+						page_size, mmu_prop->page_size >> 10);
+
+	return -EFAULT;
+}
+
+static struct hl_mmu_properties *hl_mmu_get_prop(struct hl_device *hdev, u32 page_size,
+							bool is_dram_addr)
+{
+	struct asic_fixed_properties *prop = &hdev->asic_prop;
+
+	if (is_dram_addr)
+		return &prop->dmmu;
+	else if ((page_size % prop->pmmu_huge.page_size) == 0)
+		return &prop->pmmu_huge;
+
+	return &prop->pmmu;
+}
+
 /*
  * hl_mmu_unmap_page - unmaps a virtual addr
  *
@@ -142,60 +203,37 @@ void hl_mmu_ctx_fini(struct hl_ctx *ctx)
  * For optimization reasons PCI flush may be requested once after unmapping of
  * large area.
  */
-int hl_mmu_unmap_page(struct hl_ctx *ctx, u64 virt_addr, u32 page_size,
-		bool flush_pte)
+int hl_mmu_unmap_page(struct hl_ctx *ctx, u64 virt_addr, u32 page_size, bool flush_pte)
 {
 	struct hl_device *hdev = ctx->hdev;
-	struct asic_fixed_properties *prop = &hdev->asic_prop;
+	struct asic_fixed_properties *prop;
 	struct hl_mmu_properties *mmu_prop;
-	u64 real_virt_addr;
+	struct hl_mmu_funcs *mmu_funcs;
+	int i, pgt_residency, rc = 0;
 	u32 real_page_size, npages;
-	int i, rc = 0, pgt_residency;
+	u64 real_virt_addr;
 	bool is_dram_addr;
 
 	if (!hdev->mmu_enable)
 		return 0;
 
+	prop = &hdev->asic_prop;
 	is_dram_addr = hl_is_dram_va(hdev, virt_addr);
-
-	if (is_dram_addr)
-		mmu_prop = &prop->dmmu;
-	else if ((page_size % prop->pmmu_huge.page_size) == 0)
-		mmu_prop = &prop->pmmu_huge;
-	else
-		mmu_prop = &prop->pmmu;
+	mmu_prop = hl_mmu_get_prop(hdev, page_size, is_dram_addr);
 
 	pgt_residency = mmu_prop->host_resident ? MMU_HR_PGT : MMU_DR_PGT;
-	/*
-	 * The H/W handles mapping of specific page sizes. Hence if the page
-	 * size is bigger, we break it to sub-pages and unmap them separately.
-	 */
-	if ((page_size % mmu_prop->page_size) == 0) {
-		real_page_size = mmu_prop->page_size;
-	} else {
-		/*
-		 * MMU page size may differ from DRAM page size.
-		 * In such case work with the DRAM page size and let the MMU
-		 * scrambling routine to handle this mismatch when
-		 * calculating the address to remove from the MMU page table
-		 */
-		if (is_dram_addr && ((page_size % prop->dram_page_size) == 0)) {
-			real_page_size = prop->dram_page_size;
-		} else {
-			dev_err(hdev->dev,
-				"page size of %u is not %uKB aligned, can't unmap\n",
-				page_size, mmu_prop->page_size >> 10);
+	mmu_funcs = hl_mmu_get_funcs(hdev, pgt_residency, is_dram_addr);
 
-			return -EFAULT;
-		}
-	}
+	rc = hdev->asic_funcs->mmu_get_real_page_size(hdev, mmu_prop, page_size, &real_page_size,
+							is_dram_addr);
+	if (rc)
+		return rc;
 
 	npages = page_size / real_page_size;
 	real_virt_addr = virt_addr;
 
 	for (i = 0 ; i < npages ; i++) {
-		rc = hdev->mmu_func[pgt_residency].unmap(ctx,
-						real_virt_addr, is_dram_addr);
+		rc = mmu_funcs->unmap(ctx, real_virt_addr, is_dram_addr);
 		if (rc)
 			break;
 
@@ -203,7 +241,7 @@ int hl_mmu_unmap_page(struct hl_ctx *ctx, u64 virt_addr, u32 page_size,
 	}
 
 	if (flush_pte)
-		hdev->mmu_func[pgt_residency].flush(ctx);
+		mmu_funcs->flush(ctx);
 
 	return rc;
 }
@@ -230,56 +268,33 @@ int hl_mmu_unmap_page(struct hl_ctx *ctx, u64 virt_addr, u32 page_size,
  * For optimization reasons PCI flush may be requested once after mapping of
  * large area.
  */
-int hl_mmu_map_page(struct hl_ctx *ctx, u64 virt_addr, u64 phys_addr,
-		u32 page_size, bool flush_pte)
+int hl_mmu_map_page(struct hl_ctx *ctx, u64 virt_addr, u64 phys_addr, u32 page_size,
+			bool flush_pte)
 {
+	int i, rc, pgt_residency, mapped_cnt = 0;
 	struct hl_device *hdev = ctx->hdev;
-	struct asic_fixed_properties *prop = &hdev->asic_prop;
+	struct asic_fixed_properties *prop;
 	struct hl_mmu_properties *mmu_prop;
 	u64 real_virt_addr, real_phys_addr;
+	struct hl_mmu_funcs *mmu_funcs;
 	u32 real_page_size, npages;
-	int i, rc, pgt_residency, mapped_cnt = 0;
 	bool is_dram_addr;
 
 
 	if (!hdev->mmu_enable)
 		return 0;
 
+	prop = &hdev->asic_prop;
 	is_dram_addr = hl_is_dram_va(hdev, virt_addr);
-
-	if (is_dram_addr)
-		mmu_prop = &prop->dmmu;
-	else if ((page_size % prop->pmmu_huge.page_size) == 0)
-		mmu_prop = &prop->pmmu_huge;
-	else
-		mmu_prop = &prop->pmmu;
+	mmu_prop = hl_mmu_get_prop(hdev, page_size, is_dram_addr);
 
 	pgt_residency = mmu_prop->host_resident ? MMU_HR_PGT : MMU_DR_PGT;
+	mmu_funcs = hl_mmu_get_funcs(hdev, pgt_residency, is_dram_addr);
 
-	/*
-	 * The H/W handles mapping of specific page sizes. Hence if the page
-	 * size is bigger, we break it to sub-pages and map them separately.
-	 */
-	if ((page_size % mmu_prop->page_size) == 0) {
-		real_page_size = mmu_prop->page_size;
-	} else if (is_dram_addr && ((page_size % prop->dram_page_size) == 0) &&
-			(prop->dram_page_size < mmu_prop->page_size)) {
-		/*
-		 * MMU page size may differ from DRAM page size.
-		 * In such case work with the DRAM page size and let the MMU
-		 * scrambling routine handle this mismatch when calculating
-		 * the address to place in the MMU page table. (in that case
-		 * also make sure that the dram_page_size smaller than the
-		 * mmu page size)
-		 */
-		real_page_size = prop->dram_page_size;
-	} else {
-		dev_err(hdev->dev,
-			"page size of %u is not %uKB aligned, can't map\n",
-			page_size, mmu_prop->page_size >> 10);
-
-		return -EFAULT;
-	}
+	rc = hdev->asic_funcs->mmu_get_real_page_size(hdev, mmu_prop, page_size, &real_page_size,
+							is_dram_addr);
+	if (rc)
+		return rc;
 
 	/*
 	 * Verify that the phys and virt addresses are aligned with the
@@ -302,9 +317,8 @@ int hl_mmu_map_page(struct hl_ctx *ctx, u64 virt_addr, u64 phys_addr,
 	real_phys_addr = phys_addr;
 
 	for (i = 0 ; i < npages ; i++) {
-		rc = hdev->mmu_func[pgt_residency].map(ctx,
-						real_virt_addr, real_phys_addr,
-						real_page_size, is_dram_addr);
+		rc = mmu_funcs->map(ctx, real_virt_addr, real_phys_addr, real_page_size,
+										is_dram_addr);
 		if (rc)
 			goto err;
 
@@ -314,22 +328,21 @@ int hl_mmu_map_page(struct hl_ctx *ctx, u64 virt_addr, u64 phys_addr,
 	}
 
 	if (flush_pte)
-		hdev->mmu_func[pgt_residency].flush(ctx);
+		mmu_funcs->flush(ctx);
 
 	return 0;
 
 err:
 	real_virt_addr = virt_addr;
 	for (i = 0 ; i < mapped_cnt ; i++) {
-		if (hdev->mmu_func[pgt_residency].unmap(ctx,
-						real_virt_addr, is_dram_addr))
+		if (mmu_funcs->unmap(ctx, real_virt_addr, is_dram_addr))
 			dev_warn_ratelimited(hdev->dev,
 				"failed to unmap va: 0x%llx\n", real_virt_addr);
 
 		real_virt_addr += real_page_size;
 	}
 
-	hdev->mmu_func[pgt_residency].flush(ctx);
+	mmu_funcs->flush(ctx);
 
 	return rc;
 }
@@ -508,7 +521,7 @@ static void hl_mmu_pa_page_with_offset(struct hl_ctx *ctx, u64 virt_addr,
 		/*
 		 * Bit arithmetics cannot be used for non power of two page
 		 * sizes. In addition, since bit arithmetics is not used,
-		 * we cannot ignore dram base. All that shall be considerd.
+		 * we cannot ignore dram base. All that shall be considered.
 		 */
 
 		dram_page_size = prop->dram_page_size;
@@ -557,40 +570,39 @@ int hl_mmu_get_tlb_info(struct hl_ctx *ctx, u64 virt_addr,
 			struct hl_mmu_hop_info *hops)
 {
 	struct hl_device *hdev = ctx->hdev;
-	struct asic_fixed_properties *prop = &hdev->asic_prop;
+	struct asic_fixed_properties *prop;
 	struct hl_mmu_properties *mmu_prop;
-	int rc;
+	struct hl_mmu_funcs *mmu_funcs;
+	int pgt_residency, rc;
 	bool is_dram_addr;
 
 	if (!hdev->mmu_enable)
 		return -EOPNOTSUPP;
 
+	prop = &hdev->asic_prop;
 	hops->scrambled_vaddr = virt_addr;      /* assume no scrambling */
 
 	is_dram_addr = hl_mem_area_inside_range(virt_addr, prop->dmmu.page_size,
-						prop->dmmu.start_addr,
-						prop->dmmu.end_addr);
+								prop->dmmu.start_addr,
+								prop->dmmu.end_addr);
 
-	/* host-residency is the same in PMMU and HPMMU, use one of them */
+	/* host-residency is the same in PMMU and PMMU huge, no need to distinguish here */
 	mmu_prop = is_dram_addr ? &prop->dmmu : &prop->pmmu;
+	pgt_residency = mmu_prop->host_resident ? MMU_HR_PGT : MMU_DR_PGT;
+	mmu_funcs = hl_mmu_get_funcs(hdev, pgt_residency, is_dram_addr);
 
 	mutex_lock(&ctx->mmu_lock);
-
-	if (mmu_prop->host_resident)
-		rc = hdev->mmu_func[MMU_HR_PGT].get_tlb_info(ctx,
-							virt_addr, hops);
-	else
-		rc = hdev->mmu_func[MMU_DR_PGT].get_tlb_info(ctx,
-							virt_addr, hops);
-
+	rc = mmu_funcs->get_tlb_info(ctx, virt_addr, hops);
 	mutex_unlock(&ctx->mmu_lock);
 
+	if (rc)
+		return rc;
+
 	/* add page offset to physical address */
 	if (hops->unscrambled_paddr)
-		hl_mmu_pa_page_with_offset(ctx, virt_addr, hops,
-					&hops->unscrambled_paddr);
+		hl_mmu_pa_page_with_offset(ctx, virt_addr, hops, &hops->unscrambled_paddr);
 
-	return rc;
+	return 0;
 }
 
 int hl_mmu_if_set_funcs(struct hl_device *hdev)
diff --git a/drivers/misc/habanalabs/gaudi/gaudi.c b/drivers/misc/habanalabs/gaudi/gaudi.c
index feb1323a8f4a..47afc5d1aef4 100644
--- a/drivers/misc/habanalabs/gaudi/gaudi.c
+++ b/drivers/misc/habanalabs/gaudi/gaudi.c
@@ -9487,7 +9487,8 @@ static const struct hl_asic_funcs gaudi_funcs = {
 	.get_sob_addr = gaudi_get_sob_addr,
 	.set_pci_memory_regions = gaudi_set_pci_memory_regions,
 	.get_stream_master_qid_arr = gaudi_get_stream_master_qid_arr,
-	.is_valid_dram_page_size = NULL
+	.is_valid_dram_page_size = NULL,
+	.mmu_get_real_page_size = hl_mmu_get_real_page_size,
 };
 
 /**
diff --git a/drivers/misc/habanalabs/goya/goya.c b/drivers/misc/habanalabs/goya/goya.c
index 5bd665188ea6..e4b7b9706d1a 100644
--- a/drivers/misc/habanalabs/goya/goya.c
+++ b/drivers/misc/habanalabs/goya/goya.c
@@ -5765,7 +5765,8 @@ static const struct hl_asic_funcs goya_funcs = {
 	.get_sob_addr = &goya_get_sob_addr,
 	.set_pci_memory_regions = goya_set_pci_memory_regions,
 	.get_stream_master_qid_arr = goya_get_stream_master_qid_arr,
-	.is_valid_dram_page_size = NULL
+	.is_valid_dram_page_size = NULL,
+	.mmu_get_real_page_size = hl_mmu_get_real_page_size,
 };
 
 /*
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 04/11] habanalabs: convert all MMU masks/shifts to arrays
  2022-03-16 11:41 [PATCH 01/11] habanalabs: set non-0 value in dram default page size Oded Gabbay
  2022-03-16 11:41 ` [PATCH 02/11] habanalabs: add DRAM default page size to HW info Oded Gabbay
  2022-03-16 11:41 ` [PATCH 03/11] habanalabs: change mmu_get_real_page_size to be ASIC-specific Oded Gabbay
@ 2022-03-16 11:41 ` Oded Gabbay
  2022-03-16 11:41 ` [PATCH 05/11] habanalabs: add user API to get valid DRAM page sizes Oded Gabbay
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 13+ messages in thread
From: Oded Gabbay @ 2022-03-16 11:41 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ohad Sharabi

From: Ohad Sharabi <osharabi@habana.ai>

There is no need to hold each MMU mask/shift as a denoted structure
member (e.g. hop0_mask).

Instead converting it to array will result in smaller and more readable
code.

Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
---
 drivers/misc/habanalabs/common/habanalabs.h   | 28 ++---------
 drivers/misc/habanalabs/common/mmu/mmu.c      | 46 ++++---------------
 drivers/misc/habanalabs/common/mmu/mmu_v1.c   | 20 ++++----
 drivers/misc/habanalabs/gaudi/gaudi.c         | 20 ++++----
 drivers/misc/habanalabs/goya/goya.c           | 20 ++++----
 .../include/hw_ip/mmu/mmu_general.h           | 10 ++++
 6 files changed, 52 insertions(+), 92 deletions(-)

diff --git a/drivers/misc/habanalabs/common/habanalabs.h b/drivers/misc/habanalabs/common/habanalabs.h
index 990190fc3054..6eb35e4124c2 100644
--- a/drivers/misc/habanalabs/common/habanalabs.h
+++ b/drivers/misc/habanalabs/common/habanalabs.h
@@ -394,18 +394,8 @@ enum hl_device_hw_state {
  * struct hl_mmu_properties - ASIC specific MMU address translation properties.
  * @start_addr: virtual start address of the memory region.
  * @end_addr: virtual end address of the memory region.
- * @hop0_shift: shift of hop 0 mask.
- * @hop1_shift: shift of hop 1 mask.
- * @hop2_shift: shift of hop 2 mask.
- * @hop3_shift: shift of hop 3 mask.
- * @hop4_shift: shift of hop 4 mask.
- * @hop5_shift: shift of hop 5 mask.
- * @hop0_mask: mask to get the PTE address in hop 0.
- * @hop1_mask: mask to get the PTE address in hop 1.
- * @hop2_mask: mask to get the PTE address in hop 2.
- * @hop3_mask: mask to get the PTE address in hop 3.
- * @hop4_mask: mask to get the PTE address in hop 4.
- * @hop5_mask: mask to get the PTE address in hop 5.
+ * @hop_shifts: array holds HOPs shifts.
+ * @hop_masks: array holds HOPs masks.
  * @last_mask: mask to get the bit indicating this is the last hop.
  * @pgt_size: size for page tables.
  * @page_size: default page size used to allocate memory.
@@ -418,18 +408,8 @@ enum hl_device_hw_state {
 struct hl_mmu_properties {
 	u64	start_addr;
 	u64	end_addr;
-	u64	hop0_shift;
-	u64	hop1_shift;
-	u64	hop2_shift;
-	u64	hop3_shift;
-	u64	hop4_shift;
-	u64	hop5_shift;
-	u64	hop0_mask;
-	u64	hop1_mask;
-	u64	hop2_mask;
-	u64	hop3_mask;
-	u64	hop4_mask;
-	u64	hop5_mask;
+	u64	hop_shifts[MMU_HOP_MAX];
+	u64	hop_masks[MMU_HOP_MAX];
 	u64	last_mask;
 	u64	pgt_size;
 	u32	page_size;
diff --git a/drivers/misc/habanalabs/common/mmu/mmu.c b/drivers/misc/habanalabs/common/mmu/mmu.c
index eb85d9fb7462..b5d439aceb32 100644
--- a/drivers/misc/habanalabs/common/mmu/mmu.c
+++ b/drivers/misc/habanalabs/common/mmu/mmu.c
@@ -493,11 +493,9 @@ static void hl_mmu_pa_page_with_offset(struct hl_ctx *ctx, u64 virt_addr,
 						struct hl_mmu_hop_info *hops,
 						u64 *phys_addr)
 {
-	struct hl_device *hdev = ctx->hdev;
-	struct asic_fixed_properties *prop = &hdev->asic_prop;
+	struct asic_fixed_properties *prop = &ctx->hdev->asic_prop;
 	u64 offset_mask, addr_mask, hop_shift, tmp_phys_addr;
-	u32 hop0_shift_off;
-	void *p;
+	struct hl_mmu_properties *mmu_prop;
 
 	/* last hop holds the phys address and flags */
 	if (hops->unscrambled_paddr)
@@ -506,11 +504,11 @@ static void hl_mmu_pa_page_with_offset(struct hl_ctx *ctx, u64 virt_addr,
 		tmp_phys_addr = hops->hop_info[hops->used_hops - 1].hop_pte_val;
 
 	if (hops->range_type == HL_VA_RANGE_TYPE_HOST_HUGE)
-		p = &prop->pmmu_huge;
+		mmu_prop = &prop->pmmu_huge;
 	else if (hops->range_type == HL_VA_RANGE_TYPE_HOST)
-		p = &prop->pmmu;
+		mmu_prop = &prop->pmmu;
 	else /* HL_VA_RANGE_TYPE_DRAM */
-		p = &prop->dmmu;
+		mmu_prop = &prop->dmmu;
 
 	if ((hops->range_type == HL_VA_RANGE_TYPE_DRAM) &&
 			!is_power_of_2(prop->dram_page_size)) {
@@ -539,10 +537,7 @@ static void hl_mmu_pa_page_with_offset(struct hl_ctx *ctx, u64 virt_addr,
 		 * structure in order to determine the right masks
 		 * for the page offset.
 		 */
-		hop0_shift_off = offsetof(struct hl_mmu_properties, hop0_shift);
-		p = (char *)p + hop0_shift_off;
-		p = (char *)p + ((hops->used_hops - 1) * sizeof(u64));
-		hop_shift = *(u64 *)p;
+		hop_shift = mmu_prop->hop_shifts[hops->used_hops - 1];
 		offset_mask = (1ull << hop_shift) - 1;
 		addr_mask = ~(offset_mask);
 		*phys_addr = (tmp_phys_addr & addr_mask) |
@@ -698,33 +693,8 @@ u64 hl_mmu_get_hop_pte_phys_addr(struct hl_ctx *ctx, struct hl_mmu_properties *m
 		return U64_MAX;
 	}
 
-	/* currently max number of HOPs is 6 */
-	switch (hop_idx) {
-	case 0:
-		mask = mmu_prop->hop0_mask;
-		shift = mmu_prop->hop0_shift;
-		break;
-	case 1:
-		mask = mmu_prop->hop1_mask;
-		shift = mmu_prop->hop1_shift;
-		break;
-	case 2:
-		mask = mmu_prop->hop2_mask;
-		shift = mmu_prop->hop2_shift;
-		break;
-	case 3:
-		mask = mmu_prop->hop3_mask;
-		shift = mmu_prop->hop3_shift;
-		break;
-	case 4:
-		mask = mmu_prop->hop4_mask;
-		shift = mmu_prop->hop4_shift;
-		break;
-	default:
-		mask = mmu_prop->hop5_mask;
-		shift = mmu_prop->hop5_shift;
-		break;
-	}
+	shift = mmu_prop->hop_shifts[hop_idx];
+	mask = mmu_prop->hop_masks[hop_idx];
 
 	return hop_addr + ctx->hdev->asic_prop.mmu_pte_size * ((virt_addr & mask) >> shift);
 }
diff --git a/drivers/misc/habanalabs/common/mmu/mmu_v1.c b/drivers/misc/habanalabs/common/mmu/mmu_v1.c
index d03786d0c407..f43657ad442b 100644
--- a/drivers/misc/habanalabs/common/mmu/mmu_v1.c
+++ b/drivers/misc/habanalabs/common/mmu/mmu_v1.c
@@ -181,40 +181,40 @@ static inline u64 get_hop0_pte_addr(struct hl_ctx *ctx,
 					struct hl_mmu_properties *mmu_prop,
 					u64 hop_addr, u64 vaddr)
 {
-	return get_hopN_pte_addr(ctx, hop_addr, vaddr, mmu_prop->hop0_mask,
-					mmu_prop->hop0_shift);
+	return get_hopN_pte_addr(ctx, hop_addr, vaddr, mmu_prop->hop_masks[MMU_HOP0],
+					mmu_prop->hop_shifts[MMU_HOP0]);
 }
 
 static inline u64 get_hop1_pte_addr(struct hl_ctx *ctx,
 					struct hl_mmu_properties *mmu_prop,
 					u64 hop_addr, u64 vaddr)
 {
-	return get_hopN_pte_addr(ctx, hop_addr, vaddr, mmu_prop->hop1_mask,
-					mmu_prop->hop1_shift);
+	return get_hopN_pte_addr(ctx, hop_addr, vaddr, mmu_prop->hop_masks[MMU_HOP1],
+					mmu_prop->hop_shifts[MMU_HOP1]);
 }
 
 static inline u64 get_hop2_pte_addr(struct hl_ctx *ctx,
 					struct hl_mmu_properties *mmu_prop,
 					u64 hop_addr, u64 vaddr)
 {
-	return get_hopN_pte_addr(ctx, hop_addr, vaddr, mmu_prop->hop2_mask,
-					mmu_prop->hop2_shift);
+	return get_hopN_pte_addr(ctx, hop_addr, vaddr, mmu_prop->hop_masks[MMU_HOP2],
+					mmu_prop->hop_shifts[MMU_HOP2]);
 }
 
 static inline u64 get_hop3_pte_addr(struct hl_ctx *ctx,
 					struct hl_mmu_properties *mmu_prop,
 					u64 hop_addr, u64 vaddr)
 {
-	return get_hopN_pte_addr(ctx, hop_addr, vaddr, mmu_prop->hop3_mask,
-					mmu_prop->hop3_shift);
+	return get_hopN_pte_addr(ctx, hop_addr, vaddr, mmu_prop->hop_masks[MMU_HOP3],
+					mmu_prop->hop_shifts[MMU_HOP3]);
 }
 
 static inline u64 get_hop4_pte_addr(struct hl_ctx *ctx,
 					struct hl_mmu_properties *mmu_prop,
 					u64 hop_addr, u64 vaddr)
 {
-	return get_hopN_pte_addr(ctx, hop_addr, vaddr, mmu_prop->hop4_mask,
-					mmu_prop->hop4_shift);
+	return get_hopN_pte_addr(ctx, hop_addr, vaddr, mmu_prop->hop_masks[MMU_HOP4],
+					mmu_prop->hop_shifts[MMU_HOP4]);
 }
 
 static inline u64 get_alloc_next_hop_addr(struct hl_ctx *ctx, u64 curr_pte,
diff --git a/drivers/misc/habanalabs/gaudi/gaudi.c b/drivers/misc/habanalabs/gaudi/gaudi.c
index 47afc5d1aef4..5979434d1905 100644
--- a/drivers/misc/habanalabs/gaudi/gaudi.c
+++ b/drivers/misc/habanalabs/gaudi/gaudi.c
@@ -598,16 +598,16 @@ static int gaudi_set_fixed_properties(struct hl_device *hdev)
 	prop->device_mem_alloc_default_page_size = prop->dram_page_size;
 	prop->dram_supports_virtual_memory = false;
 
-	prop->pmmu.hop0_shift = MMU_V1_1_HOP0_SHIFT;
-	prop->pmmu.hop1_shift = MMU_V1_1_HOP1_SHIFT;
-	prop->pmmu.hop2_shift = MMU_V1_1_HOP2_SHIFT;
-	prop->pmmu.hop3_shift = MMU_V1_1_HOP3_SHIFT;
-	prop->pmmu.hop4_shift = MMU_V1_1_HOP4_SHIFT;
-	prop->pmmu.hop0_mask = MMU_V1_1_HOP0_MASK;
-	prop->pmmu.hop1_mask = MMU_V1_1_HOP1_MASK;
-	prop->pmmu.hop2_mask = MMU_V1_1_HOP2_MASK;
-	prop->pmmu.hop3_mask = MMU_V1_1_HOP3_MASK;
-	prop->pmmu.hop4_mask = MMU_V1_1_HOP4_MASK;
+	prop->pmmu.hop_shifts[MMU_HOP0] = MMU_V1_1_HOP0_SHIFT;
+	prop->pmmu.hop_shifts[MMU_HOP1] = MMU_V1_1_HOP1_SHIFT;
+	prop->pmmu.hop_shifts[MMU_HOP2] = MMU_V1_1_HOP2_SHIFT;
+	prop->pmmu.hop_shifts[MMU_HOP3] = MMU_V1_1_HOP3_SHIFT;
+	prop->pmmu.hop_shifts[MMU_HOP4] = MMU_V1_1_HOP4_SHIFT;
+	prop->pmmu.hop_masks[MMU_HOP0] = MMU_V1_1_HOP0_MASK;
+	prop->pmmu.hop_masks[MMU_HOP1] = MMU_V1_1_HOP1_MASK;
+	prop->pmmu.hop_masks[MMU_HOP2] = MMU_V1_1_HOP2_MASK;
+	prop->pmmu.hop_masks[MMU_HOP3] = MMU_V1_1_HOP3_MASK;
+	prop->pmmu.hop_masks[MMU_HOP4] = MMU_V1_1_HOP4_MASK;
 	prop->pmmu.start_addr = VA_HOST_SPACE_START;
 	prop->pmmu.end_addr =
 			(VA_HOST_SPACE_START + VA_HOST_SPACE_SIZE / 2) - 1;
diff --git a/drivers/misc/habanalabs/goya/goya.c b/drivers/misc/habanalabs/goya/goya.c
index e4b7b9706d1a..ec347bd3bb69 100644
--- a/drivers/misc/habanalabs/goya/goya.c
+++ b/drivers/misc/habanalabs/goya/goya.c
@@ -416,16 +416,16 @@ int goya_set_fixed_properties(struct hl_device *hdev)
 	prop->device_mem_alloc_default_page_size = prop->dram_page_size;
 	prop->dram_supports_virtual_memory = true;
 
-	prop->dmmu.hop0_shift = MMU_V1_0_HOP0_SHIFT;
-	prop->dmmu.hop1_shift = MMU_V1_0_HOP1_SHIFT;
-	prop->dmmu.hop2_shift = MMU_V1_0_HOP2_SHIFT;
-	prop->dmmu.hop3_shift = MMU_V1_0_HOP3_SHIFT;
-	prop->dmmu.hop4_shift = MMU_V1_0_HOP4_SHIFT;
-	prop->dmmu.hop0_mask = MMU_V1_0_HOP0_MASK;
-	prop->dmmu.hop1_mask = MMU_V1_0_HOP1_MASK;
-	prop->dmmu.hop2_mask = MMU_V1_0_HOP2_MASK;
-	prop->dmmu.hop3_mask = MMU_V1_0_HOP3_MASK;
-	prop->dmmu.hop4_mask = MMU_V1_0_HOP4_MASK;
+	prop->dmmu.hop_shifts[MMU_HOP0] = MMU_V1_0_HOP0_SHIFT;
+	prop->dmmu.hop_shifts[MMU_HOP1] = MMU_V1_0_HOP1_SHIFT;
+	prop->dmmu.hop_shifts[MMU_HOP2] = MMU_V1_0_HOP2_SHIFT;
+	prop->dmmu.hop_shifts[MMU_HOP3] = MMU_V1_0_HOP3_SHIFT;
+	prop->dmmu.hop_shifts[MMU_HOP4] = MMU_V1_0_HOP4_SHIFT;
+	prop->dmmu.hop_masks[MMU_HOP0] = MMU_V1_0_HOP0_MASK;
+	prop->dmmu.hop_masks[MMU_HOP1] = MMU_V1_0_HOP1_MASK;
+	prop->dmmu.hop_masks[MMU_HOP2] = MMU_V1_0_HOP2_MASK;
+	prop->dmmu.hop_masks[MMU_HOP3] = MMU_V1_0_HOP3_MASK;
+	prop->dmmu.hop_masks[MMU_HOP4] = MMU_V1_0_HOP4_MASK;
 	prop->dmmu.start_addr = VA_DDR_SPACE_START;
 	prop->dmmu.end_addr = VA_DDR_SPACE_END;
 	prop->dmmu.page_size = PAGE_SIZE_2MB;
diff --git a/drivers/misc/habanalabs/include/hw_ip/mmu/mmu_general.h b/drivers/misc/habanalabs/include/hw_ip/mmu/mmu_general.h
index 758f246627f8..cae8ac8bc5b1 100644
--- a/drivers/misc/habanalabs/include/hw_ip/mmu/mmu_general.h
+++ b/drivers/misc/habanalabs/include/hw_ip/mmu/mmu_general.h
@@ -34,4 +34,14 @@
 
 #define MMU_CONFIG_TIMEOUT_USEC		2000 /* 2 ms */
 
+enum mmu_hop_num {
+	MMU_HOP0,
+	MMU_HOP1,
+	MMU_HOP2,
+	MMU_HOP3,
+	MMU_HOP4,
+	MMU_HOP5,
+	MMU_HOP_MAX,
+};
+
 #endif /* INCLUDE_MMU_GENERAL_H_ */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 05/11] habanalabs: add user API to get valid DRAM page sizes
  2022-03-16 11:41 [PATCH 01/11] habanalabs: set non-0 value in dram default page size Oded Gabbay
                   ` (2 preceding siblings ...)
  2022-03-16 11:41 ` [PATCH 04/11] habanalabs: convert all MMU masks/shifts to arrays Oded Gabbay
@ 2022-03-16 11:41 ` Oded Gabbay
  2022-03-16 19:37   ` kernel test robot
  2022-03-16 11:41 ` [PATCH 06/11] habanalabs: add new return code to device fd open Oded Gabbay
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 13+ messages in thread
From: Oded Gabbay @ 2022-03-16 11:41 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ohad Sharabi

From: Ohad Sharabi <osharabi@habana.ai>

Future devices will support multiple device memory page sizes.
In addition, an API for the user was added for it to be able to control
the device memory allocation page size.

This patch is a complementary patch to inform the user of the available
page size supported by the device.

Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
---
 drivers/misc/habanalabs/common/habanalabs.h   |  2 +
 .../misc/habanalabs/common/habanalabs_ioctl.c | 24 +++++++
 drivers/misc/habanalabs/gaudi/gaudi.c         |  7 +++
 drivers/misc/habanalabs/goya/goya.c           |  7 +++
 include/uapi/misc/habanalabs.h                | 63 +++++++++++--------
 5 files changed, 77 insertions(+), 26 deletions(-)

diff --git a/drivers/misc/habanalabs/common/habanalabs.h b/drivers/misc/habanalabs/common/habanalabs.h
index 6eb35e4124c2..564797766f42 100644
--- a/drivers/misc/habanalabs/common/habanalabs.h
+++ b/drivers/misc/habanalabs/common/habanalabs.h
@@ -1304,6 +1304,7 @@ struct fw_load_mgr {
  * @get_stream_master_qid_arr: get pointer to stream masters QID array
  * @is_valid_dram_page_size: return true if page size is supported in device
  *                           memory allocation, otherwise false.
+ * @get_valid_dram_page_orders: get valid device memory allocation page orders
  */
 struct hl_asic_funcs {
 	int (*early_init)(struct hl_device *hdev);
@@ -1432,6 +1433,7 @@ struct hl_asic_funcs {
 	bool (*is_valid_dram_page_size)(u32 page_size);
 	int (*mmu_get_real_page_size)(struct hl_device *hdev, struct hl_mmu_properties *mmu_prop,
 					u32 page_size, u32 *real_page_size, bool is_dram_addr);
+	void (*get_valid_dram_page_orders)(struct hl_info_dev_memalloc_page_sizes *info);
 };
 
 
diff --git a/drivers/misc/habanalabs/common/habanalabs_ioctl.c b/drivers/misc/habanalabs/common/habanalabs_ioctl.c
index 14c58579b9cd..c6fe35ae1238 100644
--- a/drivers/misc/habanalabs/common/habanalabs_ioctl.c
+++ b/drivers/misc/habanalabs/common/habanalabs_ioctl.c
@@ -591,6 +591,27 @@ static int razwi_info(struct hl_fpriv *hpriv, struct hl_info_args *args)
 	return copy_to_user(out, &info, min_t(size_t, max_size, sizeof(info))) ? -EFAULT : 0;
 }
 
+static int dev_mem_alloc_page_sizes_info(struct hl_fpriv *hpriv, struct hl_info_args *args)
+{
+	void __user *out = (void __user *) (uintptr_t) args->return_pointer;
+	struct hl_info_dev_memalloc_page_sizes info = {0};
+	struct hl_device *hdev = hpriv->hdev;
+	u32 max_size = args->return_size;
+
+	if ((!max_size) || (!out))
+		return -EINVAL;
+
+	/*
+	 * Future ASICs that will support multiple DRAM page sizes will support only "powers of 2"
+	 * pages (unlike some of the ASICs before supporting multiple page sizes).
+	 * For this reason for all ASICs that not support multiple page size the function will
+	 * return an empty bitmask indicating that multiple page sizes is not supported.
+	 */
+	hdev->asic_funcs->get_valid_dram_page_orders(&info);
+
+	return copy_to_user(out, &info, min_t(size_t, max_size, sizeof(info))) ? -EFAULT : 0;
+}
+
 static int _hl_info_ioctl(struct hl_fpriv *hpriv, void *data,
 				struct device *dev)
 {
@@ -641,6 +662,9 @@ static int _hl_info_ioctl(struct hl_fpriv *hpriv, void *data,
 	case HL_INFO_RAZWI_EVENT:
 		return razwi_info(hpriv, args);
 
+	case HL_INFO_DEV_MEM_ALLOC_PAGE_SIZES:
+		return dev_mem_alloc_page_sizes_info(hpriv, args);
+
 	default:
 		break;
 	}
diff --git a/drivers/misc/habanalabs/gaudi/gaudi.c b/drivers/misc/habanalabs/gaudi/gaudi.c
index 5979434d1905..950a7d6a4a35 100644
--- a/drivers/misc/habanalabs/gaudi/gaudi.c
+++ b/drivers/misc/habanalabs/gaudi/gaudi.c
@@ -9378,6 +9378,12 @@ static u32 *gaudi_get_stream_master_qid_arr(void)
 	return gaudi_stream_master;
 }
 
+void gaudi_get_valid_dram_page_orders(struct hl_info_dev_memalloc_page_sizes *info)
+{
+	/* set 0 since multiple pages are not supported */
+	info->page_order_bitmask = 0;
+}
+
 static ssize_t infineon_ver_show(struct device *dev, struct device_attribute *attr, char *buf)
 {
 	struct hl_device *hdev = dev_get_drvdata(dev);
@@ -9489,6 +9495,7 @@ static const struct hl_asic_funcs gaudi_funcs = {
 	.get_stream_master_qid_arr = gaudi_get_stream_master_qid_arr,
 	.is_valid_dram_page_size = NULL,
 	.mmu_get_real_page_size = hl_mmu_get_real_page_size,
+	.get_valid_dram_page_orders = gaudi_get_valid_dram_page_orders,
 };
 
 /**
diff --git a/drivers/misc/habanalabs/goya/goya.c b/drivers/misc/habanalabs/goya/goya.c
index ec347bd3bb69..38af7014b845 100644
--- a/drivers/misc/habanalabs/goya/goya.c
+++ b/drivers/misc/habanalabs/goya/goya.c
@@ -5679,6 +5679,12 @@ static u32 *goya_get_stream_master_qid_arr(void)
 	return NULL;
 }
 
+void goya_get_valid_dram_page_orders(struct hl_info_dev_memalloc_page_sizes *info)
+{
+	/* set 0 since multiple pages are not supported */
+	info->page_order_bitmask = 0;
+}
+
 static const struct hl_asic_funcs goya_funcs = {
 	.early_init = goya_early_init,
 	.early_fini = goya_early_fini,
@@ -5767,6 +5773,7 @@ static const struct hl_asic_funcs goya_funcs = {
 	.get_stream_master_qid_arr = goya_get_stream_master_qid_arr,
 	.is_valid_dram_page_size = NULL,
 	.mmu_get_real_page_size = hl_mmu_get_real_page_size,
+	.get_valid_dram_page_orders = goya_get_valid_dram_page_orders,
 };
 
 /*
diff --git a/include/uapi/misc/habanalabs.h b/include/uapi/misc/habanalabs.h
index ae2441521467..f474e7fb018d 100644
--- a/include/uapi/misc/habanalabs.h
+++ b/include/uapi/misc/habanalabs.h
@@ -348,33 +348,35 @@ enum hl_server_type {
  *                            The address which accessing it caused the razwi.
  *                            Razwi initiator.
  *                            Razwi cause, was it a page fault or MMU access error.
+ * HL_INFO_DEV_MEM_ALLOC_PAGE_SIZES - Retrieve valid page sizes for device memory allocation
  */
-#define HL_INFO_HW_IP_INFO		0
-#define HL_INFO_HW_EVENTS		1
-#define HL_INFO_DRAM_USAGE		2
-#define HL_INFO_HW_IDLE			3
-#define HL_INFO_DEVICE_STATUS		4
-#define HL_INFO_DEVICE_UTILIZATION	6
-#define HL_INFO_HW_EVENTS_AGGREGATE	7
-#define HL_INFO_CLK_RATE		8
-#define HL_INFO_RESET_COUNT		9
-#define HL_INFO_TIME_SYNC		10
-#define HL_INFO_CS_COUNTERS		11
-#define HL_INFO_PCI_COUNTERS		12
-#define HL_INFO_CLK_THROTTLE_REASON	13
-#define HL_INFO_SYNC_MANAGER		14
-#define HL_INFO_TOTAL_ENERGY		15
-#define HL_INFO_PLL_FREQUENCY		16
-#define HL_INFO_POWER			17
-#define HL_INFO_OPEN_STATS		18
-#define HL_INFO_DRAM_REPLACED_ROWS	21
-#define HL_INFO_DRAM_PENDING_ROWS	22
-#define HL_INFO_LAST_ERR_OPEN_DEV_TIME	23
-#define HL_INFO_CS_TIMEOUT_EVENT	24
-#define HL_INFO_RAZWI_EVENT		25
-
-#define HL_INFO_VERSION_MAX_LEN		128
-#define HL_INFO_CARD_NAME_MAX_LEN	16
+#define HL_INFO_HW_IP_INFO			0
+#define HL_INFO_HW_EVENTS			1
+#define HL_INFO_DRAM_USAGE			2
+#define HL_INFO_HW_IDLE				3
+#define HL_INFO_DEVICE_STATUS			4
+#define HL_INFO_DEVICE_UTILIZATION		6
+#define HL_INFO_HW_EVENTS_AGGREGATE		7
+#define HL_INFO_CLK_RATE			8
+#define HL_INFO_RESET_COUNT			9
+#define HL_INFO_TIME_SYNC			10
+#define HL_INFO_CS_COUNTERS			11
+#define HL_INFO_PCI_COUNTERS			12
+#define HL_INFO_CLK_THROTTLE_REASON		13
+#define HL_INFO_SYNC_MANAGER			14
+#define HL_INFO_TOTAL_ENERGY			15
+#define HL_INFO_PLL_FREQUENCY			16
+#define HL_INFO_POWER				17
+#define HL_INFO_OPEN_STATS			18
+#define HL_INFO_DRAM_REPLACED_ROWS		21
+#define HL_INFO_DRAM_PENDING_ROWS		22
+#define HL_INFO_LAST_ERR_OPEN_DEV_TIME		23
+#define HL_INFO_CS_TIMEOUT_EVENT		24
+#define HL_INFO_RAZWI_EVENT			25
+#define HL_INFO_DEV_MEM_ALLOC_PAGE_SIZES	26
+
+#define HL_INFO_VERSION_MAX_LEN			128
+#define HL_INFO_CARD_NAME_MAX_LEN		16
 
 /**
  * struct hl_info_hw_ip_info - hardware information on various IPs in the ASIC
@@ -643,6 +645,15 @@ struct hl_info_razwi_event {
 	__u8 pad[2];
 };
 
+/**
+ * struct hl_info_dev_memalloc_page_sizes - valid page sizes in device mem alloc information.
+ * @page_order_bitmask: bitmap in which a set bit represents the order of the supported page size
+ *                      (e.g. 0x2100000 means that 1MB and 32MB pages are supported).
+ */
+struct hl_info_dev_memalloc_page_sizes {
+	__u64 page_order_bitmask;
+};
+
 enum gaudi_dcores {
 	HL_GAUDI_WS_DCORE,
 	HL_GAUDI_WN_DCORE,
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 06/11] habanalabs: add new return code to device fd open
  2022-03-16 11:41 [PATCH 01/11] habanalabs: set non-0 value in dram default page size Oded Gabbay
                   ` (3 preceding siblings ...)
  2022-03-16 11:41 ` [PATCH 05/11] habanalabs: add user API to get valid DRAM page sizes Oded Gabbay
@ 2022-03-16 11:41 ` Oded Gabbay
  2022-03-16 11:41 ` [PATCH 07/11] habanalabs: expose compute ctx status through info ioctl Oded Gabbay
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 13+ messages in thread
From: Oded Gabbay @ 2022-03-16 11:41 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ofir Bitton

From: Ofir Bitton <obitton@habana.ai>

In order to be more informative during device open, we are adding a
new return code -EAGAIN that indicates device is still going through
resource reclaiming and hence it cannot be used yet.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
---
 drivers/misc/habanalabs/common/device.c         |  4 ++++
 drivers/misc/habanalabs/common/habanalabs.h     |  2 ++
 drivers/misc/habanalabs/common/habanalabs_drv.c | 15 ++++++++++++++-
 3 files changed, 20 insertions(+), 1 deletion(-)

diff --git a/drivers/misc/habanalabs/common/device.c b/drivers/misc/habanalabs/common/device.c
index dc9341a64541..3eb392b4308a 100644
--- a/drivers/misc/habanalabs/common/device.c
+++ b/drivers/misc/habanalabs/common/device.c
@@ -107,6 +107,8 @@ static void hpriv_release(struct kref *ref)
 	hdev->is_compute_ctx_active = false;
 	mutex_unlock(&hdev->fpriv_list_lock);
 
+	hdev->compute_ctx_in_release = 0;
+
 	kfree(hpriv);
 }
 
@@ -150,6 +152,8 @@ static int hl_device_release(struct inode *inode, struct file *filp)
 	hl_ts_mgr_fini(hpriv->hdev, &hpriv->ts_mem_mgr);
 	hl_ctx_mgr_fini(hdev, &hpriv->ctx_mgr);
 
+	hdev->compute_ctx_in_release = 1;
+
 	if (!hl_hpriv_put(hpriv))
 		dev_notice(hdev->dev,
 			"User process closed FD but device still in use\n");
diff --git a/drivers/misc/habanalabs/common/habanalabs.h b/drivers/misc/habanalabs/common/habanalabs.h
index 564797766f42..0079f43bc596 100644
--- a/drivers/misc/habanalabs/common/habanalabs.h
+++ b/drivers/misc/habanalabs/common/habanalabs.h
@@ -2710,6 +2710,7 @@ struct hl_reset_info {
  *                        cases where Linux was not loaded to device CPU
  * @supports_wait_for_multi_cs: true if wait for multi CS is supported
  * @is_compute_ctx_active: Whether there is an active compute context executing.
+ * @compute_ctx_in_release: true if the current compute context is being released.
  */
 struct hl_device {
 	struct pci_dev			*pdev;
@@ -2828,6 +2829,7 @@ struct hl_device {
 	u8				supports_wait_for_multi_cs;
 	u8				stream_master_qid_arr_size;
 	u8				is_compute_ctx_active;
+	u8				compute_ctx_in_release;
 
 	/* Parameters for bring-up */
 	u64				nic_ports_mask;
diff --git a/drivers/misc/habanalabs/common/habanalabs_drv.c b/drivers/misc/habanalabs/common/habanalabs_drv.c
index ca404ed9d9a7..e870c32a0733 100644
--- a/drivers/misc/habanalabs/common/habanalabs_drv.c
+++ b/drivers/misc/habanalabs/common/habanalabs_drv.c
@@ -150,7 +150,20 @@ int hl_device_open(struct inode *inode, struct file *filp)
 		dev_err_ratelimited(hdev->dev,
 			"Can't open %s because it is %s\n",
 			dev_name(hdev->dev), hdev->status[status]);
-		rc = -EPERM;
+
+		if (status == HL_DEVICE_STATUS_IN_RESET)
+			rc = -EAGAIN;
+		else
+			rc = -EPERM;
+
+		goto out_err;
+	}
+
+	if (hdev->compute_ctx_in_release) {
+		dev_dbg_ratelimited(hdev->dev,
+			"Can't open %s because another user is still releasing it\n",
+			dev_name(hdev->dev));
+		rc = -EAGAIN;
 		goto out_err;
 	}
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 07/11] habanalabs: expose compute ctx status through info ioctl
  2022-03-16 11:41 [PATCH 01/11] habanalabs: set non-0 value in dram default page size Oded Gabbay
                   ` (4 preceding siblings ...)
  2022-03-16 11:41 ` [PATCH 06/11] habanalabs: add new return code to device fd open Oded Gabbay
@ 2022-03-16 11:41 ` Oded Gabbay
  2022-03-16 11:41 ` [PATCH 08/11] habanalabs/gaudi: increase submission resources Oded Gabbay
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 13+ messages in thread
From: Oded Gabbay @ 2022-03-16 11:41 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ofir Bitton

From: Ofir Bitton <obitton@habana.ai>

In order for the user to know if he can try and open device, we
expose the compute ctx state. The user can now know if the context
is used by another process or whether the device is still ongoing
through cleanup or reset and will be available soon.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
---
 drivers/misc/habanalabs/common/habanalabs_ioctl.c | 2 ++
 include/uapi/misc/habanalabs.h                    | 5 +++++
 2 files changed, 7 insertions(+)

diff --git a/drivers/misc/habanalabs/common/habanalabs_ioctl.c b/drivers/misc/habanalabs/common/habanalabs_ioctl.c
index c6fe35ae1238..bfb5cfe68110 100644
--- a/drivers/misc/habanalabs/common/habanalabs_ioctl.c
+++ b/drivers/misc/habanalabs/common/habanalabs_ioctl.c
@@ -498,6 +498,8 @@ static int open_stats_info(struct hl_fpriv *hpriv, struct hl_info_args *args)
 	open_stats_info.last_open_period_ms = jiffies64_to_msecs(
 		hdev->last_open_session_duration_jif);
 	open_stats_info.open_counter = hdev->open_counter;
+	open_stats_info.is_compute_ctx_active = hdev->is_compute_ctx_active;
+	open_stats_info.compute_ctx_in_release = hdev->compute_ctx_in_release;
 
 	return copy_to_user(out, &open_stats_info,
 		min((size_t) max_size, sizeof(open_stats_info))) ? -EFAULT : 0;
diff --git a/include/uapi/misc/habanalabs.h b/include/uapi/misc/habanalabs.h
index f474e7fb018d..ca2af5f98056 100644
--- a/include/uapi/misc/habanalabs.h
+++ b/include/uapi/misc/habanalabs.h
@@ -543,10 +543,15 @@ struct hl_pll_frequency_info {
  * struct hl_open_stats_info - device open statistics information
  * @open_counter: ever growing counter, increased on each successful dev open
  * @last_open_period_ms: duration (ms) device was open last time
+ * @is_compute_ctx_active: Whether there is an active compute context executing
+ * @compute_ctx_in_release: true if the current compute context is being released
  */
 struct hl_open_stats_info {
 	__u64 open_counter;
 	__u64 last_open_period_ms;
+	__u8 is_compute_ctx_active;
+	__u8 compute_ctx_in_release;
+	__u8 pad[6];
 };
 
 /**
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 08/11] habanalabs/gaudi: increase submission resources
  2022-03-16 11:41 [PATCH 01/11] habanalabs: set non-0 value in dram default page size Oded Gabbay
                   ` (5 preceding siblings ...)
  2022-03-16 11:41 ` [PATCH 07/11] habanalabs: expose compute ctx status through info ioctl Oded Gabbay
@ 2022-03-16 11:41 ` Oded Gabbay
  2022-03-16 11:41 ` [PATCH 09/11] habanalabs/gaudi: avoid resetting max power in hard reset Oded Gabbay
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 13+ messages in thread
From: Oded Gabbay @ 2022-03-16 11:41 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ofir Bitton

From: Ofir Bitton <obitton@habana.ai>

In order to allow user to have larger amount of submissions, we
increase the DMA and NIC queue depth to 4K.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
---
 drivers/misc/habanalabs/gaudi/gaudiP.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/misc/habanalabs/gaudi/gaudiP.h b/drivers/misc/habanalabs/gaudi/gaudiP.h
index 54de7c599072..4fbcf3f0afe5 100644
--- a/drivers/misc/habanalabs/gaudi/gaudiP.h
+++ b/drivers/misc/habanalabs/gaudi/gaudiP.h
@@ -148,14 +148,14 @@
 #define MME_QMAN_LENGTH			1024
 #define MME_QMAN_SIZE_IN_BYTES		(MME_QMAN_LENGTH * QMAN_PQ_ENTRY_SIZE)
 
-#define HBM_DMA_QMAN_LENGTH		1024
+#define HBM_DMA_QMAN_LENGTH		4096
 #define HBM_DMA_QMAN_SIZE_IN_BYTES	\
 				(HBM_DMA_QMAN_LENGTH * QMAN_PQ_ENTRY_SIZE)
 
 #define TPC_QMAN_LENGTH			1024
 #define TPC_QMAN_SIZE_IN_BYTES		(TPC_QMAN_LENGTH * QMAN_PQ_ENTRY_SIZE)
 
-#define NIC_QMAN_LENGTH			1024
+#define NIC_QMAN_LENGTH			4096
 #define NIC_QMAN_SIZE_IN_BYTES		(NIC_QMAN_LENGTH * QMAN_PQ_ENTRY_SIZE)
 
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 09/11] habanalabs/gaudi: avoid resetting max power in hard reset
  2022-03-16 11:41 [PATCH 01/11] habanalabs: set non-0 value in dram default page size Oded Gabbay
                   ` (6 preceding siblings ...)
  2022-03-16 11:41 ` [PATCH 08/11] habanalabs/gaudi: increase submission resources Oded Gabbay
@ 2022-03-16 11:41 ` Oded Gabbay
  2022-03-16 11:41 ` [PATCH 10/11] habanalabs: parse full firmware versions Oded Gabbay
  2022-03-16 11:41 ` [PATCH 11/11] habanalabs: modify dma_mask to be ASIC specific property Oded Gabbay
  9 siblings, 0 replies; 13+ messages in thread
From: Oded Gabbay @ 2022-03-16 11:41 UTC (permalink / raw)
  To: linux-kernel; +Cc: Tomer Tayar

From: Tomer Tayar <ttayar@habana.ai>

The default max power is deduced from the card type value in the CPU-CP
info. This value is then set in the max power variable of the device
structure.
Getting the CPU-CP info is done as part of the late init phase
which is called also during reset. This means that a max power value
which is modified via sysfs will be reset during hard reset back to the
default value.
As the max power is updated in any case during device init in
hl_sysfs_init(), this setting in late init can be removed, and the
overriding during reset is thus avoided.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
---
 drivers/misc/habanalabs/gaudi/gaudi.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/misc/habanalabs/gaudi/gaudi.c b/drivers/misc/habanalabs/gaudi/gaudi.c
index 950a7d6a4a35..3cb461288533 100644
--- a/drivers/misc/habanalabs/gaudi/gaudi.c
+++ b/drivers/misc/habanalabs/gaudi/gaudi.c
@@ -8329,8 +8329,6 @@ static int gaudi_cpucp_info_get(struct hl_device *hdev)
 
 	set_default_power_values(hdev);
 
-	hdev->max_power = prop->max_power_default;
-
 	return 0;
 }
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 10/11] habanalabs: parse full firmware versions
  2022-03-16 11:41 [PATCH 01/11] habanalabs: set non-0 value in dram default page size Oded Gabbay
                   ` (7 preceding siblings ...)
  2022-03-16 11:41 ` [PATCH 09/11] habanalabs/gaudi: avoid resetting max power in hard reset Oded Gabbay
@ 2022-03-16 11:41 ` Oded Gabbay
  2022-03-16 11:41 ` [PATCH 11/11] habanalabs: modify dma_mask to be ASIC specific property Oded Gabbay
  9 siblings, 0 replies; 13+ messages in thread
From: Oded Gabbay @ 2022-03-16 11:41 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ofir Bitton

From: Ofir Bitton <obitton@habana.ai>

When parsing firmware versions strings, driver should not
assume a specific length and parse up to the maximum supported
version length.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
---
 drivers/misc/habanalabs/common/firmware_if.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/misc/habanalabs/common/firmware_if.c b/drivers/misc/habanalabs/common/firmware_if.c
index 3262126cc7ca..2665919dbbdd 100644
--- a/drivers/misc/habanalabs/common/firmware_if.c
+++ b/drivers/misc/habanalabs/common/firmware_if.c
@@ -18,8 +18,9 @@
 static char *extract_fw_ver_from_str(const char *fw_str)
 {
 	char *str, *fw_ver, *whitespace;
+	u32 ver_offset;
 
-	fw_ver = kmalloc(16, GFP_KERNEL);
+	fw_ver = kmalloc(VERSION_MAX_LEN, GFP_KERNEL);
 	if (!fw_ver)
 		return NULL;
 
@@ -29,9 +30,10 @@ static char *extract_fw_ver_from_str(const char *fw_str)
 
 	/* Skip the fw- part */
 	str += 3;
+	ver_offset = str - fw_str;
 
 	/* Copy until the next whitespace */
-	whitespace =  strnstr(str, " ", 15);
+	whitespace =  strnstr(str, " ", VERSION_MAX_LEN - ver_offset);
 	if (!whitespace)
 		goto free_fw_ver;
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 11/11] habanalabs: modify dma_mask to be ASIC specific property
  2022-03-16 11:41 [PATCH 01/11] habanalabs: set non-0 value in dram default page size Oded Gabbay
                   ` (8 preceding siblings ...)
  2022-03-16 11:41 ` [PATCH 10/11] habanalabs: parse full firmware versions Oded Gabbay
@ 2022-03-16 11:41 ` Oded Gabbay
  9 siblings, 0 replies; 13+ messages in thread
From: Oded Gabbay @ 2022-03-16 11:41 UTC (permalink / raw)
  To: linux-kernel; +Cc: Tomer Tayar

From: Tomer Tayar <ttayar@habana.ai>

The required DMA mask is no longer based on input from the F/W, but it
is fixed per ASIC according to its address space.
As such, the per-ASIC function to get this value can be replaced with a
property variable.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
---
 drivers/misc/habanalabs/common/habanalabs.h     |  7 ++-----
 drivers/misc/habanalabs/common/habanalabs_drv.c |  3 ---
 drivers/misc/habanalabs/common/pci/pci.c        | 10 ++++------
 drivers/misc/habanalabs/gaudi/gaudi.c           | 10 ++--------
 drivers/misc/habanalabs/goya/goya.c             | 10 ++--------
 5 files changed, 10 insertions(+), 30 deletions(-)

diff --git a/drivers/misc/habanalabs/common/habanalabs.h b/drivers/misc/habanalabs/common/habanalabs.h
index 0079f43bc596..3e7012f7b1a3 100644
--- a/drivers/misc/habanalabs/common/habanalabs.h
+++ b/drivers/misc/habanalabs/common/habanalabs.h
@@ -552,6 +552,7 @@ struct hl_hints_range {
  * @configurable_stop_on_err: is stop-on-error option configurable via debugfs.
  * @set_max_power_on_device_init: true if need to set max power in F/W on device init.
  * @supports_user_set_page_size: true if user can set the allocation page size.
+ * @dma_mask: the dma mask to be set for this device
  */
 struct asic_fixed_properties {
 	struct hw_queue_properties	*hw_queues_props;
@@ -639,6 +640,7 @@ struct asic_fixed_properties {
 	u8				configurable_stop_on_err;
 	u8				set_max_power_on_device_init;
 	u8				supports_user_set_page_size;
+	u8				dma_mask;
 };
 
 /**
@@ -1274,8 +1276,6 @@ struct fw_load_mgr {
  * @gen_wait_cb: Generate a wait CB.
  * @reset_sob: Reset a SOB.
  * @reset_sob_group: Reset SOB group
- * @set_dma_mask_from_fw: set the DMA mask in the driver according to the
- *                        firmware configuration
  * @get_device_time: Get the device time.
  * @collective_wait_init_cs: Generate collective master/slave packets
  *                           and place them in the relevant cs jobs
@@ -1407,7 +1407,6 @@ struct hl_asic_funcs {
 			struct hl_gen_wait_properties *prop);
 	void (*reset_sob)(struct hl_device *hdev, void *data);
 	void (*reset_sob_group)(struct hl_device *hdev, u16 sob_group);
-	void (*set_dma_mask_from_fw)(struct hl_device *hdev);
 	u64 (*get_device_time)(struct hl_device *hdev);
 	int (*collective_wait_init_cs)(struct hl_cs *cs);
 	int (*collective_wait_create_jobs)(struct hl_device *hdev,
@@ -2688,7 +2687,6 @@ struct hl_reset_info {
  *                   huge pages.
  * @init_done: is the initialization of the device done.
  * @device_cpu_disabled: is the device CPU disabled (due to timeouts)
- * @dma_mask: the dma mask that was set for this device
  * @in_debug: whether the device is in a state where the profiling/tracing infrastructure
  *            can be used. This indication is needed because in some ASICs we need to do
  *            specific operations to enable that infrastructure.
@@ -2813,7 +2811,6 @@ struct hl_device {
 	u8				pmmu_huge_range;
 	u8				init_done;
 	u8				device_cpu_disabled;
-	u8				dma_mask;
 	u8				in_debug;
 	u8				cdev_sysfs_created;
 	u8				stop_on_err;
diff --git a/drivers/misc/habanalabs/common/habanalabs_drv.c b/drivers/misc/habanalabs/common/habanalabs_drv.c
index e870c32a0733..95c000d7dc26 100644
--- a/drivers/misc/habanalabs/common/habanalabs_drv.c
+++ b/drivers/misc/habanalabs/common/habanalabs_drv.c
@@ -309,9 +309,6 @@ static int fixup_device_params(struct hl_device *hdev)
 	/* Enable only after the initialization of the device */
 	hdev->disabled = true;
 
-	/* Set default DMA mask to 32 bits */
-	hdev->dma_mask = 32;
-
 	return 0;
 }
 
diff --git a/drivers/misc/habanalabs/common/pci/pci.c b/drivers/misc/habanalabs/common/pci/pci.c
index bb9ce22bafc4..610acd4a8057 100644
--- a/drivers/misc/habanalabs/common/pci/pci.c
+++ b/drivers/misc/habanalabs/common/pci/pci.c
@@ -392,6 +392,7 @@ enum pci_region hl_get_pci_memory_region(struct hl_device *hdev, u64 addr)
  */
 int hl_pci_init(struct hl_device *hdev)
 {
+	struct asic_fixed_properties *prop = &hdev->asic_prop;
 	struct pci_dev *pdev = hdev->pdev;
 	int rc;
 
@@ -419,17 +420,14 @@ int hl_pci_init(struct hl_device *hdev)
 	}
 
 	/* Driver must sleep in order for FW to finish the iATU configuration */
-	if (hdev->asic_prop.iatu_done_by_fw) {
+	if (hdev->asic_prop.iatu_done_by_fw)
 		usleep_range(2000, 3000);
-		hdev->asic_funcs->set_dma_mask_from_fw(hdev);
-	}
 
-	rc = dma_set_mask_and_coherent(&pdev->dev,
-					DMA_BIT_MASK(hdev->dma_mask));
+	rc = dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(prop->dma_mask));
 	if (rc) {
 		dev_err(hdev->dev,
 			"Failed to set dma mask to %d bits, error %d\n",
-			hdev->dma_mask, rc);
+			prop->dma_mask, rc);
 		goto unmap_pci_bars;
 	}
 
diff --git a/drivers/misc/habanalabs/gaudi/gaudi.c b/drivers/misc/habanalabs/gaudi/gaudi.c
index 3cb461288533..f35fa7d7f918 100644
--- a/drivers/misc/habanalabs/gaudi/gaudi.c
+++ b/drivers/misc/habanalabs/gaudi/gaudi.c
@@ -674,6 +674,8 @@ static int gaudi_set_fixed_properties(struct hl_device *hdev)
 
 	prop->set_max_power_on_device_init = true;
 
+	prop->dma_mask = 48;
+
 	return 0;
 }
 
@@ -755,8 +757,6 @@ static int gaudi_init_iatu(struct hl_device *hdev)
 	if (rc)
 		goto done;
 
-	hdev->asic_funcs->set_dma_mask_from_fw(hdev);
-
 	/* Outbound Region 0 - Point to Host */
 	outbound_region.addr = HOST_PHYS_BASE;
 	outbound_region.size = HOST_PHYS_SIZE;
@@ -9065,11 +9065,6 @@ static void gaudi_reset_sob(struct hl_device *hdev, void *data)
 	kref_init(&hw_sob->kref);
 }
 
-static void gaudi_set_dma_mask_from_fw(struct hl_device *hdev)
-{
-	hdev->dma_mask = 48;
-}
-
 static u64 gaudi_get_device_time(struct hl_device *hdev)
 {
 	u64 device_time = ((u64) RREG32(mmPSOC_TIMESTAMP_CNTCVU)) << 32;
@@ -9474,7 +9469,6 @@ static const struct hl_asic_funcs gaudi_funcs = {
 	.gen_wait_cb = gaudi_gen_wait_cb,
 	.reset_sob = gaudi_reset_sob,
 	.reset_sob_group = gaudi_reset_sob_group,
-	.set_dma_mask_from_fw = gaudi_set_dma_mask_from_fw,
 	.get_device_time = gaudi_get_device_time,
 	.collective_wait_init_cs = gaudi_collective_wait_init_cs,
 	.collective_wait_create_jobs = gaudi_collective_wait_create_jobs,
diff --git a/drivers/misc/habanalabs/goya/goya.c b/drivers/misc/habanalabs/goya/goya.c
index 38af7014b845..8c165ba3a6cc 100644
--- a/drivers/misc/habanalabs/goya/goya.c
+++ b/drivers/misc/habanalabs/goya/goya.c
@@ -488,6 +488,8 @@ int goya_set_fixed_properties(struct hl_device *hdev)
 
 	prop->set_max_power_on_device_init = true;
 
+	prop->dma_mask = 48;
+
 	return 0;
 }
 
@@ -575,8 +577,6 @@ static int goya_init_iatu(struct hl_device *hdev)
 	if (rc)
 		goto done;
 
-	hdev->asic_funcs->set_dma_mask_from_fw(hdev);
-
 	/* Outbound Region 0 - Point to Host  */
 	outbound_region.addr = HOST_PHYS_BASE;
 	outbound_region.size = HOST_PHYS_SIZE;
@@ -5562,11 +5562,6 @@ static void goya_reset_sob_group(struct hl_device *hdev, u16 sob_group)
 
 }
 
-static void goya_set_dma_mask_from_fw(struct hl_device *hdev)
-{
-	hdev->dma_mask = 48;
-}
-
 u64 goya_get_device_time(struct hl_device *hdev)
 {
 	u64 device_time = ((u64) RREG32(mmPSOC_TIMESTAMP_CNTCVU)) << 32;
@@ -5754,7 +5749,6 @@ static const struct hl_asic_funcs goya_funcs = {
 	.gen_wait_cb = goya_gen_wait_cb,
 	.reset_sob = goya_reset_sob,
 	.reset_sob_group = goya_reset_sob_group,
-	.set_dma_mask_from_fw = goya_set_dma_mask_from_fw,
 	.get_device_time = goya_get_device_time,
 	.collective_wait_init_cs = goya_collective_wait_init_cs,
 	.collective_wait_create_jobs = goya_collective_wait_create_jobs,
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 05/11] habanalabs: add user API to get valid DRAM page sizes
  2022-03-16 11:41 ` [PATCH 05/11] habanalabs: add user API to get valid DRAM page sizes Oded Gabbay
@ 2022-03-16 19:37   ` kernel test robot
  0 siblings, 0 replies; 13+ messages in thread
From: kernel test robot @ 2022-03-16 19:37 UTC (permalink / raw)
  To: Oded Gabbay; +Cc: llvm, kbuild-all

Hi Oded,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on char-misc/char-misc-testing]
[also build test WARNING on next-20220316]
[cannot apply to linux/master linus/master v5.17-rc8]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Oded-Gabbay/habanalabs-set-non-0-value-in-dram-default-page-size/20220316-194323
base:   https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git d6cd2f85931f87dbd07c664c9c6e806db1dd7c75
config: s390-randconfig-r031-20220317 (https://download.01.org/0day-ci/archive/20220317/202203170327.EdLrr0lQ-lkp@intel.com/config)
compiler: clang version 15.0.0 (https://github.com/llvm/llvm-project a6ec1e3d798f8eab43fb3a91028c6ab04e115fcb)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # install s390 cross compiling tool for clang build
        # apt-get install binutils-s390x-linux-gnu
        # https://github.com/0day-ci/linux/commit/1a30c683a2ac64915d5cf4391a973666ac172396
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Oded-Gabbay/habanalabs-set-non-0-value-in-dram-default-page-size/20220316-194323
        git checkout 1a30c683a2ac64915d5cf4391a973666ac172396
        # save the config file to linux build tree
        mkdir build_dir
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=s390 SHELL=/bin/bash drivers/misc/

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

   In file included from drivers/misc/habanalabs/goya/goya.c:8:
   In file included from drivers/misc/habanalabs/goya/goyaP.h:12:
   In file included from drivers/misc/habanalabs/goya/../common/habanalabs.h:11:
   In file included from drivers/misc/habanalabs/goya/../common/../include/common/cpucp_if.h:12:
   In file included from include/linux/if_ether.h:19:
   In file included from include/linux/skbuff.h:31:
   In file included from include/linux/dma-mapping.h:10:
   In file included from include/linux/scatterlist.h:9:
   In file included from arch/s390/include/asm/io.h:75:
   include/asm-generic/io.h:464:31: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           val = __raw_readb(PCI_IOBASE + addr);
                             ~~~~~~~~~~ ^
   include/asm-generic/io.h:477:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           val = __le16_to_cpu((__le16 __force)__raw_readw(PCI_IOBASE + addr));
                                                           ~~~~~~~~~~ ^
   include/uapi/linux/byteorder/big_endian.h:37:59: note: expanded from macro '__le16_to_cpu'
   #define __le16_to_cpu(x) __swab16((__force __u16)(__le16)(x))
                                                             ^
   include/uapi/linux/swab.h:102:54: note: expanded from macro '__swab16'
   #define __swab16(x) (__u16)__builtin_bswap16((__u16)(x))
                                                        ^
   In file included from drivers/misc/habanalabs/goya/goya.c:8:
   In file included from drivers/misc/habanalabs/goya/goyaP.h:12:
   In file included from drivers/misc/habanalabs/goya/../common/habanalabs.h:11:
   In file included from drivers/misc/habanalabs/goya/../common/../include/common/cpucp_if.h:12:
   In file included from include/linux/if_ether.h:19:
   In file included from include/linux/skbuff.h:31:
   In file included from include/linux/dma-mapping.h:10:
   In file included from include/linux/scatterlist.h:9:
   In file included from arch/s390/include/asm/io.h:75:
   include/asm-generic/io.h:490:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           val = __le32_to_cpu((__le32 __force)__raw_readl(PCI_IOBASE + addr));
                                                           ~~~~~~~~~~ ^
   include/uapi/linux/byteorder/big_endian.h:35:59: note: expanded from macro '__le32_to_cpu'
   #define __le32_to_cpu(x) __swab32((__force __u32)(__le32)(x))
                                                             ^
   include/uapi/linux/swab.h:115:54: note: expanded from macro '__swab32'
   #define __swab32(x) (__u32)__builtin_bswap32((__u32)(x))
                                                        ^
   In file included from drivers/misc/habanalabs/goya/goya.c:8:
   In file included from drivers/misc/habanalabs/goya/goyaP.h:12:
   In file included from drivers/misc/habanalabs/goya/../common/habanalabs.h:11:
   In file included from drivers/misc/habanalabs/goya/../common/../include/common/cpucp_if.h:12:
   In file included from include/linux/if_ether.h:19:
   In file included from include/linux/skbuff.h:31:
   In file included from include/linux/dma-mapping.h:10:
   In file included from include/linux/scatterlist.h:9:
   In file included from arch/s390/include/asm/io.h:75:
   include/asm-generic/io.h:501:33: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           __raw_writeb(value, PCI_IOBASE + addr);
                               ~~~~~~~~~~ ^
   include/asm-generic/io.h:511:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           __raw_writew((u16 __force)cpu_to_le16(value), PCI_IOBASE + addr);
                                                         ~~~~~~~~~~ ^
   include/asm-generic/io.h:521:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           __raw_writel((u32 __force)cpu_to_le32(value), PCI_IOBASE + addr);
                                                         ~~~~~~~~~~ ^
   include/asm-generic/io.h:609:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           readsb(PCI_IOBASE + addr, buffer, count);
                  ~~~~~~~~~~ ^
   include/asm-generic/io.h:617:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           readsw(PCI_IOBASE + addr, buffer, count);
                  ~~~~~~~~~~ ^
   include/asm-generic/io.h:625:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           readsl(PCI_IOBASE + addr, buffer, count);
                  ~~~~~~~~~~ ^
   include/asm-generic/io.h:634:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           writesb(PCI_IOBASE + addr, buffer, count);
                   ~~~~~~~~~~ ^
   include/asm-generic/io.h:643:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           writesw(PCI_IOBASE + addr, buffer, count);
                   ~~~~~~~~~~ ^
   include/asm-generic/io.h:652:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           writesl(PCI_IOBASE + addr, buffer, count);
                   ~~~~~~~~~~ ^
>> drivers/misc/habanalabs/goya/goya.c:5682:6: warning: no previous prototype for function 'goya_get_valid_dram_page_orders' [-Wmissing-prototypes]
   void goya_get_valid_dram_page_orders(struct hl_info_dev_memalloc_page_sizes *info)
        ^
   drivers/misc/habanalabs/goya/goya.c:5682:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
   void goya_get_valid_dram_page_orders(struct hl_info_dev_memalloc_page_sizes *info)
   ^
   static 
   13 warnings generated.
--
   In file included from drivers/misc/habanalabs/gaudi/gaudi.c:8:
   In file included from drivers/misc/habanalabs/gaudi/gaudiP.h:12:
   In file included from drivers/misc/habanalabs/gaudi/../common/habanalabs.h:11:
   In file included from drivers/misc/habanalabs/gaudi/../common/../include/common/cpucp_if.h:12:
   In file included from include/linux/if_ether.h:19:
   In file included from include/linux/skbuff.h:31:
   In file included from include/linux/dma-mapping.h:10:
   In file included from include/linux/scatterlist.h:9:
   In file included from arch/s390/include/asm/io.h:75:
   include/asm-generic/io.h:464:31: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           val = __raw_readb(PCI_IOBASE + addr);
                             ~~~~~~~~~~ ^
   include/asm-generic/io.h:477:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           val = __le16_to_cpu((__le16 __force)__raw_readw(PCI_IOBASE + addr));
                                                           ~~~~~~~~~~ ^
   include/uapi/linux/byteorder/big_endian.h:37:59: note: expanded from macro '__le16_to_cpu'
   #define __le16_to_cpu(x) __swab16((__force __u16)(__le16)(x))
                                                             ^
   include/uapi/linux/swab.h:102:54: note: expanded from macro '__swab16'
   #define __swab16(x) (__u16)__builtin_bswap16((__u16)(x))
                                                        ^
   In file included from drivers/misc/habanalabs/gaudi/gaudi.c:8:
   In file included from drivers/misc/habanalabs/gaudi/gaudiP.h:12:
   In file included from drivers/misc/habanalabs/gaudi/../common/habanalabs.h:11:
   In file included from drivers/misc/habanalabs/gaudi/../common/../include/common/cpucp_if.h:12:
   In file included from include/linux/if_ether.h:19:
   In file included from include/linux/skbuff.h:31:
   In file included from include/linux/dma-mapping.h:10:
   In file included from include/linux/scatterlist.h:9:
   In file included from arch/s390/include/asm/io.h:75:
   include/asm-generic/io.h:490:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           val = __le32_to_cpu((__le32 __force)__raw_readl(PCI_IOBASE + addr));
                                                           ~~~~~~~~~~ ^
   include/uapi/linux/byteorder/big_endian.h:35:59: note: expanded from macro '__le32_to_cpu'
   #define __le32_to_cpu(x) __swab32((__force __u32)(__le32)(x))
                                                             ^
   include/uapi/linux/swab.h:115:54: note: expanded from macro '__swab32'
   #define __swab32(x) (__u32)__builtin_bswap32((__u32)(x))
                                                        ^
   In file included from drivers/misc/habanalabs/gaudi/gaudi.c:8:
   In file included from drivers/misc/habanalabs/gaudi/gaudiP.h:12:
   In file included from drivers/misc/habanalabs/gaudi/../common/habanalabs.h:11:
   In file included from drivers/misc/habanalabs/gaudi/../common/../include/common/cpucp_if.h:12:
   In file included from include/linux/if_ether.h:19:
   In file included from include/linux/skbuff.h:31:
   In file included from include/linux/dma-mapping.h:10:
   In file included from include/linux/scatterlist.h:9:
   In file included from arch/s390/include/asm/io.h:75:
   include/asm-generic/io.h:501:33: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           __raw_writeb(value, PCI_IOBASE + addr);
                               ~~~~~~~~~~ ^
   include/asm-generic/io.h:511:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           __raw_writew((u16 __force)cpu_to_le16(value), PCI_IOBASE + addr);
                                                         ~~~~~~~~~~ ^
   include/asm-generic/io.h:521:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           __raw_writel((u32 __force)cpu_to_le32(value), PCI_IOBASE + addr);
                                                         ~~~~~~~~~~ ^
   include/asm-generic/io.h:609:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           readsb(PCI_IOBASE + addr, buffer, count);
                  ~~~~~~~~~~ ^
   include/asm-generic/io.h:617:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           readsw(PCI_IOBASE + addr, buffer, count);
                  ~~~~~~~~~~ ^
   include/asm-generic/io.h:625:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           readsl(PCI_IOBASE + addr, buffer, count);
                  ~~~~~~~~~~ ^
   include/asm-generic/io.h:634:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           writesb(PCI_IOBASE + addr, buffer, count);
                   ~~~~~~~~~~ ^
   include/asm-generic/io.h:643:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           writesw(PCI_IOBASE + addr, buffer, count);
                   ~~~~~~~~~~ ^
   include/asm-generic/io.h:652:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           writesl(PCI_IOBASE + addr, buffer, count);
                   ~~~~~~~~~~ ^
>> drivers/misc/habanalabs/gaudi/gaudi.c:9381:6: warning: no previous prototype for function 'gaudi_get_valid_dram_page_orders' [-Wmissing-prototypes]
   void gaudi_get_valid_dram_page_orders(struct hl_info_dev_memalloc_page_sizes *info)
        ^
   drivers/misc/habanalabs/gaudi/gaudi.c:9381:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
   void gaudi_get_valid_dram_page_orders(struct hl_info_dev_memalloc_page_sizes *info)
   ^
   static 
   13 warnings generated.


vim +/goya_get_valid_dram_page_orders +5682 drivers/misc/habanalabs/goya/goya.c

  5681	
> 5682	void goya_get_valid_dram_page_orders(struct hl_info_dev_memalloc_page_sizes *info)
  5683	{
  5684		/* set 0 since multiple pages are not supported */
  5685		info->page_order_bitmask = 0;
  5686	}
  5687	

---
0-DAY CI Kernel Test Service
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 03/11] habanalabs: change mmu_get_real_page_size to be ASIC-specific
  2022-03-16 11:41 ` [PATCH 03/11] habanalabs: change mmu_get_real_page_size to be ASIC-specific Oded Gabbay
@ 2022-03-16 22:33   ` kernel test robot
  0 siblings, 0 replies; 13+ messages in thread
From: kernel test robot @ 2022-03-16 22:33 UTC (permalink / raw)
  To: Oded Gabbay; +Cc: llvm, kbuild-all

Hi Oded,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on char-misc/char-misc-testing]
[also build test WARNING on next-20220316]
[cannot apply to linux/master linus/master v5.17-rc8]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Oded-Gabbay/habanalabs-set-non-0-value-in-dram-default-page-size/20220316-194323
base:   https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git d6cd2f85931f87dbd07c664c9c6e806db1dd7c75
config: x86_64-randconfig-a013-20220314 (https://download.01.org/0day-ci/archive/20220317/202203170651.S8azQIor-lkp@intel.com/config)
compiler: clang version 15.0.0 (https://github.com/llvm/llvm-project a6ec1e3d798f8eab43fb3a91028c6ab04e115fcb)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/9535025e314bc12dbdeebee7c71634699759bcfa
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Oded-Gabbay/habanalabs-set-non-0-value-in-dram-default-page-size/20220316-194323
        git checkout 9535025e314bc12dbdeebee7c71634699759bcfa
        # save the config file to linux build tree
        mkdir build_dir
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=x86_64 SHELL=/bin/bash drivers/misc/

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

>> drivers/misc/habanalabs/common/mmu/mmu.c:209:32: warning: variable 'prop' set but not used [-Wunused-but-set-variable]
           struct asic_fixed_properties *prop;
                                         ^
   drivers/misc/habanalabs/common/mmu/mmu.c:276:32: warning: variable 'prop' set but not used [-Wunused-but-set-variable]
           struct asic_fixed_properties *prop;
                                         ^
   2 warnings generated.


vim +/prop +209 drivers/misc/habanalabs/common/mmu/mmu.c

   184	
   185	/*
   186	 * hl_mmu_unmap_page - unmaps a virtual addr
   187	 *
   188	 * @ctx: pointer to the context structure
   189	 * @virt_addr: virt addr to map from
   190	 * @page_size: size of the page to unmap
   191	 * @flush_pte: whether to do a PCI flush
   192	 *
   193	 * This function does the following:
   194	 * - Check that the virt addr is mapped
   195	 * - Unmap the virt addr and frees pgts if possible
   196	 * - Returns 0 on success, -EINVAL if the given addr is not mapped
   197	 *
   198	 * Because this function changes the page tables in the device and because it
   199	 * changes the MMU hash, it must be protected by a lock.
   200	 * However, because it maps only a single page, the lock should be implemented
   201	 * in a higher level in order to protect the entire mapping of the memory area
   202	 *
   203	 * For optimization reasons PCI flush may be requested once after unmapping of
   204	 * large area.
   205	 */
   206	int hl_mmu_unmap_page(struct hl_ctx *ctx, u64 virt_addr, u32 page_size, bool flush_pte)
   207	{
   208		struct hl_device *hdev = ctx->hdev;
 > 209		struct asic_fixed_properties *prop;
   210		struct hl_mmu_properties *mmu_prop;
   211		struct hl_mmu_funcs *mmu_funcs;
   212		int i, pgt_residency, rc = 0;
   213		u32 real_page_size, npages;
   214		u64 real_virt_addr;
   215		bool is_dram_addr;
   216	
   217		if (!hdev->mmu_enable)
   218			return 0;
   219	
   220		prop = &hdev->asic_prop;
   221		is_dram_addr = hl_is_dram_va(hdev, virt_addr);
   222		mmu_prop = hl_mmu_get_prop(hdev, page_size, is_dram_addr);
   223	
   224		pgt_residency = mmu_prop->host_resident ? MMU_HR_PGT : MMU_DR_PGT;
   225		mmu_funcs = hl_mmu_get_funcs(hdev, pgt_residency, is_dram_addr);
   226	
   227		rc = hdev->asic_funcs->mmu_get_real_page_size(hdev, mmu_prop, page_size, &real_page_size,
   228								is_dram_addr);
   229		if (rc)
   230			return rc;
   231	
   232		npages = page_size / real_page_size;
   233		real_virt_addr = virt_addr;
   234	
   235		for (i = 0 ; i < npages ; i++) {
   236			rc = mmu_funcs->unmap(ctx, real_virt_addr, is_dram_addr);
   237			if (rc)
   238				break;
   239	
   240			real_virt_addr += real_page_size;
   241		}
   242	
   243		if (flush_pte)
   244			mmu_funcs->flush(ctx);
   245	
   246		return rc;
   247	}
   248	

---
0-DAY CI Kernel Test Service
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2022-03-16 22:34 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-16 11:41 [PATCH 01/11] habanalabs: set non-0 value in dram default page size Oded Gabbay
2022-03-16 11:41 ` [PATCH 02/11] habanalabs: add DRAM default page size to HW info Oded Gabbay
2022-03-16 11:41 ` [PATCH 03/11] habanalabs: change mmu_get_real_page_size to be ASIC-specific Oded Gabbay
2022-03-16 22:33   ` kernel test robot
2022-03-16 11:41 ` [PATCH 04/11] habanalabs: convert all MMU masks/shifts to arrays Oded Gabbay
2022-03-16 11:41 ` [PATCH 05/11] habanalabs: add user API to get valid DRAM page sizes Oded Gabbay
2022-03-16 19:37   ` kernel test robot
2022-03-16 11:41 ` [PATCH 06/11] habanalabs: add new return code to device fd open Oded Gabbay
2022-03-16 11:41 ` [PATCH 07/11] habanalabs: expose compute ctx status through info ioctl Oded Gabbay
2022-03-16 11:41 ` [PATCH 08/11] habanalabs/gaudi: increase submission resources Oded Gabbay
2022-03-16 11:41 ` [PATCH 09/11] habanalabs/gaudi: avoid resetting max power in hard reset Oded Gabbay
2022-03-16 11:41 ` [PATCH 10/11] habanalabs: parse full firmware versions Oded Gabbay
2022-03-16 11:41 ` [PATCH 11/11] habanalabs: modify dma_mask to be ASIC specific property Oded Gabbay

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.