amd-gfx.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/6] Best effort contiguous VRAM allocation
@ 2024-04-18 13:57 Philip Yang
  2024-04-18 13:57 ` [PATCH v2 1/6] drm/amdgpu: Support " Philip Yang
                   ` (5 more replies)
  0 siblings, 6 replies; 9+ messages in thread
From: Philip Yang @ 2024-04-18 13:57 UTC (permalink / raw)
  To: amd-gfx
  Cc: Felix.Kuehling, christian.koenig, Arunpravin.PaneerSelvam, Philip Yang

This patch series implement new KFD memory alloc flag for best effort contiguous
VRAM allocation, to support peer direct access RDMA device with limited scatter-gather
dma capability.

v2: rebase on patch ("drm/amdgpu: Modify the contiguous flags behaviour")
    to avoid adding the new GEM flag

Philip Yang (6):
  drm/amdgpu: Support contiguous VRAM allocation
  drm/amdgpu: Evict BOs from same process for contiguous allocation
  drm/amdkfd: Evict BO itself for contiguous allocation
  drm/amdkfd: Increase KFD bo restore wait time
  drm/amdgpu: Skip dma map resource for null RDMA device
  drm/amdkfd: Bump kfd version for contiguous VRAM allocation

 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  | 21 +++++++++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c       |  3 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c  | 42 ++++++++++++-------
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h         |  2 +-
 include/uapi/linux/kfd_ioctl.h                |  4 +-
 5 files changed, 52 insertions(+), 20 deletions(-)

-- 
2.43.2


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v2 1/6] drm/amdgpu: Support contiguous VRAM allocation
  2024-04-18 13:57 [PATCH v2 0/6] Best effort contiguous VRAM allocation Philip Yang
@ 2024-04-18 13:57 ` Philip Yang
  2024-04-18 14:37   ` Christian König
  2024-04-18 13:57 ` [PATCH v2 2/6] drm/amdgpu: Evict BOs from same process for contiguous allocation Philip Yang
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 9+ messages in thread
From: Philip Yang @ 2024-04-18 13:57 UTC (permalink / raw)
  To: amd-gfx
  Cc: Felix.Kuehling, christian.koenig, Arunpravin.PaneerSelvam, Philip Yang

RDMA device with limited scatter-gather ability requires contiguous VRAM
buffer allocation for RDMA peer direct support.

Add a new KFD alloc memory flag and store as bo alloc flag
AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS. When pin this bo to export for RDMA
peerdirect access, this will set TTM_PL_FLAG_CONTIFUOUS flag, and ask
VRAM buddy allocator to get contiguous VRAM.

Remove the 2GB max memory block size limit for contiguous allocation.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 4 ++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c     | 9 +++++++--
 include/uapi/linux/kfd_ioctl.h                   | 1 +
 3 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index 0ae9fd844623..ef9154043757 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -1712,6 +1712,10 @@ int amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu(
 			alloc_flags = AMDGPU_GEM_CREATE_VRAM_WIPE_ON_RELEASE;
 			alloc_flags |= (flags & KFD_IOC_ALLOC_MEM_FLAGS_PUBLIC) ?
 			AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED : 0;
+
+			/* For contiguous VRAM allocation */
+			if (flags & KFD_IOC_ALLOC_MEM_FLAGS_CONTIGUOUS_BEST_EFFORT)
+				alloc_flags |= AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS;
 		}
 		xcp_id = fpriv->xcp_id == AMDGPU_XCP_NO_PARTITION ?
 					0 : fpriv->xcp_id;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
index 4be8b091099a..2f2ae7177771 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
@@ -532,8 +532,13 @@ static int amdgpu_vram_mgr_new(struct ttm_resource_manager *man,
 
 		BUG_ON(min_block_size < mm->chunk_size);
 
-		/* Limit maximum size to 2GiB due to SG table limitations */
-		size = min(remaining_size, 2ULL << 30);
+		if (place->flags & TTM_PL_FLAG_CONTIGUOUS)
+			size = remaining_size;
+		else
+			/* Limit maximum size to 2GiB due to SG table limitations
+			 * for no contiguous allocation.
+			 */
+			size = min(remaining_size, 2ULL << 30);
 
 		if ((size >= (u64)pages_per_block << PAGE_SHIFT) &&
 				!(size & (((u64)pages_per_block << PAGE_SHIFT) - 1)))
diff --git a/include/uapi/linux/kfd_ioctl.h b/include/uapi/linux/kfd_ioctl.h
index 2040a470ddb4..c1394c162d4e 100644
--- a/include/uapi/linux/kfd_ioctl.h
+++ b/include/uapi/linux/kfd_ioctl.h
@@ -407,6 +407,7 @@ struct kfd_ioctl_acquire_vm_args {
 #define KFD_IOC_ALLOC_MEM_FLAGS_COHERENT	(1 << 26)
 #define KFD_IOC_ALLOC_MEM_FLAGS_UNCACHED	(1 << 25)
 #define KFD_IOC_ALLOC_MEM_FLAGS_EXT_COHERENT	(1 << 24)
+#define KFD_IOC_ALLOC_MEM_FLAGS_CONTIGUOUS_BEST_EFFORT	(1 << 23)
 
 /* Allocate memory for later SVM (shared virtual memory) mapping.
  *
-- 
2.43.2


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v2 2/6] drm/amdgpu: Evict BOs from same process for contiguous allocation
  2024-04-18 13:57 [PATCH v2 0/6] Best effort contiguous VRAM allocation Philip Yang
  2024-04-18 13:57 ` [PATCH v2 1/6] drm/amdgpu: Support " Philip Yang
@ 2024-04-18 13:57 ` Philip Yang
  2024-04-18 13:58 ` [PATCH v2 3/6] drm/amdkfd: Evict BO itself " Philip Yang
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 9+ messages in thread
From: Philip Yang @ 2024-04-18 13:57 UTC (permalink / raw)
  To: amd-gfx
  Cc: Felix.Kuehling, christian.koenig, Arunpravin.PaneerSelvam, Philip Yang

When TTM failed to alloc VRAM, TTM try evict BOs from VRAM to system
memory then retry the allocation, this skips the KFD BOs from the same
process because KFD require all BOs are resident for user queues.

If TTM with TTM_PL_FLAG_CONTIGUOUS flag to alloc contiguous VRAM, allow
TTM evict KFD BOs from the same process, this will evict the user queues
first, and restore the queues later after contiguous VRAM allocation.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 851509c6e90e..c907d6005641 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -1398,7 +1398,8 @@ static bool amdgpu_ttm_bo_eviction_valuable(struct ttm_buffer_object *bo,
 	 */
 	dma_resv_for_each_fence(&resv_cursor, bo->base.resv,
 				DMA_RESV_USAGE_BOOKKEEP, f) {
-		if (amdkfd_fence_check_mm(f, current->mm))
+		if (amdkfd_fence_check_mm(f, current->mm) &&
+		    !(place->flags & TTM_PL_FLAG_CONTIGUOUS))
 			return false;
 	}
 
-- 
2.43.2


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v2 3/6] drm/amdkfd: Evict BO itself for contiguous allocation
  2024-04-18 13:57 [PATCH v2 0/6] Best effort contiguous VRAM allocation Philip Yang
  2024-04-18 13:57 ` [PATCH v2 1/6] drm/amdgpu: Support " Philip Yang
  2024-04-18 13:57 ` [PATCH v2 2/6] drm/amdgpu: Evict BOs from same process for contiguous allocation Philip Yang
@ 2024-04-18 13:58 ` Philip Yang
  2024-04-18 13:58 ` [PATCH v2 4/6] drm/amdkfd: Increase KFD bo restore wait time Philip Yang
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 9+ messages in thread
From: Philip Yang @ 2024-04-18 13:58 UTC (permalink / raw)
  To: amd-gfx
  Cc: Felix.Kuehling, christian.koenig, Arunpravin.PaneerSelvam, Philip Yang

If the BO pages pinned for RDMA is not contiguous on VRAM, evict it to
system memory first to free the VRAM space, then allocate contiguous
VRAM space, and then move it from system memory back to VRAM.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
---
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c    | 17 ++++++++++++++++-
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index ef9154043757..ff7f54741661 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -1470,13 +1470,28 @@ static int amdgpu_amdkfd_gpuvm_pin_bo(struct amdgpu_bo *bo, u32 domain)
 	if (unlikely(ret))
 		return ret;
 
+	if (bo->flags & AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS) {
+		/*
+		 * If bo is not contiguous on VRAM, move to system memory first to ensure
+		 * we can get contiguous VRAM space after evicting other BOs.
+		 */
+		if (!(bo->tbo.resource->placement & TTM_PL_FLAG_CONTIGUOUS)) {
+			ret = amdgpu_amdkfd_bo_validate(bo, AMDGPU_GEM_DOMAIN_GTT, false);
+			if (unlikely(ret)) {
+				pr_debug("validate bo 0x%p to GTT failed %d\n", &bo->tbo, ret);
+				goto out;
+			}
+		}
+	}
+
 	ret = amdgpu_bo_pin_restricted(bo, domain, 0, 0);
 	if (ret)
 		pr_err("Error in Pinning BO to domain: %d\n", domain);
 
 	amdgpu_bo_sync_wait(bo, AMDGPU_FENCE_OWNER_KFD, false);
-	amdgpu_bo_unreserve(bo);
 
+out:
+	amdgpu_bo_unreserve(bo);
 	return ret;
 }
 
-- 
2.43.2


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v2 4/6] drm/amdkfd: Increase KFD bo restore wait time
  2024-04-18 13:57 [PATCH v2 0/6] Best effort contiguous VRAM allocation Philip Yang
                   ` (2 preceding siblings ...)
  2024-04-18 13:58 ` [PATCH v2 3/6] drm/amdkfd: Evict BO itself " Philip Yang
@ 2024-04-18 13:58 ` Philip Yang
  2024-04-18 13:58 ` [PATCH v2 5/6] drm/amdgpu: Skip dma map resource for null RDMA device Philip Yang
  2024-04-18 13:58 ` [PATCH v2 6/6] drm/amdkfd: Bump kfd version for contiguous VRAM allocation Philip Yang
  5 siblings, 0 replies; 9+ messages in thread
From: Philip Yang @ 2024-04-18 13:58 UTC (permalink / raw)
  To: amd-gfx
  Cc: Felix.Kuehling, christian.koenig, Arunpravin.PaneerSelvam, Philip Yang

TTM allocate contiguous VRAM may takes more than 1 second to evict BOs
for larger size RDMA buffer. Because KFD restore bo worker reserves all
KFD BOs, then TTM cannot hold the remainning KFD BOs lock to evict them,
this causes TTM failed to alloc contiguous VRAM.

Increase the KFD restore BO wait time to 2 seconds, long enough for RDMA
pin BO to alloc the contiguous VRAM.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
---
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
index a81ef232fdef..c205e2d3acf9 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
@@ -698,7 +698,7 @@ struct qcm_process_device {
 /* KFD Memory Eviction */
 
 /* Approx. wait time before attempting to restore evicted BOs */
-#define PROCESS_RESTORE_TIME_MS 100
+#define PROCESS_RESTORE_TIME_MS 2000
 /* Approx. back off time if restore fails due to lack of memory */
 #define PROCESS_BACK_OFF_TIME_MS 100
 /* Approx. time before evicting the process again */
-- 
2.43.2


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v2 5/6] drm/amdgpu: Skip dma map resource for null RDMA device
  2024-04-18 13:57 [PATCH v2 0/6] Best effort contiguous VRAM allocation Philip Yang
                   ` (3 preceding siblings ...)
  2024-04-18 13:58 ` [PATCH v2 4/6] drm/amdkfd: Increase KFD bo restore wait time Philip Yang
@ 2024-04-18 13:58 ` Philip Yang
  2024-04-18 13:58 ` [PATCH v2 6/6] drm/amdkfd: Bump kfd version for contiguous VRAM allocation Philip Yang
  5 siblings, 0 replies; 9+ messages in thread
From: Philip Yang @ 2024-04-18 13:58 UTC (permalink / raw)
  To: amd-gfx
  Cc: Felix.Kuehling, christian.koenig, Arunpravin.PaneerSelvam, Philip Yang

To test RDMA using dummy driver on the system without NIC/RDMA
device, the get/put dma pages pass in null device pointer, skip the
dma map/unmap resource to avoid null pointer access.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 33 +++++++++++---------
 1 file changed, 19 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
index 2f2ae7177771..4c512a372ec7 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
@@ -703,12 +703,15 @@ int amdgpu_vram_mgr_alloc_sgt(struct amdgpu_device *adev,
 		size_t size = cursor.size;
 		dma_addr_t addr;
 
-		addr = dma_map_resource(dev, phys, size, dir,
-					DMA_ATTR_SKIP_CPU_SYNC);
-		r = dma_mapping_error(dev, addr);
-		if (r)
-			goto error_unmap;
-
+		if (dev) {
+			addr = dma_map_resource(dev, phys, size, dir,
+						DMA_ATTR_SKIP_CPU_SYNC);
+			r = dma_mapping_error(dev, addr);
+			if (r)
+				goto error_unmap;
+		} else {
+			addr = phys;
+		}
 		sg_set_page(sg, NULL, size, 0);
 		sg_dma_address(sg) = addr;
 		sg_dma_len(sg) = size;
@@ -722,10 +725,10 @@ int amdgpu_vram_mgr_alloc_sgt(struct amdgpu_device *adev,
 	for_each_sgtable_sg((*sgt), sg, i) {
 		if (!sg->length)
 			continue;
-
-		dma_unmap_resource(dev, sg->dma_address,
-				   sg->length, dir,
-				   DMA_ATTR_SKIP_CPU_SYNC);
+		if (dev)
+			dma_unmap_resource(dev, sg->dma_address,
+					   sg->length, dir,
+					   DMA_ATTR_SKIP_CPU_SYNC);
 	}
 	sg_free_table(*sgt);
 
@@ -750,10 +753,12 @@ void amdgpu_vram_mgr_free_sgt(struct device *dev,
 	struct scatterlist *sg;
 	int i;
 
-	for_each_sgtable_sg(sgt, sg, i)
-		dma_unmap_resource(dev, sg->dma_address,
-				   sg->length, dir,
-				   DMA_ATTR_SKIP_CPU_SYNC);
+	if (dev) {
+		for_each_sgtable_sg(sgt, sg, i)
+			dma_unmap_resource(dev, sg->dma_address,
+					   sg->length, dir,
+					   DMA_ATTR_SKIP_CPU_SYNC);
+	}
 	sg_free_table(sgt);
 	kfree(sgt);
 }
-- 
2.43.2


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v2 6/6] drm/amdkfd: Bump kfd version for contiguous VRAM allocation
  2024-04-18 13:57 [PATCH v2 0/6] Best effort contiguous VRAM allocation Philip Yang
                   ` (4 preceding siblings ...)
  2024-04-18 13:58 ` [PATCH v2 5/6] drm/amdgpu: Skip dma map resource for null RDMA device Philip Yang
@ 2024-04-18 13:58 ` Philip Yang
  5 siblings, 0 replies; 9+ messages in thread
From: Philip Yang @ 2024-04-18 13:58 UTC (permalink / raw)
  To: amd-gfx
  Cc: Felix.Kuehling, christian.koenig, Arunpravin.PaneerSelvam, Philip Yang

Bump the kfd ioctl minor version to delcare the contiguous VRAM
allocation flag support.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
---
 include/uapi/linux/kfd_ioctl.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/include/uapi/linux/kfd_ioctl.h b/include/uapi/linux/kfd_ioctl.h
index c1394c162d4e..a0af2ef696ea 100644
--- a/include/uapi/linux/kfd_ioctl.h
+++ b/include/uapi/linux/kfd_ioctl.h
@@ -41,9 +41,10 @@
  * - 1.13 - Add debugger API
  * - 1.14 - Update kfd_event_data
  * - 1.15 - Enable managing mappings in compute VMs with GEM_VA ioctl
+ * - 1.16 - Add contiguous VRAM allocation flag for RDMA
  */
 #define KFD_IOCTL_MAJOR_VERSION 1
-#define KFD_IOCTL_MINOR_VERSION 15
+#define KFD_IOCTL_MINOR_VERSION 16
 
 struct kfd_ioctl_get_version_args {
 	__u32 major_version;	/* from KFD */
-- 
2.43.2


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 1/6] drm/amdgpu: Support contiguous VRAM allocation
  2024-04-18 13:57 ` [PATCH v2 1/6] drm/amdgpu: Support " Philip Yang
@ 2024-04-18 14:37   ` Christian König
  2024-04-18 20:50     ` Philip Yang
  0 siblings, 1 reply; 9+ messages in thread
From: Christian König @ 2024-04-18 14:37 UTC (permalink / raw)
  To: Philip Yang, amd-gfx
  Cc: Felix.Kuehling, christian.koenig, Arunpravin.PaneerSelvam



Am 18.04.24 um 15:57 schrieb Philip Yang:
> RDMA device with limited scatter-gather ability requires contiguous VRAM
> buffer allocation for RDMA peer direct support.
>
> Add a new KFD alloc memory flag and store as bo alloc flag
> AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS. When pin this bo to export for RDMA
> peerdirect access, this will set TTM_PL_FLAG_CONTIFUOUS flag, and ask
> VRAM buddy allocator to get contiguous VRAM.
>
> Remove the 2GB max memory block size limit for contiguous allocation.
>
> Signed-off-by: Philip Yang <Philip.Yang@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 4 ++++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c     | 9 +++++++--
>   include/uapi/linux/kfd_ioctl.h                   | 1 +
>   3 files changed, 12 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> index 0ae9fd844623..ef9154043757 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> @@ -1712,6 +1712,10 @@ int amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu(
>   			alloc_flags = AMDGPU_GEM_CREATE_VRAM_WIPE_ON_RELEASE;
>   			alloc_flags |= (flags & KFD_IOC_ALLOC_MEM_FLAGS_PUBLIC) ?
>   			AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED : 0;
> +
> +			/* For contiguous VRAM allocation */
> +			if (flags & KFD_IOC_ALLOC_MEM_FLAGS_CONTIGUOUS_BEST_EFFORT)
> +				alloc_flags |= AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS;
>   		}
>   		xcp_id = fpriv->xcp_id == AMDGPU_XCP_NO_PARTITION ?
>   					0 : fpriv->xcp_id;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
> index 4be8b091099a..2f2ae7177771 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
> @@ -532,8 +532,13 @@ static int amdgpu_vram_mgr_new(struct ttm_resource_manager *man,
>   
>   		BUG_ON(min_block_size < mm->chunk_size);
>   
> -		/* Limit maximum size to 2GiB due to SG table limitations */
> -		size = min(remaining_size, 2ULL << 30);
> +		if (place->flags & TTM_PL_FLAG_CONTIGUOUS)
> +			size = remaining_size;
> +		else
> +			/* Limit maximum size to 2GiB due to SG table limitations
> +			 * for no contiguous allocation.
> +			 */
> +			size = min(remaining_size, 2ULL << 30);

Oh, I totally missed this in the first review. That won't work like that 
the sg table limit is still there even if the BO is contiguous.

We could only fix up the VRAM P2P support to use multiple segments in 
the sg table.

Regards,
Christian.

>   
>   		if ((size >= (u64)pages_per_block << PAGE_SHIFT) &&
>   				!(size & (((u64)pages_per_block << PAGE_SHIFT) - 1)))
> diff --git a/include/uapi/linux/kfd_ioctl.h b/include/uapi/linux/kfd_ioctl.h
> index 2040a470ddb4..c1394c162d4e 100644
> --- a/include/uapi/linux/kfd_ioctl.h
> +++ b/include/uapi/linux/kfd_ioctl.h
> @@ -407,6 +407,7 @@ struct kfd_ioctl_acquire_vm_args {
>   #define KFD_IOC_ALLOC_MEM_FLAGS_COHERENT	(1 << 26)
>   #define KFD_IOC_ALLOC_MEM_FLAGS_UNCACHED	(1 << 25)
>   #define KFD_IOC_ALLOC_MEM_FLAGS_EXT_COHERENT	(1 << 24)
> +#define KFD_IOC_ALLOC_MEM_FLAGS_CONTIGUOUS_BEST_EFFORT	(1 << 23)
>   
>   /* Allocate memory for later SVM (shared virtual memory) mapping.
>    *


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 1/6] drm/amdgpu: Support contiguous VRAM allocation
  2024-04-18 14:37   ` Christian König
@ 2024-04-18 20:50     ` Philip Yang
  0 siblings, 0 replies; 9+ messages in thread
From: Philip Yang @ 2024-04-18 20:50 UTC (permalink / raw)
  To: Christian König, Philip Yang, amd-gfx
  Cc: Felix.Kuehling, christian.koenig, Arunpravin.PaneerSelvam

[-- Attachment #1: Type: text/html, Size: 7559 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2024-04-18 20:50 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-18 13:57 [PATCH v2 0/6] Best effort contiguous VRAM allocation Philip Yang
2024-04-18 13:57 ` [PATCH v2 1/6] drm/amdgpu: Support " Philip Yang
2024-04-18 14:37   ` Christian König
2024-04-18 20:50     ` Philip Yang
2024-04-18 13:57 ` [PATCH v2 2/6] drm/amdgpu: Evict BOs from same process for contiguous allocation Philip Yang
2024-04-18 13:58 ` [PATCH v2 3/6] drm/amdkfd: Evict BO itself " Philip Yang
2024-04-18 13:58 ` [PATCH v2 4/6] drm/amdkfd: Increase KFD bo restore wait time Philip Yang
2024-04-18 13:58 ` [PATCH v2 5/6] drm/amdgpu: Skip dma map resource for null RDMA device Philip Yang
2024-04-18 13:58 ` [PATCH v2 6/6] drm/amdkfd: Bump kfd version for contiguous VRAM allocation Philip Yang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).