All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Errabolu, Ramesh" <Ramesh.Errabolu@amd.com>
To: "Kuehling, Felix" <Felix.Kuehling@amd.com>,
	"Zeng, Oak" <Oak.Zeng@amd.com>,
	"amd-gfx@lists.freedesktop.org" <amd-gfx@lists.freedesktop.org>,
	"dri-devel@lists.freedesktop.org"
	<dri-devel@lists.freedesktop.org>
Subject: RE: [PATCH v2 05/10] drm/amdgpu: Add multi-GPU DMA mapping helpers
Date: Mon, 10 May 2021 22:05:03 +0000	[thread overview]
Message-ID: <SN6PR12MB2672D07C35FEFA6DDA91D344E3549@SN6PR12MB2672.namprd12.prod.outlook.com> (raw)
In-Reply-To: <1c3566b0-6275-b6c2-3f3f-28178bf60b44@amd.com>

[AMD Official Use Only - Internal Distribution Only]

Acked-by: Ramesh Errabolu <ramesh.errabolu@amd.com>

-----Original Message-----
From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> On Behalf Of Kuehling, Felix
Sent: Monday, April 26, 2021 10:41 PM
To: Zeng, Oak <Oak.Zeng@amd.com>; amd-gfx@lists.freedesktop.org; dri-devel@lists.freedesktop.org
Subject: Re: [PATCH v2 05/10] drm/amdgpu: Add multi-GPU DMA mapping helpers

Am 2021-04-26 um 8:09 p.m. schrieb Zeng, Oak:
> As I understand it, when one GPU map another GPU's vram, this vram should also be mapped in iommu page table. Also normal GTT memory (versus userptr) also need to be mapped in iommu. But don't see this code below.

Right, I'm not solving all problems at once. The next patch is there to handle GTT BOs.

Peer mappings of doorbells, MMIO and VRAM still need to be handled in the future. I'm trying to fix the worst issues first. This series should get 99% of real world tests working.


>  I only see you map userptr in iommu. Maybe you map them in iommu not during memory attachment time?
>
> Also see a nit-pick inline
>
> Regards,
> Oak
>
>  
>
> On 2021-04-21, 9:31 PM, "dri-devel on behalf of Felix Kuehling" <dri-devel-bounces@lists.freedesktop.org on behalf of Felix.Kuehling@amd.com> wrote:
>
>     Add BO-type specific helpers functions to DMA-map and unmap
>     kfd_mem_attachments. Implement this functionality for userptrs by creating
>     one SG BO per GPU and filling it with a DMA mapping of the pages from the
>     original mem->bo.
>
>     Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
>     ---
>      drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h    |   8 +-
>      .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  | 146 +++++++++++++++++-
>      2 files changed, 145 insertions(+), 9 deletions(-)
>
>     diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
>     index c24b2478f445..63668433f5a6 100644
>     --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
>     +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
>     @@ -38,11 +38,17 @@ extern uint64_t amdgpu_amdkfd_total_mem_size;
>
>      struct amdgpu_device;
>
>     +enum kfd_mem_attachment_type {
>     +	KFD_MEM_ATT_SHARED,	/* Share kgd_mem->bo or another attachment's */
>     +	KFD_MEM_ATT_USERPTR,	/* SG bo to DMA map pages from a userptr bo */
>     +};
>     +
>      struct kfd_mem_attachment {
>      	struct list_head list;
>     +	enum kfd_mem_attachment_type type;
>     +	bool is_mapped;
>      	struct amdgpu_bo_va *bo_va;
>      	struct amdgpu_device *adev;
>     -	bool is_mapped;
>      	uint64_t va;
>      	uint64_t pte_flags;
>      };
>     diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>     index fbd7e786b54e..49d1af4aa5f1 100644
>     --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>     +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>     @@ -473,12 +473,117 @@ static uint64_t get_pte_flags(struct amdgpu_device *adev, struct kgd_mem *mem)
>      	return pte_flags;
>      }
>
>     +static int
>     +kfd_mem_dmamap_userptr(struct kgd_mem *mem,
>     +		       struct kfd_mem_attachment *attachment)
>     +{
>     +	enum dma_data_direction direction =
>     +		mem->alloc_flags & KFD_IOC_ALLOC_MEM_FLAGS_WRITABLE ?
>     +		DMA_BIDIRECTIONAL : DMA_TO_DEVICE;
>     +	struct ttm_operation_ctx ctx = {.interruptible = true};
>     +	struct amdgpu_bo *bo = attachment->bo_va->base.bo;
>     +	struct amdgpu_device *adev = attachment->adev;
>     +	struct ttm_tt *src_ttm = mem->bo->tbo.ttm;
>     +	struct ttm_tt *ttm = bo->tbo.ttm;
>     +	int ret;
>     +
>     +	ttm->sg = kmalloc(sizeof(*ttm->sg), GFP_KERNEL);
>     +	if (unlikely(!ttm->sg))
>     +		return -ENOMEM;
>     +
>     +	if (WARN_ON(ttm->num_pages != src_ttm->num_pages))
>     +		return -EINVAL;
>     +
>     +	/* Same sequence as in amdgpu_ttm_tt_pin_userptr */
>     +	ret = sg_alloc_table_from_pages(ttm->sg, src_ttm->pages,
>     +					ttm->num_pages, 0,
>     +					(u64)ttm->num_pages << PAGE_SHIFT,
>     +					GFP_KERNEL);
>     +	if (unlikely(ret))
>     +		goto release_sg;
> Should go to a label starting from kfree below?

Thanks, I'll fix that.

Regards,
  Felix


>     +
>     +	ret = dma_map_sgtable(adev->dev, ttm->sg, direction, 0);
>     +	if (unlikely(ret))
>     +		goto release_sg;
>     +
>     +	drm_prime_sg_to_dma_addr_array(ttm->sg, ttm->dma_address,
>     +				       ttm->num_pages);
>     +
>     +	amdgpu_bo_placement_from_domain(bo, AMDGPU_GEM_DOMAIN_GTT);
>     +	ret = ttm_bo_validate(&bo->tbo, &bo->placement, &ctx);
>     +	if (ret)
>     +		goto release_sg;
>     +
>     +	return 0;
>     +
>     +release_sg:
>     +	pr_err("DMA map userptr failed: %d\n", ret);
>     +	sg_free_table(ttm->sg);
>     +	kfree(ttm->sg);
>     +	ttm->sg = NULL;
>     +	return ret;
>     +}
>     +
>     +static int
>     +kfd_mem_dmamap_attachment(struct kgd_mem *mem,
>     +			  struct kfd_mem_attachment *attachment)
>     +{
>     +	switch (attachment->type) {
>     +	case KFD_MEM_ATT_SHARED:
>     +		return 0;
>     +	case KFD_MEM_ATT_USERPTR:
>     +		return kfd_mem_dmamap_userptr(mem, attachment);
>     +	default:
>     +		WARN_ON_ONCE(1);
>     +	}
>     +	return -EINVAL;
>     +}
>     +
>     +static void
>     +kfd_mem_dmaunmap_userptr(struct kgd_mem *mem,
>     +			 struct kfd_mem_attachment *attachment)
>     +{
>     +	enum dma_data_direction direction =
>     +		mem->alloc_flags & KFD_IOC_ALLOC_MEM_FLAGS_WRITABLE ?
>     +		DMA_BIDIRECTIONAL : DMA_TO_DEVICE;
>     +	struct ttm_operation_ctx ctx = {.interruptible = false};
>     +	struct amdgpu_bo *bo = attachment->bo_va->base.bo;
>     +	struct amdgpu_device *adev = attachment->adev;
>     +	struct ttm_tt *ttm = bo->tbo.ttm;
>     +
>     +	if (unlikely(!ttm->sg))
>     +		return;
>     +
>     +	amdgpu_bo_placement_from_domain(bo, AMDGPU_GEM_DOMAIN_CPU);
>     +	ttm_bo_validate(&bo->tbo, &bo->placement, &ctx);
>     +
>     +	dma_unmap_sgtable(adev->dev, ttm->sg, direction, 0);
>     +	sg_free_table(ttm->sg);
>     +	ttm->sg = NULL;
>     +}
>     +
>     +static void
>     +kfd_mem_dmaunmap_attachment(struct kgd_mem *mem,
>     +			    struct kfd_mem_attachment *attachment)
>     +{
>     +	switch (attachment->type) {
>     +	case KFD_MEM_ATT_SHARED:
>     +		break;
>     +	case KFD_MEM_ATT_USERPTR:
>     +		kfd_mem_dmaunmap_userptr(mem, attachment);
>     +		break;
>     +	default:
>     +		WARN_ON_ONCE(1);
>     +	}
>     +}
>     +
>      /* kfd_mem_attach - Add a BO to a VM
>       *
>       * Everything that needs to bo done only once when a BO is first added
>       * to a VM. It can later be mapped and unmapped many times without
>       * repeating these steps.
>       *
>     + * 0. Create BO for DMA mapping, if needed
>       * 1. Allocate and initialize BO VA entry data structure
>       * 2. Add BO to the VM
>       * 3. Determine ASIC-specific PTE flags
>     @@ -488,10 +593,12 @@ static uint64_t get_pte_flags(struct amdgpu_device *adev, struct kgd_mem *mem)
>      static int kfd_mem_attach(struct amdgpu_device *adev, struct kgd_mem *mem,
>      		struct amdgpu_vm *vm, bool is_aql)
>      {
>     +	struct amdgpu_device *bo_adev = amdgpu_ttm_adev(mem->bo->tbo.bdev);
>      	unsigned long bo_size = mem->bo->tbo.base.size;
>      	uint64_t va = mem->va;
>      	struct kfd_mem_attachment *attachment[2] = {NULL, NULL};
>      	struct amdgpu_bo *bo[2] = {NULL, NULL};
>     +	struct drm_gem_object *gobj;
>      	int i, ret;
>
>      	if (!va) {
>     @@ -509,14 +616,37 @@ static int kfd_mem_attach(struct amdgpu_device *adev, struct kgd_mem *mem,
>      		pr_debug("\t add VA 0x%llx - 0x%llx to vm %p\n", va,
>      			 va + bo_size, vm);
>
>     -		/* FIXME: For now all attachments use the same BO. This is
>     -		 * incorrect because one BO can only have one DMA mapping
>     -		 * for one GPU. We need one BO per GPU, e.g. a DMABuf
>     -		 * import with dynamic attachment. This will be addressed
>     -		 * one BO-type at a time in subsequent patches.
>     -		 */
>     -		bo[i] = mem->bo;
>     -		drm_gem_object_get(&bo[i]->tbo.base);
>     +		if (adev == bo_adev || (mem->domain == AMDGPU_GEM_DOMAIN_VRAM &&
>     +					amdgpu_xgmi_same_hive(adev, bo_adev))) {
>     +			/* Mappings on the local GPU and VRAM mappings in the
>     +			 * local hive share the original BO
>     +			 */
>     +			attachment[i]->type = KFD_MEM_ATT_SHARED;
>     +			bo[i] = mem->bo;
>     +			drm_gem_object_get(&bo[i]->tbo.base);
>     +		} else if (i > 0) {
>     +			/* Multiple mappings on the same GPU share the BO */
>     +			attachment[i]->type = KFD_MEM_ATT_SHARED;
>     +			bo[i] = bo[0];
>     +			drm_gem_object_get(&bo[i]->tbo.base);
>     +		} else if (amdgpu_ttm_tt_get_usermm(mem->bo->tbo.ttm)) {
>     +			/* Create an SG BO to DMA-map userptrs on other GPUs */
>     +			attachment[i]->type = KFD_MEM_ATT_USERPTR;
>     +			ret = amdgpu_gem_object_create(adev, bo_size, 1,
>     +						       AMDGPU_GEM_DOMAIN_CPU,
>     +						       0, ttm_bo_type_sg,
>     +						       mem->bo->tbo.base.resv,
>     +						       &gobj);
>     +			if (ret)
>     +				goto unwind;
>     +			bo[i] = gem_to_amdgpu_bo(gobj);
>     +			bo[i]->parent = amdgpu_bo_ref(mem->bo);
>     +		} else {
>     +			/* FIXME: Need to DMA-map other BO types */
>     +			attachment[i]->type = KFD_MEM_ATT_SHARED;
>     +			bo[i] = mem->bo;
>     +			drm_gem_object_get(&bo[i]->tbo.base);
>     +		}
>
>      		/* Add BO to VM internal data structures */
>      		attachment[i]->bo_va = amdgpu_vm_bo_add(adev, vm, bo[i]);
>     -- 
>     2.31.1
>
>     _______________________________________________
>     dri-devel mailing list
>     dri-devel@lists.freedesktop.org
>     
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flist
> s.freedesktop.org%2Fmailman%2Flistinfo%2Fdri-devel&amp;data=04%7C01%7C
> philip.yang%40amd.com%7C3766043c467041cc358408d9092e5990%7C3dd8961fe48
> 84e608e11a82d994e183d%7C0%7C0%7C637550916942474408%7CUnknown%7CTWFpbGZ
> sb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3
> D%7C1000&amp;sdata=H1H%2BCwSeLkcjiUMaE%2Fczj%2BYpBKuF%2Fhqnen0p9duG4UM
> %3D&amp;reserved=0
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=04%7C01%7Cphilip.yang%40amd.com%7C3766043c467041cc358408d9092e5990%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637550916942484402%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=N5eiRkOWy2pkABLXozti0uU0chHg3w7JFzgN%2FqdLdOA%3D&amp;reserved=0

WARNING: multiple messages have this Message-ID (diff)
From: "Errabolu, Ramesh" <Ramesh.Errabolu@amd.com>
To: "Kuehling, Felix" <Felix.Kuehling@amd.com>,
	"Zeng, Oak" <Oak.Zeng@amd.com>,
	"amd-gfx@lists.freedesktop.org" <amd-gfx@lists.freedesktop.org>,
	"dri-devel@lists.freedesktop.org"
	<dri-devel@lists.freedesktop.org>
Subject: RE: [PATCH v2 05/10] drm/amdgpu: Add multi-GPU DMA mapping helpers
Date: Mon, 10 May 2021 22:05:03 +0000	[thread overview]
Message-ID: <SN6PR12MB2672D07C35FEFA6DDA91D344E3549@SN6PR12MB2672.namprd12.prod.outlook.com> (raw)
In-Reply-To: <1c3566b0-6275-b6c2-3f3f-28178bf60b44@amd.com>

[AMD Official Use Only - Internal Distribution Only]

Acked-by: Ramesh Errabolu <ramesh.errabolu@amd.com>

-----Original Message-----
From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> On Behalf Of Kuehling, Felix
Sent: Monday, April 26, 2021 10:41 PM
To: Zeng, Oak <Oak.Zeng@amd.com>; amd-gfx@lists.freedesktop.org; dri-devel@lists.freedesktop.org
Subject: Re: [PATCH v2 05/10] drm/amdgpu: Add multi-GPU DMA mapping helpers

Am 2021-04-26 um 8:09 p.m. schrieb Zeng, Oak:
> As I understand it, when one GPU map another GPU's vram, this vram should also be mapped in iommu page table. Also normal GTT memory (versus userptr) also need to be mapped in iommu. But don't see this code below.

Right, I'm not solving all problems at once. The next patch is there to handle GTT BOs.

Peer mappings of doorbells, MMIO and VRAM still need to be handled in the future. I'm trying to fix the worst issues first. This series should get 99% of real world tests working.


>  I only see you map userptr in iommu. Maybe you map them in iommu not during memory attachment time?
>
> Also see a nit-pick inline
>
> Regards,
> Oak
>
>  
>
> On 2021-04-21, 9:31 PM, "dri-devel on behalf of Felix Kuehling" <dri-devel-bounces@lists.freedesktop.org on behalf of Felix.Kuehling@amd.com> wrote:
>
>     Add BO-type specific helpers functions to DMA-map and unmap
>     kfd_mem_attachments. Implement this functionality for userptrs by creating
>     one SG BO per GPU and filling it with a DMA mapping of the pages from the
>     original mem->bo.
>
>     Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
>     ---
>      drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h    |   8 +-
>      .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  | 146 +++++++++++++++++-
>      2 files changed, 145 insertions(+), 9 deletions(-)
>
>     diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
>     index c24b2478f445..63668433f5a6 100644
>     --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
>     +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
>     @@ -38,11 +38,17 @@ extern uint64_t amdgpu_amdkfd_total_mem_size;
>
>      struct amdgpu_device;
>
>     +enum kfd_mem_attachment_type {
>     +	KFD_MEM_ATT_SHARED,	/* Share kgd_mem->bo or another attachment's */
>     +	KFD_MEM_ATT_USERPTR,	/* SG bo to DMA map pages from a userptr bo */
>     +};
>     +
>      struct kfd_mem_attachment {
>      	struct list_head list;
>     +	enum kfd_mem_attachment_type type;
>     +	bool is_mapped;
>      	struct amdgpu_bo_va *bo_va;
>      	struct amdgpu_device *adev;
>     -	bool is_mapped;
>      	uint64_t va;
>      	uint64_t pte_flags;
>      };
>     diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>     index fbd7e786b54e..49d1af4aa5f1 100644
>     --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>     +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>     @@ -473,12 +473,117 @@ static uint64_t get_pte_flags(struct amdgpu_device *adev, struct kgd_mem *mem)
>      	return pte_flags;
>      }
>
>     +static int
>     +kfd_mem_dmamap_userptr(struct kgd_mem *mem,
>     +		       struct kfd_mem_attachment *attachment)
>     +{
>     +	enum dma_data_direction direction =
>     +		mem->alloc_flags & KFD_IOC_ALLOC_MEM_FLAGS_WRITABLE ?
>     +		DMA_BIDIRECTIONAL : DMA_TO_DEVICE;
>     +	struct ttm_operation_ctx ctx = {.interruptible = true};
>     +	struct amdgpu_bo *bo = attachment->bo_va->base.bo;
>     +	struct amdgpu_device *adev = attachment->adev;
>     +	struct ttm_tt *src_ttm = mem->bo->tbo.ttm;
>     +	struct ttm_tt *ttm = bo->tbo.ttm;
>     +	int ret;
>     +
>     +	ttm->sg = kmalloc(sizeof(*ttm->sg), GFP_KERNEL);
>     +	if (unlikely(!ttm->sg))
>     +		return -ENOMEM;
>     +
>     +	if (WARN_ON(ttm->num_pages != src_ttm->num_pages))
>     +		return -EINVAL;
>     +
>     +	/* Same sequence as in amdgpu_ttm_tt_pin_userptr */
>     +	ret = sg_alloc_table_from_pages(ttm->sg, src_ttm->pages,
>     +					ttm->num_pages, 0,
>     +					(u64)ttm->num_pages << PAGE_SHIFT,
>     +					GFP_KERNEL);
>     +	if (unlikely(ret))
>     +		goto release_sg;
> Should go to a label starting from kfree below?

Thanks, I'll fix that.

Regards,
  Felix


>     +
>     +	ret = dma_map_sgtable(adev->dev, ttm->sg, direction, 0);
>     +	if (unlikely(ret))
>     +		goto release_sg;
>     +
>     +	drm_prime_sg_to_dma_addr_array(ttm->sg, ttm->dma_address,
>     +				       ttm->num_pages);
>     +
>     +	amdgpu_bo_placement_from_domain(bo, AMDGPU_GEM_DOMAIN_GTT);
>     +	ret = ttm_bo_validate(&bo->tbo, &bo->placement, &ctx);
>     +	if (ret)
>     +		goto release_sg;
>     +
>     +	return 0;
>     +
>     +release_sg:
>     +	pr_err("DMA map userptr failed: %d\n", ret);
>     +	sg_free_table(ttm->sg);
>     +	kfree(ttm->sg);
>     +	ttm->sg = NULL;
>     +	return ret;
>     +}
>     +
>     +static int
>     +kfd_mem_dmamap_attachment(struct kgd_mem *mem,
>     +			  struct kfd_mem_attachment *attachment)
>     +{
>     +	switch (attachment->type) {
>     +	case KFD_MEM_ATT_SHARED:
>     +		return 0;
>     +	case KFD_MEM_ATT_USERPTR:
>     +		return kfd_mem_dmamap_userptr(mem, attachment);
>     +	default:
>     +		WARN_ON_ONCE(1);
>     +	}
>     +	return -EINVAL;
>     +}
>     +
>     +static void
>     +kfd_mem_dmaunmap_userptr(struct kgd_mem *mem,
>     +			 struct kfd_mem_attachment *attachment)
>     +{
>     +	enum dma_data_direction direction =
>     +		mem->alloc_flags & KFD_IOC_ALLOC_MEM_FLAGS_WRITABLE ?
>     +		DMA_BIDIRECTIONAL : DMA_TO_DEVICE;
>     +	struct ttm_operation_ctx ctx = {.interruptible = false};
>     +	struct amdgpu_bo *bo = attachment->bo_va->base.bo;
>     +	struct amdgpu_device *adev = attachment->adev;
>     +	struct ttm_tt *ttm = bo->tbo.ttm;
>     +
>     +	if (unlikely(!ttm->sg))
>     +		return;
>     +
>     +	amdgpu_bo_placement_from_domain(bo, AMDGPU_GEM_DOMAIN_CPU);
>     +	ttm_bo_validate(&bo->tbo, &bo->placement, &ctx);
>     +
>     +	dma_unmap_sgtable(adev->dev, ttm->sg, direction, 0);
>     +	sg_free_table(ttm->sg);
>     +	ttm->sg = NULL;
>     +}
>     +
>     +static void
>     +kfd_mem_dmaunmap_attachment(struct kgd_mem *mem,
>     +			    struct kfd_mem_attachment *attachment)
>     +{
>     +	switch (attachment->type) {
>     +	case KFD_MEM_ATT_SHARED:
>     +		break;
>     +	case KFD_MEM_ATT_USERPTR:
>     +		kfd_mem_dmaunmap_userptr(mem, attachment);
>     +		break;
>     +	default:
>     +		WARN_ON_ONCE(1);
>     +	}
>     +}
>     +
>      /* kfd_mem_attach - Add a BO to a VM
>       *
>       * Everything that needs to bo done only once when a BO is first added
>       * to a VM. It can later be mapped and unmapped many times without
>       * repeating these steps.
>       *
>     + * 0. Create BO for DMA mapping, if needed
>       * 1. Allocate and initialize BO VA entry data structure
>       * 2. Add BO to the VM
>       * 3. Determine ASIC-specific PTE flags
>     @@ -488,10 +593,12 @@ static uint64_t get_pte_flags(struct amdgpu_device *adev, struct kgd_mem *mem)
>      static int kfd_mem_attach(struct amdgpu_device *adev, struct kgd_mem *mem,
>      		struct amdgpu_vm *vm, bool is_aql)
>      {
>     +	struct amdgpu_device *bo_adev = amdgpu_ttm_adev(mem->bo->tbo.bdev);
>      	unsigned long bo_size = mem->bo->tbo.base.size;
>      	uint64_t va = mem->va;
>      	struct kfd_mem_attachment *attachment[2] = {NULL, NULL};
>      	struct amdgpu_bo *bo[2] = {NULL, NULL};
>     +	struct drm_gem_object *gobj;
>      	int i, ret;
>
>      	if (!va) {
>     @@ -509,14 +616,37 @@ static int kfd_mem_attach(struct amdgpu_device *adev, struct kgd_mem *mem,
>      		pr_debug("\t add VA 0x%llx - 0x%llx to vm %p\n", va,
>      			 va + bo_size, vm);
>
>     -		/* FIXME: For now all attachments use the same BO. This is
>     -		 * incorrect because one BO can only have one DMA mapping
>     -		 * for one GPU. We need one BO per GPU, e.g. a DMABuf
>     -		 * import with dynamic attachment. This will be addressed
>     -		 * one BO-type at a time in subsequent patches.
>     -		 */
>     -		bo[i] = mem->bo;
>     -		drm_gem_object_get(&bo[i]->tbo.base);
>     +		if (adev == bo_adev || (mem->domain == AMDGPU_GEM_DOMAIN_VRAM &&
>     +					amdgpu_xgmi_same_hive(adev, bo_adev))) {
>     +			/* Mappings on the local GPU and VRAM mappings in the
>     +			 * local hive share the original BO
>     +			 */
>     +			attachment[i]->type = KFD_MEM_ATT_SHARED;
>     +			bo[i] = mem->bo;
>     +			drm_gem_object_get(&bo[i]->tbo.base);
>     +		} else if (i > 0) {
>     +			/* Multiple mappings on the same GPU share the BO */
>     +			attachment[i]->type = KFD_MEM_ATT_SHARED;
>     +			bo[i] = bo[0];
>     +			drm_gem_object_get(&bo[i]->tbo.base);
>     +		} else if (amdgpu_ttm_tt_get_usermm(mem->bo->tbo.ttm)) {
>     +			/* Create an SG BO to DMA-map userptrs on other GPUs */
>     +			attachment[i]->type = KFD_MEM_ATT_USERPTR;
>     +			ret = amdgpu_gem_object_create(adev, bo_size, 1,
>     +						       AMDGPU_GEM_DOMAIN_CPU,
>     +						       0, ttm_bo_type_sg,
>     +						       mem->bo->tbo.base.resv,
>     +						       &gobj);
>     +			if (ret)
>     +				goto unwind;
>     +			bo[i] = gem_to_amdgpu_bo(gobj);
>     +			bo[i]->parent = amdgpu_bo_ref(mem->bo);
>     +		} else {
>     +			/* FIXME: Need to DMA-map other BO types */
>     +			attachment[i]->type = KFD_MEM_ATT_SHARED;
>     +			bo[i] = mem->bo;
>     +			drm_gem_object_get(&bo[i]->tbo.base);
>     +		}
>
>      		/* Add BO to VM internal data structures */
>      		attachment[i]->bo_va = amdgpu_vm_bo_add(adev, vm, bo[i]);
>     -- 
>     2.31.1
>
>     _______________________________________________
>     dri-devel mailing list
>     dri-devel@lists.freedesktop.org
>     
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flist
> s.freedesktop.org%2Fmailman%2Flistinfo%2Fdri-devel&amp;data=04%7C01%7C
> philip.yang%40amd.com%7C3766043c467041cc358408d9092e5990%7C3dd8961fe48
> 84e608e11a82d994e183d%7C0%7C0%7C637550916942474408%7CUnknown%7CTWFpbGZ
> sb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3
> D%7C1000&amp;sdata=H1H%2BCwSeLkcjiUMaE%2Fczj%2BYpBKuF%2Fhqnen0p9duG4UM
> %3D&amp;reserved=0
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=04%7C01%7Cphilip.yang%40amd.com%7C3766043c467041cc358408d9092e5990%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637550916942484402%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=N5eiRkOWy2pkABLXozti0uU0chHg3w7JFzgN%2FqdLdOA%3D&amp;reserved=0
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

  reply	other threads:[~2021-05-10 22:05 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-22  1:30 [PATCH v2 00/10] Implement multi-GPU DMA mappings for KFD Felix Kuehling
2021-04-22  1:30 ` Felix Kuehling
2021-04-22  1:30 ` [PATCH v2 01/10] rock-dbg_defconfig: Enable Intel IOMMU Felix Kuehling
2021-04-22  1:30   ` Felix Kuehling
2021-04-22  1:30 ` [PATCH v2 02/10] drm/amdgpu: Rename kfd_bo_va_list to kfd_mem_attachment Felix Kuehling
2021-04-22  1:30   ` Felix Kuehling
2021-05-10 22:00   ` Errabolu, Ramesh
2021-05-10 22:00     ` Errabolu, Ramesh
2021-04-22  1:30 ` [PATCH v2 03/10] drm/amdgpu: Keep a bo-reference per-attachment Felix Kuehling
2021-04-22  1:30   ` Felix Kuehling
2021-05-10 22:00   ` Errabolu, Ramesh
2021-05-10 22:00     ` Errabolu, Ramesh
2021-04-22  1:30 ` [PATCH v2 04/10] drm/amdgpu: Simplify AQL queue mapping Felix Kuehling
2021-04-22  1:30   ` Felix Kuehling
2021-04-23  1:33   ` Zeng, Oak
2021-04-23  1:33     ` Zeng, Oak
2021-04-23  7:23     ` Felix Kuehling
2021-04-23  7:23       ` Felix Kuehling
2021-05-10 22:03       ` Errabolu, Ramesh
2021-05-10 22:03         ` Errabolu, Ramesh
2021-04-22  1:30 ` [PATCH v2 05/10] drm/amdgpu: Add multi-GPU DMA mapping helpers Felix Kuehling
2021-04-22  1:30   ` Felix Kuehling
2021-04-27  0:09   ` Zeng, Oak
2021-04-27  0:09     ` Zeng, Oak
2021-04-27  3:41     ` Felix Kuehling
2021-04-27  3:41       ` Felix Kuehling
2021-05-10 22:05       ` Errabolu, Ramesh [this message]
2021-05-10 22:05         ` Errabolu, Ramesh
2021-04-22  1:30 ` [PATCH v2 06/10] drm/amdgpu: DMA map/unmap when updating GPU mappings Felix Kuehling
2021-04-22  1:30   ` Felix Kuehling
2021-04-27  0:23   ` Zeng, Oak
2021-04-27  0:23     ` Zeng, Oak
2021-04-27  3:47     ` Felix Kuehling
2021-04-27  3:47       ` Felix Kuehling
2021-05-10 22:06       ` Errabolu, Ramesh
2021-05-10 22:06         ` Errabolu, Ramesh
2021-04-22  1:30 ` [PATCH v2 07/10] drm/amdgpu: Move kfd_mem_attach outside reservation Felix Kuehling
2021-04-22  1:30   ` Felix Kuehling
2021-05-10 22:06   ` Errabolu, Ramesh
2021-05-10 22:06     ` Errabolu, Ramesh
2021-04-22  1:30 ` [PATCH v2 08/10] drm/amdgpu: Add DMA mapping of GTT BOs Felix Kuehling
2021-04-22  1:30   ` Felix Kuehling
2021-04-27  0:35   ` Zeng, Oak
2021-04-27  0:35     ` Zeng, Oak
2021-04-27  3:56     ` Felix Kuehling
2021-04-27  3:56       ` Felix Kuehling
2021-04-27 14:29       ` Zeng, Oak
2021-04-27 14:29         ` Zeng, Oak
2021-04-27 15:08         ` Felix Kuehling
2021-04-27 15:08           ` Felix Kuehling
2021-05-10 22:07           ` Errabolu, Ramesh
2021-05-10 22:07             ` Errabolu, Ramesh
2021-04-22  1:30 ` [PATCH v2 09/10] drm/ttm: Don't count pages in SG BOs against pages_limit Felix Kuehling
2021-04-22  1:30   ` Felix Kuehling
2021-05-10 22:08   ` Errabolu, Ramesh
2021-05-10 22:08     ` Errabolu, Ramesh
2021-04-22  1:30 ` [PATCH v2 10/10] drm/amdgpu: Move dmabuf attach/detach to backend_(un)bind Felix Kuehling
2021-04-22  1:30   ` Felix Kuehling
2021-04-22 11:20   ` Christian König
2021-04-22 11:20     ` Christian König
2021-05-10 22:09     ` Errabolu, Ramesh
2021-05-10 22:09       ` Errabolu, Ramesh
2021-04-27 15:16 ` [PATCH v2 00/10] Implement multi-GPU DMA mappings for KFD Zeng, Oak
2021-04-27 15:16   ` Zeng, Oak

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=SN6PR12MB2672D07C35FEFA6DDA91D344E3549@SN6PR12MB2672.namprd12.prod.outlook.com \
    --to=ramesh.errabolu@amd.com \
    --cc=Felix.Kuehling@amd.com \
    --cc=Oak.Zeng@amd.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=dri-devel@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.