dri-devel.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/6] amdgpu: Allow explicitly synchronized submissions.
@ 2022-08-13  1:27 Bas Nieuwenhuizen
  2022-08-13  1:27 ` [PATCH 1/6] drm/ttm: Add usage to ttm_validate_buffer Bas Nieuwenhuizen
                   ` (6 more replies)
  0 siblings, 7 replies; 14+ messages in thread
From: Bas Nieuwenhuizen @ 2022-08-13  1:27 UTC (permalink / raw)
  To: dri-devel, amd-gfx; +Cc: christian.koenig

This adds a context option to use DMA_RESV_USAGE_BOOKKEEP for userspace submissions,
based on Christians TTM work.

Disabling implicit sync is something we've wanted in radv for a while for resolving
some corner cases. A more immediate thing that would be solved here is avoiding a
bunch of implicit sync on GPU map/unmap operations as well, which helps with stutter
around sparse maps/unmaps.

This has seen a significant improvement in stutter in Forza Horizon 5 and Forza
Horizon 4. (As games that had significant issues in sparse binding related stutter).
I've been able to pass a full vulkan-cts run on navi21 with this.

Userspace code for this is available at
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18032 and a branch
for the kernel code is available at
https://github.com/BNieuwenhuizen/linux/tree/no-implicit-sync-5.19

This is a follow-up on RFC series https://patchwork.freedesktop.org/series/104578/ .

The main changes were:

1) Instead of replacing num_shared with usage, I'm just adding usage, since
   num_shared was actually needed.
2) We now agree that DMA_RESV_USAGE_BOOKKEEP is reasonable for this purpose.

Please let me know if I missed anything, especially with the change to VM updates,
as we went back and forth a ton of times on that.


Bas Nieuwenhuizen (6):
  drm/ttm: Add usage to ttm_validate_buffer.
  drm/amdgpu: Add separate mode for syncing DMA_RESV_USAGE_BOOKKEEP.
  drm/amdgpu: Allow explicit sync for VM ops.
  drm/amdgpu: Refactor amdgpu_vm_get_pd_bo.
  drm/amdgpu: Add option to disable implicit sync for a context.
  drm/amdgpu: Bump amdgpu driver version.

 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  | 16 +++++++---
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c        | 20 +++++++++---
 drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c       |  3 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c       | 32 +++++++++++++++++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h       |  1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c       |  3 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c       | 12 ++++---
 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c       |  3 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c    | 11 ++++---
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.h    |  3 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c      | 11 +++++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_sync.h      |  4 +--
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c       |  1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c       |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c        |  5 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h        |  3 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm_cpu.c    |  3 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c   |  3 +-
 drivers/gpu/drm/amd/amdkfd/kfd_svm.c          |  1 +
 drivers/gpu/drm/qxl/qxl_release.c             |  1 +
 drivers/gpu/drm/radeon/radeon_cs.c            |  2 ++
 drivers/gpu/drm/radeon/radeon_gem.c           |  1 +
 drivers/gpu/drm/radeon/radeon_vm.c            |  2 ++
 drivers/gpu/drm/ttm/ttm_execbuf_util.c        |  3 +-
 drivers/gpu/drm/vmwgfx/vmwgfx_resource.c      |  7 +++-
 drivers/gpu/drm/vmwgfx/vmwgfx_validation.c    |  1 +
 include/drm/ttm/ttm_execbuf_util.h            |  2 ++
 include/uapi/drm/amdgpu_drm.h                 |  3 ++
 28 files changed, 122 insertions(+), 37 deletions(-)

-- 
2.37.1


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/6] drm/ttm: Add usage to ttm_validate_buffer.
  2022-08-13  1:27 [PATCH 0/6] amdgpu: Allow explicitly synchronized submissions Bas Nieuwenhuizen
@ 2022-08-13  1:27 ` Bas Nieuwenhuizen
  2022-08-17 22:04   ` Felix Kuehling
  2022-08-13  1:27 ` [PATCH 2/6] drm/amdgpu: Add separate mode for syncing DMA_RESV_USAGE_BOOKKEEP Bas Nieuwenhuizen
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 14+ messages in thread
From: Bas Nieuwenhuizen @ 2022-08-13  1:27 UTC (permalink / raw)
  To: dri-devel, amd-gfx; +Cc: christian.koenig

This way callsites can choose between READ/BOOKKEEP reservations.

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 5 +++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c           | 9 +++++++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c          | 1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c          | 8 ++++++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c          | 1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c           | 1 +
 drivers/gpu/drm/amd/amdkfd/kfd_svm.c             | 1 +
 drivers/gpu/drm/qxl/qxl_release.c                | 1 +
 drivers/gpu/drm/radeon/radeon_cs.c               | 2 ++
 drivers/gpu/drm/radeon/radeon_gem.c              | 1 +
 drivers/gpu/drm/radeon/radeon_vm.c               | 2 ++
 drivers/gpu/drm/ttm/ttm_execbuf_util.c           | 3 +--
 drivers/gpu/drm/vmwgfx/vmwgfx_resource.c         | 7 ++++++-
 drivers/gpu/drm/vmwgfx/vmwgfx_validation.c       | 1 +
 include/drm/ttm/ttm_execbuf_util.h               | 2 ++
 15 files changed, 38 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index 4608599ba6bb..a6eb7697c936 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -775,6 +775,7 @@ static void add_kgd_mem_to_kfd_bo_list(struct kgd_mem *mem,
 
 	INIT_LIST_HEAD(&entry->head);
 	entry->num_shared = 1;
+	entry->usage = DMA_RESV_USAGE_READ;
 	entry->bo = &bo->tbo;
 	mutex_lock(&process_info->lock);
 	if (userptr)
@@ -919,6 +920,7 @@ static int reserve_bo_and_vm(struct kgd_mem *mem,
 	ctx->kfd_bo.priority = 0;
 	ctx->kfd_bo.tv.bo = &bo->tbo;
 	ctx->kfd_bo.tv.num_shared = 1;
+	ctx->kfd_bo.tv.usage = DMA_RESV_USAGE_READ;
 	list_add(&ctx->kfd_bo.tv.head, &ctx->list);
 
 	amdgpu_vm_get_pd_bo(vm, &ctx->list, &ctx->vm_pd[0]);
@@ -982,6 +984,7 @@ static int reserve_bo_and_cond_vms(struct kgd_mem *mem,
 	ctx->kfd_bo.priority = 0;
 	ctx->kfd_bo.tv.bo = &bo->tbo;
 	ctx->kfd_bo.tv.num_shared = 1;
+	ctx->kfd_bo.tv.usage = DMA_RESV_USAGE_READ;
 	list_add(&ctx->kfd_bo.tv.head, &ctx->list);
 
 	i = 0;
@@ -2207,6 +2210,7 @@ static int validate_invalid_user_pages(struct amdkfd_process_info *process_info)
 		list_add_tail(&mem->resv_list.head, &resv_list);
 		mem->resv_list.bo = mem->validate_list.bo;
 		mem->resv_list.num_shared = mem->validate_list.num_shared;
+		mem->resv_list.usage = mem->validate_list.usage;
 	}
 
 	/* Reserve all BOs and page tables for validation */
@@ -2406,6 +2410,7 @@ int amdgpu_amdkfd_gpuvm_restore_process_bos(void *info, struct dma_fence **ef)
 		list_add_tail(&mem->resv_list.head, &ctx.list);
 		mem->resv_list.bo = mem->validate_list.bo;
 		mem->resv_list.num_shared = mem->validate_list.num_shared;
+		mem->resv_list.usage = mem->validate_list.usage;
 	}
 
 	ret = ttm_eu_reserve_buffers(&ctx.ticket, &ctx.list,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index d8f1335bc68f..f1ceb25d1b84 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -57,6 +57,7 @@ static int amdgpu_cs_user_fence_chunk(struct amdgpu_cs_parser *p,
 	p->uf_entry.tv.bo = &bo->tbo;
 	/* One for TTM and two for the CS job */
 	p->uf_entry.tv.num_shared = 3;
+	p->uf_entry.tv.usage = DMA_RESV_USAGE_READ;
 
 	drm_gem_object_put(gobj);
 
@@ -522,8 +523,10 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
 	mutex_lock(&p->bo_list->bo_list_mutex);
 
 	/* One for TTM and one for the CS job */
-	amdgpu_bo_list_for_each_entry(e, p->bo_list)
+	amdgpu_bo_list_for_each_entry(e, p->bo_list) {
 		e->tv.num_shared = 2;
+		e->tv.usage = DMA_RESV_USAGE_READ;
+	}
 
 	amdgpu_bo_list_get_list(p->bo_list, &p->validated);
 
@@ -1282,8 +1285,10 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
 	amdgpu_vm_move_to_lru_tail(p->adev, &fpriv->vm);
 
 	/* Make sure all BOs are remembered as writers */
-	amdgpu_bo_list_for_each_entry(e, p->bo_list)
+	amdgpu_bo_list_for_each_entry(e, p->bo_list) {
 		e->tv.num_shared = 0;
+		e->tv.usage = DMA_RESV_USAGE_WRITE;
+	}
 
 	ttm_eu_fence_buffer_objects(&p->ticket, &p->validated, p->fence);
 	mutex_unlock(&p->adev->notifier_lock);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
index c6d4d41c4393..24941ed1a5ec 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
@@ -75,6 +75,7 @@ int amdgpu_map_static_csa(struct amdgpu_device *adev, struct amdgpu_vm *vm,
 	INIT_LIST_HEAD(&csa_tv.head);
 	csa_tv.bo = &bo->tbo;
 	csa_tv.num_shared = 1;
+	csa_tv.usage = DMA_RESV_USAGE_READ;
 
 	list_add(&csa_tv.head, &list);
 	amdgpu_vm_get_pd_bo(vm, &list, &pd);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
index 8ef31d687ef3..f8cf52eb1931 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
@@ -208,6 +208,7 @@ static void amdgpu_gem_object_close(struct drm_gem_object *obj,
 
 	tv.bo = &bo->tbo;
 	tv.num_shared = 2;
+	tv.usage = DMA_RESV_USAGE_READ;
 	list_add(&tv.head, &list);
 
 	amdgpu_vm_get_pd_bo(vm, &list, &vm_pd);
@@ -733,10 +734,13 @@ int amdgpu_gem_va_ioctl(struct drm_device *dev, void *data,
 			return -ENOENT;
 		abo = gem_to_amdgpu_bo(gobj);
 		tv.bo = &abo->tbo;
-		if (abo->flags & AMDGPU_GEM_CREATE_VM_ALWAYS_VALID)
+		if (abo->flags & AMDGPU_GEM_CREATE_VM_ALWAYS_VALID) {
 			tv.num_shared = 1;
-		else
+			tv.usage = DMA_RESV_USAGE_READ;
+		} else {
 			tv.num_shared = 0;
+			tv.usage = DMA_RESV_USAGE_WRITE;
+		}
 		list_add(&tv.head, &list);
 	} else {
 		gobj = NULL;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
index 69a70a0aaed9..6b1da37c2280 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
@@ -996,6 +996,7 @@ int amdgpu_mes_ctx_map_meta_data(struct amdgpu_device *adev,
 
 	csa_tv.bo = &ctx_data->meta_data_obj->tbo;
 	csa_tv.num_shared = 1;
+	csa_tv.usage = DMA_RESV_USAGE_READ;
 
 	list_add(&csa_tv.head, &list);
 	amdgpu_vm_get_pd_bo(vm, &list, &pd);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index dc76d2b3ce52..1b5d2317b987 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -325,6 +325,7 @@ void amdgpu_vm_get_pd_bo(struct amdgpu_vm *vm,
 	entry->tv.bo = &vm->root.bo->tbo;
 	/* Two for VM updates, one for TTM and one for the CS job */
 	entry->tv.num_shared = 4;
+	entry->tv.usage = DMA_RESV_USAGE_READ;
 	entry->user_pages = NULL;
 	list_add(&entry->tv.head, validated);
 }
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
index 7b332246eda3..83531b00b29d 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
@@ -1410,6 +1410,7 @@ static int svm_range_reserve_bos(struct svm_validate_context *ctx)
 
 		ctx->tv[gpuidx].bo = &vm->root.bo->tbo;
 		ctx->tv[gpuidx].num_shared = 4;
+		ctx->tv[gpuidx].usage = DMA_RESV_USAGE_READ;
 		list_add(&ctx->tv[gpuidx].head, &ctx->validate_list);
 	}
 
diff --git a/drivers/gpu/drm/qxl/qxl_release.c b/drivers/gpu/drm/qxl/qxl_release.c
index 368d26da0d6a..0c6e45992604 100644
--- a/drivers/gpu/drm/qxl/qxl_release.c
+++ b/drivers/gpu/drm/qxl/qxl_release.c
@@ -184,6 +184,7 @@ int qxl_release_list_add(struct qxl_release *release, struct qxl_bo *bo)
 	qxl_bo_ref(bo);
 	entry->tv.bo = &bo->tbo;
 	entry->tv.num_shared = 0;
+	entry->tv.usage = DMA_RESV_USAGE_WRITE;
 	list_add_tail(&entry->tv.head, &release->bos);
 	return 0;
 }
diff --git a/drivers/gpu/drm/radeon/radeon_cs.c b/drivers/gpu/drm/radeon/radeon_cs.c
index 446f7bae54c4..6cc470dcf177 100644
--- a/drivers/gpu/drm/radeon/radeon_cs.c
+++ b/drivers/gpu/drm/radeon/radeon_cs.c
@@ -184,6 +184,8 @@ static int radeon_cs_parser_relocs(struct radeon_cs_parser *p)
 
 		p->relocs[i].tv.bo = &p->relocs[i].robj->tbo;
 		p->relocs[i].tv.num_shared = !r->write_domain;
+		p->relocs[i].tv.usage = r->write_domain ? DMA_RESV_USAGE_WRITE :
+							  DMA_RESV_USAGE_READ;
 
 		radeon_cs_buckets_add(&buckets, &p->relocs[i].tv.head,
 				      priority);
diff --git a/drivers/gpu/drm/radeon/radeon_gem.c b/drivers/gpu/drm/radeon/radeon_gem.c
index 8c01a7f0e027..e7abd535bdc2 100644
--- a/drivers/gpu/drm/radeon/radeon_gem.c
+++ b/drivers/gpu/drm/radeon/radeon_gem.c
@@ -636,6 +636,7 @@ static void radeon_gem_va_update_vm(struct radeon_device *rdev,
 
 	tv.bo = &bo_va->bo->tbo;
 	tv.num_shared = 1;
+	tv.usage = DMA_RESV_USAGE_READ;
 	list_add(&tv.head, &list);
 
 	vm_bos = radeon_vm_get_bos(rdev, bo_va->vm, &list);
diff --git a/drivers/gpu/drm/radeon/radeon_vm.c b/drivers/gpu/drm/radeon/radeon_vm.c
index 987cabbf1318..72ff5347b56d 100644
--- a/drivers/gpu/drm/radeon/radeon_vm.c
+++ b/drivers/gpu/drm/radeon/radeon_vm.c
@@ -144,6 +144,7 @@ struct radeon_bo_list *radeon_vm_get_bos(struct radeon_device *rdev,
 	list[0].allowed_domains = RADEON_GEM_DOMAIN_VRAM;
 	list[0].tv.bo = &vm->page_directory->tbo;
 	list[0].tv.num_shared = 1;
+	list[0].tv.usage = DMA_RESV_USAGE_READ;
 	list[0].tiling_flags = 0;
 	list_add(&list[0].tv.head, head);
 
@@ -156,6 +157,7 @@ struct radeon_bo_list *radeon_vm_get_bos(struct radeon_device *rdev,
 		list[idx].allowed_domains = RADEON_GEM_DOMAIN_VRAM;
 		list[idx].tv.bo = &list[idx].robj->tbo;
 		list[idx].tv.num_shared = 1;
+		list[idx].tv.usage = DMA_RESV_USAGE_READ;
 		list[idx].tiling_flags = 0;
 		list_add(&list[idx++].tv.head, head);
 	}
diff --git a/drivers/gpu/drm/ttm/ttm_execbuf_util.c b/drivers/gpu/drm/ttm/ttm_execbuf_util.c
index dbee34a058df..44a6bce66cf7 100644
--- a/drivers/gpu/drm/ttm/ttm_execbuf_util.c
+++ b/drivers/gpu/drm/ttm/ttm_execbuf_util.c
@@ -154,8 +154,7 @@ void ttm_eu_fence_buffer_objects(struct ww_acquire_ctx *ticket,
 	list_for_each_entry(entry, list, head) {
 		struct ttm_buffer_object *bo = entry->bo;
 
-		dma_resv_add_fence(bo->base.resv, fence, entry->num_shared ?
-				   DMA_RESV_USAGE_READ : DMA_RESV_USAGE_WRITE);
+		dma_resv_add_fence(bo->base.resv, fence, entry->usage);
 		ttm_bo_move_to_lru_tail_unlocked(bo);
 		dma_resv_unlock(bo->base.resv);
 	}
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c b/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c
index a7d62a4eb47b..0de0365504d6 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c
@@ -131,6 +131,7 @@ static void vmw_resource_release(struct kref *kref)
 
 			val_buf.bo = bo;
 			val_buf.num_shared = 0;
+			val_buf.usage = DMA_RESV_USAGE_WRITE;
 			res->func->unbind(res, false, &val_buf);
 		}
 		res->backup_dirty = false;
@@ -553,6 +554,7 @@ vmw_resource_check_buffer(struct ww_acquire_ctx *ticket,
 	ttm_bo_get(&res->backup->base);
 	val_buf->bo = &res->backup->base;
 	val_buf->num_shared = 0;
+	val_buf->usage = DMA_RESV_USAGE_WRITE;
 	list_add_tail(&val_buf->head, &val_list);
 	ret = ttm_eu_reserve_buffers(ticket, &val_list, interruptible, NULL);
 	if (unlikely(ret != 0))
@@ -658,6 +660,7 @@ static int vmw_resource_do_evict(struct ww_acquire_ctx *ticket,
 
 	val_buf.bo = NULL;
 	val_buf.num_shared = 0;
+	val_buf.usage = DMA_RESV_USAGE_WRITE;
 	ret = vmw_resource_check_buffer(ticket, res, interruptible, &val_buf);
 	if (unlikely(ret != 0))
 		return ret;
@@ -709,6 +712,7 @@ int vmw_resource_validate(struct vmw_resource *res, bool intr,
 
 	val_buf.bo = NULL;
 	val_buf.num_shared = 0;
+	val_buf.usage = DMA_RESV_USAGE_WRITE;
 	if (res->backup)
 		val_buf.bo = &res->backup->base;
 	do {
@@ -777,7 +781,8 @@ void vmw_resource_unbind_list(struct vmw_buffer_object *vbo)
 {
 	struct ttm_validate_buffer val_buf = {
 		.bo = &vbo->base,
-		.num_shared = 0
+		.num_shared = 0,
+		.usage = DMA_RESV_USAGE_WRITE
 	};
 
 	dma_resv_assert_held(vbo->base.base.resv);
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_validation.c b/drivers/gpu/drm/vmwgfx/vmwgfx_validation.c
index f46891012be3..913e91962af1 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_validation.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_validation.c
@@ -289,6 +289,7 @@ int vmw_validation_add_bo(struct vmw_validation_context *ctx,
 		if (!val_buf->bo)
 			return -ESRCH;
 		val_buf->num_shared = 0;
+		val_buf->usage = DMA_RESV_USAGE_WRITE;
 		list_add_tail(&val_buf->head, &ctx->bo_list);
 		bo_node->as_mob = as_mob;
 		bo_node->cpu_blit = cpu_blit;
diff --git a/include/drm/ttm/ttm_execbuf_util.h b/include/drm/ttm/ttm_execbuf_util.h
index a99d7fdf2964..5b65f5e1354a 100644
--- a/include/drm/ttm/ttm_execbuf_util.h
+++ b/include/drm/ttm/ttm_execbuf_util.h
@@ -41,12 +41,14 @@
  * @head:           list head for thread-private list.
  * @bo:             refcounted buffer object pointer.
  * @num_shared:     How many shared fences we want to add.
+ * @usage           dma resv usage of the fences to add.
  */
 
 struct ttm_validate_buffer {
 	struct list_head head;
 	struct ttm_buffer_object *bo;
 	unsigned int num_shared;
+	enum dma_resv_usage usage;
 };
 
 /**
-- 
2.37.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 2/6] drm/amdgpu: Add separate mode for syncing DMA_RESV_USAGE_BOOKKEEP.
  2022-08-13  1:27 [PATCH 0/6] amdgpu: Allow explicitly synchronized submissions Bas Nieuwenhuizen
  2022-08-13  1:27 ` [PATCH 1/6] drm/ttm: Add usage to ttm_validate_buffer Bas Nieuwenhuizen
@ 2022-08-13  1:27 ` Bas Nieuwenhuizen
  2022-08-13  1:27 ` [PATCH 3/6] drm/amdgpu: Allow explicit sync for VM ops Bas Nieuwenhuizen
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 14+ messages in thread
From: Bas Nieuwenhuizen @ 2022-08-13  1:27 UTC (permalink / raw)
  To: dri-devel, amd-gfx; +Cc: christian.koenig

To prep for allowing different sync modes in a follow-up patch.

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c           |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c       | 11 +++++++----
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.h       |  3 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c         | 11 ++++++++---
 drivers/gpu/drm/amd/amdgpu/amdgpu_sync.h         |  4 ++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c          |  1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c          |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm_cpu.c       |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c      |  2 +-
 10 files changed, 25 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index a6eb7697c936..746f44c1c3f9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -1158,7 +1158,7 @@ static int process_sync_pds_resv(struct amdkfd_process_info *process_info,
 		struct amdgpu_bo *pd = peer_vm->root.bo;
 
 		ret = amdgpu_sync_resv(NULL, sync, pd->tbo.base.resv,
-				       AMDGPU_SYNC_NE_OWNER,
+				       AMDGPU_SYNC_NE_OWNER, AMDGPU_SYNC_NE_OWNER,
 				       AMDGPU_FENCE_OWNER_KFD);
 		if (ret)
 			return ret;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index f1ceb25d1b84..91958e9db90b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -675,7 +675,7 @@ static int amdgpu_cs_sync_rings(struct amdgpu_cs_parser *p)
 		sync_mode = amdgpu_bo_explicit_sync(bo) ?
 			AMDGPU_SYNC_EXPLICIT : AMDGPU_SYNC_NE_OWNER;
 		r = amdgpu_sync_resv(p->adev, &p->job->sync, resv, sync_mode,
-				     &fpriv->vm);
+				     AMDGPU_SYNC_EXPLICIT, &fpriv->vm);
 		if (r)
 			return r;
 	}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index 2c82b1d5a0d7..20c45f502536 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -1410,7 +1410,8 @@ void amdgpu_bo_fence(struct amdgpu_bo *bo, struct dma_fence *fence,
  *
  * @adev: amdgpu device pointer
  * @resv: reservation object to sync to
- * @sync_mode: synchronization mode
+ * @implicit_sync_mode: synchronization mode for usage <= DMA_RESV_USAGE_READ
+ * @explicit_sync_mode: synchronization mode for usage DMA_RESV_USAGE_BOOKKEEP
  * @owner: fence owner
  * @intr: Whether the wait is interruptible
  *
@@ -1420,14 +1421,15 @@ void amdgpu_bo_fence(struct amdgpu_bo *bo, struct dma_fence *fence,
  * 0 on success, errno otherwise.
  */
 int amdgpu_bo_sync_wait_resv(struct amdgpu_device *adev, struct dma_resv *resv,
-			     enum amdgpu_sync_mode sync_mode, void *owner,
+			     enum amdgpu_sync_mode implicit_sync_mode,
+			     enum amdgpu_sync_mode explicit_sync_mode, void *owner,
 			     bool intr)
 {
 	struct amdgpu_sync sync;
 	int r;
 
 	amdgpu_sync_create(&sync);
-	amdgpu_sync_resv(adev, &sync, resv, sync_mode, owner);
+	amdgpu_sync_resv(adev, &sync, resv, implicit_sync_mode, explicit_sync_mode, owner);
 	r = amdgpu_sync_wait(&sync, intr);
 	amdgpu_sync_free(&sync);
 	return r;
@@ -1448,7 +1450,8 @@ int amdgpu_bo_sync_wait(struct amdgpu_bo *bo, void *owner, bool intr)
 	struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
 
 	return amdgpu_bo_sync_wait_resv(adev, bo->tbo.base.resv,
-					AMDGPU_SYNC_NE_OWNER, owner, intr);
+					AMDGPU_SYNC_NE_OWNER, AMDGPU_SYNC_EXPLICIT,
+					owner, intr);
 }
 
 /**
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
index 147b79c10cbb..36ce9abb579c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
@@ -320,7 +320,8 @@ vm_fault_t amdgpu_bo_fault_reserve_notify(struct ttm_buffer_object *bo);
 void amdgpu_bo_fence(struct amdgpu_bo *bo, struct dma_fence *fence,
 		     bool shared);
 int amdgpu_bo_sync_wait_resv(struct amdgpu_device *adev, struct dma_resv *resv,
-			     enum amdgpu_sync_mode sync_mode, void *owner,
+			     enum amdgpu_sync_mode implicit_sync_mode,
+			     enum amdgpu_sync_mode explicit_sync_mode, void *owner,
 			     bool intr);
 int amdgpu_bo_sync_wait(struct amdgpu_bo *bo, void *owner, bool intr);
 u64 amdgpu_bo_gpu_offset(struct amdgpu_bo *bo);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
index 504af1b93bfa..de508cb3f6a5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
@@ -225,14 +225,15 @@ static bool amdgpu_sync_test_fence(struct amdgpu_device *adev,
  * @adev: amdgpu device
  * @sync: sync object to add fences from reservation object to
  * @resv: reservation object with embedded fence
- * @mode: how owner affects which fences we sync to
+ * @implicit_mode: how owner affects which fences with usage <= DMA_RESV_USAGE_READ we sync to
+ * @explicit_mode: how owner affects which fences with usage DMA_RESV_USAGE_BOOKKEEP we sync to
  * @owner: owner of the planned job submission
  *
  * Sync to the fence
  */
 int amdgpu_sync_resv(struct amdgpu_device *adev, struct amdgpu_sync *sync,
-		     struct dma_resv *resv, enum amdgpu_sync_mode mode,
-		     void *owner)
+		     struct dma_resv *resv, enum amdgpu_sync_mode implicit_mode,
+		     enum amdgpu_sync_mode explicit_mode, void *owner)
 {
 	struct dma_resv_iter cursor;
 	struct dma_fence *f;
@@ -245,6 +246,10 @@ int amdgpu_sync_resv(struct amdgpu_device *adev, struct amdgpu_sync *sync,
 	dma_resv_for_each_fence(&cursor, resv, DMA_RESV_USAGE_BOOKKEEP, f) {
 		dma_fence_chain_for_each(f, f) {
 			struct dma_fence *tmp = dma_fence_chain_contained(f);
+			enum amdgpu_sync_mode mode = implicit_mode;
+
+			if (dma_resv_iter_usage(&cursor) >= DMA_RESV_USAGE_BOOKKEEP)
+				mode = explicit_mode;
 
 			if (amdgpu_sync_test_fence(adev, mode, owner, tmp)) {
 				r = amdgpu_sync_fence(sync, f);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.h
index 2d5c613cda10..57a39eedff78 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.h
@@ -48,8 +48,8 @@ struct amdgpu_sync {
 void amdgpu_sync_create(struct amdgpu_sync *sync);
 int amdgpu_sync_fence(struct amdgpu_sync *sync, struct dma_fence *f);
 int amdgpu_sync_resv(struct amdgpu_device *adev, struct amdgpu_sync *sync,
-		     struct dma_resv *resv, enum amdgpu_sync_mode mode,
-		     void *owner);
+		     struct dma_resv *resv, enum amdgpu_sync_mode implicit_mode,
+		     enum amdgpu_sync_mode explicit_mode, void *owner);
 struct dma_fence *amdgpu_sync_peek_fence(struct amdgpu_sync *sync,
 				     struct amdgpu_ring *ring);
 struct dma_fence *amdgpu_sync_get_fence(struct amdgpu_sync *sync);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 3b4c19412625..9d5fc6359191 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -1978,6 +1978,7 @@ static int amdgpu_ttm_prepare_job(struct amdgpu_device *adev,
 	if (resv) {
 		r = amdgpu_sync_resv(adev, &(*job)->sync, resv,
 				     AMDGPU_SYNC_ALWAYS,
+				     AMDGPU_SYNC_EXPLICIT,
 				     AMDGPU_FENCE_OWNER_UNDEFINED);
 		if (r) {
 			DRM_ERROR("sync failed (%d).\n", r);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
index 6eac649499d3..de08bab400d5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
@@ -1176,7 +1176,7 @@ static int amdgpu_uvd_send_msg(struct amdgpu_ring *ring, struct amdgpu_bo *bo,
 			goto err_free;
 	} else {
 		r = amdgpu_sync_resv(adev, &job->sync, bo->tbo.base.resv,
-				     AMDGPU_SYNC_ALWAYS,
+				     AMDGPU_SYNC_ALWAYS, AMDGPU_SYNC_ALWAYS,
 				     AMDGPU_FENCE_OWNER_UNDEFINED);
 		if (r)
 			goto err_free;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_cpu.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_cpu.c
index 31913ae86de6..f10332e1c6c0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_cpu.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_cpu.c
@@ -51,7 +51,7 @@ static int amdgpu_vm_cpu_prepare(struct amdgpu_vm_update_params *p,
 	if (!resv)
 		return 0;
 
-	return amdgpu_bo_sync_wait_resv(p->adev, resv, sync_mode, p->vm, true);
+	return amdgpu_bo_sync_wait_resv(p->adev, resv, sync_mode, sync_mode, p->vm, true);
 }
 
 /**
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
index 1fd3cbca20a2..6ec6217f0b0e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
@@ -75,7 +75,7 @@ static int amdgpu_vm_sdma_prepare(struct amdgpu_vm_update_params *p,
 	if (!resv)
 		return 0;
 
-	return amdgpu_sync_resv(p->adev, &p->job->sync, resv, sync_mode, p->vm);
+	return amdgpu_sync_resv(p->adev, &p->job->sync, resv, sync_mode, sync_mode, p->vm);
 }
 
 /**
-- 
2.37.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 3/6] drm/amdgpu: Allow explicit sync for VM ops.
  2022-08-13  1:27 [PATCH 0/6] amdgpu: Allow explicitly synchronized submissions Bas Nieuwenhuizen
  2022-08-13  1:27 ` [PATCH 1/6] drm/ttm: Add usage to ttm_validate_buffer Bas Nieuwenhuizen
  2022-08-13  1:27 ` [PATCH 2/6] drm/amdgpu: Add separate mode for syncing DMA_RESV_USAGE_BOOKKEEP Bas Nieuwenhuizen
@ 2022-08-13  1:27 ` Bas Nieuwenhuizen
  2022-08-13  1:27 ` [PATCH 4/6] drm/amdgpu: Refactor amdgpu_vm_get_pd_bo Bas Nieuwenhuizen
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 14+ messages in thread
From: Bas Nieuwenhuizen @ 2022-08-13  1:27 UTC (permalink / raw)
  To: dri-devel, amd-gfx; +Cc: christian.koenig

This should be okay because moves themselves use KERNEL usage and
hence still sync with BOOKKEEP usage. Then any later submits still
wait on any pending VM operations.

(i.e. we only made VM ops not wait on BOOKKEEP submits, not the other
 way around)

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm_cpu.c  | 3 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c | 3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_cpu.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_cpu.c
index f10332e1c6c0..e898a549f86d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_cpu.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_cpu.c
@@ -51,7 +51,8 @@ static int amdgpu_vm_cpu_prepare(struct amdgpu_vm_update_params *p,
 	if (!resv)
 		return 0;
 
-	return amdgpu_bo_sync_wait_resv(p->adev, resv, sync_mode, sync_mode, p->vm, true);
+	return amdgpu_bo_sync_wait_resv(p->adev, resv, sync_mode,
+					AMDGPU_SYNC_EXPLICIT, p->vm, true);
 }
 
 /**
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
index 6ec6217f0b0e..9233ea3c9404 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
@@ -75,7 +75,8 @@ static int amdgpu_vm_sdma_prepare(struct amdgpu_vm_update_params *p,
 	if (!resv)
 		return 0;
 
-	return amdgpu_sync_resv(p->adev, &p->job->sync, resv, sync_mode, sync_mode, p->vm);
+	return amdgpu_sync_resv(p->adev, &p->job->sync, resv, sync_mode,
+				AMDGPU_SYNC_EXPLICIT, p->vm);
 }
 
 /**
-- 
2.37.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 4/6] drm/amdgpu: Refactor amdgpu_vm_get_pd_bo.
  2022-08-13  1:27 [PATCH 0/6] amdgpu: Allow explicitly synchronized submissions Bas Nieuwenhuizen
                   ` (2 preceding siblings ...)
  2022-08-13  1:27 ` [PATCH 3/6] drm/amdgpu: Allow explicit sync for VM ops Bas Nieuwenhuizen
@ 2022-08-13  1:27 ` Bas Nieuwenhuizen
  2022-08-13  1:28 ` [PATCH 5/6] drm/amdgpu: Add option to disable implicit sync for a context Bas Nieuwenhuizen
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 14+ messages in thread
From: Bas Nieuwenhuizen @ 2022-08-13  1:27 UTC (permalink / raw)
  To: dri-devel, amd-gfx; +Cc: christian.koenig

We want to take only a BOOKKEEP usage for contexts that are not
implicitly synced.

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 9 +++++----
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c           | 2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c          | 2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c          | 4 ++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c          | 2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c           | 6 ++++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h           | 3 ++-
 7 files changed, 16 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index 746f44c1c3f9..cc4fcc82eec1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -923,7 +923,7 @@ static int reserve_bo_and_vm(struct kgd_mem *mem,
 	ctx->kfd_bo.tv.usage = DMA_RESV_USAGE_READ;
 	list_add(&ctx->kfd_bo.tv.head, &ctx->list);
 
-	amdgpu_vm_get_pd_bo(vm, &ctx->list, &ctx->vm_pd[0]);
+	amdgpu_vm_get_pd_bo(vm, &ctx->list, &ctx->vm_pd[0], DMA_RESV_USAGE_READ);
 
 	ret = ttm_eu_reserve_buffers(&ctx->ticket, &ctx->list,
 				     false, &ctx->duplicates);
@@ -995,7 +995,7 @@ static int reserve_bo_and_cond_vms(struct kgd_mem *mem,
 			continue;
 
 		amdgpu_vm_get_pd_bo(entry->bo_va->base.vm, &ctx->list,
-				&ctx->vm_pd[i]);
+				&ctx->vm_pd[i], DMA_RESV_USAGE_READ);
 		i++;
 	}
 
@@ -2203,7 +2203,7 @@ static int validate_invalid_user_pages(struct amdkfd_process_info *process_info)
 	list_for_each_entry(peer_vm, &process_info->vm_list_head,
 			    vm_list_node)
 		amdgpu_vm_get_pd_bo(peer_vm, &resv_list,
-				    &pd_bo_list_entries[i++]);
+				    &pd_bo_list_entries[i++], DMA_RESV_USAGE_READ);
 	/* Add the userptr_inval_list entries to resv_list */
 	list_for_each_entry(mem, &process_info->userptr_inval_list,
 			    validate_list.head) {
@@ -2399,7 +2399,8 @@ int amdgpu_amdkfd_gpuvm_restore_process_bos(void *info, struct dma_fence **ef)
 	mutex_lock(&process_info->lock);
 	list_for_each_entry(peer_vm, &process_info->vm_list_head,
 			vm_list_node)
-		amdgpu_vm_get_pd_bo(peer_vm, &ctx.list, &pd_bo_list[i++]);
+		amdgpu_vm_get_pd_bo(peer_vm, &ctx.list, &pd_bo_list[i++],
+				    DMA_RESV_USAGE_READ);
 
 	/* Reserve all BOs and page tables/directory. Add all BOs from
 	 * kfd_bo_list to ctx.list
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index 91958e9db90b..175fc2c2feec 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -531,7 +531,7 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
 	amdgpu_bo_list_get_list(p->bo_list, &p->validated);
 
 	INIT_LIST_HEAD(&duplicates);
-	amdgpu_vm_get_pd_bo(&fpriv->vm, &p->validated, &p->vm_pd);
+	amdgpu_vm_get_pd_bo(&fpriv->vm, &p->validated, &p->vm_pd, DMA_RESV_USAGE_READ);
 
 	if (p->uf_entry.tv.bo && !ttm_to_amdgpu_bo(p->uf_entry.tv.bo)->parent)
 		list_add(&p->uf_entry.tv.head, &p->validated);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
index 24941ed1a5ec..0cc2c863808f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
@@ -78,7 +78,7 @@ int amdgpu_map_static_csa(struct amdgpu_device *adev, struct amdgpu_vm *vm,
 	csa_tv.usage = DMA_RESV_USAGE_READ;
 
 	list_add(&csa_tv.head, &list);
-	amdgpu_vm_get_pd_bo(vm, &list, &pd);
+	amdgpu_vm_get_pd_bo(vm, &list, &pd, DMA_RESV_USAGE_READ);
 
 	r = ttm_eu_reserve_buffers(&ticket, &list, true, NULL);
 	if (r) {
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
index f8cf52eb1931..0f0e0acec691 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
@@ -211,7 +211,7 @@ static void amdgpu_gem_object_close(struct drm_gem_object *obj,
 	tv.usage = DMA_RESV_USAGE_READ;
 	list_add(&tv.head, &list);
 
-	amdgpu_vm_get_pd_bo(vm, &list, &vm_pd);
+	amdgpu_vm_get_pd_bo(vm, &list, &vm_pd, DMA_RESV_USAGE_READ);
 
 	r = ttm_eu_reserve_buffers(&ticket, &list, false, &duplicates);
 	if (r) {
@@ -747,7 +747,7 @@ int amdgpu_gem_va_ioctl(struct drm_device *dev, void *data,
 		abo = NULL;
 	}
 
-	amdgpu_vm_get_pd_bo(&fpriv->vm, &list, &vm_pd);
+	amdgpu_vm_get_pd_bo(&fpriv->vm, &list, &vm_pd, DMA_RESV_USAGE_READ);
 
 	r = ttm_eu_reserve_buffers(&ticket, &list, true, &duplicates);
 	if (r)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
index 6b1da37c2280..852057cccc54 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
@@ -999,7 +999,7 @@ int amdgpu_mes_ctx_map_meta_data(struct amdgpu_device *adev,
 	csa_tv.usage = DMA_RESV_USAGE_READ;
 
 	list_add(&csa_tv.head, &list);
-	amdgpu_vm_get_pd_bo(vm, &list, &pd);
+	amdgpu_vm_get_pd_bo(vm, &list, &pd, DMA_RESV_USAGE_READ);
 
 	r = ttm_eu_reserve_buffers(&ticket, &list, true, NULL);
 	if (r) {
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 1b5d2317b987..17cfe16a68ce 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -313,19 +313,21 @@ void amdgpu_vm_bo_base_init(struct amdgpu_vm_bo_base *base,
  * @vm: vm providing the BOs
  * @validated: head of validation list
  * @entry: entry to add
+ * @resv_usage: resv usage for the synchronization
  *
  * Add the page directory to the list of BOs to
  * validate for command submission.
  */
 void amdgpu_vm_get_pd_bo(struct amdgpu_vm *vm,
 			 struct list_head *validated,
-			 struct amdgpu_bo_list_entry *entry)
+			 struct amdgpu_bo_list_entry *entry,
+			 enum dma_resv_usage resv_usage)
 {
 	entry->priority = 0;
 	entry->tv.bo = &vm->root.bo->tbo;
 	/* Two for VM updates, one for TTM and one for the CS job */
 	entry->tv.num_shared = 4;
-	entry->tv.usage = DMA_RESV_USAGE_READ;
+	entry->tv.usage = resv_usage;
 	entry->user_pages = NULL;
 	list_add(&entry->tv.head, validated);
 }
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
index 9ecb7f663e19..da0de4df13ef 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
@@ -386,7 +386,8 @@ void amdgpu_vm_release_compute(struct amdgpu_device *adev, struct amdgpu_vm *vm)
 void amdgpu_vm_fini(struct amdgpu_device *adev, struct amdgpu_vm *vm);
 void amdgpu_vm_get_pd_bo(struct amdgpu_vm *vm,
 			 struct list_head *validated,
-			 struct amdgpu_bo_list_entry *entry);
+			 struct amdgpu_bo_list_entry *entry,
+			 enum dma_resv_usage resv_usage);
 bool amdgpu_vm_ready(struct amdgpu_vm *vm);
 int amdgpu_vm_validate_pt_bos(struct amdgpu_device *adev, struct amdgpu_vm *vm,
 			      int (*callback)(void *p, struct amdgpu_bo *bo),
-- 
2.37.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 5/6] drm/amdgpu: Add option to disable implicit sync for a context.
  2022-08-13  1:27 [PATCH 0/6] amdgpu: Allow explicitly synchronized submissions Bas Nieuwenhuizen
                   ` (3 preceding siblings ...)
  2022-08-13  1:27 ` [PATCH 4/6] drm/amdgpu: Refactor amdgpu_vm_get_pd_bo Bas Nieuwenhuizen
@ 2022-08-13  1:28 ` Bas Nieuwenhuizen
  2022-08-13  1:28 ` [PATCH 6/6] drm/amdgpu: Bump amdgpu driver version Bas Nieuwenhuizen
  2022-08-18 13:20 ` [PATCH 0/6] amdgpu: Allow explicitly synchronized submissions Christian König
  6 siblings, 0 replies; 14+ messages in thread
From: Bas Nieuwenhuizen @ 2022-08-13  1:28 UTC (permalink / raw)
  To: dri-devel, amd-gfx; +Cc: christian.koenig

This changes all BO usages in a submit to BOOKKEEP instead of READ,
which effectively disables implicit sync for these submits.

This is configured at a context level using the existing IOCTL.

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c  | 13 ++++++----
 drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 32 +++++++++++++++++++++++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h |  1 +
 include/uapi/drm/amdgpu_drm.h           |  3 +++
 4 files changed, 43 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index 175fc2c2feec..5246defa4de8 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -500,6 +500,7 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
 	struct amdgpu_bo *gws;
 	struct amdgpu_bo *oa;
 	int r;
+	enum dma_resv_usage resv_usage;
 
 	INIT_LIST_HEAD(&p->validated);
 
@@ -522,16 +523,19 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
 
 	mutex_lock(&p->bo_list->bo_list_mutex);
 
+	resv_usage = p->ctx->disable_implicit_sync ? DMA_RESV_USAGE_BOOKKEEP :
+						     DMA_RESV_USAGE_READ;
+
 	/* One for TTM and one for the CS job */
 	amdgpu_bo_list_for_each_entry(e, p->bo_list) {
 		e->tv.num_shared = 2;
-		e->tv.usage = DMA_RESV_USAGE_READ;
+		e->tv.usage = resv_usage;
 	}
 
 	amdgpu_bo_list_get_list(p->bo_list, &p->validated);
 
 	INIT_LIST_HEAD(&duplicates);
-	amdgpu_vm_get_pd_bo(&fpriv->vm, &p->validated, &p->vm_pd, DMA_RESV_USAGE_READ);
+	amdgpu_vm_get_pd_bo(&fpriv->vm, &p->validated, &p->vm_pd, resv_usage);
 
 	if (p->uf_entry.tv.bo && !ttm_to_amdgpu_bo(p->uf_entry.tv.bo)->parent)
 		list_add(&p->uf_entry.tv.head, &p->validated);
@@ -672,7 +676,7 @@ static int amdgpu_cs_sync_rings(struct amdgpu_cs_parser *p)
 		struct dma_resv *resv = bo->tbo.base.resv;
 		enum amdgpu_sync_mode sync_mode;
 
-		sync_mode = amdgpu_bo_explicit_sync(bo) ?
+		sync_mode = (amdgpu_bo_explicit_sync(bo) || p->ctx->disable_implicit_sync) ?
 			AMDGPU_SYNC_EXPLICIT : AMDGPU_SYNC_NE_OWNER;
 		r = amdgpu_sync_resv(p->adev, &p->job->sync, resv, sync_mode,
 				     AMDGPU_SYNC_EXPLICIT, &fpriv->vm);
@@ -1287,7 +1291,8 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
 	/* Make sure all BOs are remembered as writers */
 	amdgpu_bo_list_for_each_entry(e, p->bo_list) {
 		e->tv.num_shared = 0;
-		e->tv.usage = DMA_RESV_USAGE_WRITE;
+		e->tv.usage = p->ctx->disable_implicit_sync ? DMA_RESV_USAGE_BOOKKEEP
+							    : DMA_RESV_USAGE_WRITE;
 	}
 
 	ttm_eu_fence_buffer_objects(&p->ticket, &p->validated, p->fence);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
index 7dc92ef36b2b..c01140a449da 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
@@ -596,8 +596,6 @@ static int amdgpu_ctx_query2(struct amdgpu_device *adev,
 	return 0;
 }
 
-
-
 static int amdgpu_ctx_stable_pstate(struct amdgpu_device *adev,
 				    struct amdgpu_fpriv *fpriv, uint32_t id,
 				    bool set, u32 *stable_pstate)
@@ -626,6 +624,30 @@ static int amdgpu_ctx_stable_pstate(struct amdgpu_device *adev,
 	return r;
 }
 
+static int amdgpu_ctx_set_implicit_sync(struct amdgpu_device *adev,
+					struct amdgpu_fpriv *fpriv, uint32_t id,
+					bool enable)
+{
+	struct amdgpu_ctx *ctx;
+	struct amdgpu_ctx_mgr *mgr;
+
+	if (!fpriv)
+		return -EINVAL;
+
+	mgr = &fpriv->ctx_mgr;
+	mutex_lock(&mgr->lock);
+	ctx = idr_find(&mgr->ctx_handles, id);
+	if (!ctx) {
+		mutex_unlock(&mgr->lock);
+		return -EINVAL;
+	}
+
+	ctx->disable_implicit_sync = !enable;
+
+	mutex_unlock(&mgr->lock);
+	return 0;
+}
+
 int amdgpu_ctx_ioctl(struct drm_device *dev, void *data,
 		     struct drm_file *filp)
 {
@@ -674,6 +696,12 @@ int amdgpu_ctx_ioctl(struct drm_device *dev, void *data,
 			return -EINVAL;
 		r = amdgpu_ctx_stable_pstate(adev, fpriv, id, true, &stable_pstate);
 		break;
+	case AMDGPU_CTX_OP_SET_IMPLICIT_SYNC:
+		if ((args->in.flags & ~AMDGPU_CTX_IMPICIT_SYNC_ENABLED) || args->in.priority)
+			return -EINVAL;
+		r = amdgpu_ctx_set_implicit_sync(adev, fpriv, id,
+						 args->in.flags & ~AMDGPU_CTX_IMPICIT_SYNC_ENABLED);
+		break;
 	default:
 		return -EINVAL;
 	}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h
index cc7c8afff414..60149a7de4fc 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h
@@ -58,6 +58,7 @@ struct amdgpu_ctx {
 	unsigned long			ras_counter_ce;
 	unsigned long			ras_counter_ue;
 	uint32_t			stable_pstate;
+	bool				disable_implicit_sync;
 };
 
 struct amdgpu_ctx_mgr {
diff --git a/include/uapi/drm/amdgpu_drm.h b/include/uapi/drm/amdgpu_drm.h
index 18d3246d636e..27e61466b5d0 100644
--- a/include/uapi/drm/amdgpu_drm.h
+++ b/include/uapi/drm/amdgpu_drm.h
@@ -212,6 +212,7 @@ union drm_amdgpu_bo_list {
 #define AMDGPU_CTX_OP_QUERY_STATE2	4
 #define AMDGPU_CTX_OP_GET_STABLE_PSTATE	5
 #define AMDGPU_CTX_OP_SET_STABLE_PSTATE	6
+#define AMDGPU_CTX_OP_SET_IMPLICIT_SYNC	7
 
 /* GPU reset status */
 #define AMDGPU_CTX_NO_RESET		0
@@ -252,6 +253,8 @@ union drm_amdgpu_bo_list {
 #define AMDGPU_CTX_STABLE_PSTATE_MIN_MCLK  3
 #define AMDGPU_CTX_STABLE_PSTATE_PEAK  4
 
+#define AMDGPU_CTX_IMPICIT_SYNC_ENABLED 1
+
 struct drm_amdgpu_ctx_in {
 	/** AMDGPU_CTX_OP_* */
 	__u32	op;
-- 
2.37.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 6/6] drm/amdgpu: Bump amdgpu driver version.
  2022-08-13  1:27 [PATCH 0/6] amdgpu: Allow explicitly synchronized submissions Bas Nieuwenhuizen
                   ` (4 preceding siblings ...)
  2022-08-13  1:28 ` [PATCH 5/6] drm/amdgpu: Add option to disable implicit sync for a context Bas Nieuwenhuizen
@ 2022-08-13  1:28 ` Bas Nieuwenhuizen
  2022-08-18 13:20 ` [PATCH 0/6] amdgpu: Allow explicitly synchronized submissions Christian König
  6 siblings, 0 replies; 14+ messages in thread
From: Bas Nieuwenhuizen @ 2022-08-13  1:28 UTC (permalink / raw)
  To: dri-devel, amd-gfx; +Cc: christian.koenig

For detection of the new explicit sync functionality without
having to try the ioctl.

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 8890300766a5..6d92e8846b21 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -101,9 +101,10 @@
  * - 3.45.0 - Add context ioctl stable pstate interface
  * - 3.46.0 - To enable hot plug amdgpu tests in libdrm
  * * 3.47.0 - Add AMDGPU_GEM_CREATE_DISCARDABLE and AMDGPU_VM_NOALLOC flags
+ * - 3.48.0 - Add AMDGPU_CTX_OP_SET_IMPLICIT_SYNC context operation.
  */
 #define KMS_DRIVER_MAJOR	3
-#define KMS_DRIVER_MINOR	47
+#define KMS_DRIVER_MINOR	48
 #define KMS_DRIVER_PATCHLEVEL	0
 
 int amdgpu_vram_limit;
-- 
2.37.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/6] drm/ttm: Add usage to ttm_validate_buffer.
  2022-08-13  1:27 ` [PATCH 1/6] drm/ttm: Add usage to ttm_validate_buffer Bas Nieuwenhuizen
@ 2022-08-17 22:04   ` Felix Kuehling
  2022-08-18  0:30     ` Bas Nieuwenhuizen
  0 siblings, 1 reply; 14+ messages in thread
From: Felix Kuehling @ 2022-08-17 22:04 UTC (permalink / raw)
  To: Bas Nieuwenhuizen, dri-devel, amd-gfx; +Cc: christian.koenig

Am 2022-08-12 um 21:27 schrieb Bas Nieuwenhuizen:
> This way callsites can choose between READ/BOOKKEEP reservations.
>
> Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 5 +++++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c           | 9 +++++++--
>   drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c          | 1 +
>   drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c          | 8 ++++++--
>   drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c          | 1 +
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c           | 1 +
>   drivers/gpu/drm/amd/amdkfd/kfd_svm.c             | 1 +
>   drivers/gpu/drm/qxl/qxl_release.c                | 1 +
>   drivers/gpu/drm/radeon/radeon_cs.c               | 2 ++
>   drivers/gpu/drm/radeon/radeon_gem.c              | 1 +
>   drivers/gpu/drm/radeon/radeon_vm.c               | 2 ++
>   drivers/gpu/drm/ttm/ttm_execbuf_util.c           | 3 +--
>   drivers/gpu/drm/vmwgfx/vmwgfx_resource.c         | 7 ++++++-
>   drivers/gpu/drm/vmwgfx/vmwgfx_validation.c       | 1 +
>   include/drm/ttm/ttm_execbuf_util.h               | 2 ++
>   15 files changed, 38 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> index 4608599ba6bb..a6eb7697c936 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> @@ -775,6 +775,7 @@ static void add_kgd_mem_to_kfd_bo_list(struct kgd_mem *mem,
>   
>   	INIT_LIST_HEAD(&entry->head);
>   	entry->num_shared = 1;
> +	entry->usage = DMA_RESV_USAGE_READ;

KFD code never calls ttm_eu_fence_buffer_objects. Does it make sense to 
set this field at all in this case?

Furthermore, I remember reviewing an RFC patch series by Christian that 
replaced all the execbuf_util functions with an iterator API. Is 
Christian's work abandoned or still in progress? How will that interact 
with your patch series?

Regards,
   Felix


>   	entry->bo = &bo->tbo;
>   	mutex_lock(&process_info->lock);
>   	if (userptr)
> @@ -919,6 +920,7 @@ static int reserve_bo_and_vm(struct kgd_mem *mem,
>   	ctx->kfd_bo.priority = 0;
>   	ctx->kfd_bo.tv.bo = &bo->tbo;
>   	ctx->kfd_bo.tv.num_shared = 1;
> +	ctx->kfd_bo.tv.usage = DMA_RESV_USAGE_READ;
>   	list_add(&ctx->kfd_bo.tv.head, &ctx->list);
>   
>   	amdgpu_vm_get_pd_bo(vm, &ctx->list, &ctx->vm_pd[0]);
> @@ -982,6 +984,7 @@ static int reserve_bo_and_cond_vms(struct kgd_mem *mem,
>   	ctx->kfd_bo.priority = 0;
>   	ctx->kfd_bo.tv.bo = &bo->tbo;
>   	ctx->kfd_bo.tv.num_shared = 1;
> +	ctx->kfd_bo.tv.usage = DMA_RESV_USAGE_READ;
>   	list_add(&ctx->kfd_bo.tv.head, &ctx->list);
>   
>   	i = 0;
> @@ -2207,6 +2210,7 @@ static int validate_invalid_user_pages(struct amdkfd_process_info *process_info)
>   		list_add_tail(&mem->resv_list.head, &resv_list);
>   		mem->resv_list.bo = mem->validate_list.bo;
>   		mem->resv_list.num_shared = mem->validate_list.num_shared;
> +		mem->resv_list.usage = mem->validate_list.usage;
>   	}
>   
>   	/* Reserve all BOs and page tables for validation */
> @@ -2406,6 +2410,7 @@ int amdgpu_amdkfd_gpuvm_restore_process_bos(void *info, struct dma_fence **ef)
>   		list_add_tail(&mem->resv_list.head, &ctx.list);
>   		mem->resv_list.bo = mem->validate_list.bo;
>   		mem->resv_list.num_shared = mem->validate_list.num_shared;
> +		mem->resv_list.usage = mem->validate_list.usage;
>   	}
>   
>   	ret = ttm_eu_reserve_buffers(&ctx.ticket, &ctx.list,
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> index d8f1335bc68f..f1ceb25d1b84 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> @@ -57,6 +57,7 @@ static int amdgpu_cs_user_fence_chunk(struct amdgpu_cs_parser *p,
>   	p->uf_entry.tv.bo = &bo->tbo;
>   	/* One for TTM and two for the CS job */
>   	p->uf_entry.tv.num_shared = 3;
> +	p->uf_entry.tv.usage = DMA_RESV_USAGE_READ;
>   
>   	drm_gem_object_put(gobj);
>   
> @@ -522,8 +523,10 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
>   	mutex_lock(&p->bo_list->bo_list_mutex);
>   
>   	/* One for TTM and one for the CS job */
> -	amdgpu_bo_list_for_each_entry(e, p->bo_list)
> +	amdgpu_bo_list_for_each_entry(e, p->bo_list) {
>   		e->tv.num_shared = 2;
> +		e->tv.usage = DMA_RESV_USAGE_READ;
> +	}
>   
>   	amdgpu_bo_list_get_list(p->bo_list, &p->validated);
>   
> @@ -1282,8 +1285,10 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
>   	amdgpu_vm_move_to_lru_tail(p->adev, &fpriv->vm);
>   
>   	/* Make sure all BOs are remembered as writers */
> -	amdgpu_bo_list_for_each_entry(e, p->bo_list)
> +	amdgpu_bo_list_for_each_entry(e, p->bo_list) {
>   		e->tv.num_shared = 0;
> +		e->tv.usage = DMA_RESV_USAGE_WRITE;
> +	}
>   
>   	ttm_eu_fence_buffer_objects(&p->ticket, &p->validated, p->fence);
>   	mutex_unlock(&p->adev->notifier_lock);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
> index c6d4d41c4393..24941ed1a5ec 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
> @@ -75,6 +75,7 @@ int amdgpu_map_static_csa(struct amdgpu_device *adev, struct amdgpu_vm *vm,
>   	INIT_LIST_HEAD(&csa_tv.head);
>   	csa_tv.bo = &bo->tbo;
>   	csa_tv.num_shared = 1;
> +	csa_tv.usage = DMA_RESV_USAGE_READ;
>   
>   	list_add(&csa_tv.head, &list);
>   	amdgpu_vm_get_pd_bo(vm, &list, &pd);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> index 8ef31d687ef3..f8cf52eb1931 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> @@ -208,6 +208,7 @@ static void amdgpu_gem_object_close(struct drm_gem_object *obj,
>   
>   	tv.bo = &bo->tbo;
>   	tv.num_shared = 2;
> +	tv.usage = DMA_RESV_USAGE_READ;
>   	list_add(&tv.head, &list);
>   
>   	amdgpu_vm_get_pd_bo(vm, &list, &vm_pd);
> @@ -733,10 +734,13 @@ int amdgpu_gem_va_ioctl(struct drm_device *dev, void *data,
>   			return -ENOENT;
>   		abo = gem_to_amdgpu_bo(gobj);
>   		tv.bo = &abo->tbo;
> -		if (abo->flags & AMDGPU_GEM_CREATE_VM_ALWAYS_VALID)
> +		if (abo->flags & AMDGPU_GEM_CREATE_VM_ALWAYS_VALID) {
>   			tv.num_shared = 1;
> -		else
> +			tv.usage = DMA_RESV_USAGE_READ;
> +		} else {
>   			tv.num_shared = 0;
> +			tv.usage = DMA_RESV_USAGE_WRITE;
> +		}
>   		list_add(&tv.head, &list);
>   	} else {
>   		gobj = NULL;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
> index 69a70a0aaed9..6b1da37c2280 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
> @@ -996,6 +996,7 @@ int amdgpu_mes_ctx_map_meta_data(struct amdgpu_device *adev,
>   
>   	csa_tv.bo = &ctx_data->meta_data_obj->tbo;
>   	csa_tv.num_shared = 1;
> +	csa_tv.usage = DMA_RESV_USAGE_READ;
>   
>   	list_add(&csa_tv.head, &list);
>   	amdgpu_vm_get_pd_bo(vm, &list, &pd);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> index dc76d2b3ce52..1b5d2317b987 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> @@ -325,6 +325,7 @@ void amdgpu_vm_get_pd_bo(struct amdgpu_vm *vm,
>   	entry->tv.bo = &vm->root.bo->tbo;
>   	/* Two for VM updates, one for TTM and one for the CS job */
>   	entry->tv.num_shared = 4;
> +	entry->tv.usage = DMA_RESV_USAGE_READ;
>   	entry->user_pages = NULL;
>   	list_add(&entry->tv.head, validated);
>   }
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> index 7b332246eda3..83531b00b29d 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> @@ -1410,6 +1410,7 @@ static int svm_range_reserve_bos(struct svm_validate_context *ctx)
>   
>   		ctx->tv[gpuidx].bo = &vm->root.bo->tbo;
>   		ctx->tv[gpuidx].num_shared = 4;
> +		ctx->tv[gpuidx].usage = DMA_RESV_USAGE_READ;
>   		list_add(&ctx->tv[gpuidx].head, &ctx->validate_list);
>   	}
>   
> diff --git a/drivers/gpu/drm/qxl/qxl_release.c b/drivers/gpu/drm/qxl/qxl_release.c
> index 368d26da0d6a..0c6e45992604 100644
> --- a/drivers/gpu/drm/qxl/qxl_release.c
> +++ b/drivers/gpu/drm/qxl/qxl_release.c
> @@ -184,6 +184,7 @@ int qxl_release_list_add(struct qxl_release *release, struct qxl_bo *bo)
>   	qxl_bo_ref(bo);
>   	entry->tv.bo = &bo->tbo;
>   	entry->tv.num_shared = 0;
> +	entry->tv.usage = DMA_RESV_USAGE_WRITE;
>   	list_add_tail(&entry->tv.head, &release->bos);
>   	return 0;
>   }
> diff --git a/drivers/gpu/drm/radeon/radeon_cs.c b/drivers/gpu/drm/radeon/radeon_cs.c
> index 446f7bae54c4..6cc470dcf177 100644
> --- a/drivers/gpu/drm/radeon/radeon_cs.c
> +++ b/drivers/gpu/drm/radeon/radeon_cs.c
> @@ -184,6 +184,8 @@ static int radeon_cs_parser_relocs(struct radeon_cs_parser *p)
>   
>   		p->relocs[i].tv.bo = &p->relocs[i].robj->tbo;
>   		p->relocs[i].tv.num_shared = !r->write_domain;
> +		p->relocs[i].tv.usage = r->write_domain ? DMA_RESV_USAGE_WRITE :
> +							  DMA_RESV_USAGE_READ;
>   
>   		radeon_cs_buckets_add(&buckets, &p->relocs[i].tv.head,
>   				      priority);
> diff --git a/drivers/gpu/drm/radeon/radeon_gem.c b/drivers/gpu/drm/radeon/radeon_gem.c
> index 8c01a7f0e027..e7abd535bdc2 100644
> --- a/drivers/gpu/drm/radeon/radeon_gem.c
> +++ b/drivers/gpu/drm/radeon/radeon_gem.c
> @@ -636,6 +636,7 @@ static void radeon_gem_va_update_vm(struct radeon_device *rdev,
>   
>   	tv.bo = &bo_va->bo->tbo;
>   	tv.num_shared = 1;
> +	tv.usage = DMA_RESV_USAGE_READ;
>   	list_add(&tv.head, &list);
>   
>   	vm_bos = radeon_vm_get_bos(rdev, bo_va->vm, &list);
> diff --git a/drivers/gpu/drm/radeon/radeon_vm.c b/drivers/gpu/drm/radeon/radeon_vm.c
> index 987cabbf1318..72ff5347b56d 100644
> --- a/drivers/gpu/drm/radeon/radeon_vm.c
> +++ b/drivers/gpu/drm/radeon/radeon_vm.c
> @@ -144,6 +144,7 @@ struct radeon_bo_list *radeon_vm_get_bos(struct radeon_device *rdev,
>   	list[0].allowed_domains = RADEON_GEM_DOMAIN_VRAM;
>   	list[0].tv.bo = &vm->page_directory->tbo;
>   	list[0].tv.num_shared = 1;
> +	list[0].tv.usage = DMA_RESV_USAGE_READ;
>   	list[0].tiling_flags = 0;
>   	list_add(&list[0].tv.head, head);
>   
> @@ -156,6 +157,7 @@ struct radeon_bo_list *radeon_vm_get_bos(struct radeon_device *rdev,
>   		list[idx].allowed_domains = RADEON_GEM_DOMAIN_VRAM;
>   		list[idx].tv.bo = &list[idx].robj->tbo;
>   		list[idx].tv.num_shared = 1;
> +		list[idx].tv.usage = DMA_RESV_USAGE_READ;
>   		list[idx].tiling_flags = 0;
>   		list_add(&list[idx++].tv.head, head);
>   	}
> diff --git a/drivers/gpu/drm/ttm/ttm_execbuf_util.c b/drivers/gpu/drm/ttm/ttm_execbuf_util.c
> index dbee34a058df..44a6bce66cf7 100644
> --- a/drivers/gpu/drm/ttm/ttm_execbuf_util.c
> +++ b/drivers/gpu/drm/ttm/ttm_execbuf_util.c
> @@ -154,8 +154,7 @@ void ttm_eu_fence_buffer_objects(struct ww_acquire_ctx *ticket,
>   	list_for_each_entry(entry, list, head) {
>   		struct ttm_buffer_object *bo = entry->bo;
>   
> -		dma_resv_add_fence(bo->base.resv, fence, entry->num_shared ?
> -				   DMA_RESV_USAGE_READ : DMA_RESV_USAGE_WRITE);
> +		dma_resv_add_fence(bo->base.resv, fence, entry->usage);
>   		ttm_bo_move_to_lru_tail_unlocked(bo);
>   		dma_resv_unlock(bo->base.resv);
>   	}
> diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c b/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c
> index a7d62a4eb47b..0de0365504d6 100644
> --- a/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c
> +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c
> @@ -131,6 +131,7 @@ static void vmw_resource_release(struct kref *kref)
>   
>   			val_buf.bo = bo;
>   			val_buf.num_shared = 0;
> +			val_buf.usage = DMA_RESV_USAGE_WRITE;
>   			res->func->unbind(res, false, &val_buf);
>   		}
>   		res->backup_dirty = false;
> @@ -553,6 +554,7 @@ vmw_resource_check_buffer(struct ww_acquire_ctx *ticket,
>   	ttm_bo_get(&res->backup->base);
>   	val_buf->bo = &res->backup->base;
>   	val_buf->num_shared = 0;
> +	val_buf->usage = DMA_RESV_USAGE_WRITE;
>   	list_add_tail(&val_buf->head, &val_list);
>   	ret = ttm_eu_reserve_buffers(ticket, &val_list, interruptible, NULL);
>   	if (unlikely(ret != 0))
> @@ -658,6 +660,7 @@ static int vmw_resource_do_evict(struct ww_acquire_ctx *ticket,
>   
>   	val_buf.bo = NULL;
>   	val_buf.num_shared = 0;
> +	val_buf.usage = DMA_RESV_USAGE_WRITE;
>   	ret = vmw_resource_check_buffer(ticket, res, interruptible, &val_buf);
>   	if (unlikely(ret != 0))
>   		return ret;
> @@ -709,6 +712,7 @@ int vmw_resource_validate(struct vmw_resource *res, bool intr,
>   
>   	val_buf.bo = NULL;
>   	val_buf.num_shared = 0;
> +	val_buf.usage = DMA_RESV_USAGE_WRITE;
>   	if (res->backup)
>   		val_buf.bo = &res->backup->base;
>   	do {
> @@ -777,7 +781,8 @@ void vmw_resource_unbind_list(struct vmw_buffer_object *vbo)
>   {
>   	struct ttm_validate_buffer val_buf = {
>   		.bo = &vbo->base,
> -		.num_shared = 0
> +		.num_shared = 0,
> +		.usage = DMA_RESV_USAGE_WRITE
>   	};
>   
>   	dma_resv_assert_held(vbo->base.base.resv);
> diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_validation.c b/drivers/gpu/drm/vmwgfx/vmwgfx_validation.c
> index f46891012be3..913e91962af1 100644
> --- a/drivers/gpu/drm/vmwgfx/vmwgfx_validation.c
> +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_validation.c
> @@ -289,6 +289,7 @@ int vmw_validation_add_bo(struct vmw_validation_context *ctx,
>   		if (!val_buf->bo)
>   			return -ESRCH;
>   		val_buf->num_shared = 0;
> +		val_buf->usage = DMA_RESV_USAGE_WRITE;
>   		list_add_tail(&val_buf->head, &ctx->bo_list);
>   		bo_node->as_mob = as_mob;
>   		bo_node->cpu_blit = cpu_blit;
> diff --git a/include/drm/ttm/ttm_execbuf_util.h b/include/drm/ttm/ttm_execbuf_util.h
> index a99d7fdf2964..5b65f5e1354a 100644
> --- a/include/drm/ttm/ttm_execbuf_util.h
> +++ b/include/drm/ttm/ttm_execbuf_util.h
> @@ -41,12 +41,14 @@
>    * @head:           list head for thread-private list.
>    * @bo:             refcounted buffer object pointer.
>    * @num_shared:     How many shared fences we want to add.
> + * @usage           dma resv usage of the fences to add.
>    */
>   
>   struct ttm_validate_buffer {
>   	struct list_head head;
>   	struct ttm_buffer_object *bo;
>   	unsigned int num_shared;
> +	enum dma_resv_usage usage;
>   };
>   
>   /**

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/6] drm/ttm: Add usage to ttm_validate_buffer.
  2022-08-17 22:04   ` Felix Kuehling
@ 2022-08-18  0:30     ` Bas Nieuwenhuizen
  2022-08-18  9:33       ` Christian König
  0 siblings, 1 reply; 14+ messages in thread
From: Bas Nieuwenhuizen @ 2022-08-18  0:30 UTC (permalink / raw)
  To: Felix Kuehling; +Cc: amd-gfx, dri-devel, christian.koenig

On Thu, Aug 18, 2022 at 12:04 AM Felix Kuehling <felix.kuehling@amd.com> wrote:
>
> Am 2022-08-12 um 21:27 schrieb Bas Nieuwenhuizen:
> > This way callsites can choose between READ/BOOKKEEP reservations.
> >
> > Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
> > ---
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 5 +++++
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c           | 9 +++++++--
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c          | 1 +
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c          | 8 ++++++--
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c          | 1 +
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c           | 1 +
> >   drivers/gpu/drm/amd/amdkfd/kfd_svm.c             | 1 +
> >   drivers/gpu/drm/qxl/qxl_release.c                | 1 +
> >   drivers/gpu/drm/radeon/radeon_cs.c               | 2 ++
> >   drivers/gpu/drm/radeon/radeon_gem.c              | 1 +
> >   drivers/gpu/drm/radeon/radeon_vm.c               | 2 ++
> >   drivers/gpu/drm/ttm/ttm_execbuf_util.c           | 3 +--
> >   drivers/gpu/drm/vmwgfx/vmwgfx_resource.c         | 7 ++++++-
> >   drivers/gpu/drm/vmwgfx/vmwgfx_validation.c       | 1 +
> >   include/drm/ttm/ttm_execbuf_util.h               | 2 ++
> >   15 files changed, 38 insertions(+), 7 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> > index 4608599ba6bb..a6eb7697c936 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> > @@ -775,6 +775,7 @@ static void add_kgd_mem_to_kfd_bo_list(struct kgd_mem *mem,
> >
> >       INIT_LIST_HEAD(&entry->head);
> >       entry->num_shared = 1;
> > +     entry->usage = DMA_RESV_USAGE_READ;
>
> KFD code never calls ttm_eu_fence_buffer_objects. Does it make sense to
> set this field at all in this case?

Okay, not super familiar with this code, just wanted to make sure that
whatever we're doing in this patch is obviously not a functional
change. I guess it isn't strictly necessaru.


>
> Furthermore, I remember reviewing an RFC patch series by Christian that
> replaced all the execbuf_util functions with an iterator API. Is
> Christian's work abandoned or still in progress? How will that interact
> with your patch series?

I think instead of doing the above one would just adjust the
DMA_RESV_USAGE_WRITE references in
https://patchwork.freedesktop.org/patch/484765/?series=103522&rev=1 to
DMA_RESV_USAGE_BOOKKEEP if the submission is on a context with
disabledimplicit sync. And then obviously this patch wouldn't be
necessary anymore (as well as the PD patch).

>
> Regards,
>    Felix
>
>
> >       entry->bo = &bo->tbo;
> >       mutex_lock(&process_info->lock);
> >       if (userptr)
> > @@ -919,6 +920,7 @@ static int reserve_bo_and_vm(struct kgd_mem *mem,
> >       ctx->kfd_bo.priority = 0;
> >       ctx->kfd_bo.tv.bo = &bo->tbo;
> >       ctx->kfd_bo.tv.num_shared = 1;
> > +     ctx->kfd_bo.tv.usage = DMA_RESV_USAGE_READ;
> >       list_add(&ctx->kfd_bo.tv.head, &ctx->list);
> >
> >       amdgpu_vm_get_pd_bo(vm, &ctx->list, &ctx->vm_pd[0]);
> > @@ -982,6 +984,7 @@ static int reserve_bo_and_cond_vms(struct kgd_mem *mem,
> >       ctx->kfd_bo.priority = 0;
> >       ctx->kfd_bo.tv.bo = &bo->tbo;
> >       ctx->kfd_bo.tv.num_shared = 1;
> > +     ctx->kfd_bo.tv.usage = DMA_RESV_USAGE_READ;
> >       list_add(&ctx->kfd_bo.tv.head, &ctx->list);
> >
> >       i = 0;
> > @@ -2207,6 +2210,7 @@ static int validate_invalid_user_pages(struct amdkfd_process_info *process_info)
> >               list_add_tail(&mem->resv_list.head, &resv_list);
> >               mem->resv_list.bo = mem->validate_list.bo;
> >               mem->resv_list.num_shared = mem->validate_list.num_shared;
> > +             mem->resv_list.usage = mem->validate_list.usage;
> >       }
> >
> >       /* Reserve all BOs and page tables for validation */
> > @@ -2406,6 +2410,7 @@ int amdgpu_amdkfd_gpuvm_restore_process_bos(void *info, struct dma_fence **ef)
> >               list_add_tail(&mem->resv_list.head, &ctx.list);
> >               mem->resv_list.bo = mem->validate_list.bo;
> >               mem->resv_list.num_shared = mem->validate_list.num_shared;
> > +             mem->resv_list.usage = mem->validate_list.usage;
> >       }
> >
> >       ret = ttm_eu_reserve_buffers(&ctx.ticket, &ctx.list,
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > index d8f1335bc68f..f1ceb25d1b84 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > @@ -57,6 +57,7 @@ static int amdgpu_cs_user_fence_chunk(struct amdgpu_cs_parser *p,
> >       p->uf_entry.tv.bo = &bo->tbo;
> >       /* One for TTM and two for the CS job */
> >       p->uf_entry.tv.num_shared = 3;
> > +     p->uf_entry.tv.usage = DMA_RESV_USAGE_READ;
> >
> >       drm_gem_object_put(gobj);
> >
> > @@ -522,8 +523,10 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
> >       mutex_lock(&p->bo_list->bo_list_mutex);
> >
> >       /* One for TTM and one for the CS job */
> > -     amdgpu_bo_list_for_each_entry(e, p->bo_list)
> > +     amdgpu_bo_list_for_each_entry(e, p->bo_list) {
> >               e->tv.num_shared = 2;
> > +             e->tv.usage = DMA_RESV_USAGE_READ;
> > +     }
> >
> >       amdgpu_bo_list_get_list(p->bo_list, &p->validated);
> >
> > @@ -1282,8 +1285,10 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
> >       amdgpu_vm_move_to_lru_tail(p->adev, &fpriv->vm);
> >
> >       /* Make sure all BOs are remembered as writers */
> > -     amdgpu_bo_list_for_each_entry(e, p->bo_list)
> > +     amdgpu_bo_list_for_each_entry(e, p->bo_list) {
> >               e->tv.num_shared = 0;
> > +             e->tv.usage = DMA_RESV_USAGE_WRITE;
> > +     }
> >
> >       ttm_eu_fence_buffer_objects(&p->ticket, &p->validated, p->fence);
> >       mutex_unlock(&p->adev->notifier_lock);
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
> > index c6d4d41c4393..24941ed1a5ec 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
> > @@ -75,6 +75,7 @@ int amdgpu_map_static_csa(struct amdgpu_device *adev, struct amdgpu_vm *vm,
> >       INIT_LIST_HEAD(&csa_tv.head);
> >       csa_tv.bo = &bo->tbo;
> >       csa_tv.num_shared = 1;
> > +     csa_tv.usage = DMA_RESV_USAGE_READ;
> >
> >       list_add(&csa_tv.head, &list);
> >       amdgpu_vm_get_pd_bo(vm, &list, &pd);
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> > index 8ef31d687ef3..f8cf52eb1931 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> > @@ -208,6 +208,7 @@ static void amdgpu_gem_object_close(struct drm_gem_object *obj,
> >
> >       tv.bo = &bo->tbo;
> >       tv.num_shared = 2;
> > +     tv.usage = DMA_RESV_USAGE_READ;
> >       list_add(&tv.head, &list);
> >
> >       amdgpu_vm_get_pd_bo(vm, &list, &vm_pd);
> > @@ -733,10 +734,13 @@ int amdgpu_gem_va_ioctl(struct drm_device *dev, void *data,
> >                       return -ENOENT;
> >               abo = gem_to_amdgpu_bo(gobj);
> >               tv.bo = &abo->tbo;
> > -             if (abo->flags & AMDGPU_GEM_CREATE_VM_ALWAYS_VALID)
> > +             if (abo->flags & AMDGPU_GEM_CREATE_VM_ALWAYS_VALID) {
> >                       tv.num_shared = 1;
> > -             else
> > +                     tv.usage = DMA_RESV_USAGE_READ;
> > +             } else {
> >                       tv.num_shared = 0;
> > +                     tv.usage = DMA_RESV_USAGE_WRITE;
> > +             }
> >               list_add(&tv.head, &list);
> >       } else {
> >               gobj = NULL;
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
> > index 69a70a0aaed9..6b1da37c2280 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
> > @@ -996,6 +996,7 @@ int amdgpu_mes_ctx_map_meta_data(struct amdgpu_device *adev,
> >
> >       csa_tv.bo = &ctx_data->meta_data_obj->tbo;
> >       csa_tv.num_shared = 1;
> > +     csa_tv.usage = DMA_RESV_USAGE_READ;
> >
> >       list_add(&csa_tv.head, &list);
> >       amdgpu_vm_get_pd_bo(vm, &list, &pd);
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> > index dc76d2b3ce52..1b5d2317b987 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> > @@ -325,6 +325,7 @@ void amdgpu_vm_get_pd_bo(struct amdgpu_vm *vm,
> >       entry->tv.bo = &vm->root.bo->tbo;
> >       /* Two for VM updates, one for TTM and one for the CS job */
> >       entry->tv.num_shared = 4;
> > +     entry->tv.usage = DMA_RESV_USAGE_READ;
> >       entry->user_pages = NULL;
> >       list_add(&entry->tv.head, validated);
> >   }
> > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> > index 7b332246eda3..83531b00b29d 100644
> > --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> > @@ -1410,6 +1410,7 @@ static int svm_range_reserve_bos(struct svm_validate_context *ctx)
> >
> >               ctx->tv[gpuidx].bo = &vm->root.bo->tbo;
> >               ctx->tv[gpuidx].num_shared = 4;
> > +             ctx->tv[gpuidx].usage = DMA_RESV_USAGE_READ;
> >               list_add(&ctx->tv[gpuidx].head, &ctx->validate_list);
> >       }
> >
> > diff --git a/drivers/gpu/drm/qxl/qxl_release.c b/drivers/gpu/drm/qxl/qxl_release.c
> > index 368d26da0d6a..0c6e45992604 100644
> > --- a/drivers/gpu/drm/qxl/qxl_release.c
> > +++ b/drivers/gpu/drm/qxl/qxl_release.c
> > @@ -184,6 +184,7 @@ int qxl_release_list_add(struct qxl_release *release, struct qxl_bo *bo)
> >       qxl_bo_ref(bo);
> >       entry->tv.bo = &bo->tbo;
> >       entry->tv.num_shared = 0;
> > +     entry->tv.usage = DMA_RESV_USAGE_WRITE;
> >       list_add_tail(&entry->tv.head, &release->bos);
> >       return 0;
> >   }
> > diff --git a/drivers/gpu/drm/radeon/radeon_cs.c b/drivers/gpu/drm/radeon/radeon_cs.c
> > index 446f7bae54c4..6cc470dcf177 100644
> > --- a/drivers/gpu/drm/radeon/radeon_cs.c
> > +++ b/drivers/gpu/drm/radeon/radeon_cs.c
> > @@ -184,6 +184,8 @@ static int radeon_cs_parser_relocs(struct radeon_cs_parser *p)
> >
> >               p->relocs[i].tv.bo = &p->relocs[i].robj->tbo;
> >               p->relocs[i].tv.num_shared = !r->write_domain;
> > +             p->relocs[i].tv.usage = r->write_domain ? DMA_RESV_USAGE_WRITE :
> > +                                                       DMA_RESV_USAGE_READ;
> >
> >               radeon_cs_buckets_add(&buckets, &p->relocs[i].tv.head,
> >                                     priority);
> > diff --git a/drivers/gpu/drm/radeon/radeon_gem.c b/drivers/gpu/drm/radeon/radeon_gem.c
> > index 8c01a7f0e027..e7abd535bdc2 100644
> > --- a/drivers/gpu/drm/radeon/radeon_gem.c
> > +++ b/drivers/gpu/drm/radeon/radeon_gem.c
> > @@ -636,6 +636,7 @@ static void radeon_gem_va_update_vm(struct radeon_device *rdev,
> >
> >       tv.bo = &bo_va->bo->tbo;
> >       tv.num_shared = 1;
> > +     tv.usage = DMA_RESV_USAGE_READ;
> >       list_add(&tv.head, &list);
> >
> >       vm_bos = radeon_vm_get_bos(rdev, bo_va->vm, &list);
> > diff --git a/drivers/gpu/drm/radeon/radeon_vm.c b/drivers/gpu/drm/radeon/radeon_vm.c
> > index 987cabbf1318..72ff5347b56d 100644
> > --- a/drivers/gpu/drm/radeon/radeon_vm.c
> > +++ b/drivers/gpu/drm/radeon/radeon_vm.c
> > @@ -144,6 +144,7 @@ struct radeon_bo_list *radeon_vm_get_bos(struct radeon_device *rdev,
> >       list[0].allowed_domains = RADEON_GEM_DOMAIN_VRAM;
> >       list[0].tv.bo = &vm->page_directory->tbo;
> >       list[0].tv.num_shared = 1;
> > +     list[0].tv.usage = DMA_RESV_USAGE_READ;
> >       list[0].tiling_flags = 0;
> >       list_add(&list[0].tv.head, head);
> >
> > @@ -156,6 +157,7 @@ struct radeon_bo_list *radeon_vm_get_bos(struct radeon_device *rdev,
> >               list[idx].allowed_domains = RADEON_GEM_DOMAIN_VRAM;
> >               list[idx].tv.bo = &list[idx].robj->tbo;
> >               list[idx].tv.num_shared = 1;
> > +             list[idx].tv.usage = DMA_RESV_USAGE_READ;
> >               list[idx].tiling_flags = 0;
> >               list_add(&list[idx++].tv.head, head);
> >       }
> > diff --git a/drivers/gpu/drm/ttm/ttm_execbuf_util.c b/drivers/gpu/drm/ttm/ttm_execbuf_util.c
> > index dbee34a058df..44a6bce66cf7 100644
> > --- a/drivers/gpu/drm/ttm/ttm_execbuf_util.c
> > +++ b/drivers/gpu/drm/ttm/ttm_execbuf_util.c
> > @@ -154,8 +154,7 @@ void ttm_eu_fence_buffer_objects(struct ww_acquire_ctx *ticket,
> >       list_for_each_entry(entry, list, head) {
> >               struct ttm_buffer_object *bo = entry->bo;
> >
> > -             dma_resv_add_fence(bo->base.resv, fence, entry->num_shared ?
> > -                                DMA_RESV_USAGE_READ : DMA_RESV_USAGE_WRITE);
> > +             dma_resv_add_fence(bo->base.resv, fence, entry->usage);
> >               ttm_bo_move_to_lru_tail_unlocked(bo);
> >               dma_resv_unlock(bo->base.resv);
> >       }
> > diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c b/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c
> > index a7d62a4eb47b..0de0365504d6 100644
> > --- a/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c
> > +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c
> > @@ -131,6 +131,7 @@ static void vmw_resource_release(struct kref *kref)
> >
> >                       val_buf.bo = bo;
> >                       val_buf.num_shared = 0;
> > +                     val_buf.usage = DMA_RESV_USAGE_WRITE;
> >                       res->func->unbind(res, false, &val_buf);
> >               }
> >               res->backup_dirty = false;
> > @@ -553,6 +554,7 @@ vmw_resource_check_buffer(struct ww_acquire_ctx *ticket,
> >       ttm_bo_get(&res->backup->base);
> >       val_buf->bo = &res->backup->base;
> >       val_buf->num_shared = 0;
> > +     val_buf->usage = DMA_RESV_USAGE_WRITE;
> >       list_add_tail(&val_buf->head, &val_list);
> >       ret = ttm_eu_reserve_buffers(ticket, &val_list, interruptible, NULL);
> >       if (unlikely(ret != 0))
> > @@ -658,6 +660,7 @@ static int vmw_resource_do_evict(struct ww_acquire_ctx *ticket,
> >
> >       val_buf.bo = NULL;
> >       val_buf.num_shared = 0;
> > +     val_buf.usage = DMA_RESV_USAGE_WRITE;
> >       ret = vmw_resource_check_buffer(ticket, res, interruptible, &val_buf);
> >       if (unlikely(ret != 0))
> >               return ret;
> > @@ -709,6 +712,7 @@ int vmw_resource_validate(struct vmw_resource *res, bool intr,
> >
> >       val_buf.bo = NULL;
> >       val_buf.num_shared = 0;
> > +     val_buf.usage = DMA_RESV_USAGE_WRITE;
> >       if (res->backup)
> >               val_buf.bo = &res->backup->base;
> >       do {
> > @@ -777,7 +781,8 @@ void vmw_resource_unbind_list(struct vmw_buffer_object *vbo)
> >   {
> >       struct ttm_validate_buffer val_buf = {
> >               .bo = &vbo->base,
> > -             .num_shared = 0
> > +             .num_shared = 0,
> > +             .usage = DMA_RESV_USAGE_WRITE
> >       };
> >
> >       dma_resv_assert_held(vbo->base.base.resv);
> > diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_validation.c b/drivers/gpu/drm/vmwgfx/vmwgfx_validation.c
> > index f46891012be3..913e91962af1 100644
> > --- a/drivers/gpu/drm/vmwgfx/vmwgfx_validation.c
> > +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_validation.c
> > @@ -289,6 +289,7 @@ int vmw_validation_add_bo(struct vmw_validation_context *ctx,
> >               if (!val_buf->bo)
> >                       return -ESRCH;
> >               val_buf->num_shared = 0;
> > +             val_buf->usage = DMA_RESV_USAGE_WRITE;
> >               list_add_tail(&val_buf->head, &ctx->bo_list);
> >               bo_node->as_mob = as_mob;
> >               bo_node->cpu_blit = cpu_blit;
> > diff --git a/include/drm/ttm/ttm_execbuf_util.h b/include/drm/ttm/ttm_execbuf_util.h
> > index a99d7fdf2964..5b65f5e1354a 100644
> > --- a/include/drm/ttm/ttm_execbuf_util.h
> > +++ b/include/drm/ttm/ttm_execbuf_util.h
> > @@ -41,12 +41,14 @@
> >    * @head:           list head for thread-private list.
> >    * @bo:             refcounted buffer object pointer.
> >    * @num_shared:     How many shared fences we want to add.
> > + * @usage           dma resv usage of the fences to add.
> >    */
> >
> >   struct ttm_validate_buffer {
> >       struct list_head head;
> >       struct ttm_buffer_object *bo;
> >       unsigned int num_shared;
> > +     enum dma_resv_usage usage;
> >   };
> >
> >   /**

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/6] drm/ttm: Add usage to ttm_validate_buffer.
  2022-08-18  0:30     ` Bas Nieuwenhuizen
@ 2022-08-18  9:33       ` Christian König
  0 siblings, 0 replies; 14+ messages in thread
From: Christian König @ 2022-08-18  9:33 UTC (permalink / raw)
  To: Bas Nieuwenhuizen, Felix Kuehling; +Cc: dri-devel, amd-gfx, christian.koenig



Am 18.08.22 um 02:30 schrieb Bas Nieuwenhuizen:
> On Thu, Aug 18, 2022 at 12:04 AM Felix Kuehling <felix.kuehling@amd.com> wrote:
>> Am 2022-08-12 um 21:27 schrieb Bas Nieuwenhuizen:
>>> This way callsites can choose between READ/BOOKKEEP reservations.
>>>
>>> Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
>>> ---
>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 5 +++++
>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c           | 9 +++++++--
>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c          | 1 +
>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c          | 8 ++++++--
>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c          | 1 +
>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c           | 1 +
>>>    drivers/gpu/drm/amd/amdkfd/kfd_svm.c             | 1 +
>>>    drivers/gpu/drm/qxl/qxl_release.c                | 1 +
>>>    drivers/gpu/drm/radeon/radeon_cs.c               | 2 ++
>>>    drivers/gpu/drm/radeon/radeon_gem.c              | 1 +
>>>    drivers/gpu/drm/radeon/radeon_vm.c               | 2 ++
>>>    drivers/gpu/drm/ttm/ttm_execbuf_util.c           | 3 +--
>>>    drivers/gpu/drm/vmwgfx/vmwgfx_resource.c         | 7 ++++++-
>>>    drivers/gpu/drm/vmwgfx/vmwgfx_validation.c       | 1 +
>>>    include/drm/ttm/ttm_execbuf_util.h               | 2 ++
>>>    15 files changed, 38 insertions(+), 7 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>>> index 4608599ba6bb..a6eb7697c936 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>>> @@ -775,6 +775,7 @@ static void add_kgd_mem_to_kfd_bo_list(struct kgd_mem *mem,
>>>
>>>        INIT_LIST_HEAD(&entry->head);
>>>        entry->num_shared = 1;
>>> +     entry->usage = DMA_RESV_USAGE_READ;
>> KFD code never calls ttm_eu_fence_buffer_objects. Does it make sense to
>> set this field at all in this case?
> Okay, not super familiar with this code, just wanted to make sure that
> whatever we're doing in this patch is obviously not a functional
> change. I guess it isn't strictly necessaru.
>
>
>> Furthermore, I remember reviewing an RFC patch series by Christian that
>> replaced all the execbuf_util functions with an iterator API. Is
>> Christian's work abandoned or still in progress? How will that interact
>> with your patch series?
> I think instead of doing the above one would just adjust the
> DMA_RESV_USAGE_WRITE references in
> https://patchwork.freedesktop.org/patch/484765/?series=103522&rev=1 to
> DMA_RESV_USAGE_BOOKKEEP if the submission is on a context with
> disabledimplicit sync. And then obviously this patch wouldn't be
> necessary anymore (as well as the PD patch).

Felix is right my series should already give you the opportunity to use 
DMA_RESV_USAGE_BOOKKEEP.

I'm currently rebasing that stuff, so this patch here can be dropped 
when this series is ready.

Apart from that I'm still somewhat sure that we would mess up the VM 
synchronization in case of eviction which this here.

But I really need more time to fully look into this once more.

On the other hand the UAPI looks perfectly fine to me, so you can 
probably keep that as granted for userspace implementation.

Regards,
Christian.

>
>> Regards,
>>     Felix
>>
>>
>>>        entry->bo = &bo->tbo;
>>>        mutex_lock(&process_info->lock);
>>>        if (userptr)
>>> @@ -919,6 +920,7 @@ static int reserve_bo_and_vm(struct kgd_mem *mem,
>>>        ctx->kfd_bo.priority = 0;
>>>        ctx->kfd_bo.tv.bo = &bo->tbo;
>>>        ctx->kfd_bo.tv.num_shared = 1;
>>> +     ctx->kfd_bo.tv.usage = DMA_RESV_USAGE_READ;
>>>        list_add(&ctx->kfd_bo.tv.head, &ctx->list);
>>>
>>>        amdgpu_vm_get_pd_bo(vm, &ctx->list, &ctx->vm_pd[0]);
>>> @@ -982,6 +984,7 @@ static int reserve_bo_and_cond_vms(struct kgd_mem *mem,
>>>        ctx->kfd_bo.priority = 0;
>>>        ctx->kfd_bo.tv.bo = &bo->tbo;
>>>        ctx->kfd_bo.tv.num_shared = 1;
>>> +     ctx->kfd_bo.tv.usage = DMA_RESV_USAGE_READ;
>>>        list_add(&ctx->kfd_bo.tv.head, &ctx->list);
>>>
>>>        i = 0;
>>> @@ -2207,6 +2210,7 @@ static int validate_invalid_user_pages(struct amdkfd_process_info *process_info)
>>>                list_add_tail(&mem->resv_list.head, &resv_list);
>>>                mem->resv_list.bo = mem->validate_list.bo;
>>>                mem->resv_list.num_shared = mem->validate_list.num_shared;
>>> +             mem->resv_list.usage = mem->validate_list.usage;
>>>        }
>>>
>>>        /* Reserve all BOs and page tables for validation */
>>> @@ -2406,6 +2410,7 @@ int amdgpu_amdkfd_gpuvm_restore_process_bos(void *info, struct dma_fence **ef)
>>>                list_add_tail(&mem->resv_list.head, &ctx.list);
>>>                mem->resv_list.bo = mem->validate_list.bo;
>>>                mem->resv_list.num_shared = mem->validate_list.num_shared;
>>> +             mem->resv_list.usage = mem->validate_list.usage;
>>>        }
>>>
>>>        ret = ttm_eu_reserve_buffers(&ctx.ticket, &ctx.list,
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>> index d8f1335bc68f..f1ceb25d1b84 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>> @@ -57,6 +57,7 @@ static int amdgpu_cs_user_fence_chunk(struct amdgpu_cs_parser *p,
>>>        p->uf_entry.tv.bo = &bo->tbo;
>>>        /* One for TTM and two for the CS job */
>>>        p->uf_entry.tv.num_shared = 3;
>>> +     p->uf_entry.tv.usage = DMA_RESV_USAGE_READ;
>>>
>>>        drm_gem_object_put(gobj);
>>>
>>> @@ -522,8 +523,10 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
>>>        mutex_lock(&p->bo_list->bo_list_mutex);
>>>
>>>        /* One for TTM and one for the CS job */
>>> -     amdgpu_bo_list_for_each_entry(e, p->bo_list)
>>> +     amdgpu_bo_list_for_each_entry(e, p->bo_list) {
>>>                e->tv.num_shared = 2;
>>> +             e->tv.usage = DMA_RESV_USAGE_READ;
>>> +     }
>>>
>>>        amdgpu_bo_list_get_list(p->bo_list, &p->validated);
>>>
>>> @@ -1282,8 +1285,10 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
>>>        amdgpu_vm_move_to_lru_tail(p->adev, &fpriv->vm);
>>>
>>>        /* Make sure all BOs are remembered as writers */
>>> -     amdgpu_bo_list_for_each_entry(e, p->bo_list)
>>> +     amdgpu_bo_list_for_each_entry(e, p->bo_list) {
>>>                e->tv.num_shared = 0;
>>> +             e->tv.usage = DMA_RESV_USAGE_WRITE;
>>> +     }
>>>
>>>        ttm_eu_fence_buffer_objects(&p->ticket, &p->validated, p->fence);
>>>        mutex_unlock(&p->adev->notifier_lock);
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
>>> index c6d4d41c4393..24941ed1a5ec 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
>>> @@ -75,6 +75,7 @@ int amdgpu_map_static_csa(struct amdgpu_device *adev, struct amdgpu_vm *vm,
>>>        INIT_LIST_HEAD(&csa_tv.head);
>>>        csa_tv.bo = &bo->tbo;
>>>        csa_tv.num_shared = 1;
>>> +     csa_tv.usage = DMA_RESV_USAGE_READ;
>>>
>>>        list_add(&csa_tv.head, &list);
>>>        amdgpu_vm_get_pd_bo(vm, &list, &pd);
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
>>> index 8ef31d687ef3..f8cf52eb1931 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
>>> @@ -208,6 +208,7 @@ static void amdgpu_gem_object_close(struct drm_gem_object *obj,
>>>
>>>        tv.bo = &bo->tbo;
>>>        tv.num_shared = 2;
>>> +     tv.usage = DMA_RESV_USAGE_READ;
>>>        list_add(&tv.head, &list);
>>>
>>>        amdgpu_vm_get_pd_bo(vm, &list, &vm_pd);
>>> @@ -733,10 +734,13 @@ int amdgpu_gem_va_ioctl(struct drm_device *dev, void *data,
>>>                        return -ENOENT;
>>>                abo = gem_to_amdgpu_bo(gobj);
>>>                tv.bo = &abo->tbo;
>>> -             if (abo->flags & AMDGPU_GEM_CREATE_VM_ALWAYS_VALID)
>>> +             if (abo->flags & AMDGPU_GEM_CREATE_VM_ALWAYS_VALID) {
>>>                        tv.num_shared = 1;
>>> -             else
>>> +                     tv.usage = DMA_RESV_USAGE_READ;
>>> +             } else {
>>>                        tv.num_shared = 0;
>>> +                     tv.usage = DMA_RESV_USAGE_WRITE;
>>> +             }
>>>                list_add(&tv.head, &list);
>>>        } else {
>>>                gobj = NULL;
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
>>> index 69a70a0aaed9..6b1da37c2280 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
>>> @@ -996,6 +996,7 @@ int amdgpu_mes_ctx_map_meta_data(struct amdgpu_device *adev,
>>>
>>>        csa_tv.bo = &ctx_data->meta_data_obj->tbo;
>>>        csa_tv.num_shared = 1;
>>> +     csa_tv.usage = DMA_RESV_USAGE_READ;
>>>
>>>        list_add(&csa_tv.head, &list);
>>>        amdgpu_vm_get_pd_bo(vm, &list, &pd);
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>> index dc76d2b3ce52..1b5d2317b987 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>> @@ -325,6 +325,7 @@ void amdgpu_vm_get_pd_bo(struct amdgpu_vm *vm,
>>>        entry->tv.bo = &vm->root.bo->tbo;
>>>        /* Two for VM updates, one for TTM and one for the CS job */
>>>        entry->tv.num_shared = 4;
>>> +     entry->tv.usage = DMA_RESV_USAGE_READ;
>>>        entry->user_pages = NULL;
>>>        list_add(&entry->tv.head, validated);
>>>    }
>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
>>> index 7b332246eda3..83531b00b29d 100644
>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
>>> @@ -1410,6 +1410,7 @@ static int svm_range_reserve_bos(struct svm_validate_context *ctx)
>>>
>>>                ctx->tv[gpuidx].bo = &vm->root.bo->tbo;
>>>                ctx->tv[gpuidx].num_shared = 4;
>>> +             ctx->tv[gpuidx].usage = DMA_RESV_USAGE_READ;
>>>                list_add(&ctx->tv[gpuidx].head, &ctx->validate_list);
>>>        }
>>>
>>> diff --git a/drivers/gpu/drm/qxl/qxl_release.c b/drivers/gpu/drm/qxl/qxl_release.c
>>> index 368d26da0d6a..0c6e45992604 100644
>>> --- a/drivers/gpu/drm/qxl/qxl_release.c
>>> +++ b/drivers/gpu/drm/qxl/qxl_release.c
>>> @@ -184,6 +184,7 @@ int qxl_release_list_add(struct qxl_release *release, struct qxl_bo *bo)
>>>        qxl_bo_ref(bo);
>>>        entry->tv.bo = &bo->tbo;
>>>        entry->tv.num_shared = 0;
>>> +     entry->tv.usage = DMA_RESV_USAGE_WRITE;
>>>        list_add_tail(&entry->tv.head, &release->bos);
>>>        return 0;
>>>    }
>>> diff --git a/drivers/gpu/drm/radeon/radeon_cs.c b/drivers/gpu/drm/radeon/radeon_cs.c
>>> index 446f7bae54c4..6cc470dcf177 100644
>>> --- a/drivers/gpu/drm/radeon/radeon_cs.c
>>> +++ b/drivers/gpu/drm/radeon/radeon_cs.c
>>> @@ -184,6 +184,8 @@ static int radeon_cs_parser_relocs(struct radeon_cs_parser *p)
>>>
>>>                p->relocs[i].tv.bo = &p->relocs[i].robj->tbo;
>>>                p->relocs[i].tv.num_shared = !r->write_domain;
>>> +             p->relocs[i].tv.usage = r->write_domain ? DMA_RESV_USAGE_WRITE :
>>> +                                                       DMA_RESV_USAGE_READ;
>>>
>>>                radeon_cs_buckets_add(&buckets, &p->relocs[i].tv.head,
>>>                                      priority);
>>> diff --git a/drivers/gpu/drm/radeon/radeon_gem.c b/drivers/gpu/drm/radeon/radeon_gem.c
>>> index 8c01a7f0e027..e7abd535bdc2 100644
>>> --- a/drivers/gpu/drm/radeon/radeon_gem.c
>>> +++ b/drivers/gpu/drm/radeon/radeon_gem.c
>>> @@ -636,6 +636,7 @@ static void radeon_gem_va_update_vm(struct radeon_device *rdev,
>>>
>>>        tv.bo = &bo_va->bo->tbo;
>>>        tv.num_shared = 1;
>>> +     tv.usage = DMA_RESV_USAGE_READ;
>>>        list_add(&tv.head, &list);
>>>
>>>        vm_bos = radeon_vm_get_bos(rdev, bo_va->vm, &list);
>>> diff --git a/drivers/gpu/drm/radeon/radeon_vm.c b/drivers/gpu/drm/radeon/radeon_vm.c
>>> index 987cabbf1318..72ff5347b56d 100644
>>> --- a/drivers/gpu/drm/radeon/radeon_vm.c
>>> +++ b/drivers/gpu/drm/radeon/radeon_vm.c
>>> @@ -144,6 +144,7 @@ struct radeon_bo_list *radeon_vm_get_bos(struct radeon_device *rdev,
>>>        list[0].allowed_domains = RADEON_GEM_DOMAIN_VRAM;
>>>        list[0].tv.bo = &vm->page_directory->tbo;
>>>        list[0].tv.num_shared = 1;
>>> +     list[0].tv.usage = DMA_RESV_USAGE_READ;
>>>        list[0].tiling_flags = 0;
>>>        list_add(&list[0].tv.head, head);
>>>
>>> @@ -156,6 +157,7 @@ struct radeon_bo_list *radeon_vm_get_bos(struct radeon_device *rdev,
>>>                list[idx].allowed_domains = RADEON_GEM_DOMAIN_VRAM;
>>>                list[idx].tv.bo = &list[idx].robj->tbo;
>>>                list[idx].tv.num_shared = 1;
>>> +             list[idx].tv.usage = DMA_RESV_USAGE_READ;
>>>                list[idx].tiling_flags = 0;
>>>                list_add(&list[idx++].tv.head, head);
>>>        }
>>> diff --git a/drivers/gpu/drm/ttm/ttm_execbuf_util.c b/drivers/gpu/drm/ttm/ttm_execbuf_util.c
>>> index dbee34a058df..44a6bce66cf7 100644
>>> --- a/drivers/gpu/drm/ttm/ttm_execbuf_util.c
>>> +++ b/drivers/gpu/drm/ttm/ttm_execbuf_util.c
>>> @@ -154,8 +154,7 @@ void ttm_eu_fence_buffer_objects(struct ww_acquire_ctx *ticket,
>>>        list_for_each_entry(entry, list, head) {
>>>                struct ttm_buffer_object *bo = entry->bo;
>>>
>>> -             dma_resv_add_fence(bo->base.resv, fence, entry->num_shared ?
>>> -                                DMA_RESV_USAGE_READ : DMA_RESV_USAGE_WRITE);
>>> +             dma_resv_add_fence(bo->base.resv, fence, entry->usage);
>>>                ttm_bo_move_to_lru_tail_unlocked(bo);
>>>                dma_resv_unlock(bo->base.resv);
>>>        }
>>> diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c b/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c
>>> index a7d62a4eb47b..0de0365504d6 100644
>>> --- a/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c
>>> +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c
>>> @@ -131,6 +131,7 @@ static void vmw_resource_release(struct kref *kref)
>>>
>>>                        val_buf.bo = bo;
>>>                        val_buf.num_shared = 0;
>>> +                     val_buf.usage = DMA_RESV_USAGE_WRITE;
>>>                        res->func->unbind(res, false, &val_buf);
>>>                }
>>>                res->backup_dirty = false;
>>> @@ -553,6 +554,7 @@ vmw_resource_check_buffer(struct ww_acquire_ctx *ticket,
>>>        ttm_bo_get(&res->backup->base);
>>>        val_buf->bo = &res->backup->base;
>>>        val_buf->num_shared = 0;
>>> +     val_buf->usage = DMA_RESV_USAGE_WRITE;
>>>        list_add_tail(&val_buf->head, &val_list);
>>>        ret = ttm_eu_reserve_buffers(ticket, &val_list, interruptible, NULL);
>>>        if (unlikely(ret != 0))
>>> @@ -658,6 +660,7 @@ static int vmw_resource_do_evict(struct ww_acquire_ctx *ticket,
>>>
>>>        val_buf.bo = NULL;
>>>        val_buf.num_shared = 0;
>>> +     val_buf.usage = DMA_RESV_USAGE_WRITE;
>>>        ret = vmw_resource_check_buffer(ticket, res, interruptible, &val_buf);
>>>        if (unlikely(ret != 0))
>>>                return ret;
>>> @@ -709,6 +712,7 @@ int vmw_resource_validate(struct vmw_resource *res, bool intr,
>>>
>>>        val_buf.bo = NULL;
>>>        val_buf.num_shared = 0;
>>> +     val_buf.usage = DMA_RESV_USAGE_WRITE;
>>>        if (res->backup)
>>>                val_buf.bo = &res->backup->base;
>>>        do {
>>> @@ -777,7 +781,8 @@ void vmw_resource_unbind_list(struct vmw_buffer_object *vbo)
>>>    {
>>>        struct ttm_validate_buffer val_buf = {
>>>                .bo = &vbo->base,
>>> -             .num_shared = 0
>>> +             .num_shared = 0,
>>> +             .usage = DMA_RESV_USAGE_WRITE
>>>        };
>>>
>>>        dma_resv_assert_held(vbo->base.base.resv);
>>> diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_validation.c b/drivers/gpu/drm/vmwgfx/vmwgfx_validation.c
>>> index f46891012be3..913e91962af1 100644
>>> --- a/drivers/gpu/drm/vmwgfx/vmwgfx_validation.c
>>> +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_validation.c
>>> @@ -289,6 +289,7 @@ int vmw_validation_add_bo(struct vmw_validation_context *ctx,
>>>                if (!val_buf->bo)
>>>                        return -ESRCH;
>>>                val_buf->num_shared = 0;
>>> +             val_buf->usage = DMA_RESV_USAGE_WRITE;
>>>                list_add_tail(&val_buf->head, &ctx->bo_list);
>>>                bo_node->as_mob = as_mob;
>>>                bo_node->cpu_blit = cpu_blit;
>>> diff --git a/include/drm/ttm/ttm_execbuf_util.h b/include/drm/ttm/ttm_execbuf_util.h
>>> index a99d7fdf2964..5b65f5e1354a 100644
>>> --- a/include/drm/ttm/ttm_execbuf_util.h
>>> +++ b/include/drm/ttm/ttm_execbuf_util.h
>>> @@ -41,12 +41,14 @@
>>>     * @head:           list head for thread-private list.
>>>     * @bo:             refcounted buffer object pointer.
>>>     * @num_shared:     How many shared fences we want to add.
>>> + * @usage           dma resv usage of the fences to add.
>>>     */
>>>
>>>    struct ttm_validate_buffer {
>>>        struct list_head head;
>>>        struct ttm_buffer_object *bo;
>>>        unsigned int num_shared;
>>> +     enum dma_resv_usage usage;
>>>    };
>>>
>>>    /**


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/6] amdgpu: Allow explicitly synchronized submissions.
  2022-08-13  1:27 [PATCH 0/6] amdgpu: Allow explicitly synchronized submissions Bas Nieuwenhuizen
                   ` (5 preceding siblings ...)
  2022-08-13  1:28 ` [PATCH 6/6] drm/amdgpu: Bump amdgpu driver version Bas Nieuwenhuizen
@ 2022-08-18 13:20 ` Christian König
  2022-08-21 23:08   ` Bas Nieuwenhuizen
  6 siblings, 1 reply; 14+ messages in thread
From: Christian König @ 2022-08-18 13:20 UTC (permalink / raw)
  To: Bas Nieuwenhuizen, dri-devel, amd-gfx

Hi Bas,

I've just pushed the branch drm-exec to my fdo repository: 
https://gitlab.freedesktop.org/ckoenig/linux-drm.git

This branch contains all the gang submit patches as well as the latest 
drm-exec stuff. VCN3/4 video decoding has some issues on it, but that 
probably shouldn't bother your work.

Please rebase this work on top. It should at least make the TTM changes 
unnecessary.

Going to take a closer look into the VM sync changes now.

Regards,
Christian.

Am 13.08.22 um 03:27 schrieb Bas Nieuwenhuizen:
> This adds a context option to use DMA_RESV_USAGE_BOOKKEEP for userspace submissions,
> based on Christians TTM work.
>
> Disabling implicit sync is something we've wanted in radv for a while for resolving
> some corner cases. A more immediate thing that would be solved here is avoiding a
> bunch of implicit sync on GPU map/unmap operations as well, which helps with stutter
> around sparse maps/unmaps.
>
> This has seen a significant improvement in stutter in Forza Horizon 5 and Forza
> Horizon 4. (As games that had significant issues in sparse binding related stutter).
> I've been able to pass a full vulkan-cts run on navi21 with this.
>
> Userspace code for this is available at
> https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18032 and a branch
> for the kernel code is available at
> https://github.com/BNieuwenhuizen/linux/tree/no-implicit-sync-5.19
>
> This is a follow-up on RFC series https://patchwork.freedesktop.org/series/104578/ .
>
> The main changes were:
>
> 1) Instead of replacing num_shared with usage, I'm just adding usage, since
>     num_shared was actually needed.
> 2) We now agree that DMA_RESV_USAGE_BOOKKEEP is reasonable for this purpose.
>
> Please let me know if I missed anything, especially with the change to VM updates,
> as we went back and forth a ton of times on that.
>
>
> Bas Nieuwenhuizen (6):
>    drm/ttm: Add usage to ttm_validate_buffer.
>    drm/amdgpu: Add separate mode for syncing DMA_RESV_USAGE_BOOKKEEP.
>    drm/amdgpu: Allow explicit sync for VM ops.
>    drm/amdgpu: Refactor amdgpu_vm_get_pd_bo.
>    drm/amdgpu: Add option to disable implicit sync for a context.
>    drm/amdgpu: Bump amdgpu driver version.
>
>   .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  | 16 +++++++---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c        | 20 +++++++++---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c       |  3 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c       | 32 +++++++++++++++++--
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h       |  1 +
>   drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c       |  3 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c       | 12 ++++---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c       |  3 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c    | 11 ++++---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.h    |  3 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c      | 11 +++++--
>   drivers/gpu/drm/amd/amdgpu/amdgpu_sync.h      |  4 +--
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c       |  1 +
>   drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c       |  2 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c        |  5 ++-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h        |  3 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm_cpu.c    |  3 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c   |  3 +-
>   drivers/gpu/drm/amd/amdkfd/kfd_svm.c          |  1 +
>   drivers/gpu/drm/qxl/qxl_release.c             |  1 +
>   drivers/gpu/drm/radeon/radeon_cs.c            |  2 ++
>   drivers/gpu/drm/radeon/radeon_gem.c           |  1 +
>   drivers/gpu/drm/radeon/radeon_vm.c            |  2 ++
>   drivers/gpu/drm/ttm/ttm_execbuf_util.c        |  3 +-
>   drivers/gpu/drm/vmwgfx/vmwgfx_resource.c      |  7 +++-
>   drivers/gpu/drm/vmwgfx/vmwgfx_validation.c    |  1 +
>   include/drm/ttm/ttm_execbuf_util.h            |  2 ++
>   include/uapi/drm/amdgpu_drm.h                 |  3 ++
>   28 files changed, 122 insertions(+), 37 deletions(-)
>


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/6] amdgpu: Allow explicitly synchronized submissions.
  2022-08-18 13:20 ` [PATCH 0/6] amdgpu: Allow explicitly synchronized submissions Christian König
@ 2022-08-21 23:08   ` Bas Nieuwenhuizen
  2022-08-23 10:15     ` Christian König
  0 siblings, 1 reply; 14+ messages in thread
From: Bas Nieuwenhuizen @ 2022-08-21 23:08 UTC (permalink / raw)
  To: Christian König; +Cc: amd-gfx, dri-devel

On Thu, Aug 18, 2022 at 3:20 PM Christian König
<christian.koenig@amd.com> wrote:
>
> Hi Bas,
>
> I've just pushed the branch drm-exec to my fdo repository:
> https://gitlab.freedesktop.org/ckoenig/linux-drm.git
>
> This branch contains all the gang submit patches as well as the latest
> drm-exec stuff. VCN3/4 video decoding has some issues on it, but that
> probably shouldn't bother your work.

Hi Christian,

The drm-exec branch doesn't seem to be capable of running Forza
Horizon 5. First bad commit seems to be

commit 8bb3e919ce0109512f6631422f3fe52169836261
Author: Christian König <christian.koenig@amd.com>
Date:   Thu Jul 14 10:23:38 2022 +0200

   drm/amdgpu: revert "partial revert "remove ctx->lock" v2"

   This reverts commit 94f4c4965e5513ba624488f4b601d6b385635aec.

   We found that the bo_list is missing a protection for its list entries.
   Since that is fixed now this workaround can be removed again.

   Signed-off-by: Christian König <christian.koenig@amd.com>


and

https://patchwork.freedesktop.org/patch/497679/ ("drm/amdgpu: Fix
use-after-free on amdgpu_bo_list mutex")

seems to fix things at that patch, but I'm not seeing the obvious
rebase over "drm/amdgpu: cleanup and reorder amdgpu_cs.c" yet (and/or
whether further issues were introduced).


Error logs:

[  124.821691] ------------[ cut here ]------------
[  124.821696] WARNING: CPU: 3 PID: 2485 at
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c:667
amdgpu_ttm_tt_get_user_pages+0x15c/0x190 [amdgpu]
[  124.821955] Modules linked in: uinput snd_seq_dummy snd_hrtimer
snd_seq snd_seq_device ccm algif_aead cbc des_generic libdes ecb md4
cmac algif_hash algif_skcipher af_alg bnep intel_rapl_msr
intel_rapl_common snd_soc_acp5x_mach snd_acp5x_i2s snd_acp5x_pcm_dma
edac_mce_amd kvm_amd kvm rtw88_8822ce rtw88_8822c rtw88_pci irqbypass
rapl rtw88_core pcspkr joydev mac80211 btusb s
nd_hda_codec_hdmi btrtl libarc4 snd_hda_intel btbcm btintel
snd_intel_dspcfg btmtk snd_pci_acp5x i2c_piix4 snd_soc_nau8821
snd_intel_sdw_acpi snd_rn_pci_acp3x cfg80211 bluetooth snd_soc_core
snd_hda_codec snd_acp_config snd_soc_acpi snd_pci_acp3x ecdh_generic
snd_hda_core cdc_acm mousedev snd_compress ecc rfkill snd_hwdep
ac97_bus snd_pcm_dmaengine ina2xx_adc snd_pcm kfifo_buf
spi_amd snd_timer opt3001 ina2xx snd industrialio soundcore mac_hid
acpi_cpufreq fuse ip_tables x_tables overlay ext4 crc16 mbcache jbd2
mmc_block vfat fat usbhid amdgpu drm_ttm_helper ttm agpgart drm_exec
gpu_sched i2c_algo_bit
[  124.822016]  drm_display_helper drm_kms_helper syscopyarea
sysfillrect sysimgblt fb_sys_fops drm serio_raw atkbd crct10dif_pclmul
libps2 crc32_pclmul vivaldi_fmap sdhci_pci ghash_clmulni_intel i8042
ccp cqhci sdhci aesni_intel hid_multitouch xhci_pci crypto_simd cryptd
wdat_wdt mmc_core cec sp5100_tco rng_core xhci_pci_renesas serio video
i2c_hid_acpi 8250_dw i2c_hid btrfs
blake2b_generic libcrc32c crc32c_generic crc32c_intel xor raid6_pq
dm_mirror dm_region_hash dm_log dm_mod pkcs8_key_parser crypto_user
[  124.822051] CPU: 3 PID: 2485 Comm: ForzaHorizon5.e Not tainted
5.18.0-1-neptune-00172-g067e00b76d9c #23
[  124.822054] Hardware name: Valve Jupiter/Jupiter, BIOS F7A0105 03/21/2022
[  124.822055] RIP: 0010:amdgpu_ttm_tt_get_user_pages+0x15c/0x190 [amdgpu]
[  124.822262] Code: e1 ef c0 48 c7 c7 10 4a 0c c1 e8 5f f7 3e dd eb
9c 48 c7 c6 85 0a f6 c0 bf 02 00 00 00 e8 8c 74 e2 ff 41 be f2 ff ff
ff eb 8b <0f> 0b eb f4 41 be fd ff ff ff e9 7c ff ff ff 48 83 b8 a0 00
00 00
[  124.822264] RSP: 0018:ffffa257827afb98 EFLAGS: 00010282
[  124.822267] RAX: ffff8b82240e6000 RBX: ffff8b8200a31100 RCX: 0000000000000001
[  124.822268] RDX: 0000000000000dc0 RSI: ffff8b82240e6000 RDI: ffff8b82a4c7e800
[  124.822269] RBP: ffff8b82ee809320 R08: 0000000000001000 R09: ffff8b82240e6000
[  124.822270] R10: 0000000000000006 R11: 0000000000000000 R12: ffff8b82ee6dc9c0
[  124.822272] R13: 0000000031880000 R14: 0000000000000001 R15: ffff8b823face440
[  124.822273] FS:  000000002773f640(0000) GS:ffff8b852fec0000(0000)
knlGS:000000001aba0000
[  124.822275] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  124.822276] CR2: 0000000003ff4000 CR3: 00000001f1c2e000 CR4: 0000000000350ee0
[  124.822278] Call Trace:
[  124.822281]  <TASK>
[  124.822285]  amdgpu_cs_ioctl+0x9cc/0x2070 [amdgpu]
[  124.822496]  ? amdgpu_cs_find_mapping+0x110/0x110 [amdgpu]
[  124.822701]  drm_ioctl_kernel+0xc5/0x170 [drm]
[  124.822728]  ? futex_wait+0x18f/0x260
[  124.822733]  drm_ioctl+0x229/0x400 [drm]
[  124.822757]  ? amdgpu_cs_find_mapping+0x110/0x110 [amdgpu]
[  124.822963]  amdgpu_drm_ioctl+0x4a/0x80 [amdgpu]
[  124.823165]  __x64_sys_ioctl+0x8c/0xc0
[  124.823169]  do_syscall_64+0x3a/0x80
[  124.823174]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  124.823177] RIP: 0033:0x7f5525e1059b
[  124.823180] Code: ff ff ff 85 c0 79 9b 49 c7 c4 ff ff ff ff 5b 5d
4c 89 e0 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa b8 10 00 00
00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d a5 a8 0c 00 f7 d8 64 89
01 48
[  124.823182] RSP: 002b:000000002773d548 EFLAGS: 00000246 ORIG_RAX:
0000000000000010
[  124.823185] RAX: ffffffffffffffda RBX: 000000002773d5d0 RCX: 00007f5525e1059b
[  124.823186] RDX: 000000002773d5d0 RSI: 00000000c0186444 RDI: 0000000000000021
[  124.823187] RBP: 00000000c0186444 R08: 00007f54a4043c80 R09: 000000002773d590
[  124.823188] R10: 0000000000000000 R11: 0000000000000246 R12: 00007f54a4043d50
[  124.823190] R13: 0000000000000021 R14: 00007f54a4043cb0 R15: 00007f54a4043d20
[  124.823192]  </TASK>
[  124.823193] ---[ end trace 0000000000000000 ]---
[  124.823197] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to
process the buffer list -14!
[  124.823410] ------------[ cut here ]------------
[  124.823411] refcount_t: underflow; use-after-free.
[  124.823418] WARNING: CPU: 3 PID: 2485 at lib/refcount.c:28
refcount_warn_saturate+0xa6/0xf0
[  124.823424] Modules linked in: uinput snd_seq_dummy snd_hrtimer
snd_seq snd_seq_device ccm algif_aead cbc des_generic libdes ecb md4
cmac algif_hash algif_skcipher af_alg bnep intel_rapl_msr
intel_rapl_common snd_soc_acp5x_mach snd_acp5x_i2s snd_acp5x_pcm_dma
edac_mce_amd kvm_amd kvm rtw88_8822ce rtw88_8822c rtw88_pci irqbypass
rapl rtw88_core pcspkr joydev mac80211 btusb s
nd_hda_codec_hdmi btrtl libarc4 snd_hda_intel btbcm btintel
snd_intel_dspcfg btmtk snd_pci_acp5x i2c_piix4 snd_soc_nau8821
snd_intel_sdw_acpi snd_rn_pci_acp3x cfg80211 bluetooth snd_soc_core
snd_hda_codec snd_acp_config snd_soc_acpi snd_pci_acp3x ecdh_generic
snd_hda_core cdc_acm mousedev snd_compress ecc rfkill snd_hwdep
ac97_bus snd_pcm_dmaengine ina2xx_adc snd_pcm kfifo_buf
spi_amd snd_timer opt3001 ina2xx snd industrialio soundcore mac_hid
acpi_cpufreq fuse ip_tables x_tables overlay ext4 crc16 mbcache jbd2
mmc_block vfat fat usbhid amdgpu drm_ttm_helper ttm agpgart drm_exec
gpu_sched i2c_algo_bit
[  124.823485]  drm_display_helper drm_kms_helper syscopyarea
sysfillrect sysimgblt fb_sys_fops drm serio_raw atkbd crct10dif_pclmul
libps2 crc32_pclmul vivaldi_fmap sdhci_pci ghash_clmulni_intel i8042
ccp cqhci sdhci aesni_intel hid_multitouch xhci_pci crypto_simd cryptd
wdat_wdt mmc_core cec sp5100_tco rng_core xhci_pci_renesas serio video
i2c_hid_acpi 8250_dw i2c_hid btrfs
blake2b_generic libcrc32c crc32c_generic crc32c_intel xor raid6_pq
dm_mirror dm_region_hash dm_log dm_mod pkcs8_key_parser crypto_user
[  124.823516] CPU: 3 PID: 2485 Comm: ForzaHorizon5.e Tainted: G
 W         5.18.0-1-neptune-00172-g067e00b76d9c #23
[  124.823519] Hardware name: Valve Jupiter/Jupiter, BIOS F7A0105 03/21/2022
[  124.823520] RIP: 0010:refcount_warn_saturate+0xa6/0xf0
[  124.823523] Code: 05 2d c4 6d 01 01 e8 90 68 58 00 0f 0b c3 80 3d
1d c4 6d 01 00 75 95 48 c7 c7 b8 db ba 9e c6 05 0d c4 6d 01 01 e8 71
68 58 00 <0f> 0b c3 80 3d fc c3 6d 01 00 0f 85 72 ff ff ff 48 c7 c7 10
dc ba
[  124.823524] RSP: 0018:ffffa257827afba8 EFLAGS: 00010286
[  124.823526] RAX: 0000000000000000 RBX: ffffa257827afc58 RCX: 0000000000000027
[  124.823527] RDX: ffff8b852fee0768 RSI: 0000000000000001 RDI: ffff8b852fee0760
[  124.823528] RBP: 0000000000000000 R08: 0000000000000000 R09: ffffa257827af9b8
[  124.823529] R10: 0000000000000003 R11: ffffffff9f2c5168 R12: 00000000ffffffff
[  124.823530] R13: 0000000000000018 R14: 0000000000000001 R15: ffff8b823face440
[  124.823531] FS:  000000002773f640(0000) GS:ffff8b852fec0000(0000)
knlGS:000000001aba0000
[  124.823533] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  124.823534] CR2: 0000000003ff4000 CR3: 00000001f1c2e000 CR4: 0000000000350ee0
[  124.823535] Call Trace:
[  124.823537]  <TASK>
[  124.823537]  amdgpu_cs_parser_fini+0x11e/0x160 [amdgpu]
[  124.823745]  amdgpu_cs_ioctl+0x40a/0x2070 [amdgpu]
[  124.823954]  ? amdgpu_cs_find_mapping+0x110/0x110 [amdgpu]
[  124.824159]  drm_ioctl_kernel+0xc5/0x170 [drm]
[  124.824185]  ? futex_wait+0x18f/0x260
[  124.824189]  drm_ioctl+0x229/0x400 [drm]
[  124.824213]  ? amdgpu_cs_find_mapping+0x110/0x110 [amdgpu]
[  124.824444]  amdgpu_drm_ioctl+0x4a/0x80 [amdgpu]
[  124.824651]  __x64_sys_ioctl+0x8c/0xc0
[  124.824655]  do_syscall_64+0x3a/0x80
[  124.824660]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  124.824663] RIP: 0033:0x7f5525e1059b
[  124.824665] Code: ff ff ff 85 c0 79 9b 49 c7 c4 ff ff ff ff 5b 5d
4c 89 e0 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa b8 10 00 00
00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d a5 a8 0c 00 f7 d8 64 89
01 48
[  124.824667] RSP: 002b:000000002773d548 EFLAGS: 00000246 ORIG_RAX:
0000000000000010
[  124.824670] RAX: ffffffffffffffda RBX: 000000002773d5d0 RCX: 00007f5525e1059b
[  124.824671] RDX: 000000002773d5d0 RSI: 00000000c0186444 RDI: 0000000000000021
[  124.824673] RBP: 00000000c0186444 R08: 00007f54a4043c80 R09: 000000002773d590
[  124.824674] R10: 0000000000000000 R11: 0000000000000246 R12: 00007f54a4043d50
[  124.824675] R13: 0000000000000021 R14: 00007f54a4043cb0 R15: 00007f54a4043d20
[  124.824677]  </TASK>
[  124.824678] ---[ end trace 0000000000000000 ]---



>
> Please rebase this work on top. It should at least make the TTM changes
> unnecessary.
>
> Going to take a closer look into the VM sync changes now.
>
> Regards,
> Christian.
>
> Am 13.08.22 um 03:27 schrieb Bas Nieuwenhuizen:
> > This adds a context option to use DMA_RESV_USAGE_BOOKKEEP for userspace submissions,
> > based on Christians TTM work.
> >
> > Disabling implicit sync is something we've wanted in radv for a while for resolving
> > some corner cases. A more immediate thing that would be solved here is avoiding a
> > bunch of implicit sync on GPU map/unmap operations as well, which helps with stutter
> > around sparse maps/unmaps.
> >
> > This has seen a significant improvement in stutter in Forza Horizon 5 and Forza
> > Horizon 4. (As games that had significant issues in sparse binding related stutter).
> > I've been able to pass a full vulkan-cts run on navi21 with this.
> >
> > Userspace code for this is available at
> > https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18032 and a branch
> > for the kernel code is available at
> > https://github.com/BNieuwenhuizen/linux/tree/no-implicit-sync-5.19
> >
> > This is a follow-up on RFC series https://patchwork.freedesktop.org/series/104578/ .
> >
> > The main changes were:
> >
> > 1) Instead of replacing num_shared with usage, I'm just adding usage, since
> >     num_shared was actually needed.
> > 2) We now agree that DMA_RESV_USAGE_BOOKKEEP is reasonable for this purpose.
> >
> > Please let me know if I missed anything, especially with the change to VM updates,
> > as we went back and forth a ton of times on that.
> >
> >
> > Bas Nieuwenhuizen (6):
> >    drm/ttm: Add usage to ttm_validate_buffer.
> >    drm/amdgpu: Add separate mode for syncing DMA_RESV_USAGE_BOOKKEEP.
> >    drm/amdgpu: Allow explicit sync for VM ops.
> >    drm/amdgpu: Refactor amdgpu_vm_get_pd_bo.
> >    drm/amdgpu: Add option to disable implicit sync for a context.
> >    drm/amdgpu: Bump amdgpu driver version.
> >
> >   .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  | 16 +++++++---
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c        | 20 +++++++++---
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c       |  3 +-
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c       | 32 +++++++++++++++++--
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h       |  1 +
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c       |  3 +-
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c       | 12 ++++---
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c       |  3 +-
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c    | 11 ++++---
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_object.h    |  3 +-
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c      | 11 +++++--
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_sync.h      |  4 +--
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c       |  1 +
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c       |  2 +-
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c        |  5 ++-
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h        |  3 +-
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_vm_cpu.c    |  3 +-
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c   |  3 +-
> >   drivers/gpu/drm/amd/amdkfd/kfd_svm.c          |  1 +
> >   drivers/gpu/drm/qxl/qxl_release.c             |  1 +
> >   drivers/gpu/drm/radeon/radeon_cs.c            |  2 ++
> >   drivers/gpu/drm/radeon/radeon_gem.c           |  1 +
> >   drivers/gpu/drm/radeon/radeon_vm.c            |  2 ++
> >   drivers/gpu/drm/ttm/ttm_execbuf_util.c        |  3 +-
> >   drivers/gpu/drm/vmwgfx/vmwgfx_resource.c      |  7 +++-
> >   drivers/gpu/drm/vmwgfx/vmwgfx_validation.c    |  1 +
> >   include/drm/ttm/ttm_execbuf_util.h            |  2 ++
> >   include/uapi/drm/amdgpu_drm.h                 |  3 ++
> >   28 files changed, 122 insertions(+), 37 deletions(-)
> >
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/6] amdgpu: Allow explicitly synchronized submissions.
  2022-08-21 23:08   ` Bas Nieuwenhuizen
@ 2022-08-23 10:15     ` Christian König
  2022-08-23 22:35       ` Bas Nieuwenhuizen
  0 siblings, 1 reply; 14+ messages in thread
From: Christian König @ 2022-08-23 10:15 UTC (permalink / raw)
  To: Bas Nieuwenhuizen; +Cc: amd-gfx, dri-devel

Hi Bas,

I've just pushed an updated drm-exec branch to fdo which should now 
include the bo_list bug fix.

Can you please test that with Forza? I'm still fighting getting a new 
kernel on my Steamdeck.

Thanks,
Christian.

Am 22.08.22 um 01:08 schrieb Bas Nieuwenhuizen:
> On Thu, Aug 18, 2022 at 3:20 PM Christian König
> <christian.koenig@amd.com> wrote:
>> Hi Bas,
>>
>> I've just pushed the branch drm-exec to my fdo repository:
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.freedesktop.org%2Fckoenig%2Flinux-drm.git&amp;data=05%7C01%7Cchristian.koenig%40amd.com%7Ccc04d790d774485a5cbd08da83ca03f4%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637967200920376906%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=8ZaXIdEQZe3oNCQtxoNjuBezB4YmPeDR2cLfXfxraZk%3D&amp;reserved=0
>>
>> This branch contains all the gang submit patches as well as the latest
>> drm-exec stuff. VCN3/4 video decoding has some issues on it, but that
>> probably shouldn't bother your work.
> Hi Christian,
>
> The drm-exec branch doesn't seem to be capable of running Forza
> Horizon 5. First bad commit seems to be
>
> commit 8bb3e919ce0109512f6631422f3fe52169836261
> Author: Christian König <christian.koenig@amd.com>
> Date:   Thu Jul 14 10:23:38 2022 +0200
>
>     drm/amdgpu: revert "partial revert "remove ctx->lock" v2"
>
>     This reverts commit 94f4c4965e5513ba624488f4b601d6b385635aec.
>
>     We found that the bo_list is missing a protection for its list entries.
>     Since that is fixed now this workaround can be removed again.
>
>     Signed-off-by: Christian König <christian.koenig@amd.com>
>
>
> and
>
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatchwork.freedesktop.org%2Fpatch%2F497679%2F&amp;data=05%7C01%7Cchristian.koenig%40amd.com%7Ccc04d790d774485a5cbd08da83ca03f4%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637967200920376906%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=0F7jd61YEApKySKpqIgODHM1x0JB83coaHgjzFeVPoU%3D&amp;reserved=0 ("drm/amdgpu: Fix
> use-after-free on amdgpu_bo_list mutex")
>
> seems to fix things at that patch, but I'm not seeing the obvious
> rebase over "drm/amdgpu: cleanup and reorder amdgpu_cs.c" yet (and/or
> whether further issues were introduced).
>
>
> Error logs:
>
> [  124.821691] ------------[ cut here ]------------
> [  124.821696] WARNING: CPU: 3 PID: 2485 at
> drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c:667
> amdgpu_ttm_tt_get_user_pages+0x15c/0x190 [amdgpu]
> [  124.821955] Modules linked in: uinput snd_seq_dummy snd_hrtimer
> snd_seq snd_seq_device ccm algif_aead cbc des_generic libdes ecb md4
> cmac algif_hash algif_skcipher af_alg bnep intel_rapl_msr
> intel_rapl_common snd_soc_acp5x_mach snd_acp5x_i2s snd_acp5x_pcm_dma
> edac_mce_amd kvm_amd kvm rtw88_8822ce rtw88_8822c rtw88_pci irqbypass
> rapl rtw88_core pcspkr joydev mac80211 btusb s
> nd_hda_codec_hdmi btrtl libarc4 snd_hda_intel btbcm btintel
> snd_intel_dspcfg btmtk snd_pci_acp5x i2c_piix4 snd_soc_nau8821
> snd_intel_sdw_acpi snd_rn_pci_acp3x cfg80211 bluetooth snd_soc_core
> snd_hda_codec snd_acp_config snd_soc_acpi snd_pci_acp3x ecdh_generic
> snd_hda_core cdc_acm mousedev snd_compress ecc rfkill snd_hwdep
> ac97_bus snd_pcm_dmaengine ina2xx_adc snd_pcm kfifo_buf
> spi_amd snd_timer opt3001 ina2xx snd industrialio soundcore mac_hid
> acpi_cpufreq fuse ip_tables x_tables overlay ext4 crc16 mbcache jbd2
> mmc_block vfat fat usbhid amdgpu drm_ttm_helper ttm agpgart drm_exec
> gpu_sched i2c_algo_bit
> [  124.822016]  drm_display_helper drm_kms_helper syscopyarea
> sysfillrect sysimgblt fb_sys_fops drm serio_raw atkbd crct10dif_pclmul
> libps2 crc32_pclmul vivaldi_fmap sdhci_pci ghash_clmulni_intel i8042
> ccp cqhci sdhci aesni_intel hid_multitouch xhci_pci crypto_simd cryptd
> wdat_wdt mmc_core cec sp5100_tco rng_core xhci_pci_renesas serio video
> i2c_hid_acpi 8250_dw i2c_hid btrfs
> blake2b_generic libcrc32c crc32c_generic crc32c_intel xor raid6_pq
> dm_mirror dm_region_hash dm_log dm_mod pkcs8_key_parser crypto_user
> [  124.822051] CPU: 3 PID: 2485 Comm: ForzaHorizon5.e Not tainted
> 5.18.0-1-neptune-00172-g067e00b76d9c #23
> [  124.822054] Hardware name: Valve Jupiter/Jupiter, BIOS F7A0105 03/21/2022
> [  124.822055] RIP: 0010:amdgpu_ttm_tt_get_user_pages+0x15c/0x190 [amdgpu]
> [  124.822262] Code: e1 ef c0 48 c7 c7 10 4a 0c c1 e8 5f f7 3e dd eb
> 9c 48 c7 c6 85 0a f6 c0 bf 02 00 00 00 e8 8c 74 e2 ff 41 be f2 ff ff
> ff eb 8b <0f> 0b eb f4 41 be fd ff ff ff e9 7c ff ff ff 48 83 b8 a0 00
> 00 00
> [  124.822264] RSP: 0018:ffffa257827afb98 EFLAGS: 00010282
> [  124.822267] RAX: ffff8b82240e6000 RBX: ffff8b8200a31100 RCX: 0000000000000001
> [  124.822268] RDX: 0000000000000dc0 RSI: ffff8b82240e6000 RDI: ffff8b82a4c7e800
> [  124.822269] RBP: ffff8b82ee809320 R08: 0000000000001000 R09: ffff8b82240e6000
> [  124.822270] R10: 0000000000000006 R11: 0000000000000000 R12: ffff8b82ee6dc9c0
> [  124.822272] R13: 0000000031880000 R14: 0000000000000001 R15: ffff8b823face440
> [  124.822273] FS:  000000002773f640(0000) GS:ffff8b852fec0000(0000)
> knlGS:000000001aba0000
> [  124.822275] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  124.822276] CR2: 0000000003ff4000 CR3: 00000001f1c2e000 CR4: 0000000000350ee0
> [  124.822278] Call Trace:
> [  124.822281]  <TASK>
> [  124.822285]  amdgpu_cs_ioctl+0x9cc/0x2070 [amdgpu]
> [  124.822496]  ? amdgpu_cs_find_mapping+0x110/0x110 [amdgpu]
> [  124.822701]  drm_ioctl_kernel+0xc5/0x170 [drm]
> [  124.822728]  ? futex_wait+0x18f/0x260
> [  124.822733]  drm_ioctl+0x229/0x400 [drm]
> [  124.822757]  ? amdgpu_cs_find_mapping+0x110/0x110 [amdgpu]
> [  124.822963]  amdgpu_drm_ioctl+0x4a/0x80 [amdgpu]
> [  124.823165]  __x64_sys_ioctl+0x8c/0xc0
> [  124.823169]  do_syscall_64+0x3a/0x80
> [  124.823174]  entry_SYSCALL_64_after_hwframe+0x44/0xae
> [  124.823177] RIP: 0033:0x7f5525e1059b
> [  124.823180] Code: ff ff ff 85 c0 79 9b 49 c7 c4 ff ff ff ff 5b 5d
> 4c 89 e0 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa b8 10 00 00
> 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d a5 a8 0c 00 f7 d8 64 89
> 01 48
> [  124.823182] RSP: 002b:000000002773d548 EFLAGS: 00000246 ORIG_RAX:
> 0000000000000010
> [  124.823185] RAX: ffffffffffffffda RBX: 000000002773d5d0 RCX: 00007f5525e1059b
> [  124.823186] RDX: 000000002773d5d0 RSI: 00000000c0186444 RDI: 0000000000000021
> [  124.823187] RBP: 00000000c0186444 R08: 00007f54a4043c80 R09: 000000002773d590
> [  124.823188] R10: 0000000000000000 R11: 0000000000000246 R12: 00007f54a4043d50
> [  124.823190] R13: 0000000000000021 R14: 00007f54a4043cb0 R15: 00007f54a4043d20
> [  124.823192]  </TASK>
> [  124.823193] ---[ end trace 0000000000000000 ]---
> [  124.823197] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to
> process the buffer list -14!
> [  124.823410] ------------[ cut here ]------------
> [  124.823411] refcount_t: underflow; use-after-free.
> [  124.823418] WARNING: CPU: 3 PID: 2485 at lib/refcount.c:28
> refcount_warn_saturate+0xa6/0xf0
> [  124.823424] Modules linked in: uinput snd_seq_dummy snd_hrtimer
> snd_seq snd_seq_device ccm algif_aead cbc des_generic libdes ecb md4
> cmac algif_hash algif_skcipher af_alg bnep intel_rapl_msr
> intel_rapl_common snd_soc_acp5x_mach snd_acp5x_i2s snd_acp5x_pcm_dma
> edac_mce_amd kvm_amd kvm rtw88_8822ce rtw88_8822c rtw88_pci irqbypass
> rapl rtw88_core pcspkr joydev mac80211 btusb s
> nd_hda_codec_hdmi btrtl libarc4 snd_hda_intel btbcm btintel
> snd_intel_dspcfg btmtk snd_pci_acp5x i2c_piix4 snd_soc_nau8821
> snd_intel_sdw_acpi snd_rn_pci_acp3x cfg80211 bluetooth snd_soc_core
> snd_hda_codec snd_acp_config snd_soc_acpi snd_pci_acp3x ecdh_generic
> snd_hda_core cdc_acm mousedev snd_compress ecc rfkill snd_hwdep
> ac97_bus snd_pcm_dmaengine ina2xx_adc snd_pcm kfifo_buf
> spi_amd snd_timer opt3001 ina2xx snd industrialio soundcore mac_hid
> acpi_cpufreq fuse ip_tables x_tables overlay ext4 crc16 mbcache jbd2
> mmc_block vfat fat usbhid amdgpu drm_ttm_helper ttm agpgart drm_exec
> gpu_sched i2c_algo_bit
> [  124.823485]  drm_display_helper drm_kms_helper syscopyarea
> sysfillrect sysimgblt fb_sys_fops drm serio_raw atkbd crct10dif_pclmul
> libps2 crc32_pclmul vivaldi_fmap sdhci_pci ghash_clmulni_intel i8042
> ccp cqhci sdhci aesni_intel hid_multitouch xhci_pci crypto_simd cryptd
> wdat_wdt mmc_core cec sp5100_tco rng_core xhci_pci_renesas serio video
> i2c_hid_acpi 8250_dw i2c_hid btrfs
> blake2b_generic libcrc32c crc32c_generic crc32c_intel xor raid6_pq
> dm_mirror dm_region_hash dm_log dm_mod pkcs8_key_parser crypto_user
> [  124.823516] CPU: 3 PID: 2485 Comm: ForzaHorizon5.e Tainted: G
>   W         5.18.0-1-neptune-00172-g067e00b76d9c #23
> [  124.823519] Hardware name: Valve Jupiter/Jupiter, BIOS F7A0105 03/21/2022
> [  124.823520] RIP: 0010:refcount_warn_saturate+0xa6/0xf0
> [  124.823523] Code: 05 2d c4 6d 01 01 e8 90 68 58 00 0f 0b c3 80 3d
> 1d c4 6d 01 00 75 95 48 c7 c7 b8 db ba 9e c6 05 0d c4 6d 01 01 e8 71
> 68 58 00 <0f> 0b c3 80 3d fc c3 6d 01 00 0f 85 72 ff ff ff 48 c7 c7 10
> dc ba
> [  124.823524] RSP: 0018:ffffa257827afba8 EFLAGS: 00010286
> [  124.823526] RAX: 0000000000000000 RBX: ffffa257827afc58 RCX: 0000000000000027
> [  124.823527] RDX: ffff8b852fee0768 RSI: 0000000000000001 RDI: ffff8b852fee0760
> [  124.823528] RBP: 0000000000000000 R08: 0000000000000000 R09: ffffa257827af9b8
> [  124.823529] R10: 0000000000000003 R11: ffffffff9f2c5168 R12: 00000000ffffffff
> [  124.823530] R13: 0000000000000018 R14: 0000000000000001 R15: ffff8b823face440
> [  124.823531] FS:  000000002773f640(0000) GS:ffff8b852fec0000(0000)
> knlGS:000000001aba0000
> [  124.823533] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  124.823534] CR2: 0000000003ff4000 CR3: 00000001f1c2e000 CR4: 0000000000350ee0
> [  124.823535] Call Trace:
> [  124.823537]  <TASK>
> [  124.823537]  amdgpu_cs_parser_fini+0x11e/0x160 [amdgpu]
> [  124.823745]  amdgpu_cs_ioctl+0x40a/0x2070 [amdgpu]
> [  124.823954]  ? amdgpu_cs_find_mapping+0x110/0x110 [amdgpu]
> [  124.824159]  drm_ioctl_kernel+0xc5/0x170 [drm]
> [  124.824185]  ? futex_wait+0x18f/0x260
> [  124.824189]  drm_ioctl+0x229/0x400 [drm]
> [  124.824213]  ? amdgpu_cs_find_mapping+0x110/0x110 [amdgpu]
> [  124.824444]  amdgpu_drm_ioctl+0x4a/0x80 [amdgpu]
> [  124.824651]  __x64_sys_ioctl+0x8c/0xc0
> [  124.824655]  do_syscall_64+0x3a/0x80
> [  124.824660]  entry_SYSCALL_64_after_hwframe+0x44/0xae
> [  124.824663] RIP: 0033:0x7f5525e1059b
> [  124.824665] Code: ff ff ff 85 c0 79 9b 49 c7 c4 ff ff ff ff 5b 5d
> 4c 89 e0 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa b8 10 00 00
> 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d a5 a8 0c 00 f7 d8 64 89
> 01 48
> [  124.824667] RSP: 002b:000000002773d548 EFLAGS: 00000246 ORIG_RAX:
> 0000000000000010
> [  124.824670] RAX: ffffffffffffffda RBX: 000000002773d5d0 RCX: 00007f5525e1059b
> [  124.824671] RDX: 000000002773d5d0 RSI: 00000000c0186444 RDI: 0000000000000021
> [  124.824673] RBP: 00000000c0186444 R08: 00007f54a4043c80 R09: 000000002773d590
> [  124.824674] R10: 0000000000000000 R11: 0000000000000246 R12: 00007f54a4043d50
> [  124.824675] R13: 0000000000000021 R14: 00007f54a4043cb0 R15: 00007f54a4043d20
> [  124.824677]  </TASK>
> [  124.824678] ---[ end trace 0000000000000000 ]---
>
>
>
>> Please rebase this work on top. It should at least make the TTM changes
>> unnecessary.
>>
>> Going to take a closer look into the VM sync changes now.
>>
>> Regards,
>> Christian.
>>
>> Am 13.08.22 um 03:27 schrieb Bas Nieuwenhuizen:
>>> This adds a context option to use DMA_RESV_USAGE_BOOKKEEP for userspace submissions,
>>> based on Christians TTM work.
>>>
>>> Disabling implicit sync is something we've wanted in radv for a while for resolving
>>> some corner cases. A more immediate thing that would be solved here is avoiding a
>>> bunch of implicit sync on GPU map/unmap operations as well, which helps with stutter
>>> around sparse maps/unmaps.
>>>
>>> This has seen a significant improvement in stutter in Forza Horizon 5 and Forza
>>> Horizon 4. (As games that had significant issues in sparse binding related stutter).
>>> I've been able to pass a full vulkan-cts run on navi21 with this.
>>>
>>> Userspace code for this is available at
>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.freedesktop.org%2Fmesa%2Fmesa%2F-%2Fmerge_requests%2F18032&amp;data=05%7C01%7Cchristian.koenig%40amd.com%7Ccc04d790d774485a5cbd08da83ca03f4%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637967200920533109%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=6vpUn1APkYZKO0xA7ixEpJgG7%2B1gHynGv1iO5BPfZe4%3D&amp;reserved=0 and a branch
>>> for the kernel code is available at
>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FBNieuwenhuizen%2Flinux%2Ftree%2Fno-implicit-sync-5.19&amp;data=05%7C01%7Cchristian.koenig%40amd.com%7Ccc04d790d774485a5cbd08da83ca03f4%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637967200920533109%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=KV0DTPt35fuduIU4qFroTHGmsZ%2FSD9yWk8F6YjzEu4c%3D&amp;reserved=0
>>>
>>> This is a follow-up on RFC series https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatchwork.freedesktop.org%2Fseries%2F104578%2F&amp;data=05%7C01%7Cchristian.koenig%40amd.com%7Ccc04d790d774485a5cbd08da83ca03f4%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637967200920533109%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=ymhUQZPFcBhd0qHyt%2BawAwQYx9hUZjviF5T90ks0MEQ%3D&amp;reserved=0 .
>>>
>>> The main changes were:
>>>
>>> 1) Instead of replacing num_shared with usage, I'm just adding usage, since
>>>      num_shared was actually needed.
>>> 2) We now agree that DMA_RESV_USAGE_BOOKKEEP is reasonable for this purpose.
>>>
>>> Please let me know if I missed anything, especially with the change to VM updates,
>>> as we went back and forth a ton of times on that.
>>>
>>>
>>> Bas Nieuwenhuizen (6):
>>>     drm/ttm: Add usage to ttm_validate_buffer.
>>>     drm/amdgpu: Add separate mode for syncing DMA_RESV_USAGE_BOOKKEEP.
>>>     drm/amdgpu: Allow explicit sync for VM ops.
>>>     drm/amdgpu: Refactor amdgpu_vm_get_pd_bo.
>>>     drm/amdgpu: Add option to disable implicit sync for a context.
>>>     drm/amdgpu: Bump amdgpu driver version.
>>>
>>>    .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  | 16 +++++++---
>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c        | 20 +++++++++---
>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c       |  3 +-
>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c       | 32 +++++++++++++++++--
>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h       |  1 +
>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c       |  3 +-
>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c       | 12 ++++---
>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c       |  3 +-
>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_object.c    | 11 ++++---
>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_object.h    |  3 +-
>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c      | 11 +++++--
>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_sync.h      |  4 +--
>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c       |  1 +
>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c       |  2 +-
>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c        |  5 ++-
>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h        |  3 +-
>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_vm_cpu.c    |  3 +-
>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c   |  3 +-
>>>    drivers/gpu/drm/amd/amdkfd/kfd_svm.c          |  1 +
>>>    drivers/gpu/drm/qxl/qxl_release.c             |  1 +
>>>    drivers/gpu/drm/radeon/radeon_cs.c            |  2 ++
>>>    drivers/gpu/drm/radeon/radeon_gem.c           |  1 +
>>>    drivers/gpu/drm/radeon/radeon_vm.c            |  2 ++
>>>    drivers/gpu/drm/ttm/ttm_execbuf_util.c        |  3 +-
>>>    drivers/gpu/drm/vmwgfx/vmwgfx_resource.c      |  7 +++-
>>>    drivers/gpu/drm/vmwgfx/vmwgfx_validation.c    |  1 +
>>>    include/drm/ttm/ttm_execbuf_util.h            |  2 ++
>>>    include/uapi/drm/amdgpu_drm.h                 |  3 ++
>>>    28 files changed, 122 insertions(+), 37 deletions(-)
>>>


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/6] amdgpu: Allow explicitly synchronized submissions.
  2022-08-23 10:15     ` Christian König
@ 2022-08-23 22:35       ` Bas Nieuwenhuizen
  0 siblings, 0 replies; 14+ messages in thread
From: Bas Nieuwenhuizen @ 2022-08-23 22:35 UTC (permalink / raw)
  To: Christian König; +Cc: amd-gfx, dri-devel

On Tue, Aug 23, 2022 at 12:16 PM Christian König
<christian.koenig@amd.com> wrote:
>
> Hi Bas,
>
> I've just pushed an updated drm-exec branch to fdo which should now
> include the bo_list bug fix.

Still getting the same thing:

[  103.598784] ------------[ cut here ]------------
[  103.598787] WARNING: CPU: 2 PID: 2505 at
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c:667
amdgpu_ttm_tt_get_user_pages+0x15c/0x190 [amdgpu]
[  103.599016] Modules linked in: uinput snd_seq_dummy snd_hrtimer
snd_seq snd_seq_device ccm algif_aead cbc des_generic libdes ecb md4
cmac algif_hash algif_skcipher af_alg bnep intel_ra
pl_msr snd_soc_acp5x_mach snd_acp5x_i2s snd_acp5x_pcm_dma
intel_rapl_common rtw88_8822ce rtw88_8822c rtw88_pci edac_mce_amd
rtw88_core kvm_amd mac80211 kvm btusb btrtl libarc4 btbcm irqby
pass rapl btintel pcspkr btmtk joydev cfg80211 snd_hda_codec_hdmi
bluetooth i2c_piix4 snd_soc_nau8821 snd_hda_intel snd_intel_dspcfg
snd_soc_core snd_intel_sdw_acpi snd_hda_codec snd_pci_
acp5x snd_hda_core snd_rn_pci_acp3x snd_compress snd_acp_config
ac97_bus snd_soc_acpi snd_pcm_dmaengine snd_hwdep ecdh_generic
snd_pci_acp3x cdc_acm mousedev ecc rfkill snd_pcm ina2xx_adc
kfifo_buf snd_timer snd opt3001 ina2xx industrialio soundcore
acpi_cpufreq spi_amd mac_hid fuse ip_tables x_tables overlay ext4
crc16 mbcache jbd2 mmc_block vfat fat usbhid amdgpu drm_tt
m_helper ttm agpgart drm_exec gpu_sched i2c_algo_bit
[  103.599064]  drm_display_helper drm_kms_helper syscopyarea
sysfillrect sysimgblt fb_sys_fops drm serio_raw sdhci_pci atkbd libps2
cqhci vivaldi_fmap ccp crct10dif_pclmul sdhci i8042 cr
c32_pclmul xhci_pci hid_multitouch ghash_clmulni_intel aesni_intel
crypto_simd cryptd wdat_wdt mmc_core cec sp5100_tco rng_core
xhci_pci_renesas serio video i2c_hid_acpi 8250_dw i2c_hid b
trfs blake2b_generic libcrc32c crc32c_generic crc32c_intel xor
raid6_pq dm_mirror dm_region_hash dm_log dm_mod pkcs8_key_parser
crypto_user
[  103.599091] CPU: 2 PID: 2505 Comm: ForzaHorizon5.e Not tainted
5.18.0-1-neptune-00172-g067e00b76d9c #25
[  103.599093] Hardware name: Valve Jupiter/Jupiter, BIOS F7A0105 03/21/2022
[  103.599095] RIP: 0010:amdgpu_ttm_tt_get_user_pages+0x15c/0x190 [amdgpu]
[  103.599232] Code: 66 ed c0 48 c7 c7 60 cb 09 c1 e8 5f 87 c1 cb eb
9c 48 c7 c6 c5 8e f3 c0 bf 02 00 00 00 e8 8c a4 ee ff 41 be f2 ff ff
ff eb 8b <0f> 0b eb f4 41 be fd ff ff ff e9 7c ff
ff ff 48 83 b8 a0 00 00 00
[  103.599234] RSP: 0018:ffffc195020afb98 EFLAGS: 00010282
[  103.599235] RAX: ffff9d840b878000 RBX: ffff9d8300b1e1c0 RCX: 0000000000000001
[  103.599236] RDX: 0000000000000dc0 RSI: ffff9d840b878000 RDI: ffff9d835bdb3000
[  103.599237] RBP: ffff9d84d2617520 R08: 0000000000001000 R09: ffff9d840b878000
[  103.599238] R10: 0000000000000005 R11: 0000000000000000 R12: ffff9d831030e240
[  103.599239] R13: 0000000032080000 R14: 0000000000000001 R15: ffff9d8313c30060
[  103.599240] FS:  000000001c86f640(0000) GS:ffff9d862fe80000(0000)
knlGS:000000001abf0000
[  103.599242] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  103.599242] CR2: 00007f0be80210e8 CR3: 000000017fa08000 CR4: 0000000000350ee0
[  103.599244] Call Trace:
[  103.599247]  <TASK>
[  103.599250]  amdgpu_cs_ioctl+0x9cc/0x2080 [amdgpu]
[  103.599390]  ? amdgpu_cs_find_mapping+0x110/0x110 [amdgpu]
[  103.599525]  drm_ioctl_kernel+0xc5/0x170 [drm]
[  103.599547]  drm_ioctl+0x229/0x400 [drm]
[  103.599563]  ? amdgpu_cs_find_mapping+0x110/0x110 [amdgpu]
[  103.599700]  amdgpu_drm_ioctl+0x4a/0x80 [amdgpu]
[  103.599832]  __x64_sys_ioctl+0x8c/0xc0
[  103.599837]  do_syscall_64+0x3a/0x80
[  103.599841]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  103.599844] RIP: 0033:0x7f0c9f59b59b
[  103.599846] Code: ff ff ff 85 c0 79 9b 49 c7 c4 ff ff ff ff 5b 5d
4c 89 e0 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa b8 10 00 00
00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d a5
a8 0c 00 f7 d8 64 89 01 48
[  103.599847] RSP: 002b:000000001c86d498 EFLAGS: 00000246 ORIG_RAX:
0000000000000010
[  103.599849] RAX: ffffffffffffffda RBX: 000000001c86d510 RCX: 00007f0c9f59b59b
[  103.599850] RDX: 000000001c86d510 RSI: 00000000c0186444 RDI: 0000000000000021
[  103.599850] RBP: 00000000c0186444 R08: 00007f0c3ad55840 R09: 000000001c86d4e0
[  103.599851] R10: 0000000000000000 R11: 0000000000000246 R12: 00007f0c3a19b200
[  103.599852] R13: 0000000000000021 R14: 00007f0c3aff8cf0 R15: 000000001c86d6e0
[  103.599854]  </TASK>
[  103.599854] ---[ end trace 0000000000000000 ]---
[  103.599856] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to
process the buffer list -14!
[  103.599995] ------------[ cut here ]------------
[  103.599995] refcount_t: underflow; use-after-free.
[  103.600002] WARNING: CPU: 2 PID: 2505 at lib/refcount.c:28
refcount_warn_saturate+0xa6/0xf0
[  103.600006] Modules linked in: uinput snd_seq_dummy snd_hrtimer
snd_seq snd_seq_device ccm algif_aead cbc des_generic libdes ecb md4
cmac algif_hash algif_skcipher af_alg bnep intel_ra
pl_msr snd_soc_acp5x_mach snd_acp5x_i2s snd_acp5x_pcm_dma
intel_rapl_common rtw88_8822ce rtw88_8822c rtw88_pci edac_mce_amd
rtw88_core kvm_amd mac80211 kvm btusb btrtl libarc4 btbcm irqby
pass rapl btintel pcspkr btmtk joydev cfg80211 snd_hda_codec_hdmi
bluetooth i2c_piix4 snd_soc_nau8821 snd_hda_intel snd_intel_dspcfg
snd_soc_core snd_intel_sdw_acpi snd_hda_codec snd_pci_
acp5x snd_hda_core snd_rn_pci_acp3x snd_compress snd_acp_config
ac97_bus snd_soc_acpi snd_pcm_dmaengine snd_hwdep ecdh_generic
snd_pci_acp3x cdc_acm mousedev ecc rfkill snd_pcm ina2xx_adc
kfifo_buf snd_timer snd opt3001 ina2xx industrialio soundcore
acpi_cpufreq spi_amd mac_hid fuse ip_tables x_tables overlay ext4
crc16 mbcache jbd2 mmc_block vfat fat usbhid amdgpu drm_tt
m_helper ttm agpgart drm_exec gpu_sched i2c_algo_bit
[  103.600043]  drm_display_helper drm_kms_helper syscopyarea
sysfillrect sysimgblt fb_sys_fops drm serio_raw sdhci_pci atkbd libps2
cqhci vivaldi_fmap ccp crct10dif_pclmul sdhci i8042 cr
c32_pclmul xhci_pci hid_multitouch ghash_clmulni_intel aesni_intel
crypto_simd cryptd wdat_wdt mmc_core cec sp5100_tco rng_core
xhci_pci_renesas serio video i2c_hid_acpi 8250_dw i2c_hid b
trfs blake2b_generic libcrc32c crc32c_generic crc32c_intel xor
raid6_pq dm_mirror dm_region_hash dm_log dm_mod pkcs8_key_parser
crypto_user
[  103.600062] CPU: 2 PID: 2505 Comm: ForzaHorizon5.e Tainted: G
 W         5.18.0-1-neptune-00172-g067e00b76d9c #25
[  103.600064] Hardware name: Valve Jupiter/Jupiter, BIOS F7A0105 03/21/2022
[  103.600065] RIP: 0010:refcount_warn_saturate+0xa6/0xf0
[  103.600066] Code: 05 2d c4 6d 01 01 e8 90 68 58 00 0f 0b c3 80 3d
1d c4 6d 01 00 75 95 48 c7 c7 b8 db 3a 8d c6 05 0d c4 6d 01 01 e8 71
68 58 00 <0f> 0b c3 80 3d fc c3 6d 01 00 0f 85 72
ff ff ff 48 c7 c7 10 dc 3a
[  103.600068] RSP: 0018:ffffc195020afba8 EFLAGS: 00010286
[  103.600069] RAX: 0000000000000000 RBX: ffffc195020afc58 RCX: 0000000000000027
[  103.600070] RDX: ffff9d862fea0768 RSI: 0000000000000001 RDI: ffff9d862fea0760
[  103.600070] RBP: 0000000000000000 R08: 0000000000000000 R09: ffffc195020af9b8
[  103.600071] R10: 0000000000000003 R11: ffffffff8dac5168 R12: 00000000ffffffff
[  103.600072] R13: 0000000000000018 R14: 0000000000000001 R15: ffff9d8313c30060
[  103.600073] FS:  000000001c86f640(0000) GS:ffff9d862fe80000(0000)
knlGS:000000001abf0000
[  103.600074] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  103.600075] CR2: 00007f0be80210e8 CR3: 000000017fa08000 CR4: 0000000000350ee0
[  103.600076] Call Trace:
[  103.600076]  <TASK>
[  103.600077]  amdgpu_cs_parser_fini+0x11e/0x160 [amdgpu]
[  103.600213]  amdgpu_cs_ioctl+0x40a/0x2080 [amdgpu]
[  103.600350]  ? amdgpu_cs_find_mapping+0x110/0x110 [amdgpu]
[  103.600485]  drm_ioctl_kernel+0xc5/0x170 [drm]
[  103.600502]  drm_ioctl+0x229/0x400 [drm]
[  103.600518]  ? amdgpu_cs_find_mapping+0x110/0x110 [amdgpu]
[  103.600654]  amdgpu_drm_ioctl+0x4a/0x80 [amdgpu]
[  103.600787]  __x64_sys_ioctl+0x8c/0xc0
[  103.600788]  do_syscall_64+0x3a/0x80
[  103.600791]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  103.600792] RIP: 0033:0x7f0c9f59b59b
[  103.600793] Code: ff ff ff 85 c0 79 9b 49 c7 c4 ff ff ff ff 5b 5d
4c 89 e0 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa b8 10 00 00
00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d a5
a8 0c 00 f7 d8 64 89 01 48
[  103.600794] RSP: 002b:000000001c86d498 EFLAGS: 00000246 ORIG_RAX:
0000000000000010
[  103.600795] RAX: ffffffffffffffda RBX: 000000001c86d510 RCX: 00007f0c9f59b59b
[  103.600796] RDX: 000000001c86d510 RSI: 00000000c0186444 RDI: 0000000000000021
[  103.600797] RBP: 00000000c0186444 R08: 00007f0c3ad55840 R09: 000000001c86d4e0
[  103.600798] R10: 0000000000000000 R11: 0000000000000246 R12: 00007f0c3a19b200
[  103.600798] R13: 0000000000000021 R14: 00007f0c3aff8cf0 R15: 000000001c86d6e0
[  103.600800]  </TASK>
[  103.600800] ---[ end trace 0000000000000000 ]---


>
> Can you please test that with Forza? I'm still fighting getting a new
> kernel on my Steamdeck.
>
> Thanks,
> Christian.
>
> Am 22.08.22 um 01:08 schrieb Bas Nieuwenhuizen:
> > On Thu, Aug 18, 2022 at 3:20 PM Christian König
> > <christian.koenig@amd.com> wrote:
> >> Hi Bas,
> >>
> >> I've just pushed the branch drm-exec to my fdo repository:
> >> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.freedesktop.org%2Fckoenig%2Flinux-drm.git&amp;data=05%7C01%7Cchristian.koenig%40amd.com%7Ccc04d790d774485a5cbd08da83ca03f4%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637967200920376906%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=8ZaXIdEQZe3oNCQtxoNjuBezB4YmPeDR2cLfXfxraZk%3D&amp;reserved=0
> >>
> >> This branch contains all the gang submit patches as well as the latest
> >> drm-exec stuff. VCN3/4 video decoding has some issues on it, but that
> >> probably shouldn't bother your work.
> > Hi Christian,
> >
> > The drm-exec branch doesn't seem to be capable of running Forza
> > Horizon 5. First bad commit seems to be
> >
> > commit 8bb3e919ce0109512f6631422f3fe52169836261
> > Author: Christian König <christian.koenig@amd.com>
> > Date:   Thu Jul 14 10:23:38 2022 +0200
> >
> >     drm/amdgpu: revert "partial revert "remove ctx->lock" v2"
> >
> >     This reverts commit 94f4c4965e5513ba624488f4b601d6b385635aec.
> >
> >     We found that the bo_list is missing a protection for its list entries.
> >     Since that is fixed now this workaround can be removed again.
> >
> >     Signed-off-by: Christian König <christian.koenig@amd.com>
> >
> >
> > and
> >
> > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatchwork.freedesktop.org%2Fpatch%2F497679%2F&amp;data=05%7C01%7Cchristian.koenig%40amd.com%7Ccc04d790d774485a5cbd08da83ca03f4%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637967200920376906%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=0F7jd61YEApKySKpqIgODHM1x0JB83coaHgjzFeVPoU%3D&amp;reserved=0 ("drm/amdgpu: Fix
> > use-after-free on amdgpu_bo_list mutex")
> >
> > seems to fix things at that patch, but I'm not seeing the obvious
> > rebase over "drm/amdgpu: cleanup and reorder amdgpu_cs.c" yet (and/or
> > whether further issues were introduced).
> >
> >
> > Error logs:
> >
> > [  124.821691] ------------[ cut here ]------------
> > [  124.821696] WARNING: CPU: 3 PID: 2485 at
> > drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c:667
> > amdgpu_ttm_tt_get_user_pages+0x15c/0x190 [amdgpu]
> > [  124.821955] Modules linked in: uinput snd_seq_dummy snd_hrtimer
> > snd_seq snd_seq_device ccm algif_aead cbc des_generic libdes ecb md4
> > cmac algif_hash algif_skcipher af_alg bnep intel_rapl_msr
> > intel_rapl_common snd_soc_acp5x_mach snd_acp5x_i2s snd_acp5x_pcm_dma
> > edac_mce_amd kvm_amd kvm rtw88_8822ce rtw88_8822c rtw88_pci irqbypass
> > rapl rtw88_core pcspkr joydev mac80211 btusb s
> > nd_hda_codec_hdmi btrtl libarc4 snd_hda_intel btbcm btintel
> > snd_intel_dspcfg btmtk snd_pci_acp5x i2c_piix4 snd_soc_nau8821
> > snd_intel_sdw_acpi snd_rn_pci_acp3x cfg80211 bluetooth snd_soc_core
> > snd_hda_codec snd_acp_config snd_soc_acpi snd_pci_acp3x ecdh_generic
> > snd_hda_core cdc_acm mousedev snd_compress ecc rfkill snd_hwdep
> > ac97_bus snd_pcm_dmaengine ina2xx_adc snd_pcm kfifo_buf
> > spi_amd snd_timer opt3001 ina2xx snd industrialio soundcore mac_hid
> > acpi_cpufreq fuse ip_tables x_tables overlay ext4 crc16 mbcache jbd2
> > mmc_block vfat fat usbhid amdgpu drm_ttm_helper ttm agpgart drm_exec
> > gpu_sched i2c_algo_bit
> > [  124.822016]  drm_display_helper drm_kms_helper syscopyarea
> > sysfillrect sysimgblt fb_sys_fops drm serio_raw atkbd crct10dif_pclmul
> > libps2 crc32_pclmul vivaldi_fmap sdhci_pci ghash_clmulni_intel i8042
> > ccp cqhci sdhci aesni_intel hid_multitouch xhci_pci crypto_simd cryptd
> > wdat_wdt mmc_core cec sp5100_tco rng_core xhci_pci_renesas serio video
> > i2c_hid_acpi 8250_dw i2c_hid btrfs
> > blake2b_generic libcrc32c crc32c_generic crc32c_intel xor raid6_pq
> > dm_mirror dm_region_hash dm_log dm_mod pkcs8_key_parser crypto_user
> > [  124.822051] CPU: 3 PID: 2485 Comm: ForzaHorizon5.e Not tainted
> > 5.18.0-1-neptune-00172-g067e00b76d9c #23
> > [  124.822054] Hardware name: Valve Jupiter/Jupiter, BIOS F7A0105 03/21/2022
> > [  124.822055] RIP: 0010:amdgpu_ttm_tt_get_user_pages+0x15c/0x190 [amdgpu]
> > [  124.822262] Code: e1 ef c0 48 c7 c7 10 4a 0c c1 e8 5f f7 3e dd eb
> > 9c 48 c7 c6 85 0a f6 c0 bf 02 00 00 00 e8 8c 74 e2 ff 41 be f2 ff ff
> > ff eb 8b <0f> 0b eb f4 41 be fd ff ff ff e9 7c ff ff ff 48 83 b8 a0 00
> > 00 00
> > [  124.822264] RSP: 0018:ffffa257827afb98 EFLAGS: 00010282
> > [  124.822267] RAX: ffff8b82240e6000 RBX: ffff8b8200a31100 RCX: 0000000000000001
> > [  124.822268] RDX: 0000000000000dc0 RSI: ffff8b82240e6000 RDI: ffff8b82a4c7e800
> > [  124.822269] RBP: ffff8b82ee809320 R08: 0000000000001000 R09: ffff8b82240e6000
> > [  124.822270] R10: 0000000000000006 R11: 0000000000000000 R12: ffff8b82ee6dc9c0
> > [  124.822272] R13: 0000000031880000 R14: 0000000000000001 R15: ffff8b823face440
> > [  124.822273] FS:  000000002773f640(0000) GS:ffff8b852fec0000(0000)
> > knlGS:000000001aba0000
> > [  124.822275] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [  124.822276] CR2: 0000000003ff4000 CR3: 00000001f1c2e000 CR4: 0000000000350ee0
> > [  124.822278] Call Trace:
> > [  124.822281]  <TASK>
> > [  124.822285]  amdgpu_cs_ioctl+0x9cc/0x2070 [amdgpu]
> > [  124.822496]  ? amdgpu_cs_find_mapping+0x110/0x110 [amdgpu]
> > [  124.822701]  drm_ioctl_kernel+0xc5/0x170 [drm]
> > [  124.822728]  ? futex_wait+0x18f/0x260
> > [  124.822733]  drm_ioctl+0x229/0x400 [drm]
> > [  124.822757]  ? amdgpu_cs_find_mapping+0x110/0x110 [amdgpu]
> > [  124.822963]  amdgpu_drm_ioctl+0x4a/0x80 [amdgpu]
> > [  124.823165]  __x64_sys_ioctl+0x8c/0xc0
> > [  124.823169]  do_syscall_64+0x3a/0x80
> > [  124.823174]  entry_SYSCALL_64_after_hwframe+0x44/0xae
> > [  124.823177] RIP: 0033:0x7f5525e1059b
> > [  124.823180] Code: ff ff ff 85 c0 79 9b 49 c7 c4 ff ff ff ff 5b 5d
> > 4c 89 e0 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa b8 10 00 00
> > 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d a5 a8 0c 00 f7 d8 64 89
> > 01 48
> > [  124.823182] RSP: 002b:000000002773d548 EFLAGS: 00000246 ORIG_RAX:
> > 0000000000000010
> > [  124.823185] RAX: ffffffffffffffda RBX: 000000002773d5d0 RCX: 00007f5525e1059b
> > [  124.823186] RDX: 000000002773d5d0 RSI: 00000000c0186444 RDI: 0000000000000021
> > [  124.823187] RBP: 00000000c0186444 R08: 00007f54a4043c80 R09: 000000002773d590
> > [  124.823188] R10: 0000000000000000 R11: 0000000000000246 R12: 00007f54a4043d50
> > [  124.823190] R13: 0000000000000021 R14: 00007f54a4043cb0 R15: 00007f54a4043d20
> > [  124.823192]  </TASK>
> > [  124.823193] ---[ end trace 0000000000000000 ]---
> > [  124.823197] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to
> > process the buffer list -14!
> > [  124.823410] ------------[ cut here ]------------
> > [  124.823411] refcount_t: underflow; use-after-free.
> > [  124.823418] WARNING: CPU: 3 PID: 2485 at lib/refcount.c:28
> > refcount_warn_saturate+0xa6/0xf0
> > [  124.823424] Modules linked in: uinput snd_seq_dummy snd_hrtimer
> > snd_seq snd_seq_device ccm algif_aead cbc des_generic libdes ecb md4
> > cmac algif_hash algif_skcipher af_alg bnep intel_rapl_msr
> > intel_rapl_common snd_soc_acp5x_mach snd_acp5x_i2s snd_acp5x_pcm_dma
> > edac_mce_amd kvm_amd kvm rtw88_8822ce rtw88_8822c rtw88_pci irqbypass
> > rapl rtw88_core pcspkr joydev mac80211 btusb s
> > nd_hda_codec_hdmi btrtl libarc4 snd_hda_intel btbcm btintel
> > snd_intel_dspcfg btmtk snd_pci_acp5x i2c_piix4 snd_soc_nau8821
> > snd_intel_sdw_acpi snd_rn_pci_acp3x cfg80211 bluetooth snd_soc_core
> > snd_hda_codec snd_acp_config snd_soc_acpi snd_pci_acp3x ecdh_generic
> > snd_hda_core cdc_acm mousedev snd_compress ecc rfkill snd_hwdep
> > ac97_bus snd_pcm_dmaengine ina2xx_adc snd_pcm kfifo_buf
> > spi_amd snd_timer opt3001 ina2xx snd industrialio soundcore mac_hid
> > acpi_cpufreq fuse ip_tables x_tables overlay ext4 crc16 mbcache jbd2
> > mmc_block vfat fat usbhid amdgpu drm_ttm_helper ttm agpgart drm_exec
> > gpu_sched i2c_algo_bit
> > [  124.823485]  drm_display_helper drm_kms_helper syscopyarea
> > sysfillrect sysimgblt fb_sys_fops drm serio_raw atkbd crct10dif_pclmul
> > libps2 crc32_pclmul vivaldi_fmap sdhci_pci ghash_clmulni_intel i8042
> > ccp cqhci sdhci aesni_intel hid_multitouch xhci_pci crypto_simd cryptd
> > wdat_wdt mmc_core cec sp5100_tco rng_core xhci_pci_renesas serio video
> > i2c_hid_acpi 8250_dw i2c_hid btrfs
> > blake2b_generic libcrc32c crc32c_generic crc32c_intel xor raid6_pq
> > dm_mirror dm_region_hash dm_log dm_mod pkcs8_key_parser crypto_user
> > [  124.823516] CPU: 3 PID: 2485 Comm: ForzaHorizon5.e Tainted: G
> >   W         5.18.0-1-neptune-00172-g067e00b76d9c #23
> > [  124.823519] Hardware name: Valve Jupiter/Jupiter, BIOS F7A0105 03/21/2022
> > [  124.823520] RIP: 0010:refcount_warn_saturate+0xa6/0xf0
> > [  124.823523] Code: 05 2d c4 6d 01 01 e8 90 68 58 00 0f 0b c3 80 3d
> > 1d c4 6d 01 00 75 95 48 c7 c7 b8 db ba 9e c6 05 0d c4 6d 01 01 e8 71
> > 68 58 00 <0f> 0b c3 80 3d fc c3 6d 01 00 0f 85 72 ff ff ff 48 c7 c7 10
> > dc ba
> > [  124.823524] RSP: 0018:ffffa257827afba8 EFLAGS: 00010286
> > [  124.823526] RAX: 0000000000000000 RBX: ffffa257827afc58 RCX: 0000000000000027
> > [  124.823527] RDX: ffff8b852fee0768 RSI: 0000000000000001 RDI: ffff8b852fee0760
> > [  124.823528] RBP: 0000000000000000 R08: 0000000000000000 R09: ffffa257827af9b8
> > [  124.823529] R10: 0000000000000003 R11: ffffffff9f2c5168 R12: 00000000ffffffff
> > [  124.823530] R13: 0000000000000018 R14: 0000000000000001 R15: ffff8b823face440
> > [  124.823531] FS:  000000002773f640(0000) GS:ffff8b852fec0000(0000)
> > knlGS:000000001aba0000
> > [  124.823533] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [  124.823534] CR2: 0000000003ff4000 CR3: 00000001f1c2e000 CR4: 0000000000350ee0
> > [  124.823535] Call Trace:
> > [  124.823537]  <TASK>
> > [  124.823537]  amdgpu_cs_parser_fini+0x11e/0x160 [amdgpu]
> > [  124.823745]  amdgpu_cs_ioctl+0x40a/0x2070 [amdgpu]
> > [  124.823954]  ? amdgpu_cs_find_mapping+0x110/0x110 [amdgpu]
> > [  124.824159]  drm_ioctl_kernel+0xc5/0x170 [drm]
> > [  124.824185]  ? futex_wait+0x18f/0x260
> > [  124.824189]  drm_ioctl+0x229/0x400 [drm]
> > [  124.824213]  ? amdgpu_cs_find_mapping+0x110/0x110 [amdgpu]
> > [  124.824444]  amdgpu_drm_ioctl+0x4a/0x80 [amdgpu]
> > [  124.824651]  __x64_sys_ioctl+0x8c/0xc0
> > [  124.824655]  do_syscall_64+0x3a/0x80
> > [  124.824660]  entry_SYSCALL_64_after_hwframe+0x44/0xae
> > [  124.824663] RIP: 0033:0x7f5525e1059b
> > [  124.824665] Code: ff ff ff 85 c0 79 9b 49 c7 c4 ff ff ff ff 5b 5d
> > 4c 89 e0 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa b8 10 00 00
> > 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d a5 a8 0c 00 f7 d8 64 89
> > 01 48
> > [  124.824667] RSP: 002b:000000002773d548 EFLAGS: 00000246 ORIG_RAX:
> > 0000000000000010
> > [  124.824670] RAX: ffffffffffffffda RBX: 000000002773d5d0 RCX: 00007f5525e1059b
> > [  124.824671] RDX: 000000002773d5d0 RSI: 00000000c0186444 RDI: 0000000000000021
> > [  124.824673] RBP: 00000000c0186444 R08: 00007f54a4043c80 R09: 000000002773d590
> > [  124.824674] R10: 0000000000000000 R11: 0000000000000246 R12: 00007f54a4043d50
> > [  124.824675] R13: 0000000000000021 R14: 00007f54a4043cb0 R15: 00007f54a4043d20
> > [  124.824677]  </TASK>
> > [  124.824678] ---[ end trace 0000000000000000 ]---
> >
> >
> >
> >> Please rebase this work on top. It should at least make the TTM changes
> >> unnecessary.
> >>
> >> Going to take a closer look into the VM sync changes now.
> >>
> >> Regards,
> >> Christian.
> >>
> >> Am 13.08.22 um 03:27 schrieb Bas Nieuwenhuizen:
> >>> This adds a context option to use DMA_RESV_USAGE_BOOKKEEP for userspace submissions,
> >>> based on Christians TTM work.
> >>>
> >>> Disabling implicit sync is something we've wanted in radv for a while for resolving
> >>> some corner cases. A more immediate thing that would be solved here is avoiding a
> >>> bunch of implicit sync on GPU map/unmap operations as well, which helps with stutter
> >>> around sparse maps/unmaps.
> >>>
> >>> This has seen a significant improvement in stutter in Forza Horizon 5 and Forza
> >>> Horizon 4. (As games that had significant issues in sparse binding related stutter).
> >>> I've been able to pass a full vulkan-cts run on navi21 with this.
> >>>
> >>> Userspace code for this is available at
> >>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.freedesktop.org%2Fmesa%2Fmesa%2F-%2Fmerge_requests%2F18032&amp;data=05%7C01%7Cchristian.koenig%40amd.com%7Ccc04d790d774485a5cbd08da83ca03f4%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637967200920533109%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=6vpUn1APkYZKO0xA7ixEpJgG7%2B1gHynGv1iO5BPfZe4%3D&amp;reserved=0 and a branch
> >>> for the kernel code is available at
> >>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FBNieuwenhuizen%2Flinux%2Ftree%2Fno-implicit-sync-5.19&amp;data=05%7C01%7Cchristian.koenig%40amd.com%7Ccc04d790d774485a5cbd08da83ca03f4%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637967200920533109%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=KV0DTPt35fuduIU4qFroTHGmsZ%2FSD9yWk8F6YjzEu4c%3D&amp;reserved=0
> >>>
> >>> This is a follow-up on RFC series https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatchwork.freedesktop.org%2Fseries%2F104578%2F&amp;data=05%7C01%7Cchristian.koenig%40amd.com%7Ccc04d790d774485a5cbd08da83ca03f4%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637967200920533109%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=ymhUQZPFcBhd0qHyt%2BawAwQYx9hUZjviF5T90ks0MEQ%3D&amp;reserved=0 .
> >>>
> >>> The main changes were:
> >>>
> >>> 1) Instead of replacing num_shared with usage, I'm just adding usage, since
> >>>      num_shared was actually needed.
> >>> 2) We now agree that DMA_RESV_USAGE_BOOKKEEP is reasonable for this purpose.
> >>>
> >>> Please let me know if I missed anything, especially with the change to VM updates,
> >>> as we went back and forth a ton of times on that.
> >>>
> >>>
> >>> Bas Nieuwenhuizen (6):
> >>>     drm/ttm: Add usage to ttm_validate_buffer.
> >>>     drm/amdgpu: Add separate mode for syncing DMA_RESV_USAGE_BOOKKEEP.
> >>>     drm/amdgpu: Allow explicit sync for VM ops.
> >>>     drm/amdgpu: Refactor amdgpu_vm_get_pd_bo.
> >>>     drm/amdgpu: Add option to disable implicit sync for a context.
> >>>     drm/amdgpu: Bump amdgpu driver version.
> >>>
> >>>    .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  | 16 +++++++---
> >>>    drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c        | 20 +++++++++---
> >>>    drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c       |  3 +-
> >>>    drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c       | 32 +++++++++++++++++--
> >>>    drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h       |  1 +
> >>>    drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c       |  3 +-
> >>>    drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c       | 12 ++++---
> >>>    drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c       |  3 +-
> >>>    drivers/gpu/drm/amd/amdgpu/amdgpu_object.c    | 11 ++++---
> >>>    drivers/gpu/drm/amd/amdgpu/amdgpu_object.h    |  3 +-
> >>>    drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c      | 11 +++++--
> >>>    drivers/gpu/drm/amd/amdgpu/amdgpu_sync.h      |  4 +--
> >>>    drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c       |  1 +
> >>>    drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c       |  2 +-
> >>>    drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c        |  5 ++-
> >>>    drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h        |  3 +-
> >>>    drivers/gpu/drm/amd/amdgpu/amdgpu_vm_cpu.c    |  3 +-
> >>>    drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c   |  3 +-
> >>>    drivers/gpu/drm/amd/amdkfd/kfd_svm.c          |  1 +
> >>>    drivers/gpu/drm/qxl/qxl_release.c             |  1 +
> >>>    drivers/gpu/drm/radeon/radeon_cs.c            |  2 ++
> >>>    drivers/gpu/drm/radeon/radeon_gem.c           |  1 +
> >>>    drivers/gpu/drm/radeon/radeon_vm.c            |  2 ++
> >>>    drivers/gpu/drm/ttm/ttm_execbuf_util.c        |  3 +-
> >>>    drivers/gpu/drm/vmwgfx/vmwgfx_resource.c      |  7 +++-
> >>>    drivers/gpu/drm/vmwgfx/vmwgfx_validation.c    |  1 +
> >>>    include/drm/ttm/ttm_execbuf_util.h            |  2 ++
> >>>    include/uapi/drm/amdgpu_drm.h                 |  3 ++
> >>>    28 files changed, 122 insertions(+), 37 deletions(-)
> >>>
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2022-08-24 18:46 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-13  1:27 [PATCH 0/6] amdgpu: Allow explicitly synchronized submissions Bas Nieuwenhuizen
2022-08-13  1:27 ` [PATCH 1/6] drm/ttm: Add usage to ttm_validate_buffer Bas Nieuwenhuizen
2022-08-17 22:04   ` Felix Kuehling
2022-08-18  0:30     ` Bas Nieuwenhuizen
2022-08-18  9:33       ` Christian König
2022-08-13  1:27 ` [PATCH 2/6] drm/amdgpu: Add separate mode for syncing DMA_RESV_USAGE_BOOKKEEP Bas Nieuwenhuizen
2022-08-13  1:27 ` [PATCH 3/6] drm/amdgpu: Allow explicit sync for VM ops Bas Nieuwenhuizen
2022-08-13  1:27 ` [PATCH 4/6] drm/amdgpu: Refactor amdgpu_vm_get_pd_bo Bas Nieuwenhuizen
2022-08-13  1:28 ` [PATCH 5/6] drm/amdgpu: Add option to disable implicit sync for a context Bas Nieuwenhuizen
2022-08-13  1:28 ` [PATCH 6/6] drm/amdgpu: Bump amdgpu driver version Bas Nieuwenhuizen
2022-08-18 13:20 ` [PATCH 0/6] amdgpu: Allow explicitly synchronized submissions Christian König
2022-08-21 23:08   ` Bas Nieuwenhuizen
2022-08-23 10:15     ` Christian König
2022-08-23 22:35       ` Bas Nieuwenhuizen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).