[PATCH 1/4] drm/amdgpu: return EINVAL instead of ENOENT in the VM code

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH 1/4] drm/amdgpu: return EINVAL instead of ENOENT in the VM code
@ 2020-01-23 14:21 Christian König
  2020-01-23 14:21 ` [PATCH 2/4] drm/amdgpu: allow higher level PD invalidations Christian König
                   ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Christian König @ 2020-01-23 14:21 UTC (permalink / raw)
  To: tom.stdenis, amd-gfx, Felix.Kuehling

That we can't find a PD above the root is expected can only happen if
we try to update a larger range than actually managed by the VM.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 5cb182231f5d..8119f32ca94d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -1475,7 +1475,7 @@ static int amdgpu_vm_update_ptes(struct amdgpu_vm_update_params *params,
 			 * shift we should go up one level and check it again.
 			 */
 			if (!amdgpu_vm_pt_ancestor(&cursor))
-				return -ENOENT;
+				return -EINVAL;
 			continue;
 		}
 
-- 
2.14.1

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 2/4] drm/amdgpu: allow higher level PD invalidations
  2020-01-23 14:21 [PATCH 1/4] drm/amdgpu: return EINVAL instead of ENOENT in the VM code Christian König
@ 2020-01-23 14:21 ` Christian König
  2020-01-27 14:57   ` Tom St Denis
  2020-01-23 14:21 ` [PATCH 3/4] drm/amdgpu: simplify and fix amdgpu_sync_resv Christian König
  2020-01-23 14:21 ` [PATCH 4/4] drm/amdgpu: rework synchronization of VM updates v3 Christian König
  2 siblings, 1 reply; 15+ messages in thread
From: Christian König @ 2020-01-23 14:21 UTC (permalink / raw)
  To: tom.stdenis, amd-gfx, Felix.Kuehling

Allow partial invalidation on unallocated PDs. This is useful when we
need to silence faults to stop interrupt floods on Vega.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 23 ++++++++++++++++++-----
 1 file changed, 18 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 8119f32ca94d..0f79c17118bf 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -1467,9 +1467,8 @@ static int amdgpu_vm_update_ptes(struct amdgpu_vm_update_params *params,
 			 * smaller than the address shift. Go to the next
 			 * child entry and try again.
 			 */
-			if (!amdgpu_vm_pt_descendant(adev, &cursor))
-				return -ENOENT;
-			continue;
+			if (amdgpu_vm_pt_descendant(adev, &cursor))
+				continue;
 		} else if (frag >= parent_shift) {
 			/* If the fragment size is even larger than the parent
 			 * shift we should go up one level and check it again.
@@ -1480,8 +1479,19 @@ static int amdgpu_vm_update_ptes(struct amdgpu_vm_update_params *params,
 		}
 
 		pt = cursor.entry->base.bo;
-		if (!pt)
-			return -ENOENT;
+		if (!pt) {
+			/* We need all PDs and PTs for mapping something, */
+			if (flags & AMDGPU_PTE_VALID)
+				return -ENOENT;
+
+			/* but unmapping something can happen at a higher
+			 * level. */
+			if (!amdgpu_vm_pt_ancestor(&cursor))
+				return -EINVAL;
+
+			pt = cursor.entry->base.bo;
+			shift = parent_shift;
+		}
 
 		/* Looks good so far, calculate parameters for the update */
 		incr = (uint64_t)AMDGPU_GPU_PAGE_SIZE << shift;
@@ -1495,6 +1505,9 @@ static int amdgpu_vm_update_ptes(struct amdgpu_vm_update_params *params,
 			uint64_t upd_end = min(entry_end, frag_end);
 			unsigned nptes = (upd_end - frag_start) >> shift;
 
+			/* This can happen when we set higher level PDs to
+			 * silent to stop fault floods. */
+			nptes = max(nptes, 1u);
 			amdgpu_vm_update_flags(params, pt, cursor.level,
 					       pe_start, dst, nptes, incr,
 					       flags | AMDGPU_PTE_FRAG(frag));
-- 
2.14.1

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 3/4] drm/amdgpu: simplify and fix amdgpu_sync_resv
  2020-01-23 14:21 [PATCH 1/4] drm/amdgpu: return EINVAL instead of ENOENT in the VM code Christian König
  2020-01-23 14:21 ` [PATCH 2/4] drm/amdgpu: allow higher level PD invalidations Christian König
@ 2020-01-23 14:21 ` Christian König
  2020-01-23 14:21 ` [PATCH 4/4] drm/amdgpu: rework synchronization of VM updates v3 Christian König
  2 siblings, 0 replies; 15+ messages in thread
From: Christian König @ 2020-01-23 14:21 UTC (permalink / raw)
  To: tom.stdenis, amd-gfx, Felix.Kuehling

No matter what we always need to sync to moves.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c | 15 +++++++++++----
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
index c124f64e7aae..9f42032676da 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
@@ -232,10 +232,19 @@ int amdgpu_sync_resv(struct amdgpu_device *adev, struct amdgpu_sync *sync,
 
 		f = rcu_dereference_protected(flist->shared[i],
 					      dma_resv_held(resv));
+
+		fence_owner = amdgpu_sync_get_owner(f);
+
+		/* Always sync to moves, no matter what */
+		if (fence_owner == AMDGPU_FENCE_OWNER_UNDEFINED) {
+			r = amdgpu_sync_fence(sync, f, false);
+			if (r)
+				break;
+		}
+
 		/* We only want to trigger KFD eviction fences on
 		 * evict or move jobs. Skip KFD fences otherwise.
 		 */
-		fence_owner = amdgpu_sync_get_owner(f);
 		if (fence_owner == AMDGPU_FENCE_OWNER_KFD &&
 		    owner != AMDGPU_FENCE_OWNER_UNDEFINED)
 			continue;
@@ -265,9 +274,7 @@ int amdgpu_sync_resv(struct amdgpu_device *adev, struct amdgpu_sync *sync,
 			break;
 
 		case AMDGPU_SYNC_EXPLICIT:
-			if (owner != AMDGPU_FENCE_OWNER_UNDEFINED)
-				continue;
-			break;
+			continue;
 		}
 
 		r = amdgpu_sync_fence(sync, f, false);
-- 
2.14.1

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 4/4] drm/amdgpu: rework synchronization of VM updates v3
  2020-01-23 14:21 [PATCH 1/4] drm/amdgpu: return EINVAL instead of ENOENT in the VM code Christian König
  2020-01-23 14:21 ` [PATCH 2/4] drm/amdgpu: allow higher level PD invalidations Christian König
  2020-01-23 14:21 ` [PATCH 3/4] drm/amdgpu: simplify and fix amdgpu_sync_resv Christian König
@ 2020-01-23 14:21 ` Christian König
  2020-01-23 14:22   ` Tom St Denis
  2020-01-23 15:39   ` William Lewis
  2 siblings, 2 replies; 15+ messages in thread
From: Christian König @ 2020-01-23 14:21 UTC (permalink / raw)
  To: tom.stdenis, amd-gfx, Felix.Kuehling

If provided we only sync to the BOs reservation
object and no longer to the root PD.

v2: update comment, cleanup amdgpu_bo_sync_wait_resv
v3: use correct reservation object while clearing

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c  | 37 ++++++++++++++++++++------
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.h  |  3 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c    |  7 -----
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c      | 41 +++++++++++++++++------------
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h      |  4 +--
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm_cpu.c  | 22 ++++------------
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c | 15 +++--------
 7 files changed, 67 insertions(+), 62 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index 46c76e2e1281..c70bbdda078c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -1403,30 +1403,51 @@ void amdgpu_bo_fence(struct amdgpu_bo *bo, struct dma_fence *fence,
 }
 
 /**
- * amdgpu_sync_wait_resv - Wait for BO reservation fences
+ * amdgpu_bo_sync_wait_resv - Wait for BO reservation fences
  *
- * @bo: buffer object
+ * @adev: amdgpu device pointer
+ * @resv: reservation object to sync to
+ * @sync_mode: synchronization mode
  * @owner: fence owner
  * @intr: Whether the wait is interruptible
  *
+ * Extract the fences from the reservation object and waits for them to finish.
+ *
  * Returns:
  * 0 on success, errno otherwise.
  */
-int amdgpu_bo_sync_wait(struct amdgpu_bo *bo, void *owner, bool intr)
+int amdgpu_bo_sync_wait_resv(struct amdgpu_device *adev, struct dma_resv *resv,
+			     enum amdgpu_sync_mode sync_mode, void *owner,
+			     bool intr)
 {
-	struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
 	struct amdgpu_sync sync;
 	int r;
 
 	amdgpu_sync_create(&sync);
-	amdgpu_sync_resv(adev, &sync, bo->tbo.base.resv,
-			 AMDGPU_SYNC_NE_OWNER, owner);
-	r = amdgpu_sync_wait(&sync, intr);
+	amdgpu_sync_resv(adev, &sync, resv, sync_mode, owner);
+	r = amdgpu_sync_wait(&sync, true);
 	amdgpu_sync_free(&sync);
-
 	return r;
 }
 
+/**
+ * amdgpu_bo_sync_wait - Wrapper for amdgpu_bo_sync_wait_resv
+ * @bo: buffer object to wait for
+ * @owner: fence owner
+ * @intr: Whether the wait is interruptible
+ *
+ * Wrapper to wait for fences in a BO.
+ * Returns:
+ * 0 on success, errno otherwise.
+ */
+int amdgpu_bo_sync_wait(struct amdgpu_bo *bo, void *owner, bool intr)
+{
+	struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
+
+	return amdgpu_bo_sync_wait_resv(adev, bo->tbo.base.resv,
+					AMDGPU_SYNC_NE_OWNER, owner, intr);
+}
+
 /**
  * amdgpu_bo_gpu_offset - return GPU offset of bo
  * @bo:	amdgpu object for which we query the offset
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
index 2eeafc77c9c1..96d805889e8d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
@@ -283,6 +283,9 @@ void amdgpu_bo_release_notify(struct ttm_buffer_object *bo);
 int amdgpu_bo_fault_reserve_notify(struct ttm_buffer_object *bo);
 void amdgpu_bo_fence(struct amdgpu_bo *bo, struct dma_fence *fence,
 		     bool shared);
+int amdgpu_bo_sync_wait_resv(struct amdgpu_device *adev, struct dma_resv *resv,
+			     enum amdgpu_sync_mode sync_mode, void *owner,
+			     bool intr);
 int amdgpu_bo_sync_wait(struct amdgpu_bo *bo, void *owner, bool intr);
 u64 amdgpu_bo_gpu_offset(struct amdgpu_bo *bo);
 int amdgpu_bo_validate(struct amdgpu_bo *bo);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
index 9f42032676da..b86392253696 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
@@ -249,13 +249,6 @@ int amdgpu_sync_resv(struct amdgpu_device *adev, struct amdgpu_sync *sync,
 		    owner != AMDGPU_FENCE_OWNER_UNDEFINED)
 			continue;
 
-		/* VM updates only sync with moves but not with user
-		 * command submissions or KFD evictions fences
-		 */
-		if (fence_owner != AMDGPU_FENCE_OWNER_UNDEFINED &&
-		    owner == AMDGPU_FENCE_OWNER_VM)
-			continue;
-
 		/* Ignore fences depending on the sync mode */
 		switch (mode) {
 		case AMDGPU_SYNC_ALWAYS:
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 0f79c17118bf..c268aa14381e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -797,7 +797,7 @@ static int amdgpu_vm_clear_bo(struct amdgpu_device *adev,
 	params.vm = vm;
 	params.direct = direct;
 
-	r = vm->update_funcs->prepare(&params, AMDGPU_FENCE_OWNER_KFD, NULL);
+	r = vm->update_funcs->prepare(&params, NULL, AMDGPU_SYNC_EXPLICIT);
 	if (r)
 		return r;
 
@@ -1293,7 +1293,7 @@ int amdgpu_vm_update_pdes(struct amdgpu_device *adev,
 	params.vm = vm;
 	params.direct = direct;
 
-	r = vm->update_funcs->prepare(&params, AMDGPU_FENCE_OWNER_VM, NULL);
+	r = vm->update_funcs->prepare(&params, NULL, AMDGPU_SYNC_EXPLICIT);
 	if (r)
 		return r;
 
@@ -1552,7 +1552,7 @@ static int amdgpu_vm_update_ptes(struct amdgpu_vm_update_params *params,
  * @adev: amdgpu_device pointer
  * @vm: requested vm
  * @direct: direct submission in a page fault
- * @exclusive: fence we need to sync to
+ * @resv: fences we need to sync to
  * @start: start of mapped range
  * @last: last mapped entry
  * @flags: flags for the entries
@@ -1567,14 +1567,14 @@ static int amdgpu_vm_update_ptes(struct amdgpu_vm_update_params *params,
  */
 static int amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev,
 				       struct amdgpu_vm *vm, bool direct,
-				       struct dma_fence *exclusive,
+				       struct dma_resv *resv,
 				       uint64_t start, uint64_t last,
 				       uint64_t flags, uint64_t addr,
 				       dma_addr_t *pages_addr,
 				       struct dma_fence **fence)
 {
 	struct amdgpu_vm_update_params params;
-	void *owner = AMDGPU_FENCE_OWNER_VM;
+	enum amdgpu_sync_mode sync_mode;
 	int r;
 
 	memset(&params, 0, sizeof(params));
@@ -1583,9 +1583,13 @@ static int amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev,
 	params.direct = direct;
 	params.pages_addr = pages_addr;
 
-	/* sync to everything except eviction fences on unmapping */
+	/* Implicitly sync to command submissions in the same VM before
+	 * unmapping. Sync to moving fences before mapping.
+	 */
 	if (!(flags & AMDGPU_PTE_VALID))
-		owner = AMDGPU_FENCE_OWNER_KFD;
+		sync_mode = AMDGPU_SYNC_EQ_OWNER;
+	else
+		sync_mode = AMDGPU_SYNC_EXPLICIT;
 
 	amdgpu_vm_eviction_lock(vm);
 	if (vm->evicting) {
@@ -1593,7 +1597,7 @@ static int amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev,
 		goto error_unlock;
 	}
 
-	r = vm->update_funcs->prepare(&params, owner, exclusive);
+	r = vm->update_funcs->prepare(&params, resv, sync_mode);
 	if (r)
 		goto error_unlock;
 
@@ -1612,7 +1616,7 @@ static int amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev,
  * amdgpu_vm_bo_split_mapping - split a mapping into smaller chunks
  *
  * @adev: amdgpu_device pointer
- * @exclusive: fence we need to sync to
+ * @resv: fences we need to sync to
  * @pages_addr: DMA addresses to use for mapping
  * @vm: requested vm
  * @mapping: mapped range and flags to use for the update
@@ -1628,7 +1632,7 @@ static int amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev,
  * 0 for success, -EINVAL for failure.
  */
 static int amdgpu_vm_bo_split_mapping(struct amdgpu_device *adev,
-				      struct dma_fence *exclusive,
+				      struct dma_resv *resv,
 				      dma_addr_t *pages_addr,
 				      struct amdgpu_vm *vm,
 				      struct amdgpu_bo_va_mapping *mapping,
@@ -1704,7 +1708,7 @@ static int amdgpu_vm_bo_split_mapping(struct amdgpu_device *adev,
 		}
 
 		last = min((uint64_t)mapping->last, start + max_entries - 1);
-		r = amdgpu_vm_bo_update_mapping(adev, vm, false, exclusive,
+		r = amdgpu_vm_bo_update_mapping(adev, vm, false, resv,
 						start, last, flags, addr,
 						dma_addr, fence);
 		if (r)
@@ -1743,7 +1747,8 @@ int amdgpu_vm_bo_update(struct amdgpu_device *adev, struct amdgpu_bo_va *bo_va,
 	dma_addr_t *pages_addr = NULL;
 	struct ttm_mem_reg *mem;
 	struct drm_mm_node *nodes;
-	struct dma_fence *exclusive, **last_update;
+	struct dma_fence **last_update;
+	struct dma_resv *resv;
 	uint64_t flags;
 	struct amdgpu_device *bo_adev = adev;
 	int r;
@@ -1751,7 +1756,7 @@ int amdgpu_vm_bo_update(struct amdgpu_device *adev, struct amdgpu_bo_va *bo_va,
 	if (clear || !bo) {
 		mem = NULL;
 		nodes = NULL;
-		exclusive = NULL;
+		resv = vm->root.base.bo->tbo.base.resv;
 	} else {
 		struct ttm_dma_tt *ttm;
 
@@ -1761,7 +1766,7 @@ int amdgpu_vm_bo_update(struct amdgpu_device *adev, struct amdgpu_bo_va *bo_va,
 			ttm = container_of(bo->tbo.ttm, struct ttm_dma_tt, ttm);
 			pages_addr = ttm->dma_address;
 		}
-		exclusive = bo->tbo.moving;
+		resv = bo->tbo.base.resv;
 	}
 
 	if (bo) {
@@ -1775,7 +1780,8 @@ int amdgpu_vm_bo_update(struct amdgpu_device *adev, struct amdgpu_bo_va *bo_va,
 		flags = 0x0;
 	}
 
-	if (clear || (bo && bo->tbo.base.resv == vm->root.base.bo->tbo.base.resv))
+	if (clear || (bo && bo->tbo.base.resv ==
+		      vm->root.base.bo->tbo.base.resv))
 		last_update = &vm->last_update;
 	else
 		last_update = &bo_va->last_pt_update;
@@ -1789,7 +1795,7 @@ int amdgpu_vm_bo_update(struct amdgpu_device *adev, struct amdgpu_bo_va *bo_va,
 	}
 
 	list_for_each_entry(mapping, &bo_va->invalids, list) {
-		r = amdgpu_vm_bo_split_mapping(adev, exclusive, pages_addr, vm,
+		r = amdgpu_vm_bo_split_mapping(adev, resv, pages_addr, vm,
 					       mapping, flags, bo_adev, nodes,
 					       last_update);
 		if (r)
@@ -1984,6 +1990,7 @@ int amdgpu_vm_clear_freed(struct amdgpu_device *adev,
 			  struct amdgpu_vm *vm,
 			  struct dma_fence **fence)
 {
+	struct dma_resv *resv = vm->root.base.bo->tbo.base.resv;
 	struct amdgpu_bo_va_mapping *mapping;
 	uint64_t init_pte_value = 0;
 	struct dma_fence *f = NULL;
@@ -1998,7 +2005,7 @@ int amdgpu_vm_clear_freed(struct amdgpu_device *adev,
 		    mapping->start < AMDGPU_GMC_HOLE_START)
 			init_pte_value = AMDGPU_PTE_DEFAULT_ATC;
 
-		r = amdgpu_vm_bo_update_mapping(adev, vm, false, NULL,
+		r = amdgpu_vm_bo_update_mapping(adev, vm, false, resv,
 						mapping->start, mapping->last,
 						init_pte_value, 0, NULL, &f);
 		amdgpu_vm_free_mapping(adev, vm, mapping, f);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
index c21a36bebc0c..b5705fcfc935 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
@@ -230,8 +230,8 @@ struct amdgpu_vm_update_params {
 
 struct amdgpu_vm_update_funcs {
 	int (*map_table)(struct amdgpu_bo *bo);
-	int (*prepare)(struct amdgpu_vm_update_params *p, void * owner,
-		       struct dma_fence *exclusive);
+	int (*prepare)(struct amdgpu_vm_update_params *p, struct dma_resv *resv,
+		       enum amdgpu_sync_mode sync_mode);
 	int (*update)(struct amdgpu_vm_update_params *p,
 		      struct amdgpu_bo *bo, uint64_t pe, uint64_t addr,
 		      unsigned count, uint32_t incr, uint64_t flags);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_cpu.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_cpu.c
index 68b013be3837..e38516304070 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_cpu.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_cpu.c
@@ -44,26 +44,14 @@ static int amdgpu_vm_cpu_map_table(struct amdgpu_bo *table)
  * Returns:
  * Negativ errno, 0 for success.
  */
-static int amdgpu_vm_cpu_prepare(struct amdgpu_vm_update_params *p, void *owner,
-				 struct dma_fence *exclusive)
+static int amdgpu_vm_cpu_prepare(struct amdgpu_vm_update_params *p,
+				 struct dma_resv *resv,
+				 enum amdgpu_sync_mode sync_mode)
 {
-	int r;
-
-	/* Wait for any BO move to be completed */
-	if (exclusive) {
-		r = dma_fence_wait(exclusive, true);
-		if (unlikely(r))
-			return r;
-	}
-
-	/* Don't wait for submissions during page fault */
-	if (p->direct)
+	if (!resv)
 		return 0;
 
-	/* Wait for PT BOs to be idle. PTs share the same resv. object
-	 * as the root PD BO
-	 */
-	return amdgpu_bo_sync_wait(p->vm->root.base.bo, owner, true);
+	return amdgpu_bo_sync_wait_resv(p->adev, resv, sync_mode, p->vm, true);
 }
 
 /**
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
index ab6481751763..4cc7881f438c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
@@ -58,9 +58,9 @@ static int amdgpu_vm_sdma_map_table(struct amdgpu_bo *table)
  * Negativ errno, 0 for success.
  */
 static int amdgpu_vm_sdma_prepare(struct amdgpu_vm_update_params *p,
-				  void *owner, struct dma_fence *exclusive)
+				  struct dma_resv *resv,
+				  enum amdgpu_sync_mode sync_mode)
 {
-	struct amdgpu_bo *root = p->vm->root.base.bo;
 	unsigned int ndw = AMDGPU_VM_SDMA_MIN_NUM_DW;
 	int r;
 
@@ -70,17 +70,10 @@ static int amdgpu_vm_sdma_prepare(struct amdgpu_vm_update_params *p,
 
 	p->num_dw_left = ndw;
 
-	/* Wait for moves to be completed */
-	r = amdgpu_sync_fence(&p->job->sync, exclusive, false);
-	if (r)
-		return r;
-
-	/* Don't wait for any submissions during page fault handling */
-	if (p->direct)
+	if (!resv)
 		return 0;
 
-	return amdgpu_sync_resv(p->adev, &p->job->sync, root->tbo.base.resv,
-				AMDGPU_SYNC_NE_OWNER, owner);
+	return amdgpu_sync_resv(p->adev, &p->job->sync, resv, sync_mode, p->vm);
 }
 
 /**
-- 
2.14.1

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 4/4] drm/amdgpu: rework synchronization of VM updates v3
  2020-01-23 14:21 ` [PATCH 4/4] drm/amdgpu: rework synchronization of VM updates v3 Christian König
@ 2020-01-23 14:22   ` Tom St Denis
  2020-01-23 14:24     ` Christian König
  2020-01-23 15:39   ` William Lewis
  1 sibling, 1 reply; 15+ messages in thread
From: Tom St Denis @ 2020-01-23 14:22 UTC (permalink / raw)
  To: Christian König, amd-gfx, Felix.Kuehling

Just applied these now, trying them out will report back in ~20 mins.

On 2020-01-23 9:21 a.m., Christian König wrote:
> If provided we only sync to the BOs reservation
> object and no longer to the root PD.
>
> v2: update comment, cleanup amdgpu_bo_sync_wait_resv
> v3: use correct reservation object while clearing
>
> Signed-off-by: Christian König <christian.koenig@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c  | 37 ++++++++++++++++++++------
>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.h  |  3 +++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c    |  7 -----
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c      | 41 +++++++++++++++++------------
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h      |  4 +--
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm_cpu.c  | 22 ++++------------
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c | 15 +++--------
>   7 files changed, 67 insertions(+), 62 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> index 46c76e2e1281..c70bbdda078c 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> @@ -1403,30 +1403,51 @@ void amdgpu_bo_fence(struct amdgpu_bo *bo, struct dma_fence *fence,
>   }
>   
>   /**
> - * amdgpu_sync_wait_resv - Wait for BO reservation fences
> + * amdgpu_bo_sync_wait_resv - Wait for BO reservation fences
>    *
> - * @bo: buffer object
> + * @adev: amdgpu device pointer
> + * @resv: reservation object to sync to
> + * @sync_mode: synchronization mode
>    * @owner: fence owner
>    * @intr: Whether the wait is interruptible
>    *
> + * Extract the fences from the reservation object and waits for them to finish.
> + *
>    * Returns:
>    * 0 on success, errno otherwise.
>    */
> -int amdgpu_bo_sync_wait(struct amdgpu_bo *bo, void *owner, bool intr)
> +int amdgpu_bo_sync_wait_resv(struct amdgpu_device *adev, struct dma_resv *resv,
> +			     enum amdgpu_sync_mode sync_mode, void *owner,
> +			     bool intr)
>   {
> -	struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
>   	struct amdgpu_sync sync;
>   	int r;
>   
>   	amdgpu_sync_create(&sync);
> -	amdgpu_sync_resv(adev, &sync, bo->tbo.base.resv,
> -			 AMDGPU_SYNC_NE_OWNER, owner);
> -	r = amdgpu_sync_wait(&sync, intr);
> +	amdgpu_sync_resv(adev, &sync, resv, sync_mode, owner);
> +	r = amdgpu_sync_wait(&sync, true);
>   	amdgpu_sync_free(&sync);
> -
>   	return r;
>   }
>   
> +/**
> + * amdgpu_bo_sync_wait - Wrapper for amdgpu_bo_sync_wait_resv
> + * @bo: buffer object to wait for
> + * @owner: fence owner
> + * @intr: Whether the wait is interruptible
> + *
> + * Wrapper to wait for fences in a BO.
> + * Returns:
> + * 0 on success, errno otherwise.
> + */
> +int amdgpu_bo_sync_wait(struct amdgpu_bo *bo, void *owner, bool intr)
> +{
> +	struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
> +
> +	return amdgpu_bo_sync_wait_resv(adev, bo->tbo.base.resv,
> +					AMDGPU_SYNC_NE_OWNER, owner, intr);
> +}
> +
>   /**
>    * amdgpu_bo_gpu_offset - return GPU offset of bo
>    * @bo:	amdgpu object for which we query the offset
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> index 2eeafc77c9c1..96d805889e8d 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> @@ -283,6 +283,9 @@ void amdgpu_bo_release_notify(struct ttm_buffer_object *bo);
>   int amdgpu_bo_fault_reserve_notify(struct ttm_buffer_object *bo);
>   void amdgpu_bo_fence(struct amdgpu_bo *bo, struct dma_fence *fence,
>   		     bool shared);
> +int amdgpu_bo_sync_wait_resv(struct amdgpu_device *adev, struct dma_resv *resv,
> +			     enum amdgpu_sync_mode sync_mode, void *owner,
> +			     bool intr);
>   int amdgpu_bo_sync_wait(struct amdgpu_bo *bo, void *owner, bool intr);
>   u64 amdgpu_bo_gpu_offset(struct amdgpu_bo *bo);
>   int amdgpu_bo_validate(struct amdgpu_bo *bo);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
> index 9f42032676da..b86392253696 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
> @@ -249,13 +249,6 @@ int amdgpu_sync_resv(struct amdgpu_device *adev, struct amdgpu_sync *sync,
>   		    owner != AMDGPU_FENCE_OWNER_UNDEFINED)
>   			continue;
>   
> -		/* VM updates only sync with moves but not with user
> -		 * command submissions or KFD evictions fences
> -		 */
> -		if (fence_owner != AMDGPU_FENCE_OWNER_UNDEFINED &&
> -		    owner == AMDGPU_FENCE_OWNER_VM)
> -			continue;
> -
>   		/* Ignore fences depending on the sync mode */
>   		switch (mode) {
>   		case AMDGPU_SYNC_ALWAYS:
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> index 0f79c17118bf..c268aa14381e 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> @@ -797,7 +797,7 @@ static int amdgpu_vm_clear_bo(struct amdgpu_device *adev,
>   	params.vm = vm;
>   	params.direct = direct;
>   
> -	r = vm->update_funcs->prepare(&params, AMDGPU_FENCE_OWNER_KFD, NULL);
> +	r = vm->update_funcs->prepare(&params, NULL, AMDGPU_SYNC_EXPLICIT);
>   	if (r)
>   		return r;
>   
> @@ -1293,7 +1293,7 @@ int amdgpu_vm_update_pdes(struct amdgpu_device *adev,
>   	params.vm = vm;
>   	params.direct = direct;
>   
> -	r = vm->update_funcs->prepare(&params, AMDGPU_FENCE_OWNER_VM, NULL);
> +	r = vm->update_funcs->prepare(&params, NULL, AMDGPU_SYNC_EXPLICIT);
>   	if (r)
>   		return r;
>   
> @@ -1552,7 +1552,7 @@ static int amdgpu_vm_update_ptes(struct amdgpu_vm_update_params *params,
>    * @adev: amdgpu_device pointer
>    * @vm: requested vm
>    * @direct: direct submission in a page fault
> - * @exclusive: fence we need to sync to
> + * @resv: fences we need to sync to
>    * @start: start of mapped range
>    * @last: last mapped entry
>    * @flags: flags for the entries
> @@ -1567,14 +1567,14 @@ static int amdgpu_vm_update_ptes(struct amdgpu_vm_update_params *params,
>    */
>   static int amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev,
>   				       struct amdgpu_vm *vm, bool direct,
> -				       struct dma_fence *exclusive,
> +				       struct dma_resv *resv,
>   				       uint64_t start, uint64_t last,
>   				       uint64_t flags, uint64_t addr,
>   				       dma_addr_t *pages_addr,
>   				       struct dma_fence **fence)
>   {
>   	struct amdgpu_vm_update_params params;
> -	void *owner = AMDGPU_FENCE_OWNER_VM;
> +	enum amdgpu_sync_mode sync_mode;
>   	int r;
>   
>   	memset(&params, 0, sizeof(params));
> @@ -1583,9 +1583,13 @@ static int amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev,
>   	params.direct = direct;
>   	params.pages_addr = pages_addr;
>   
> -	/* sync to everything except eviction fences on unmapping */
> +	/* Implicitly sync to command submissions in the same VM before
> +	 * unmapping. Sync to moving fences before mapping.
> +	 */
>   	if (!(flags & AMDGPU_PTE_VALID))
> -		owner = AMDGPU_FENCE_OWNER_KFD;
> +		sync_mode = AMDGPU_SYNC_EQ_OWNER;
> +	else
> +		sync_mode = AMDGPU_SYNC_EXPLICIT;
>   
>   	amdgpu_vm_eviction_lock(vm);
>   	if (vm->evicting) {
> @@ -1593,7 +1597,7 @@ static int amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev,
>   		goto error_unlock;
>   	}
>   
> -	r = vm->update_funcs->prepare(&params, owner, exclusive);
> +	r = vm->update_funcs->prepare(&params, resv, sync_mode);
>   	if (r)
>   		goto error_unlock;
>   
> @@ -1612,7 +1616,7 @@ static int amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev,
>    * amdgpu_vm_bo_split_mapping - split a mapping into smaller chunks
>    *
>    * @adev: amdgpu_device pointer
> - * @exclusive: fence we need to sync to
> + * @resv: fences we need to sync to
>    * @pages_addr: DMA addresses to use for mapping
>    * @vm: requested vm
>    * @mapping: mapped range and flags to use for the update
> @@ -1628,7 +1632,7 @@ static int amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev,
>    * 0 for success, -EINVAL for failure.
>    */
>   static int amdgpu_vm_bo_split_mapping(struct amdgpu_device *adev,
> -				      struct dma_fence *exclusive,
> +				      struct dma_resv *resv,
>   				      dma_addr_t *pages_addr,
>   				      struct amdgpu_vm *vm,
>   				      struct amdgpu_bo_va_mapping *mapping,
> @@ -1704,7 +1708,7 @@ static int amdgpu_vm_bo_split_mapping(struct amdgpu_device *adev,
>   		}
>   
>   		last = min((uint64_t)mapping->last, start + max_entries - 1);
> -		r = amdgpu_vm_bo_update_mapping(adev, vm, false, exclusive,
> +		r = amdgpu_vm_bo_update_mapping(adev, vm, false, resv,
>   						start, last, flags, addr,
>   						dma_addr, fence);
>   		if (r)
> @@ -1743,7 +1747,8 @@ int amdgpu_vm_bo_update(struct amdgpu_device *adev, struct amdgpu_bo_va *bo_va,
>   	dma_addr_t *pages_addr = NULL;
>   	struct ttm_mem_reg *mem;
>   	struct drm_mm_node *nodes;
> -	struct dma_fence *exclusive, **last_update;
> +	struct dma_fence **last_update;
> +	struct dma_resv *resv;
>   	uint64_t flags;
>   	struct amdgpu_device *bo_adev = adev;
>   	int r;
> @@ -1751,7 +1756,7 @@ int amdgpu_vm_bo_update(struct amdgpu_device *adev, struct amdgpu_bo_va *bo_va,
>   	if (clear || !bo) {
>   		mem = NULL;
>   		nodes = NULL;
> -		exclusive = NULL;
> +		resv = vm->root.base.bo->tbo.base.resv;
>   	} else {
>   		struct ttm_dma_tt *ttm;
>   
> @@ -1761,7 +1766,7 @@ int amdgpu_vm_bo_update(struct amdgpu_device *adev, struct amdgpu_bo_va *bo_va,
>   			ttm = container_of(bo->tbo.ttm, struct ttm_dma_tt, ttm);
>   			pages_addr = ttm->dma_address;
>   		}
> -		exclusive = bo->tbo.moving;
> +		resv = bo->tbo.base.resv;
>   	}
>   
>   	if (bo) {
> @@ -1775,7 +1780,8 @@ int amdgpu_vm_bo_update(struct amdgpu_device *adev, struct amdgpu_bo_va *bo_va,
>   		flags = 0x0;
>   	}
>   
> -	if (clear || (bo && bo->tbo.base.resv == vm->root.base.bo->tbo.base.resv))
> +	if (clear || (bo && bo->tbo.base.resv ==
> +		      vm->root.base.bo->tbo.base.resv))
>   		last_update = &vm->last_update;
>   	else
>   		last_update = &bo_va->last_pt_update;
> @@ -1789,7 +1795,7 @@ int amdgpu_vm_bo_update(struct amdgpu_device *adev, struct amdgpu_bo_va *bo_va,
>   	}
>   
>   	list_for_each_entry(mapping, &bo_va->invalids, list) {
> -		r = amdgpu_vm_bo_split_mapping(adev, exclusive, pages_addr, vm,
> +		r = amdgpu_vm_bo_split_mapping(adev, resv, pages_addr, vm,
>   					       mapping, flags, bo_adev, nodes,
>   					       last_update);
>   		if (r)
> @@ -1984,6 +1990,7 @@ int amdgpu_vm_clear_freed(struct amdgpu_device *adev,
>   			  struct amdgpu_vm *vm,
>   			  struct dma_fence **fence)
>   {
> +	struct dma_resv *resv = vm->root.base.bo->tbo.base.resv;
>   	struct amdgpu_bo_va_mapping *mapping;
>   	uint64_t init_pte_value = 0;
>   	struct dma_fence *f = NULL;
> @@ -1998,7 +2005,7 @@ int amdgpu_vm_clear_freed(struct amdgpu_device *adev,
>   		    mapping->start < AMDGPU_GMC_HOLE_START)
>   			init_pte_value = AMDGPU_PTE_DEFAULT_ATC;
>   
> -		r = amdgpu_vm_bo_update_mapping(adev, vm, false, NULL,
> +		r = amdgpu_vm_bo_update_mapping(adev, vm, false, resv,
>   						mapping->start, mapping->last,
>   						init_pte_value, 0, NULL, &f);
>   		amdgpu_vm_free_mapping(adev, vm, mapping, f);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> index c21a36bebc0c..b5705fcfc935 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> @@ -230,8 +230,8 @@ struct amdgpu_vm_update_params {
>   
>   struct amdgpu_vm_update_funcs {
>   	int (*map_table)(struct amdgpu_bo *bo);
> -	int (*prepare)(struct amdgpu_vm_update_params *p, void * owner,
> -		       struct dma_fence *exclusive);
> +	int (*prepare)(struct amdgpu_vm_update_params *p, struct dma_resv *resv,
> +		       enum amdgpu_sync_mode sync_mode);
>   	int (*update)(struct amdgpu_vm_update_params *p,
>   		      struct amdgpu_bo *bo, uint64_t pe, uint64_t addr,
>   		      unsigned count, uint32_t incr, uint64_t flags);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_cpu.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_cpu.c
> index 68b013be3837..e38516304070 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_cpu.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_cpu.c
> @@ -44,26 +44,14 @@ static int amdgpu_vm_cpu_map_table(struct amdgpu_bo *table)
>    * Returns:
>    * Negativ errno, 0 for success.
>    */
> -static int amdgpu_vm_cpu_prepare(struct amdgpu_vm_update_params *p, void *owner,
> -				 struct dma_fence *exclusive)
> +static int amdgpu_vm_cpu_prepare(struct amdgpu_vm_update_params *p,
> +				 struct dma_resv *resv,
> +				 enum amdgpu_sync_mode sync_mode)
>   {
> -	int r;
> -
> -	/* Wait for any BO move to be completed */
> -	if (exclusive) {
> -		r = dma_fence_wait(exclusive, true);
> -		if (unlikely(r))
> -			return r;
> -	}
> -
> -	/* Don't wait for submissions during page fault */
> -	if (p->direct)
> +	if (!resv)
>   		return 0;
>   
> -	/* Wait for PT BOs to be idle. PTs share the same resv. object
> -	 * as the root PD BO
> -	 */
> -	return amdgpu_bo_sync_wait(p->vm->root.base.bo, owner, true);
> +	return amdgpu_bo_sync_wait_resv(p->adev, resv, sync_mode, p->vm, true);
>   }
>   
>   /**
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
> index ab6481751763..4cc7881f438c 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
> @@ -58,9 +58,9 @@ static int amdgpu_vm_sdma_map_table(struct amdgpu_bo *table)
>    * Negativ errno, 0 for success.
>    */
>   static int amdgpu_vm_sdma_prepare(struct amdgpu_vm_update_params *p,
> -				  void *owner, struct dma_fence *exclusive)
> +				  struct dma_resv *resv,
> +				  enum amdgpu_sync_mode sync_mode)
>   {
> -	struct amdgpu_bo *root = p->vm->root.base.bo;
>   	unsigned int ndw = AMDGPU_VM_SDMA_MIN_NUM_DW;
>   	int r;
>   
> @@ -70,17 +70,10 @@ static int amdgpu_vm_sdma_prepare(struct amdgpu_vm_update_params *p,
>   
>   	p->num_dw_left = ndw;
>   
> -	/* Wait for moves to be completed */
> -	r = amdgpu_sync_fence(&p->job->sync, exclusive, false);
> -	if (r)
> -		return r;
> -
> -	/* Don't wait for any submissions during page fault handling */
> -	if (p->direct)
> +	if (!resv)
>   		return 0;
>   
> -	return amdgpu_sync_resv(p->adev, &p->job->sync, root->tbo.base.resv,
> -				AMDGPU_SYNC_NE_OWNER, owner);
> +	return amdgpu_sync_resv(p->adev, &p->job->sync, resv, sync_mode, p->vm);
>   }
>   
>   /**
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 4/4] drm/amdgpu: rework synchronization of VM updates v3
  2020-01-23 14:22   ` Tom St Denis
@ 2020-01-23 14:24     ` Christian König
  2020-01-23 14:25       ` Tom St Denis
  0 siblings, 1 reply; 15+ messages in thread
From: Christian König @ 2020-01-23 14:24 UTC (permalink / raw)
  To: Tom St Denis, amd-gfx, Felix.Kuehling

Thanks, please give them a full round.

Took me a week to figure out that we accidentally pass in the 
reservation object as NULL for cleared BOs.

Thanks,
Christian.

Am 23.01.20 um 15:22 schrieb Tom St Denis:
> Just applied these now, trying them out will report back in ~20 mins.
>
> On 2020-01-23 9:21 a.m., Christian König wrote:
>> If provided we only sync to the BOs reservation
>> object and no longer to the root PD.
>>
>> v2: update comment, cleanup amdgpu_bo_sync_wait_resv
>> v3: use correct reservation object while clearing
>>
>> Signed-off-by: Christian König <christian.koenig@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c  | 37 
>> ++++++++++++++++++++------
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.h  |  3 +++
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c    |  7 -----
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c      | 41 
>> +++++++++++++++++------------
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h      |  4 +--
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm_cpu.c  | 22 ++++------------
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c | 15 +++--------
>>   7 files changed, 67 insertions(+), 62 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>> index 46c76e2e1281..c70bbdda078c 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>> @@ -1403,30 +1403,51 @@ void amdgpu_bo_fence(struct amdgpu_bo *bo, 
>> struct dma_fence *fence,
>>   }
>>     /**
>> - * amdgpu_sync_wait_resv - Wait for BO reservation fences
>> + * amdgpu_bo_sync_wait_resv - Wait for BO reservation fences
>>    *
>> - * @bo: buffer object
>> + * @adev: amdgpu device pointer
>> + * @resv: reservation object to sync to
>> + * @sync_mode: synchronization mode
>>    * @owner: fence owner
>>    * @intr: Whether the wait is interruptible
>>    *
>> + * Extract the fences from the reservation object and waits for them 
>> to finish.
>> + *
>>    * Returns:
>>    * 0 on success, errno otherwise.
>>    */
>> -int amdgpu_bo_sync_wait(struct amdgpu_bo *bo, void *owner, bool intr)
>> +int amdgpu_bo_sync_wait_resv(struct amdgpu_device *adev, struct 
>> dma_resv *resv,
>> +                 enum amdgpu_sync_mode sync_mode, void *owner,
>> +                 bool intr)
>>   {
>> -    struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
>>       struct amdgpu_sync sync;
>>       int r;
>>         amdgpu_sync_create(&sync);
>> -    amdgpu_sync_resv(adev, &sync, bo->tbo.base.resv,
>> -             AMDGPU_SYNC_NE_OWNER, owner);
>> -    r = amdgpu_sync_wait(&sync, intr);
>> +    amdgpu_sync_resv(adev, &sync, resv, sync_mode, owner);
>> +    r = amdgpu_sync_wait(&sync, true);
>>       amdgpu_sync_free(&sync);
>> -
>>       return r;
>>   }
>>   +/**
>> + * amdgpu_bo_sync_wait - Wrapper for amdgpu_bo_sync_wait_resv
>> + * @bo: buffer object to wait for
>> + * @owner: fence owner
>> + * @intr: Whether the wait is interruptible
>> + *
>> + * Wrapper to wait for fences in a BO.
>> + * Returns:
>> + * 0 on success, errno otherwise.
>> + */
>> +int amdgpu_bo_sync_wait(struct amdgpu_bo *bo, void *owner, bool intr)
>> +{
>> +    struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
>> +
>> +    return amdgpu_bo_sync_wait_resv(adev, bo->tbo.base.resv,
>> +                    AMDGPU_SYNC_NE_OWNER, owner, intr);
>> +}
>> +
>>   /**
>>    * amdgpu_bo_gpu_offset - return GPU offset of bo
>>    * @bo:    amdgpu object for which we query the offset
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>> index 2eeafc77c9c1..96d805889e8d 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>> @@ -283,6 +283,9 @@ void amdgpu_bo_release_notify(struct 
>> ttm_buffer_object *bo);
>>   int amdgpu_bo_fault_reserve_notify(struct ttm_buffer_object *bo);
>>   void amdgpu_bo_fence(struct amdgpu_bo *bo, struct dma_fence *fence,
>>                bool shared);
>> +int amdgpu_bo_sync_wait_resv(struct amdgpu_device *adev, struct 
>> dma_resv *resv,
>> +                 enum amdgpu_sync_mode sync_mode, void *owner,
>> +                 bool intr);
>>   int amdgpu_bo_sync_wait(struct amdgpu_bo *bo, void *owner, bool intr);
>>   u64 amdgpu_bo_gpu_offset(struct amdgpu_bo *bo);
>>   int amdgpu_bo_validate(struct amdgpu_bo *bo);
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
>> index 9f42032676da..b86392253696 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
>> @@ -249,13 +249,6 @@ int amdgpu_sync_resv(struct amdgpu_device *adev, 
>> struct amdgpu_sync *sync,
>>               owner != AMDGPU_FENCE_OWNER_UNDEFINED)
>>               continue;
>>   -        /* VM updates only sync with moves but not with user
>> -         * command submissions or KFD evictions fences
>> -         */
>> -        if (fence_owner != AMDGPU_FENCE_OWNER_UNDEFINED &&
>> -            owner == AMDGPU_FENCE_OWNER_VM)
>> -            continue;
>> -
>>           /* Ignore fences depending on the sync mode */
>>           switch (mode) {
>>           case AMDGPU_SYNC_ALWAYS:
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> index 0f79c17118bf..c268aa14381e 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> @@ -797,7 +797,7 @@ static int amdgpu_vm_clear_bo(struct 
>> amdgpu_device *adev,
>>       params.vm = vm;
>>       params.direct = direct;
>>   -    r = vm->update_funcs->prepare(&params, AMDGPU_FENCE_OWNER_KFD, 
>> NULL);
>> +    r = vm->update_funcs->prepare(&params, NULL, AMDGPU_SYNC_EXPLICIT);
>>       if (r)
>>           return r;
>>   @@ -1293,7 +1293,7 @@ int amdgpu_vm_update_pdes(struct 
>> amdgpu_device *adev,
>>       params.vm = vm;
>>       params.direct = direct;
>>   -    r = vm->update_funcs->prepare(&params, AMDGPU_FENCE_OWNER_VM, 
>> NULL);
>> +    r = vm->update_funcs->prepare(&params, NULL, AMDGPU_SYNC_EXPLICIT);
>>       if (r)
>>           return r;
>>   @@ -1552,7 +1552,7 @@ static int amdgpu_vm_update_ptes(struct 
>> amdgpu_vm_update_params *params,
>>    * @adev: amdgpu_device pointer
>>    * @vm: requested vm
>>    * @direct: direct submission in a page fault
>> - * @exclusive: fence we need to sync to
>> + * @resv: fences we need to sync to
>>    * @start: start of mapped range
>>    * @last: last mapped entry
>>    * @flags: flags for the entries
>> @@ -1567,14 +1567,14 @@ static int amdgpu_vm_update_ptes(struct 
>> amdgpu_vm_update_params *params,
>>    */
>>   static int amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev,
>>                          struct amdgpu_vm *vm, bool direct,
>> -                       struct dma_fence *exclusive,
>> +                       struct dma_resv *resv,
>>                          uint64_t start, uint64_t last,
>>                          uint64_t flags, uint64_t addr,
>>                          dma_addr_t *pages_addr,
>>                          struct dma_fence **fence)
>>   {
>>       struct amdgpu_vm_update_params params;
>> -    void *owner = AMDGPU_FENCE_OWNER_VM;
>> +    enum amdgpu_sync_mode sync_mode;
>>       int r;
>>         memset(&params, 0, sizeof(params));
>> @@ -1583,9 +1583,13 @@ static int amdgpu_vm_bo_update_mapping(struct 
>> amdgpu_device *adev,
>>       params.direct = direct;
>>       params.pages_addr = pages_addr;
>>   -    /* sync to everything except eviction fences on unmapping */
>> +    /* Implicitly sync to command submissions in the same VM before
>> +     * unmapping. Sync to moving fences before mapping.
>> +     */
>>       if (!(flags & AMDGPU_PTE_VALID))
>> -        owner = AMDGPU_FENCE_OWNER_KFD;
>> +        sync_mode = AMDGPU_SYNC_EQ_OWNER;
>> +    else
>> +        sync_mode = AMDGPU_SYNC_EXPLICIT;
>>         amdgpu_vm_eviction_lock(vm);
>>       if (vm->evicting) {
>> @@ -1593,7 +1597,7 @@ static int amdgpu_vm_bo_update_mapping(struct 
>> amdgpu_device *adev,
>>           goto error_unlock;
>>       }
>>   -    r = vm->update_funcs->prepare(&params, owner, exclusive);
>> +    r = vm->update_funcs->prepare(&params, resv, sync_mode);
>>       if (r)
>>           goto error_unlock;
>>   @@ -1612,7 +1616,7 @@ static int amdgpu_vm_bo_update_mapping(struct 
>> amdgpu_device *adev,
>>    * amdgpu_vm_bo_split_mapping - split a mapping into smaller chunks
>>    *
>>    * @adev: amdgpu_device pointer
>> - * @exclusive: fence we need to sync to
>> + * @resv: fences we need to sync to
>>    * @pages_addr: DMA addresses to use for mapping
>>    * @vm: requested vm
>>    * @mapping: mapped range and flags to use for the update
>> @@ -1628,7 +1632,7 @@ static int amdgpu_vm_bo_update_mapping(struct 
>> amdgpu_device *adev,
>>    * 0 for success, -EINVAL for failure.
>>    */
>>   static int amdgpu_vm_bo_split_mapping(struct amdgpu_device *adev,
>> -                      struct dma_fence *exclusive,
>> +                      struct dma_resv *resv,
>>                         dma_addr_t *pages_addr,
>>                         struct amdgpu_vm *vm,
>>                         struct amdgpu_bo_va_mapping *mapping,
>> @@ -1704,7 +1708,7 @@ static int amdgpu_vm_bo_split_mapping(struct 
>> amdgpu_device *adev,
>>           }
>>             last = min((uint64_t)mapping->last, start + max_entries - 
>> 1);
>> -        r = amdgpu_vm_bo_update_mapping(adev, vm, false, exclusive,
>> +        r = amdgpu_vm_bo_update_mapping(adev, vm, false, resv,
>>                           start, last, flags, addr,
>>                           dma_addr, fence);
>>           if (r)
>> @@ -1743,7 +1747,8 @@ int amdgpu_vm_bo_update(struct amdgpu_device 
>> *adev, struct amdgpu_bo_va *bo_va,
>>       dma_addr_t *pages_addr = NULL;
>>       struct ttm_mem_reg *mem;
>>       struct drm_mm_node *nodes;
>> -    struct dma_fence *exclusive, **last_update;
>> +    struct dma_fence **last_update;
>> +    struct dma_resv *resv;
>>       uint64_t flags;
>>       struct amdgpu_device *bo_adev = adev;
>>       int r;
>> @@ -1751,7 +1756,7 @@ int amdgpu_vm_bo_update(struct amdgpu_device 
>> *adev, struct amdgpu_bo_va *bo_va,
>>       if (clear || !bo) {
>>           mem = NULL;
>>           nodes = NULL;
>> -        exclusive = NULL;
>> +        resv = vm->root.base.bo->tbo.base.resv;
>>       } else {
>>           struct ttm_dma_tt *ttm;
>>   @@ -1761,7 +1766,7 @@ int amdgpu_vm_bo_update(struct amdgpu_device 
>> *adev, struct amdgpu_bo_va *bo_va,
>>               ttm = container_of(bo->tbo.ttm, struct ttm_dma_tt, ttm);
>>               pages_addr = ttm->dma_address;
>>           }
>> -        exclusive = bo->tbo.moving;
>> +        resv = bo->tbo.base.resv;
>>       }
>>         if (bo) {
>> @@ -1775,7 +1780,8 @@ int amdgpu_vm_bo_update(struct amdgpu_device 
>> *adev, struct amdgpu_bo_va *bo_va,
>>           flags = 0x0;
>>       }
>>   -    if (clear || (bo && bo->tbo.base.resv == 
>> vm->root.base.bo->tbo.base.resv))
>> +    if (clear || (bo && bo->tbo.base.resv ==
>> +              vm->root.base.bo->tbo.base.resv))
>>           last_update = &vm->last_update;
>>       else
>>           last_update = &bo_va->last_pt_update;
>> @@ -1789,7 +1795,7 @@ int amdgpu_vm_bo_update(struct amdgpu_device 
>> *adev, struct amdgpu_bo_va *bo_va,
>>       }
>>         list_for_each_entry(mapping, &bo_va->invalids, list) {
>> -        r = amdgpu_vm_bo_split_mapping(adev, exclusive, pages_addr, vm,
>> +        r = amdgpu_vm_bo_split_mapping(adev, resv, pages_addr, vm,
>>                              mapping, flags, bo_adev, nodes,
>>                              last_update);
>>           if (r)
>> @@ -1984,6 +1990,7 @@ int amdgpu_vm_clear_freed(struct amdgpu_device 
>> *adev,
>>                 struct amdgpu_vm *vm,
>>                 struct dma_fence **fence)
>>   {
>> +    struct dma_resv *resv = vm->root.base.bo->tbo.base.resv;
>>       struct amdgpu_bo_va_mapping *mapping;
>>       uint64_t init_pte_value = 0;
>>       struct dma_fence *f = NULL;
>> @@ -1998,7 +2005,7 @@ int amdgpu_vm_clear_freed(struct amdgpu_device 
>> *adev,
>>               mapping->start < AMDGPU_GMC_HOLE_START)
>>               init_pte_value = AMDGPU_PTE_DEFAULT_ATC;
>>   -        r = amdgpu_vm_bo_update_mapping(adev, vm, false, NULL,
>> +        r = amdgpu_vm_bo_update_mapping(adev, vm, false, resv,
>>                           mapping->start, mapping->last,
>>                           init_pte_value, 0, NULL, &f);
>>           amdgpu_vm_free_mapping(adev, vm, mapping, f);
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>> index c21a36bebc0c..b5705fcfc935 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>> @@ -230,8 +230,8 @@ struct amdgpu_vm_update_params {
>>     struct amdgpu_vm_update_funcs {
>>       int (*map_table)(struct amdgpu_bo *bo);
>> -    int (*prepare)(struct amdgpu_vm_update_params *p, void * owner,
>> -               struct dma_fence *exclusive);
>> +    int (*prepare)(struct amdgpu_vm_update_params *p, struct 
>> dma_resv *resv,
>> +               enum amdgpu_sync_mode sync_mode);
>>       int (*update)(struct amdgpu_vm_update_params *p,
>>                 struct amdgpu_bo *bo, uint64_t pe, uint64_t addr,
>>                 unsigned count, uint32_t incr, uint64_t flags);
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_cpu.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_cpu.c
>> index 68b013be3837..e38516304070 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_cpu.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_cpu.c
>> @@ -44,26 +44,14 @@ static int amdgpu_vm_cpu_map_table(struct 
>> amdgpu_bo *table)
>>    * Returns:
>>    * Negativ errno, 0 for success.
>>    */
>> -static int amdgpu_vm_cpu_prepare(struct amdgpu_vm_update_params *p, 
>> void *owner,
>> -                 struct dma_fence *exclusive)
>> +static int amdgpu_vm_cpu_prepare(struct amdgpu_vm_update_params *p,
>> +                 struct dma_resv *resv,
>> +                 enum amdgpu_sync_mode sync_mode)
>>   {
>> -    int r;
>> -
>> -    /* Wait for any BO move to be completed */
>> -    if (exclusive) {
>> -        r = dma_fence_wait(exclusive, true);
>> -        if (unlikely(r))
>> -            return r;
>> -    }
>> -
>> -    /* Don't wait for submissions during page fault */
>> -    if (p->direct)
>> +    if (!resv)
>>           return 0;
>>   -    /* Wait for PT BOs to be idle. PTs share the same resv. object
>> -     * as the root PD BO
>> -     */
>> -    return amdgpu_bo_sync_wait(p->vm->root.base.bo, owner, true);
>> +    return amdgpu_bo_sync_wait_resv(p->adev, resv, sync_mode, p->vm, 
>> true);
>>   }
>>     /**
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
>> index ab6481751763..4cc7881f438c 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
>> @@ -58,9 +58,9 @@ static int amdgpu_vm_sdma_map_table(struct 
>> amdgpu_bo *table)
>>    * Negativ errno, 0 for success.
>>    */
>>   static int amdgpu_vm_sdma_prepare(struct amdgpu_vm_update_params *p,
>> -                  void *owner, struct dma_fence *exclusive)
>> +                  struct dma_resv *resv,
>> +                  enum amdgpu_sync_mode sync_mode)
>>   {
>> -    struct amdgpu_bo *root = p->vm->root.base.bo;
>>       unsigned int ndw = AMDGPU_VM_SDMA_MIN_NUM_DW;
>>       int r;
>>   @@ -70,17 +70,10 @@ static int amdgpu_vm_sdma_prepare(struct 
>> amdgpu_vm_update_params *p,
>>         p->num_dw_left = ndw;
>>   -    /* Wait for moves to be completed */
>> -    r = amdgpu_sync_fence(&p->job->sync, exclusive, false);
>> -    if (r)
>> -        return r;
>> -
>> -    /* Don't wait for any submissions during page fault handling */
>> -    if (p->direct)
>> +    if (!resv)
>>           return 0;
>>   -    return amdgpu_sync_resv(p->adev, &p->job->sync, 
>> root->tbo.base.resv,
>> -                AMDGPU_SYNC_NE_OWNER, owner);
>> +    return amdgpu_sync_resv(p->adev, &p->job->sync, resv, sync_mode, 
>> p->vm);
>>   }
>>     /**

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 4/4] drm/amdgpu: rework synchronization of VM updates v3
  2020-01-23 14:24     ` Christian König
@ 2020-01-23 14:25       ` Tom St Denis
  2020-01-23 14:35         ` Christian König
  0 siblings, 1 reply; 15+ messages in thread
From: Tom St Denis @ 2020-01-23 14:25 UTC (permalink / raw)
  To: christian.koenig, amd-gfx, Felix.Kuehling

On the tip of drm-next (as of this morning) I was still getting

[  983.891264] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
update BO_VA (-2)

type errors.  So I'm looking for that to stop.  At least LLVM was fixed 
so I can run a full run of piglit without gfx hangs.

Tom

On 2020-01-23 9:24 a.m., Christian König wrote:
> Thanks, please give them a full round.
>
> Took me a week to figure out that we accidentally pass in the 
> reservation object as NULL for cleared BOs.
>
> Thanks,
> Christian.
>
> Am 23.01.20 um 15:22 schrieb Tom St Denis:
>> Just applied these now, trying them out will report back in ~20 mins.
>>
>> On 2020-01-23 9:21 a.m., Christian König wrote:
>>> If provided we only sync to the BOs reservation
>>> object and no longer to the root PD.
>>>
>>> v2: update comment, cleanup amdgpu_bo_sync_wait_resv
>>> v3: use correct reservation object while clearing
>>>
>>> Signed-off-by: Christian König <christian.koenig@amd.com>
>>> ---
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c  | 37 
>>> ++++++++++++++++++++------
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.h  |  3 +++
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c    |  7 -----
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c      | 41 
>>> +++++++++++++++++------------
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h      |  4 +--
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm_cpu.c  | 22 ++++------------
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c | 15 +++--------
>>>   7 files changed, 67 insertions(+), 62 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>> index 46c76e2e1281..c70bbdda078c 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>> @@ -1403,30 +1403,51 @@ void amdgpu_bo_fence(struct amdgpu_bo *bo, 
>>> struct dma_fence *fence,
>>>   }
>>>     /**
>>> - * amdgpu_sync_wait_resv - Wait for BO reservation fences
>>> + * amdgpu_bo_sync_wait_resv - Wait for BO reservation fences
>>>    *
>>> - * @bo: buffer object
>>> + * @adev: amdgpu device pointer
>>> + * @resv: reservation object to sync to
>>> + * @sync_mode: synchronization mode
>>>    * @owner: fence owner
>>>    * @intr: Whether the wait is interruptible
>>>    *
>>> + * Extract the fences from the reservation object and waits for 
>>> them to finish.
>>> + *
>>>    * Returns:
>>>    * 0 on success, errno otherwise.
>>>    */
>>> -int amdgpu_bo_sync_wait(struct amdgpu_bo *bo, void *owner, bool intr)
>>> +int amdgpu_bo_sync_wait_resv(struct amdgpu_device *adev, struct 
>>> dma_resv *resv,
>>> +                 enum amdgpu_sync_mode sync_mode, void *owner,
>>> +                 bool intr)
>>>   {
>>> -    struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
>>>       struct amdgpu_sync sync;
>>>       int r;
>>>         amdgpu_sync_create(&sync);
>>> -    amdgpu_sync_resv(adev, &sync, bo->tbo.base.resv,
>>> -             AMDGPU_SYNC_NE_OWNER, owner);
>>> -    r = amdgpu_sync_wait(&sync, intr);
>>> +    amdgpu_sync_resv(adev, &sync, resv, sync_mode, owner);
>>> +    r = amdgpu_sync_wait(&sync, true);
>>>       amdgpu_sync_free(&sync);
>>> -
>>>       return r;
>>>   }
>>>   +/**
>>> + * amdgpu_bo_sync_wait - Wrapper for amdgpu_bo_sync_wait_resv
>>> + * @bo: buffer object to wait for
>>> + * @owner: fence owner
>>> + * @intr: Whether the wait is interruptible
>>> + *
>>> + * Wrapper to wait for fences in a BO.
>>> + * Returns:
>>> + * 0 on success, errno otherwise.
>>> + */
>>> +int amdgpu_bo_sync_wait(struct amdgpu_bo *bo, void *owner, bool intr)
>>> +{
>>> +    struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
>>> +
>>> +    return amdgpu_bo_sync_wait_resv(adev, bo->tbo.base.resv,
>>> +                    AMDGPU_SYNC_NE_OWNER, owner, intr);
>>> +}
>>> +
>>>   /**
>>>    * amdgpu_bo_gpu_offset - return GPU offset of bo
>>>    * @bo:    amdgpu object for which we query the offset
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>>> index 2eeafc77c9c1..96d805889e8d 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>>> @@ -283,6 +283,9 @@ void amdgpu_bo_release_notify(struct 
>>> ttm_buffer_object *bo);
>>>   int amdgpu_bo_fault_reserve_notify(struct ttm_buffer_object *bo);
>>>   void amdgpu_bo_fence(struct amdgpu_bo *bo, struct dma_fence *fence,
>>>                bool shared);
>>> +int amdgpu_bo_sync_wait_resv(struct amdgpu_device *adev, struct 
>>> dma_resv *resv,
>>> +                 enum amdgpu_sync_mode sync_mode, void *owner,
>>> +                 bool intr);
>>>   int amdgpu_bo_sync_wait(struct amdgpu_bo *bo, void *owner, bool 
>>> intr);
>>>   u64 amdgpu_bo_gpu_offset(struct amdgpu_bo *bo);
>>>   int amdgpu_bo_validate(struct amdgpu_bo *bo);
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
>>> index 9f42032676da..b86392253696 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
>>> @@ -249,13 +249,6 @@ int amdgpu_sync_resv(struct amdgpu_device 
>>> *adev, struct amdgpu_sync *sync,
>>>               owner != AMDGPU_FENCE_OWNER_UNDEFINED)
>>>               continue;
>>>   -        /* VM updates only sync with moves but not with user
>>> -         * command submissions or KFD evictions fences
>>> -         */
>>> -        if (fence_owner != AMDGPU_FENCE_OWNER_UNDEFINED &&
>>> -            owner == AMDGPU_FENCE_OWNER_VM)
>>> -            continue;
>>> -
>>>           /* Ignore fences depending on the sync mode */
>>>           switch (mode) {
>>>           case AMDGPU_SYNC_ALWAYS:
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>> index 0f79c17118bf..c268aa14381e 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>> @@ -797,7 +797,7 @@ static int amdgpu_vm_clear_bo(struct 
>>> amdgpu_device *adev,
>>>       params.vm = vm;
>>>       params.direct = direct;
>>>   -    r = vm->update_funcs->prepare(&params, 
>>> AMDGPU_FENCE_OWNER_KFD, NULL);
>>> +    r = vm->update_funcs->prepare(&params, NULL, 
>>> AMDGPU_SYNC_EXPLICIT);
>>>       if (r)
>>>           return r;
>>>   @@ -1293,7 +1293,7 @@ int amdgpu_vm_update_pdes(struct 
>>> amdgpu_device *adev,
>>>       params.vm = vm;
>>>       params.direct = direct;
>>>   -    r = vm->update_funcs->prepare(&params, AMDGPU_FENCE_OWNER_VM, 
>>> NULL);
>>> +    r = vm->update_funcs->prepare(&params, NULL, 
>>> AMDGPU_SYNC_EXPLICIT);
>>>       if (r)
>>>           return r;
>>>   @@ -1552,7 +1552,7 @@ static int amdgpu_vm_update_ptes(struct 
>>> amdgpu_vm_update_params *params,
>>>    * @adev: amdgpu_device pointer
>>>    * @vm: requested vm
>>>    * @direct: direct submission in a page fault
>>> - * @exclusive: fence we need to sync to
>>> + * @resv: fences we need to sync to
>>>    * @start: start of mapped range
>>>    * @last: last mapped entry
>>>    * @flags: flags for the entries
>>> @@ -1567,14 +1567,14 @@ static int amdgpu_vm_update_ptes(struct 
>>> amdgpu_vm_update_params *params,
>>>    */
>>>   static int amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev,
>>>                          struct amdgpu_vm *vm, bool direct,
>>> -                       struct dma_fence *exclusive,
>>> +                       struct dma_resv *resv,
>>>                          uint64_t start, uint64_t last,
>>>                          uint64_t flags, uint64_t addr,
>>>                          dma_addr_t *pages_addr,
>>>                          struct dma_fence **fence)
>>>   {
>>>       struct amdgpu_vm_update_params params;
>>> -    void *owner = AMDGPU_FENCE_OWNER_VM;
>>> +    enum amdgpu_sync_mode sync_mode;
>>>       int r;
>>>         memset(&params, 0, sizeof(params));
>>> @@ -1583,9 +1583,13 @@ static int amdgpu_vm_bo_update_mapping(struct 
>>> amdgpu_device *adev,
>>>       params.direct = direct;
>>>       params.pages_addr = pages_addr;
>>>   -    /* sync to everything except eviction fences on unmapping */
>>> +    /* Implicitly sync to command submissions in the same VM before
>>> +     * unmapping. Sync to moving fences before mapping.
>>> +     */
>>>       if (!(flags & AMDGPU_PTE_VALID))
>>> -        owner = AMDGPU_FENCE_OWNER_KFD;
>>> +        sync_mode = AMDGPU_SYNC_EQ_OWNER;
>>> +    else
>>> +        sync_mode = AMDGPU_SYNC_EXPLICIT;
>>>         amdgpu_vm_eviction_lock(vm);
>>>       if (vm->evicting) {
>>> @@ -1593,7 +1597,7 @@ static int amdgpu_vm_bo_update_mapping(struct 
>>> amdgpu_device *adev,
>>>           goto error_unlock;
>>>       }
>>>   -    r = vm->update_funcs->prepare(&params, owner, exclusive);
>>> +    r = vm->update_funcs->prepare(&params, resv, sync_mode);
>>>       if (r)
>>>           goto error_unlock;
>>>   @@ -1612,7 +1616,7 @@ static int 
>>> amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev,
>>>    * amdgpu_vm_bo_split_mapping - split a mapping into smaller chunks
>>>    *
>>>    * @adev: amdgpu_device pointer
>>> - * @exclusive: fence we need to sync to
>>> + * @resv: fences we need to sync to
>>>    * @pages_addr: DMA addresses to use for mapping
>>>    * @vm: requested vm
>>>    * @mapping: mapped range and flags to use for the update
>>> @@ -1628,7 +1632,7 @@ static int amdgpu_vm_bo_update_mapping(struct 
>>> amdgpu_device *adev,
>>>    * 0 for success, -EINVAL for failure.
>>>    */
>>>   static int amdgpu_vm_bo_split_mapping(struct amdgpu_device *adev,
>>> -                      struct dma_fence *exclusive,
>>> +                      struct dma_resv *resv,
>>>                         dma_addr_t *pages_addr,
>>>                         struct amdgpu_vm *vm,
>>>                         struct amdgpu_bo_va_mapping *mapping,
>>> @@ -1704,7 +1708,7 @@ static int amdgpu_vm_bo_split_mapping(struct 
>>> amdgpu_device *adev,
>>>           }
>>>             last = min((uint64_t)mapping->last, start + max_entries 
>>> - 1);
>>> -        r = amdgpu_vm_bo_update_mapping(adev, vm, false, exclusive,
>>> +        r = amdgpu_vm_bo_update_mapping(adev, vm, false, resv,
>>>                           start, last, flags, addr,
>>>                           dma_addr, fence);
>>>           if (r)
>>> @@ -1743,7 +1747,8 @@ int amdgpu_vm_bo_update(struct amdgpu_device 
>>> *adev, struct amdgpu_bo_va *bo_va,
>>>       dma_addr_t *pages_addr = NULL;
>>>       struct ttm_mem_reg *mem;
>>>       struct drm_mm_node *nodes;
>>> -    struct dma_fence *exclusive, **last_update;
>>> +    struct dma_fence **last_update;
>>> +    struct dma_resv *resv;
>>>       uint64_t flags;
>>>       struct amdgpu_device *bo_adev = adev;
>>>       int r;
>>> @@ -1751,7 +1756,7 @@ int amdgpu_vm_bo_update(struct amdgpu_device 
>>> *adev, struct amdgpu_bo_va *bo_va,
>>>       if (clear || !bo) {
>>>           mem = NULL;
>>>           nodes = NULL;
>>> -        exclusive = NULL;
>>> +        resv = vm->root.base.bo->tbo.base.resv;
>>>       } else {
>>>           struct ttm_dma_tt *ttm;
>>>   @@ -1761,7 +1766,7 @@ int amdgpu_vm_bo_update(struct amdgpu_device 
>>> *adev, struct amdgpu_bo_va *bo_va,
>>>               ttm = container_of(bo->tbo.ttm, struct ttm_dma_tt, ttm);
>>>               pages_addr = ttm->dma_address;
>>>           }
>>> -        exclusive = bo->tbo.moving;
>>> +        resv = bo->tbo.base.resv;
>>>       }
>>>         if (bo) {
>>> @@ -1775,7 +1780,8 @@ int amdgpu_vm_bo_update(struct amdgpu_device 
>>> *adev, struct amdgpu_bo_va *bo_va,
>>>           flags = 0x0;
>>>       }
>>>   -    if (clear || (bo && bo->tbo.base.resv == 
>>> vm->root.base.bo->tbo.base.resv))
>>> +    if (clear || (bo && bo->tbo.base.resv ==
>>> +              vm->root.base.bo->tbo.base.resv))
>>>           last_update = &vm->last_update;
>>>       else
>>>           last_update = &bo_va->last_pt_update;
>>> @@ -1789,7 +1795,7 @@ int amdgpu_vm_bo_update(struct amdgpu_device 
>>> *adev, struct amdgpu_bo_va *bo_va,
>>>       }
>>>         list_for_each_entry(mapping, &bo_va->invalids, list) {
>>> -        r = amdgpu_vm_bo_split_mapping(adev, exclusive, pages_addr, 
>>> vm,
>>> +        r = amdgpu_vm_bo_split_mapping(adev, resv, pages_addr, vm,
>>>                              mapping, flags, bo_adev, nodes,
>>>                              last_update);
>>>           if (r)
>>> @@ -1984,6 +1990,7 @@ int amdgpu_vm_clear_freed(struct amdgpu_device 
>>> *adev,
>>>                 struct amdgpu_vm *vm,
>>>                 struct dma_fence **fence)
>>>   {
>>> +    struct dma_resv *resv = vm->root.base.bo->tbo.base.resv;
>>>       struct amdgpu_bo_va_mapping *mapping;
>>>       uint64_t init_pte_value = 0;
>>>       struct dma_fence *f = NULL;
>>> @@ -1998,7 +2005,7 @@ int amdgpu_vm_clear_freed(struct amdgpu_device 
>>> *adev,
>>>               mapping->start < AMDGPU_GMC_HOLE_START)
>>>               init_pte_value = AMDGPU_PTE_DEFAULT_ATC;
>>>   -        r = amdgpu_vm_bo_update_mapping(adev, vm, false, NULL,
>>> +        r = amdgpu_vm_bo_update_mapping(adev, vm, false, resv,
>>>                           mapping->start, mapping->last,
>>>                           init_pte_value, 0, NULL, &f);
>>>           amdgpu_vm_free_mapping(adev, vm, mapping, f);
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>>> index c21a36bebc0c..b5705fcfc935 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>>> @@ -230,8 +230,8 @@ struct amdgpu_vm_update_params {
>>>     struct amdgpu_vm_update_funcs {
>>>       int (*map_table)(struct amdgpu_bo *bo);
>>> -    int (*prepare)(struct amdgpu_vm_update_params *p, void * owner,
>>> -               struct dma_fence *exclusive);
>>> +    int (*prepare)(struct amdgpu_vm_update_params *p, struct 
>>> dma_resv *resv,
>>> +               enum amdgpu_sync_mode sync_mode);
>>>       int (*update)(struct amdgpu_vm_update_params *p,
>>>                 struct amdgpu_bo *bo, uint64_t pe, uint64_t addr,
>>>                 unsigned count, uint32_t incr, uint64_t flags);
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_cpu.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_cpu.c
>>> index 68b013be3837..e38516304070 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_cpu.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_cpu.c
>>> @@ -44,26 +44,14 @@ static int amdgpu_vm_cpu_map_table(struct 
>>> amdgpu_bo *table)
>>>    * Returns:
>>>    * Negativ errno, 0 for success.
>>>    */
>>> -static int amdgpu_vm_cpu_prepare(struct amdgpu_vm_update_params *p, 
>>> void *owner,
>>> -                 struct dma_fence *exclusive)
>>> +static int amdgpu_vm_cpu_prepare(struct amdgpu_vm_update_params *p,
>>> +                 struct dma_resv *resv,
>>> +                 enum amdgpu_sync_mode sync_mode)
>>>   {
>>> -    int r;
>>> -
>>> -    /* Wait for any BO move to be completed */
>>> -    if (exclusive) {
>>> -        r = dma_fence_wait(exclusive, true);
>>> -        if (unlikely(r))
>>> -            return r;
>>> -    }
>>> -
>>> -    /* Don't wait for submissions during page fault */
>>> -    if (p->direct)
>>> +    if (!resv)
>>>           return 0;
>>>   -    /* Wait for PT BOs to be idle. PTs share the same resv. object
>>> -     * as the root PD BO
>>> -     */
>>> -    return amdgpu_bo_sync_wait(p->vm->root.base.bo, owner, true);
>>> +    return amdgpu_bo_sync_wait_resv(p->adev, resv, sync_mode, 
>>> p->vm, true);
>>>   }
>>>     /**
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
>>> index ab6481751763..4cc7881f438c 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
>>> @@ -58,9 +58,9 @@ static int amdgpu_vm_sdma_map_table(struct 
>>> amdgpu_bo *table)
>>>    * Negativ errno, 0 for success.
>>>    */
>>>   static int amdgpu_vm_sdma_prepare(struct amdgpu_vm_update_params *p,
>>> -                  void *owner, struct dma_fence *exclusive)
>>> +                  struct dma_resv *resv,
>>> +                  enum amdgpu_sync_mode sync_mode)
>>>   {
>>> -    struct amdgpu_bo *root = p->vm->root.base.bo;
>>>       unsigned int ndw = AMDGPU_VM_SDMA_MIN_NUM_DW;
>>>       int r;
>>>   @@ -70,17 +70,10 @@ static int amdgpu_vm_sdma_prepare(struct 
>>> amdgpu_vm_update_params *p,
>>>         p->num_dw_left = ndw;
>>>   -    /* Wait for moves to be completed */
>>> -    r = amdgpu_sync_fence(&p->job->sync, exclusive, false);
>>> -    if (r)
>>> -        return r;
>>> -
>>> -    /* Don't wait for any submissions during page fault handling */
>>> -    if (p->direct)
>>> +    if (!resv)
>>>           return 0;
>>>   -    return amdgpu_sync_resv(p->adev, &p->job->sync, 
>>> root->tbo.base.resv,
>>> -                AMDGPU_SYNC_NE_OWNER, owner);
>>> +    return amdgpu_sync_resv(p->adev, &p->job->sync, resv, 
>>> sync_mode, p->vm);
>>>   }
>>>     /**
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 4/4] drm/amdgpu: rework synchronization of VM updates v3
  2020-01-23 14:25       ` Tom St Denis
@ 2020-01-23 14:35         ` Christian König
  2020-01-23 15:23           ` Tom St Denis
  0 siblings, 1 reply; 15+ messages in thread
From: Christian König @ 2020-01-23 14:35 UTC (permalink / raw)
  To: Tom St Denis, amd-gfx, Felix.Kuehling

That is fixed by patch #2 in this series.

Patch #4 is then re-applying the faulty synchronization cleanup.

Regards,
Christian.

Am 23.01.20 um 15:25 schrieb Tom St Denis:
> On the tip of drm-next (as of this morning) I was still getting
>
> [  983.891264] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
> update BO_VA (-2)
>
> type errors.  So I'm looking for that to stop.  At least LLVM was 
> fixed so I can run a full run of piglit without gfx hangs.
>
> Tom
>
> On 2020-01-23 9:24 a.m., Christian König wrote:
>> Thanks, please give them a full round.
>>
>> Took me a week to figure out that we accidentally pass in the 
>> reservation object as NULL for cleared BOs.
>>
>> Thanks,
>> Christian.
>>
>> Am 23.01.20 um 15:22 schrieb Tom St Denis:
>>> Just applied these now, trying them out will report back in ~20 mins.
>>>
>>> On 2020-01-23 9:21 a.m., Christian König wrote:
>>>> If provided we only sync to the BOs reservation
>>>> object and no longer to the root PD.
>>>>
>>>> v2: update comment, cleanup amdgpu_bo_sync_wait_resv
>>>> v3: use correct reservation object while clearing
>>>>
>>>> Signed-off-by: Christian König <christian.koenig@amd.com>
>>>> ---
>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c  | 37 
>>>> ++++++++++++++++++++------
>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.h  |  3 +++
>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c    |  7 -----
>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c      | 41 
>>>> +++++++++++++++++------------
>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h      |  4 +--
>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm_cpu.c  | 22 ++++------------
>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c | 15 +++--------
>>>>   7 files changed, 67 insertions(+), 62 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>>> index 46c76e2e1281..c70bbdda078c 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>>> @@ -1403,30 +1403,51 @@ void amdgpu_bo_fence(struct amdgpu_bo *bo, 
>>>> struct dma_fence *fence,
>>>>   }
>>>>     /**
>>>> - * amdgpu_sync_wait_resv - Wait for BO reservation fences
>>>> + * amdgpu_bo_sync_wait_resv - Wait for BO reservation fences
>>>>    *
>>>> - * @bo: buffer object
>>>> + * @adev: amdgpu device pointer
>>>> + * @resv: reservation object to sync to
>>>> + * @sync_mode: synchronization mode
>>>>    * @owner: fence owner
>>>>    * @intr: Whether the wait is interruptible
>>>>    *
>>>> + * Extract the fences from the reservation object and waits for 
>>>> them to finish.
>>>> + *
>>>>    * Returns:
>>>>    * 0 on success, errno otherwise.
>>>>    */
>>>> -int amdgpu_bo_sync_wait(struct amdgpu_bo *bo, void *owner, bool intr)
>>>> +int amdgpu_bo_sync_wait_resv(struct amdgpu_device *adev, struct 
>>>> dma_resv *resv,
>>>> +                 enum amdgpu_sync_mode sync_mode, void *owner,
>>>> +                 bool intr)
>>>>   {
>>>> -    struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
>>>>       struct amdgpu_sync sync;
>>>>       int r;
>>>>         amdgpu_sync_create(&sync);
>>>> -    amdgpu_sync_resv(adev, &sync, bo->tbo.base.resv,
>>>> -             AMDGPU_SYNC_NE_OWNER, owner);
>>>> -    r = amdgpu_sync_wait(&sync, intr);
>>>> +    amdgpu_sync_resv(adev, &sync, resv, sync_mode, owner);
>>>> +    r = amdgpu_sync_wait(&sync, true);
>>>>       amdgpu_sync_free(&sync);
>>>> -
>>>>       return r;
>>>>   }
>>>>   +/**
>>>> + * amdgpu_bo_sync_wait - Wrapper for amdgpu_bo_sync_wait_resv
>>>> + * @bo: buffer object to wait for
>>>> + * @owner: fence owner
>>>> + * @intr: Whether the wait is interruptible
>>>> + *
>>>> + * Wrapper to wait for fences in a BO.
>>>> + * Returns:
>>>> + * 0 on success, errno otherwise.
>>>> + */
>>>> +int amdgpu_bo_sync_wait(struct amdgpu_bo *bo, void *owner, bool intr)
>>>> +{
>>>> +    struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
>>>> +
>>>> +    return amdgpu_bo_sync_wait_resv(adev, bo->tbo.base.resv,
>>>> +                    AMDGPU_SYNC_NE_OWNER, owner, intr);
>>>> +}
>>>> +
>>>>   /**
>>>>    * amdgpu_bo_gpu_offset - return GPU offset of bo
>>>>    * @bo:    amdgpu object for which we query the offset
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h 
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>>>> index 2eeafc77c9c1..96d805889e8d 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>>>> @@ -283,6 +283,9 @@ void amdgpu_bo_release_notify(struct 
>>>> ttm_buffer_object *bo);
>>>>   int amdgpu_bo_fault_reserve_notify(struct ttm_buffer_object *bo);
>>>>   void amdgpu_bo_fence(struct amdgpu_bo *bo, struct dma_fence *fence,
>>>>                bool shared);
>>>> +int amdgpu_bo_sync_wait_resv(struct amdgpu_device *adev, struct 
>>>> dma_resv *resv,
>>>> +                 enum amdgpu_sync_mode sync_mode, void *owner,
>>>> +                 bool intr);
>>>>   int amdgpu_bo_sync_wait(struct amdgpu_bo *bo, void *owner, bool 
>>>> intr);
>>>>   u64 amdgpu_bo_gpu_offset(struct amdgpu_bo *bo);
>>>>   int amdgpu_bo_validate(struct amdgpu_bo *bo);
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c 
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
>>>> index 9f42032676da..b86392253696 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
>>>> @@ -249,13 +249,6 @@ int amdgpu_sync_resv(struct amdgpu_device 
>>>> *adev, struct amdgpu_sync *sync,
>>>>               owner != AMDGPU_FENCE_OWNER_UNDEFINED)
>>>>               continue;
>>>>   -        /* VM updates only sync with moves but not with user
>>>> -         * command submissions or KFD evictions fences
>>>> -         */
>>>> -        if (fence_owner != AMDGPU_FENCE_OWNER_UNDEFINED &&
>>>> -            owner == AMDGPU_FENCE_OWNER_VM)
>>>> -            continue;
>>>> -
>>>>           /* Ignore fences depending on the sync mode */
>>>>           switch (mode) {
>>>>           case AMDGPU_SYNC_ALWAYS:
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>> index 0f79c17118bf..c268aa14381e 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>> @@ -797,7 +797,7 @@ static int amdgpu_vm_clear_bo(struct 
>>>> amdgpu_device *adev,
>>>>       params.vm = vm;
>>>>       params.direct = direct;
>>>>   -    r = vm->update_funcs->prepare(&params, 
>>>> AMDGPU_FENCE_OWNER_KFD, NULL);
>>>> +    r = vm->update_funcs->prepare(&params, NULL, 
>>>> AMDGPU_SYNC_EXPLICIT);
>>>>       if (r)
>>>>           return r;
>>>>   @@ -1293,7 +1293,7 @@ int amdgpu_vm_update_pdes(struct 
>>>> amdgpu_device *adev,
>>>>       params.vm = vm;
>>>>       params.direct = direct;
>>>>   -    r = vm->update_funcs->prepare(&params, 
>>>> AMDGPU_FENCE_OWNER_VM, NULL);
>>>> +    r = vm->update_funcs->prepare(&params, NULL, 
>>>> AMDGPU_SYNC_EXPLICIT);
>>>>       if (r)
>>>>           return r;
>>>>   @@ -1552,7 +1552,7 @@ static int amdgpu_vm_update_ptes(struct 
>>>> amdgpu_vm_update_params *params,
>>>>    * @adev: amdgpu_device pointer
>>>>    * @vm: requested vm
>>>>    * @direct: direct submission in a page fault
>>>> - * @exclusive: fence we need to sync to
>>>> + * @resv: fences we need to sync to
>>>>    * @start: start of mapped range
>>>>    * @last: last mapped entry
>>>>    * @flags: flags for the entries
>>>> @@ -1567,14 +1567,14 @@ static int amdgpu_vm_update_ptes(struct 
>>>> amdgpu_vm_update_params *params,
>>>>    */
>>>>   static int amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev,
>>>>                          struct amdgpu_vm *vm, bool direct,
>>>> -                       struct dma_fence *exclusive,
>>>> +                       struct dma_resv *resv,
>>>>                          uint64_t start, uint64_t last,
>>>>                          uint64_t flags, uint64_t addr,
>>>>                          dma_addr_t *pages_addr,
>>>>                          struct dma_fence **fence)
>>>>   {
>>>>       struct amdgpu_vm_update_params params;
>>>> -    void *owner = AMDGPU_FENCE_OWNER_VM;
>>>> +    enum amdgpu_sync_mode sync_mode;
>>>>       int r;
>>>>         memset(&params, 0, sizeof(params));
>>>> @@ -1583,9 +1583,13 @@ static int 
>>>> amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev,
>>>>       params.direct = direct;
>>>>       params.pages_addr = pages_addr;
>>>>   -    /* sync to everything except eviction fences on unmapping */
>>>> +    /* Implicitly sync to command submissions in the same VM before
>>>> +     * unmapping. Sync to moving fences before mapping.
>>>> +     */
>>>>       if (!(flags & AMDGPU_PTE_VALID))
>>>> -        owner = AMDGPU_FENCE_OWNER_KFD;
>>>> +        sync_mode = AMDGPU_SYNC_EQ_OWNER;
>>>> +    else
>>>> +        sync_mode = AMDGPU_SYNC_EXPLICIT;
>>>>         amdgpu_vm_eviction_lock(vm);
>>>>       if (vm->evicting) {
>>>> @@ -1593,7 +1597,7 @@ static int amdgpu_vm_bo_update_mapping(struct 
>>>> amdgpu_device *adev,
>>>>           goto error_unlock;
>>>>       }
>>>>   -    r = vm->update_funcs->prepare(&params, owner, exclusive);
>>>> +    r = vm->update_funcs->prepare(&params, resv, sync_mode);
>>>>       if (r)
>>>>           goto error_unlock;
>>>>   @@ -1612,7 +1616,7 @@ static int 
>>>> amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev,
>>>>    * amdgpu_vm_bo_split_mapping - split a mapping into smaller chunks
>>>>    *
>>>>    * @adev: amdgpu_device pointer
>>>> - * @exclusive: fence we need to sync to
>>>> + * @resv: fences we need to sync to
>>>>    * @pages_addr: DMA addresses to use for mapping
>>>>    * @vm: requested vm
>>>>    * @mapping: mapped range and flags to use for the update
>>>> @@ -1628,7 +1632,7 @@ static int amdgpu_vm_bo_update_mapping(struct 
>>>> amdgpu_device *adev,
>>>>    * 0 for success, -EINVAL for failure.
>>>>    */
>>>>   static int amdgpu_vm_bo_split_mapping(struct amdgpu_device *adev,
>>>> -                      struct dma_fence *exclusive,
>>>> +                      struct dma_resv *resv,
>>>>                         dma_addr_t *pages_addr,
>>>>                         struct amdgpu_vm *vm,
>>>>                         struct amdgpu_bo_va_mapping *mapping,
>>>> @@ -1704,7 +1708,7 @@ static int amdgpu_vm_bo_split_mapping(struct 
>>>> amdgpu_device *adev,
>>>>           }
>>>>             last = min((uint64_t)mapping->last, start + max_entries 
>>>> - 1);
>>>> -        r = amdgpu_vm_bo_update_mapping(adev, vm, false, exclusive,
>>>> +        r = amdgpu_vm_bo_update_mapping(adev, vm, false, resv,
>>>>                           start, last, flags, addr,
>>>>                           dma_addr, fence);
>>>>           if (r)
>>>> @@ -1743,7 +1747,8 @@ int amdgpu_vm_bo_update(struct amdgpu_device 
>>>> *adev, struct amdgpu_bo_va *bo_va,
>>>>       dma_addr_t *pages_addr = NULL;
>>>>       struct ttm_mem_reg *mem;
>>>>       struct drm_mm_node *nodes;
>>>> -    struct dma_fence *exclusive, **last_update;
>>>> +    struct dma_fence **last_update;
>>>> +    struct dma_resv *resv;
>>>>       uint64_t flags;
>>>>       struct amdgpu_device *bo_adev = adev;
>>>>       int r;
>>>> @@ -1751,7 +1756,7 @@ int amdgpu_vm_bo_update(struct amdgpu_device 
>>>> *adev, struct amdgpu_bo_va *bo_va,
>>>>       if (clear || !bo) {
>>>>           mem = NULL;
>>>>           nodes = NULL;
>>>> -        exclusive = NULL;
>>>> +        resv = vm->root.base.bo->tbo.base.resv;
>>>>       } else {
>>>>           struct ttm_dma_tt *ttm;
>>>>   @@ -1761,7 +1766,7 @@ int amdgpu_vm_bo_update(struct 
>>>> amdgpu_device *adev, struct amdgpu_bo_va *bo_va,
>>>>               ttm = container_of(bo->tbo.ttm, struct ttm_dma_tt, ttm);
>>>>               pages_addr = ttm->dma_address;
>>>>           }
>>>> -        exclusive = bo->tbo.moving;
>>>> +        resv = bo->tbo.base.resv;
>>>>       }
>>>>         if (bo) {
>>>> @@ -1775,7 +1780,8 @@ int amdgpu_vm_bo_update(struct amdgpu_device 
>>>> *adev, struct amdgpu_bo_va *bo_va,
>>>>           flags = 0x0;
>>>>       }
>>>>   -    if (clear || (bo && bo->tbo.base.resv == 
>>>> vm->root.base.bo->tbo.base.resv))
>>>> +    if (clear || (bo && bo->tbo.base.resv ==
>>>> +              vm->root.base.bo->tbo.base.resv))
>>>>           last_update = &vm->last_update;
>>>>       else
>>>>           last_update = &bo_va->last_pt_update;
>>>> @@ -1789,7 +1795,7 @@ int amdgpu_vm_bo_update(struct amdgpu_device 
>>>> *adev, struct amdgpu_bo_va *bo_va,
>>>>       }
>>>>         list_for_each_entry(mapping, &bo_va->invalids, list) {
>>>> -        r = amdgpu_vm_bo_split_mapping(adev, exclusive, 
>>>> pages_addr, vm,
>>>> +        r = amdgpu_vm_bo_split_mapping(adev, resv, pages_addr, vm,
>>>>                              mapping, flags, bo_adev, nodes,
>>>>                              last_update);
>>>>           if (r)
>>>> @@ -1984,6 +1990,7 @@ int amdgpu_vm_clear_freed(struct 
>>>> amdgpu_device *adev,
>>>>                 struct amdgpu_vm *vm,
>>>>                 struct dma_fence **fence)
>>>>   {
>>>> +    struct dma_resv *resv = vm->root.base.bo->tbo.base.resv;
>>>>       struct amdgpu_bo_va_mapping *mapping;
>>>>       uint64_t init_pte_value = 0;
>>>>       struct dma_fence *f = NULL;
>>>> @@ -1998,7 +2005,7 @@ int amdgpu_vm_clear_freed(struct 
>>>> amdgpu_device *adev,
>>>>               mapping->start < AMDGPU_GMC_HOLE_START)
>>>>               init_pte_value = AMDGPU_PTE_DEFAULT_ATC;
>>>>   -        r = amdgpu_vm_bo_update_mapping(adev, vm, false, NULL,
>>>> +        r = amdgpu_vm_bo_update_mapping(adev, vm, false, resv,
>>>>                           mapping->start, mapping->last,
>>>>                           init_pte_value, 0, NULL, &f);
>>>>           amdgpu_vm_free_mapping(adev, vm, mapping, f);
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h 
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>>>> index c21a36bebc0c..b5705fcfc935 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>>>> @@ -230,8 +230,8 @@ struct amdgpu_vm_update_params {
>>>>     struct amdgpu_vm_update_funcs {
>>>>       int (*map_table)(struct amdgpu_bo *bo);
>>>> -    int (*prepare)(struct amdgpu_vm_update_params *p, void * owner,
>>>> -               struct dma_fence *exclusive);
>>>> +    int (*prepare)(struct amdgpu_vm_update_params *p, struct 
>>>> dma_resv *resv,
>>>> +               enum amdgpu_sync_mode sync_mode);
>>>>       int (*update)(struct amdgpu_vm_update_params *p,
>>>>                 struct amdgpu_bo *bo, uint64_t pe, uint64_t addr,
>>>>                 unsigned count, uint32_t incr, uint64_t flags);
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_cpu.c 
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_cpu.c
>>>> index 68b013be3837..e38516304070 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_cpu.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_cpu.c
>>>> @@ -44,26 +44,14 @@ static int amdgpu_vm_cpu_map_table(struct 
>>>> amdgpu_bo *table)
>>>>    * Returns:
>>>>    * Negativ errno, 0 for success.
>>>>    */
>>>> -static int amdgpu_vm_cpu_prepare(struct amdgpu_vm_update_params 
>>>> *p, void *owner,
>>>> -                 struct dma_fence *exclusive)
>>>> +static int amdgpu_vm_cpu_prepare(struct amdgpu_vm_update_params *p,
>>>> +                 struct dma_resv *resv,
>>>> +                 enum amdgpu_sync_mode sync_mode)
>>>>   {
>>>> -    int r;
>>>> -
>>>> -    /* Wait for any BO move to be completed */
>>>> -    if (exclusive) {
>>>> -        r = dma_fence_wait(exclusive, true);
>>>> -        if (unlikely(r))
>>>> -            return r;
>>>> -    }
>>>> -
>>>> -    /* Don't wait for submissions during page fault */
>>>> -    if (p->direct)
>>>> +    if (!resv)
>>>>           return 0;
>>>>   -    /* Wait for PT BOs to be idle. PTs share the same resv. object
>>>> -     * as the root PD BO
>>>> -     */
>>>> -    return amdgpu_bo_sync_wait(p->vm->root.base.bo, owner, true);
>>>> +    return amdgpu_bo_sync_wait_resv(p->adev, resv, sync_mode, 
>>>> p->vm, true);
>>>>   }
>>>>     /**
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c 
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
>>>> index ab6481751763..4cc7881f438c 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
>>>> @@ -58,9 +58,9 @@ static int amdgpu_vm_sdma_map_table(struct 
>>>> amdgpu_bo *table)
>>>>    * Negativ errno, 0 for success.
>>>>    */
>>>>   static int amdgpu_vm_sdma_prepare(struct amdgpu_vm_update_params *p,
>>>> -                  void *owner, struct dma_fence *exclusive)
>>>> +                  struct dma_resv *resv,
>>>> +                  enum amdgpu_sync_mode sync_mode)
>>>>   {
>>>> -    struct amdgpu_bo *root = p->vm->root.base.bo;
>>>>       unsigned int ndw = AMDGPU_VM_SDMA_MIN_NUM_DW;
>>>>       int r;
>>>>   @@ -70,17 +70,10 @@ static int amdgpu_vm_sdma_prepare(struct 
>>>> amdgpu_vm_update_params *p,
>>>>         p->num_dw_left = ndw;
>>>>   -    /* Wait for moves to be completed */
>>>> -    r = amdgpu_sync_fence(&p->job->sync, exclusive, false);
>>>> -    if (r)
>>>> -        return r;
>>>> -
>>>> -    /* Don't wait for any submissions during page fault handling */
>>>> -    if (p->direct)
>>>> +    if (!resv)
>>>>           return 0;
>>>>   -    return amdgpu_sync_resv(p->adev, &p->job->sync, 
>>>> root->tbo.base.resv,
>>>> -                AMDGPU_SYNC_NE_OWNER, owner);
>>>> +    return amdgpu_sync_resv(p->adev, &p->job->sync, resv, 
>>>> sync_mode, p->vm);
>>>>   }
>>>>     /**
>>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 4/4] drm/amdgpu: rework synchronization of VM updates v3
  2020-01-23 14:35         ` Christian König
@ 2020-01-23 15:23           ` Tom St Denis
  0 siblings, 0 replies; 15+ messages in thread
From: Tom St Denis @ 2020-01-23 15:23 UTC (permalink / raw)
  To: Christian König, amd-gfx, Felix.Kuehling

I've tested piglit, unigine-heaven, and video playback on my 
navi10/raven1 system.  All fine (as far as the kernel is concerned).  
You can add my


Tested-by: Tom St Denis <tom.stdenis@amd.com>


Tom

On 2020-01-23 9:35 a.m., Christian König wrote:
> That is fixed by patch #2 in this series.
>
> Patch #4 is then re-applying the faulty synchronization cleanup.
>
> Regards,
> Christian.
>
> Am 23.01.20 um 15:25 schrieb Tom St Denis:
>> On the tip of drm-next (as of this morning) I was still getting
>>
>> [  983.891264] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
>> update BO_VA (-2)
>>
>> type errors.  So I'm looking for that to stop.  At least LLVM was 
>> fixed so I can run a full run of piglit without gfx hangs.
>>
>> Tom
>>
>> On 2020-01-23 9:24 a.m., Christian König wrote:
>>> Thanks, please give them a full round.
>>>
>>> Took me a week to figure out that we accidentally pass in the 
>>> reservation object as NULL for cleared BOs.
>>>
>>> Thanks,
>>> Christian.
>>>
>>> Am 23.01.20 um 15:22 schrieb Tom St Denis:
>>>> Just applied these now, trying them out will report back in ~20 mins.
>>>>
>>>> On 2020-01-23 9:21 a.m., Christian König wrote:
>>>>> If provided we only sync to the BOs reservation
>>>>> object and no longer to the root PD.
>>>>>
>>>>> v2: update comment, cleanup amdgpu_bo_sync_wait_resv
>>>>> v3: use correct reservation object while clearing
>>>>>
>>>>> Signed-off-by: Christian König <christian.koenig@amd.com>
>>>>> ---
>>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c  | 37 
>>>>> ++++++++++++++++++++------
>>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.h  |  3 +++
>>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c    |  7 -----
>>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c      | 41 
>>>>> +++++++++++++++++------------
>>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h      |  4 +--
>>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm_cpu.c  | 22 ++++------------
>>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c | 15 +++--------
>>>>>   7 files changed, 67 insertions(+), 62 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>>>> index 46c76e2e1281..c70bbdda078c 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>>>> @@ -1403,30 +1403,51 @@ void amdgpu_bo_fence(struct amdgpu_bo *bo, 
>>>>> struct dma_fence *fence,
>>>>>   }
>>>>>     /**
>>>>> - * amdgpu_sync_wait_resv - Wait for BO reservation fences
>>>>> + * amdgpu_bo_sync_wait_resv - Wait for BO reservation fences
>>>>>    *
>>>>> - * @bo: buffer object
>>>>> + * @adev: amdgpu device pointer
>>>>> + * @resv: reservation object to sync to
>>>>> + * @sync_mode: synchronization mode
>>>>>    * @owner: fence owner
>>>>>    * @intr: Whether the wait is interruptible
>>>>>    *
>>>>> + * Extract the fences from the reservation object and waits for 
>>>>> them to finish.
>>>>> + *
>>>>>    * Returns:
>>>>>    * 0 on success, errno otherwise.
>>>>>    */
>>>>> -int amdgpu_bo_sync_wait(struct amdgpu_bo *bo, void *owner, bool 
>>>>> intr)
>>>>> +int amdgpu_bo_sync_wait_resv(struct amdgpu_device *adev, struct 
>>>>> dma_resv *resv,
>>>>> +                 enum amdgpu_sync_mode sync_mode, void *owner,
>>>>> +                 bool intr)
>>>>>   {
>>>>> -    struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
>>>>>       struct amdgpu_sync sync;
>>>>>       int r;
>>>>>         amdgpu_sync_create(&sync);
>>>>> -    amdgpu_sync_resv(adev, &sync, bo->tbo.base.resv,
>>>>> -             AMDGPU_SYNC_NE_OWNER, owner);
>>>>> -    r = amdgpu_sync_wait(&sync, intr);
>>>>> +    amdgpu_sync_resv(adev, &sync, resv, sync_mode, owner);
>>>>> +    r = amdgpu_sync_wait(&sync, true);
>>>>>       amdgpu_sync_free(&sync);
>>>>> -
>>>>>       return r;
>>>>>   }
>>>>>   +/**
>>>>> + * amdgpu_bo_sync_wait - Wrapper for amdgpu_bo_sync_wait_resv
>>>>> + * @bo: buffer object to wait for
>>>>> + * @owner: fence owner
>>>>> + * @intr: Whether the wait is interruptible
>>>>> + *
>>>>> + * Wrapper to wait for fences in a BO.
>>>>> + * Returns:
>>>>> + * 0 on success, errno otherwise.
>>>>> + */
>>>>> +int amdgpu_bo_sync_wait(struct amdgpu_bo *bo, void *owner, bool 
>>>>> intr)
>>>>> +{
>>>>> +    struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
>>>>> +
>>>>> +    return amdgpu_bo_sync_wait_resv(adev, bo->tbo.base.resv,
>>>>> +                    AMDGPU_SYNC_NE_OWNER, owner, intr);
>>>>> +}
>>>>> +
>>>>>   /**
>>>>>    * amdgpu_bo_gpu_offset - return GPU offset of bo
>>>>>    * @bo:    amdgpu object for which we query the offset
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h 
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>>>>> index 2eeafc77c9c1..96d805889e8d 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>>>>> @@ -283,6 +283,9 @@ void amdgpu_bo_release_notify(struct 
>>>>> ttm_buffer_object *bo);
>>>>>   int amdgpu_bo_fault_reserve_notify(struct ttm_buffer_object *bo);
>>>>>   void amdgpu_bo_fence(struct amdgpu_bo *bo, struct dma_fence *fence,
>>>>>                bool shared);
>>>>> +int amdgpu_bo_sync_wait_resv(struct amdgpu_device *adev, struct 
>>>>> dma_resv *resv,
>>>>> +                 enum amdgpu_sync_mode sync_mode, void *owner,
>>>>> +                 bool intr);
>>>>>   int amdgpu_bo_sync_wait(struct amdgpu_bo *bo, void *owner, bool 
>>>>> intr);
>>>>>   u64 amdgpu_bo_gpu_offset(struct amdgpu_bo *bo);
>>>>>   int amdgpu_bo_validate(struct amdgpu_bo *bo);
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c 
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
>>>>> index 9f42032676da..b86392253696 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
>>>>> @@ -249,13 +249,6 @@ int amdgpu_sync_resv(struct amdgpu_device 
>>>>> *adev, struct amdgpu_sync *sync,
>>>>>               owner != AMDGPU_FENCE_OWNER_UNDEFINED)
>>>>>               continue;
>>>>>   -        /* VM updates only sync with moves but not with user
>>>>> -         * command submissions or KFD evictions fences
>>>>> -         */
>>>>> -        if (fence_owner != AMDGPU_FENCE_OWNER_UNDEFINED &&
>>>>> -            owner == AMDGPU_FENCE_OWNER_VM)
>>>>> -            continue;
>>>>> -
>>>>>           /* Ignore fences depending on the sync mode */
>>>>>           switch (mode) {
>>>>>           case AMDGPU_SYNC_ALWAYS:
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>>> index 0f79c17118bf..c268aa14381e 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>>> @@ -797,7 +797,7 @@ static int amdgpu_vm_clear_bo(struct 
>>>>> amdgpu_device *adev,
>>>>>       params.vm = vm;
>>>>>       params.direct = direct;
>>>>>   -    r = vm->update_funcs->prepare(&params, 
>>>>> AMDGPU_FENCE_OWNER_KFD, NULL);
>>>>> +    r = vm->update_funcs->prepare(&params, NULL, 
>>>>> AMDGPU_SYNC_EXPLICIT);
>>>>>       if (r)
>>>>>           return r;
>>>>>   @@ -1293,7 +1293,7 @@ int amdgpu_vm_update_pdes(struct 
>>>>> amdgpu_device *adev,
>>>>>       params.vm = vm;
>>>>>       params.direct = direct;
>>>>>   -    r = vm->update_funcs->prepare(&params, 
>>>>> AMDGPU_FENCE_OWNER_VM, NULL);
>>>>> +    r = vm->update_funcs->prepare(&params, NULL, 
>>>>> AMDGPU_SYNC_EXPLICIT);
>>>>>       if (r)
>>>>>           return r;
>>>>>   @@ -1552,7 +1552,7 @@ static int amdgpu_vm_update_ptes(struct 
>>>>> amdgpu_vm_update_params *params,
>>>>>    * @adev: amdgpu_device pointer
>>>>>    * @vm: requested vm
>>>>>    * @direct: direct submission in a page fault
>>>>> - * @exclusive: fence we need to sync to
>>>>> + * @resv: fences we need to sync to
>>>>>    * @start: start of mapped range
>>>>>    * @last: last mapped entry
>>>>>    * @flags: flags for the entries
>>>>> @@ -1567,14 +1567,14 @@ static int amdgpu_vm_update_ptes(struct 
>>>>> amdgpu_vm_update_params *params,
>>>>>    */
>>>>>   static int amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev,
>>>>>                          struct amdgpu_vm *vm, bool direct,
>>>>> -                       struct dma_fence *exclusive,
>>>>> +                       struct dma_resv *resv,
>>>>>                          uint64_t start, uint64_t last,
>>>>>                          uint64_t flags, uint64_t addr,
>>>>>                          dma_addr_t *pages_addr,
>>>>>                          struct dma_fence **fence)
>>>>>   {
>>>>>       struct amdgpu_vm_update_params params;
>>>>> -    void *owner = AMDGPU_FENCE_OWNER_VM;
>>>>> +    enum amdgpu_sync_mode sync_mode;
>>>>>       int r;
>>>>>         memset(&params, 0, sizeof(params));
>>>>> @@ -1583,9 +1583,13 @@ static int 
>>>>> amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev,
>>>>>       params.direct = direct;
>>>>>       params.pages_addr = pages_addr;
>>>>>   -    /* sync to everything except eviction fences on unmapping */
>>>>> +    /* Implicitly sync to command submissions in the same VM before
>>>>> +     * unmapping. Sync to moving fences before mapping.
>>>>> +     */
>>>>>       if (!(flags & AMDGPU_PTE_VALID))
>>>>> -        owner = AMDGPU_FENCE_OWNER_KFD;
>>>>> +        sync_mode = AMDGPU_SYNC_EQ_OWNER;
>>>>> +    else
>>>>> +        sync_mode = AMDGPU_SYNC_EXPLICIT;
>>>>>         amdgpu_vm_eviction_lock(vm);
>>>>>       if (vm->evicting) {
>>>>> @@ -1593,7 +1597,7 @@ static int 
>>>>> amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev,
>>>>>           goto error_unlock;
>>>>>       }
>>>>>   -    r = vm->update_funcs->prepare(&params, owner, exclusive);
>>>>> +    r = vm->update_funcs->prepare(&params, resv, sync_mode);
>>>>>       if (r)
>>>>>           goto error_unlock;
>>>>>   @@ -1612,7 +1616,7 @@ static int 
>>>>> amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev,
>>>>>    * amdgpu_vm_bo_split_mapping - split a mapping into smaller chunks
>>>>>    *
>>>>>    * @adev: amdgpu_device pointer
>>>>> - * @exclusive: fence we need to sync to
>>>>> + * @resv: fences we need to sync to
>>>>>    * @pages_addr: DMA addresses to use for mapping
>>>>>    * @vm: requested vm
>>>>>    * @mapping: mapped range and flags to use for the update
>>>>> @@ -1628,7 +1632,7 @@ static int 
>>>>> amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev,
>>>>>    * 0 for success, -EINVAL for failure.
>>>>>    */
>>>>>   static int amdgpu_vm_bo_split_mapping(struct amdgpu_device *adev,
>>>>> -                      struct dma_fence *exclusive,
>>>>> +                      struct dma_resv *resv,
>>>>>                         dma_addr_t *pages_addr,
>>>>>                         struct amdgpu_vm *vm,
>>>>>                         struct amdgpu_bo_va_mapping *mapping,
>>>>> @@ -1704,7 +1708,7 @@ static int amdgpu_vm_bo_split_mapping(struct 
>>>>> amdgpu_device *adev,
>>>>>           }
>>>>>             last = min((uint64_t)mapping->last, start + 
>>>>> max_entries - 1);
>>>>> -        r = amdgpu_vm_bo_update_mapping(adev, vm, false, exclusive,
>>>>> +        r = amdgpu_vm_bo_update_mapping(adev, vm, false, resv,
>>>>>                           start, last, flags, addr,
>>>>>                           dma_addr, fence);
>>>>>           if (r)
>>>>> @@ -1743,7 +1747,8 @@ int amdgpu_vm_bo_update(struct amdgpu_device 
>>>>> *adev, struct amdgpu_bo_va *bo_va,
>>>>>       dma_addr_t *pages_addr = NULL;
>>>>>       struct ttm_mem_reg *mem;
>>>>>       struct drm_mm_node *nodes;
>>>>> -    struct dma_fence *exclusive, **last_update;
>>>>> +    struct dma_fence **last_update;
>>>>> +    struct dma_resv *resv;
>>>>>       uint64_t flags;
>>>>>       struct amdgpu_device *bo_adev = adev;
>>>>>       int r;
>>>>> @@ -1751,7 +1756,7 @@ int amdgpu_vm_bo_update(struct amdgpu_device 
>>>>> *adev, struct amdgpu_bo_va *bo_va,
>>>>>       if (clear || !bo) {
>>>>>           mem = NULL;
>>>>>           nodes = NULL;
>>>>> -        exclusive = NULL;
>>>>> +        resv = vm->root.base.bo->tbo.base.resv;
>>>>>       } else {
>>>>>           struct ttm_dma_tt *ttm;
>>>>>   @@ -1761,7 +1766,7 @@ int amdgpu_vm_bo_update(struct 
>>>>> amdgpu_device *adev, struct amdgpu_bo_va *bo_va,
>>>>>               ttm = container_of(bo->tbo.ttm, struct ttm_dma_tt, 
>>>>> ttm);
>>>>>               pages_addr = ttm->dma_address;
>>>>>           }
>>>>> -        exclusive = bo->tbo.moving;
>>>>> +        resv = bo->tbo.base.resv;
>>>>>       }
>>>>>         if (bo) {
>>>>> @@ -1775,7 +1780,8 @@ int amdgpu_vm_bo_update(struct amdgpu_device 
>>>>> *adev, struct amdgpu_bo_va *bo_va,
>>>>>           flags = 0x0;
>>>>>       }
>>>>>   -    if (clear || (bo && bo->tbo.base.resv == 
>>>>> vm->root.base.bo->tbo.base.resv))
>>>>> +    if (clear || (bo && bo->tbo.base.resv ==
>>>>> +              vm->root.base.bo->tbo.base.resv))
>>>>>           last_update = &vm->last_update;
>>>>>       else
>>>>>           last_update = &bo_va->last_pt_update;
>>>>> @@ -1789,7 +1795,7 @@ int amdgpu_vm_bo_update(struct amdgpu_device 
>>>>> *adev, struct amdgpu_bo_va *bo_va,
>>>>>       }
>>>>>         list_for_each_entry(mapping, &bo_va->invalids, list) {
>>>>> -        r = amdgpu_vm_bo_split_mapping(adev, exclusive, 
>>>>> pages_addr, vm,
>>>>> +        r = amdgpu_vm_bo_split_mapping(adev, resv, pages_addr, vm,
>>>>>                              mapping, flags, bo_adev, nodes,
>>>>>                              last_update);
>>>>>           if (r)
>>>>> @@ -1984,6 +1990,7 @@ int amdgpu_vm_clear_freed(struct 
>>>>> amdgpu_device *adev,
>>>>>                 struct amdgpu_vm *vm,
>>>>>                 struct dma_fence **fence)
>>>>>   {
>>>>> +    struct dma_resv *resv = vm->root.base.bo->tbo.base.resv;
>>>>>       struct amdgpu_bo_va_mapping *mapping;
>>>>>       uint64_t init_pte_value = 0;
>>>>>       struct dma_fence *f = NULL;
>>>>> @@ -1998,7 +2005,7 @@ int amdgpu_vm_clear_freed(struct 
>>>>> amdgpu_device *adev,
>>>>>               mapping->start < AMDGPU_GMC_HOLE_START)
>>>>>               init_pte_value = AMDGPU_PTE_DEFAULT_ATC;
>>>>>   -        r = amdgpu_vm_bo_update_mapping(adev, vm, false, NULL,
>>>>> +        r = amdgpu_vm_bo_update_mapping(adev, vm, false, resv,
>>>>>                           mapping->start, mapping->last,
>>>>>                           init_pte_value, 0, NULL, &f);
>>>>>           amdgpu_vm_free_mapping(adev, vm, mapping, f);
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h 
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>>>>> index c21a36bebc0c..b5705fcfc935 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>>>>> @@ -230,8 +230,8 @@ struct amdgpu_vm_update_params {
>>>>>     struct amdgpu_vm_update_funcs {
>>>>>       int (*map_table)(struct amdgpu_bo *bo);
>>>>> -    int (*prepare)(struct amdgpu_vm_update_params *p, void * owner,
>>>>> -               struct dma_fence *exclusive);
>>>>> +    int (*prepare)(struct amdgpu_vm_update_params *p, struct 
>>>>> dma_resv *resv,
>>>>> +               enum amdgpu_sync_mode sync_mode);
>>>>>       int (*update)(struct amdgpu_vm_update_params *p,
>>>>>                 struct amdgpu_bo *bo, uint64_t pe, uint64_t addr,
>>>>>                 unsigned count, uint32_t incr, uint64_t flags);
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_cpu.c 
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_cpu.c
>>>>> index 68b013be3837..e38516304070 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_cpu.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_cpu.c
>>>>> @@ -44,26 +44,14 @@ static int amdgpu_vm_cpu_map_table(struct 
>>>>> amdgpu_bo *table)
>>>>>    * Returns:
>>>>>    * Negativ errno, 0 for success.
>>>>>    */
>>>>> -static int amdgpu_vm_cpu_prepare(struct amdgpu_vm_update_params 
>>>>> *p, void *owner,
>>>>> -                 struct dma_fence *exclusive)
>>>>> +static int amdgpu_vm_cpu_prepare(struct amdgpu_vm_update_params *p,
>>>>> +                 struct dma_resv *resv,
>>>>> +                 enum amdgpu_sync_mode sync_mode)
>>>>>   {
>>>>> -    int r;
>>>>> -
>>>>> -    /* Wait for any BO move to be completed */
>>>>> -    if (exclusive) {
>>>>> -        r = dma_fence_wait(exclusive, true);
>>>>> -        if (unlikely(r))
>>>>> -            return r;
>>>>> -    }
>>>>> -
>>>>> -    /* Don't wait for submissions during page fault */
>>>>> -    if (p->direct)
>>>>> +    if (!resv)
>>>>>           return 0;
>>>>>   -    /* Wait for PT BOs to be idle. PTs share the same resv. object
>>>>> -     * as the root PD BO
>>>>> -     */
>>>>> -    return amdgpu_bo_sync_wait(p->vm->root.base.bo, owner, true);
>>>>> +    return amdgpu_bo_sync_wait_resv(p->adev, resv, sync_mode, 
>>>>> p->vm, true);
>>>>>   }
>>>>>     /**
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c 
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
>>>>> index ab6481751763..4cc7881f438c 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
>>>>> @@ -58,9 +58,9 @@ static int amdgpu_vm_sdma_map_table(struct 
>>>>> amdgpu_bo *table)
>>>>>    * Negativ errno, 0 for success.
>>>>>    */
>>>>>   static int amdgpu_vm_sdma_prepare(struct amdgpu_vm_update_params 
>>>>> *p,
>>>>> -                  void *owner, struct dma_fence *exclusive)
>>>>> +                  struct dma_resv *resv,
>>>>> +                  enum amdgpu_sync_mode sync_mode)
>>>>>   {
>>>>> -    struct amdgpu_bo *root = p->vm->root.base.bo;
>>>>>       unsigned int ndw = AMDGPU_VM_SDMA_MIN_NUM_DW;
>>>>>       int r;
>>>>>   @@ -70,17 +70,10 @@ static int amdgpu_vm_sdma_prepare(struct 
>>>>> amdgpu_vm_update_params *p,
>>>>>         p->num_dw_left = ndw;
>>>>>   -    /* Wait for moves to be completed */
>>>>> -    r = amdgpu_sync_fence(&p->job->sync, exclusive, false);
>>>>> -    if (r)
>>>>> -        return r;
>>>>> -
>>>>> -    /* Don't wait for any submissions during page fault handling */
>>>>> -    if (p->direct)
>>>>> +    if (!resv)
>>>>>           return 0;
>>>>>   -    return amdgpu_sync_resv(p->adev, &p->job->sync, 
>>>>> root->tbo.base.resv,
>>>>> -                AMDGPU_SYNC_NE_OWNER, owner);
>>>>> +    return amdgpu_sync_resv(p->adev, &p->job->sync, resv, 
>>>>> sync_mode, p->vm);
>>>>>   }
>>>>>     /**
>>>
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 4/4] drm/amdgpu: rework synchronization of VM updates v3
  2020-01-23 14:21 ` [PATCH 4/4] drm/amdgpu: rework synchronization of VM updates v3 Christian König
  2020-01-23 14:22   ` Tom St Denis
@ 2020-01-23 15:39   ` William Lewis
  2020-01-23 16:20     ` Christian König
  1 sibling, 1 reply; 15+ messages in thread
From: William Lewis @ 2020-01-23 15:39 UTC (permalink / raw)
  To: amd-gfx

Was the change to true from the only use of intr in 
amdgpu_bo_sync_wait_resv intentional?  If so, would it not make sense to 
remove the argument from the function signature while the API is changing?

On 1/23/20 8:21 AM, Christian König wrote:
> If provided we only sync to the BOs reservation
> object and no longer to the root PD.
>
> v2: update comment, cleanup amdgpu_bo_sync_wait_resv
> v3: use correct reservation object while clearing
>
> Signed-off-by: Christian König <christian.koenig@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c  | 37 ++++++++++++++++++++------
>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.h  |  3 +++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c    |  7 -----
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c      | 41 +++++++++++++++++------------
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h      |  4 +--
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm_cpu.c  | 22 ++++------------
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c | 15 +++--------
>   7 files changed, 67 insertions(+), 62 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> index 46c76e2e1281..c70bbdda078c 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> @@ -1403,30 +1403,51 @@ void amdgpu_bo_fence(struct amdgpu_bo *bo, struct dma_fence *fence,
>   }
>   
>   /**
> - * amdgpu_sync_wait_resv - Wait for BO reservation fences
> + * amdgpu_bo_sync_wait_resv - Wait for BO reservation fences
>    *
> - * @bo: buffer object
> + * @adev: amdgpu device pointer
> + * @resv: reservation object to sync to
> + * @sync_mode: synchronization mode
>    * @owner: fence owner
>    * @intr: Whether the wait is interruptible
>    *
> + * Extract the fences from the reservation object and waits for them to finish.
> + *
>    * Returns:
>    * 0 on success, errno otherwise.
>    */
> -int amdgpu_bo_sync_wait(struct amdgpu_bo *bo, void *owner, bool intr)
> +int amdgpu_bo_sync_wait_resv(struct amdgpu_device *adev, struct dma_resv *resv,
> +			     enum amdgpu_sync_mode sync_mode, void *owner,
> +			     bool intr)
>   {
> -	struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
>   	struct amdgpu_sync sync;
>   	int r;
>   
>   	amdgpu_sync_create(&sync);
> -	amdgpu_sync_resv(adev, &sync, bo->tbo.base.resv,
> -			 AMDGPU_SYNC_NE_OWNER, owner);
> -	r = amdgpu_sync_wait(&sync, intr);
> +	amdgpu_sync_resv(adev, &sync, resv, sync_mode, owner);
> +	r = amdgpu_sync_wait(&sync, true);
>   	amdgpu_sync_free(&sync);
> -
>   	return r;
>   }
>   
> +/**
> + * amdgpu_bo_sync_wait - Wrapper for amdgpu_bo_sync_wait_resv
> + * @bo: buffer object to wait for
> + * @owner: fence owner
> + * @intr: Whether the wait is interruptible
> + *
> + * Wrapper to wait for fences in a BO.
> + * Returns:
> + * 0 on success, errno otherwise.
> + */
> +int amdgpu_bo_sync_wait(struct amdgpu_bo *bo, void *owner, bool intr)
> +{
> +	struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
> +
> +	return amdgpu_bo_sync_wait_resv(adev, bo->tbo.base.resv,
> +					AMDGPU_SYNC_NE_OWNER, owner, intr);
> +}
> +
>   /**
>    * amdgpu_bo_gpu_offset - return GPU offset of bo
>    * @bo:	amdgpu object for which we query the offset
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> index 2eeafc77c9c1..96d805889e8d 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> @@ -283,6 +283,9 @@ void amdgpu_bo_release_notify(struct ttm_buffer_object *bo);
>   int amdgpu_bo_fault_reserve_notify(struct ttm_buffer_object *bo);
>   void amdgpu_bo_fence(struct amdgpu_bo *bo, struct dma_fence *fence,
>   		     bool shared);
> +int amdgpu_bo_sync_wait_resv(struct amdgpu_device *adev, struct dma_resv *resv,
> +			     enum amdgpu_sync_mode sync_mode, void *owner,
> +			     bool intr);
>   int amdgpu_bo_sync_wait(struct amdgpu_bo *bo, void *owner, bool intr);
>   u64 amdgpu_bo_gpu_offset(struct amdgpu_bo *bo);
>   int amdgpu_bo_validate(struct amdgpu_bo *bo);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
> index 9f42032676da..b86392253696 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
> @@ -249,13 +249,6 @@ int amdgpu_sync_resv(struct amdgpu_device *adev, struct amdgpu_sync *sync,
>   		    owner != AMDGPU_FENCE_OWNER_UNDEFINED)
>   			continue;
>   
> -		/* VM updates only sync with moves but not with user
> -		 * command submissions or KFD evictions fences
> -		 */
> -		if (fence_owner != AMDGPU_FENCE_OWNER_UNDEFINED &&
> -		    owner == AMDGPU_FENCE_OWNER_VM)
> -			continue;
> -
>   		/* Ignore fences depending on the sync mode */
>   		switch (mode) {
>   		case AMDGPU_SYNC_ALWAYS:
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> index 0f79c17118bf..c268aa14381e 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> @@ -797,7 +797,7 @@ static int amdgpu_vm_clear_bo(struct amdgpu_device *adev,
>   	params.vm = vm;
>   	params.direct = direct;
>   
> -	r = vm->update_funcs->prepare(&params, AMDGPU_FENCE_OWNER_KFD, NULL);
> +	r = vm->update_funcs->prepare(&params, NULL, AMDGPU_SYNC_EXPLICIT);
>   	if (r)
>   		return r;
>   
> @@ -1293,7 +1293,7 @@ int amdgpu_vm_update_pdes(struct amdgpu_device *adev,
>   	params.vm = vm;
>   	params.direct = direct;
>   
> -	r = vm->update_funcs->prepare(&params, AMDGPU_FENCE_OWNER_VM, NULL);
> +	r = vm->update_funcs->prepare(&params, NULL, AMDGPU_SYNC_EXPLICIT);
>   	if (r)
>   		return r;
>   
> @@ -1552,7 +1552,7 @@ static int amdgpu_vm_update_ptes(struct amdgpu_vm_update_params *params,
>    * @adev: amdgpu_device pointer
>    * @vm: requested vm
>    * @direct: direct submission in a page fault
> - * @exclusive: fence we need to sync to
> + * @resv: fences we need to sync to
>    * @start: start of mapped range
>    * @last: last mapped entry
>    * @flags: flags for the entries
> @@ -1567,14 +1567,14 @@ static int amdgpu_vm_update_ptes(struct amdgpu_vm_update_params *params,
>    */
>   static int amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev,
>   				       struct amdgpu_vm *vm, bool direct,
> -				       struct dma_fence *exclusive,
> +				       struct dma_resv *resv,
>   				       uint64_t start, uint64_t last,
>   				       uint64_t flags, uint64_t addr,
>   				       dma_addr_t *pages_addr,
>   				       struct dma_fence **fence)
>   {
>   	struct amdgpu_vm_update_params params;
> -	void *owner = AMDGPU_FENCE_OWNER_VM;
> +	enum amdgpu_sync_mode sync_mode;
>   	int r;
>   
>   	memset(&params, 0, sizeof(params));
> @@ -1583,9 +1583,13 @@ static int amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev,
>   	params.direct = direct;
>   	params.pages_addr = pages_addr;
>   
> -	/* sync to everything except eviction fences on unmapping */
> +	/* Implicitly sync to command submissions in the same VM before
> +	 * unmapping. Sync to moving fences before mapping.
> +	 */
>   	if (!(flags & AMDGPU_PTE_VALID))
> -		owner = AMDGPU_FENCE_OWNER_KFD;
> +		sync_mode = AMDGPU_SYNC_EQ_OWNER;
> +	else
> +		sync_mode = AMDGPU_SYNC_EXPLICIT;
>   
>   	amdgpu_vm_eviction_lock(vm);
>   	if (vm->evicting) {
> @@ -1593,7 +1597,7 @@ static int amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev,
>   		goto error_unlock;
>   	}
>   
> -	r = vm->update_funcs->prepare(&params, owner, exclusive);
> +	r = vm->update_funcs->prepare(&params, resv, sync_mode);
>   	if (r)
>   		goto error_unlock;
>   
> @@ -1612,7 +1616,7 @@ static int amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev,
>    * amdgpu_vm_bo_split_mapping - split a mapping into smaller chunks
>    *
>    * @adev: amdgpu_device pointer
> - * @exclusive: fence we need to sync to
> + * @resv: fences we need to sync to
>    * @pages_addr: DMA addresses to use for mapping
>    * @vm: requested vm
>    * @mapping: mapped range and flags to use for the update
> @@ -1628,7 +1632,7 @@ static int amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev,
>    * 0 for success, -EINVAL for failure.
>    */
>   static int amdgpu_vm_bo_split_mapping(struct amdgpu_device *adev,
> -				      struct dma_fence *exclusive,
> +				      struct dma_resv *resv,
>   				      dma_addr_t *pages_addr,
>   				      struct amdgpu_vm *vm,
>   				      struct amdgpu_bo_va_mapping *mapping,
> @@ -1704,7 +1708,7 @@ static int amdgpu_vm_bo_split_mapping(struct amdgpu_device *adev,
>   		}
>   
>   		last = min((uint64_t)mapping->last, start + max_entries - 1);
> -		r = amdgpu_vm_bo_update_mapping(adev, vm, false, exclusive,
> +		r = amdgpu_vm_bo_update_mapping(adev, vm, false, resv,
>   						start, last, flags, addr,
>   						dma_addr, fence);
>   		if (r)
> @@ -1743,7 +1747,8 @@ int amdgpu_vm_bo_update(struct amdgpu_device *adev, struct amdgpu_bo_va *bo_va,
>   	dma_addr_t *pages_addr = NULL;
>   	struct ttm_mem_reg *mem;
>   	struct drm_mm_node *nodes;
> -	struct dma_fence *exclusive, **last_update;
> +	struct dma_fence **last_update;
> +	struct dma_resv *resv;
>   	uint64_t flags;
>   	struct amdgpu_device *bo_adev = adev;
>   	int r;
> @@ -1751,7 +1756,7 @@ int amdgpu_vm_bo_update(struct amdgpu_device *adev, struct amdgpu_bo_va *bo_va,
>   	if (clear || !bo) {
>   		mem = NULL;
>   		nodes = NULL;
> -		exclusive = NULL;
> +		resv = vm->root.base.bo->tbo.base.resv;
>   	} else {
>   		struct ttm_dma_tt *ttm;
>   
> @@ -1761,7 +1766,7 @@ int amdgpu_vm_bo_update(struct amdgpu_device *adev, struct amdgpu_bo_va *bo_va,
>   			ttm = container_of(bo->tbo.ttm, struct ttm_dma_tt, ttm);
>   			pages_addr = ttm->dma_address;
>   		}
> -		exclusive = bo->tbo.moving;
> +		resv = bo->tbo.base.resv;
>   	}
>   
>   	if (bo) {
> @@ -1775,7 +1780,8 @@ int amdgpu_vm_bo_update(struct amdgpu_device *adev, struct amdgpu_bo_va *bo_va,
>   		flags = 0x0;
>   	}
>   
> -	if (clear || (bo && bo->tbo.base.resv == vm->root.base.bo->tbo.base.resv))
> +	if (clear || (bo && bo->tbo.base.resv ==
> +		      vm->root.base.bo->tbo.base.resv))
>   		last_update = &vm->last_update;
>   	else
>   		last_update = &bo_va->last_pt_update;
> @@ -1789,7 +1795,7 @@ int amdgpu_vm_bo_update(struct amdgpu_device *adev, struct amdgpu_bo_va *bo_va,
>   	}
>   
>   	list_for_each_entry(mapping, &bo_va->invalids, list) {
> -		r = amdgpu_vm_bo_split_mapping(adev, exclusive, pages_addr, vm,
> +		r = amdgpu_vm_bo_split_mapping(adev, resv, pages_addr, vm,
>   					       mapping, flags, bo_adev, nodes,
>   					       last_update);
>   		if (r)
> @@ -1984,6 +1990,7 @@ int amdgpu_vm_clear_freed(struct amdgpu_device *adev,
>   			  struct amdgpu_vm *vm,
>   			  struct dma_fence **fence)
>   {
> +	struct dma_resv *resv = vm->root.base.bo->tbo.base.resv;
>   	struct amdgpu_bo_va_mapping *mapping;
>   	uint64_t init_pte_value = 0;
>   	struct dma_fence *f = NULL;
> @@ -1998,7 +2005,7 @@ int amdgpu_vm_clear_freed(struct amdgpu_device *adev,
>   		    mapping->start < AMDGPU_GMC_HOLE_START)
>   			init_pte_value = AMDGPU_PTE_DEFAULT_ATC;
>   
> -		r = amdgpu_vm_bo_update_mapping(adev, vm, false, NULL,
> +		r = amdgpu_vm_bo_update_mapping(adev, vm, false, resv,
>   						mapping->start, mapping->last,
>   						init_pte_value, 0, NULL, &f);
>   		amdgpu_vm_free_mapping(adev, vm, mapping, f);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> index c21a36bebc0c..b5705fcfc935 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> @@ -230,8 +230,8 @@ struct amdgpu_vm_update_params {
>   
>   struct amdgpu_vm_update_funcs {
>   	int (*map_table)(struct amdgpu_bo *bo);
> -	int (*prepare)(struct amdgpu_vm_update_params *p, void * owner,
> -		       struct dma_fence *exclusive);
> +	int (*prepare)(struct amdgpu_vm_update_params *p, struct dma_resv *resv,
> +		       enum amdgpu_sync_mode sync_mode);
>   	int (*update)(struct amdgpu_vm_update_params *p,
>   		      struct amdgpu_bo *bo, uint64_t pe, uint64_t addr,
>   		      unsigned count, uint32_t incr, uint64_t flags);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_cpu.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_cpu.c
> index 68b013be3837..e38516304070 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_cpu.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_cpu.c
> @@ -44,26 +44,14 @@ static int amdgpu_vm_cpu_map_table(struct amdgpu_bo *table)
>    * Returns:
>    * Negativ errno, 0 for success.
>    */
> -static int amdgpu_vm_cpu_prepare(struct amdgpu_vm_update_params *p, void *owner,
> -				 struct dma_fence *exclusive)
> +static int amdgpu_vm_cpu_prepare(struct amdgpu_vm_update_params *p,
> +				 struct dma_resv *resv,
> +				 enum amdgpu_sync_mode sync_mode)
>   {
> -	int r;
> -
> -	/* Wait for any BO move to be completed */
> -	if (exclusive) {
> -		r = dma_fence_wait(exclusive, true);
> -		if (unlikely(r))
> -			return r;
> -	}
> -
> -	/* Don't wait for submissions during page fault */
> -	if (p->direct)
> +	if (!resv)
>   		return 0;
>   
> -	/* Wait for PT BOs to be idle. PTs share the same resv. object
> -	 * as the root PD BO
> -	 */
> -	return amdgpu_bo_sync_wait(p->vm->root.base.bo, owner, true);
> +	return amdgpu_bo_sync_wait_resv(p->adev, resv, sync_mode, p->vm, true);
>   }
>   
>   /**
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
> index ab6481751763..4cc7881f438c 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
> @@ -58,9 +58,9 @@ static int amdgpu_vm_sdma_map_table(struct amdgpu_bo *table)
>    * Negativ errno, 0 for success.
>    */
>   static int amdgpu_vm_sdma_prepare(struct amdgpu_vm_update_params *p,
> -				  void *owner, struct dma_fence *exclusive)
> +				  struct dma_resv *resv,
> +				  enum amdgpu_sync_mode sync_mode)
>   {
> -	struct amdgpu_bo *root = p->vm->root.base.bo;
>   	unsigned int ndw = AMDGPU_VM_SDMA_MIN_NUM_DW;
>   	int r;
>   
> @@ -70,17 +70,10 @@ static int amdgpu_vm_sdma_prepare(struct amdgpu_vm_update_params *p,
>   
>   	p->num_dw_left = ndw;
>   
> -	/* Wait for moves to be completed */
> -	r = amdgpu_sync_fence(&p->job->sync, exclusive, false);
> -	if (r)
> -		return r;
> -
> -	/* Don't wait for any submissions during page fault handling */
> -	if (p->direct)
> +	if (!resv)
>   		return 0;
>   
> -	return amdgpu_sync_resv(p->adev, &p->job->sync, root->tbo.base.resv,
> -				AMDGPU_SYNC_NE_OWNER, owner);
> +	return amdgpu_sync_resv(p->adev, &p->job->sync, resv, sync_mode, p->vm);
>   }
>   
>   /**
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 4/4] drm/amdgpu: rework synchronization of VM updates v3
  2020-01-23 15:39   ` William Lewis
@ 2020-01-23 16:20     ` Christian König
  0 siblings, 0 replies; 15+ messages in thread
From: Christian König @ 2020-01-23 16:20 UTC (permalink / raw)
  To: William Lewis, amd-gfx

Am 23.01.20 um 16:39 schrieb William Lewis:
> Was the change to true from the only use of intr in
> amdgpu_bo_sync_wait_resv intentional?  If so, would it not make sense to
> remove the argument from the function signature while the API is changing?

That's indeed a copy & paste typo. Thanks for pointing it out.

Christian.

>
> On 1/23/20 8:21 AM, Christian König wrote:
>> If provided we only sync to the BOs reservation
>> object and no longer to the root PD.
>>
>> v2: update comment, cleanup amdgpu_bo_sync_wait_resv
>> v3: use correct reservation object while clearing
>>
>> Signed-off-by: Christian König <christian.koenig@amd.com>
>> ---
>>    drivers/gpu/drm/amd/amdgpu/amdgpu_object.c  | 37 ++++++++++++++++++++------
>>    drivers/gpu/drm/amd/amdgpu/amdgpu_object.h  |  3 +++
>>    drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c    |  7 -----
>>    drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c      | 41 +++++++++++++++++------------
>>    drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h      |  4 +--
>>    drivers/gpu/drm/amd/amdgpu/amdgpu_vm_cpu.c  | 22 ++++------------
>>    drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c | 15 +++--------
>>    7 files changed, 67 insertions(+), 62 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>> index 46c76e2e1281..c70bbdda078c 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>> @@ -1403,30 +1403,51 @@ void amdgpu_bo_fence(struct amdgpu_bo *bo, struct dma_fence *fence,
>>    }
>>    
>>    /**
>> - * amdgpu_sync_wait_resv - Wait for BO reservation fences
>> + * amdgpu_bo_sync_wait_resv - Wait for BO reservation fences
>>     *
>> - * @bo: buffer object
>> + * @adev: amdgpu device pointer
>> + * @resv: reservation object to sync to
>> + * @sync_mode: synchronization mode
>>     * @owner: fence owner
>>     * @intr: Whether the wait is interruptible
>>     *
>> + * Extract the fences from the reservation object and waits for them to finish.
>> + *
>>     * Returns:
>>     * 0 on success, errno otherwise.
>>     */
>> -int amdgpu_bo_sync_wait(struct amdgpu_bo *bo, void *owner, bool intr)
>> +int amdgpu_bo_sync_wait_resv(struct amdgpu_device *adev, struct dma_resv *resv,
>> +			     enum amdgpu_sync_mode sync_mode, void *owner,
>> +			     bool intr)
>>    {
>> -	struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
>>    	struct amdgpu_sync sync;
>>    	int r;
>>    
>>    	amdgpu_sync_create(&sync);
>> -	amdgpu_sync_resv(adev, &sync, bo->tbo.base.resv,
>> -			 AMDGPU_SYNC_NE_OWNER, owner);
>> -	r = amdgpu_sync_wait(&sync, intr);
>> +	amdgpu_sync_resv(adev, &sync, resv, sync_mode, owner);
>> +	r = amdgpu_sync_wait(&sync, true);
>>    	amdgpu_sync_free(&sync);
>> -
>>    	return r;
>>    }
>>    
>> +/**
>> + * amdgpu_bo_sync_wait - Wrapper for amdgpu_bo_sync_wait_resv
>> + * @bo: buffer object to wait for
>> + * @owner: fence owner
>> + * @intr: Whether the wait is interruptible
>> + *
>> + * Wrapper to wait for fences in a BO.
>> + * Returns:
>> + * 0 on success, errno otherwise.
>> + */
>> +int amdgpu_bo_sync_wait(struct amdgpu_bo *bo, void *owner, bool intr)
>> +{
>> +	struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
>> +
>> +	return amdgpu_bo_sync_wait_resv(adev, bo->tbo.base.resv,
>> +					AMDGPU_SYNC_NE_OWNER, owner, intr);
>> +}
>> +
>>    /**
>>     * amdgpu_bo_gpu_offset - return GPU offset of bo
>>     * @bo:	amdgpu object for which we query the offset
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>> index 2eeafc77c9c1..96d805889e8d 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>> @@ -283,6 +283,9 @@ void amdgpu_bo_release_notify(struct ttm_buffer_object *bo);
>>    int amdgpu_bo_fault_reserve_notify(struct ttm_buffer_object *bo);
>>    void amdgpu_bo_fence(struct amdgpu_bo *bo, struct dma_fence *fence,
>>    		     bool shared);
>> +int amdgpu_bo_sync_wait_resv(struct amdgpu_device *adev, struct dma_resv *resv,
>> +			     enum amdgpu_sync_mode sync_mode, void *owner,
>> +			     bool intr);
>>    int amdgpu_bo_sync_wait(struct amdgpu_bo *bo, void *owner, bool intr);
>>    u64 amdgpu_bo_gpu_offset(struct amdgpu_bo *bo);
>>    int amdgpu_bo_validate(struct amdgpu_bo *bo);
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
>> index 9f42032676da..b86392253696 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
>> @@ -249,13 +249,6 @@ int amdgpu_sync_resv(struct amdgpu_device *adev, struct amdgpu_sync *sync,
>>    		    owner != AMDGPU_FENCE_OWNER_UNDEFINED)
>>    			continue;
>>    
>> -		/* VM updates only sync with moves but not with user
>> -		 * command submissions or KFD evictions fences
>> -		 */
>> -		if (fence_owner != AMDGPU_FENCE_OWNER_UNDEFINED &&
>> -		    owner == AMDGPU_FENCE_OWNER_VM)
>> -			continue;
>> -
>>    		/* Ignore fences depending on the sync mode */
>>    		switch (mode) {
>>    		case AMDGPU_SYNC_ALWAYS:
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> index 0f79c17118bf..c268aa14381e 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> @@ -797,7 +797,7 @@ static int amdgpu_vm_clear_bo(struct amdgpu_device *adev,
>>    	params.vm = vm;
>>    	params.direct = direct;
>>    
>> -	r = vm->update_funcs->prepare(&params, AMDGPU_FENCE_OWNER_KFD, NULL);
>> +	r = vm->update_funcs->prepare(&params, NULL, AMDGPU_SYNC_EXPLICIT);
>>    	if (r)
>>    		return r;
>>    
>> @@ -1293,7 +1293,7 @@ int amdgpu_vm_update_pdes(struct amdgpu_device *adev,
>>    	params.vm = vm;
>>    	params.direct = direct;
>>    
>> -	r = vm->update_funcs->prepare(&params, AMDGPU_FENCE_OWNER_VM, NULL);
>> +	r = vm->update_funcs->prepare(&params, NULL, AMDGPU_SYNC_EXPLICIT);
>>    	if (r)
>>    		return r;
>>    
>> @@ -1552,7 +1552,7 @@ static int amdgpu_vm_update_ptes(struct amdgpu_vm_update_params *params,
>>     * @adev: amdgpu_device pointer
>>     * @vm: requested vm
>>     * @direct: direct submission in a page fault
>> - * @exclusive: fence we need to sync to
>> + * @resv: fences we need to sync to
>>     * @start: start of mapped range
>>     * @last: last mapped entry
>>     * @flags: flags for the entries
>> @@ -1567,14 +1567,14 @@ static int amdgpu_vm_update_ptes(struct amdgpu_vm_update_params *params,
>>     */
>>    static int amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev,
>>    				       struct amdgpu_vm *vm, bool direct,
>> -				       struct dma_fence *exclusive,
>> +				       struct dma_resv *resv,
>>    				       uint64_t start, uint64_t last,
>>    				       uint64_t flags, uint64_t addr,
>>    				       dma_addr_t *pages_addr,
>>    				       struct dma_fence **fence)
>>    {
>>    	struct amdgpu_vm_update_params params;
>> -	void *owner = AMDGPU_FENCE_OWNER_VM;
>> +	enum amdgpu_sync_mode sync_mode;
>>    	int r;
>>    
>>    	memset(&params, 0, sizeof(params));
>> @@ -1583,9 +1583,13 @@ static int amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev,
>>    	params.direct = direct;
>>    	params.pages_addr = pages_addr;
>>    
>> -	/* sync to everything except eviction fences on unmapping */
>> +	/* Implicitly sync to command submissions in the same VM before
>> +	 * unmapping. Sync to moving fences before mapping.
>> +	 */
>>    	if (!(flags & AMDGPU_PTE_VALID))
>> -		owner = AMDGPU_FENCE_OWNER_KFD;
>> +		sync_mode = AMDGPU_SYNC_EQ_OWNER;
>> +	else
>> +		sync_mode = AMDGPU_SYNC_EXPLICIT;
>>    
>>    	amdgpu_vm_eviction_lock(vm);
>>    	if (vm->evicting) {
>> @@ -1593,7 +1597,7 @@ static int amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev,
>>    		goto error_unlock;
>>    	}
>>    
>> -	r = vm->update_funcs->prepare(&params, owner, exclusive);
>> +	r = vm->update_funcs->prepare(&params, resv, sync_mode);
>>    	if (r)
>>    		goto error_unlock;
>>    
>> @@ -1612,7 +1616,7 @@ static int amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev,
>>     * amdgpu_vm_bo_split_mapping - split a mapping into smaller chunks
>>     *
>>     * @adev: amdgpu_device pointer
>> - * @exclusive: fence we need to sync to
>> + * @resv: fences we need to sync to
>>     * @pages_addr: DMA addresses to use for mapping
>>     * @vm: requested vm
>>     * @mapping: mapped range and flags to use for the update
>> @@ -1628,7 +1632,7 @@ static int amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev,
>>     * 0 for success, -EINVAL for failure.
>>     */
>>    static int amdgpu_vm_bo_split_mapping(struct amdgpu_device *adev,
>> -				      struct dma_fence *exclusive,
>> +				      struct dma_resv *resv,
>>    				      dma_addr_t *pages_addr,
>>    				      struct amdgpu_vm *vm,
>>    				      struct amdgpu_bo_va_mapping *mapping,
>> @@ -1704,7 +1708,7 @@ static int amdgpu_vm_bo_split_mapping(struct amdgpu_device *adev,
>>    		}
>>    
>>    		last = min((uint64_t)mapping->last, start + max_entries - 1);
>> -		r = amdgpu_vm_bo_update_mapping(adev, vm, false, exclusive,
>> +		r = amdgpu_vm_bo_update_mapping(adev, vm, false, resv,
>>    						start, last, flags, addr,
>>    						dma_addr, fence);
>>    		if (r)
>> @@ -1743,7 +1747,8 @@ int amdgpu_vm_bo_update(struct amdgpu_device *adev, struct amdgpu_bo_va *bo_va,
>>    	dma_addr_t *pages_addr = NULL;
>>    	struct ttm_mem_reg *mem;
>>    	struct drm_mm_node *nodes;
>> -	struct dma_fence *exclusive, **last_update;
>> +	struct dma_fence **last_update;
>> +	struct dma_resv *resv;
>>    	uint64_t flags;
>>    	struct amdgpu_device *bo_adev = adev;
>>    	int r;
>> @@ -1751,7 +1756,7 @@ int amdgpu_vm_bo_update(struct amdgpu_device *adev, struct amdgpu_bo_va *bo_va,
>>    	if (clear || !bo) {
>>    		mem = NULL;
>>    		nodes = NULL;
>> -		exclusive = NULL;
>> +		resv = vm->root.base.bo->tbo.base.resv;
>>    	} else {
>>    		struct ttm_dma_tt *ttm;
>>    
>> @@ -1761,7 +1766,7 @@ int amdgpu_vm_bo_update(struct amdgpu_device *adev, struct amdgpu_bo_va *bo_va,
>>    			ttm = container_of(bo->tbo.ttm, struct ttm_dma_tt, ttm);
>>    			pages_addr = ttm->dma_address;
>>    		}
>> -		exclusive = bo->tbo.moving;
>> +		resv = bo->tbo.base.resv;
>>    	}
>>    
>>    	if (bo) {
>> @@ -1775,7 +1780,8 @@ int amdgpu_vm_bo_update(struct amdgpu_device *adev, struct amdgpu_bo_va *bo_va,
>>    		flags = 0x0;
>>    	}
>>    
>> -	if (clear || (bo && bo->tbo.base.resv == vm->root.base.bo->tbo.base.resv))
>> +	if (clear || (bo && bo->tbo.base.resv ==
>> +		      vm->root.base.bo->tbo.base.resv))
>>    		last_update = &vm->last_update;
>>    	else
>>    		last_update = &bo_va->last_pt_update;
>> @@ -1789,7 +1795,7 @@ int amdgpu_vm_bo_update(struct amdgpu_device *adev, struct amdgpu_bo_va *bo_va,
>>    	}
>>    
>>    	list_for_each_entry(mapping, &bo_va->invalids, list) {
>> -		r = amdgpu_vm_bo_split_mapping(adev, exclusive, pages_addr, vm,
>> +		r = amdgpu_vm_bo_split_mapping(adev, resv, pages_addr, vm,
>>    					       mapping, flags, bo_adev, nodes,
>>    					       last_update);
>>    		if (r)
>> @@ -1984,6 +1990,7 @@ int amdgpu_vm_clear_freed(struct amdgpu_device *adev,
>>    			  struct amdgpu_vm *vm,
>>    			  struct dma_fence **fence)
>>    {
>> +	struct dma_resv *resv = vm->root.base.bo->tbo.base.resv;
>>    	struct amdgpu_bo_va_mapping *mapping;
>>    	uint64_t init_pte_value = 0;
>>    	struct dma_fence *f = NULL;
>> @@ -1998,7 +2005,7 @@ int amdgpu_vm_clear_freed(struct amdgpu_device *adev,
>>    		    mapping->start < AMDGPU_GMC_HOLE_START)
>>    			init_pte_value = AMDGPU_PTE_DEFAULT_ATC;
>>    
>> -		r = amdgpu_vm_bo_update_mapping(adev, vm, false, NULL,
>> +		r = amdgpu_vm_bo_update_mapping(adev, vm, false, resv,
>>    						mapping->start, mapping->last,
>>    						init_pte_value, 0, NULL, &f);
>>    		amdgpu_vm_free_mapping(adev, vm, mapping, f);
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>> index c21a36bebc0c..b5705fcfc935 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>> @@ -230,8 +230,8 @@ struct amdgpu_vm_update_params {
>>    
>>    struct amdgpu_vm_update_funcs {
>>    	int (*map_table)(struct amdgpu_bo *bo);
>> -	int (*prepare)(struct amdgpu_vm_update_params *p, void * owner,
>> -		       struct dma_fence *exclusive);
>> +	int (*prepare)(struct amdgpu_vm_update_params *p, struct dma_resv *resv,
>> +		       enum amdgpu_sync_mode sync_mode);
>>    	int (*update)(struct amdgpu_vm_update_params *p,
>>    		      struct amdgpu_bo *bo, uint64_t pe, uint64_t addr,
>>    		      unsigned count, uint32_t incr, uint64_t flags);
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_cpu.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_cpu.c
>> index 68b013be3837..e38516304070 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_cpu.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_cpu.c
>> @@ -44,26 +44,14 @@ static int amdgpu_vm_cpu_map_table(struct amdgpu_bo *table)
>>     * Returns:
>>     * Negativ errno, 0 for success.
>>     */
>> -static int amdgpu_vm_cpu_prepare(struct amdgpu_vm_update_params *p, void *owner,
>> -				 struct dma_fence *exclusive)
>> +static int amdgpu_vm_cpu_prepare(struct amdgpu_vm_update_params *p,
>> +				 struct dma_resv *resv,
>> +				 enum amdgpu_sync_mode sync_mode)
>>    {
>> -	int r;
>> -
>> -	/* Wait for any BO move to be completed */
>> -	if (exclusive) {
>> -		r = dma_fence_wait(exclusive, true);
>> -		if (unlikely(r))
>> -			return r;
>> -	}
>> -
>> -	/* Don't wait for submissions during page fault */
>> -	if (p->direct)
>> +	if (!resv)
>>    		return 0;
>>    
>> -	/* Wait for PT BOs to be idle. PTs share the same resv. object
>> -	 * as the root PD BO
>> -	 */
>> -	return amdgpu_bo_sync_wait(p->vm->root.base.bo, owner, true);
>> +	return amdgpu_bo_sync_wait_resv(p->adev, resv, sync_mode, p->vm, true);
>>    }
>>    
>>    /**
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
>> index ab6481751763..4cc7881f438c 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
>> @@ -58,9 +58,9 @@ static int amdgpu_vm_sdma_map_table(struct amdgpu_bo *table)
>>     * Negativ errno, 0 for success.
>>     */
>>    static int amdgpu_vm_sdma_prepare(struct amdgpu_vm_update_params *p,
>> -				  void *owner, struct dma_fence *exclusive)
>> +				  struct dma_resv *resv,
>> +				  enum amdgpu_sync_mode sync_mode)
>>    {
>> -	struct amdgpu_bo *root = p->vm->root.base.bo;
>>    	unsigned int ndw = AMDGPU_VM_SDMA_MIN_NUM_DW;
>>    	int r;
>>    
>> @@ -70,17 +70,10 @@ static int amdgpu_vm_sdma_prepare(struct amdgpu_vm_update_params *p,
>>    
>>    	p->num_dw_left = ndw;
>>    
>> -	/* Wait for moves to be completed */
>> -	r = amdgpu_sync_fence(&p->job->sync, exclusive, false);
>> -	if (r)
>> -		return r;
>> -
>> -	/* Don't wait for any submissions during page fault handling */
>> -	if (p->direct)
>> +	if (!resv)
>>    		return 0;
>>    
>> -	return amdgpu_sync_resv(p->adev, &p->job->sync, root->tbo.base.resv,
>> -				AMDGPU_SYNC_NE_OWNER, owner);
>> +	return amdgpu_sync_resv(p->adev, &p->job->sync, resv, sync_mode, p->vm);
>>    }
>>    
>>    /**
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/4] drm/amdgpu: allow higher level PD invalidations
  2020-01-23 14:21 ` [PATCH 2/4] drm/amdgpu: allow higher level PD invalidations Christian König
@ 2020-01-27 14:57   ` Tom St Denis
  2020-01-27 17:37     ` Tom St Denis
  0 siblings, 1 reply; 15+ messages in thread
From: Tom St Denis @ 2020-01-27 14:57 UTC (permalink / raw)
  To: Christian König, amd-gfx, Felix.Kuehling

Reverting this patch avoids the gmc_v8 errors (I previously sent kernel 
logs, here's one for convenience):

[  358.554335] ------------[ cut here ]------------
[  358.554338] kernel BUG at drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c:725!
[  358.554351] invalid opcode: 0000 [#1] SMP NOPTI
[  358.554354] CPU: 0 PID: 4551 Comm: Xwayland Not tainted 5.4.0-rc7+ #14
[  358.554355] Hardware name: System manufacturer System Product 
Name/TUF B350M-PLUS GAMING, BIOS 5220 09/12/2019
[  358.554452] RIP: 0010:gmc_v8_0_get_vm_pde+0x10/0x20 [amdgpu]
[  358.554455] Code: 31 f6 48 89 df e8 30 e9 ff ff e9 28 ff ff ff e8 16 
d6 21 f9 66 0f 1f 44 00 00 48 b8 ff 0f 00 00 00 ff ff ff 48 85 02 75 01 
c3 <0f> 0b 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 55 48 89 fd e8 c7
[  358.554456] RSP: 0018:ffffa28142287a00 EFLAGS: 00010206
[  358.554458] RAX: ffffff0000000fff RBX: 0000000000000000 RCX: 
ffffa28142287a78
[  358.554459] RDX: ffffa28142287a50 RSI: 0000000000000002 RDI: 
ffff8b9be15e0000
[  358.554460] RBP: 0000000000000001 R08: 0000000000000000 R09: 
0000000000000000
[  358.554461] R10: 000000000000000f R11: 0000000000000406 R12: 
0000000000002030
[  358.554462] R13: 003ffffefea00000 R14: 0000000000101c00 R15: 
ffffa28142287af0
[  358.554464] FS:  00007f180a48ba80(0000) GS:ffff8b9be6c00000(0000) 
knlGS:0000000000000000
[  358.554465] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  358.554466] CR2: 00007f3de8f5dcc0 CR3: 00000002170c8000 CR4: 
00000000001406f0
[  358.554467] Call Trace:
[  358.554502]  amdgpu_vm_update_ptes+0x28a/0x7f0 [amdgpu]
[  358.554534]  ? amdgpu_sync_resv+0x34/0x190 [amdgpu]
[  358.554565]  amdgpu_vm_bo_update_mapping+0x12b/0x160 [amdgpu]
[  358.554596]  amdgpu_vm_bo_update+0x333/0x6a0 [amdgpu]
[  358.554626]  amdgpu_gem_va_ioctl+0x3c1/0x3e0 [amdgpu]
[  358.554658]  ? amdgpu_gem_va_map_flags+0x60/0x60 [amdgpu]
[  358.554663]  drm_ioctl_kernel+0xa5/0xf0
[  358.554665]  drm_ioctl+0x1df/0x366
[  358.554695]  ? amdgpu_gem_va_map_flags+0x60/0x60 [amdgpu]
[  358.554698]  ? __switch_to_asm+0x34/0x70
[  358.554699]  ? __switch_to_asm+0x40/0x70
[  358.554701]  ? __switch_to_asm+0x34/0x70
[  358.554702]  ? __switch_to_asm+0x40/0x70
[  358.554703]  ? __switch_to_asm+0x34/0x70
[  358.554704]  ? __switch_to_asm+0x40/0x70
[  358.554731]  amdgpu_drm_ioctl+0x44/0x80 [amdgpu]
[  358.554735]  do_vfs_ioctl+0x3f0/0x650
[  358.554737]  ? __schedule+0x28c/0x5a0
[  358.554738]  ksys_ioctl+0x59/0x90
[  358.554740]  __x64_sys_ioctl+0x11/0x20
[  358.554743]  do_syscall_64+0x43/0x110
[  358.554745]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  358.554747] RIP: 0033:0x7f1809e6638b
[  358.554749] Code: 0f 1e fa 48 8b 05 fd 9a 0c 00 64 c7 00 26 00 00 00 
48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 
05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d cd 9a 0c 00 f7 d8 64 89 01 48
[  358.554750] RSP: 002b:00007fffac20a638 EFLAGS: 00000246 ORIG_RAX: 
0000000000000010
[  358.554751] RAX: ffffffffffffffda RBX: 00007fffac20a690 RCX: 
00007f1809e6638b
[  358.554752] RDX: 00007fffac20a690 RSI: 00000000c0286448 RDI: 
000000000000000e
[  358.554753] RBP: 00000000c0286448 R08: 0000000101600000 R09: 
000000000000000e
[  358.554754] R10: 00000000000000e0 R11: 0000000000000246 R12: 
0000000000000000
[  358.554754] R13: 000000000000000e R14: 0000000000000001 R15: 
0000563d48bfd1f0
[  358.554756] Modules linked in: amdgpu gpu_sched ttm r8152 efivarfs
[  358.554790] ---[ end trace e0d54f6c49902356 ]---
[  358.554824] RIP: 0010:gmc_v8_0_get_vm_pde+0x10/0x20 [amdgpu]

(the gmc_v8 bug triggers regardless of whether I'm running piglit on my 
headless vega20 or directly on the carrizo).

However, with patch 2 of 4 reverted I then get:

[ 1471.338089] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
update BO_VA (-2)
[ 1471.338647] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
update BO_VA (-2)
[ 1471.339807] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
update BO_VA (-2)
[ 1471.341699] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
update BO_VA (-2)
[ 1471.342348] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
update BO_VA (-2)
[ 1471.342474] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
update BO_VA (-2)
[ 1471.342532] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
update BO_VA (-2)
[ 1471.342583] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
update BO_VA (-2)
[ 1471.342636] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
update BO_VA (-2)
[ 1471.342694] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
update BO_VA (-2)
[ 1471.342745] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
update BO_VA (-2)
[ 1471.342796] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
update BO_VA (-2)
[ 1471.343555] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
update BO_VA (-2)
[ 1471.350270] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
update BO_VA (-2)
[ 1471.350351] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
update BO_VA (-2)
[ 1471.350395] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
update BO_VA (-2)
[ 1471.351895] amdgpu 0000:00:01.0: failed to clear page tables on GEM 
object close (-2)
[ 1471.353995] amdgpu 0000:00:01.0: failed to clear page tables on GEM 
object close (-2)
[ 1471.354179] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
update BO_VA (-2)
[ 1471.354190] amdgpu 0000:00:01.0: failed to clear page tables on GEM 
object close (-2)
[ 1471.354252] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
update BO_VA (-2)
[ 1471.354259] amdgpu 0000:00:01.0: failed to clear page tables on GEM 
object close (-2)
[ 1471.354302] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
update BO_VA (-2)
[ 1471.354308] amdgpu 0000:00:01.0: failed to clear page tables on GEM 
object close (-2)
[ 1471.354351] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
update BO_VA (-2)
[ 1471.354356] amdgpu 0000:00:01.0: failed to clear page tables on GEM 
object close (-2)

So clearly that's not the fix either :-/

(all on top of the latest drm-next from this morning).

Tom


On 2020-01-23 9:21 a.m., Christian König wrote:
> Allow partial invalidation on unallocated PDs. This is useful when we
> need to silence faults to stop interrupt floods on Vega.
>
> Signed-off-by: Christian König <christian.koenig@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 23 ++++++++++++++++++-----
>   1 file changed, 18 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> index 8119f32ca94d..0f79c17118bf 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> @@ -1467,9 +1467,8 @@ static int amdgpu_vm_update_ptes(struct amdgpu_vm_update_params *params,
>   			 * smaller than the address shift. Go to the next
>   			 * child entry and try again.
>   			 */
> -			if (!amdgpu_vm_pt_descendant(adev, &cursor))
> -				return -ENOENT;
> -			continue;
> +			if (amdgpu_vm_pt_descendant(adev, &cursor))
> +				continue;
>   		} else if (frag >= parent_shift) {
>   			/* If the fragment size is even larger than the parent
>   			 * shift we should go up one level and check it again.
> @@ -1480,8 +1479,19 @@ static int amdgpu_vm_update_ptes(struct amdgpu_vm_update_params *params,
>   		}
>   
>   		pt = cursor.entry->base.bo;
> -		if (!pt)
> -			return -ENOENT;
> +		if (!pt) {
> +			/* We need all PDs and PTs for mapping something, */
> +			if (flags & AMDGPU_PTE_VALID)
> +				return -ENOENT;
> +
> +			/* but unmapping something can happen at a higher
> +			 * level. */
> +			if (!amdgpu_vm_pt_ancestor(&cursor))
> +				return -EINVAL;
> +
> +			pt = cursor.entry->base.bo;
> +			shift = parent_shift;
> +		}
>   
>   		/* Looks good so far, calculate parameters for the update */
>   		incr = (uint64_t)AMDGPU_GPU_PAGE_SIZE << shift;
> @@ -1495,6 +1505,9 @@ static int amdgpu_vm_update_ptes(struct amdgpu_vm_update_params *params,
>   			uint64_t upd_end = min(entry_end, frag_end);
>   			unsigned nptes = (upd_end - frag_start) >> shift;
>   
> +			/* This can happen when we set higher level PDs to
> +			 * silent to stop fault floods. */
> +			nptes = max(nptes, 1u);
>   			amdgpu_vm_update_flags(params, pt, cursor.level,
>   					       pe_start, dst, nptes, incr,
>   					       flags | AMDGPU_PTE_FRAG(frag));
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/4] drm/amdgpu: allow higher level PD invalidations
  2020-01-27 14:57   ` Tom St Denis
@ 2020-01-27 17:37     ` Tom St Denis
  2020-01-27 19:14       ` Christian König
  0 siblings, 1 reply; 15+ messages in thread
From: Tom St Denis @ 2020-01-27 17:37 UTC (permalink / raw)
  To: Christian König, amd-gfx, Felix.Kuehling

The offending PDE address is: "3ffffeff600000"

Which looks like it was sign extended into the "reserved" section 
between bits 40:58 (0x3fff) hence triggering the BUG() in the gmc_v8 driver.

Tom

On 2020-01-27 9:57 a.m., Tom St Denis wrote:
> Reverting this patch avoids the gmc_v8 errors (I previously sent 
> kernel logs, here's one for convenience):
>
> [  358.554335] ------------[ cut here ]------------
> [  358.554338] kernel BUG at drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c:725!
> [  358.554351] invalid opcode: 0000 [#1] SMP NOPTI
> [  358.554354] CPU: 0 PID: 4551 Comm: Xwayland Not tainted 5.4.0-rc7+ #14
> [  358.554355] Hardware name: System manufacturer System Product 
> Name/TUF B350M-PLUS GAMING, BIOS 5220 09/12/2019
> [  358.554452] RIP: 0010:gmc_v8_0_get_vm_pde+0x10/0x20 [amdgpu]
> [  358.554455] Code: 31 f6 48 89 df e8 30 e9 ff ff e9 28 ff ff ff e8 
> 16 d6 21 f9 66 0f 1f 44 00 00 48 b8 ff 0f 00 00 00 ff ff ff 48 85 02 
> 75 01 c3 <0f> 0b 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 55 48 89 fd 
> e8 c7
> [  358.554456] RSP: 0018:ffffa28142287a00 EFLAGS: 00010206
> [  358.554458] RAX: ffffff0000000fff RBX: 0000000000000000 RCX: 
> ffffa28142287a78
> [  358.554459] RDX: ffffa28142287a50 RSI: 0000000000000002 RDI: 
> ffff8b9be15e0000
> [  358.554460] RBP: 0000000000000001 R08: 0000000000000000 R09: 
> 0000000000000000
> [  358.554461] R10: 000000000000000f R11: 0000000000000406 R12: 
> 0000000000002030
> [  358.554462] R13: 003ffffefea00000 R14: 0000000000101c00 R15: 
> ffffa28142287af0
> [  358.554464] FS:  00007f180a48ba80(0000) GS:ffff8b9be6c00000(0000) 
> knlGS:0000000000000000
> [  358.554465] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  358.554466] CR2: 00007f3de8f5dcc0 CR3: 00000002170c8000 CR4: 
> 00000000001406f0
> [  358.554467] Call Trace:
> [  358.554502]  amdgpu_vm_update_ptes+0x28a/0x7f0 [amdgpu]
> [  358.554534]  ? amdgpu_sync_resv+0x34/0x190 [amdgpu]
> [  358.554565]  amdgpu_vm_bo_update_mapping+0x12b/0x160 [amdgpu]
> [  358.554596]  amdgpu_vm_bo_update+0x333/0x6a0 [amdgpu]
> [  358.554626]  amdgpu_gem_va_ioctl+0x3c1/0x3e0 [amdgpu]
> [  358.554658]  ? amdgpu_gem_va_map_flags+0x60/0x60 [amdgpu]
> [  358.554663]  drm_ioctl_kernel+0xa5/0xf0
> [  358.554665]  drm_ioctl+0x1df/0x366
> [  358.554695]  ? amdgpu_gem_va_map_flags+0x60/0x60 [amdgpu]
> [  358.554698]  ? __switch_to_asm+0x34/0x70
> [  358.554699]  ? __switch_to_asm+0x40/0x70
> [  358.554701]  ? __switch_to_asm+0x34/0x70
> [  358.554702]  ? __switch_to_asm+0x40/0x70
> [  358.554703]  ? __switch_to_asm+0x34/0x70
> [  358.554704]  ? __switch_to_asm+0x40/0x70
> [  358.554731]  amdgpu_drm_ioctl+0x44/0x80 [amdgpu]
> [  358.554735]  do_vfs_ioctl+0x3f0/0x650
> [  358.554737]  ? __schedule+0x28c/0x5a0
> [  358.554738]  ksys_ioctl+0x59/0x90
> [  358.554740]  __x64_sys_ioctl+0x11/0x20
> [  358.554743]  do_syscall_64+0x43/0x110
> [  358.554745]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [  358.554747] RIP: 0033:0x7f1809e6638b
> [  358.554749] Code: 0f 1e fa 48 8b 05 fd 9a 0c 00 64 c7 00 26 00 00 
> 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 
> 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d cd 9a 0c 00 f7 d8 64 89 
> 01 48
> [  358.554750] RSP: 002b:00007fffac20a638 EFLAGS: 00000246 ORIG_RAX: 
> 0000000000000010
> [  358.554751] RAX: ffffffffffffffda RBX: 00007fffac20a690 RCX: 
> 00007f1809e6638b
> [  358.554752] RDX: 00007fffac20a690 RSI: 00000000c0286448 RDI: 
> 000000000000000e
> [  358.554753] RBP: 00000000c0286448 R08: 0000000101600000 R09: 
> 000000000000000e
> [  358.554754] R10: 00000000000000e0 R11: 0000000000000246 R12: 
> 0000000000000000
> [  358.554754] R13: 000000000000000e R14: 0000000000000001 R15: 
> 0000563d48bfd1f0
> [  358.554756] Modules linked in: amdgpu gpu_sched ttm r8152 efivarfs
> [  358.554790] ---[ end trace e0d54f6c49902356 ]---
> [  358.554824] RIP: 0010:gmc_v8_0_get_vm_pde+0x10/0x20 [amdgpu]
>
> (the gmc_v8 bug triggers regardless of whether I'm running piglit on 
> my headless vega20 or directly on the carrizo).
>
> However, with patch 2 of 4 reverted I then get:
>
> [ 1471.338089] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
> update BO_VA (-2)
> [ 1471.338647] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
> update BO_VA (-2)
> [ 1471.339807] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
> update BO_VA (-2)
> [ 1471.341699] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
> update BO_VA (-2)
> [ 1471.342348] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
> update BO_VA (-2)
> [ 1471.342474] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
> update BO_VA (-2)
> [ 1471.342532] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
> update BO_VA (-2)
> [ 1471.342583] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
> update BO_VA (-2)
> [ 1471.342636] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
> update BO_VA (-2)
> [ 1471.342694] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
> update BO_VA (-2)
> [ 1471.342745] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
> update BO_VA (-2)
> [ 1471.342796] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
> update BO_VA (-2)
> [ 1471.343555] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
> update BO_VA (-2)
> [ 1471.350270] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
> update BO_VA (-2)
> [ 1471.350351] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
> update BO_VA (-2)
> [ 1471.350395] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
> update BO_VA (-2)
> [ 1471.351895] amdgpu 0000:00:01.0: failed to clear page tables on GEM 
> object close (-2)
> [ 1471.353995] amdgpu 0000:00:01.0: failed to clear page tables on GEM 
> object close (-2)
> [ 1471.354179] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
> update BO_VA (-2)
> [ 1471.354190] amdgpu 0000:00:01.0: failed to clear page tables on GEM 
> object close (-2)
> [ 1471.354252] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
> update BO_VA (-2)
> [ 1471.354259] amdgpu 0000:00:01.0: failed to clear page tables on GEM 
> object close (-2)
> [ 1471.354302] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
> update BO_VA (-2)
> [ 1471.354308] amdgpu 0000:00:01.0: failed to clear page tables on GEM 
> object close (-2)
> [ 1471.354351] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
> update BO_VA (-2)
> [ 1471.354356] amdgpu 0000:00:01.0: failed to clear page tables on GEM 
> object close (-2)
>
> So clearly that's not the fix either :-/
>
> (all on top of the latest drm-next from this morning).
>
> Tom
>
>
> On 2020-01-23 9:21 a.m., Christian König wrote:
>> Allow partial invalidation on unallocated PDs. This is useful when we
>> need to silence faults to stop interrupt floods on Vega.
>>
>> Signed-off-by: Christian König <christian.koenig@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 23 ++++++++++++++++++-----
>>   1 file changed, 18 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> index 8119f32ca94d..0f79c17118bf 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> @@ -1467,9 +1467,8 @@ static int amdgpu_vm_update_ptes(struct 
>> amdgpu_vm_update_params *params,
>>                * smaller than the address shift. Go to the next
>>                * child entry and try again.
>>                */
>> -            if (!amdgpu_vm_pt_descendant(adev, &cursor))
>> -                return -ENOENT;
>> -            continue;
>> +            if (amdgpu_vm_pt_descendant(adev, &cursor))
>> +                continue;
>>           } else if (frag >= parent_shift) {
>>               /* If the fragment size is even larger than the parent
>>                * shift we should go up one level and check it again.
>> @@ -1480,8 +1479,19 @@ static int amdgpu_vm_update_ptes(struct 
>> amdgpu_vm_update_params *params,
>>           }
>>             pt = cursor.entry->base.bo;
>> -        if (!pt)
>> -            return -ENOENT;
>> +        if (!pt) {
>> +            /* We need all PDs and PTs for mapping something, */
>> +            if (flags & AMDGPU_PTE_VALID)
>> +                return -ENOENT;
>> +
>> +            /* but unmapping something can happen at a higher
>> +             * level. */
>> +            if (!amdgpu_vm_pt_ancestor(&cursor))
>> +                return -EINVAL;
>> +
>> +            pt = cursor.entry->base.bo;
>> +            shift = parent_shift;
>> +        }
>>             /* Looks good so far, calculate parameters for the update */
>>           incr = (uint64_t)AMDGPU_GPU_PAGE_SIZE << shift;
>> @@ -1495,6 +1505,9 @@ static int amdgpu_vm_update_ptes(struct 
>> amdgpu_vm_update_params *params,
>>               uint64_t upd_end = min(entry_end, frag_end);
>>               unsigned nptes = (upd_end - frag_start) >> shift;
>>   +            /* This can happen when we set higher level PDs to
>> +             * silent to stop fault floods. */
>> +            nptes = max(nptes, 1u);
>>               amdgpu_vm_update_flags(params, pt, cursor.level,
>>                              pe_start, dst, nptes, incr,
>>                              flags | AMDGPU_PTE_FRAG(frag));
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/4] drm/amdgpu: allow higher level PD invalidations
  2020-01-27 17:37     ` Tom St Denis
@ 2020-01-27 19:14       ` Christian König
  2020-01-27 19:18         ` Tom St Denis
  0 siblings, 1 reply; 15+ messages in thread
From: Christian König @ 2020-01-27 19:14 UTC (permalink / raw)
  To: Tom St Denis, amd-gfx, Felix.Kuehling

OH, thanks for that hint! I was staring at the code for a day now 
without having a clue what's going wrong.

Haven't realized that something could have been sign extended!

Probably going to figure it out by tomorrow now,
Christian.

Am 27.01.20 um 18:37 schrieb Tom St Denis:
> The offending PDE address is: "3ffffeff600000"
>
> Which looks like it was sign extended into the "reserved" section 
> between bits 40:58 (0x3fff) hence triggering the BUG() in the gmc_v8 
> driver.
>
> Tom
>
> On 2020-01-27 9:57 a.m., Tom St Denis wrote:
>> Reverting this patch avoids the gmc_v8 errors (I previously sent 
>> kernel logs, here's one for convenience):
>>
>> [  358.554335] ------------[ cut here ]------------
>> [  358.554338] kernel BUG at drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c:725!
>> [  358.554351] invalid opcode: 0000 [#1] SMP NOPTI
>> [  358.554354] CPU: 0 PID: 4551 Comm: Xwayland Not tainted 5.4.0-rc7+ 
>> #14
>> [  358.554355] Hardware name: System manufacturer System Product 
>> Name/TUF B350M-PLUS GAMING, BIOS 5220 09/12/2019
>> [  358.554452] RIP: 0010:gmc_v8_0_get_vm_pde+0x10/0x20 [amdgpu]
>> [  358.554455] Code: 31 f6 48 89 df e8 30 e9 ff ff e9 28 ff ff ff e8 
>> 16 d6 21 f9 66 0f 1f 44 00 00 48 b8 ff 0f 00 00 00 ff ff ff 48 85 02 
>> 75 01 c3 <0f> 0b 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 55 48 89 
>> fd e8 c7
>> [  358.554456] RSP: 0018:ffffa28142287a00 EFLAGS: 00010206
>> [  358.554458] RAX: ffffff0000000fff RBX: 0000000000000000 RCX: 
>> ffffa28142287a78
>> [  358.554459] RDX: ffffa28142287a50 RSI: 0000000000000002 RDI: 
>> ffff8b9be15e0000
>> [  358.554460] RBP: 0000000000000001 R08: 0000000000000000 R09: 
>> 0000000000000000
>> [  358.554461] R10: 000000000000000f R11: 0000000000000406 R12: 
>> 0000000000002030
>> [  358.554462] R13: 003ffffefea00000 R14: 0000000000101c00 R15: 
>> ffffa28142287af0
>> [  358.554464] FS:  00007f180a48ba80(0000) GS:ffff8b9be6c00000(0000) 
>> knlGS:0000000000000000
>> [  358.554465] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [  358.554466] CR2: 00007f3de8f5dcc0 CR3: 00000002170c8000 CR4: 
>> 00000000001406f0
>> [  358.554467] Call Trace:
>> [  358.554502]  amdgpu_vm_update_ptes+0x28a/0x7f0 [amdgpu]
>> [  358.554534]  ? amdgpu_sync_resv+0x34/0x190 [amdgpu]
>> [  358.554565]  amdgpu_vm_bo_update_mapping+0x12b/0x160 [amdgpu]
>> [  358.554596]  amdgpu_vm_bo_update+0x333/0x6a0 [amdgpu]
>> [  358.554626]  amdgpu_gem_va_ioctl+0x3c1/0x3e0 [amdgpu]
>> [  358.554658]  ? amdgpu_gem_va_map_flags+0x60/0x60 [amdgpu]
>> [  358.554663]  drm_ioctl_kernel+0xa5/0xf0
>> [  358.554665]  drm_ioctl+0x1df/0x366
>> [  358.554695]  ? amdgpu_gem_va_map_flags+0x60/0x60 [amdgpu]
>> [  358.554698]  ? __switch_to_asm+0x34/0x70
>> [  358.554699]  ? __switch_to_asm+0x40/0x70
>> [  358.554701]  ? __switch_to_asm+0x34/0x70
>> [  358.554702]  ? __switch_to_asm+0x40/0x70
>> [  358.554703]  ? __switch_to_asm+0x34/0x70
>> [  358.554704]  ? __switch_to_asm+0x40/0x70
>> [  358.554731]  amdgpu_drm_ioctl+0x44/0x80 [amdgpu]
>> [  358.554735]  do_vfs_ioctl+0x3f0/0x650
>> [  358.554737]  ? __schedule+0x28c/0x5a0
>> [  358.554738]  ksys_ioctl+0x59/0x90
>> [  358.554740]  __x64_sys_ioctl+0x11/0x20
>> [  358.554743]  do_syscall_64+0x43/0x110
>> [  358.554745]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
>> [  358.554747] RIP: 0033:0x7f1809e6638b
>> [  358.554749] Code: 0f 1e fa 48 8b 05 fd 9a 0c 00 64 c7 00 26 00 00 
>> 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 
>> 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d cd 9a 0c 00 f7 d8 64 
>> 89 01 48
>> [  358.554750] RSP: 002b:00007fffac20a638 EFLAGS: 00000246 ORIG_RAX: 
>> 0000000000000010
>> [  358.554751] RAX: ffffffffffffffda RBX: 00007fffac20a690 RCX: 
>> 00007f1809e6638b
>> [  358.554752] RDX: 00007fffac20a690 RSI: 00000000c0286448 RDI: 
>> 000000000000000e
>> [  358.554753] RBP: 00000000c0286448 R08: 0000000101600000 R09: 
>> 000000000000000e
>> [  358.554754] R10: 00000000000000e0 R11: 0000000000000246 R12: 
>> 0000000000000000
>> [  358.554754] R13: 000000000000000e R14: 0000000000000001 R15: 
>> 0000563d48bfd1f0
>> [  358.554756] Modules linked in: amdgpu gpu_sched ttm r8152 efivarfs
>> [  358.554790] ---[ end trace e0d54f6c49902356 ]---
>> [  358.554824] RIP: 0010:gmc_v8_0_get_vm_pde+0x10/0x20 [amdgpu]
>>
>> (the gmc_v8 bug triggers regardless of whether I'm running piglit on 
>> my headless vega20 or directly on the carrizo).
>>
>> However, with patch 2 of 4 reverted I then get:
>>
>> [ 1471.338089] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
>> update BO_VA (-2)
>> [ 1471.338647] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
>> update BO_VA (-2)
>> [ 1471.339807] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
>> update BO_VA (-2)
>> [ 1471.341699] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
>> update BO_VA (-2)
>> [ 1471.342348] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
>> update BO_VA (-2)
>> [ 1471.342474] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
>> update BO_VA (-2)
>> [ 1471.342532] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
>> update BO_VA (-2)
>> [ 1471.342583] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
>> update BO_VA (-2)
>> [ 1471.342636] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
>> update BO_VA (-2)
>> [ 1471.342694] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
>> update BO_VA (-2)
>> [ 1471.342745] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
>> update BO_VA (-2)
>> [ 1471.342796] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
>> update BO_VA (-2)
>> [ 1471.343555] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
>> update BO_VA (-2)
>> [ 1471.350270] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
>> update BO_VA (-2)
>> [ 1471.350351] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
>> update BO_VA (-2)
>> [ 1471.350395] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
>> update BO_VA (-2)
>> [ 1471.351895] amdgpu 0000:00:01.0: failed to clear page tables on 
>> GEM object close (-2)
>> [ 1471.353995] amdgpu 0000:00:01.0: failed to clear page tables on 
>> GEM object close (-2)
>> [ 1471.354179] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
>> update BO_VA (-2)
>> [ 1471.354190] amdgpu 0000:00:01.0: failed to clear page tables on 
>> GEM object close (-2)
>> [ 1471.354252] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
>> update BO_VA (-2)
>> [ 1471.354259] amdgpu 0000:00:01.0: failed to clear page tables on 
>> GEM object close (-2)
>> [ 1471.354302] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
>> update BO_VA (-2)
>> [ 1471.354308] amdgpu 0000:00:01.0: failed to clear page tables on 
>> GEM object close (-2)
>> [ 1471.354351] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
>> update BO_VA (-2)
>> [ 1471.354356] amdgpu 0000:00:01.0: failed to clear page tables on 
>> GEM object close (-2)
>>
>> So clearly that's not the fix either :-/
>>
>> (all on top of the latest drm-next from this morning).
>>
>> Tom
>>
>>
>> On 2020-01-23 9:21 a.m., Christian König wrote:
>>> Allow partial invalidation on unallocated PDs. This is useful when we
>>> need to silence faults to stop interrupt floods on Vega.
>>>
>>> Signed-off-by: Christian König <christian.koenig@amd.com>
>>> ---
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 23 ++++++++++++++++++-----
>>>   1 file changed, 18 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>> index 8119f32ca94d..0f79c17118bf 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>> @@ -1467,9 +1467,8 @@ static int amdgpu_vm_update_ptes(struct 
>>> amdgpu_vm_update_params *params,
>>>                * smaller than the address shift. Go to the next
>>>                * child entry and try again.
>>>                */
>>> -            if (!amdgpu_vm_pt_descendant(adev, &cursor))
>>> -                return -ENOENT;
>>> -            continue;
>>> +            if (amdgpu_vm_pt_descendant(adev, &cursor))
>>> +                continue;
>>>           } else if (frag >= parent_shift) {
>>>               /* If the fragment size is even larger than the parent
>>>                * shift we should go up one level and check it again.
>>> @@ -1480,8 +1479,19 @@ static int amdgpu_vm_update_ptes(struct 
>>> amdgpu_vm_update_params *params,
>>>           }
>>>             pt = cursor.entry->base.bo;
>>> -        if (!pt)
>>> -            return -ENOENT;
>>> +        if (!pt) {
>>> +            /* We need all PDs and PTs for mapping something, */
>>> +            if (flags & AMDGPU_PTE_VALID)
>>> +                return -ENOENT;
>>> +
>>> +            /* but unmapping something can happen at a higher
>>> +             * level. */
>>> +            if (!amdgpu_vm_pt_ancestor(&cursor))
>>> +                return -EINVAL;
>>> +
>>> +            pt = cursor.entry->base.bo;
>>> +            shift = parent_shift;
>>> +        }
>>>             /* Looks good so far, calculate parameters for the 
>>> update */
>>>           incr = (uint64_t)AMDGPU_GPU_PAGE_SIZE << shift;
>>> @@ -1495,6 +1505,9 @@ static int amdgpu_vm_update_ptes(struct 
>>> amdgpu_vm_update_params *params,
>>>               uint64_t upd_end = min(entry_end, frag_end);
>>>               unsigned nptes = (upd_end - frag_start) >> shift;
>>>   +            /* This can happen when we set higher level PDs to
>>> +             * silent to stop fault floods. */
>>> +            nptes = max(nptes, 1u);
>>>               amdgpu_vm_update_flags(params, pt, cursor.level,
>>>                              pe_start, dst, nptes, incr,
>>>                              flags | AMDGPU_PTE_FRAG(frag));

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/4] drm/amdgpu: allow higher level PD invalidations
  2020-01-27 19:14       ` Christian König
@ 2020-01-27 19:18         ` Tom St Denis
  0 siblings, 0 replies; 15+ messages in thread
From: Tom St Denis @ 2020-01-27 19:18 UTC (permalink / raw)
  To: christian.koenig, amd-gfx, Felix.Kuehling

No problemo,  maybe we should split that BUG() statement to catch the 
two halves of the PDE address so next time it'll be more obvious.

For ref though your 4 patches have applied cleanly on top of drm-next 
every day and work fine on my navi10/raven1 system.  So it seems like 
just gmc8 is affected (I don't have any 6/7 hardware to test).

Tom


On 2020-01-27 2:14 p.m., Christian König wrote:
> OH, thanks for that hint! I was staring at the code for a day now 
> without having a clue what's going wrong.
>
> Haven't realized that something could have been sign extended!
>
> Probably going to figure it out by tomorrow now,
> Christian.
>
> Am 27.01.20 um 18:37 schrieb Tom St Denis:
>> The offending PDE address is: "3ffffeff600000"
>>
>> Which looks like it was sign extended into the "reserved" section 
>> between bits 40:58 (0x3fff) hence triggering the BUG() in the gmc_v8 
>> driver.
>>
>> Tom
>>
>> On 2020-01-27 9:57 a.m., Tom St Denis wrote:
>>> Reverting this patch avoids the gmc_v8 errors (I previously sent 
>>> kernel logs, here's one for convenience):
>>>
>>> [  358.554335] ------------[ cut here ]------------
>>> [  358.554338] kernel BUG at drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c:725!
>>> [  358.554351] invalid opcode: 0000 [#1] SMP NOPTI
>>> [  358.554354] CPU: 0 PID: 4551 Comm: Xwayland Not tainted 
>>> 5.4.0-rc7+ #14
>>> [  358.554355] Hardware name: System manufacturer System Product 
>>> Name/TUF B350M-PLUS GAMING, BIOS 5220 09/12/2019
>>> [  358.554452] RIP: 0010:gmc_v8_0_get_vm_pde+0x10/0x20 [amdgpu]
>>> [  358.554455] Code: 31 f6 48 89 df e8 30 e9 ff ff e9 28 ff ff ff e8 
>>> 16 d6 21 f9 66 0f 1f 44 00 00 48 b8 ff 0f 00 00 00 ff ff ff 48 85 02 
>>> 75 01 c3 <0f> 0b 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 55 48 89 
>>> fd e8 c7
>>> [  358.554456] RSP: 0018:ffffa28142287a00 EFLAGS: 00010206
>>> [  358.554458] RAX: ffffff0000000fff RBX: 0000000000000000 RCX: 
>>> ffffa28142287a78
>>> [  358.554459] RDX: ffffa28142287a50 RSI: 0000000000000002 RDI: 
>>> ffff8b9be15e0000
>>> [  358.554460] RBP: 0000000000000001 R08: 0000000000000000 R09: 
>>> 0000000000000000
>>> [  358.554461] R10: 000000000000000f R11: 0000000000000406 R12: 
>>> 0000000000002030
>>> [  358.554462] R13: 003ffffefea00000 R14: 0000000000101c00 R15: 
>>> ffffa28142287af0
>>> [  358.554464] FS:  00007f180a48ba80(0000) GS:ffff8b9be6c00000(0000) 
>>> knlGS:0000000000000000
>>> [  358.554465] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> [  358.554466] CR2: 00007f3de8f5dcc0 CR3: 00000002170c8000 CR4: 
>>> 00000000001406f0
>>> [  358.554467] Call Trace:
>>> [  358.554502]  amdgpu_vm_update_ptes+0x28a/0x7f0 [amdgpu]
>>> [  358.554534]  ? amdgpu_sync_resv+0x34/0x190 [amdgpu]
>>> [  358.554565]  amdgpu_vm_bo_update_mapping+0x12b/0x160 [amdgpu]
>>> [  358.554596]  amdgpu_vm_bo_update+0x333/0x6a0 [amdgpu]
>>> [  358.554626]  amdgpu_gem_va_ioctl+0x3c1/0x3e0 [amdgpu]
>>> [  358.554658]  ? amdgpu_gem_va_map_flags+0x60/0x60 [amdgpu]
>>> [  358.554663]  drm_ioctl_kernel+0xa5/0xf0
>>> [  358.554665]  drm_ioctl+0x1df/0x366
>>> [  358.554695]  ? amdgpu_gem_va_map_flags+0x60/0x60 [amdgpu]
>>> [  358.554698]  ? __switch_to_asm+0x34/0x70
>>> [  358.554699]  ? __switch_to_asm+0x40/0x70
>>> [  358.554701]  ? __switch_to_asm+0x34/0x70
>>> [  358.554702]  ? __switch_to_asm+0x40/0x70
>>> [  358.554703]  ? __switch_to_asm+0x34/0x70
>>> [  358.554704]  ? __switch_to_asm+0x40/0x70
>>> [  358.554731]  amdgpu_drm_ioctl+0x44/0x80 [amdgpu]
>>> [  358.554735]  do_vfs_ioctl+0x3f0/0x650
>>> [  358.554737]  ? __schedule+0x28c/0x5a0
>>> [  358.554738]  ksys_ioctl+0x59/0x90
>>> [  358.554740]  __x64_sys_ioctl+0x11/0x20
>>> [  358.554743]  do_syscall_64+0x43/0x110
>>> [  358.554745]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
>>> [  358.554747] RIP: 0033:0x7f1809e6638b
>>> [  358.554749] Code: 0f 1e fa 48 8b 05 fd 9a 0c 00 64 c7 00 26 00 00 
>>> 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 
>>> 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d cd 9a 0c 00 f7 d8 64 
>>> 89 01 48
>>> [  358.554750] RSP: 002b:00007fffac20a638 EFLAGS: 00000246 ORIG_RAX: 
>>> 0000000000000010
>>> [  358.554751] RAX: ffffffffffffffda RBX: 00007fffac20a690 RCX: 
>>> 00007f1809e6638b
>>> [  358.554752] RDX: 00007fffac20a690 RSI: 00000000c0286448 RDI: 
>>> 000000000000000e
>>> [  358.554753] RBP: 00000000c0286448 R08: 0000000101600000 R09: 
>>> 000000000000000e
>>> [  358.554754] R10: 00000000000000e0 R11: 0000000000000246 R12: 
>>> 0000000000000000
>>> [  358.554754] R13: 000000000000000e R14: 0000000000000001 R15: 
>>> 0000563d48bfd1f0
>>> [  358.554756] Modules linked in: amdgpu gpu_sched ttm r8152 efivarfs
>>> [  358.554790] ---[ end trace e0d54f6c49902356 ]---
>>> [  358.554824] RIP: 0010:gmc_v8_0_get_vm_pde+0x10/0x20 [amdgpu]
>>>
>>> (the gmc_v8 bug triggers regardless of whether I'm running piglit on 
>>> my headless vega20 or directly on the carrizo).
>>>
>>> However, with patch 2 of 4 reverted I then get:
>>>
>>> [ 1471.338089] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
>>> update BO_VA (-2)
>>> [ 1471.338647] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
>>> update BO_VA (-2)
>>> [ 1471.339807] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
>>> update BO_VA (-2)
>>> [ 1471.341699] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
>>> update BO_VA (-2)
>>> [ 1471.342348] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
>>> update BO_VA (-2)
>>> [ 1471.342474] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
>>> update BO_VA (-2)
>>> [ 1471.342532] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
>>> update BO_VA (-2)
>>> [ 1471.342583] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
>>> update BO_VA (-2)
>>> [ 1471.342636] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
>>> update BO_VA (-2)
>>> [ 1471.342694] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
>>> update BO_VA (-2)
>>> [ 1471.342745] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
>>> update BO_VA (-2)
>>> [ 1471.342796] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
>>> update BO_VA (-2)
>>> [ 1471.343555] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
>>> update BO_VA (-2)
>>> [ 1471.350270] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
>>> update BO_VA (-2)
>>> [ 1471.350351] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
>>> update BO_VA (-2)
>>> [ 1471.350395] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
>>> update BO_VA (-2)
>>> [ 1471.351895] amdgpu 0000:00:01.0: failed to clear page tables on 
>>> GEM object close (-2)
>>> [ 1471.353995] amdgpu 0000:00:01.0: failed to clear page tables on 
>>> GEM object close (-2)
>>> [ 1471.354179] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
>>> update BO_VA (-2)
>>> [ 1471.354190] amdgpu 0000:00:01.0: failed to clear page tables on 
>>> GEM object close (-2)
>>> [ 1471.354252] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
>>> update BO_VA (-2)
>>> [ 1471.354259] amdgpu 0000:00:01.0: failed to clear page tables on 
>>> GEM object close (-2)
>>> [ 1471.354302] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
>>> update BO_VA (-2)
>>> [ 1471.354308] amdgpu 0000:00:01.0: failed to clear page tables on 
>>> GEM object close (-2)
>>> [ 1471.354351] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
>>> update BO_VA (-2)
>>> [ 1471.354356] amdgpu 0000:00:01.0: failed to clear page tables on 
>>> GEM object close (-2)
>>>
>>> So clearly that's not the fix either :-/
>>>
>>> (all on top of the latest drm-next from this morning).
>>>
>>> Tom
>>>
>>>
>>> On 2020-01-23 9:21 a.m., Christian König wrote:
>>>> Allow partial invalidation on unallocated PDs. This is useful when we
>>>> need to silence faults to stop interrupt floods on Vega.
>>>>
>>>> Signed-off-by: Christian König <christian.koenig@amd.com>
>>>> ---
>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 23 ++++++++++++++++++-----
>>>>   1 file changed, 18 insertions(+), 5 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>> index 8119f32ca94d..0f79c17118bf 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>> @@ -1467,9 +1467,8 @@ static int amdgpu_vm_update_ptes(struct 
>>>> amdgpu_vm_update_params *params,
>>>>                * smaller than the address shift. Go to the next
>>>>                * child entry and try again.
>>>>                */
>>>> -            if (!amdgpu_vm_pt_descendant(adev, &cursor))
>>>> -                return -ENOENT;
>>>> -            continue;
>>>> +            if (amdgpu_vm_pt_descendant(adev, &cursor))
>>>> +                continue;
>>>>           } else if (frag >= parent_shift) {
>>>>               /* If the fragment size is even larger than the parent
>>>>                * shift we should go up one level and check it again.
>>>> @@ -1480,8 +1479,19 @@ static int amdgpu_vm_update_ptes(struct 
>>>> amdgpu_vm_update_params *params,
>>>>           }
>>>>             pt = cursor.entry->base.bo;
>>>> -        if (!pt)
>>>> -            return -ENOENT;
>>>> +        if (!pt) {
>>>> +            /* We need all PDs and PTs for mapping something, */
>>>> +            if (flags & AMDGPU_PTE_VALID)
>>>> +                return -ENOENT;
>>>> +
>>>> +            /* but unmapping something can happen at a higher
>>>> +             * level. */
>>>> +            if (!amdgpu_vm_pt_ancestor(&cursor))
>>>> +                return -EINVAL;
>>>> +
>>>> +            pt = cursor.entry->base.bo;
>>>> +            shift = parent_shift;
>>>> +        }
>>>>             /* Looks good so far, calculate parameters for the 
>>>> update */
>>>>           incr = (uint64_t)AMDGPU_GPU_PAGE_SIZE << shift;
>>>> @@ -1495,6 +1505,9 @@ static int amdgpu_vm_update_ptes(struct 
>>>> amdgpu_vm_update_params *params,
>>>>               uint64_t upd_end = min(entry_end, frag_end);
>>>>               unsigned nptes = (upd_end - frag_start) >> shift;
>>>>   +            /* This can happen when we set higher level PDs to
>>>> +             * silent to stop fault floods. */
>>>> +            nptes = max(nptes, 1u);
>>>>               amdgpu_vm_update_flags(params, pt, cursor.level,
>>>>                              pe_start, dst, nptes, incr,
>>>>                              flags | AMDGPU_PTE_FRAG(frag));
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2020-01-27 19:19 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-23 14:21 [PATCH 1/4] drm/amdgpu: return EINVAL instead of ENOENT in the VM code Christian König
2020-01-23 14:21 ` [PATCH 2/4] drm/amdgpu: allow higher level PD invalidations Christian König
2020-01-27 14:57   ` Tom St Denis
2020-01-27 17:37     ` Tom St Denis
2020-01-27 19:14       ` Christian König
2020-01-27 19:18         ` Tom St Denis
2020-01-23 14:21 ` [PATCH 3/4] drm/amdgpu: simplify and fix amdgpu_sync_resv Christian König
2020-01-23 14:21 ` [PATCH 4/4] drm/amdgpu: rework synchronization of VM updates v3 Christian König
2020-01-23 14:22   ` Tom St Denis
2020-01-23 14:24     ` Christian König
2020-01-23 14:25       ` Tom St Denis
2020-01-23 14:35         ` Christian König
2020-01-23 15:23           ` Tom St Denis
2020-01-23 15:39   ` William Lewis
2020-01-23 16:20     ` Christian König

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.