All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/5] drm/amdgpu: stop pipelining VM PDs/PTs moves
@ 2018-09-11  9:55 Christian König
       [not found] ` <20180911095602.10152-1-christian.koenig-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 15+ messages in thread
From: Christian König @ 2018-09-11  9:55 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

We are going to need this for recoverable page fault handling and it
makes shadow handling during GPU reset much more easier.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c    | 6 +++++-
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index b5f20b42439e..a7e39c9dd14b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -1363,7 +1363,7 @@ u64 amdgpu_bo_gpu_offset(struct amdgpu_bo *bo)
 {
 	WARN_ON_ONCE(bo->tbo.mem.mem_type == TTM_PL_SYSTEM);
 	WARN_ON_ONCE(!ww_mutex_is_locked(&bo->tbo.resv->lock) &&
-		     !bo->pin_count);
+		     !bo->pin_count && bo->tbo.type != ttm_bo_type_kernel);
 	WARN_ON_ONCE(bo->tbo.mem.start == AMDGPU_BO_INVALID_OFFSET);
 	WARN_ON_ONCE(bo->tbo.mem.mem_type == TTM_PL_VRAM &&
 		     !(bo->flags & AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS));
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index d9f3201c9e5c..2f32dc692d40 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -524,7 +524,11 @@ static int amdgpu_move_blit(struct ttm_buffer_object *bo,
 	if (r)
 		goto error;
 
-	r = ttm_bo_pipeline_move(bo, fence, evict, new_mem);
+	/* Always block for VM page tables before committing the new location */
+	if (bo->type == ttm_bo_type_kernel)
+		r = ttm_bo_move_accel_cleanup(bo, fence, true, new_mem);
+	else
+		r = ttm_bo_pipeline_move(bo, fence, evict, new_mem);
 	dma_fence_put(fence);
 	return r;
 
-- 
2.14.1

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 2/5] drm/amdgpu: always enable shadow BOs
       [not found] ` <20180911095602.10152-1-christian.koenig-5C7GfCeVMHo@public.gmane.org>
@ 2018-09-11  9:55   ` Christian König
       [not found]     ` <20180911095602.10152-2-christian.koenig-5C7GfCeVMHo@public.gmane.org>
  2018-09-11  9:56   ` [PATCH 3/5] drm/amdgpu: shadow BOs don't need any alignment Christian König
                     ` (3 subsequent siblings)
  4 siblings, 1 reply; 15+ messages in thread
From: Christian König @ 2018-09-11  9:55 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Even when GPU recovery is disabled we could run into a manually
triggered recovery.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 27 +--------------------------
 1 file changed, 1 insertion(+), 26 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index a7e39c9dd14b..7db0040ca145 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -38,31 +38,6 @@
 #include "amdgpu_trace.h"
 #include "amdgpu_amdkfd.h"
 
-/**
- * DOC: amdgpu_object
- *
- * This defines the interfaces to operate on an &amdgpu_bo buffer object which
- * represents memory used by driver (VRAM, system memory, etc.). The driver
- * provides DRM/GEM APIs to userspace. DRM/GEM APIs then use these interfaces
- * to create/destroy/set buffer object which are then managed by the kernel TTM
- * memory manager.
- * The interfaces are also used internally by kernel clients, including gfx,
- * uvd, etc. for kernel managed allocations used by the GPU.
- *
- */
-
-static bool amdgpu_bo_need_backup(struct amdgpu_device *adev)
-{
-	if (adev->flags & AMD_IS_APU)
-		return false;
-
-	if (amdgpu_gpu_recovery == 0 ||
-	    (amdgpu_gpu_recovery == -1  && !amdgpu_sriov_vf(adev)))
-		return false;
-
-	return true;
-}
-
 /**
  * amdgpu_bo_subtract_pin_size - Remove BO from pin_size accounting
  *
@@ -596,7 +571,7 @@ int amdgpu_bo_create(struct amdgpu_device *adev,
 	if (r)
 		return r;
 
-	if ((flags & AMDGPU_GEM_CREATE_SHADOW) && amdgpu_bo_need_backup(adev)) {
+	if ((flags & AMDGPU_GEM_CREATE_SHADOW) && !(adev->flags & AMD_IS_APU)) {
 		if (!bp->resv)
 			WARN_ON(reservation_object_lock((*bo_ptr)->tbo.resv,
 							NULL));
-- 
2.14.1

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 3/5] drm/amdgpu: shadow BOs don't need any alignment
       [not found] ` <20180911095602.10152-1-christian.koenig-5C7GfCeVMHo@public.gmane.org>
  2018-09-11  9:55   ` [PATCH 2/5] drm/amdgpu: always enable shadow BOs Christian König
@ 2018-09-11  9:56   ` Christian König
       [not found]     ` <20180911095602.10152-3-christian.koenig-5C7GfCeVMHo@public.gmane.org>
  2018-09-11  9:56   ` [PATCH 4/5] drm/amdgpu: always recover VRAM during GPU recovery Christian König
                     ` (2 subsequent siblings)
  4 siblings, 1 reply; 15+ messages in thread
From: Christian König @ 2018-09-11  9:56 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

They aren't directly used by the hardware.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index 7db0040ca145..3a6f92de5504 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -516,7 +516,7 @@ static int amdgpu_bo_do_create(struct amdgpu_device *adev,
 }
 
 static int amdgpu_bo_create_shadow(struct amdgpu_device *adev,
-				   unsigned long size, int byte_align,
+				   unsigned long size,
 				   struct amdgpu_bo *bo)
 {
 	struct amdgpu_bo_param bp;
@@ -527,7 +527,6 @@ static int amdgpu_bo_create_shadow(struct amdgpu_device *adev,
 
 	memset(&bp, 0, sizeof(bp));
 	bp.size = size;
-	bp.byte_align = byte_align;
 	bp.domain = AMDGPU_GEM_DOMAIN_GTT;
 	bp.flags = AMDGPU_GEM_CREATE_CPU_GTT_USWC |
 		AMDGPU_GEM_CREATE_SHADOW;
@@ -576,7 +575,7 @@ int amdgpu_bo_create(struct amdgpu_device *adev,
 			WARN_ON(reservation_object_lock((*bo_ptr)->tbo.resv,
 							NULL));
 
-		r = amdgpu_bo_create_shadow(adev, bp->size, bp->byte_align, (*bo_ptr));
+		r = amdgpu_bo_create_shadow(adev, bp->size, *bo_ptr);
 
 		if (!bp->resv)
 			reservation_object_unlock((*bo_ptr)->tbo.resv);
-- 
2.14.1

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 4/5] drm/amdgpu: always recover VRAM during GPU recovery
       [not found] ` <20180911095602.10152-1-christian.koenig-5C7GfCeVMHo@public.gmane.org>
  2018-09-11  9:55   ` [PATCH 2/5] drm/amdgpu: always enable shadow BOs Christian König
  2018-09-11  9:56   ` [PATCH 3/5] drm/amdgpu: shadow BOs don't need any alignment Christian König
@ 2018-09-11  9:56   ` Christian König
       [not found]     ` <20180911095602.10152-4-christian.koenig-5C7GfCeVMHo@public.gmane.org>
  2018-09-11  9:56   ` [PATCH 5/5] drm/amdgpu: fix shadow BO restoring Christian König
  2018-09-13  9:28   ` [PATCH 1/5] drm/amdgpu: stop pipelining VM PDs/PTs moves Zhang, Jerry(Junwei)
  4 siblings, 1 reply; 15+ messages in thread
From: Christian König @ 2018-09-11  9:56 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

It shouldn't add much overhead and we should make sure that critical
VRAM content is always restored.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 93476b8c2e72..5eba66ecf668 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2989,7 +2989,7 @@ static int amdgpu_device_recover_vram_from_shadow(struct amdgpu_device *adev,
 }
 
 /**
- * amdgpu_device_handle_vram_lost - Handle the loss of VRAM contents
+ * amdgpu_device_recover_vram - Recover some VRAM contents
  *
  * @adev: amdgpu_device pointer
  *
@@ -2998,7 +2998,7 @@ static int amdgpu_device_recover_vram_from_shadow(struct amdgpu_device *adev,
  * the contents of VRAM might be lost.
  * Returns 0 on success, 1 on failure.
  */
-static int amdgpu_device_handle_vram_lost(struct amdgpu_device *adev)
+static int amdgpu_device_recover_vram(struct amdgpu_device *adev)
 {
 	struct amdgpu_ring *ring = adev->mman.buffer_funcs_ring;
 	struct amdgpu_bo *bo, *tmp;
@@ -3125,8 +3125,8 @@ static int amdgpu_device_reset(struct amdgpu_device *adev)
 		}
 	}
 
-	if (!r && ((need_full_reset && !(adev->flags & AMD_IS_APU)) || vram_lost))
-		r = amdgpu_device_handle_vram_lost(adev);
+	if (!r)
+		r = amdgpu_device_recover_vram(adev);
 
 	return r;
 }
@@ -3172,7 +3172,7 @@ static int amdgpu_device_reset_sriov(struct amdgpu_device *adev,
 	amdgpu_virt_release_full_gpu(adev, true);
 	if (!r && adev->virt.gim_feature & AMDGIM_FEATURE_GIM_FLR_VRAMLOST) {
 		atomic_inc(&adev->vram_lost_counter);
-		r = amdgpu_device_handle_vram_lost(adev);
+		r = amdgpu_device_recover_vram(adev);
 	}
 
 	return r;
-- 
2.14.1

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 5/5] drm/amdgpu: fix shadow BO restoring
       [not found] ` <20180911095602.10152-1-christian.koenig-5C7GfCeVMHo@public.gmane.org>
                     ` (2 preceding siblings ...)
  2018-09-11  9:56   ` [PATCH 4/5] drm/amdgpu: always recover VRAM during GPU recovery Christian König
@ 2018-09-11  9:56   ` Christian König
       [not found]     ` <20180911095602.10152-5-christian.koenig-5C7GfCeVMHo@public.gmane.org>
  2018-09-13  9:28   ` [PATCH 1/5] drm/amdgpu: stop pipelining VM PDs/PTs moves Zhang, Jerry(Junwei)
  4 siblings, 1 reply; 15+ messages in thread
From: Christian König @ 2018-09-11  9:56 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Don't grab the reservation lock any more and simplify the handling quite
a bit.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 109 ++++++++---------------------
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c |  46 ++++--------
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.h |   8 +--
 3 files changed, 43 insertions(+), 120 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 5eba66ecf668..20bb702f5c7f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2940,54 +2940,6 @@ static int amdgpu_device_ip_post_soft_reset(struct amdgpu_device *adev)
 	return 0;
 }
 
-/**
- * amdgpu_device_recover_vram_from_shadow - restore shadowed VRAM buffers
- *
- * @adev: amdgpu_device pointer
- * @ring: amdgpu_ring for the engine handling the buffer operations
- * @bo: amdgpu_bo buffer whose shadow is being restored
- * @fence: dma_fence associated with the operation
- *
- * Restores the VRAM buffer contents from the shadow in GTT.  Used to
- * restore things like GPUVM page tables after a GPU reset where
- * the contents of VRAM might be lost.
- * Returns 0 on success, negative error code on failure.
- */
-static int amdgpu_device_recover_vram_from_shadow(struct amdgpu_device *adev,
-						  struct amdgpu_ring *ring,
-						  struct amdgpu_bo *bo,
-						  struct dma_fence **fence)
-{
-	uint32_t domain;
-	int r;
-
-	if (!bo->shadow)
-		return 0;
-
-	r = amdgpu_bo_reserve(bo, true);
-	if (r)
-		return r;
-	domain = amdgpu_mem_type_to_domain(bo->tbo.mem.mem_type);
-	/* if bo has been evicted, then no need to recover */
-	if (domain == AMDGPU_GEM_DOMAIN_VRAM) {
-		r = amdgpu_bo_validate(bo->shadow);
-		if (r) {
-			DRM_ERROR("bo validate failed!\n");
-			goto err;
-		}
-
-		r = amdgpu_bo_restore_from_shadow(adev, ring, bo,
-						 NULL, fence, true);
-		if (r) {
-			DRM_ERROR("recover page table failed!\n");
-			goto err;
-		}
-	}
-err:
-	amdgpu_bo_unreserve(bo);
-	return r;
-}
-
 /**
  * amdgpu_device_recover_vram - Recover some VRAM contents
  *
@@ -2996,16 +2948,15 @@ static int amdgpu_device_recover_vram_from_shadow(struct amdgpu_device *adev,
  * Restores the contents of VRAM buffers from the shadows in GTT.  Used to
  * restore things like GPUVM page tables after a GPU reset where
  * the contents of VRAM might be lost.
- * Returns 0 on success, 1 on failure.
+ *
+ * Returns:
+ * 0 on success, negative error code on failure.
  */
 static int amdgpu_device_recover_vram(struct amdgpu_device *adev)
 {
-	struct amdgpu_ring *ring = adev->mman.buffer_funcs_ring;
-	struct amdgpu_bo *bo, *tmp;
 	struct dma_fence *fence = NULL, *next = NULL;
-	long r = 1;
-	int i = 0;
-	long tmo;
+	struct amdgpu_bo *shadow;
+	long r = 1, tmo;
 
 	if (amdgpu_sriov_runtime(adev))
 		tmo = msecs_to_jiffies(8000);
@@ -3014,44 +2965,40 @@ static int amdgpu_device_recover_vram(struct amdgpu_device *adev)
 
 	DRM_INFO("recover vram bo from shadow start\n");
 	mutex_lock(&adev->shadow_list_lock);
-	list_for_each_entry_safe(bo, tmp, &adev->shadow_list, shadow_list) {
-		next = NULL;
-		amdgpu_device_recover_vram_from_shadow(adev, ring, bo, &next);
+	list_for_each_entry(shadow, &adev->shadow_list, shadow_list) {
+
+		/* No need to recover an evicted BO */
+		if (shadow->tbo.mem.mem_type != TTM_PL_TT ||
+		    shadow->parent->tbo.mem.mem_type != TTM_PL_VRAM)
+			continue;
+
+		r = amdgpu_bo_restore_shadow(shadow, &next);
+		if (r)
+			break;
+
 		if (fence) {
 			r = dma_fence_wait_timeout(fence, false, tmo);
-			if (r == 0)
-				pr_err("wait fence %p[%d] timeout\n", fence, i);
-			else if (r < 0)
-				pr_err("wait fence %p[%d] interrupted\n", fence, i);
-			if (r < 1) {
-				dma_fence_put(fence);
-				fence = next;
+			dma_fence_put(fence);
+			fence = next;
+			if (r <= 0)
 				break;
-			}
-			i++;
+		} else {
+			fence = next;
 		}
-
-		dma_fence_put(fence);
-		fence = next;
 	}
 	mutex_unlock(&adev->shadow_list_lock);
 
-	if (fence) {
-		r = dma_fence_wait_timeout(fence, false, tmo);
-		if (r == 0)
-			pr_err("wait fence %p[%d] timeout\n", fence, i);
-		else if (r < 0)
-			pr_err("wait fence %p[%d] interrupted\n", fence, i);
-
-	}
+	if (fence)
+		tmo = dma_fence_wait_timeout(fence, false, tmo);
 	dma_fence_put(fence);
 
-	if (r > 0)
-		DRM_INFO("recover vram bo from shadow done\n");
-	else
+	if (r <= 0 || tmo <= 0) {
 		DRM_ERROR("recover vram bo from shadow failed\n");
+		return -EIO;
+	}
 
-	return (r > 0) ? 0 : 1;
+	DRM_INFO("recover vram bo from shadow done\n");
+	return 0;
 }
 
 /**
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index 3a6f92de5504..5b6d5fcc2151 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -537,7 +537,7 @@ static int amdgpu_bo_create_shadow(struct amdgpu_device *adev,
 	if (!r) {
 		bo->shadow->parent = amdgpu_bo_ref(bo);
 		mutex_lock(&adev->shadow_list_lock);
-		list_add_tail(&bo->shadow_list, &adev->shadow_list);
+		list_add_tail(&bo->shadow->shadow_list, &adev->shadow_list);
 		mutex_unlock(&adev->shadow_list_lock);
 	}
 
@@ -669,13 +669,10 @@ int amdgpu_bo_validate(struct amdgpu_bo *bo)
 }
 
 /**
- * amdgpu_bo_restore_from_shadow - restore an &amdgpu_bo buffer object
- * @adev: amdgpu device object
- * @ring: amdgpu_ring for the engine handling the buffer operations
- * @bo: &amdgpu_bo buffer to be restored
- * @resv: reservation object with embedded fence
+ * amdgpu_bo_restore_shadow - restore an &amdgpu_bo shadow
+ *
+ * @shadow: &amdgpu_bo shadow to be restored
  * @fence: dma_fence associated with the operation
- * @direct: whether to submit the job directly
  *
  * Copies a buffer object's shadow content back to the object.
  * This is used for recovering a buffer from its shadow in case of a gpu
@@ -684,36 +681,19 @@ int amdgpu_bo_validate(struct amdgpu_bo *bo)
  * Returns:
  * 0 for success or a negative error code on failure.
  */
-int amdgpu_bo_restore_from_shadow(struct amdgpu_device *adev,
-				  struct amdgpu_ring *ring,
-				  struct amdgpu_bo *bo,
-				  struct reservation_object *resv,
-				  struct dma_fence **fence,
-				  bool direct)
+int amdgpu_bo_restore_shadow(struct amdgpu_bo *shadow, struct dma_fence **fence)
 
 {
-	struct amdgpu_bo *shadow = bo->shadow;
-	uint64_t bo_addr, shadow_addr;
-	int r;
-
-	if (!shadow)
-		return -EINVAL;
-
-	bo_addr = amdgpu_bo_gpu_offset(bo);
-	shadow_addr = amdgpu_bo_gpu_offset(bo->shadow);
-
-	r = reservation_object_reserve_shared(bo->tbo.resv);
-	if (r)
-		goto err;
+	struct amdgpu_device *adev = amdgpu_ttm_adev(shadow->tbo.bdev);
+	struct amdgpu_ring *ring = adev->mman.buffer_funcs_ring;
+	uint64_t shadow_addr, parent_addr;
 
-	r = amdgpu_copy_buffer(ring, shadow_addr, bo_addr,
-			       amdgpu_bo_size(bo), resv, fence,
-			       direct, false);
-	if (!r)
-		amdgpu_bo_fence(bo, *fence, true);
+	shadow_addr = amdgpu_bo_gpu_offset(shadow);
+	parent_addr = amdgpu_bo_gpu_offset(shadow->parent);
 
-err:
-	return r;
+	return amdgpu_copy_buffer(ring, shadow_addr, parent_addr,
+				  amdgpu_bo_size(shadow), NULL, fence,
+				  true, false);
 }
 
 /**
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
index 907fdf46d895..363db417fb2e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
@@ -273,12 +273,8 @@ int amdgpu_bo_backup_to_shadow(struct amdgpu_device *adev,
 			       struct reservation_object *resv,
 			       struct dma_fence **fence, bool direct);
 int amdgpu_bo_validate(struct amdgpu_bo *bo);
-int amdgpu_bo_restore_from_shadow(struct amdgpu_device *adev,
-				  struct amdgpu_ring *ring,
-				  struct amdgpu_bo *bo,
-				  struct reservation_object *resv,
-				  struct dma_fence **fence,
-				  bool direct);
+int amdgpu_bo_restore_shadow(struct amdgpu_bo *shadow,
+			     struct dma_fence **fence);
 uint32_t amdgpu_bo_get_preferred_pin_domain(struct amdgpu_device *adev,
 					    uint32_t domain);
 
-- 
2.14.1

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/5] drm/amdgpu: always enable shadow BOs
       [not found]     ` <20180911095602.10152-2-christian.koenig-5C7GfCeVMHo@public.gmane.org>
@ 2018-09-11 12:29       ` Michel Dänzer
       [not found]         ` <733b6122-bfef-442b-e95a-2cd17bfe12b5-otUistvHUpPR7s880joybQ@public.gmane.org>
  0 siblings, 1 reply; 15+ messages in thread
From: Michel Dänzer @ 2018-09-11 12:29 UTC (permalink / raw)
  To: Christian König; +Cc: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

On 2018-09-11 11:55 a.m., Christian König wrote:
> Even when GPU recovery is disabled we could run into a manually
> triggered recovery.
> 
> Signed-off-by: Christian König <christian.koenig@amd.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 27 +--------------------------
>  1 file changed, 1 insertion(+), 26 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> index a7e39c9dd14b..7db0040ca145 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> @@ -38,31 +38,6 @@
>  #include "amdgpu_trace.h"
>  #include "amdgpu_amdkfd.h"
>  
> -/**
> - * DOC: amdgpu_object
> - *
> - * This defines the interfaces to operate on an &amdgpu_bo buffer object which
> - * represents memory used by driver (VRAM, system memory, etc.). The driver
> - * provides DRM/GEM APIs to userspace. DRM/GEM APIs then use these interfaces
> - * to create/destroy/set buffer object which are then managed by the kernel TTM
> - * memory manager.
> - * The interfaces are also used internally by kernel clients, including gfx,
> - * uvd, etc. for kernel managed allocations used by the GPU.
> - *
> - */

This comment shouldn't be removed, should it? Otherwise the commit log
needs to explain this, and the reference in Documentation/gpu/amdgpu.rst
needs to be removed as well.


-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/5] drm/amdgpu: always enable shadow BOs
       [not found]         ` <733b6122-bfef-442b-e95a-2cd17bfe12b5-otUistvHUpPR7s880joybQ@public.gmane.org>
@ 2018-09-11 13:21           ` Christian König
       [not found]             ` <ce8f5f17-6ef0-27bc-3ae4-cfcc7fffaaa9-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 15+ messages in thread
From: Christian König @ 2018-09-11 13:21 UTC (permalink / raw)
  To: Michel Dänzer; +Cc: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Am 11.09.2018 um 14:29 schrieb Michel Dänzer:
> On 2018-09-11 11:55 a.m., Christian König wrote:
>> Even when GPU recovery is disabled we could run into a manually
>> triggered recovery.
>>
>> Signed-off-by: Christian König <christian.koenig@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 27 +--------------------------
>>   1 file changed, 1 insertion(+), 26 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>> index a7e39c9dd14b..7db0040ca145 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>> @@ -38,31 +38,6 @@
>>   #include "amdgpu_trace.h"
>>   #include "amdgpu_amdkfd.h"
>>   
>> -/**
>> - * DOC: amdgpu_object
>> - *
>> - * This defines the interfaces to operate on an &amdgpu_bo buffer object which
>> - * represents memory used by driver (VRAM, system memory, etc.). The driver
>> - * provides DRM/GEM APIs to userspace. DRM/GEM APIs then use these interfaces
>> - * to create/destroy/set buffer object which are then managed by the kernel TTM
>> - * memory manager.
>> - * The interfaces are also used internally by kernel clients, including gfx,
>> - * uvd, etc. for kernel managed allocations used by the GPU.
>> - *
>> - */
> This comment shouldn't be removed, should it? Otherwise the commit log
> needs to explain this, and the reference in Documentation/gpu/amdgpu.rst
> needs to be removed as well.

Oh, that was indeed an accident. Going to fix that.

Thanks,
Christian.
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: [PATCH 2/5] drm/amdgpu: always enable shadow BOs
       [not found]             ` <ce8f5f17-6ef0-27bc-3ae4-cfcc7fffaaa9-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2018-09-12 12:37               ` Deng, Emily
  0 siblings, 0 replies; 15+ messages in thread
From: Deng, Emily @ 2018-09-12 12:37 UTC (permalink / raw)
  To: Koenig, Christian, Michel Dänzer
  Cc: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Acked-by: Emily Deng <Emily.Deng@amd.com>

>-----Original Message-----
>From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> On Behalf Of
>Christian König
>Sent: Tuesday, September 11, 2018 9:22 PM
>To: Michel Dänzer <michel@daenzer.net>
>Cc: amd-gfx@lists.freedesktop.org
>Subject: Re: [PATCH 2/5] drm/amdgpu: always enable shadow BOs
>
>Am 11.09.2018 um 14:29 schrieb Michel Dänzer:
>> On 2018-09-11 11:55 a.m., Christian König wrote:
>>> Even when GPU recovery is disabled we could run into a manually
>>> triggered recovery.
>>>
>>> Signed-off-by: Christian König <christian.koenig@amd.com>
>>> ---
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 27 +-----------------------
>---
>>>   1 file changed, 1 insertion(+), 26 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>> index a7e39c9dd14b..7db0040ca145 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>> @@ -38,31 +38,6 @@
>>>   #include "amdgpu_trace.h"
>>>   #include "amdgpu_amdkfd.h"
>>>
>>> -/**
>>> - * DOC: amdgpu_object
>>> - *
>>> - * This defines the interfaces to operate on an &amdgpu_bo buffer
>>> object which
>>> - * represents memory used by driver (VRAM, system memory, etc.). The
>>> driver
>>> - * provides DRM/GEM APIs to userspace. DRM/GEM APIs then use these
>>> interfaces
>>> - * to create/destroy/set buffer object which are then managed by the
>>> kernel TTM
>>> - * memory manager.
>>> - * The interfaces are also used internally by kernel clients,
>>> including gfx,
>>> - * uvd, etc. for kernel managed allocations used by the GPU.
>>> - *
>>> - */
>> This comment shouldn't be removed, should it? Otherwise the commit log
>> needs to explain this, and the reference in
>> Documentation/gpu/amdgpu.rst needs to be removed as well.
>
>Oh, that was indeed an accident. Going to fix that.
>
>Thanks,
>Christian.
>_______________________________________________
>amd-gfx mailing list
>amd-gfx@lists.freedesktop.org
>https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/5] drm/amdgpu: stop pipelining VM PDs/PTs moves
       [not found] ` <20180911095602.10152-1-christian.koenig-5C7GfCeVMHo@public.gmane.org>
                     ` (3 preceding siblings ...)
  2018-09-11  9:56   ` [PATCH 5/5] drm/amdgpu: fix shadow BO restoring Christian König
@ 2018-09-13  9:28   ` Zhang, Jerry(Junwei)
  4 siblings, 0 replies; 15+ messages in thread
From: Zhang, Jerry(Junwei) @ 2018-09-13  9:28 UTC (permalink / raw)
  To: Christian König, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW


On 09/11/2018 05:55 PM, Christian König wrote:
> We are going to need this for recoverable page fault handling and it
> makes shadow handling during GPU reset much more easier.
>
> Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Junwei Zhang <Jerry.Zhang@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 2 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c    | 6 +++++-
>   2 files changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> index b5f20b42439e..a7e39c9dd14b 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> @@ -1363,7 +1363,7 @@ u64 amdgpu_bo_gpu_offset(struct amdgpu_bo *bo)
>   {
>   	WARN_ON_ONCE(bo->tbo.mem.mem_type == TTM_PL_SYSTEM);
>   	WARN_ON_ONCE(!ww_mutex_is_locked(&bo->tbo.resv->lock) &&
> -		     !bo->pin_count);
> +		     !bo->pin_count && bo->tbo.type != ttm_bo_type_kernel);
>   	WARN_ON_ONCE(bo->tbo.mem.start == AMDGPU_BO_INVALID_OFFSET);
>   	WARN_ON_ONCE(bo->tbo.mem.mem_type == TTM_PL_VRAM &&
>   		     !(bo->flags & AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS));
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> index d9f3201c9e5c..2f32dc692d40 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> @@ -524,7 +524,11 @@ static int amdgpu_move_blit(struct ttm_buffer_object *bo,
>   	if (r)
>   		goto error;
>   
> -	r = ttm_bo_pipeline_move(bo, fence, evict, new_mem);
> +	/* Always block for VM page tables before committing the new location */
> +	if (bo->type == ttm_bo_type_kernel)
> +		r = ttm_bo_move_accel_cleanup(bo, fence, true, new_mem);
> +	else
> +		r = ttm_bo_pipeline_move(bo, fence, evict, new_mem);
>   	dma_fence_put(fence);
>   	return r;
>   

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 4/5] drm/amdgpu: always recover VRAM during GPU recovery
       [not found]     ` <20180911095602.10152-4-christian.koenig-5C7GfCeVMHo@public.gmane.org>
@ 2018-09-13  9:28       ` Zhang, Jerry(Junwei)
  0 siblings, 0 replies; 15+ messages in thread
From: Zhang, Jerry(Junwei) @ 2018-09-13  9:28 UTC (permalink / raw)
  To: Christian König, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

On 09/11/2018 05:56 PM, Christian König wrote:
> It shouldn't add much overhead and we should make sure that critical
> VRAM content is always restored.
>
> Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Junwei Zhang <Jerry.Zhang@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 10 +++++-----
>   1 file changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 93476b8c2e72..5eba66ecf668 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -2989,7 +2989,7 @@ static int amdgpu_device_recover_vram_from_shadow(struct amdgpu_device *adev,
>   }
>   
>   /**
> - * amdgpu_device_handle_vram_lost - Handle the loss of VRAM contents
> + * amdgpu_device_recover_vram - Recover some VRAM contents
>    *
>    * @adev: amdgpu_device pointer
>    *
> @@ -2998,7 +2998,7 @@ static int amdgpu_device_recover_vram_from_shadow(struct amdgpu_device *adev,
>    * the contents of VRAM might be lost.
>    * Returns 0 on success, 1 on failure.
>    */
> -static int amdgpu_device_handle_vram_lost(struct amdgpu_device *adev)
> +static int amdgpu_device_recover_vram(struct amdgpu_device *adev)
>   {
>   	struct amdgpu_ring *ring = adev->mman.buffer_funcs_ring;
>   	struct amdgpu_bo *bo, *tmp;
> @@ -3125,8 +3125,8 @@ static int amdgpu_device_reset(struct amdgpu_device *adev)
>   		}
>   	}
>   
> -	if (!r && ((need_full_reset && !(adev->flags & AMD_IS_APU)) || vram_lost))
> -		r = amdgpu_device_handle_vram_lost(adev);
> +	if (!r)
> +		r = amdgpu_device_recover_vram(adev);
>   
>   	return r;
>   }
> @@ -3172,7 +3172,7 @@ static int amdgpu_device_reset_sriov(struct amdgpu_device *adev,
>   	amdgpu_virt_release_full_gpu(adev, true);
>   	if (!r && adev->virt.gim_feature & AMDGIM_FEATURE_GIM_FLR_VRAMLOST) {
>   		atomic_inc(&adev->vram_lost_counter);
> -		r = amdgpu_device_handle_vram_lost(adev);
> +		r = amdgpu_device_recover_vram(adev);
>   	}
>   
>   	return r;

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 5/5] drm/amdgpu: fix shadow BO restoring
       [not found]     ` <20180911095602.10152-5-christian.koenig-5C7GfCeVMHo@public.gmane.org>
@ 2018-09-13  9:29       ` Zhang, Jerry(Junwei)
       [not found]         ` <073f24f7-df48-55f6-f4fb-2d250cfd2dd1-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 15+ messages in thread
From: Zhang, Jerry(Junwei) @ 2018-09-13  9:29 UTC (permalink / raw)
  To: Christian König, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

On 09/11/2018 05:56 PM, Christian König wrote:
> Don't grab the reservation lock any more and simplify the handling quite
> a bit.
>
> Signed-off-by: Christian König <christian.koenig@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 109 ++++++++---------------------
>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c |  46 ++++--------
>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.h |   8 +--
>   3 files changed, 43 insertions(+), 120 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 5eba66ecf668..20bb702f5c7f 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -2940,54 +2940,6 @@ static int amdgpu_device_ip_post_soft_reset(struct amdgpu_device *adev)
>   	return 0;
>   }
>   
> -/**
> - * amdgpu_device_recover_vram_from_shadow - restore shadowed VRAM buffers
> - *
> - * @adev: amdgpu_device pointer
> - * @ring: amdgpu_ring for the engine handling the buffer operations
> - * @bo: amdgpu_bo buffer whose shadow is being restored
> - * @fence: dma_fence associated with the operation
> - *
> - * Restores the VRAM buffer contents from the shadow in GTT.  Used to
> - * restore things like GPUVM page tables after a GPU reset where
> - * the contents of VRAM might be lost.
> - * Returns 0 on success, negative error code on failure.
> - */
> -static int amdgpu_device_recover_vram_from_shadow(struct amdgpu_device *adev,
> -						  struct amdgpu_ring *ring,
> -						  struct amdgpu_bo *bo,
> -						  struct dma_fence **fence)
> -{
> -	uint32_t domain;
> -	int r;
> -
> -	if (!bo->shadow)
> -		return 0;
> -
> -	r = amdgpu_bo_reserve(bo, true);
> -	if (r)
> -		return r;
> -	domain = amdgpu_mem_type_to_domain(bo->tbo.mem.mem_type);
> -	/* if bo has been evicted, then no need to recover */
> -	if (domain == AMDGPU_GEM_DOMAIN_VRAM) {
> -		r = amdgpu_bo_validate(bo->shadow);
> -		if (r) {
> -			DRM_ERROR("bo validate failed!\n");
> -			goto err;
> -		}
> -
> -		r = amdgpu_bo_restore_from_shadow(adev, ring, bo,
> -						 NULL, fence, true);
> -		if (r) {
> -			DRM_ERROR("recover page table failed!\n");
> -			goto err;
> -		}
> -	}
> -err:
> -	amdgpu_bo_unreserve(bo);
> -	return r;
> -}
> -
>   /**
>    * amdgpu_device_recover_vram - Recover some VRAM contents
>    *
> @@ -2996,16 +2948,15 @@ static int amdgpu_device_recover_vram_from_shadow(struct amdgpu_device *adev,
>    * Restores the contents of VRAM buffers from the shadows in GTT.  Used to
>    * restore things like GPUVM page tables after a GPU reset where
>    * the contents of VRAM might be lost.
> - * Returns 0 on success, 1 on failure.
> + *
> + * Returns:
> + * 0 on success, negative error code on failure.
>    */
>   static int amdgpu_device_recover_vram(struct amdgpu_device *adev)
>   {
> -	struct amdgpu_ring *ring = adev->mman.buffer_funcs_ring;
> -	struct amdgpu_bo *bo, *tmp;
>   	struct dma_fence *fence = NULL, *next = NULL;
> -	long r = 1;
> -	int i = 0;
> -	long tmo;
> +	struct amdgpu_bo *shadow;
> +	long r = 1, tmo;
>   
>   	if (amdgpu_sriov_runtime(adev))
>   		tmo = msecs_to_jiffies(8000);
> @@ -3014,44 +2965,40 @@ static int amdgpu_device_recover_vram(struct amdgpu_device *adev)
>   
>   	DRM_INFO("recover vram bo from shadow start\n");
>   	mutex_lock(&adev->shadow_list_lock);
> -	list_for_each_entry_safe(bo, tmp, &adev->shadow_list, shadow_list) {
> -		next = NULL;
> -		amdgpu_device_recover_vram_from_shadow(adev, ring, bo, &next);
> +	list_for_each_entry(shadow, &adev->shadow_list, shadow_list) {
> +
> +		/* No need to recover an evicted BO */
> +		if (shadow->tbo.mem.mem_type != TTM_PL_TT ||
> +		    shadow->parent->tbo.mem.mem_type != TTM_PL_VRAM)
is there a change that shadow bo evicted to other domain?
like SYSTEM?

Regards,
Jerry
> +			continue;
> +
> +		r = amdgpu_bo_restore_shadow(shadow, &next);
> +		if (r)
> +			break;
> +
>   		if (fence) {
>   			r = dma_fence_wait_timeout(fence, false, tmo);
> -			if (r == 0)
> -				pr_err("wait fence %p[%d] timeout\n", fence, i);
> -			else if (r < 0)
> -				pr_err("wait fence %p[%d] interrupted\n", fence, i);
> -			if (r < 1) {
> -				dma_fence_put(fence);
> -				fence = next;
> +			dma_fence_put(fence);
> +			fence = next;
> +			if (r <= 0)
>   				break;
> -			}
> -			i++;
> +		} else {
> +			fence = next;
>   		}
> -
> -		dma_fence_put(fence);
> -		fence = next;
>   	}
>   	mutex_unlock(&adev->shadow_list_lock);
>   
> -	if (fence) {
> -		r = dma_fence_wait_timeout(fence, false, tmo);
> -		if (r == 0)
> -			pr_err("wait fence %p[%d] timeout\n", fence, i);
> -		else if (r < 0)
> -			pr_err("wait fence %p[%d] interrupted\n", fence, i);
> -
> -	}
> +	if (fence)
> +		tmo = dma_fence_wait_timeout(fence, false, tmo);
>   	dma_fence_put(fence);
>   
> -	if (r > 0)
> -		DRM_INFO("recover vram bo from shadow done\n");
> -	else
> +	if (r <= 0 || tmo <= 0) {
>   		DRM_ERROR("recover vram bo from shadow failed\n");
> +		return -EIO;
> +	}
>   
> -	return (r > 0) ? 0 : 1;
> +	DRM_INFO("recover vram bo from shadow done\n");
> +	return 0;
>   }
>   
>   /**
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> index 3a6f92de5504..5b6d5fcc2151 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> @@ -537,7 +537,7 @@ static int amdgpu_bo_create_shadow(struct amdgpu_device *adev,
>   	if (!r) {
>   		bo->shadow->parent = amdgpu_bo_ref(bo);
>   		mutex_lock(&adev->shadow_list_lock);
> -		list_add_tail(&bo->shadow_list, &adev->shadow_list);
> +		list_add_tail(&bo->shadow->shadow_list, &adev->shadow_list);
>   		mutex_unlock(&adev->shadow_list_lock);
>   	}
>   
> @@ -669,13 +669,10 @@ int amdgpu_bo_validate(struct amdgpu_bo *bo)
>   }
>   
>   /**
> - * amdgpu_bo_restore_from_shadow - restore an &amdgpu_bo buffer object
> - * @adev: amdgpu device object
> - * @ring: amdgpu_ring for the engine handling the buffer operations
> - * @bo: &amdgpu_bo buffer to be restored
> - * @resv: reservation object with embedded fence
> + * amdgpu_bo_restore_shadow - restore an &amdgpu_bo shadow
> + *
> + * @shadow: &amdgpu_bo shadow to be restored
>    * @fence: dma_fence associated with the operation
> - * @direct: whether to submit the job directly
>    *
>    * Copies a buffer object's shadow content back to the object.
>    * This is used for recovering a buffer from its shadow in case of a gpu
> @@ -684,36 +681,19 @@ int amdgpu_bo_validate(struct amdgpu_bo *bo)
>    * Returns:
>    * 0 for success or a negative error code on failure.
>    */
> -int amdgpu_bo_restore_from_shadow(struct amdgpu_device *adev,
> -				  struct amdgpu_ring *ring,
> -				  struct amdgpu_bo *bo,
> -				  struct reservation_object *resv,
> -				  struct dma_fence **fence,
> -				  bool direct)
> +int amdgpu_bo_restore_shadow(struct amdgpu_bo *shadow, struct dma_fence **fence)
>   
>   {
> -	struct amdgpu_bo *shadow = bo->shadow;
> -	uint64_t bo_addr, shadow_addr;
> -	int r;
> -
> -	if (!shadow)
> -		return -EINVAL;
> -
> -	bo_addr = amdgpu_bo_gpu_offset(bo);
> -	shadow_addr = amdgpu_bo_gpu_offset(bo->shadow);
> -
> -	r = reservation_object_reserve_shared(bo->tbo.resv);
> -	if (r)
> -		goto err;
> +	struct amdgpu_device *adev = amdgpu_ttm_adev(shadow->tbo.bdev);
> +	struct amdgpu_ring *ring = adev->mman.buffer_funcs_ring;
> +	uint64_t shadow_addr, parent_addr;
>   
> -	r = amdgpu_copy_buffer(ring, shadow_addr, bo_addr,
> -			       amdgpu_bo_size(bo), resv, fence,
> -			       direct, false);
> -	if (!r)
> -		amdgpu_bo_fence(bo, *fence, true);
> +	shadow_addr = amdgpu_bo_gpu_offset(shadow);
> +	parent_addr = amdgpu_bo_gpu_offset(shadow->parent);
>   
> -err:
> -	return r;
> +	return amdgpu_copy_buffer(ring, shadow_addr, parent_addr,
> +				  amdgpu_bo_size(shadow), NULL, fence,
> +				  true, false);
>   }
>   
>   /**
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> index 907fdf46d895..363db417fb2e 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> @@ -273,12 +273,8 @@ int amdgpu_bo_backup_to_shadow(struct amdgpu_device *adev,
>   			       struct reservation_object *resv,
>   			       struct dma_fence **fence, bool direct);
>   int amdgpu_bo_validate(struct amdgpu_bo *bo);
> -int amdgpu_bo_restore_from_shadow(struct amdgpu_device *adev,
> -				  struct amdgpu_ring *ring,
> -				  struct amdgpu_bo *bo,
> -				  struct reservation_object *resv,
> -				  struct dma_fence **fence,
> -				  bool direct);
> +int amdgpu_bo_restore_shadow(struct amdgpu_bo *shadow,
> +			     struct dma_fence **fence);
>   uint32_t amdgpu_bo_get_preferred_pin_domain(struct amdgpu_device *adev,
>   					    uint32_t domain);
>   

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/5] drm/amdgpu: shadow BOs don't need any alignment
       [not found]     ` <20180911095602.10152-3-christian.koenig-5C7GfCeVMHo@public.gmane.org>
@ 2018-09-13  9:30       ` Zhang, Jerry(Junwei)
  0 siblings, 0 replies; 15+ messages in thread
From: Zhang, Jerry(Junwei) @ 2018-09-13  9:30 UTC (permalink / raw)
  To: Christian König, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

On 09/11/2018 05:56 PM, Christian König wrote:
> They aren't directly used by the hardware.
>
> Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 5 ++---
>   1 file changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> index 7db0040ca145..3a6f92de5504 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> @@ -516,7 +516,7 @@ static int amdgpu_bo_do_create(struct amdgpu_device *adev,
>   }
>   
>   static int amdgpu_bo_create_shadow(struct amdgpu_device *adev,
> -				   unsigned long size, int byte_align,
> +				   unsigned long size,
>   				   struct amdgpu_bo *bo)
>   {
>   	struct amdgpu_bo_param bp;
> @@ -527,7 +527,6 @@ static int amdgpu_bo_create_shadow(struct amdgpu_device *adev,
>   
>   	memset(&bp, 0, sizeof(bp));
>   	bp.size = size;
> -	bp.byte_align = byte_align;
>   	bp.domain = AMDGPU_GEM_DOMAIN_GTT;
>   	bp.flags = AMDGPU_GEM_CREATE_CPU_GTT_USWC |
>   		AMDGPU_GEM_CREATE_SHADOW;
> @@ -576,7 +575,7 @@ int amdgpu_bo_create(struct amdgpu_device *adev,
>   			WARN_ON(reservation_object_lock((*bo_ptr)->tbo.resv,
>   							NULL));
>   
> -		r = amdgpu_bo_create_shadow(adev, bp->size, bp->byte_align, (*bo_ptr));
> +		r = amdgpu_bo_create_shadow(adev, bp->size, *bo_ptr);
>   
>   		if (!bp->resv)
>   			reservation_object_unlock((*bo_ptr)->tbo.resv);

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 5/5] drm/amdgpu: fix shadow BO restoring
       [not found]         ` <073f24f7-df48-55f6-f4fb-2d250cfd2dd1-5C7GfCeVMHo@public.gmane.org>
@ 2018-09-14 11:54           ` Christian König
       [not found]             ` <fa8fb53f-ff84-fb45-59c7-1d52ec57fba7-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 15+ messages in thread
From: Christian König @ 2018-09-14 11:54 UTC (permalink / raw)
  To: Zhang, Jerry(Junwei), amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Am 13.09.2018 um 11:29 schrieb Zhang, Jerry(Junwei):
> On 09/11/2018 05:56 PM, Christian König wrote:
>> Don't grab the reservation lock any more and simplify the handling quite
>> a bit.
>>
>> Signed-off-by: Christian König <christian.koenig@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 109 
>> ++++++++---------------------
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c |  46 ++++--------
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.h |   8 +--
>>   3 files changed, 43 insertions(+), 120 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> index 5eba66ecf668..20bb702f5c7f 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> @@ -2940,54 +2940,6 @@ static int 
>> amdgpu_device_ip_post_soft_reset(struct amdgpu_device *adev)
>>       return 0;
>>   }
>>   -/**
>> - * amdgpu_device_recover_vram_from_shadow - restore shadowed VRAM 
>> buffers
>> - *
>> - * @adev: amdgpu_device pointer
>> - * @ring: amdgpu_ring for the engine handling the buffer operations
>> - * @bo: amdgpu_bo buffer whose shadow is being restored
>> - * @fence: dma_fence associated with the operation
>> - *
>> - * Restores the VRAM buffer contents from the shadow in GTT. Used to
>> - * restore things like GPUVM page tables after a GPU reset where
>> - * the contents of VRAM might be lost.
>> - * Returns 0 on success, negative error code on failure.
>> - */
>> -static int amdgpu_device_recover_vram_from_shadow(struct 
>> amdgpu_device *adev,
>> -                          struct amdgpu_ring *ring,
>> -                          struct amdgpu_bo *bo,
>> -                          struct dma_fence **fence)
>> -{
>> -    uint32_t domain;
>> -    int r;
>> -
>> -    if (!bo->shadow)
>> -        return 0;
>> -
>> -    r = amdgpu_bo_reserve(bo, true);
>> -    if (r)
>> -        return r;
>> -    domain = amdgpu_mem_type_to_domain(bo->tbo.mem.mem_type);
>> -    /* if bo has been evicted, then no need to recover */
>> -    if (domain == AMDGPU_GEM_DOMAIN_VRAM) {
>> -        r = amdgpu_bo_validate(bo->shadow);
>> -        if (r) {
>> -            DRM_ERROR("bo validate failed!\n");
>> -            goto err;
>> -        }
>> -
>> -        r = amdgpu_bo_restore_from_shadow(adev, ring, bo,
>> -                         NULL, fence, true);
>> -        if (r) {
>> -            DRM_ERROR("recover page table failed!\n");
>> -            goto err;
>> -        }
>> -    }
>> -err:
>> -    amdgpu_bo_unreserve(bo);
>> -    return r;
>> -}
>> -
>>   /**
>>    * amdgpu_device_recover_vram - Recover some VRAM contents
>>    *
>> @@ -2996,16 +2948,15 @@ static int 
>> amdgpu_device_recover_vram_from_shadow(struct amdgpu_device *adev,
>>    * Restores the contents of VRAM buffers from the shadows in GTT.  
>> Used to
>>    * restore things like GPUVM page tables after a GPU reset where
>>    * the contents of VRAM might be lost.
>> - * Returns 0 on success, 1 on failure.
>> + *
>> + * Returns:
>> + * 0 on success, negative error code on failure.
>>    */
>>   static int amdgpu_device_recover_vram(struct amdgpu_device *adev)
>>   {
>> -    struct amdgpu_ring *ring = adev->mman.buffer_funcs_ring;
>> -    struct amdgpu_bo *bo, *tmp;
>>       struct dma_fence *fence = NULL, *next = NULL;
>> -    long r = 1;
>> -    int i = 0;
>> -    long tmo;
>> +    struct amdgpu_bo *shadow;
>> +    long r = 1, tmo;
>>         if (amdgpu_sriov_runtime(adev))
>>           tmo = msecs_to_jiffies(8000);
>> @@ -3014,44 +2965,40 @@ static int amdgpu_device_recover_vram(struct 
>> amdgpu_device *adev)
>>         DRM_INFO("recover vram bo from shadow start\n");
>>       mutex_lock(&adev->shadow_list_lock);
>> -    list_for_each_entry_safe(bo, tmp, &adev->shadow_list, 
>> shadow_list) {
>> -        next = NULL;
>> -        amdgpu_device_recover_vram_from_shadow(adev, ring, bo, &next);
>> +    list_for_each_entry(shadow, &adev->shadow_list, shadow_list) {
>> +
>> +        /* No need to recover an evicted BO */
>> +        if (shadow->tbo.mem.mem_type != TTM_PL_TT ||
>> +            shadow->parent->tbo.mem.mem_type != TTM_PL_VRAM)
> is there a change that shadow bo evicted to other domain?
> like SYSTEM?

Yes, that's why I test "!= TTM_PL_TT" here.

What can happen is that either the shadow or the page table or page 
directory is evicted.

But in this case we don't need to restore anything because of patch #1 
in this series.

Regards,
Christian.

>
> Regards,
> Jerry
>> +            continue;
>> +
>> +        r = amdgpu_bo_restore_shadow(shadow, &next);
>> +        if (r)
>> +            break;
>> +
>>           if (fence) {
>>               r = dma_fence_wait_timeout(fence, false, tmo);
>> -            if (r == 0)
>> -                pr_err("wait fence %p[%d] timeout\n", fence, i);
>> -            else if (r < 0)
>> -                pr_err("wait fence %p[%d] interrupted\n", fence, i);
>> -            if (r < 1) {
>> -                dma_fence_put(fence);
>> -                fence = next;
>> +            dma_fence_put(fence);
>> +            fence = next;
>> +            if (r <= 0)
>>                   break;
>> -            }
>> -            i++;
>> +        } else {
>> +            fence = next;
>>           }
>> -
>> -        dma_fence_put(fence);
>> -        fence = next;
>>       }
>>       mutex_unlock(&adev->shadow_list_lock);
>>   -    if (fence) {
>> -        r = dma_fence_wait_timeout(fence, false, tmo);
>> -        if (r == 0)
>> -            pr_err("wait fence %p[%d] timeout\n", fence, i);
>> -        else if (r < 0)
>> -            pr_err("wait fence %p[%d] interrupted\n", fence, i);
>> -
>> -    }
>> +    if (fence)
>> +        tmo = dma_fence_wait_timeout(fence, false, tmo);
>>       dma_fence_put(fence);
>>   -    if (r > 0)
>> -        DRM_INFO("recover vram bo from shadow done\n");
>> -    else
>> +    if (r <= 0 || tmo <= 0) {
>>           DRM_ERROR("recover vram bo from shadow failed\n");
>> +        return -EIO;
>> +    }
>>   -    return (r > 0) ? 0 : 1;
>> +    DRM_INFO("recover vram bo from shadow done\n");
>> +    return 0;
>>   }
>>     /**
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>> index 3a6f92de5504..5b6d5fcc2151 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>> @@ -537,7 +537,7 @@ static int amdgpu_bo_create_shadow(struct 
>> amdgpu_device *adev,
>>       if (!r) {
>>           bo->shadow->parent = amdgpu_bo_ref(bo);
>>           mutex_lock(&adev->shadow_list_lock);
>> -        list_add_tail(&bo->shadow_list, &adev->shadow_list);
>> +        list_add_tail(&bo->shadow->shadow_list, &adev->shadow_list);
>>           mutex_unlock(&adev->shadow_list_lock);
>>       }
>>   @@ -669,13 +669,10 @@ int amdgpu_bo_validate(struct amdgpu_bo *bo)
>>   }
>>     /**
>> - * amdgpu_bo_restore_from_shadow - restore an &amdgpu_bo buffer object
>> - * @adev: amdgpu device object
>> - * @ring: amdgpu_ring for the engine handling the buffer operations
>> - * @bo: &amdgpu_bo buffer to be restored
>> - * @resv: reservation object with embedded fence
>> + * amdgpu_bo_restore_shadow - restore an &amdgpu_bo shadow
>> + *
>> + * @shadow: &amdgpu_bo shadow to be restored
>>    * @fence: dma_fence associated with the operation
>> - * @direct: whether to submit the job directly
>>    *
>>    * Copies a buffer object's shadow content back to the object.
>>    * This is used for recovering a buffer from its shadow in case of 
>> a gpu
>> @@ -684,36 +681,19 @@ int amdgpu_bo_validate(struct amdgpu_bo *bo)
>>    * Returns:
>>    * 0 for success or a negative error code on failure.
>>    */
>> -int amdgpu_bo_restore_from_shadow(struct amdgpu_device *adev,
>> -                  struct amdgpu_ring *ring,
>> -                  struct amdgpu_bo *bo,
>> -                  struct reservation_object *resv,
>> -                  struct dma_fence **fence,
>> -                  bool direct)
>> +int amdgpu_bo_restore_shadow(struct amdgpu_bo *shadow, struct 
>> dma_fence **fence)
>>     {
>> -    struct amdgpu_bo *shadow = bo->shadow;
>> -    uint64_t bo_addr, shadow_addr;
>> -    int r;
>> -
>> -    if (!shadow)
>> -        return -EINVAL;
>> -
>> -    bo_addr = amdgpu_bo_gpu_offset(bo);
>> -    shadow_addr = amdgpu_bo_gpu_offset(bo->shadow);
>> -
>> -    r = reservation_object_reserve_shared(bo->tbo.resv);
>> -    if (r)
>> -        goto err;
>> +    struct amdgpu_device *adev = amdgpu_ttm_adev(shadow->tbo.bdev);
>> +    struct amdgpu_ring *ring = adev->mman.buffer_funcs_ring;
>> +    uint64_t shadow_addr, parent_addr;
>>   -    r = amdgpu_copy_buffer(ring, shadow_addr, bo_addr,
>> -                   amdgpu_bo_size(bo), resv, fence,
>> -                   direct, false);
>> -    if (!r)
>> -        amdgpu_bo_fence(bo, *fence, true);
>> +    shadow_addr = amdgpu_bo_gpu_offset(shadow);
>> +    parent_addr = amdgpu_bo_gpu_offset(shadow->parent);
>>   -err:
>> -    return r;
>> +    return amdgpu_copy_buffer(ring, shadow_addr, parent_addr,
>> +                  amdgpu_bo_size(shadow), NULL, fence,
>> +                  true, false);
>>   }
>>     /**
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>> index 907fdf46d895..363db417fb2e 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>> @@ -273,12 +273,8 @@ int amdgpu_bo_backup_to_shadow(struct 
>> amdgpu_device *adev,
>>                      struct reservation_object *resv,
>>                      struct dma_fence **fence, bool direct);
>>   int amdgpu_bo_validate(struct amdgpu_bo *bo);
>> -int amdgpu_bo_restore_from_shadow(struct amdgpu_device *adev,
>> -                  struct amdgpu_ring *ring,
>> -                  struct amdgpu_bo *bo,
>> -                  struct reservation_object *resv,
>> -                  struct dma_fence **fence,
>> -                  bool direct);
>> +int amdgpu_bo_restore_shadow(struct amdgpu_bo *shadow,
>> +                 struct dma_fence **fence);
>>   uint32_t amdgpu_bo_get_preferred_pin_domain(struct amdgpu_device 
>> *adev,
>>                           uint32_t domain);
>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 5/5] drm/amdgpu: fix shadow BO restoring
       [not found]             ` <fa8fb53f-ff84-fb45-59c7-1d52ec57fba7-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2018-09-18  6:15               ` Zhang, Jerry(Junwei)
  0 siblings, 0 replies; 15+ messages in thread
From: Zhang, Jerry(Junwei) @ 2018-09-18  6:15 UTC (permalink / raw)
  To: christian.koenig-5C7GfCeVMHo, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

On 09/14/2018 07:54 PM, Christian König wrote:
> Am 13.09.2018 um 11:29 schrieb Zhang, Jerry(Junwei):
>> On 09/11/2018 05:56 PM, Christian König wrote:
>>> Don't grab the reservation lock any more and simplify the handling 
>>> quite
>>> a bit.
>>>
>>> Signed-off-by: Christian König <christian.koenig@amd.com>
>>> ---
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 109 
>>> ++++++++---------------------
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c |  46 ++++--------
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.h |   8 +--
>>>   3 files changed, 43 insertions(+), 120 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> index 5eba66ecf668..20bb702f5c7f 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> @@ -2940,54 +2940,6 @@ static int 
>>> amdgpu_device_ip_post_soft_reset(struct amdgpu_device *adev)
>>>       return 0;
>>>   }
>>>   -/**
>>> - * amdgpu_device_recover_vram_from_shadow - restore shadowed VRAM 
>>> buffers
>>> - *
>>> - * @adev: amdgpu_device pointer
>>> - * @ring: amdgpu_ring for the engine handling the buffer operations
>>> - * @bo: amdgpu_bo buffer whose shadow is being restored
>>> - * @fence: dma_fence associated with the operation
>>> - *
>>> - * Restores the VRAM buffer contents from the shadow in GTT. Used to
>>> - * restore things like GPUVM page tables after a GPU reset where
>>> - * the contents of VRAM might be lost.
>>> - * Returns 0 on success, negative error code on failure.
>>> - */
>>> -static int amdgpu_device_recover_vram_from_shadow(struct 
>>> amdgpu_device *adev,
>>> -                          struct amdgpu_ring *ring,
>>> -                          struct amdgpu_bo *bo,
>>> -                          struct dma_fence **fence)
>>> -{
>>> -    uint32_t domain;
>>> -    int r;
>>> -
>>> -    if (!bo->shadow)
>>> -        return 0;
>>> -
>>> -    r = amdgpu_bo_reserve(bo, true);
>>> -    if (r)
>>> -        return r;
>>> -    domain = amdgpu_mem_type_to_domain(bo->tbo.mem.mem_type);
>>> -    /* if bo has been evicted, then no need to recover */
>>> -    if (domain == AMDGPU_GEM_DOMAIN_VRAM) {
>>> -        r = amdgpu_bo_validate(bo->shadow);
>>> -        if (r) {
>>> -            DRM_ERROR("bo validate failed!\n");
>>> -            goto err;
>>> -        }
>>> -
>>> -        r = amdgpu_bo_restore_from_shadow(adev, ring, bo,
>>> -                         NULL, fence, true);
>>> -        if (r) {
>>> -            DRM_ERROR("recover page table failed!\n");
>>> -            goto err;
>>> -        }
>>> -    }
>>> -err:
>>> -    amdgpu_bo_unreserve(bo);
>>> -    return r;
>>> -}
>>> -
>>>   /**
>>>    * amdgpu_device_recover_vram - Recover some VRAM contents
>>>    *
>>> @@ -2996,16 +2948,15 @@ static int 
>>> amdgpu_device_recover_vram_from_shadow(struct amdgpu_device *adev,
>>>    * Restores the contents of VRAM buffers from the shadows in GTT.  
>>> Used to
>>>    * restore things like GPUVM page tables after a GPU reset where
>>>    * the contents of VRAM might be lost.
>>> - * Returns 0 on success, 1 on failure.
>>> + *
>>> + * Returns:
>>> + * 0 on success, negative error code on failure.
>>>    */
>>>   static int amdgpu_device_recover_vram(struct amdgpu_device *adev)
>>>   {
>>> -    struct amdgpu_ring *ring = adev->mman.buffer_funcs_ring;
>>> -    struct amdgpu_bo *bo, *tmp;
>>>       struct dma_fence *fence = NULL, *next = NULL;
>>> -    long r = 1;
>>> -    int i = 0;
>>> -    long tmo;
>>> +    struct amdgpu_bo *shadow;
>>> +    long r = 1, tmo;
>>>         if (amdgpu_sriov_runtime(adev))
>>>           tmo = msecs_to_jiffies(8000);
>>> @@ -3014,44 +2965,40 @@ static int amdgpu_device_recover_vram(struct 
>>> amdgpu_device *adev)
>>>         DRM_INFO("recover vram bo from shadow start\n");
>>>       mutex_lock(&adev->shadow_list_lock);
>>> -    list_for_each_entry_safe(bo, tmp, &adev->shadow_list, 
>>> shadow_list) {
>>> -        next = NULL;
>>> -        amdgpu_device_recover_vram_from_shadow(adev, ring, bo, &next);
>>> +    list_for_each_entry(shadow, &adev->shadow_list, shadow_list) {
>>> +
>>> +        /* No need to recover an evicted BO */
>>> +        if (shadow->tbo.mem.mem_type != TTM_PL_TT ||
>>> +            shadow->parent->tbo.mem.mem_type != TTM_PL_VRAM)
>> is there a change that shadow bo evicted to other domain?
>> like SYSTEM?
>
> Yes, that's why I test "!= TTM_PL_TT" here.
>
> What can happen is that either the shadow or the page table or page 
> directory is evicted.
>
> But in this case we don't need to restore anything because of patch #1 
> in this series.

Thanks, then it's
Acked-by: Junwei Zhang <Jerry.Zhang@amd.com>

Regards,
Jerry

>
> Regards,
> Christian.
>
>>
>> Regards,
>> Jerry
>>> +            continue;
>>> +
>>> +        r = amdgpu_bo_restore_shadow(shadow, &next);
>>> +        if (r)
>>> +            break;
>>> +
>>>           if (fence) {
>>>               r = dma_fence_wait_timeout(fence, false, tmo);
>>> -            if (r == 0)
>>> -                pr_err("wait fence %p[%d] timeout\n", fence, i);
>>> -            else if (r < 0)
>>> -                pr_err("wait fence %p[%d] interrupted\n", fence, i);
>>> -            if (r < 1) {
>>> -                dma_fence_put(fence);
>>> -                fence = next;
>>> +            dma_fence_put(fence);
>>> +            fence = next;
>>> +            if (r <= 0)
>>>                   break;
>>> -            }
>>> -            i++;
>>> +        } else {
>>> +            fence = next;
>>>           }
>>> -
>>> -        dma_fence_put(fence);
>>> -        fence = next;
>>>       }
>>>       mutex_unlock(&adev->shadow_list_lock);
>>>   -    if (fence) {
>>> -        r = dma_fence_wait_timeout(fence, false, tmo);
>>> -        if (r == 0)
>>> -            pr_err("wait fence %p[%d] timeout\n", fence, i);
>>> -        else if (r < 0)
>>> -            pr_err("wait fence %p[%d] interrupted\n", fence, i);
>>> -
>>> -    }
>>> +    if (fence)
>>> +        tmo = dma_fence_wait_timeout(fence, false, tmo);
>>>       dma_fence_put(fence);
>>>   -    if (r > 0)
>>> -        DRM_INFO("recover vram bo from shadow done\n");
>>> -    else
>>> +    if (r <= 0 || tmo <= 0) {
>>>           DRM_ERROR("recover vram bo from shadow failed\n");
>>> +        return -EIO;
>>> +    }
>>>   -    return (r > 0) ? 0 : 1;
>>> +    DRM_INFO("recover vram bo from shadow done\n");
>>> +    return 0;
>>>   }
>>>     /**
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>> index 3a6f92de5504..5b6d5fcc2151 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>> @@ -537,7 +537,7 @@ static int amdgpu_bo_create_shadow(struct 
>>> amdgpu_device *adev,
>>>       if (!r) {
>>>           bo->shadow->parent = amdgpu_bo_ref(bo);
>>>           mutex_lock(&adev->shadow_list_lock);
>>> -        list_add_tail(&bo->shadow_list, &adev->shadow_list);
>>> +        list_add_tail(&bo->shadow->shadow_list, &adev->shadow_list);
>>>           mutex_unlock(&adev->shadow_list_lock);
>>>       }
>>>   @@ -669,13 +669,10 @@ int amdgpu_bo_validate(struct amdgpu_bo *bo)
>>>   }
>>>     /**
>>> - * amdgpu_bo_restore_from_shadow - restore an &amdgpu_bo buffer object
>>> - * @adev: amdgpu device object
>>> - * @ring: amdgpu_ring for the engine handling the buffer operations
>>> - * @bo: &amdgpu_bo buffer to be restored
>>> - * @resv: reservation object with embedded fence
>>> + * amdgpu_bo_restore_shadow - restore an &amdgpu_bo shadow
>>> + *
>>> + * @shadow: &amdgpu_bo shadow to be restored
>>>    * @fence: dma_fence associated with the operation
>>> - * @direct: whether to submit the job directly
>>>    *
>>>    * Copies a buffer object's shadow content back to the object.
>>>    * This is used for recovering a buffer from its shadow in case of 
>>> a gpu
>>> @@ -684,36 +681,19 @@ int amdgpu_bo_validate(struct amdgpu_bo *bo)
>>>    * Returns:
>>>    * 0 for success or a negative error code on failure.
>>>    */
>>> -int amdgpu_bo_restore_from_shadow(struct amdgpu_device *adev,
>>> -                  struct amdgpu_ring *ring,
>>> -                  struct amdgpu_bo *bo,
>>> -                  struct reservation_object *resv,
>>> -                  struct dma_fence **fence,
>>> -                  bool direct)
>>> +int amdgpu_bo_restore_shadow(struct amdgpu_bo *shadow, struct 
>>> dma_fence **fence)
>>>     {
>>> -    struct amdgpu_bo *shadow = bo->shadow;
>>> -    uint64_t bo_addr, shadow_addr;
>>> -    int r;
>>> -
>>> -    if (!shadow)
>>> -        return -EINVAL;
>>> -
>>> -    bo_addr = amdgpu_bo_gpu_offset(bo);
>>> -    shadow_addr = amdgpu_bo_gpu_offset(bo->shadow);
>>> -
>>> -    r = reservation_object_reserve_shared(bo->tbo.resv);
>>> -    if (r)
>>> -        goto err;
>>> +    struct amdgpu_device *adev = amdgpu_ttm_adev(shadow->tbo.bdev);
>>> +    struct amdgpu_ring *ring = adev->mman.buffer_funcs_ring;
>>> +    uint64_t shadow_addr, parent_addr;
>>>   -    r = amdgpu_copy_buffer(ring, shadow_addr, bo_addr,
>>> -                   amdgpu_bo_size(bo), resv, fence,
>>> -                   direct, false);
>>> -    if (!r)
>>> -        amdgpu_bo_fence(bo, *fence, true);
>>> +    shadow_addr = amdgpu_bo_gpu_offset(shadow);
>>> +    parent_addr = amdgpu_bo_gpu_offset(shadow->parent);
>>>   -err:
>>> -    return r;
>>> +    return amdgpu_copy_buffer(ring, shadow_addr, parent_addr,
>>> +                  amdgpu_bo_size(shadow), NULL, fence,
>>> +                  true, false);
>>>   }
>>>     /**
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>>> index 907fdf46d895..363db417fb2e 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>>> @@ -273,12 +273,8 @@ int amdgpu_bo_backup_to_shadow(struct 
>>> amdgpu_device *adev,
>>>                      struct reservation_object *resv,
>>>                      struct dma_fence **fence, bool direct);
>>>   int amdgpu_bo_validate(struct amdgpu_bo *bo);
>>> -int amdgpu_bo_restore_from_shadow(struct amdgpu_device *adev,
>>> -                  struct amdgpu_ring *ring,
>>> -                  struct amdgpu_bo *bo,
>>> -                  struct reservation_object *resv,
>>> -                  struct dma_fence **fence,
>>> -                  bool direct);
>>> +int amdgpu_bo_restore_shadow(struct amdgpu_bo *shadow,
>>> +                 struct dma_fence **fence);
>>>   uint32_t amdgpu_bo_get_preferred_pin_domain(struct amdgpu_device 
>>> *adev,
>>>                           uint32_t domain);
>>
>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 3/5] drm/amdgpu: shadow BOs don't need any alignment
       [not found] ` <20180914134257.2196-1-christian.koenig-5C7GfCeVMHo@public.gmane.org>
@ 2018-09-14 13:42   ` Christian König
  0 siblings, 0 replies; 15+ messages in thread
From: Christian König @ 2018-09-14 13:42 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

They aren't directly used by the hardware.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index d8e8d653d518..650c45c896f0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -526,7 +526,7 @@ static int amdgpu_bo_do_create(struct amdgpu_device *adev,
 }
 
 static int amdgpu_bo_create_shadow(struct amdgpu_device *adev,
-				   unsigned long size, int byte_align,
+				   unsigned long size,
 				   struct amdgpu_bo *bo)
 {
 	struct amdgpu_bo_param bp;
@@ -537,7 +537,6 @@ static int amdgpu_bo_create_shadow(struct amdgpu_device *adev,
 
 	memset(&bp, 0, sizeof(bp));
 	bp.size = size;
-	bp.byte_align = byte_align;
 	bp.domain = AMDGPU_GEM_DOMAIN_GTT;
 	bp.flags = AMDGPU_GEM_CREATE_CPU_GTT_USWC |
 		AMDGPU_GEM_CREATE_SHADOW;
@@ -586,7 +585,7 @@ int amdgpu_bo_create(struct amdgpu_device *adev,
 			WARN_ON(reservation_object_lock((*bo_ptr)->tbo.resv,
 							NULL));
 
-		r = amdgpu_bo_create_shadow(adev, bp->size, bp->byte_align, (*bo_ptr));
+		r = amdgpu_bo_create_shadow(adev, bp->size, *bo_ptr);
 
 		if (!bp->resv)
 			reservation_object_unlock((*bo_ptr)->tbo.resv);
-- 
2.14.1

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2018-09-18  6:15 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-11  9:55 [PATCH 1/5] drm/amdgpu: stop pipelining VM PDs/PTs moves Christian König
     [not found] ` <20180911095602.10152-1-christian.koenig-5C7GfCeVMHo@public.gmane.org>
2018-09-11  9:55   ` [PATCH 2/5] drm/amdgpu: always enable shadow BOs Christian König
     [not found]     ` <20180911095602.10152-2-christian.koenig-5C7GfCeVMHo@public.gmane.org>
2018-09-11 12:29       ` Michel Dänzer
     [not found]         ` <733b6122-bfef-442b-e95a-2cd17bfe12b5-otUistvHUpPR7s880joybQ@public.gmane.org>
2018-09-11 13:21           ` Christian König
     [not found]             ` <ce8f5f17-6ef0-27bc-3ae4-cfcc7fffaaa9-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2018-09-12 12:37               ` Deng, Emily
2018-09-11  9:56   ` [PATCH 3/5] drm/amdgpu: shadow BOs don't need any alignment Christian König
     [not found]     ` <20180911095602.10152-3-christian.koenig-5C7GfCeVMHo@public.gmane.org>
2018-09-13  9:30       ` Zhang, Jerry(Junwei)
2018-09-11  9:56   ` [PATCH 4/5] drm/amdgpu: always recover VRAM during GPU recovery Christian König
     [not found]     ` <20180911095602.10152-4-christian.koenig-5C7GfCeVMHo@public.gmane.org>
2018-09-13  9:28       ` Zhang, Jerry(Junwei)
2018-09-11  9:56   ` [PATCH 5/5] drm/amdgpu: fix shadow BO restoring Christian König
     [not found]     ` <20180911095602.10152-5-christian.koenig-5C7GfCeVMHo@public.gmane.org>
2018-09-13  9:29       ` Zhang, Jerry(Junwei)
     [not found]         ` <073f24f7-df48-55f6-f4fb-2d250cfd2dd1-5C7GfCeVMHo@public.gmane.org>
2018-09-14 11:54           ` Christian König
     [not found]             ` <fa8fb53f-ff84-fb45-59c7-1d52ec57fba7-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2018-09-18  6:15               ` Zhang, Jerry(Junwei)
2018-09-13  9:28   ` [PATCH 1/5] drm/amdgpu: stop pipelining VM PDs/PTs moves Zhang, Jerry(Junwei)
2018-09-14 13:42 Christian König
     [not found] ` <20180914134257.2196-1-christian.koenig-5C7GfCeVMHo@public.gmane.org>
2018-09-14 13:42   ` [PATCH 3/5] drm/amdgpu: shadow BOs don't need any alignment Christian König

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.