All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Christian König" <deathsimple-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org>
To: Chunming Zhou <David1.Zhou-5C7GfCeVMHo@public.gmane.org>,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org
Subject: Re: [PATCH 00/18] shadow page table support V3
Date: Fri, 12 Aug 2016 10:00:51 +0200	[thread overview]
Message-ID: <9a1da72b-4834-c135-260a-1d16c8b36ea3@vodafone.de> (raw)
In-Reply-To: <1470983947-32579-1-git-send-email-David1.Zhou-5C7GfCeVMHo@public.gmane.org>

Patch #1-#4 are Reviewed-by: Christian König <christian.koenig@amd.com>, 
please commit those so that we can consider them as done.

Patch #5:
>   
> +int amdgpu_bo_sync_between_bo_and_shadow(struct amdgpu_device *adev,
> +					 struct amdgpu_ring *ring,
> +					 struct amdgpu_bo *bo,
> +					 struct reservation_object *resv,
> +					 struct fence **fence)
Maybe rename this to amdgpu_bo_backup_shadow(), when talking about sync 
at least I always think about synchronization using fences.

> +
> +	if (bo->backup_shadow & AMDGPU_BO_SHADOW_TO_NONE) {
That condition will never be true, cause AMDGPU_BO_SHADOW_TO_NONE is 
defined as 0. You probably wanted to use "==" here, don't you?

> +	r = amdgpu_bo_pin(bo, bo->prefered_domains, &bo_addr);
> +	if (r) {
> +		DRM_ERROR("Failed to pin bo object\n");
> +		goto err1;
> +	}
> +	r = amdgpu_bo_pin(bo->shadow, bo->shadow->prefered_domains, &shadow_addr);
> +	if (r) {
> +		DRM_ERROR("Failed to pin bo shadow object\n");
> +		goto err2;
> +	}
Again don't use amdgpu_bo_pin() here.

Patch #6:
> +	r = amdgpu_bo_reserve(vm->page_directory, false);
> +	if (r)
> +		return r;
You need to double check if the BOs aren't swapped out. See 
amdgpu_gem_va_update_vm() on how to do this.

> +	fence_put(vm->shadow_sync_fence);
> +	vm->shadow_sync_fence = fence_get(fence);
I wouldn't remember the backup operation at all, just return the fence 
to the caller instead.

> +		r = amdgpu_bo_sync_between_bo_and_shadow(adev, ring, bo,
> +							 NULL, &fence);
As discussed we don't want to use the scheduler for the recovery 
operation, so we need a flag to send the commands directly to the ring here.

Patch #7 and following. Thinking about this more the whole idea of using 
a separate entity is a NAK.

The basic problem is that you can't know which updates are completed, 
crashed or in flight when the the GPU recovery hits us. So the shadow 
needs to be updated before the real page table and none of this can be 
done out of order.

I suggest you just switch back to doing the updates twice for now.

Regards,
Christian.

Am 12.08.2016 um 08:38 schrieb Chunming Zhou:
> Since we cannot ensure VRAM is consistent after a GPU reset, page
> table shadowing is necessary. Shadowed page tables are, in a sense, a
> method to recover the consistent state of the page tables before the
> reset occurred.
>
> We need to allocate GTT bo as the shadow of VRAM bo when creating page table,
> and make them the same. After gpu reset, we will need to use SDMA to copy GTT bo
> content to VRAM bo, then page table will be recoveried.
>
>
> V2:
> Shadow bo uses a shadow entity running on normal run queue, after gpu reset,
> we need to wait for all shadow jobs finished first, then recovery page table from shadow.
>
> V3:
> Addressed Christian comments for shadow bo part.
>
> Chunming Zhou (18):
>    drm/amdgpu: add shadow bo support V2
>    drm/amdgpu: validate shadow as well when validating bo
>    drm/amdgpu: allocate shadow for pd/pt bo V2
>    drm/amdgpu: add shadow flag V2
>    drm/amdgpu: sync bo and shadow
>    drm/amdgpu: implement vm recovery function from shadow
>    drm/amdgpu: add shadow_entity for shadow page table updates
>    drm/amdgpu: update pd shadow bo
>    drm/amdgpu: update pt shadow
>    drm/amd: add last fence in sched entity
>    drm/amdgpu: link all vm clients
>    drm/amdgpu: add vm_list_lock
>    drm/amd: add block entity function
>    drm/amdgpu: add shadow fence owner
>    drm/amd: block entity
>    drm/amdgpu: recover page tables after gpu reset
>    drm/amdgpu: add need backup function
>    drm/amdgpu: add backup condition for vm
>
>   drivers/gpu/drm/amd/amdgpu/amdgpu.h           |  25 +++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c        |  82 +++++---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c    |  88 +++++++-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c    | 100 ++++++++-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.h    |   5 +
>   drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c      |   3 +
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c        | 281 +++++++++++++++++++-------
>   drivers/gpu/drm/amd/scheduler/gpu_scheduler.c |  38 +++-
>   drivers/gpu/drm/amd/scheduler/gpu_scheduler.h |   5 +
>   include/uapi/drm/amdgpu_drm.h                 |   2 +
>   10 files changed, 524 insertions(+), 105 deletions(-)
>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

      parent reply	other threads:[~2016-08-12  8:00 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-12  6:38 [PATCH 00/18] shadow page table support V3 Chunming Zhou
     [not found] ` <1470983947-32579-1-git-send-email-David1.Zhou-5C7GfCeVMHo@public.gmane.org>
2016-08-12  6:38   ` [PATCH 01/18] drm/amdgpu: add shadow bo support V2 Chunming Zhou
2016-08-12  6:38   ` [PATCH 02/18] drm/amdgpu: validate shadow as well when validating bo Chunming Zhou
2016-08-12  6:38   ` [PATCH 03/18] drm/amdgpu: allocate shadow for pd/pt bo V2 Chunming Zhou
2016-08-12  6:38   ` [PATCH 04/18] drm/amdgpu: add shadow flag V2 Chunming Zhou
2016-08-12  6:38   ` [PATCH 05/18] drm/amdgpu: sync bo and shadow Chunming Zhou
2016-08-12  6:38   ` [PATCH 06/18] drm/amdgpu: implement vm recovery function from shadow Chunming Zhou
2016-08-12  6:38   ` [PATCH 07/18] drm/amdgpu: add shadow_entity for shadow page table updates Chunming Zhou
2016-08-12  6:38   ` [PATCH 08/18] drm/amdgpu: update pd shadow bo Chunming Zhou
     [not found]     ` <1470983947-32579-9-git-send-email-David1.Zhou-5C7GfCeVMHo@public.gmane.org>
2016-08-12  9:50       ` Edward O'Callaghan
2016-08-12  6:38   ` [PATCH 09/18] drm/amdgpu: update pt shadow Chunming Zhou
2016-08-12  6:38   ` [PATCH 10/18] drm/amd: add last fence in sched entity Chunming Zhou
2016-08-12  6:39   ` [PATCH 11/18] drm/amdgpu: link all vm clients Chunming Zhou
2016-08-12  6:39   ` [PATCH 12/18] drm/amdgpu: add vm_list_lock Chunming Zhou
2016-08-12  6:39   ` [PATCH 13/18] drm/amd: add block entity function Chunming Zhou
2016-08-12  6:39   ` [PATCH 14/18] drm/amdgpu: add shadow fence owner Chunming Zhou
2016-08-12  6:39   ` [PATCH 15/18] drm/amd: block entity Chunming Zhou
     [not found]     ` <1470983947-32579-16-git-send-email-David1.Zhou-5C7GfCeVMHo@public.gmane.org>
2016-08-12  9:42       ` Edward O'Callaghan
     [not found]         ` <f8dfeb55-132f-1ddf-2e2f-02776aa5d3e0-dczkZgxz+BNUPWh3PAxdjQ@public.gmane.org>
2016-08-12  9:43           ` zhoucm1
2016-08-12  6:39   ` [PATCH 16/18] drm/amdgpu: recover page tables after gpu reset Chunming Zhou
2016-08-12  6:39   ` [PATCH 17/18] drm/amdgpu: add need backup function Chunming Zhou
2016-08-12  6:39   ` [PATCH 18/18] drm/amdgpu: add backup condition for vm Chunming Zhou
2016-08-12  8:00   ` Christian König [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9a1da72b-4834-c135-260a-1d16c8b36ea3@vodafone.de \
    --to=deathsimple-antagkrnahcb1svskn2v4q@public.gmane.org \
    --cc=David1.Zhou-5C7GfCeVMHo@public.gmane.org \
    --cc=amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.