From mboxrd@z Thu Jan 1 00:00:00 1970 From: =?UTF-8?Q?Christian_K=c3=b6nig?= Subject: =?UTF-8?Q?Re:_=e7=ad=94=e5=a4=8d:_[PATCH]_drm/amdgpu:put_CSA_unmap_?= =?UTF-8?Q?after_sched=5fentity=5ffini?= Date: Fri, 13 Jan 2017 11:23:26 +0100 Message-ID: <936c3f9b-2545-3b18-c7ad-f3440d203ea6@vodafone.de> References: <1484280664-22845-1-git-send-email-Monk.Liu@amd.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============1544592784==" Return-path: In-Reply-To: List-Id: Discussion list for AMD gfx List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: amd-gfx-bounces-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org Sender: "amd-gfx" To: "Liu, Monk" , "amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org" This is a multi-part message in MIME format. --===============1544592784== Content-Type: multipart/alternative; boundary="------------0E84747C94F1BCF76917282E" This is a multi-part message in MIME format. --------------0E84747C94F1BCF76917282E Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Ah, in this case please separate the amdgpu_vm_bo_rmv() from setting csa_addr to NULL. Cause amdgpu_vm_bo_rmv() should come before amdgpu_vm_fini() and that in turn should become before waiting for the scheduler so that the MM knows that the memory is about to be freed. Regards, Christian. Am 13.01.2017 um 10:56 schrieb Liu, Monk: > > only with amdgpu_vm_bo_rmv() won't has such bug, but in another branch > for sriov, we not only call vm_bo_rmv(), and we also set csa_addr to > NULL after it, so the NULL address is inserted in RB, and when > preemption occured, CP backup snapshot to NULL address. > > > although in staging-4.9 we didn't set csa_addr to NULL (because as you > suggested we always use HARDCODE/MACRO for CSA address), but logically > we'd better put CSA unmapping stuffs behind "sched_entity_fini", which > is more reasonable ... > > > BR Monk > > ------------------------------------------------------------------------ > *发件人:* amd-gfx 代表 Christian > König > *发送时间:* 2017年1月13日 17:25:09 > *收件人:* Liu, Monk; amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org > *主题:* Re: [PATCH] drm/amdgpu:put CSA unmap after sched_entity_fini > Am 13.01.2017 um 05:11 schrieb Monk Liu: > > otherwise CSA may unmapped before gpu_scheduler scheduling > > jobs and trigger VM fault on CSA address > > > > Change-Id: Ib2e25ededf89bca44c764477dd2f9127024ca78c > > Signed-off-by: Monk Liu > > Did you really run into an issue because of that? > > Calling amdgpu_vm_bo_rmv() shouldn't affect the page tables nor already > submitted command submissions in any way. > > Regards, > Christian. > > > --- > > drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 8 -------- > > drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 8 ++++++++ > > 2 files changed, 8 insertions(+), 8 deletions(-) > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c > > index 45484c0..e13cdde 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c > > @@ -694,14 +694,6 @@ void amdgpu_driver_postclose_kms(struct > drm_device *dev, > > amdgpu_uvd_free_handles(adev, file_priv); > > amdgpu_vce_free_handles(adev, file_priv); > > > > - if (amdgpu_sriov_vf(adev)) { > > - /* TODO: how to handle reserve failure */ > > - BUG_ON(amdgpu_bo_reserve(adev->virt.csa_obj, false)); > > - amdgpu_vm_bo_rmv(adev, fpriv->vm.csa_bo_va); > > - fpriv->vm.csa_bo_va = NULL; > > - amdgpu_bo_unreserve(adev->virt.csa_obj); > > - } > > - > > amdgpu_vm_fini(adev, &fpriv->vm); > > > > idr_for_each_entry(&fpriv->bo_list_handles, list, handle) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c > > index d05546e..94098bc 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c > > @@ -1608,6 +1608,14 @@ void amdgpu_vm_fini(struct amdgpu_device > *adev, struct amdgpu_vm *vm) > > > > amd_sched_entity_fini(vm->entity.sched, &vm->entity); > > > > + if (amdgpu_sriov_vf(adev)) { > > + /* TODO: how to handle reserve failure */ > > + BUG_ON(amdgpu_bo_reserve(adev->virt.csa_obj, false)); > > + amdgpu_vm_bo_rmv(adev, vm->csa_bo_va); > > + vm->csa_bo_va = NULL; > > + amdgpu_bo_unreserve(adev->virt.csa_obj); > > + } > > + > > if (!RB_EMPTY_ROOT(&vm->va)) { > > dev_err(adev->dev, "still active bo inside vm\n"); > > } > > > _______________________________________________ > amd-gfx mailing list > amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org > https://lists.freedesktop.org/mailman/listinfo/amd-gfx > > > _______________________________________________ > amd-gfx mailing list > amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org > https://lists.freedesktop.org/mailman/listinfo/amd-gfx --------------0E84747C94F1BCF76917282E Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: 8bit
Ah, in this case please separate the amdgpu_vm_bo_rmv() from setting csa_addr to NULL.

Cause amdgpu_vm_bo_rmv() should come before amdgpu_vm_fini() and that in turn should become before waiting for the scheduler so that the MM knows that the memory is about to be freed.

Regards,
Christian.

Am 13.01.2017 um 10:56 schrieb Liu, Monk:

only with amdgpu_vm_bo_rmv() won't has such bug, but in another branch for sriov, we not only call vm_bo_rmv(), and we also set csa_addr to NULL after it, so the NULL address is inserted in RB, and when preemption occured, CP backup snapshot to NULL address.


although in staging-4.9 we didn't set csa_addr to NULL (because as you suggested we always use HARDCODE/MACRO for CSA address), but logically we'd better put CSA unmapping stuffs behind "sched_entity_fini", which is more reasonable ...


BR Monk


发件人: amd-gfx <amd-gfx-bounces-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org> 代表 Christian König <deathsimple-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org>
发送时间: 2017年1月13日 17:25:09
收件人: Liu, Monk; amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org
主题: Re: [PATCH] drm/amdgpu:put CSA unmap after sched_entity_fini
 
Am 13.01.2017 um 05:11 schrieb Monk Liu:
> otherwise CSA may unmapped before gpu_scheduler scheduling
> jobs and trigger VM fault on CSA address
>
> Change-Id: Ib2e25ededf89bca44c764477dd2f9127024ca78c
> Signed-off-by: Monk Liu <Monk.Liu-5C7GfCeVMHo@public.gmane.org>

Did you really run into an issue because of that?

Calling amdgpu_vm_bo_rmv() shouldn't affect the page tables nor already
submitted command submissions in any way.

Regards,
Christian.

> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 8 --------
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c  | 8 ++++++++
>   2 files changed, 8 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
> index 45484c0..e13cdde 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
> @@ -694,14 +694,6 @@ void amdgpu_driver_postclose_kms(struct drm_device *dev,
>        amdgpu_uvd_free_handles(adev, file_priv);
>        amdgpu_vce_free_handles(adev, file_priv);
>  
> -     if (amdgpu_sriov_vf(adev)) {
> -             /* TODO: how to handle reserve failure */
> -             BUG_ON(amdgpu_bo_reserve(adev->virt.csa_obj, false));
> -             amdgpu_vm_bo_rmv(adev, fpriv->vm.csa_bo_va);
> -             fpriv->vm.csa_bo_va = NULL;
> -             amdgpu_bo_unreserve(adev->virt.csa_obj);
> -     }
> -
>        amdgpu_vm_fini(adev, &fpriv->vm);
>  
>        idr_for_each_entry(&fpriv->bo_list_handles, list, handle)
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> index d05546e..94098bc 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> @@ -1608,6 +1608,14 @@ void amdgpu_vm_fini(struct amdgpu_device *adev, struct amdgpu_vm *vm)
>  
>        amd_sched_entity_fini(vm->entity.sched, &vm->entity);
>  
> +     if (amdgpu_sriov_vf(adev)) {
> +             /* TODO: how to handle reserve failure */
> +             BUG_ON(amdgpu_bo_reserve(adev->virt.csa_obj, false));
> +             amdgpu_vm_bo_rmv(adev, vm->csa_bo_va);
> +             vm->csa_bo_va = NULL;
> +             amdgpu_bo_unreserve(adev->virt.csa_obj);
> +     }
> +
>        if (!RB_EMPTY_ROOT(&vm->va)) {
>                dev_err(adev->dev, "still active bo inside vm\n");
>        }


_______________________________________________
amd-gfx mailing list
amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


_______________________________________________
amd-gfx mailing list
amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


--------------0E84747C94F1BCF76917282E-- --===============1544592784== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KYW1kLWdmeCBt YWlsaW5nIGxpc3QKYW1kLWdmeEBsaXN0cy5mcmVlZGVza3RvcC5vcmcKaHR0cHM6Ly9saXN0cy5m cmVlZGVza3RvcC5vcmcvbWFpbG1hbi9saXN0aW5mby9hbWQtZ2Z4Cg== --===============1544592784==--