All of lore.kernel.org
 help / color / mirror / Atom feed
* Fixing SDMA TO after GPU reset
@ 2018-09-10 22:52 Andrey Grodzovsky
       [not found] ` <059118f3-2729-12a1-7c8d-e306f69369aa-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 4+ messages in thread
From: Andrey Grodzovsky @ 2018-09-10 22:52 UTC (permalink / raw)
  To: Koenig, Christian, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: Deucher, Alexander

[-- Attachment #1: Type: text/plain, Size: 337 bytes --]

Attached patch fixes SDMA TO after GPU reset, it's a regression caused 
by cbd5285 drm/amdgpu: move setting the GART addr into TTM.

But to me it looks safer just to revert the original patch all together 
since we never can predict for sure if VM flush will take place and so 
it's safer to just always assign job->vm_pd_addr.

Andrey


[-- Attachment #2: 0001-drm-amdgpu-Fix-SDMA-TO-after-GPU-reset.patch --]
[-- Type: text/x-patch, Size: 1029 bytes --]

>From 038ca047fc41d85fb390b9564cebc7c6da441831 Mon Sep 17 00:00:00 2001
From: Andrey Grodzovsky <andrey.grodzovsky-5C7GfCeVMHo@public.gmane.org>
Date: Mon, 10 Sep 2018 18:43:58 -0400
Subject: drm/amdgpu: Fix SDMA TO after GPU reset

After GPU reset amdgpu_vm_clear_bo triggers VM flush
but job->vm_pd_addr is not set causing SDMA TO.

Fixes cbd5285 drm/amdgpu: move setting the GART addr into TTM.
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky-5C7GfCeVMHo@public.gmane.org>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index f5a9600..88598b1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -526,6 +526,8 @@ static int amdgpu_vm_clear_bo(struct amdgpu_device *adev,
 	if (r)
 		goto error;
 
+	job->vm_pd_addr = amdgpu_gmc_pd_addr(adev->gart.bo);
+
 	addr = amdgpu_bo_gpu_offset(bo);
 	if (ats_entries) {
 		uint64_t ats_value;
-- 
2.7.4


[-- Attachment #3: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: Fixing SDMA TO after GPU reset
       [not found] ` <059118f3-2729-12a1-7c8d-e306f69369aa-5C7GfCeVMHo@public.gmane.org>
@ 2018-09-11 11:46   ` Christian König
       [not found]     ` <aa21f524-ac5e-ddb7-448a-d12ec1599a59-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 4+ messages in thread
From: Christian König @ 2018-09-11 11:46 UTC (permalink / raw)
  To: Andrey Grodzovsky, Koenig, Christian,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: Deucher, Alexander


[-- Attachment #1.1: Type: text/plain, Size: 803 bytes --]

It would probably be better to initialize job->vm_pd_addr with 
AMDGPU_BO_INVALID_OFFSET.

And then just drop the vm flush alltogether when the vm_pd_addr isn't 
set to something sane.

Thanks,
Christian.

Am 11.09.2018 um 00:52 schrieb Andrey Grodzovsky:
> Attached patch fixes SDMA TO after GPU reset, it's a regression caused 
> by cbd5285 drm/amdgpu: move setting the GART addr into TTM.
>
> But to me it looks safer just to revert the original patch all 
> together since we never can predict for sure if VM flush will take 
> place and so it's safer to just always assign job->vm_pd_addr.
>
> Andrey
>
>
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[-- Attachment #1.2: Type: text/html, Size: 1617 bytes --]

[-- Attachment #2: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Fixing SDMA TO after GPU reset
       [not found]     ` <aa21f524-ac5e-ddb7-448a-d12ec1599a59-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2018-09-11 13:54       ` Andrey Grodzovsky
       [not found]         ` <e73cb83b-750c-fd4f-40c9-635cd04bd979-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 4+ messages in thread
From: Andrey Grodzovsky @ 2018-09-11 13:54 UTC (permalink / raw)
  To: christian.koenig-5C7GfCeVMHo, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: Deucher, Alexander


[-- Attachment #1.1: Type: text/plain, Size: 1364 bytes --]

By current code logic job->vm_pd_addr is never going to be set unless 
the job is created for user SC or for buffer copy in amdgpu_copy_buffer
So in any other case we are going to skip VM flush. But amdgpu_vm_flush 
wants a flush to happen in case GPU reset just happend 
(amdgpu_vmid_had_gpu_reset is true)
so we will be skipping that VM flush (as in my case with 
amdgpu_driver_open_kms->amdgpu_vm_init->amdgpu_vm_clear_bo right after 
GPU reset occured)
Is it safe ?

Andrey

On 09/11/2018 07:46 AM, Christian König wrote:
> It would probably be better to initialize job->vm_pd_addr with 
> AMDGPU_BO_INVALID_OFFSET.
>
> And then just drop the vm flush alltogether when the vm_pd_addr isn't 
> set to something sane.
>
> Thanks,
> Christian.
>
> Am 11.09.2018 um 00:52 schrieb Andrey Grodzovsky:
>> Attached patch fixes SDMA TO after GPU reset, it's a regression 
>> caused by cbd5285 drm/amdgpu: move setting the GART addr into TTM.
>>
>> But to me it looks safer just to revert the original patch all 
>> together since we never can predict for sure if VM flush will take 
>> place and so it's safer to just always assign job->vm_pd_addr.
>>
>> Andrey
>>
>>
>>
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>


[-- Attachment #1.2: Type: text/html, Size: 2598 bytes --]

[-- Attachment #2: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Fixing SDMA TO after GPU reset
       [not found]         ` <e73cb83b-750c-fd4f-40c9-635cd04bd979-5C7GfCeVMHo@public.gmane.org>
@ 2018-09-11 13:57           ` Koenig, Christian
  0 siblings, 0 replies; 4+ messages in thread
From: Koenig, Christian @ 2018-09-11 13:57 UTC (permalink / raw)
  To: Grodzovsky, Andrey
  Cc: Deucher, Alexander, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW


[-- Attachment #1.1: Type: text/plain, Size: 1633 bytes --]

Yes, that is actually desired.

We should not flush the system VM just because we do some clearing command submission.

Christian.

Am 11.09.2018 15:54 schrieb "Grodzovsky, Andrey" <Andrey.Grodzovsky-5C7GfCeVMHo@public.gmane.org>:
By current code logic job->vm_pd_addr is never going to be set unless the job is created for user SC or for buffer copy in amdgpu_copy_buffer
So in any other case we are going to skip VM flush. But amdgpu_vm_flush wants a flush to happen in case GPU reset just happend (amdgpu_vmid_had_gpu_reset is true)
so we will be skipping that VM flush (as in my case with amdgpu_driver_open_kms->amdgpu_vm_init->amdgpu_vm_clear_bo right after GPU reset occured)
Is it safe ?

Andrey

On 09/11/2018 07:46 AM, Christian König wrote:
It would probably be better to initialize job->vm_pd_addr with AMDGPU_BO_INVALID_OFFSET.

And then just drop the vm flush alltogether when the vm_pd_addr isn't set to something sane.

Thanks,
Christian.

Am 11.09.2018 um 00:52 schrieb Andrey Grodzovsky:
Attached patch fixes SDMA TO after GPU reset, it's a regression caused by cbd5285 drm/amdgpu: move setting the GART addr into TTM.

But to me it looks safer just to revert the original patch all together since we never can predict for sure if VM flush will take place and so it's safer to just always assign job->vm_pd_addr.

Andrey




_______________________________________________
amd-gfx mailing list
amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org<mailto:amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org>
https://lists.freedesktop.org/mailman/listinfo/amd-gfx




[-- Attachment #1.2: Type: text/html, Size: 2545 bytes --]

[-- Attachment #2: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2018-09-11 13:57 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-10 22:52 Fixing SDMA TO after GPU reset Andrey Grodzovsky
     [not found] ` <059118f3-2729-12a1-7c8d-e306f69369aa-5C7GfCeVMHo@public.gmane.org>
2018-09-11 11:46   ` Christian König
     [not found]     ` <aa21f524-ac5e-ddb7-448a-d12ec1599a59-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2018-09-11 13:54       ` Andrey Grodzovsky
     [not found]         ` <e73cb83b-750c-fd4f-40c9-635cd04bd979-5C7GfCeVMHo@public.gmane.org>
2018-09-11 13:57           ` Koenig, Christian

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.