All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/1] drm/amdgpu: Sync with VM root BO when switching VM to CPU update mode
@ 2020-05-20 22:51 Felix Kuehling
  2020-05-21 13:50 ` Christian König
  0 siblings, 1 reply; 4+ messages in thread
From: Felix Kuehling @ 2020-05-20 22:51 UTC (permalink / raw)
  To: amd-gfx; +Cc: jay.cornwall, christian.koenig

This fixes an intermittent bug where a root PD clear operation still in
progress could overwrite a PDE update done by the CPU, resulting in a
VM fault.

Fixes: 108b4d928c03 ("drm/amd/amdgpu: Update VM function pointer")
Reported-by: Jay Cornwall <Jay.Cornwall@amd.com>
Tested-by: Jay Cornwall <Jay.Cornwall@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 414a0b1c2e5a..7417754e9141 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -3000,10 +3000,17 @@ int amdgpu_vm_make_compute(struct amdgpu_device *adev, struct amdgpu_vm *vm,
 		   !amdgpu_gmc_vram_full_visible(&adev->gmc)),
 		  "CPU update of VM recommended only for large BAR system\n");
 
-	if (vm->use_cpu_for_update)
+	if (vm->use_cpu_for_update) {
+		/* Sync with last SDMA update/clear before switching to CPU */
+		r = amdgpu_bo_sync_wait(vm->root.base.bo,
+					AMDGPU_FENCE_OWNER_UNDEFINED, true);
+		if (r)
+			goto free_idr;
+
 		vm->update_funcs = &amdgpu_vm_cpu_funcs;
-	else
+	} else {
 		vm->update_funcs = &amdgpu_vm_sdma_funcs;
+	}
 	dma_fence_put(vm->last_update);
 	vm->last_update = NULL;
 	vm->is_compute_context = true;
-- 
2.17.1

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH 1/1] drm/amdgpu: Sync with VM root BO when switching VM to CPU update mode
  2020-05-20 22:51 [PATCH 1/1] drm/amdgpu: Sync with VM root BO when switching VM to CPU update mode Felix Kuehling
@ 2020-05-21 13:50 ` Christian König
  2020-05-21 17:06   ` Felix Kuehling
  0 siblings, 1 reply; 4+ messages in thread
From: Christian König @ 2020-05-21 13:50 UTC (permalink / raw)
  To: Felix Kuehling, amd-gfx; +Cc: jay.cornwall, christian.koenig

Am 21.05.20 um 00:51 schrieb Felix Kuehling:
> This fixes an intermittent bug where a root PD clear operation still in
> progress could overwrite a PDE update done by the CPU, resulting in a
> VM fault.

Mhm, maybe better add this to amdgpu_vm_cpu_prepare().

This way we could (in theory) switch between CPU and SDMA based updates 
on the fly elsewhere as well.

Christian.

>
> Fixes: 108b4d928c03 ("drm/amd/amdgpu: Update VM function pointer")
> Reported-by: Jay Cornwall <Jay.Cornwall@amd.com>
> Tested-by: Jay Cornwall <Jay.Cornwall@amd.com>
> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 11 +++++++++--
>   1 file changed, 9 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> index 414a0b1c2e5a..7417754e9141 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> @@ -3000,10 +3000,17 @@ int amdgpu_vm_make_compute(struct amdgpu_device *adev, struct amdgpu_vm *vm,
>   		   !amdgpu_gmc_vram_full_visible(&adev->gmc)),
>   		  "CPU update of VM recommended only for large BAR system\n");
>   
> -	if (vm->use_cpu_for_update)
> +	if (vm->use_cpu_for_update) {
> +		/* Sync with last SDMA update/clear before switching to CPU */
> +		r = amdgpu_bo_sync_wait(vm->root.base.bo,
> +					AMDGPU_FENCE_OWNER_UNDEFINED, true);
> +		if (r)
> +			goto free_idr;
> +
>   		vm->update_funcs = &amdgpu_vm_cpu_funcs;
> -	else
> +	} else {
>   		vm->update_funcs = &amdgpu_vm_sdma_funcs;
> +	}
>   	dma_fence_put(vm->last_update);
>   	vm->last_update = NULL;
>   	vm->is_compute_context = true;

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH 1/1] drm/amdgpu: Sync with VM root BO when switching VM to CPU update mode
  2020-05-21 13:50 ` Christian König
@ 2020-05-21 17:06   ` Felix Kuehling
  2020-05-22  8:59     ` Christian König
  0 siblings, 1 reply; 4+ messages in thread
From: Felix Kuehling @ 2020-05-21 17:06 UTC (permalink / raw)
  To: christian.koenig, amd-gfx; +Cc: jay.cornwall


Am 2020-05-21 um 9:50 a.m. schrieb Christian König:
> Am 21.05.20 um 00:51 schrieb Felix Kuehling:
>> This fixes an intermittent bug where a root PD clear operation still in
>> progress could overwrite a PDE update done by the CPU, resulting in a
>> VM fault.
>
> Mhm, maybe better add this to amdgpu_vm_cpu_prepare().
>
> This way we could (in theory) switch between CPU and SDMA based
> updates on the fly elsewhere as well.

That won't work. I want to wait for FENCE_OWNER_VM fences, so I need to
use FENCE_OWNER_UNDEFINED. But then I would also end up waiting for
FENCE_OWNER_KFD eviction fences, which would trigger unwanted evictions.

This works OK in amdgpu_vm_make_compute because it runs before the
eviction fence is attached to the VM.

Regards,
  Felix


>
> Christian.
>
>>
>> Fixes: 108b4d928c03 ("drm/amd/amdgpu: Update VM function pointer")
>> Reported-by: Jay Cornwall <Jay.Cornwall@amd.com>
>> Tested-by: Jay Cornwall <Jay.Cornwall@amd.com>
>> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 11 +++++++++--
>>   1 file changed, 9 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> index 414a0b1c2e5a..7417754e9141 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> @@ -3000,10 +3000,17 @@ int amdgpu_vm_make_compute(struct
>> amdgpu_device *adev, struct amdgpu_vm *vm,
>>              !amdgpu_gmc_vram_full_visible(&adev->gmc)),
>>             "CPU update of VM recommended only for large BAR system\n");
>>   -    if (vm->use_cpu_for_update)
>> +    if (vm->use_cpu_for_update) {
>> +        /* Sync with last SDMA update/clear before switching to CPU */
>> +        r = amdgpu_bo_sync_wait(vm->root.base.bo,
>> +                    AMDGPU_FENCE_OWNER_UNDEFINED, true);
>> +        if (r)
>> +            goto free_idr;
>> +
>>           vm->update_funcs = &amdgpu_vm_cpu_funcs;
>> -    else
>> +    } else {
>>           vm->update_funcs = &amdgpu_vm_sdma_funcs;
>> +    }
>>       dma_fence_put(vm->last_update);
>>       vm->last_update = NULL;
>>       vm->is_compute_context = true;
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH 1/1] drm/amdgpu: Sync with VM root BO when switching VM to CPU update mode
  2020-05-21 17:06   ` Felix Kuehling
@ 2020-05-22  8:59     ` Christian König
  0 siblings, 0 replies; 4+ messages in thread
From: Christian König @ 2020-05-22  8:59 UTC (permalink / raw)
  To: Felix Kuehling, amd-gfx; +Cc: jay.cornwall

Am 21.05.20 um 19:06 schrieb Felix Kuehling:
> Am 2020-05-21 um 9:50 a.m. schrieb Christian König:
>> Am 21.05.20 um 00:51 schrieb Felix Kuehling:
>>> This fixes an intermittent bug where a root PD clear operation still in
>>> progress could overwrite a PDE update done by the CPU, resulting in a
>>> VM fault.
>> Mhm, maybe better add this to amdgpu_vm_cpu_prepare().
>>
>> This way we could (in theory) switch between CPU and SDMA based
>> updates on the fly elsewhere as well.
> That won't work. I want to wait for FENCE_OWNER_VM fences, so I need to
> use FENCE_OWNER_UNDEFINED. But then I would also end up waiting for
> FENCE_OWNER_KFD eviction fences, which would trigger unwanted evictions.
>
> This works OK in amdgpu_vm_make_compute because it runs before the
> eviction fence is attached to the VM.

Ok, in this case the patch is Reviewed-by: Christian König 
<christian.koenig@amd.com>.

>
> Regards,
>    Felix
>
>
>> Christian.
>>
>>> Fixes: 108b4d928c03 ("drm/amd/amdgpu: Update VM function pointer")
>>> Reported-by: Jay Cornwall <Jay.Cornwall@amd.com>
>>> Tested-by: Jay Cornwall <Jay.Cornwall@amd.com>
>>> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
>>> ---
>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 11 +++++++++--
>>>    1 file changed, 9 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>> index 414a0b1c2e5a..7417754e9141 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>> @@ -3000,10 +3000,17 @@ int amdgpu_vm_make_compute(struct
>>> amdgpu_device *adev, struct amdgpu_vm *vm,
>>>               !amdgpu_gmc_vram_full_visible(&adev->gmc)),
>>>              "CPU update of VM recommended only for large BAR system\n");
>>>    -    if (vm->use_cpu_for_update)
>>> +    if (vm->use_cpu_for_update) {
>>> +        /* Sync with last SDMA update/clear before switching to CPU */
>>> +        r = amdgpu_bo_sync_wait(vm->root.base.bo,
>>> +                    AMDGPU_FENCE_OWNER_UNDEFINED, true);
>>> +        if (r)
>>> +            goto free_idr;
>>> +
>>>            vm->update_funcs = &amdgpu_vm_cpu_funcs;
>>> -    else
>>> +    } else {
>>>            vm->update_funcs = &amdgpu_vm_sdma_funcs;
>>> +    }
>>>        dma_fence_put(vm->last_update);
>>>        vm->last_update = NULL;
>>>        vm->is_compute_context = true;

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-05-22  9:00 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-20 22:51 [PATCH 1/1] drm/amdgpu: Sync with VM root BO when switching VM to CPU update mode Felix Kuehling
2020-05-21 13:50 ` Christian König
2020-05-21 17:06   ` Felix Kuehling
2020-05-22  8:59     ` Christian König

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.