[PATCH 1/2] drm/amdgpu: add flag to indicate amdgpu vm context

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH 1/2] drm/amdgpu: add flag to indicate amdgpu vm context
@ 2019-11-19 16:37 ` Alex Sierra
  0 siblings, 0 replies; 12+ messages in thread
From: Alex Sierra @ 2019-11-19 16:37 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW; +Cc: Alex Sierra

Flag added to indicate if the amdgpu vm context is used for compute or
graphics.

Change-Id: Ia813037fda2ec2947d73f5c7328388078fbeebe5
Signed-off-by: Alex Sierra <alex.sierra@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 3 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h | 2 ++
 2 files changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 3c0bd6472a46..d51ac8771ae0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -2713,6 +2713,7 @@ int amdgpu_vm_init(struct amdgpu_device *adev, struct amdgpu_vm *vm,
 		goto error_free_direct;
 
 	vm->pte_support_ats = false;
+	vm->is_compute_context = false;
 
 	if (vm_context == AMDGPU_VM_CONTEXT_COMPUTE) {
 		vm->use_cpu_for_update = !!(adev->vm_manager.vm_update_mode &
@@ -2898,6 +2899,7 @@ int amdgpu_vm_make_compute(struct amdgpu_device *adev, struct amdgpu_vm *vm,
 		vm->update_funcs = &amdgpu_vm_sdma_funcs;
 	dma_fence_put(vm->last_update);
 	vm->last_update = NULL;
+	vm->is_compute_context = true;
 
 	if (vm->pasid) {
 		unsigned long flags;
@@ -2952,6 +2954,7 @@ void amdgpu_vm_release_compute(struct amdgpu_device *adev, struct amdgpu_vm *vm)
 		spin_unlock_irqrestore(&adev->vm_manager.pasid_lock, flags);
 	}
 	vm->pasid = 0;
+	vm->is_compute_context = false;
 }
 
 /**
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
index 5cb25c1c54e0..76fcf853035c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
@@ -301,6 +301,8 @@ struct amdgpu_vm {
 	struct ttm_lru_bulk_move lru_bulk_move;
 	/* mark whether can do the bulk move */
 	bool			bulk_moveable;
+	/* Flag to indicate if VM is used for compute */
+	bool			is_compute_context;
 };
 
 struct amdgpu_vm_manager {
-- 
2.17.1

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 1/2] drm/amdgpu: add flag to indicate amdgpu vm context
@ 2019-11-19 16:37 ` Alex Sierra
  0 siblings, 0 replies; 12+ messages in thread
From: Alex Sierra @ 2019-11-19 16:37 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Sierra

Flag added to indicate if the amdgpu vm context is used for compute or
graphics.

Change-Id: Ia813037fda2ec2947d73f5c7328388078fbeebe5
Signed-off-by: Alex Sierra <alex.sierra@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 3 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h | 2 ++
 2 files changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 3c0bd6472a46..d51ac8771ae0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -2713,6 +2713,7 @@ int amdgpu_vm_init(struct amdgpu_device *adev, struct amdgpu_vm *vm,
 		goto error_free_direct;
 
 	vm->pte_support_ats = false;
+	vm->is_compute_context = false;
 
 	if (vm_context == AMDGPU_VM_CONTEXT_COMPUTE) {
 		vm->use_cpu_for_update = !!(adev->vm_manager.vm_update_mode &
@@ -2898,6 +2899,7 @@ int amdgpu_vm_make_compute(struct amdgpu_device *adev, struct amdgpu_vm *vm,
 		vm->update_funcs = &amdgpu_vm_sdma_funcs;
 	dma_fence_put(vm->last_update);
 	vm->last_update = NULL;
+	vm->is_compute_context = true;
 
 	if (vm->pasid) {
 		unsigned long flags;
@@ -2952,6 +2954,7 @@ void amdgpu_vm_release_compute(struct amdgpu_device *adev, struct amdgpu_vm *vm)
 		spin_unlock_irqrestore(&adev->vm_manager.pasid_lock, flags);
 	}
 	vm->pasid = 0;
+	vm->is_compute_context = false;
 }
 
 /**
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
index 5cb25c1c54e0..76fcf853035c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
@@ -301,6 +301,8 @@ struct amdgpu_vm {
 	struct ttm_lru_bulk_move lru_bulk_move;
 	/* mark whether can do the bulk move */
 	bool			bulk_moveable;
+	/* Flag to indicate if VM is used for compute */
+	bool			is_compute_context;
 };
 
 struct amdgpu_vm_manager {
-- 
2.17.1

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/2] amd/amdgpu: force to trigger a no-retry-fault after a retry-fault
@ 2019-11-19 16:37     ` Alex Sierra
  0 siblings, 0 replies; 12+ messages in thread
From: Alex Sierra @ 2019-11-19 16:37 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW; +Cc: Alex Sierra

Only for the debugger use case.

[why]
Avoid endless translation retries, after an invalid address access has
been issued to the GPU. Instead, the trap handler is forced to enter by
generating a no-retry-fault.
A s_trap instruction is inserted in the debugger case to let the wave to
enter trap handler to save context.

[how]
Intentionally using an invalid flag combination (F and P set at the same
time) to trigger a no-retry-fault, after a retry-fault happens. This is
only valid under compute context.

Change-Id: I4180c30e2631dc0401cbd6171f8a6776e4733c9a
Signed-off-by: Alex Sierra <alex.sierra@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index d51ac8771ae0..cd36195ff8be 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -3202,11 +3202,18 @@ bool amdgpu_vm_handle_fault(struct amdgpu_device *adev, unsigned int pasid,
 	flags = AMDGPU_PTE_VALID | AMDGPU_PTE_SNOOPED |
 		AMDGPU_PTE_SYSTEM;
 
-	if (amdgpu_vm_fault_stop == AMDGPU_VM_FAULT_STOP_NEVER) {
+	if (vm->is_compute_context) {
+		/* Setting PTE flags to trigger a no-retry-fault  */
+		flags = AMDGPU_PTE_EXECUTABLE | AMDGPU_PDE_PTE |
+			AMDGPU_PTE_TF;
+		value = 0;
+
+	} else if (amdgpu_vm_fault_stop == AMDGPU_VM_FAULT_STOP_NEVER) {
 		/* Redirect the access to the dummy page */
 		value = adev->dummy_page_addr;
 		flags |= AMDGPU_PTE_EXECUTABLE | AMDGPU_PTE_READABLE |
 			AMDGPU_PTE_WRITEABLE;
+
 	} else {
 		/* Let the hw retry silently on the PTE */
 		value = 0;
-- 
2.17.1

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/2] amd/amdgpu: force to trigger a no-retry-fault after a retry-fault
@ 2019-11-19 16:37     ` Alex Sierra
  0 siblings, 0 replies; 12+ messages in thread
From: Alex Sierra @ 2019-11-19 16:37 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Sierra

Only for the debugger use case.

[why]
Avoid endless translation retries, after an invalid address access has
been issued to the GPU. Instead, the trap handler is forced to enter by
generating a no-retry-fault.
A s_trap instruction is inserted in the debugger case to let the wave to
enter trap handler to save context.

[how]
Intentionally using an invalid flag combination (F and P set at the same
time) to trigger a no-retry-fault, after a retry-fault happens. This is
only valid under compute context.

Change-Id: I4180c30e2631dc0401cbd6171f8a6776e4733c9a
Signed-off-by: Alex Sierra <alex.sierra@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index d51ac8771ae0..cd36195ff8be 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -3202,11 +3202,18 @@ bool amdgpu_vm_handle_fault(struct amdgpu_device *adev, unsigned int pasid,
 	flags = AMDGPU_PTE_VALID | AMDGPU_PTE_SNOOPED |
 		AMDGPU_PTE_SYSTEM;
 
-	if (amdgpu_vm_fault_stop == AMDGPU_VM_FAULT_STOP_NEVER) {
+	if (vm->is_compute_context) {
+		/* Setting PTE flags to trigger a no-retry-fault  */
+		flags = AMDGPU_PTE_EXECUTABLE | AMDGPU_PDE_PTE |
+			AMDGPU_PTE_TF;
+		value = 0;
+
+	} else if (amdgpu_vm_fault_stop == AMDGPU_VM_FAULT_STOP_NEVER) {
 		/* Redirect the access to the dummy page */
 		value = adev->dummy_page_addr;
 		flags |= AMDGPU_PTE_EXECUTABLE | AMDGPU_PTE_READABLE |
 			AMDGPU_PTE_WRITEABLE;
+
 	} else {
 		/* Let the hw retry silently on the PTE */
 		value = 0;
-- 
2.17.1

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] amd/amdgpu: force to trigger a no-retry-fault after a retry-fault
@ 2019-11-19 16:45         ` Felix Kuehling
  0 siblings, 0 replies; 12+ messages in thread
From: Felix Kuehling @ 2019-11-19 16:45 UTC (permalink / raw)
  To: Alex Sierra, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

On 2019-11-19 11:37, Alex Sierra wrote:
> Only for the debugger use case.
>
> [why]
> Avoid endless translation retries, after an invalid address access has
> been issued to the GPU. Instead, the trap handler is forced to enter by
> generating a no-retry-fault.
> A s_trap instruction is inserted in the debugger case to let the wave to
> enter trap handler to save context.
>
> [how]
> Intentionally using an invalid flag combination (F and P set at the same
> time) to trigger a no-retry-fault, after a retry-fault happens. This is
> only valid under compute context.
>
> Change-Id: I4180c30e2631dc0401cbd6171f8a6776e4733c9a
> Signed-off-by: Alex Sierra <alex.sierra@amd.com>

This commit adds some unnecessary empty lines. See inline. With that 
fixed, the series is

Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>

Please also give Christian a chance to review.

Thanks,
   Felix

> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 9 ++++++++-
>   1 file changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> index d51ac8771ae0..cd36195ff8be 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> @@ -3202,11 +3202,18 @@ bool amdgpu_vm_handle_fault(struct amdgpu_device *adev, unsigned int pasid,
>   	flags = AMDGPU_PTE_VALID | AMDGPU_PTE_SNOOPED |
>   		AMDGPU_PTE_SYSTEM;
>   
> -	if (amdgpu_vm_fault_stop == AMDGPU_VM_FAULT_STOP_NEVER) {
> +	if (vm->is_compute_context) {
> +		/* Setting PTE flags to trigger a no-retry-fault  */
> +		flags = AMDGPU_PTE_EXECUTABLE | AMDGPU_PDE_PTE |
> +			AMDGPU_PTE_TF;
> +		value = 0;
> +
Unnecessary blank line.
> +	} else if (amdgpu_vm_fault_stop == AMDGPU_VM_FAULT_STOP_NEVER) {
>   		/* Redirect the access to the dummy page */
>   		value = adev->dummy_page_addr;
>   		flags |= AMDGPU_PTE_EXECUTABLE | AMDGPU_PTE_READABLE |
>   			AMDGPU_PTE_WRITEABLE;
> +
Unnecessary blank line.
>   	} else {
>   		/* Let the hw retry silently on the PTE */
>   		value = 0;
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] amd/amdgpu: force to trigger a no-retry-fault after a retry-fault
@ 2019-11-19 16:45         ` Felix Kuehling
  0 siblings, 0 replies; 12+ messages in thread
From: Felix Kuehling @ 2019-11-19 16:45 UTC (permalink / raw)
  To: Alex Sierra, amd-gfx

On 2019-11-19 11:37, Alex Sierra wrote:
> Only for the debugger use case.
>
> [why]
> Avoid endless translation retries, after an invalid address access has
> been issued to the GPU. Instead, the trap handler is forced to enter by
> generating a no-retry-fault.
> A s_trap instruction is inserted in the debugger case to let the wave to
> enter trap handler to save context.
>
> [how]
> Intentionally using an invalid flag combination (F and P set at the same
> time) to trigger a no-retry-fault, after a retry-fault happens. This is
> only valid under compute context.
>
> Change-Id: I4180c30e2631dc0401cbd6171f8a6776e4733c9a
> Signed-off-by: Alex Sierra <alex.sierra@amd.com>

This commit adds some unnecessary empty lines. See inline. With that 
fixed, the series is

Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>

Please also give Christian a chance to review.

Thanks,
   Felix

> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 9 ++++++++-
>   1 file changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> index d51ac8771ae0..cd36195ff8be 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> @@ -3202,11 +3202,18 @@ bool amdgpu_vm_handle_fault(struct amdgpu_device *adev, unsigned int pasid,
>   	flags = AMDGPU_PTE_VALID | AMDGPU_PTE_SNOOPED |
>   		AMDGPU_PTE_SYSTEM;
>   
> -	if (amdgpu_vm_fault_stop == AMDGPU_VM_FAULT_STOP_NEVER) {
> +	if (vm->is_compute_context) {
> +		/* Setting PTE flags to trigger a no-retry-fault  */
> +		flags = AMDGPU_PTE_EXECUTABLE | AMDGPU_PDE_PTE |
> +			AMDGPU_PTE_TF;
> +		value = 0;
> +
Unnecessary blank line.
> +	} else if (amdgpu_vm_fault_stop == AMDGPU_VM_FAULT_STOP_NEVER) {
>   		/* Redirect the access to the dummy page */
>   		value = adev->dummy_page_addr;
>   		flags |= AMDGPU_PTE_EXECUTABLE | AMDGPU_PTE_READABLE |
>   			AMDGPU_PTE_WRITEABLE;
> +
Unnecessary blank line.
>   	} else {
>   		/* Let the hw retry silently on the PTE */
>   		value = 0;
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] amd/amdgpu: force to trigger a no-retry-fault after a retry-fault
@ 2019-11-19 20:06             ` Christian König
  0 siblings, 0 replies; 12+ messages in thread
From: Christian König @ 2019-11-19 20:06 UTC (permalink / raw)
  To: Felix Kuehling, Alex Sierra, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Am 19.11.19 um 17:45 schrieb Felix Kuehling:
> On 2019-11-19 11:37, Alex Sierra wrote:
>> Only for the debugger use case.
>>
>> [why]
>> Avoid endless translation retries, after an invalid address access has
>> been issued to the GPU. Instead, the trap handler is forced to enter by
>> generating a no-retry-fault.
>> A s_trap instruction is inserted in the debugger case to let the wave to
>> enter trap handler to save context.
>>
>> [how]
>> Intentionally using an invalid flag combination (F and P set at the same
>> time) to trigger a no-retry-fault, after a retry-fault happens. This is
>> only valid under compute context.
>>
>> Change-Id: I4180c30e2631dc0401cbd6171f8a6776e4733c9a
>> Signed-off-by: Alex Sierra <alex.sierra@amd.com>
>
> This commit adds some unnecessary empty lines. See inline. With that 
> fixed, the series is
>
> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>

I actually like the empty lines, nicely emphases that a new block starts.

Maybe note in the code comment that the flags combination is 
intentionally invalid as well.

But either way the series is Reviewed-by: Christian König 
<christian.koenig@amd.com>.

Regards,
Christian.

>
> Please also give Christian a chance to review.
>
> Thanks,
>   Felix
>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 9 ++++++++-
>>   1 file changed, 8 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> index d51ac8771ae0..cd36195ff8be 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> @@ -3202,11 +3202,18 @@ bool amdgpu_vm_handle_fault(struct 
>> amdgpu_device *adev, unsigned int pasid,
>>       flags = AMDGPU_PTE_VALID | AMDGPU_PTE_SNOOPED |
>>           AMDGPU_PTE_SYSTEM;
>>   -    if (amdgpu_vm_fault_stop == AMDGPU_VM_FAULT_STOP_NEVER) {
>> +    if (vm->is_compute_context) {
>> +        /* Setting PTE flags to trigger a no-retry-fault  */
>> +        flags = AMDGPU_PTE_EXECUTABLE | AMDGPU_PDE_PTE |
>> +            AMDGPU_PTE_TF;
>> +        value = 0;
>> +
> Unnecessary blank line.
>> +    } else if (amdgpu_vm_fault_stop == AMDGPU_VM_FAULT_STOP_NEVER) {
>>           /* Redirect the access to the dummy page */
>>           value = adev->dummy_page_addr;
>>           flags |= AMDGPU_PTE_EXECUTABLE | AMDGPU_PTE_READABLE |
>>               AMDGPU_PTE_WRITEABLE;
>> +
> Unnecessary blank line.
>>       } else {
>>           /* Let the hw retry silently on the PTE */
>>           value = 0;
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] amd/amdgpu: force to trigger a no-retry-fault after a retry-fault
@ 2019-11-19 20:06             ` Christian König
  0 siblings, 0 replies; 12+ messages in thread
From: Christian König @ 2019-11-19 20:06 UTC (permalink / raw)
  To: Felix Kuehling, Alex Sierra, amd-gfx

Am 19.11.19 um 17:45 schrieb Felix Kuehling:
> On 2019-11-19 11:37, Alex Sierra wrote:
>> Only for the debugger use case.
>>
>> [why]
>> Avoid endless translation retries, after an invalid address access has
>> been issued to the GPU. Instead, the trap handler is forced to enter by
>> generating a no-retry-fault.
>> A s_trap instruction is inserted in the debugger case to let the wave to
>> enter trap handler to save context.
>>
>> [how]
>> Intentionally using an invalid flag combination (F and P set at the same
>> time) to trigger a no-retry-fault, after a retry-fault happens. This is
>> only valid under compute context.
>>
>> Change-Id: I4180c30e2631dc0401cbd6171f8a6776e4733c9a
>> Signed-off-by: Alex Sierra <alex.sierra@amd.com>
>
> This commit adds some unnecessary empty lines. See inline. With that 
> fixed, the series is
>
> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>

I actually like the empty lines, nicely emphases that a new block starts.

Maybe note in the code comment that the flags combination is 
intentionally invalid as well.

But either way the series is Reviewed-by: Christian König 
<christian.koenig@amd.com>.

Regards,
Christian.

>
> Please also give Christian a chance to review.
>
> Thanks,
>   Felix
>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 9 ++++++++-
>>   1 file changed, 8 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> index d51ac8771ae0..cd36195ff8be 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> @@ -3202,11 +3202,18 @@ bool amdgpu_vm_handle_fault(struct 
>> amdgpu_device *adev, unsigned int pasid,
>>       flags = AMDGPU_PTE_VALID | AMDGPU_PTE_SNOOPED |
>>           AMDGPU_PTE_SYSTEM;
>>   -    if (amdgpu_vm_fault_stop == AMDGPU_VM_FAULT_STOP_NEVER) {
>> +    if (vm->is_compute_context) {
>> +        /* Setting PTE flags to trigger a no-retry-fault  */
>> +        flags = AMDGPU_PTE_EXECUTABLE | AMDGPU_PDE_PTE |
>> +            AMDGPU_PTE_TF;
>> +        value = 0;
>> +
> Unnecessary blank line.
>> +    } else if (amdgpu_vm_fault_stop == AMDGPU_VM_FAULT_STOP_NEVER) {
>>           /* Redirect the access to the dummy page */
>>           value = adev->dummy_page_addr;
>>           flags |= AMDGPU_PTE_EXECUTABLE | AMDGPU_PTE_READABLE |
>>               AMDGPU_PTE_WRITEABLE;
>> +
> Unnecessary blank line.
>>       } else {
>>           /* Let the hw retry silently on the PTE */
>>           value = 0;
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] amd/amdgpu: force to trigger a no-retry-fault after a retry-fault
@ 2019-11-18 22:46         ` Felix Kuehling
  0 siblings, 0 replies; 12+ messages in thread
From: Felix Kuehling @ 2019-11-18 22:46 UTC (permalink / raw)
  To: Alex Sierra, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

On 2019-11-18 17:24, Alex Sierra wrote:
> Only for the debugger use case.
>
> [why]
> Avoid endless translation retries, after an invalid address access has
> been issued to the GPU. Instead, the trap handler is forced to enter by
> generating a no-retry-fault.
> A s_trap instruction is inserted in the debugger case to let the wave to
> enter trap handler to save context.
>
> [how]
> Intentionally using an invalid flag combination (F and P set at the same
> time) to trigger a no-retry-fault, after a retry-fault happens. This is
> only valid under compute context.
>
> Change-Id: I4180c30e2631dc0401cbd6171f8a6776e4733c9a
> Signed-off-by: Alex Sierra <alex.sierra@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 6 ++++++
>   1 file changed, 6 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> index d51ac8771ae0..358a4f50fcfb 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> @@ -3207,6 +3207,12 @@ bool amdgpu_vm_handle_fault(struct amdgpu_device *adev, unsigned int pasid,
>   		value = adev->dummy_page_addr;
>   		flags |= AMDGPU_PTE_EXECUTABLE | AMDGPU_PTE_READABLE |
>   			AMDGPU_PTE_WRITEABLE;
> +
> +		if (vm->is_compute_context) {
> +			/* Setting PTE flags to trigger a no-retry-fault  */
> +			flags = AMDGPU_PTE_EXECUTABLE | AMDGPU_PDE_PTE |
> +				AMDGPU_PTE_TF;

Hmm, this looks like you're setting flags twice in the compute-case. I 
was also expecting something more like this:

if (vm->is_compute_context) {
     ...
} else if (amdgpu_vm_fault_stop == AMDGPU_VM_FAULT_STOP_NEVER) {
     ...
} else {
     ...
}

I.e. for compute contexts, we do our compute-specific thing, otherwise 
the behaviour depends on the amdgpu_vm_fault_stop setting.

Regards,
   Felix


> +		}
>   	} else {
>   		/* Let the hw retry silently on the PTE */
>   		value = 0;
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] amd/amdgpu: force to trigger a no-retry-fault after a retry-fault
@ 2019-11-18 22:46         ` Felix Kuehling
  0 siblings, 0 replies; 12+ messages in thread
From: Felix Kuehling @ 2019-11-18 22:46 UTC (permalink / raw)
  To: Alex Sierra, amd-gfx

On 2019-11-18 17:24, Alex Sierra wrote:
> Only for the debugger use case.
>
> [why]
> Avoid endless translation retries, after an invalid address access has
> been issued to the GPU. Instead, the trap handler is forced to enter by
> generating a no-retry-fault.
> A s_trap instruction is inserted in the debugger case to let the wave to
> enter trap handler to save context.
>
> [how]
> Intentionally using an invalid flag combination (F and P set at the same
> time) to trigger a no-retry-fault, after a retry-fault happens. This is
> only valid under compute context.
>
> Change-Id: I4180c30e2631dc0401cbd6171f8a6776e4733c9a
> Signed-off-by: Alex Sierra <alex.sierra@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 6 ++++++
>   1 file changed, 6 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> index d51ac8771ae0..358a4f50fcfb 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> @@ -3207,6 +3207,12 @@ bool amdgpu_vm_handle_fault(struct amdgpu_device *adev, unsigned int pasid,
>   		value = adev->dummy_page_addr;
>   		flags |= AMDGPU_PTE_EXECUTABLE | AMDGPU_PTE_READABLE |
>   			AMDGPU_PTE_WRITEABLE;
> +
> +		if (vm->is_compute_context) {
> +			/* Setting PTE flags to trigger a no-retry-fault  */
> +			flags = AMDGPU_PTE_EXECUTABLE | AMDGPU_PDE_PTE |
> +				AMDGPU_PTE_TF;

Hmm, this looks like you're setting flags twice in the compute-case. I 
was also expecting something more like this:

if (vm->is_compute_context) {
     ...
} else if (amdgpu_vm_fault_stop == AMDGPU_VM_FAULT_STOP_NEVER) {
     ...
} else {
     ...
}

I.e. for compute contexts, we do our compute-specific thing, otherwise 
the behaviour depends on the amdgpu_vm_fault_stop setting.

Regards,
   Felix


> +		}
>   	} else {
>   		/* Let the hw retry silently on the PTE */
>   		value = 0;
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 2/2] amd/amdgpu: force to trigger a no-retry-fault after a retry-fault
@ 2019-11-18 22:24     ` Alex Sierra
  0 siblings, 0 replies; 12+ messages in thread
From: Alex Sierra @ 2019-11-18 22:24 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW; +Cc: Alex Sierra

Only for the debugger use case.

[why]
Avoid endless translation retries, after an invalid address access has
been issued to the GPU. Instead, the trap handler is forced to enter by
generating a no-retry-fault.
A s_trap instruction is inserted in the debugger case to let the wave to
enter trap handler to save context.

[how]
Intentionally using an invalid flag combination (F and P set at the same
time) to trigger a no-retry-fault, after a retry-fault happens. This is
only valid under compute context.

Change-Id: I4180c30e2631dc0401cbd6171f8a6776e4733c9a
Signed-off-by: Alex Sierra <alex.sierra@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index d51ac8771ae0..358a4f50fcfb 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -3207,6 +3207,12 @@ bool amdgpu_vm_handle_fault(struct amdgpu_device *adev, unsigned int pasid,
 		value = adev->dummy_page_addr;
 		flags |= AMDGPU_PTE_EXECUTABLE | AMDGPU_PTE_READABLE |
 			AMDGPU_PTE_WRITEABLE;
+
+		if (vm->is_compute_context) {
+			/* Setting PTE flags to trigger a no-retry-fault  */
+			flags = AMDGPU_PTE_EXECUTABLE | AMDGPU_PDE_PTE |
+				AMDGPU_PTE_TF;
+		}
 	} else {
 		/* Let the hw retry silently on the PTE */
 		value = 0;
-- 
2.17.1

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/2] amd/amdgpu: force to trigger a no-retry-fault after a retry-fault
@ 2019-11-18 22:24     ` Alex Sierra
  0 siblings, 0 replies; 12+ messages in thread
From: Alex Sierra @ 2019-11-18 22:24 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Sierra

Only for the debugger use case.

[why]
Avoid endless translation retries, after an invalid address access has
been issued to the GPU. Instead, the trap handler is forced to enter by
generating a no-retry-fault.
A s_trap instruction is inserted in the debugger case to let the wave to
enter trap handler to save context.

[how]
Intentionally using an invalid flag combination (F and P set at the same
time) to trigger a no-retry-fault, after a retry-fault happens. This is
only valid under compute context.

Change-Id: I4180c30e2631dc0401cbd6171f8a6776e4733c9a
Signed-off-by: Alex Sierra <alex.sierra@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index d51ac8771ae0..358a4f50fcfb 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -3207,6 +3207,12 @@ bool amdgpu_vm_handle_fault(struct amdgpu_device *adev, unsigned int pasid,
 		value = adev->dummy_page_addr;
 		flags |= AMDGPU_PTE_EXECUTABLE | AMDGPU_PTE_READABLE |
 			AMDGPU_PTE_WRITEABLE;
+
+		if (vm->is_compute_context) {
+			/* Setting PTE flags to trigger a no-retry-fault  */
+			flags = AMDGPU_PTE_EXECUTABLE | AMDGPU_PDE_PTE |
+				AMDGPU_PTE_TF;
+		}
 	} else {
 		/* Let the hw retry silently on the PTE */
 		value = 0;
-- 
2.17.1

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2019-11-19 20:06 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-19 16:37 [PATCH 1/2] drm/amdgpu: add flag to indicate amdgpu vm context Alex Sierra
2019-11-19 16:37 ` Alex Sierra
     [not found] ` <20191119163754.4966-1-alex.sierra-5C7GfCeVMHo@public.gmane.org>
2019-11-19 16:37   ` [PATCH 2/2] amd/amdgpu: force to trigger a no-retry-fault after a retry-fault Alex Sierra
2019-11-19 16:37     ` Alex Sierra
     [not found]     ` <20191119163754.4966-2-alex.sierra-5C7GfCeVMHo@public.gmane.org>
2019-11-19 16:45       ` Felix Kuehling
2019-11-19 16:45         ` Felix Kuehling
     [not found]         ` <2b96848e-cf45-b558-e453-8a73de83d4a3-5C7GfCeVMHo@public.gmane.org>
2019-11-19 20:06           ` Christian König
2019-11-19 20:06             ` Christian König
  -- strict thread matches above, loose matches on Subject: below --
2019-11-18 22:24 [PATCH 1/2] drm/amdgpu: add flag to indicate amdgpu vm context Alex Sierra
     [not found] ` <20191118222435.93134-1-alex.sierra-5C7GfCeVMHo@public.gmane.org>
2019-11-18 22:24   ` [PATCH 2/2] amd/amdgpu: force to trigger a no-retry-fault after a retry-fault Alex Sierra
2019-11-18 22:24     ` Alex Sierra
     [not found]     ` <20191118222435.93134-2-alex.sierra-5C7GfCeVMHo@public.gmane.org>
2019-11-18 22:46       ` Felix Kuehling
2019-11-18 22:46         ` Felix Kuehling

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.