All of lore.kernel.org
 help / color / mirror / Atom feed
* Avoid uninterruptible sleep during process exit
@ 2018-04-24 15:30 ` Andrey Grodzovsky
  0 siblings, 0 replies; 122+ messages in thread
From: Andrey Grodzovsky @ 2018-04-24 15:30 UTC (permalink / raw)
  To: linux-kernel, amd-gfx
  Cc: Alexander.Deucher, Christian.Koenig, David.Panariti, oleg, akpm,
	ebiederm

Following 3 patches address an issue we encounter in AMDGPU driver.

When GPU pipe is stalling for some reason (shader code error, incorrectly programmed registers e.t.c...) 
uninterruptible wait in kernel puts the user process in unresponsive state 
which only can be remedied by  system's hard reset.   

Each patch addresses a different use case of such problem.

First one is normal exit (not from signal processing) the change in 
core/signal.c - to allow propagation of KILL signal to process marked as exiting.

Second one is exit due to death because of unhanded  signal during signal 
processing - to avoid waiting for SIGKILL if you are called from
...->do_signal->get_signal->do_group_exit->do_exit->...->wait_event_killable

Third one is nor related to process exit and just avoids uninterruptible wait 
for particular job completion on the GPU pipe.

P.S Sending this to the kernel mailing list mainly because of the first patch, 
the 2 others are intended more for amd-gfx@lists.freedesktop.org and 
are given here just to provide more context for the problem we try to solve.

Andrey Grodzovsky (3):

signals: Allow generation of SIGKILL to exiting task.   
drm/scheduler: Don't call wait_event_killable for signaled process.   
drm/amdgpu: Switch to interrupted wait to recover from ring hang.

drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c   | 14 ++++++++++----
drivers/gpu/drm/scheduler/gpu_scheduler.c |  5 +++--
kernel/signal.c                           |  4 ++--
3 files changed, 15 insertions(+), 8 deletions(-)

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Avoid uninterruptible sleep during process exit
@ 2018-04-24 15:30 ` Andrey Grodzovsky
  0 siblings, 0 replies; 122+ messages in thread
From: Andrey Grodzovsky @ 2018-04-24 15:30 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: David.Panariti-5C7GfCeVMHo, oleg-H+wXaHxf7aLQT0dZR+AlfA,
	ebiederm-aS9lmoZGLiVWk0Htik3J/w, Alexander.Deucher-5C7GfCeVMHo,
	akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	Christian.Koenig-5C7GfCeVMHo

Following 3 patches address an issue we encounter in AMDGPU driver.

When GPU pipe is stalling for some reason (shader code error, incorrectly programmed registers e.t.c...) 
uninterruptible wait in kernel puts the user process in unresponsive state 
which only can be remedied by  system's hard reset.   

Each patch addresses a different use case of such problem.

First one is normal exit (not from signal processing) the change in 
core/signal.c - to allow propagation of KILL signal to process marked as exiting.

Second one is exit due to death because of unhanded  signal during signal 
processing - to avoid waiting for SIGKILL if you are called from
...->do_signal->get_signal->do_group_exit->do_exit->...->wait_event_killable

Third one is nor related to process exit and just avoids uninterruptible wait 
for particular job completion on the GPU pipe.

P.S Sending this to the kernel mailing list mainly because of the first patch, 
the 2 others are intended more for amd-gfx@lists.freedesktop.org and 
are given here just to provide more context for the problem we try to solve.

Andrey Grodzovsky (3):

signals: Allow generation of SIGKILL to exiting task.   
drm/scheduler: Don't call wait_event_killable for signaled process.   
drm/amdgpu: Switch to interrupted wait to recover from ring hang.

drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c   | 14 ++++++++++----
drivers/gpu/drm/scheduler/gpu_scheduler.c |  5 +++--
kernel/signal.c                           |  4 ++--
3 files changed, 15 insertions(+), 8 deletions(-)

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 122+ messages in thread

* [PATCH 1/3] signals: Allow generation of SIGKILL to exiting task.
@ 2018-04-24 15:30   ` Andrey Grodzovsky
  0 siblings, 0 replies; 122+ messages in thread
From: Andrey Grodzovsky @ 2018-04-24 15:30 UTC (permalink / raw)
  To: linux-kernel, amd-gfx
  Cc: Alexander.Deucher, Christian.Koenig, David.Panariti, oleg, akpm,
	ebiederm, Andrey Grodzovsky

Currently calling wait_event_killable as part of exiting process
will stall forever since SIGKILL generation is suppresed by PF_EXITING.

In our partilaur case AMDGPU driver wants to flush all GPU jobs in
flight before shutting down. But if some job hangs the pipe we still want to
be able to kill it and avoid a process in D state.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 kernel/signal.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/signal.c b/kernel/signal.c
index c6e4c83..c49c706 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -886,10 +886,10 @@ static inline int wants_signal(int sig, struct task_struct *p)
 {
 	if (sigismember(&p->blocked, sig))
 		return 0;
-	if (p->flags & PF_EXITING)
-		return 0;
 	if (sig == SIGKILL)
 		return 1;
+	if (p->flags & PF_EXITING)
+		return 0;
 	if (task_is_stopped_or_traced(p))
 		return 0;
 	return task_curr(p) || !signal_pending(p);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 122+ messages in thread

* [PATCH 1/3] signals: Allow generation of SIGKILL to exiting task.
@ 2018-04-24 15:30   ` Andrey Grodzovsky
  0 siblings, 0 replies; 122+ messages in thread
From: Andrey Grodzovsky @ 2018-04-24 15:30 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: David.Panariti-5C7GfCeVMHo, Andrey Grodzovsky,
	oleg-H+wXaHxf7aLQT0dZR+AlfA, ebiederm-aS9lmoZGLiVWk0Htik3J/w,
	Alexander.Deucher-5C7GfCeVMHo,
	akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	Christian.Koenig-5C7GfCeVMHo

Currently calling wait_event_killable as part of exiting process
will stall forever since SIGKILL generation is suppresed by PF_EXITING.

In our partilaur case AMDGPU driver wants to flush all GPU jobs in
flight before shutting down. But if some job hangs the pipe we still want to
be able to kill it and avoid a process in D state.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 kernel/signal.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/signal.c b/kernel/signal.c
index c6e4c83..c49c706 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -886,10 +886,10 @@ static inline int wants_signal(int sig, struct task_struct *p)
 {
 	if (sigismember(&p->blocked, sig))
 		return 0;
-	if (p->flags & PF_EXITING)
-		return 0;
 	if (sig == SIGKILL)
 		return 1;
+	if (p->flags & PF_EXITING)
+		return 0;
 	if (task_is_stopped_or_traced(p))
 		return 0;
 	return task_curr(p) || !signal_pending(p);
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 122+ messages in thread

* [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
@ 2018-04-24 15:30   ` Andrey Grodzovsky
  0 siblings, 0 replies; 122+ messages in thread
From: Andrey Grodzovsky @ 2018-04-24 15:30 UTC (permalink / raw)
  To: linux-kernel, amd-gfx
  Cc: Alexander.Deucher, Christian.Koenig, David.Panariti, oleg, akpm,
	ebiederm, Andrey Grodzovsky

Avoid calling wait_event_killable when you are possibly being called
from get_signal routine since in that case you end up in a deadlock
where you are alreay blocked in singla processing any trying to wait
on a new signal.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/scheduler/gpu_scheduler.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler.c b/drivers/gpu/drm/scheduler/gpu_scheduler.c
index 088ff2b..09fd258 100644
--- a/drivers/gpu/drm/scheduler/gpu_scheduler.c
+++ b/drivers/gpu/drm/scheduler/gpu_scheduler.c
@@ -227,9 +227,10 @@ void drm_sched_entity_do_release(struct drm_gpu_scheduler *sched,
 		return;
 	/**
 	 * The client will not queue more IBs during this fini, consume existing
-	 * queued IBs or discard them on SIGKILL
+	 * queued IBs or discard them when in death signal state since
+	 * wait_event_killable can't receive signals in that state.
 	*/
-	if ((current->flags & PF_SIGNALED) && current->exit_code == SIGKILL)
+	if (current->flags & PF_SIGNALED)
 		entity->fini_status = -ERESTARTSYS;
 	else
 		entity->fini_status = wait_event_killable(sched->job_scheduled,
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 122+ messages in thread

* [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
@ 2018-04-24 15:30   ` Andrey Grodzovsky
  0 siblings, 0 replies; 122+ messages in thread
From: Andrey Grodzovsky @ 2018-04-24 15:30 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: David.Panariti-5C7GfCeVMHo, Andrey Grodzovsky,
	oleg-H+wXaHxf7aLQT0dZR+AlfA, ebiederm-aS9lmoZGLiVWk0Htik3J/w,
	Alexander.Deucher-5C7GfCeVMHo,
	akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	Christian.Koenig-5C7GfCeVMHo

Avoid calling wait_event_killable when you are possibly being called
from get_signal routine since in that case you end up in a deadlock
where you are alreay blocked in singla processing any trying to wait
on a new signal.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/scheduler/gpu_scheduler.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler.c b/drivers/gpu/drm/scheduler/gpu_scheduler.c
index 088ff2b..09fd258 100644
--- a/drivers/gpu/drm/scheduler/gpu_scheduler.c
+++ b/drivers/gpu/drm/scheduler/gpu_scheduler.c
@@ -227,9 +227,10 @@ void drm_sched_entity_do_release(struct drm_gpu_scheduler *sched,
 		return;
 	/**
 	 * The client will not queue more IBs during this fini, consume existing
-	 * queued IBs or discard them on SIGKILL
+	 * queued IBs or discard them when in death signal state since
+	 * wait_event_killable can't receive signals in that state.
 	*/
-	if ((current->flags & PF_SIGNALED) && current->exit_code == SIGKILL)
+	if (current->flags & PF_SIGNALED)
 		entity->fini_status = -ERESTARTSYS;
 	else
 		entity->fini_status = wait_event_killable(sched->job_scheduled,
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 122+ messages in thread

* [PATCH 3/3] drm/amdgpu: Switch to interrupted wait to recover from ring hang.
@ 2018-04-24 15:30   ` Andrey Grodzovsky
  0 siblings, 0 replies; 122+ messages in thread
From: Andrey Grodzovsky @ 2018-04-24 15:30 UTC (permalink / raw)
  To: linux-kernel, amd-gfx
  Cc: Alexander.Deucher, Christian.Koenig, David.Panariti, oleg, akpm,
	ebiederm, Andrey Grodzovsky

If the ring is hanging for some reason allow to recover the waiting
by sending fatal signal.

Originally-by: David Panariti <David.Panariti@amd.com>
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
index eb80edf..37a36af 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
@@ -421,10 +421,16 @@ int amdgpu_ctx_wait_prev_fence(struct amdgpu_ctx *ctx, unsigned ring_id)
 
 	if (other) {
 		signed long r;
-		r = dma_fence_wait_timeout(other, false, MAX_SCHEDULE_TIMEOUT);
-		if (r < 0) {
-			DRM_ERROR("Error (%ld) waiting for fence!\n", r);
-			return r;
+
+		while (true) {
+			if ((r = dma_fence_wait_timeout(other, true,
+					MAX_SCHEDULE_TIMEOUT)) >= 0)
+				return 0;
+
+			if (fatal_signal_pending(current)) {
+				DRM_ERROR("Error (%ld) waiting for fence!\n", r);
+				return r;
+			}
 		}
 	}
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 122+ messages in thread

* [PATCH 3/3] drm/amdgpu: Switch to interrupted wait to recover from ring hang.
@ 2018-04-24 15:30   ` Andrey Grodzovsky
  0 siblings, 0 replies; 122+ messages in thread
From: Andrey Grodzovsky @ 2018-04-24 15:30 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: David.Panariti-5C7GfCeVMHo, Andrey Grodzovsky,
	oleg-H+wXaHxf7aLQT0dZR+AlfA, ebiederm-aS9lmoZGLiVWk0Htik3J/w,
	Alexander.Deucher-5C7GfCeVMHo,
	akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	Christian.Koenig-5C7GfCeVMHo

If the ring is hanging for some reason allow to recover the waiting
by sending fatal signal.

Originally-by: David Panariti <David.Panariti@amd.com>
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
index eb80edf..37a36af 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
@@ -421,10 +421,16 @@ int amdgpu_ctx_wait_prev_fence(struct amdgpu_ctx *ctx, unsigned ring_id)
 
 	if (other) {
 		signed long r;
-		r = dma_fence_wait_timeout(other, false, MAX_SCHEDULE_TIMEOUT);
-		if (r < 0) {
-			DRM_ERROR("Error (%ld) waiting for fence!\n", r);
-			return r;
+
+		while (true) {
+			if ((r = dma_fence_wait_timeout(other, true,
+					MAX_SCHEDULE_TIMEOUT)) >= 0)
+				return 0;
+
+			if (fatal_signal_pending(current)) {
+				DRM_ERROR("Error (%ld) waiting for fence!\n", r);
+				return r;
+			}
 		}
 	}
 
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
  2018-04-24 15:30   ` Andrey Grodzovsky
  (?)
@ 2018-04-24 15:46   ` Michel Dänzer
  2018-04-24 15:51       ` Andrey Grodzovsky
                       ` (2 more replies)
  -1 siblings, 3 replies; 122+ messages in thread
From: Michel Dänzer @ 2018-04-24 15:46 UTC (permalink / raw)
  To: Andrey Grodzovsky, linux-kernel, amd-gfx, dri-devel
  Cc: David.Panariti, oleg, ebiederm, Alexander.Deucher, akpm,
	Christian.Koenig


Adding the dri-devel list, since this is driver independent code.


On 2018-04-24 05:30 PM, Andrey Grodzovsky wrote:
> Avoid calling wait_event_killable when you are possibly being called
> from get_signal routine since in that case you end up in a deadlock
> where you are alreay blocked in singla processing any trying to wait

Multiple typos here, "[...] already blocked in signal processing and [...]"?


> on a new signal.
> 
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> ---
>  drivers/gpu/drm/scheduler/gpu_scheduler.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler.c b/drivers/gpu/drm/scheduler/gpu_scheduler.c
> index 088ff2b..09fd258 100644
> --- a/drivers/gpu/drm/scheduler/gpu_scheduler.c
> +++ b/drivers/gpu/drm/scheduler/gpu_scheduler.c
> @@ -227,9 +227,10 @@ void drm_sched_entity_do_release(struct drm_gpu_scheduler *sched,
>  		return;
>  	/**
>  	 * The client will not queue more IBs during this fini, consume existing
> -	 * queued IBs or discard them on SIGKILL
> +	 * queued IBs or discard them when in death signal state since
> +	 * wait_event_killable can't receive signals in that state.
>  	*/
> -	if ((current->flags & PF_SIGNALED) && current->exit_code == SIGKILL)
> +	if (current->flags & PF_SIGNALED)
>  		entity->fini_status = -ERESTARTSYS;
>  	else
>  		entity->fini_status = wait_event_killable(sched->job_scheduled,
> 


-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
@ 2018-04-24 15:51       ` Andrey Grodzovsky
  0 siblings, 0 replies; 122+ messages in thread
From: Andrey Grodzovsky @ 2018-04-24 15:51 UTC (permalink / raw)
  To: Michel Dänzer, linux-kernel, amd-gfx, dri-devel
  Cc: David.Panariti, oleg, ebiederm, Alexander.Deucher, akpm,
	Christian.Koenig



On 04/24/2018 11:46 AM, Michel Dänzer wrote:
> Adding the dri-devel list, since this is driver independent code.

Thanks, so many addresses that this one slipped out...
>
>
> On 2018-04-24 05:30 PM, Andrey Grodzovsky wrote:
>> Avoid calling wait_event_killable when you are possibly being called
>> from get_signal routine since in that case you end up in a deadlock
>> where you are alreay blocked in singla processing any trying to wait
> Multiple typos here, "[...] already blocked in signal processing and [...]"?

I don't understand where are the typos.

Andrey

>
>
>> on a new signal.
>>
>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>> ---
>>   drivers/gpu/drm/scheduler/gpu_scheduler.c | 5 +++--
>>   1 file changed, 3 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler.c b/drivers/gpu/drm/scheduler/gpu_scheduler.c
>> index 088ff2b..09fd258 100644
>> --- a/drivers/gpu/drm/scheduler/gpu_scheduler.c
>> +++ b/drivers/gpu/drm/scheduler/gpu_scheduler.c
>> @@ -227,9 +227,10 @@ void drm_sched_entity_do_release(struct drm_gpu_scheduler *sched,
>>   		return;
>>   	/**
>>   	 * The client will not queue more IBs during this fini, consume existing
>> -	 * queued IBs or discard them on SIGKILL
>> +	 * queued IBs or discard them when in death signal state since
>> +	 * wait_event_killable can't receive signals in that state.
>>   	*/
>> -	if ((current->flags & PF_SIGNALED) && current->exit_code == SIGKILL)
>> +	if (current->flags & PF_SIGNALED)
>>   		entity->fini_status = -ERESTARTSYS;
>>   	else
>>   		entity->fini_status = wait_event_killable(sched->job_scheduled,
>>
>

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
@ 2018-04-24 15:51       ` Andrey Grodzovsky
  0 siblings, 0 replies; 122+ messages in thread
From: Andrey Grodzovsky @ 2018-04-24 15:51 UTC (permalink / raw)
  To: Michel Dänzer, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: David.Panariti-5C7GfCeVMHo, oleg-H+wXaHxf7aLQT0dZR+AlfA,
	ebiederm-aS9lmoZGLiVWk0Htik3J/w, Alexander.Deucher-5C7GfCeVMHo,
	akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	Christian.Koenig-5C7GfCeVMHo



On 04/24/2018 11:46 AM, Michel Dänzer wrote:
> Adding the dri-devel list, since this is driver independent code.

Thanks, so many addresses that this one slipped out...
>
>
> On 2018-04-24 05:30 PM, Andrey Grodzovsky wrote:
>> Avoid calling wait_event_killable when you are possibly being called
>> from get_signal routine since in that case you end up in a deadlock
>> where you are alreay blocked in singla processing any trying to wait
> Multiple typos here, "[...] already blocked in signal processing and [...]"?

I don't understand where are the typos.

Andrey

>
>
>> on a new signal.
>>
>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>> ---
>>   drivers/gpu/drm/scheduler/gpu_scheduler.c | 5 +++--
>>   1 file changed, 3 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler.c b/drivers/gpu/drm/scheduler/gpu_scheduler.c
>> index 088ff2b..09fd258 100644
>> --- a/drivers/gpu/drm/scheduler/gpu_scheduler.c
>> +++ b/drivers/gpu/drm/scheduler/gpu_scheduler.c
>> @@ -227,9 +227,10 @@ void drm_sched_entity_do_release(struct drm_gpu_scheduler *sched,
>>   		return;
>>   	/**
>>   	 * The client will not queue more IBs during this fini, consume existing
>> -	 * queued IBs or discard them on SIGKILL
>> +	 * queued IBs or discard them when in death signal state since
>> +	 * wait_event_killable can't receive signals in that state.
>>   	*/
>> -	if ((current->flags & PF_SIGNALED) && current->exit_code == SIGKILL)
>> +	if (current->flags & PF_SIGNALED)
>>   		entity->fini_status = -ERESTARTSYS;
>>   	else
>>   		entity->fini_status = wait_event_killable(sched->job_scheduled,
>>
>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
  2018-04-24 15:46   ` Michel Dänzer
@ 2018-04-24 15:52       ` Andrey Grodzovsky
  2018-04-24 15:52       ` Andrey Grodzovsky
  2018-04-24 19:44       ` Daniel Vetter
  2 siblings, 0 replies; 122+ messages in thread
From: Andrey Grodzovsky @ 2018-04-24 15:52 UTC (permalink / raw)
  To: Michel Dänzer, linux-kernel, amd-gfx, dri-devel
  Cc: David.Panariti, oleg, ebiederm, Alexander.Deucher, akpm,
	Christian.Koenig



On 04/24/2018 11:46 AM, Michel Dänzer wrote:
> Adding the dri-devel list, since this is driver independent code.

Thanks, so many addresses that this one slipped out...
>
>
> On 2018-04-24 05:30 PM, Andrey Grodzovsky wrote:
>> Avoid calling wait_event_killable when you are possibly being called
>> from get_signal routine since in that case you end up in a deadlock
>> where you are alreay blocked in singla processing any trying to wait
> Multiple typos here, "[...] already blocked in signal processing and [...]"?

I don't understand where are the typos.

Andrey

>
>
>> on a new signal.
>>
>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>> ---
>>   drivers/gpu/drm/scheduler/gpu_scheduler.c | 5 +++--
>>   1 file changed, 3 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler.c b/drivers/gpu/drm/scheduler/gpu_scheduler.c
>> index 088ff2b..09fd258 100644
>> --- a/drivers/gpu/drm/scheduler/gpu_scheduler.c
>> +++ b/drivers/gpu/drm/scheduler/gpu_scheduler.c
>> @@ -227,9 +227,10 @@ void drm_sched_entity_do_release(struct drm_gpu_scheduler *sched,
>>   		return;
>>   	/**
>>   	 * The client will not queue more IBs during this fini, consume existing
>> -	 * queued IBs or discard them on SIGKILL
>> +	 * queued IBs or discard them when in death signal state since
>> +	 * wait_event_killable can't receive signals in that state.
>>   	*/
>> -	if ((current->flags & PF_SIGNALED) && current->exit_code == SIGKILL)
>> +	if (current->flags & PF_SIGNALED)
>>   		entity->fini_status = -ERESTARTSYS;
>>   	else
>>   		entity->fini_status = wait_event_killable(sched->job_scheduled,
>>
>

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
@ 2018-04-24 15:52       ` Andrey Grodzovsky
  0 siblings, 0 replies; 122+ messages in thread
From: Andrey Grodzovsky @ 2018-04-24 15:52 UTC (permalink / raw)
  To: Michel Dänzer, linux-kernel, amd-gfx, dri-devel
  Cc: David.Panariti, oleg, ebiederm, Alexander.Deucher, akpm,
	Christian.Koenig



On 04/24/2018 11:46 AM, Michel Dänzer wrote:
> Adding the dri-devel list, since this is driver independent code.

Thanks, so many addresses that this one slipped out...
>
>
> On 2018-04-24 05:30 PM, Andrey Grodzovsky wrote:
>> Avoid calling wait_event_killable when you are possibly being called
>> from get_signal routine since in that case you end up in a deadlock
>> where you are alreay blocked in singla processing any trying to wait
> Multiple typos here, "[...] already blocked in signal processing and [...]"?

I don't understand where are the typos.

Andrey

>
>
>> on a new signal.
>>
>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>> ---
>>   drivers/gpu/drm/scheduler/gpu_scheduler.c | 5 +++--
>>   1 file changed, 3 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler.c b/drivers/gpu/drm/scheduler/gpu_scheduler.c
>> index 088ff2b..09fd258 100644
>> --- a/drivers/gpu/drm/scheduler/gpu_scheduler.c
>> +++ b/drivers/gpu/drm/scheduler/gpu_scheduler.c
>> @@ -227,9 +227,10 @@ void drm_sched_entity_do_release(struct drm_gpu_scheduler *sched,
>>   		return;
>>   	/**
>>   	 * The client will not queue more IBs during this fini, consume existing
>> -	 * queued IBs or discard them on SIGKILL
>> +	 * queued IBs or discard them when in death signal state since
>> +	 * wait_event_killable can't receive signals in that state.
>>   	*/
>> -	if ((current->flags & PF_SIGNALED) && current->exit_code == SIGKILL)
>> +	if (current->flags & PF_SIGNALED)
>>   		entity->fini_status = -ERESTARTSYS;
>>   	else
>>   		entity->fini_status = wait_event_killable(sched->job_scheduled,
>>
>

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 122+ messages in thread

* RE: [PATCH 3/3] drm/amdgpu: Switch to interrupted wait to recover from ring hang.
@ 2018-04-24 15:52     ` Panariti, David
  0 siblings, 0 replies; 122+ messages in thread
From: Panariti, David @ 2018-04-24 15:52 UTC (permalink / raw)
  To: Grodzovsky, Andrey, linux-kernel, amd-gfx
  Cc: Deucher, Alexander, Koenig, Christian, oleg, akpm, ebiederm,
	Grodzovsky, Andrey

Hi,

It looks like there can be an infinite loop if neither of the if()'s become true.
Is that an impossible condition?

-----Original Message-----
From: Andrey Grodzovsky <andrey.grodzovsky@amd.com> 
Sent: Tuesday, April 24, 2018 11:31 AM
To: linux-kernel@vger.kernel.org; amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander <Alexander.Deucher@amd.com>; Koenig, Christian <Christian.Koenig@amd.com>; Panariti, David <David.Panariti@amd.com>; oleg@redhat.com; akpm@linux-foundation.org; ebiederm@xmission.com; Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>
Subject: [PATCH 3/3] drm/amdgpu: Switch to interrupted wait to recover from ring hang.

If the ring is hanging for some reason allow to recover the waiting by sending fatal signal.

Originally-by: David Panariti <David.Panariti@amd.com>
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
index eb80edf..37a36af 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
@@ -421,10 +421,16 @@ int amdgpu_ctx_wait_prev_fence(struct amdgpu_ctx *ctx, unsigned ring_id)
 
 	if (other) {
 		signed long r;
-		r = dma_fence_wait_timeout(other, false, MAX_SCHEDULE_TIMEOUT);
-		if (r < 0) {
-			DRM_ERROR("Error (%ld) waiting for fence!\n", r);
-			return r;
+
+		while (true) {
+			if ((r = dma_fence_wait_timeout(other, true,
+					MAX_SCHEDULE_TIMEOUT)) >= 0)
+				return 0;
+
+			if (fatal_signal_pending(current)) {
+				DRM_ERROR("Error (%ld) waiting for fence!\n", r);
+				return r;
+			}
 		}
 	}
 
--
2.7.4

^ permalink raw reply related	[flat|nested] 122+ messages in thread

* RE: [PATCH 3/3] drm/amdgpu: Switch to interrupted wait to recover from ring hang.
@ 2018-04-24 15:52     ` Panariti, David
  0 siblings, 0 replies; 122+ messages in thread
From: Panariti, David @ 2018-04-24 15:52 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: Grodzovsky, Andrey, oleg-H+wXaHxf7aLQT0dZR+AlfA,
	ebiederm-aS9lmoZGLiVWk0Htik3J/w, Deucher, Alexander,
	akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, Koenig, Christian

Hi,

It looks like there can be an infinite loop if neither of the if()'s become true.
Is that an impossible condition?

-----Original Message-----
From: Andrey Grodzovsky <andrey.grodzovsky@amd.com> 
Sent: Tuesday, April 24, 2018 11:31 AM
To: linux-kernel@vger.kernel.org; amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander <Alexander.Deucher@amd.com>; Koenig, Christian <Christian.Koenig@amd.com>; Panariti, David <David.Panariti@amd.com>; oleg@redhat.com; akpm@linux-foundation.org; ebiederm@xmission.com; Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>
Subject: [PATCH 3/3] drm/amdgpu: Switch to interrupted wait to recover from ring hang.

If the ring is hanging for some reason allow to recover the waiting by sending fatal signal.

Originally-by: David Panariti <David.Panariti@amd.com>
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
index eb80edf..37a36af 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
@@ -421,10 +421,16 @@ int amdgpu_ctx_wait_prev_fence(struct amdgpu_ctx *ctx, unsigned ring_id)
 
 	if (other) {
 		signed long r;
-		r = dma_fence_wait_timeout(other, false, MAX_SCHEDULE_TIMEOUT);
-		if (r < 0) {
-			DRM_ERROR("Error (%ld) waiting for fence!\n", r);
-			return r;
+
+		while (true) {
+			if ((r = dma_fence_wait_timeout(other, true,
+					MAX_SCHEDULE_TIMEOUT)) >= 0)
+				return 0;
+
+			if (fatal_signal_pending(current)) {
+				DRM_ERROR("Error (%ld) waiting for fence!\n", r);
+				return r;
+			}
 		}
 	}
 
--
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 122+ messages in thread

* Re: [PATCH 3/3] drm/amdgpu: Switch to interrupted wait to recover from ring hang.
@ 2018-04-24 15:58       ` Andrey Grodzovsky
  0 siblings, 0 replies; 122+ messages in thread
From: Andrey Grodzovsky @ 2018-04-24 15:58 UTC (permalink / raw)
  To: Panariti, David, linux-kernel, amd-gfx
  Cc: Deucher, Alexander, Koenig, Christian, oleg, akpm, ebiederm



On 04/24/2018 11:52 AM, Panariti, David wrote:
> Hi,
>
> It looks like there can be an infinite loop if neither of the if()'s become true.
> Is that an impossible condition?

That intended, we want to wait until either the fence signals or fatal 
signal received, we don't want to terminate the wait if fence is not 
signaled  even  when interrupted by non fatal signal.
Kind of dma_fence_wait_killable, except that we don't have such API 
(maybe worth adding ?)

Andrey

>
> -----Original Message-----
> From: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> Sent: Tuesday, April 24, 2018 11:31 AM
> To: linux-kernel@vger.kernel.org; amd-gfx@lists.freedesktop.org
> Cc: Deucher, Alexander <Alexander.Deucher@amd.com>; Koenig, Christian <Christian.Koenig@amd.com>; Panariti, David <David.Panariti@amd.com>; oleg@redhat.com; akpm@linux-foundation.org; ebiederm@xmission.com; Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>
> Subject: [PATCH 3/3] drm/amdgpu: Switch to interrupted wait to recover from ring hang.
>
> If the ring is hanging for some reason allow to recover the waiting by sending fatal signal.
>
> Originally-by: David Panariti <David.Panariti@amd.com>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 14 ++++++++++----
>   1 file changed, 10 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
> index eb80edf..37a36af 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
> @@ -421,10 +421,16 @@ int amdgpu_ctx_wait_prev_fence(struct amdgpu_ctx *ctx, unsigned ring_id)
>   
>   	if (other) {
>   		signed long r;
> -		r = dma_fence_wait_timeout(other, false, MAX_SCHEDULE_TIMEOUT);
> -		if (r < 0) {
> -			DRM_ERROR("Error (%ld) waiting for fence!\n", r);
> -			return r;
> +
> +		while (true) {
> +			if ((r = dma_fence_wait_timeout(other, true,
> +					MAX_SCHEDULE_TIMEOUT)) >= 0)
> +				return 0;
> +
> +			if (fatal_signal_pending(current)) {
> +				DRM_ERROR("Error (%ld) waiting for fence!\n", r);
> +				return r;
> +			}
>   		}
>   	}
>   
> --
> 2.7.4
>

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 3/3] drm/amdgpu: Switch to interrupted wait to recover from ring hang.
@ 2018-04-24 15:58       ` Andrey Grodzovsky
  0 siblings, 0 replies; 122+ messages in thread
From: Andrey Grodzovsky @ 2018-04-24 15:58 UTC (permalink / raw)
  To: Panariti, David, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: Deucher, Alexander, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	ebiederm-aS9lmoZGLiVWk0Htik3J/w, Koenig, Christian,
	oleg-H+wXaHxf7aLQT0dZR+AlfA



On 04/24/2018 11:52 AM, Panariti, David wrote:
> Hi,
>
> It looks like there can be an infinite loop if neither of the if()'s become true.
> Is that an impossible condition?

That intended, we want to wait until either the fence signals or fatal 
signal received, we don't want to terminate the wait if fence is not 
signaled  even  when interrupted by non fatal signal.
Kind of dma_fence_wait_killable, except that we don't have such API 
(maybe worth adding ?)

Andrey

>
> -----Original Message-----
> From: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> Sent: Tuesday, April 24, 2018 11:31 AM
> To: linux-kernel@vger.kernel.org; amd-gfx@lists.freedesktop.org
> Cc: Deucher, Alexander <Alexander.Deucher@amd.com>; Koenig, Christian <Christian.Koenig@amd.com>; Panariti, David <David.Panariti@amd.com>; oleg@redhat.com; akpm@linux-foundation.org; ebiederm@xmission.com; Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>
> Subject: [PATCH 3/3] drm/amdgpu: Switch to interrupted wait to recover from ring hang.
>
> If the ring is hanging for some reason allow to recover the waiting by sending fatal signal.
>
> Originally-by: David Panariti <David.Panariti@amd.com>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 14 ++++++++++----
>   1 file changed, 10 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
> index eb80edf..37a36af 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
> @@ -421,10 +421,16 @@ int amdgpu_ctx_wait_prev_fence(struct amdgpu_ctx *ctx, unsigned ring_id)
>   
>   	if (other) {
>   		signed long r;
> -		r = dma_fence_wait_timeout(other, false, MAX_SCHEDULE_TIMEOUT);
> -		if (r < 0) {
> -			DRM_ERROR("Error (%ld) waiting for fence!\n", r);
> -			return r;
> +
> +		while (true) {
> +			if ((r = dma_fence_wait_timeout(other, true,
> +					MAX_SCHEDULE_TIMEOUT)) >= 0)
> +				return 0;
> +
> +			if (fatal_signal_pending(current)) {
> +				DRM_ERROR("Error (%ld) waiting for fence!\n", r);
> +				return r;
> +			}
>   		}
>   	}
>   
> --
> 2.7.4
>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 1/3] signals: Allow generation of SIGKILL to exiting task.
  2018-04-24 15:30   ` Andrey Grodzovsky
@ 2018-04-24 16:10     ` Eric W. Biederman
  -1 siblings, 0 replies; 122+ messages in thread
From: Eric W. Biederman @ 2018-04-24 16:10 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: linux-kernel, amd-gfx, Alexander.Deucher, Christian.Koenig,
	David.Panariti, oleg, akpm

Andrey Grodzovsky <andrey.grodzovsky@amd.com> writes:

> Currently calling wait_event_killable as part of exiting process
> will stall forever since SIGKILL generation is suppresed by PF_EXITING.
>
> In our partilaur case AMDGPU driver wants to flush all GPU jobs in
> flight before shutting down. But if some job hangs the pipe we still want to
> be able to kill it and avoid a process in D state.

This makes me profoundly uncomfotable.  You are changing the linux
semantics of what it means for a process to be exiting.  Functionally
this may require all kinds of changes to when we allow processes to stop
processing signals.

So without a really good thought out explanation that takes into account
all of the issues involved in process exiting and posix conformance.

Nacked-by: "Eric W. Biederman" <ebiederm@xmission.com>

Eric

> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> ---
>  kernel/signal.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/signal.c b/kernel/signal.c
> index c6e4c83..c49c706 100644
> --- a/kernel/signal.c
> +++ b/kernel/signal.c
> @@ -886,10 +886,10 @@ static inline int wants_signal(int sig, struct task_struct *p)
>  {
>  	if (sigismember(&p->blocked, sig))
>  		return 0;
> -	if (p->flags & PF_EXITING)
> -		return 0;
>  	if (sig == SIGKILL)
>  		return 1;
> +	if (p->flags & PF_EXITING)
> +		return 0;
>  	if (task_is_stopped_or_traced(p))
>  		return 0;
>  	return task_curr(p) || !signal_pending(p);

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 1/3] signals: Allow generation of SIGKILL to exiting task.
@ 2018-04-24 16:10     ` Eric W. Biederman
  0 siblings, 0 replies; 122+ messages in thread
From: Eric W. Biederman @ 2018-04-24 16:10 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: linux-kernel, amd-gfx, Alexander.Deucher, Christian.Koenig,
	David.Panariti, oleg, akpm

Andrey Grodzovsky <andrey.grodzovsky@amd.com> writes:

> Currently calling wait_event_killable as part of exiting process
> will stall forever since SIGKILL generation is suppresed by PF_EXITING.
>
> In our partilaur case AMDGPU driver wants to flush all GPU jobs in
> flight before shutting down. But if some job hangs the pipe we still want to
> be able to kill it and avoid a process in D state.

This makes me profoundly uncomfotable.  You are changing the linux
semantics of what it means for a process to be exiting.  Functionally
this may require all kinds of changes to when we allow processes to stop
processing signals.

So without a really good thought out explanation that takes into account
all of the issues involved in process exiting and posix conformance.

Nacked-by: "Eric W. Biederman" <ebiederm@xmission.com>

Eric

> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> ---
>  kernel/signal.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/signal.c b/kernel/signal.c
> index c6e4c83..c49c706 100644
> --- a/kernel/signal.c
> +++ b/kernel/signal.c
> @@ -886,10 +886,10 @@ static inline int wants_signal(int sig, struct task_struct *p)
>  {
>  	if (sigismember(&p->blocked, sig))
>  		return 0;
> -	if (p->flags & PF_EXITING)
> -		return 0;
>  	if (sig == SIGKILL)
>  		return 1;
> +	if (p->flags & PF_EXITING)
> +		return 0;
>  	if (task_is_stopped_or_traced(p))
>  		return 0;
>  	return task_curr(p) || !signal_pending(p);

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 3/3] drm/amdgpu: Switch to interrupted wait to recover from ring hang.
  2018-04-24 15:30   ` Andrey Grodzovsky
@ 2018-04-24 16:14     ` Eric W. Biederman
  -1 siblings, 0 replies; 122+ messages in thread
From: Eric W. Biederman @ 2018-04-24 16:14 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: linux-kernel, amd-gfx, Alexander.Deucher, Christian.Koenig,
	David.Panariti, oleg, akpm

Andrey Grodzovsky <andrey.grodzovsky@amd.com> writes:

> If the ring is hanging for some reason allow to recover the waiting
> by sending fatal signal.
>
> Originally-by: David Panariti <David.Panariti@amd.com>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 14 ++++++++++----
>  1 file changed, 10 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
> index eb80edf..37a36af 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
> @@ -421,10 +421,16 @@ int amdgpu_ctx_wait_prev_fence(struct amdgpu_ctx *ctx, unsigned ring_id)
>  
>  	if (other) {
>  		signed long r;
> -		r = dma_fence_wait_timeout(other, false, MAX_SCHEDULE_TIMEOUT);
> -		if (r < 0) {
> -			DRM_ERROR("Error (%ld) waiting for fence!\n", r);
> -			return r;
> +
> +		while (true) {
> +			if ((r = dma_fence_wait_timeout(other, true,
> +					MAX_SCHEDULE_TIMEOUT)) >= 0)
> +				return 0;
> +
> +			if (fatal_signal_pending(current)) {
> +				DRM_ERROR("Error (%ld) waiting for fence!\n", r);
> +				return r;
> +			}

It looks like if you make this code say:
			if (fatal_signal_pending(current) ||
			    (current->flags & PF_EXITING)) {
				DRM_ERROR("Error (%ld) waiting for fence!\n", r);
				return r;
>  		}
>  	}

Than you would not need the horrible hack want_signal to deliver signals
to processes who have passed exit_signal() and don't expect to need
their signal handling mechanisms anymore.

Eric

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 3/3] drm/amdgpu: Switch to interrupted wait to recover from ring hang.
@ 2018-04-24 16:14     ` Eric W. Biederman
  0 siblings, 0 replies; 122+ messages in thread
From: Eric W. Biederman @ 2018-04-24 16:14 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: linux-kernel, amd-gfx, Alexander.Deucher, Christian.Koenig,
	David.Panariti, oleg, akpm

Andrey Grodzovsky <andrey.grodzovsky@amd.com> writes:

> If the ring is hanging for some reason allow to recover the waiting
> by sending fatal signal.
>
> Originally-by: David Panariti <David.Panariti@amd.com>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 14 ++++++++++----
>  1 file changed, 10 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
> index eb80edf..37a36af 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
> @@ -421,10 +421,16 @@ int amdgpu_ctx_wait_prev_fence(struct amdgpu_ctx *ctx, unsigned ring_id)
>  
>  	if (other) {
>  		signed long r;
> -		r = dma_fence_wait_timeout(other, false, MAX_SCHEDULE_TIMEOUT);
> -		if (r < 0) {
> -			DRM_ERROR("Error (%ld) waiting for fence!\n", r);
> -			return r;
> +
> +		while (true) {
> +			if ((r = dma_fence_wait_timeout(other, true,
> +					MAX_SCHEDULE_TIMEOUT)) >= 0)
> +				return 0;
> +
> +			if (fatal_signal_pending(current)) {
> +				DRM_ERROR("Error (%ld) waiting for fence!\n", r);
> +				return r;
> +			}

It looks like if you make this code say:
			if (fatal_signal_pending(current) ||
			    (current->flags & PF_EXITING)) {
				DRM_ERROR("Error (%ld) waiting for fence!\n", r);
				return r;
>  		}
>  	}

Than you would not need the horrible hack want_signal to deliver signals
to processes who have passed exit_signal() and don't expect to need
their signal handling mechanisms anymore.

Eric

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 3/3] drm/amdgpu: Switch to interrupted wait to recover from ring hang.
@ 2018-04-24 16:20         ` Panariti, David
  0 siblings, 0 replies; 122+ messages in thread
From: Panariti, David @ 2018-04-24 16:20 UTC (permalink / raw)
  To: Grodzovsky, Andrey, linux-kernel, amd-gfx
  Cc: Deucher, Alexander, Koenig, Christian, oleg, akpm, ebiederm

> Kind of dma_fence_wait_killable, except that we don't have such API
> (maybe worth adding ?)
Depends on how many places it would be called, or think it might be called.  Can always factor on the 2nd time it's needed.
Factoring, IMO, rarely hurts.  The factored function can easily be visited using `M-.' ;->

Also, if the wait could be very long, would a log message, something like "xxx has run for Y seconds."  help?
I personally hate hanging w/no info.

regards,
davep

________________________________________
From: Grodzovsky, Andrey
Sent: Tuesday, April 24, 2018 11:58:19 AM
To: Panariti, David; linux-kernel@vger.kernel.org; amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander; Koenig, Christian; oleg@redhat.com; akpm@linux-foundation.org; ebiederm@xmission.com
Subject: Re: [PATCH 3/3] drm/amdgpu: Switch to interrupted wait to recover from ring hang.



On 04/24/2018 11:52 AM, Panariti, David wrote:
> Hi,
>
> It looks like there can be an infinite loop if neither of the if()'s become true.
> Is that an impossible condition?

That intended, we want to wait until either the fence signals or fatal
signal received, we don't want to terminate the wait if fence is not
signaled  even  when interrupted by non fatal signal.
Kind of dma_fence_wait_killable, except that we don't have such API
(maybe worth adding ?)

Andrey

>
> -----Original Message-----
> From: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> Sent: Tuesday, April 24, 2018 11:31 AM
> To: linux-kernel@vger.kernel.org; amd-gfx@lists.freedesktop.org
> Cc: Deucher, Alexander <Alexander.Deucher@amd.com>; Koenig, Christian <Christian.Koenig@amd.com>; Panariti, David <David.Panariti@amd.com>; oleg@redhat.com; akpm@linux-foundation.org; ebiederm@xmission.com; Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>
> Subject: [PATCH 3/3] drm/amdgpu: Switch to interrupted wait to recover from ring hang.
>
> If the ring is hanging for some reason allow to recover the waiting by sending fatal signal.
>
> Originally-by: David Panariti <David.Panariti@amd.com>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 14 ++++++++++----
>   1 file changed, 10 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
> index eb80edf..37a36af 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
> @@ -421,10 +421,16 @@ int amdgpu_ctx_wait_prev_fence(struct amdgpu_ctx *ctx, unsigned ring_id)
>
>       if (other) {
>               signed long r;
> -             r = dma_fence_wait_timeout(other, false, MAX_SCHEDULE_TIMEOUT);
> -             if (r < 0) {
> -                     DRM_ERROR("Error (%ld) waiting for fence!\n", r);
> -                     return r;
> +
> +             while (true) {
> +                     if ((r = dma_fence_wait_timeout(other, true,
> +                                     MAX_SCHEDULE_TIMEOUT)) >= 0)
> +                             return 0;
> +
> +                     if (fatal_signal_pending(current)) {
> +                             DRM_ERROR("Error (%ld) waiting for fence!\n", r);
> +                             return r;
> +                     }
>               }
>       }
>
> --
> 2.7.4
>

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 3/3] drm/amdgpu: Switch to interrupted wait to recover from ring hang.
@ 2018-04-24 16:20         ` Panariti, David
  0 siblings, 0 replies; 122+ messages in thread
From: Panariti, David @ 2018-04-24 16:20 UTC (permalink / raw)
  To: Grodzovsky, Andrey, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: Deucher, Alexander, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	ebiederm-aS9lmoZGLiVWk0Htik3J/w, Koenig, Christian,
	oleg-H+wXaHxf7aLQT0dZR+AlfA

> Kind of dma_fence_wait_killable, except that we don't have such API
> (maybe worth adding ?)
Depends on how many places it would be called, or think it might be called.  Can always factor on the 2nd time it's needed.
Factoring, IMO, rarely hurts.  The factored function can easily be visited using `M-.' ;->

Also, if the wait could be very long, would a log message, something like "xxx has run for Y seconds."  help?
I personally hate hanging w/no info.

regards,
davep

________________________________________
From: Grodzovsky, Andrey
Sent: Tuesday, April 24, 2018 11:58:19 AM
To: Panariti, David; linux-kernel@vger.kernel.org; amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander; Koenig, Christian; oleg@redhat.com; akpm@linux-foundation.org; ebiederm@xmission.com
Subject: Re: [PATCH 3/3] drm/amdgpu: Switch to interrupted wait to recover from ring hang.



On 04/24/2018 11:52 AM, Panariti, David wrote:
> Hi,
>
> It looks like there can be an infinite loop if neither of the if()'s become true.
> Is that an impossible condition?

That intended, we want to wait until either the fence signals or fatal
signal received, we don't want to terminate the wait if fence is not
signaled  even  when interrupted by non fatal signal.
Kind of dma_fence_wait_killable, except that we don't have such API
(maybe worth adding ?)

Andrey

>
> -----Original Message-----
> From: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> Sent: Tuesday, April 24, 2018 11:31 AM
> To: linux-kernel@vger.kernel.org; amd-gfx@lists.freedesktop.org
> Cc: Deucher, Alexander <Alexander.Deucher@amd.com>; Koenig, Christian <Christian.Koenig@amd.com>; Panariti, David <David.Panariti@amd.com>; oleg@redhat.com; akpm@linux-foundation.org; ebiederm@xmission.com; Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>
> Subject: [PATCH 3/3] drm/amdgpu: Switch to interrupted wait to recover from ring hang.
>
> If the ring is hanging for some reason allow to recover the waiting by sending fatal signal.
>
> Originally-by: David Panariti <David.Panariti@amd.com>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 14 ++++++++++----
>   1 file changed, 10 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
> index eb80edf..37a36af 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
> @@ -421,10 +421,16 @@ int amdgpu_ctx_wait_prev_fence(struct amdgpu_ctx *ctx, unsigned ring_id)
>
>       if (other) {
>               signed long r;
> -             r = dma_fence_wait_timeout(other, false, MAX_SCHEDULE_TIMEOUT);
> -             if (r < 0) {
> -                     DRM_ERROR("Error (%ld) waiting for fence!\n", r);
> -                     return r;
> +
> +             while (true) {
> +                     if ((r = dma_fence_wait_timeout(other, true,
> +                                     MAX_SCHEDULE_TIMEOUT)) >= 0)
> +                             return 0;
> +
> +                     if (fatal_signal_pending(current)) {
> +                             DRM_ERROR("Error (%ld) waiting for fence!\n", r);
> +                             return r;
> +                     }
>               }
>       }
>
> --
> 2.7.4
>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
  2018-04-24 15:30   ` Andrey Grodzovsky
@ 2018-04-24 16:23     ` Eric W. Biederman
  -1 siblings, 0 replies; 122+ messages in thread
From: Eric W. Biederman @ 2018-04-24 16:23 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: linux-kernel, amd-gfx, Alexander.Deucher, Christian.Koenig,
	David.Panariti, oleg, akpm

Andrey Grodzovsky <andrey.grodzovsky@amd.com> writes:

> Avoid calling wait_event_killable when you are possibly being called
> from get_signal routine since in that case you end up in a deadlock
> where you are alreay blocked in singla processing any trying to wait
> on a new signal.

I am curious what the call path that is problematic here.

In general waiting seems wrong when the process has already been
fatally killed as indicated by PF_SIGNALED.

Returning -ERESTARTSYS seems wrong as nothing should make it back even
to the edge of userspace here.

Given that this is the only use of PF_SIGNALED outside of bsd process
accounting I find this code very suspicious.

It looks the code path that gets called during exit is buggy and needs
to be sorted out.

Eric


> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> ---
>  drivers/gpu/drm/scheduler/gpu_scheduler.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler.c b/drivers/gpu/drm/scheduler/gpu_scheduler.c
> index 088ff2b..09fd258 100644
> --- a/drivers/gpu/drm/scheduler/gpu_scheduler.c
> +++ b/drivers/gpu/drm/scheduler/gpu_scheduler.c
> @@ -227,9 +227,10 @@ void drm_sched_entity_do_release(struct drm_gpu_scheduler *sched,
>  		return;
>  	/**
>  	 * The client will not queue more IBs during this fini, consume existing
> -	 * queued IBs or discard them on SIGKILL
> +	 * queued IBs or discard them when in death signal state since
> +	 * wait_event_killable can't receive signals in that state.
>  	*/
> -	if ((current->flags & PF_SIGNALED) && current->exit_code == SIGKILL)
> +	if (current->flags & PF_SIGNALED)
>  		entity->fini_status = -ERESTARTSYS;
>  	else
>  		entity->fini_status = wait_event_killable(sched->job_scheduled,

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
@ 2018-04-24 16:23     ` Eric W. Biederman
  0 siblings, 0 replies; 122+ messages in thread
From: Eric W. Biederman @ 2018-04-24 16:23 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: linux-kernel, amd-gfx, Alexander.Deucher, Christian.Koenig,
	David.Panariti, oleg, akpm

Andrey Grodzovsky <andrey.grodzovsky@amd.com> writes:

> Avoid calling wait_event_killable when you are possibly being called
> from get_signal routine since in that case you end up in a deadlock
> where you are alreay blocked in singla processing any trying to wait
> on a new signal.

I am curious what the call path that is problematic here.

In general waiting seems wrong when the process has already been
fatally killed as indicated by PF_SIGNALED.

Returning -ERESTARTSYS seems wrong as nothing should make it back even
to the edge of userspace here.

Given that this is the only use of PF_SIGNALED outside of bsd process
accounting I find this code very suspicious.

It looks the code path that gets called during exit is buggy and needs
to be sorted out.

Eric


> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> ---
>  drivers/gpu/drm/scheduler/gpu_scheduler.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler.c b/drivers/gpu/drm/scheduler/gpu_scheduler.c
> index 088ff2b..09fd258 100644
> --- a/drivers/gpu/drm/scheduler/gpu_scheduler.c
> +++ b/drivers/gpu/drm/scheduler/gpu_scheduler.c
> @@ -227,9 +227,10 @@ void drm_sched_entity_do_release(struct drm_gpu_scheduler *sched,
>  		return;
>  	/**
>  	 * The client will not queue more IBs during this fini, consume existing
> -	 * queued IBs or discard them on SIGKILL
> +	 * queued IBs or discard them when in death signal state since
> +	 * wait_event_killable can't receive signals in that state.
>  	*/
> -	if ((current->flags & PF_SIGNALED) && current->exit_code == SIGKILL)
> +	if (current->flags & PF_SIGNALED)
>  		entity->fini_status = -ERESTARTSYS;
>  	else
>  		entity->fini_status = wait_event_killable(sched->job_scheduled,

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 3/3] drm/amdgpu: Switch to interrupted wait to recover from ring hang.
  2018-04-24 16:20         ` Panariti, David
@ 2018-04-24 16:30           ` Eric W. Biederman
  -1 siblings, 0 replies; 122+ messages in thread
From: Eric W. Biederman @ 2018-04-24 16:30 UTC (permalink / raw)
  To: Panariti, David
  Cc: Grodzovsky, Andrey, linux-kernel, amd-gfx, Deucher, Alexander,
	Koenig, Christian, oleg, akpm

"Panariti, David" <David.Panariti@amd.com> writes:

> Andrey Grodzovsky <andrey.grodzovsky@amd.com> writes:
>> Kind of dma_fence_wait_killable, except that we don't have such API
>> (maybe worth adding ?)
> Depends on how many places it would be called, or think it might be called.  Can always factor on the 2nd time it's needed.
> Factoring, IMO, rarely hurts.  The factored function can easily be visited using `M-.' ;->
>
> Also, if the wait could be very long, would a log message, something like "xxx has run for Y seconds."  help?
> I personally hate hanging w/no info.

Ugh.  This loop appears susceptible to loosing wake ups.  There are
races between when a wake-up happens, when we clear the sleeping state,
and when we test the stat to see if we should stat awake.  So yes
implementing a dma_fence_wait_killable that handles of all that
correctly sounds like an very good idea.

Eric


>> If the ring is hanging for some reason allow to recover the waiting by sending fatal signal.
>>
>> Originally-by: David Panariti <David.Panariti@amd.com>
>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 14 ++++++++++----
>>   1 file changed, 10 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
>> index eb80edf..37a36af 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
>> @@ -421,10 +421,16 @@ int amdgpu_ctx_wait_prev_fence(struct amdgpu_ctx *ctx, unsigned ring_id)
>>
>>       if (other) {
>>               signed long r;
>> -             r = dma_fence_wait_timeout(other, false, MAX_SCHEDULE_TIMEOUT);
>> -             if (r < 0) {
>> -                     DRM_ERROR("Error (%ld) waiting for fence!\n", r);
>> -                     return r;
>> +
>> +             while (true) {
>> +                     if ((r = dma_fence_wait_timeout(other, true,
>> +                                     MAX_SCHEDULE_TIMEOUT)) >= 0)
>> +                             return 0;
>> +
>> +                     if (fatal_signal_pending(current)) {
>> +                             DRM_ERROR("Error (%ld) waiting for fence!\n", r);
>> +                             return r;
>> +                     }
>>               }
>>       }
>>
>> --
>> 2.7.4
>>
Eric

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 3/3] drm/amdgpu: Switch to interrupted wait to recover from ring hang.
@ 2018-04-24 16:30           ` Eric W. Biederman
  0 siblings, 0 replies; 122+ messages in thread
From: Eric W. Biederman @ 2018-04-24 16:30 UTC (permalink / raw)
  To: Panariti, David
  Cc: Grodzovsky, Andrey, linux-kernel, amd-gfx, Deucher, Alexander,
	Koenig, Christian, oleg, akpm

"Panariti, David" <David.Panariti@amd.com> writes:

> Andrey Grodzovsky <andrey.grodzovsky@amd.com> writes:
>> Kind of dma_fence_wait_killable, except that we don't have such API
>> (maybe worth adding ?)
> Depends on how many places it would be called, or think it might be called.  Can always factor on the 2nd time it's needed.
> Factoring, IMO, rarely hurts.  The factored function can easily be visited using `M-.' ;->
>
> Also, if the wait could be very long, would a log message, something like "xxx has run for Y seconds."  help?
> I personally hate hanging w/no info.

Ugh.  This loop appears susceptible to loosing wake ups.  There are
races between when a wake-up happens, when we clear the sleeping state,
and when we test the stat to see if we should stat awake.  So yes
implementing a dma_fence_wait_killable that handles of all that
correctly sounds like an very good idea.

Eric


>> If the ring is hanging for some reason allow to recover the waiting by sending fatal signal.
>>
>> Originally-by: David Panariti <David.Panariti@amd.com>
>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 14 ++++++++++----
>>   1 file changed, 10 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
>> index eb80edf..37a36af 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
>> @@ -421,10 +421,16 @@ int amdgpu_ctx_wait_prev_fence(struct amdgpu_ctx *ctx, unsigned ring_id)
>>
>>       if (other) {
>>               signed long r;
>> -             r = dma_fence_wait_timeout(other, false, MAX_SCHEDULE_TIMEOUT);
>> -             if (r < 0) {
>> -                     DRM_ERROR("Error (%ld) waiting for fence!\n", r);
>> -                     return r;
>> +
>> +             while (true) {
>> +                     if ((r = dma_fence_wait_timeout(other, true,
>> +                                     MAX_SCHEDULE_TIMEOUT)) >= 0)
>> +                             return 0;
>> +
>> +                     if (fatal_signal_pending(current)) {
>> +                             DRM_ERROR("Error (%ld) waiting for fence!\n", r);
>> +                             return r;
>> +                     }
>>               }
>>       }
>>
>> --
>> 2.7.4
>>
Eric

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 3/3] drm/amdgpu: Switch to interrupted wait to recover from ring hang.
@ 2018-04-24 16:38       ` Andrey Grodzovsky
  0 siblings, 0 replies; 122+ messages in thread
From: Andrey Grodzovsky @ 2018-04-24 16:38 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: linux-kernel, amd-gfx, Alexander.Deucher, Christian.Koenig,
	David.Panariti, oleg, akpm



On 04/24/2018 12:14 PM, Eric W. Biederman wrote:
> Andrey Grodzovsky <andrey.grodzovsky@amd.com> writes:
>
>> If the ring is hanging for some reason allow to recover the waiting
>> by sending fatal signal.
>>
>> Originally-by: David Panariti <David.Panariti@amd.com>
>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 14 ++++++++++----
>>   1 file changed, 10 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
>> index eb80edf..37a36af 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
>> @@ -421,10 +421,16 @@ int amdgpu_ctx_wait_prev_fence(struct amdgpu_ctx *ctx, unsigned ring_id)
>>   
>>   	if (other) {
>>   		signed long r;
>> -		r = dma_fence_wait_timeout(other, false, MAX_SCHEDULE_TIMEOUT);
>> -		if (r < 0) {
>> -			DRM_ERROR("Error (%ld) waiting for fence!\n", r);
>> -			return r;
>> +
>> +		while (true) {
>> +			if ((r = dma_fence_wait_timeout(other, true,
>> +					MAX_SCHEDULE_TIMEOUT)) >= 0)
>> +				return 0;
>> +
>> +			if (fatal_signal_pending(current)) {
>> +				DRM_ERROR("Error (%ld) waiting for fence!\n", r);
>> +				return r;
>> +			}
> It looks like if you make this code say:
> 			if (fatal_signal_pending(current) ||
> 			    (current->flags & PF_EXITING)) {
> 				DRM_ERROR("Error (%ld) waiting for fence!\n", r);
> 				return r;
>>   		}
>>   	}
> Than you would not need the horrible hack want_signal to deliver signals
> to processes who have passed exit_signal() and don't expect to need
> their signal handling mechanisms anymore.

Let me clarify,  the change in want_signal wasn't addressing this code 
but hang in
drm_sched_entity_do_release->wait_event_killable, when you try to 
gracefully terminate by waiting
for all job completions on the GPU pipe you process is using.

Andrey

>
> Eric
>

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 3/3] drm/amdgpu: Switch to interrupted wait to recover from ring hang.
@ 2018-04-24 16:38       ` Andrey Grodzovsky
  0 siblings, 0 replies; 122+ messages in thread
From: Andrey Grodzovsky @ 2018-04-24 16:38 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David.Panariti-5C7GfCeVMHo, oleg-H+wXaHxf7aLQT0dZR+AlfA,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	Alexander.Deucher-5C7GfCeVMHo,
	akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	Christian.Koenig-5C7GfCeVMHo



On 04/24/2018 12:14 PM, Eric W. Biederman wrote:
> Andrey Grodzovsky <andrey.grodzovsky@amd.com> writes:
>
>> If the ring is hanging for some reason allow to recover the waiting
>> by sending fatal signal.
>>
>> Originally-by: David Panariti <David.Panariti@amd.com>
>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 14 ++++++++++----
>>   1 file changed, 10 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
>> index eb80edf..37a36af 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
>> @@ -421,10 +421,16 @@ int amdgpu_ctx_wait_prev_fence(struct amdgpu_ctx *ctx, unsigned ring_id)
>>   
>>   	if (other) {
>>   		signed long r;
>> -		r = dma_fence_wait_timeout(other, false, MAX_SCHEDULE_TIMEOUT);
>> -		if (r < 0) {
>> -			DRM_ERROR("Error (%ld) waiting for fence!\n", r);
>> -			return r;
>> +
>> +		while (true) {
>> +			if ((r = dma_fence_wait_timeout(other, true,
>> +					MAX_SCHEDULE_TIMEOUT)) >= 0)
>> +				return 0;
>> +
>> +			if (fatal_signal_pending(current)) {
>> +				DRM_ERROR("Error (%ld) waiting for fence!\n", r);
>> +				return r;
>> +			}
> It looks like if you make this code say:
> 			if (fatal_signal_pending(current) ||
> 			    (current->flags & PF_EXITING)) {
> 				DRM_ERROR("Error (%ld) waiting for fence!\n", r);
> 				return r;
>>   		}
>>   	}
> Than you would not need the horrible hack want_signal to deliver signals
> to processes who have passed exit_signal() and don't expect to need
> their signal handling mechanisms anymore.

Let me clarify,  the change in want_signal wasn't addressing this code 
but hang in
drm_sched_entity_do_release->wait_event_killable, when you try to 
gracefully terminate by waiting
for all job completions on the GPU pipe you process is using.

Andrey

>
> Eric
>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 1/3] signals: Allow generation of SIGKILL to exiting task.
  2018-04-24 15:30   ` Andrey Grodzovsky
@ 2018-04-24 16:42     ` Eric W. Biederman
  -1 siblings, 0 replies; 122+ messages in thread
From: Eric W. Biederman @ 2018-04-24 16:42 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: linux-kernel, amd-gfx, Alexander.Deucher, Christian.Koenig,
	David.Panariti, oleg, akpm

Andrey Grodzovsky <andrey.grodzovsky@amd.com> writes:

> Currently calling wait_event_killable as part of exiting process
> will stall forever since SIGKILL generation is suppresed by PF_EXITING.
>
> In our partilaur case AMDGPU driver wants to flush all GPU jobs in
> flight before shutting down. But if some job hangs the pipe we still want to
> be able to kill it and avoid a process in D state.

I should clarify.  This absolutely can not be done.
PF_EXITING is set just before a task starts tearing down it's signal
handling.

So delivering any signal, or otherwise depending on signal handling
after PF_EXITING is set can not be done.  That abstraction is gone.

Eric

> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> ---
>  kernel/signal.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/signal.c b/kernel/signal.c
> index c6e4c83..c49c706 100644
> --- a/kernel/signal.c
> +++ b/kernel/signal.c
> @@ -886,10 +886,10 @@ static inline int wants_signal(int sig, struct task_struct *p)
>  {
>  	if (sigismember(&p->blocked, sig))
>  		return 0;
> -	if (p->flags & PF_EXITING)
> -		return 0;
>  	if (sig == SIGKILL)
>  		return 1;
> +	if (p->flags & PF_EXITING)
> +		return 0;
>  	if (task_is_stopped_or_traced(p))
>  		return 0;
>  	return task_curr(p) || !signal_pending(p);

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 1/3] signals: Allow generation of SIGKILL to exiting task.
@ 2018-04-24 16:42     ` Eric W. Biederman
  0 siblings, 0 replies; 122+ messages in thread
From: Eric W. Biederman @ 2018-04-24 16:42 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: linux-kernel, amd-gfx, Alexander.Deucher, Christian.Koenig,
	David.Panariti, oleg, akpm

Andrey Grodzovsky <andrey.grodzovsky@amd.com> writes:

> Currently calling wait_event_killable as part of exiting process
> will stall forever since SIGKILL generation is suppresed by PF_EXITING.
>
> In our partilaur case AMDGPU driver wants to flush all GPU jobs in
> flight before shutting down. But if some job hangs the pipe we still want to
> be able to kill it and avoid a process in D state.

I should clarify.  This absolutely can not be done.
PF_EXITING is set just before a task starts tearing down it's signal
handling.

So delivering any signal, or otherwise depending on signal handling
after PF_EXITING is set can not be done.  That abstraction is gone.

Eric

> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> ---
>  kernel/signal.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/signal.c b/kernel/signal.c
> index c6e4c83..c49c706 100644
> --- a/kernel/signal.c
> +++ b/kernel/signal.c
> @@ -886,10 +886,10 @@ static inline int wants_signal(int sig, struct task_struct *p)
>  {
>  	if (sigismember(&p->blocked, sig))
>  		return 0;
> -	if (p->flags & PF_EXITING)
> -		return 0;
>  	if (sig == SIGKILL)
>  		return 1;
> +	if (p->flags & PF_EXITING)
> +		return 0;
>  	if (task_is_stopped_or_traced(p))
>  		return 0;
>  	return task_curr(p) || !signal_pending(p);

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
@ 2018-04-24 16:43       ` Andrey Grodzovsky
  0 siblings, 0 replies; 122+ messages in thread
From: Andrey Grodzovsky @ 2018-04-24 16:43 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: linux-kernel, amd-gfx, Alexander.Deucher, Christian.Koenig,
	David.Panariti, oleg, akpm



On 04/24/2018 12:23 PM, Eric W. Biederman wrote:
> Andrey Grodzovsky <andrey.grodzovsky@amd.com> writes:
>
>> Avoid calling wait_event_killable when you are possibly being called
>> from get_signal routine since in that case you end up in a deadlock
>> where you are alreay blocked in singla processing any trying to wait
>> on a new signal.
> I am curious what the call path that is problematic here.

Here is the problematic call stack

[<0>] drm_sched_entity_fini+0x10a/0x3a0 [gpu_sched]
[<0>] amdgpu_ctx_do_release+0x129/0x170 [amdgpu]
[<0>] amdgpu_ctx_mgr_fini+0xd5/0xe0 [amdgpu]
[<0>] amdgpu_driver_postclose_kms+0xcd/0x440 [amdgpu]
[<0>] drm_release+0x414/0x5b0 [drm]
[<0>] __fput+0x176/0x350
[<0>] task_work_run+0xa1/0xc0
[<0>] do_exit+0x48f/0x1280
[<0>] do_group_exit+0x89/0x140
[<0>] get_signal+0x375/0x8f0
[<0>] do_signal+0x79/0xaa0
[<0>] exit_to_usermode_loop+0x83/0xd0
[<0>] do_syscall_64+0x244/0x270
[<0>] entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[<0>] 0xffffffffffffffff

On exit from system call you process all the signals you received and 
encounter a fatal signal which triggers process termination.

>
> In general waiting seems wrong when the process has already been
> fatally killed as indicated by PF_SIGNALED.

So indeed this patch avoids wait in this case.

>
> Returning -ERESTARTSYS seems wrong as nothing should make it back even
> to the edge of userspace here.

Can you clarify please - what should be returned here instead ?

Andrey

>
> Given that this is the only use of PF_SIGNALED outside of bsd process
> accounting I find this code very suspicious.
>
> It looks the code path that gets called during exit is buggy and needs
> to be sorted out.
>
> Eric
>
>
>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>> ---
>>   drivers/gpu/drm/scheduler/gpu_scheduler.c | 5 +++--
>>   1 file changed, 3 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler.c b/drivers/gpu/drm/scheduler/gpu_scheduler.c
>> index 088ff2b..09fd258 100644
>> --- a/drivers/gpu/drm/scheduler/gpu_scheduler.c
>> +++ b/drivers/gpu/drm/scheduler/gpu_scheduler.c
>> @@ -227,9 +227,10 @@ void drm_sched_entity_do_release(struct drm_gpu_scheduler *sched,
>>   		return;
>>   	/**
>>   	 * The client will not queue more IBs during this fini, consume existing
>> -	 * queued IBs or discard them on SIGKILL
>> +	 * queued IBs or discard them when in death signal state since
>> +	 * wait_event_killable can't receive signals in that state.
>>   	*/
>> -	if ((current->flags & PF_SIGNALED) && current->exit_code == SIGKILL)
>> +	if (current->flags & PF_SIGNALED)
>>   		entity->fini_status = -ERESTARTSYS;
>>   	else
>>   		entity->fini_status = wait_event_killable(sched->job_scheduled,

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
@ 2018-04-24 16:43       ` Andrey Grodzovsky
  0 siblings, 0 replies; 122+ messages in thread
From: Andrey Grodzovsky @ 2018-04-24 16:43 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David.Panariti-5C7GfCeVMHo, oleg-H+wXaHxf7aLQT0dZR+AlfA,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	Alexander.Deucher-5C7GfCeVMHo,
	akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	Christian.Koenig-5C7GfCeVMHo



On 04/24/2018 12:23 PM, Eric W. Biederman wrote:
> Andrey Grodzovsky <andrey.grodzovsky@amd.com> writes:
>
>> Avoid calling wait_event_killable when you are possibly being called
>> from get_signal routine since in that case you end up in a deadlock
>> where you are alreay blocked in singla processing any trying to wait
>> on a new signal.
> I am curious what the call path that is problematic here.

Here is the problematic call stack

[<0>] drm_sched_entity_fini+0x10a/0x3a0 [gpu_sched]
[<0>] amdgpu_ctx_do_release+0x129/0x170 [amdgpu]
[<0>] amdgpu_ctx_mgr_fini+0xd5/0xe0 [amdgpu]
[<0>] amdgpu_driver_postclose_kms+0xcd/0x440 [amdgpu]
[<0>] drm_release+0x414/0x5b0 [drm]
[<0>] __fput+0x176/0x350
[<0>] task_work_run+0xa1/0xc0
[<0>] do_exit+0x48f/0x1280
[<0>] do_group_exit+0x89/0x140
[<0>] get_signal+0x375/0x8f0
[<0>] do_signal+0x79/0xaa0
[<0>] exit_to_usermode_loop+0x83/0xd0
[<0>] do_syscall_64+0x244/0x270
[<0>] entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[<0>] 0xffffffffffffffff

On exit from system call you process all the signals you received and 
encounter a fatal signal which triggers process termination.

>
> In general waiting seems wrong when the process has already been
> fatally killed as indicated by PF_SIGNALED.

So indeed this patch avoids wait in this case.

>
> Returning -ERESTARTSYS seems wrong as nothing should make it back even
> to the edge of userspace here.

Can you clarify please - what should be returned here instead ?

Andrey

>
> Given that this is the only use of PF_SIGNALED outside of bsd process
> accounting I find this code very suspicious.
>
> It looks the code path that gets called during exit is buggy and needs
> to be sorted out.
>
> Eric
>
>
>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>> ---
>>   drivers/gpu/drm/scheduler/gpu_scheduler.c | 5 +++--
>>   1 file changed, 3 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler.c b/drivers/gpu/drm/scheduler/gpu_scheduler.c
>> index 088ff2b..09fd258 100644
>> --- a/drivers/gpu/drm/scheduler/gpu_scheduler.c
>> +++ b/drivers/gpu/drm/scheduler/gpu_scheduler.c
>> @@ -227,9 +227,10 @@ void drm_sched_entity_do_release(struct drm_gpu_scheduler *sched,
>>   		return;
>>   	/**
>>   	 * The client will not queue more IBs during this fini, consume existing
>> -	 * queued IBs or discard them on SIGKILL
>> +	 * queued IBs or discard them when in death signal state since
>> +	 * wait_event_killable can't receive signals in that state.
>>   	*/
>> -	if ((current->flags & PF_SIGNALED) && current->exit_code == SIGKILL)
>> +	if (current->flags & PF_SIGNALED)
>>   		entity->fini_status = -ERESTARTSYS;
>>   	else
>>   		entity->fini_status = wait_event_killable(sched->job_scheduled,

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 1/3] signals: Allow generation of SIGKILL to exiting task.
@ 2018-04-24 16:51       ` Andrey Grodzovsky
  0 siblings, 0 replies; 122+ messages in thread
From: Andrey Grodzovsky @ 2018-04-24 16:51 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: linux-kernel, amd-gfx, Alexander.Deucher, Christian.Koenig,
	David.Panariti, oleg, akpm



On 04/24/2018 12:42 PM, Eric W. Biederman wrote:
> Andrey Grodzovsky <andrey.grodzovsky@amd.com> writes:
>
>> Currently calling wait_event_killable as part of exiting process
>> will stall forever since SIGKILL generation is suppresed by PF_EXITING.
>>
>> In our partilaur case AMDGPU driver wants to flush all GPU jobs in
>> flight before shutting down. But if some job hangs the pipe we still want to
>> be able to kill it and avoid a process in D state.
> I should clarify.  This absolutely can not be done.
> PF_EXITING is set just before a task starts tearing down it's signal
> handling.
>
> So delivering any signal, or otherwise depending on signal handling
> after PF_EXITING is set can not be done.  That abstraction is gone.

I see, so you suggest it's the driver responsibility to avoid creating 
such code path that ends up
calling wait_event_killable from exit call stack (PF_EXITING == 1) ?

Andrey

>
> Eric
>
>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>> ---
>>   kernel/signal.c | 4 ++--
>>   1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/kernel/signal.c b/kernel/signal.c
>> index c6e4c83..c49c706 100644
>> --- a/kernel/signal.c
>> +++ b/kernel/signal.c
>> @@ -886,10 +886,10 @@ static inline int wants_signal(int sig, struct task_struct *p)
>>   {
>>   	if (sigismember(&p->blocked, sig))
>>   		return 0;
>> -	if (p->flags & PF_EXITING)
>> -		return 0;
>>   	if (sig == SIGKILL)
>>   		return 1;
>> +	if (p->flags & PF_EXITING)
>> +		return 0;
>>   	if (task_is_stopped_or_traced(p))
>>   		return 0;
>>   	return task_curr(p) || !signal_pending(p);

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 1/3] signals: Allow generation of SIGKILL to exiting task.
@ 2018-04-24 16:51       ` Andrey Grodzovsky
  0 siblings, 0 replies; 122+ messages in thread
From: Andrey Grodzovsky @ 2018-04-24 16:51 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David.Panariti-5C7GfCeVMHo, oleg-H+wXaHxf7aLQT0dZR+AlfA,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	Alexander.Deucher-5C7GfCeVMHo,
	akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	Christian.Koenig-5C7GfCeVMHo



On 04/24/2018 12:42 PM, Eric W. Biederman wrote:
> Andrey Grodzovsky <andrey.grodzovsky@amd.com> writes:
>
>> Currently calling wait_event_killable as part of exiting process
>> will stall forever since SIGKILL generation is suppresed by PF_EXITING.
>>
>> In our partilaur case AMDGPU driver wants to flush all GPU jobs in
>> flight before shutting down. But if some job hangs the pipe we still want to
>> be able to kill it and avoid a process in D state.
> I should clarify.  This absolutely can not be done.
> PF_EXITING is set just before a task starts tearing down it's signal
> handling.
>
> So delivering any signal, or otherwise depending on signal handling
> after PF_EXITING is set can not be done.  That abstraction is gone.

I see, so you suggest it's the driver responsibility to avoid creating 
such code path that ends up
calling wait_event_killable from exit call stack (PF_EXITING == 1) ?

Andrey

>
> Eric
>
>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>> ---
>>   kernel/signal.c | 4 ++--
>>   1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/kernel/signal.c b/kernel/signal.c
>> index c6e4c83..c49c706 100644
>> --- a/kernel/signal.c
>> +++ b/kernel/signal.c
>> @@ -886,10 +886,10 @@ static inline int wants_signal(int sig, struct task_struct *p)
>>   {
>>   	if (sigismember(&p->blocked, sig))
>>   		return 0;
>> -	if (p->flags & PF_EXITING)
>> -		return 0;
>>   	if (sig == SIGKILL)
>>   		return 1;
>> +	if (p->flags & PF_EXITING)
>> +		return 0;
>>   	if (task_is_stopped_or_traced(p))
>>   		return 0;
>>   	return task_curr(p) || !signal_pending(p);

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
  2018-04-24 16:43       ` Andrey Grodzovsky
  (?)
@ 2018-04-24 17:12       ` Eric W. Biederman
  2018-04-25 13:55         ` Oleg Nesterov
  -1 siblings, 1 reply; 122+ messages in thread
From: Eric W. Biederman @ 2018-04-24 17:12 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: linux-kernel, amd-gfx, Alexander.Deucher, Christian.Koenig,
	David.Panariti, oleg, akpm

Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> writes:

> On 04/24/2018 12:23 PM, Eric W. Biederman wrote:
>> Andrey Grodzovsky <andrey.grodzovsky@amd.com> writes:
>>
>>> Avoid calling wait_event_killable when you are possibly being called
>>> from get_signal routine since in that case you end up in a deadlock
>>> where you are alreay blocked in singla processing any trying to wait
>>> on a new signal.
>> I am curious what the call path that is problematic here.
>
> Here is the problematic call stack
>
> [<0>] drm_sched_entity_fini+0x10a/0x3a0 [gpu_sched]
> [<0>] amdgpu_ctx_do_release+0x129/0x170 [amdgpu]
> [<0>] amdgpu_ctx_mgr_fini+0xd5/0xe0 [amdgpu]
> [<0>] amdgpu_driver_postclose_kms+0xcd/0x440 [amdgpu]
> [<0>] drm_release+0x414/0x5b0 [drm]
> [<0>] __fput+0x176/0x350
> [<0>] task_work_run+0xa1/0xc0
> [<0>] do_exit+0x48f/0x1280
> [<0>] do_group_exit+0x89/0x140
> [<0>] get_signal+0x375/0x8f0
> [<0>] do_signal+0x79/0xaa0
> [<0>] exit_to_usermode_loop+0x83/0xd0
> [<0>] do_syscall_64+0x244/0x270
> [<0>] entry_SYSCALL_64_after_hwframe+0x3d/0xa2
> [<0>] 0xffffffffffffffff
>
> On exit from system call you process all the signals you received and
> encounter a fatal signal which triggers process termination.
>
>>
>> In general waiting seems wrong when the process has already been
>> fatally killed as indicated by PF_SIGNALED.
>
> So indeed this patch avoids wait in this case.
>
>>
>> Returning -ERESTARTSYS seems wrong as nothing should make it back even
>> to the edge of userspace here.
>
> Can you clarify please - what should be returned here instead ?

__fput does not have a return code.  I don't see the return code of
release being used anywhere.  So any return code is going to be lost.
So maybe something that talks to the drm/kernel layer but don't expect
your system call to be restarted, which is what -ERESTARTSYS asks for.

Hmm.  When looking at the code that is merged versus whatever your
patch is against it gets even clearer.  The -ERESTARTSYS
return code doesn't even get out of drm_sched_entity_fini.

Caring at all about process state at that point is wrong, as except for
being in ``process'' context where you can sleep nothing is connected to
a process.

Let me respectfully suggest that the wait_event_killable on that code
path is wrong.  Possibly you want a wait_event_timeout if you are very
nice.  But the code already has the logic necessary to handle what
happens if it can't sleep.

So I think the justification needs to be why you are trying to sleep
there at all.

The progress guarantee needs to come from the gpu layer or the AMD
driver not from someone getting impatient and sending SIGKILL to
a dead process.


Eric


>>
>> Given that this is the only use of PF_SIGNALED outside of bsd process
>> accounting I find this code very suspicious.
>>
>> It looks the code path that gets called during exit is buggy and needs
>> to be sorted out.
>>
>> Eric
>>
>>
>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>> ---
>>>   drivers/gpu/drm/scheduler/gpu_scheduler.c | 5 +++--
>>>   1 file changed, 3 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler.c b/drivers/gpu/drm/scheduler/gpu_scheduler.c
>>> index 088ff2b..09fd258 100644
>>> --- a/drivers/gpu/drm/scheduler/gpu_scheduler.c
>>> +++ b/drivers/gpu/drm/scheduler/gpu_scheduler.c
>>> @@ -227,9 +227,10 @@ void drm_sched_entity_do_release(struct drm_gpu_scheduler *sched,
>>>   		return;
>>>   	/**
>>>   	 * The client will not queue more IBs during this fini, consume existing
>>> -	 * queued IBs or discard them on SIGKILL
>>> +	 * queued IBs or discard them when in death signal state since
>>> +	 * wait_event_killable can't receive signals in that state.
>>>   	*/
>>> -	if ((current->flags & PF_SIGNALED) && current->exit_code == SIGKILL)
>>> +	if (current->flags & PF_SIGNALED)
>>>   		entity->fini_status = -ERESTARTSYS;
>>>   	else
>>>   		entity->fini_status = wait_event_killable(sched->job_scheduled,

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 1/3] signals: Allow generation of SIGKILL to exiting task.
  2018-04-24 16:51       ` Andrey Grodzovsky
  (?)
@ 2018-04-24 17:29       ` Eric W. Biederman
  -1 siblings, 0 replies; 122+ messages in thread
From: Eric W. Biederman @ 2018-04-24 17:29 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: linux-kernel, amd-gfx, Alexander.Deucher, Christian.Koenig,
	David.Panariti, oleg, akpm

Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> writes:

> On 04/24/2018 12:42 PM, Eric W. Biederman wrote:
>> Andrey Grodzovsky <andrey.grodzovsky@amd.com> writes:
>>
>>> Currently calling wait_event_killable as part of exiting process
>>> will stall forever since SIGKILL generation is suppresed by PF_EXITING.
>>>
>>> In our partilaur case AMDGPU driver wants to flush all GPU jobs in
>>> flight before shutting down. But if some job hangs the pipe we still want to
>>> be able to kill it and avoid a process in D state.
>> I should clarify.  This absolutely can not be done.
>> PF_EXITING is set just before a task starts tearing down it's signal
>> handling.
>>
>> So delivering any signal, or otherwise depending on signal handling
>> after PF_EXITING is set can not be done.  That abstraction is gone.
>
> I see, so you suggest it's the driver responsibility to avoid creating
> such code path that ends up
> calling wait_event_killable from exit call stack (PF_EXITING == 1) ?

I don't just suggest.

I am saying clearly that any dependency on receiving SIGKILL after
PF_EXITING is set is a bug.

It looks safe (the bitmap is not freed) to use wait_event_killable on a
dual use code path, but you can't expect SIGKILL ever to be delivered
during fop->release, as f_op->release is called from exit after signal
handling has been shutdown.

The best generic code could do would be to always have
fatal_signal_pending return true after PF_EXITING is set.

Increasingly I am thinking that drm_sched_entity_fini should have a
wait_event_timeout or no wait at all.  The cleanup code should have
a progress guarantee of it's own.

Eric

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
@ 2018-04-24 19:44       ` Daniel Vetter
  0 siblings, 0 replies; 122+ messages in thread
From: Daniel Vetter @ 2018-04-24 19:44 UTC (permalink / raw)
  To: Michel Dänzer
  Cc: Andrey Grodzovsky, linux-kernel, amd-gfx, dri-devel,
	David.Panariti, oleg, ebiederm, Alexander.Deucher, akpm,
	Christian.Koenig

On Tue, Apr 24, 2018 at 05:46:52PM +0200, Michel Dänzer wrote:
> 
> Adding the dri-devel list, since this is driver independent code.
> 
> 
> On 2018-04-24 05:30 PM, Andrey Grodzovsky wrote:
> > Avoid calling wait_event_killable when you are possibly being called
> > from get_signal routine since in that case you end up in a deadlock
> > where you are alreay blocked in singla processing any trying to wait
> 
> Multiple typos here, "[...] already blocked in signal processing and [...]"?
> 
> 
> > on a new signal.
> > 
> > Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> > ---
> >  drivers/gpu/drm/scheduler/gpu_scheduler.c | 5 +++--
> >  1 file changed, 3 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler.c b/drivers/gpu/drm/scheduler/gpu_scheduler.c
> > index 088ff2b..09fd258 100644
> > --- a/drivers/gpu/drm/scheduler/gpu_scheduler.c
> > +++ b/drivers/gpu/drm/scheduler/gpu_scheduler.c
> > @@ -227,9 +227,10 @@ void drm_sched_entity_do_release(struct drm_gpu_scheduler *sched,
> >  		return;
> >  	/**
> >  	 * The client will not queue more IBs during this fini, consume existing
> > -	 * queued IBs or discard them on SIGKILL
> > +	 * queued IBs or discard them when in death signal state since
> > +	 * wait_event_killable can't receive signals in that state.
> >  	*/
> > -	if ((current->flags & PF_SIGNALED) && current->exit_code == SIGKILL)
> > +	if (current->flags & PF_SIGNALED)

You want fatal_signal_pending() here, instead of inventing your own broken
version.
> >  		entity->fini_status = -ERESTARTSYS;
> >  	else
> >  		entity->fini_status = wait_event_killable(sched->job_scheduled,

But really this smells like a bug in wait_event_killable, since
wait_event_interruptible does not suffer from the same bug. It will return
immediately when there's a signal pending.

I think this should be fixed in core code, not papered over in some
subsystem.
-Daniel

> > 
> 
> 
> -- 
> Earthling Michel Dänzer               |               http://www.amd.com
> Libre software enthusiast             |             Mesa and X developer
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
@ 2018-04-24 19:44       ` Daniel Vetter
  0 siblings, 0 replies; 122+ messages in thread
From: Daniel Vetter @ 2018-04-24 19:44 UTC (permalink / raw)
  To: Michel Dänzer
  Cc: Andrey Grodzovsky, David.Panariti-5C7GfCeVMHo,
	oleg-H+wXaHxf7aLQT0dZR+AlfA,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	Alexander.Deucher-5C7GfCeVMHo,
	akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	Christian.Koenig-5C7GfCeVMHo, ebiederm-aS9lmoZGLiVWk0Htik3J/w

On Tue, Apr 24, 2018 at 05:46:52PM +0200, Michel Dänzer wrote:
> 
> Adding the dri-devel list, since this is driver independent code.
> 
> 
> On 2018-04-24 05:30 PM, Andrey Grodzovsky wrote:
> > Avoid calling wait_event_killable when you are possibly being called
> > from get_signal routine since in that case you end up in a deadlock
> > where you are alreay blocked in singla processing any trying to wait
> 
> Multiple typos here, "[...] already blocked in signal processing and [...]"?
> 
> 
> > on a new signal.
> > 
> > Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> > ---
> >  drivers/gpu/drm/scheduler/gpu_scheduler.c | 5 +++--
> >  1 file changed, 3 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler.c b/drivers/gpu/drm/scheduler/gpu_scheduler.c
> > index 088ff2b..09fd258 100644
> > --- a/drivers/gpu/drm/scheduler/gpu_scheduler.c
> > +++ b/drivers/gpu/drm/scheduler/gpu_scheduler.c
> > @@ -227,9 +227,10 @@ void drm_sched_entity_do_release(struct drm_gpu_scheduler *sched,
> >  		return;
> >  	/**
> >  	 * The client will not queue more IBs during this fini, consume existing
> > -	 * queued IBs or discard them on SIGKILL
> > +	 * queued IBs or discard them when in death signal state since
> > +	 * wait_event_killable can't receive signals in that state.
> >  	*/
> > -	if ((current->flags & PF_SIGNALED) && current->exit_code == SIGKILL)
> > +	if (current->flags & PF_SIGNALED)

You want fatal_signal_pending() here, instead of inventing your own broken
version.
> >  		entity->fini_status = -ERESTARTSYS;
> >  	else
> >  		entity->fini_status = wait_event_killable(sched->job_scheduled,

But really this smells like a bug in wait_event_killable, since
wait_event_interruptible does not suffer from the same bug. It will return
immediately when there's a signal pending.

I think this should be fixed in core code, not papered over in some
subsystem.
-Daniel

> > 
> 
> 
> -- 
> Earthling Michel Dänzer               |               http://www.amd.com
> Libre software enthusiast             |             Mesa and X developer
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
  2018-04-24 19:44       ` Daniel Vetter
  (?)
@ 2018-04-24 21:00       ` Eric W. Biederman
  -1 siblings, 0 replies; 122+ messages in thread
From: Eric W. Biederman @ 2018-04-24 21:00 UTC (permalink / raw)
  To: Michel Dänzer
  Cc: Andrey Grodzovsky, linux-kernel, amd-gfx, dri-devel,
	David.Panariti, oleg, Alexander.Deucher, akpm, Christian.Koenig

Daniel Vetter <daniel@ffwll.ch> writes:

> On Tue, Apr 24, 2018 at 05:46:52PM +0200, Michel Dänzer wrote:
>> 
>> Adding the dri-devel list, since this is driver independent code.
>> 
>> 
>> On 2018-04-24 05:30 PM, Andrey Grodzovsky wrote:
>> > Avoid calling wait_event_killable when you are possibly being called
>> > from get_signal routine since in that case you end up in a deadlock
>> > where you are alreay blocked in singla processing any trying to wait
>> 
>> Multiple typos here, "[...] already blocked in signal processing and [...]"?
>> 
>> 
>> > on a new signal.
>> > 
>> > Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>> > ---
>> >  drivers/gpu/drm/scheduler/gpu_scheduler.c | 5 +++--
>> >  1 file changed, 3 insertions(+), 2 deletions(-)
>> > 
>> > diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler.c b/drivers/gpu/drm/scheduler/gpu_scheduler.c
>> > index 088ff2b..09fd258 100644
>> > --- a/drivers/gpu/drm/scheduler/gpu_scheduler.c
>> > +++ b/drivers/gpu/drm/scheduler/gpu_scheduler.c
>> > @@ -227,9 +227,10 @@ void drm_sched_entity_do_release(struct drm_gpu_scheduler *sched,
>> >  		return;
>> >  	/**
>> >  	 * The client will not queue more IBs during this fini, consume existing
>> > -	 * queued IBs or discard them on SIGKILL
>> > +	 * queued IBs or discard them when in death signal state since
>> > +	 * wait_event_killable can't receive signals in that state.
>> >  	*/
>> > -	if ((current->flags & PF_SIGNALED) && current->exit_code == SIGKILL)
>> > +	if (current->flags & PF_SIGNALED)
>
> You want fatal_signal_pending() here, instead of inventing your own broken
> version.
>> >  		entity->fini_status = -ERESTARTSYS;
>> >  	else
>> >  		entity->fini_status = wait_event_killable(sched->job_scheduled,
>
> But really this smells like a bug in wait_event_killable, since
> wait_event_interruptible does not suffer from the same bug. It will return
> immediately when there's a signal pending.
>
> I think this should be fixed in core code, not papered over in some
> subsystem.

PF_SIGNALED does not mean a signal has been sent.  PF_SIGNALED means
the process was killed by a signal.

Neither of interruptible or killable makes sense after the process has
been killed.

Eric

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
@ 2018-04-24 21:02         ` Andrey Grodzovsky
  0 siblings, 0 replies; 122+ messages in thread
From: Andrey Grodzovsky @ 2018-04-24 21:02 UTC (permalink / raw)
  To: Michel Dänzer, linux-kernel, amd-gfx, dri-devel,
	David.Panariti, oleg, ebiederm, Alexander.Deucher, akpm,
	Christian.Koenig



On 04/24/2018 03:44 PM, Daniel Vetter wrote:
> On Tue, Apr 24, 2018 at 05:46:52PM +0200, Michel Dänzer wrote:
>> Adding the dri-devel list, since this is driver independent code.
>>
>>
>> On 2018-04-24 05:30 PM, Andrey Grodzovsky wrote:
>>> Avoid calling wait_event_killable when you are possibly being called
>>> from get_signal routine since in that case you end up in a deadlock
>>> where you are alreay blocked in singla processing any trying to wait
>> Multiple typos here, "[...] already blocked in signal processing and [...]"?
>>
>>
>>> on a new signal.
>>>
>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>> ---
>>>   drivers/gpu/drm/scheduler/gpu_scheduler.c | 5 +++--
>>>   1 file changed, 3 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler.c b/drivers/gpu/drm/scheduler/gpu_scheduler.c
>>> index 088ff2b..09fd258 100644
>>> --- a/drivers/gpu/drm/scheduler/gpu_scheduler.c
>>> +++ b/drivers/gpu/drm/scheduler/gpu_scheduler.c
>>> @@ -227,9 +227,10 @@ void drm_sched_entity_do_release(struct drm_gpu_scheduler *sched,
>>>   		return;
>>>   	/**
>>>   	 * The client will not queue more IBs during this fini, consume existing
>>> -	 * queued IBs or discard them on SIGKILL
>>> +	 * queued IBs or discard them when in death signal state since
>>> +	 * wait_event_killable can't receive signals in that state.
>>>   	*/
>>> -	if ((current->flags & PF_SIGNALED) && current->exit_code == SIGKILL)
>>> +	if (current->flags & PF_SIGNALED)
> You want fatal_signal_pending() here, instead of inventing your own broken
> version.

I rely on current->flags & PF_SIGNALED because this being set from 
within get_signal,
meaning I am within signal processing  in which case I want to avoid any 
signal based wait for that task,
 From what i see in the code, task_struct.pending.signal is being set 
for other threads in same
group (zap_other_threads) or for other scenarios, those task are still 
able to receive signals
so calling wait_event_killable there will not have problem.
>>>   		entity->fini_status = -ERESTARTSYS;
>>>   	else
>>>   		entity->fini_status = wait_event_killable(sched->job_scheduled,
> But really this smells like a bug in wait_event_killable, since
> wait_event_interruptible does not suffer from the same bug. It will return
> immediately when there's a signal pending.

Even when wait_event_interruptible is called as following - 
...->do_signal->get_signal->....->wait_event_interruptible ?
I haven't tried it but wait_event_interruptible is very much alike to 
wait_event_killable so I would assume it will also
not be interrupted if called like that. (Will give it a try just out of 
curiosity anyway)

Andrey

>
> I think this should be fixed in core code, not papered over in some
> subsystem.
> -Daniel
>
>>
>> -- 
>> Earthling Michel Dänzer               |               http://www.amd.com
>> Libre software enthusiast             |             Mesa and X developer
>> _______________________________________________
>> dri-devel mailing list
>> dri-devel@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
@ 2018-04-24 21:02         ` Andrey Grodzovsky
  0 siblings, 0 replies; 122+ messages in thread
From: Andrey Grodzovsky @ 2018-04-24 21:02 UTC (permalink / raw)
  To: Michel Dänzer, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	David.Panariti-5C7GfCeVMHo, oleg-H+wXaHxf7aLQT0dZR+AlfA,
	ebiederm-aS9lmoZGLiVWk0Htik3J/w, Alexander.Deucher-5C7GfCeVMHo,
	akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	Christian.Koenig-5C7GfCeVMHo



On 04/24/2018 03:44 PM, Daniel Vetter wrote:
> On Tue, Apr 24, 2018 at 05:46:52PM +0200, Michel Dänzer wrote:
>> Adding the dri-devel list, since this is driver independent code.
>>
>>
>> On 2018-04-24 05:30 PM, Andrey Grodzovsky wrote:
>>> Avoid calling wait_event_killable when you are possibly being called
>>> from get_signal routine since in that case you end up in a deadlock
>>> where you are alreay blocked in singla processing any trying to wait
>> Multiple typos here, "[...] already blocked in signal processing and [...]"?
>>
>>
>>> on a new signal.
>>>
>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>> ---
>>>   drivers/gpu/drm/scheduler/gpu_scheduler.c | 5 +++--
>>>   1 file changed, 3 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler.c b/drivers/gpu/drm/scheduler/gpu_scheduler.c
>>> index 088ff2b..09fd258 100644
>>> --- a/drivers/gpu/drm/scheduler/gpu_scheduler.c
>>> +++ b/drivers/gpu/drm/scheduler/gpu_scheduler.c
>>> @@ -227,9 +227,10 @@ void drm_sched_entity_do_release(struct drm_gpu_scheduler *sched,
>>>   		return;
>>>   	/**
>>>   	 * The client will not queue more IBs during this fini, consume existing
>>> -	 * queued IBs or discard them on SIGKILL
>>> +	 * queued IBs or discard them when in death signal state since
>>> +	 * wait_event_killable can't receive signals in that state.
>>>   	*/
>>> -	if ((current->flags & PF_SIGNALED) && current->exit_code == SIGKILL)
>>> +	if (current->flags & PF_SIGNALED)
> You want fatal_signal_pending() here, instead of inventing your own broken
> version.

I rely on current->flags & PF_SIGNALED because this being set from 
within get_signal,
meaning I am within signal processing  in which case I want to avoid any 
signal based wait for that task,
 From what i see in the code, task_struct.pending.signal is being set 
for other threads in same
group (zap_other_threads) or for other scenarios, those task are still 
able to receive signals
so calling wait_event_killable there will not have problem.
>>>   		entity->fini_status = -ERESTARTSYS;
>>>   	else
>>>   		entity->fini_status = wait_event_killable(sched->job_scheduled,
> But really this smells like a bug in wait_event_killable, since
> wait_event_interruptible does not suffer from the same bug. It will return
> immediately when there's a signal pending.

Even when wait_event_interruptible is called as following - 
...->do_signal->get_signal->....->wait_event_interruptible ?
I haven't tried it but wait_event_interruptible is very much alike to 
wait_event_killable so I would assume it will also
not be interrupted if called like that. (Will give it a try just out of 
curiosity anyway)

Andrey

>
> I think this should be fixed in core code, not papered over in some
> subsystem.
> -Daniel
>
>>
>> -- 
>> Earthling Michel Dänzer               |               http://www.amd.com
>> Libre software enthusiast             |             Mesa and X developer
>> _______________________________________________
>> dri-devel mailing list
>> dri-devel@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/dri-devel

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
  2018-04-24 21:02         ` Andrey Grodzovsky
  (?)
@ 2018-04-24 21:21         ` Eric W. Biederman
  2018-04-24 21:37             ` Andrey Grodzovsky
  -1 siblings, 1 reply; 122+ messages in thread
From: Eric W. Biederman @ 2018-04-24 21:21 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: Michel Dänzer, linux-kernel, amd-gfx, dri-devel,
	David.Panariti, oleg, Alexander.Deucher, akpm, Christian.Koenig

Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> writes:

> On 04/24/2018 03:44 PM, Daniel Vetter wrote:
>> On Tue, Apr 24, 2018 at 05:46:52PM +0200, Michel Dänzer wrote:
>>> Adding the dri-devel list, since this is driver independent code.
>>>
>>>
>>> On 2018-04-24 05:30 PM, Andrey Grodzovsky wrote:
>>>> Avoid calling wait_event_killable when you are possibly being called
>>>> from get_signal routine since in that case you end up in a deadlock
>>>> where you are alreay blocked in singla processing any trying to wait
>>> Multiple typos here, "[...] already blocked in signal processing and [...]"?
>>>
>>>
>>>> on a new signal.
>>>>
>>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>>> ---
>>>>   drivers/gpu/drm/scheduler/gpu_scheduler.c | 5 +++--
>>>>   1 file changed, 3 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler.c b/drivers/gpu/drm/scheduler/gpu_scheduler.c
>>>> index 088ff2b..09fd258 100644
>>>> --- a/drivers/gpu/drm/scheduler/gpu_scheduler.c
>>>> +++ b/drivers/gpu/drm/scheduler/gpu_scheduler.c
>>>> @@ -227,9 +227,10 @@ void drm_sched_entity_do_release(struct drm_gpu_scheduler *sched,
>>>>   		return;
>>>>   	/**
>>>>   	 * The client will not queue more IBs during this fini, consume existing
>>>> -	 * queued IBs or discard them on SIGKILL
>>>> +	 * queued IBs or discard them when in death signal state since
>>>> +	 * wait_event_killable can't receive signals in that state.
>>>>   	*/
>>>> -	if ((current->flags & PF_SIGNALED) && current->exit_code == SIGKILL)
>>>> +	if (current->flags & PF_SIGNALED)
>> You want fatal_signal_pending() here, instead of inventing your own broken
>> version.
>
> I rely on current->flags & PF_SIGNALED because this being set from
> within get_signal,

It doesn't mean that.  Unless you are called by do_coredump (you
aren't).  The closing of files does not happen in do_coredump.
Which means you are being called from do_exit.
In fact you are being called after exit_files which closes
the files.  The actual __fput processing happens in task_work_run.

> meaning I am within signal processing  in which case I want to avoid
> any signal based wait for that task,
> From what i see in the code, task_struct.pending.signal is being set
> for other threads in same
> group (zap_other_threads) or for other scenarios, those task are still
> able to receive signals
> so calling wait_event_killable there will not have problem.

Excpet that you are geing called after from do_exit and after exit_files
which is after exit_signal.  Which means that PF_EXITING has been set.
Which implies that the kernel signal handling machinery has already
started being torn down.

Not as much as I would like to happen at that point as we are still
left with some old CLONE_PTHREAD messes in the code that need to be
cleaned up.

Still given the fact you are task_work_run it is quite possible even
release_task has been run on that task before the f_op->release method
is called.  So you simply can not count on signals working.

Which in practice leaves a timeout for ending your wait.  That code can
legitimately be in a context that is neither interruptible nor killable.

>>>>   		entity->fini_status = -ERESTARTSYS;
>>>>   	else
>>>>   		entity->fini_status = wait_event_killable(sched->job_scheduled,
>> But really this smells like a bug in wait_event_killable, since
>> wait_event_interruptible does not suffer from the same bug. It will return
>> immediately when there's a signal pending.
>
> Even when wait_event_interruptible is called as following - 
> ...->do_signal->get_signal->....->wait_event_interruptible ?
> I haven't tried it but wait_event_interruptible is very much alike to
> wait_event_killable so I would assume it will also
> not be interrupted if called like that. (Will give it a try just out
> of curiosity anyway)

As PF_EXITING is set want_signal should fail and the signal state of the
task should not be updatable by signals.

Eric

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
@ 2018-04-24 21:37             ` Andrey Grodzovsky
  0 siblings, 0 replies; 122+ messages in thread
From: Andrey Grodzovsky @ 2018-04-24 21:37 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Michel Dänzer, linux-kernel, amd-gfx, dri-devel,
	David.Panariti, oleg, Alexander.Deucher, akpm, Christian.Koenig



On 04/24/2018 05:21 PM, Eric W. Biederman wrote:
> Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> writes:
>
>> On 04/24/2018 03:44 PM, Daniel Vetter wrote:
>>> On Tue, Apr 24, 2018 at 05:46:52PM +0200, Michel Dänzer wrote:
>>>> Adding the dri-devel list, since this is driver independent code.
>>>>
>>>>
>>>> On 2018-04-24 05:30 PM, Andrey Grodzovsky wrote:
>>>>> Avoid calling wait_event_killable when you are possibly being called
>>>>> from get_signal routine since in that case you end up in a deadlock
>>>>> where you are alreay blocked in singla processing any trying to wait
>>>> Multiple typos here, "[...] already blocked in signal processing and [...]"?
>>>>
>>>>
>>>>> on a new signal.
>>>>>
>>>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>>>> ---
>>>>>    drivers/gpu/drm/scheduler/gpu_scheduler.c | 5 +++--
>>>>>    1 file changed, 3 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler.c b/drivers/gpu/drm/scheduler/gpu_scheduler.c
>>>>> index 088ff2b..09fd258 100644
>>>>> --- a/drivers/gpu/drm/scheduler/gpu_scheduler.c
>>>>> +++ b/drivers/gpu/drm/scheduler/gpu_scheduler.c
>>>>> @@ -227,9 +227,10 @@ void drm_sched_entity_do_release(struct drm_gpu_scheduler *sched,
>>>>>    		return;
>>>>>    	/**
>>>>>    	 * The client will not queue more IBs during this fini, consume existing
>>>>> -	 * queued IBs or discard them on SIGKILL
>>>>> +	 * queued IBs or discard them when in death signal state since
>>>>> +	 * wait_event_killable can't receive signals in that state.
>>>>>    	*/
>>>>> -	if ((current->flags & PF_SIGNALED) && current->exit_code == SIGKILL)
>>>>> +	if (current->flags & PF_SIGNALED)
>>> You want fatal_signal_pending() here, instead of inventing your own broken
>>> version.
>> I rely on current->flags & PF_SIGNALED because this being set from
>> within get_signal,
> It doesn't mean that.  Unless you are called by do_coredump (you
> aren't).

Looking in latest code here
https://elixir.bootlin.com/linux/v4.17-rc2/source/kernel/signal.c#L2449
i see that current->flags |= PF_SIGNALED; is out side of
if (sig_kernel_coredump(signr)) {...} scope

Andrey

> The closing of files does not happen in do_coredump.
> Which means you are being called from do_exit.
> In fact you are being called after exit_files which closes
> the files.  The actual __fput processing happens in task_work_run.
>
>> meaning I am within signal processing  in which case I want to avoid
>> any signal based wait for that task,
>>  From what i see in the code, task_struct.pending.signal is being set
>> for other threads in same
>> group (zap_other_threads) or for other scenarios, those task are still
>> able to receive signals
>> so calling wait_event_killable there will not have problem.
> Excpet that you are geing called after from do_exit and after exit_files
> which is after exit_signal.  Which means that PF_EXITING has been set.
> Which implies that the kernel signal handling machinery has already
> started being torn down.
>
> Not as much as I would like to happen at that point as we are still
> left with some old CLONE_PTHREAD messes in the code that need to be
> cleaned up.
>
> Still given the fact you are task_work_run it is quite possible even
> release_task has been run on that task before the f_op->release method
> is called.  So you simply can not count on signals working.
>
> Which in practice leaves a timeout for ending your wait.  That code can
> legitimately be in a context that is neither interruptible nor killable.
>
>>>>>    		entity->fini_status = -ERESTARTSYS;
>>>>>    	else
>>>>>    		entity->fini_status = wait_event_killable(sched->job_scheduled,
>>> But really this smells like a bug in wait_event_killable, since
>>> wait_event_interruptible does not suffer from the same bug. It will return
>>> immediately when there's a signal pending.
>> Even when wait_event_interruptible is called as following -
>> ...->do_signal->get_signal->....->wait_event_interruptible ?
>> I haven't tried it but wait_event_interruptible is very much alike to
>> wait_event_killable so I would assume it will also
>> not be interrupted if called like that. (Will give it a try just out
>> of curiosity anyway)
> As PF_EXITING is set want_signal should fail and the signal state of the
> task should not be updatable by signals.
>
> Eric
>
>

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
@ 2018-04-24 21:37             ` Andrey Grodzovsky
  0 siblings, 0 replies; 122+ messages in thread
From: Andrey Grodzovsky @ 2018-04-24 21:37 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David.Panariti-5C7GfCeVMHo, Michel Dänzer,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	oleg-H+wXaHxf7aLQT0dZR+AlfA,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	Alexander.Deucher-5C7GfCeVMHo,
	akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	Christian.Koenig-5C7GfCeVMHo



On 04/24/2018 05:21 PM, Eric W. Biederman wrote:
> Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> writes:
>
>> On 04/24/2018 03:44 PM, Daniel Vetter wrote:
>>> On Tue, Apr 24, 2018 at 05:46:52PM +0200, Michel Dänzer wrote:
>>>> Adding the dri-devel list, since this is driver independent code.
>>>>
>>>>
>>>> On 2018-04-24 05:30 PM, Andrey Grodzovsky wrote:
>>>>> Avoid calling wait_event_killable when you are possibly being called
>>>>> from get_signal routine since in that case you end up in a deadlock
>>>>> where you are alreay blocked in singla processing any trying to wait
>>>> Multiple typos here, "[...] already blocked in signal processing and [...]"?
>>>>
>>>>
>>>>> on a new signal.
>>>>>
>>>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>>>> ---
>>>>>    drivers/gpu/drm/scheduler/gpu_scheduler.c | 5 +++--
>>>>>    1 file changed, 3 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler.c b/drivers/gpu/drm/scheduler/gpu_scheduler.c
>>>>> index 088ff2b..09fd258 100644
>>>>> --- a/drivers/gpu/drm/scheduler/gpu_scheduler.c
>>>>> +++ b/drivers/gpu/drm/scheduler/gpu_scheduler.c
>>>>> @@ -227,9 +227,10 @@ void drm_sched_entity_do_release(struct drm_gpu_scheduler *sched,
>>>>>    		return;
>>>>>    	/**
>>>>>    	 * The client will not queue more IBs during this fini, consume existing
>>>>> -	 * queued IBs or discard them on SIGKILL
>>>>> +	 * queued IBs or discard them when in death signal state since
>>>>> +	 * wait_event_killable can't receive signals in that state.
>>>>>    	*/
>>>>> -	if ((current->flags & PF_SIGNALED) && current->exit_code == SIGKILL)
>>>>> +	if (current->flags & PF_SIGNALED)
>>> You want fatal_signal_pending() here, instead of inventing your own broken
>>> version.
>> I rely on current->flags & PF_SIGNALED because this being set from
>> within get_signal,
> It doesn't mean that.  Unless you are called by do_coredump (you
> aren't).

Looking in latest code here
https://elixir.bootlin.com/linux/v4.17-rc2/source/kernel/signal.c#L2449
i see that current->flags |= PF_SIGNALED; is out side of
if (sig_kernel_coredump(signr)) {...} scope

Andrey

> The closing of files does not happen in do_coredump.
> Which means you are being called from do_exit.
> In fact you are being called after exit_files which closes
> the files.  The actual __fput processing happens in task_work_run.
>
>> meaning I am within signal processing  in which case I want to avoid
>> any signal based wait for that task,
>>  From what i see in the code, task_struct.pending.signal is being set
>> for other threads in same
>> group (zap_other_threads) or for other scenarios, those task are still
>> able to receive signals
>> so calling wait_event_killable there will not have problem.
> Excpet that you are geing called after from do_exit and after exit_files
> which is after exit_signal.  Which means that PF_EXITING has been set.
> Which implies that the kernel signal handling machinery has already
> started being torn down.
>
> Not as much as I would like to happen at that point as we are still
> left with some old CLONE_PTHREAD messes in the code that need to be
> cleaned up.
>
> Still given the fact you are task_work_run it is quite possible even
> release_task has been run on that task before the f_op->release method
> is called.  So you simply can not count on signals working.
>
> Which in practice leaves a timeout for ending your wait.  That code can
> legitimately be in a context that is neither interruptible nor killable.
>
>>>>>    		entity->fini_status = -ERESTARTSYS;
>>>>>    	else
>>>>>    		entity->fini_status = wait_event_killable(sched->job_scheduled,
>>> But really this smells like a bug in wait_event_killable, since
>>> wait_event_interruptible does not suffer from the same bug. It will return
>>> immediately when there's a signal pending.
>> Even when wait_event_interruptible is called as following -
>> ...->do_signal->get_signal->....->wait_event_interruptible ?
>> I haven't tried it but wait_event_interruptible is very much alike to
>> wait_event_killable so I would assume it will also
>> not be interrupted if called like that. (Will give it a try just out
>> of curiosity anyway)
> As PF_EXITING is set want_signal should fail and the signal state of the
> task should not be updatable by signals.
>
> Eric
>
>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
  2018-04-24 21:02         ` Andrey Grodzovsky
@ 2018-04-24 21:40           ` Daniel Vetter
  -1 siblings, 0 replies; 122+ messages in thread
From: Daniel Vetter @ 2018-04-24 21:40 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: Michel Dänzer, linux-kernel, amd-gfx, dri-devel,
	David.Panariti, oleg, ebiederm, Alexander.Deucher, akpm,
	Christian.Koenig

On Tue, Apr 24, 2018 at 05:02:40PM -0400, Andrey Grodzovsky wrote:
> 
> 
> On 04/24/2018 03:44 PM, Daniel Vetter wrote:
> > On Tue, Apr 24, 2018 at 05:46:52PM +0200, Michel Dänzer wrote:
> > > Adding the dri-devel list, since this is driver independent code.
> > > 
> > > 
> > > On 2018-04-24 05:30 PM, Andrey Grodzovsky wrote:
> > > > Avoid calling wait_event_killable when you are possibly being called
> > > > from get_signal routine since in that case you end up in a deadlock
> > > > where you are alreay blocked in singla processing any trying to wait
> > > Multiple typos here, "[...] already blocked in signal processing and [...]"?
> > > 
> > > 
> > > > on a new signal.
> > > > 
> > > > Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> > > > ---
> > > >   drivers/gpu/drm/scheduler/gpu_scheduler.c | 5 +++--
> > > >   1 file changed, 3 insertions(+), 2 deletions(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler.c b/drivers/gpu/drm/scheduler/gpu_scheduler.c
> > > > index 088ff2b..09fd258 100644
> > > > --- a/drivers/gpu/drm/scheduler/gpu_scheduler.c
> > > > +++ b/drivers/gpu/drm/scheduler/gpu_scheduler.c
> > > > @@ -227,9 +227,10 @@ void drm_sched_entity_do_release(struct drm_gpu_scheduler *sched,
> > > >   		return;
> > > >   	/**
> > > >   	 * The client will not queue more IBs during this fini, consume existing
> > > > -	 * queued IBs or discard them on SIGKILL
> > > > +	 * queued IBs or discard them when in death signal state since
> > > > +	 * wait_event_killable can't receive signals in that state.
> > > >   	*/
> > > > -	if ((current->flags & PF_SIGNALED) && current->exit_code == SIGKILL)
> > > > +	if (current->flags & PF_SIGNALED)
> > You want fatal_signal_pending() here, instead of inventing your own broken
> > version.
> 
> I rely on current->flags & PF_SIGNALED because this being set from within
> get_signal,
> meaning I am within signal processing  in which case I want to avoid any
> signal based wait for that task,
> From what i see in the code, task_struct.pending.signal is being set for
> other threads in same
> group (zap_other_threads) or for other scenarios, those task are still able
> to receive signals
> so calling wait_event_killable there will not have problem.
> > > >   		entity->fini_status = -ERESTARTSYS;
> > > >   	else
> > > >   		entity->fini_status = wait_event_killable(sched->job_scheduled,
> > But really this smells like a bug in wait_event_killable, since
> > wait_event_interruptible does not suffer from the same bug. It will return
> > immediately when there's a signal pending.
> 
> Even when wait_event_interruptible is called as following -
> ...->do_signal->get_signal->....->wait_event_interruptible ?
> I haven't tried it but wait_event_interruptible is very much alike to
> wait_event_killable so I would assume it will also
> not be interrupted if called like that. (Will give it a try just out of
> curiosity anyway)

wait_event_killabel doesn't check for fatal_signal_pending before calling
schedule, so definitely has a nice race there.

But if you're sure that you really need to check PF_SIGNALED, then I'm
honestly not clear on what you're trying to pull off here. Your sparse
explanation of what happens isn't enough, since I have no idea how you can
get from get_signal() to the above wait_event_killable callsite.
-Daniel

> 
> Andrey
> 
> > 
> > I think this should be fixed in core code, not papered over in some
> > subsystem.
> > -Daniel
> > 
> > > 
> > > -- 
> > > Earthling Michel Dänzer               |               http://www.amd.com
> > > Libre software enthusiast             |             Mesa and X developer
> > > _______________________________________________
> > > dri-devel mailing list
> > > dri-devel@lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/dri-devel
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
@ 2018-04-24 21:40           ` Daniel Vetter
  0 siblings, 0 replies; 122+ messages in thread
From: Daniel Vetter @ 2018-04-24 21:40 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: David.Panariti, Michel Dänzer, linux-kernel, dri-devel,
	oleg, amd-gfx, Alexander.Deucher, akpm, Christian.Koenig,
	ebiederm

On Tue, Apr 24, 2018 at 05:02:40PM -0400, Andrey Grodzovsky wrote:
> 
> 
> On 04/24/2018 03:44 PM, Daniel Vetter wrote:
> > On Tue, Apr 24, 2018 at 05:46:52PM +0200, Michel Dänzer wrote:
> > > Adding the dri-devel list, since this is driver independent code.
> > > 
> > > 
> > > On 2018-04-24 05:30 PM, Andrey Grodzovsky wrote:
> > > > Avoid calling wait_event_killable when you are possibly being called
> > > > from get_signal routine since in that case you end up in a deadlock
> > > > where you are alreay blocked in singla processing any trying to wait
> > > Multiple typos here, "[...] already blocked in signal processing and [...]"?
> > > 
> > > 
> > > > on a new signal.
> > > > 
> > > > Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> > > > ---
> > > >   drivers/gpu/drm/scheduler/gpu_scheduler.c | 5 +++--
> > > >   1 file changed, 3 insertions(+), 2 deletions(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler.c b/drivers/gpu/drm/scheduler/gpu_scheduler.c
> > > > index 088ff2b..09fd258 100644
> > > > --- a/drivers/gpu/drm/scheduler/gpu_scheduler.c
> > > > +++ b/drivers/gpu/drm/scheduler/gpu_scheduler.c
> > > > @@ -227,9 +227,10 @@ void drm_sched_entity_do_release(struct drm_gpu_scheduler *sched,
> > > >   		return;
> > > >   	/**
> > > >   	 * The client will not queue more IBs during this fini, consume existing
> > > > -	 * queued IBs or discard them on SIGKILL
> > > > +	 * queued IBs or discard them when in death signal state since
> > > > +	 * wait_event_killable can't receive signals in that state.
> > > >   	*/
> > > > -	if ((current->flags & PF_SIGNALED) && current->exit_code == SIGKILL)
> > > > +	if (current->flags & PF_SIGNALED)
> > You want fatal_signal_pending() here, instead of inventing your own broken
> > version.
> 
> I rely on current->flags & PF_SIGNALED because this being set from within
> get_signal,
> meaning I am within signal processing  in which case I want to avoid any
> signal based wait for that task,
> From what i see in the code, task_struct.pending.signal is being set for
> other threads in same
> group (zap_other_threads) or for other scenarios, those task are still able
> to receive signals
> so calling wait_event_killable there will not have problem.
> > > >   		entity->fini_status = -ERESTARTSYS;
> > > >   	else
> > > >   		entity->fini_status = wait_event_killable(sched->job_scheduled,
> > But really this smells like a bug in wait_event_killable, since
> > wait_event_interruptible does not suffer from the same bug. It will return
> > immediately when there's a signal pending.
> 
> Even when wait_event_interruptible is called as following -
> ...->do_signal->get_signal->....->wait_event_interruptible ?
> I haven't tried it but wait_event_interruptible is very much alike to
> wait_event_killable so I would assume it will also
> not be interrupted if called like that. (Will give it a try just out of
> curiosity anyway)

wait_event_killabel doesn't check for fatal_signal_pending before calling
schedule, so definitely has a nice race there.

But if you're sure that you really need to check PF_SIGNALED, then I'm
honestly not clear on what you're trying to pull off here. Your sparse
explanation of what happens isn't enough, since I have no idea how you can
get from get_signal() to the above wait_event_killable callsite.
-Daniel

> 
> Andrey
> 
> > 
> > I think this should be fixed in core code, not papered over in some
> > subsystem.
> > -Daniel
> > 
> > > 
> > > -- 
> > > Earthling Michel Dänzer               |               http://www.amd.com
> > > Libre software enthusiast             |             Mesa and X developer
> > > _______________________________________________
> > > dri-devel mailing list
> > > dri-devel@lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/dri-devel
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
  2018-04-24 21:37             ` Andrey Grodzovsky
  (?)
@ 2018-04-24 22:11             ` Eric W. Biederman
  -1 siblings, 0 replies; 122+ messages in thread
From: Eric W. Biederman @ 2018-04-24 22:11 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: Michel Dänzer, linux-kernel, amd-gfx, dri-devel,
	David.Panariti, oleg, Alexander.Deucher, akpm, Christian.Koenig

Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> writes:

> On 04/24/2018 05:21 PM, Eric W. Biederman wrote:
>> Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> writes:
>>
>>> On 04/24/2018 03:44 PM, Daniel Vetter wrote:
>>>> On Tue, Apr 24, 2018 at 05:46:52PM +0200, Michel Dänzer wrote:
>>>>> Adding the dri-devel list, since this is driver independent code.
>>>>>
>>>>>
>>>>> On 2018-04-24 05:30 PM, Andrey Grodzovsky wrote:
>>>>>> Avoid calling wait_event_killable when you are possibly being called
>>>>>> from get_signal routine since in that case you end up in a deadlock
>>>>>> where you are alreay blocked in singla processing any trying to wait
>>>>> Multiple typos here, "[...] already blocked in signal processing and [...]"?
>>>>>
>>>>>
>>>>>> on a new signal.
>>>>>>
>>>>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>>>>> ---
>>>>>>    drivers/gpu/drm/scheduler/gpu_scheduler.c | 5 +++--
>>>>>>    1 file changed, 3 insertions(+), 2 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler.c b/drivers/gpu/drm/scheduler/gpu_scheduler.c
>>>>>> index 088ff2b..09fd258 100644
>>>>>> --- a/drivers/gpu/drm/scheduler/gpu_scheduler.c
>>>>>> +++ b/drivers/gpu/drm/scheduler/gpu_scheduler.c
>>>>>> @@ -227,9 +227,10 @@ void drm_sched_entity_do_release(struct drm_gpu_scheduler *sched,
>>>>>>    		return;
>>>>>>    	/**
>>>>>>    	 * The client will not queue more IBs during this fini, consume existing
>>>>>> -	 * queued IBs or discard them on SIGKILL
>>>>>> +	 * queued IBs or discard them when in death signal state since
>>>>>> +	 * wait_event_killable can't receive signals in that state.
>>>>>>    	*/
>>>>>> -	if ((current->flags & PF_SIGNALED) && current->exit_code == SIGKILL)
>>>>>> +	if (current->flags & PF_SIGNALED)
>>>> You want fatal_signal_pending() here, instead of inventing your own broken
>>>> version.
>>> I rely on current->flags & PF_SIGNALED because this being set from
>>> within get_signal,
>> It doesn't mean that.  Unless you are called by do_coredump (you
>> aren't).
>
> Looking in latest code here
> https://elixir.bootlin.com/linux/v4.17-rc2/source/kernel/signal.c#L2449
> i see that current->flags |= PF_SIGNALED; is out side of
> if (sig_kernel_coredump(signr)) {...} scope

In small words.  You showed me the backtrace and I have read
the code.

PF_SIGNALED means you got killed by a signal.
get_signal
  do_coredump
  do_group_exit
    do_exit
       exit_signals
          sets PF_EXITING
       exit_mm
          calls fput on mmaps
             calls sched_task_work
       exit_files
          calls fput on open files
             calls sched_task_work
       exit_task_work
          task_work_run
             /* you are here */

So strictly speaking you are inside of get_signal it is not
meaningful to speak of yourself as within get_signal.

I am a little surprised to see task_work_run called so early.
I was mostly expecting it to happen when the dead task was
scheduling away, like normally happens.

Testing for PF_SIGNALED does not give you anything at all
that testing for PF_EXITING (the flag that signal handling
is shutdown) does not get you.

There is no point in distinguishing PF_SIGNALED from any other
path to do_exit.  do_exit never returns.

The task is dead.

Blocking indefinitely while shutting down a task is a bad idea.
Blocking indefinitely while closing a file descriptor is a bad idea.

The task has been killed it can't get more dead.  SIGKILL is meaningless
at this point.

So you need a timeout, or not to wait at all.


Eric

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
  2018-04-24 21:37             ` Andrey Grodzovsky
  (?)
  (?)
@ 2018-04-25  7:14             ` Daniel Vetter
  2018-04-25 13:08                 ` Andrey Grodzovsky
  -1 siblings, 1 reply; 122+ messages in thread
From: Daniel Vetter @ 2018-04-25  7:14 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: Eric W. Biederman, David.Panariti, Michel Dänzer,
	linux-kernel, dri-devel, oleg, amd-gfx, Alexander.Deucher, akpm,
	Christian.Koenig

On Tue, Apr 24, 2018 at 05:37:08PM -0400, Andrey Grodzovsky wrote:
> 
> 
> On 04/24/2018 05:21 PM, Eric W. Biederman wrote:
> > Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> writes:
> > 
> > > On 04/24/2018 03:44 PM, Daniel Vetter wrote:
> > > > On Tue, Apr 24, 2018 at 05:46:52PM +0200, Michel Dänzer wrote:
> > > > > Adding the dri-devel list, since this is driver independent code.
> > > > > 
> > > > > 
> > > > > On 2018-04-24 05:30 PM, Andrey Grodzovsky wrote:
> > > > > > Avoid calling wait_event_killable when you are possibly being called
> > > > > > from get_signal routine since in that case you end up in a deadlock
> > > > > > where you are alreay blocked in singla processing any trying to wait
> > > > > Multiple typos here, "[...] already blocked in signal processing and [...]"?
> > > > > 
> > > > > 
> > > > > > on a new signal.
> > > > > > 
> > > > > > Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> > > > > > ---
> > > > > >    drivers/gpu/drm/scheduler/gpu_scheduler.c | 5 +++--
> > > > > >    1 file changed, 3 insertions(+), 2 deletions(-)
> > > > > > 
> > > > > > diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler.c b/drivers/gpu/drm/scheduler/gpu_scheduler.c
> > > > > > index 088ff2b..09fd258 100644
> > > > > > --- a/drivers/gpu/drm/scheduler/gpu_scheduler.c
> > > > > > +++ b/drivers/gpu/drm/scheduler/gpu_scheduler.c
> > > > > > @@ -227,9 +227,10 @@ void drm_sched_entity_do_release(struct drm_gpu_scheduler *sched,
> > > > > >    		return;
> > > > > >    	/**
> > > > > >    	 * The client will not queue more IBs during this fini, consume existing
> > > > > > -	 * queued IBs or discard them on SIGKILL
> > > > > > +	 * queued IBs or discard them when in death signal state since
> > > > > > +	 * wait_event_killable can't receive signals in that state.
> > > > > >    	*/
> > > > > > -	if ((current->flags & PF_SIGNALED) && current->exit_code == SIGKILL)
> > > > > > +	if (current->flags & PF_SIGNALED)
> > > > You want fatal_signal_pending() here, instead of inventing your own broken
> > > > version.
> > > I rely on current->flags & PF_SIGNALED because this being set from
> > > within get_signal,
> > It doesn't mean that.  Unless you are called by do_coredump (you
> > aren't).
> 
> Looking in latest code here
> https://elixir.bootlin.com/linux/v4.17-rc2/source/kernel/signal.c#L2449
> i see that current->flags |= PF_SIGNALED; is out side of
> if (sig_kernel_coredump(signr)) {...} scope

Ok I read some more about this, and I guess you go through process exit
and then eventually close. But I'm not sure.

The code in drm_sched_entity_fini also looks strange: You unpark the
scheduler thread before you remove all the IBs. At least from the comment
that doesn't sound like what you want to do.

But in general, PF_SIGNALED is really something deeply internal to the
core (used for some book-keeping and accounting). The drm scheduler is the
only thing looking at it, so smells like a layering violation. I suspect
(but without knowing what you're actually trying to achive here can't be
sure) you want to look at something else.

E.g. PF_EXITING seems to be used in a lot more places to cancel stuff
that's no longer relevant when a task exits, not PF_SIGNALED. There's the
TIF_MEMDIE flag if you're hacking around issues with the oom-killer.

This here on the other hand looks really fragile, and probably only does
what you want to do by accident.
-Daniel

> 
> Andrey
> 
> > The closing of files does not happen in do_coredump.
> > Which means you are being called from do_exit.
> > In fact you are being called after exit_files which closes
> > the files.  The actual __fput processing happens in task_work_run.
> > 
> > > meaning I am within signal processing  in which case I want to avoid
> > > any signal based wait for that task,
> > >  From what i see in the code, task_struct.pending.signal is being set
> > > for other threads in same
> > > group (zap_other_threads) or for other scenarios, those task are still
> > > able to receive signals
> > > so calling wait_event_killable there will not have problem.
> > Excpet that you are geing called after from do_exit and after exit_files
> > which is after exit_signal.  Which means that PF_EXITING has been set.
> > Which implies that the kernel signal handling machinery has already
> > started being torn down.
> > 
> > Not as much as I would like to happen at that point as we are still
> > left with some old CLONE_PTHREAD messes in the code that need to be
> > cleaned up.
> > 
> > Still given the fact you are task_work_run it is quite possible even
> > release_task has been run on that task before the f_op->release method
> > is called.  So you simply can not count on signals working.
> > 
> > Which in practice leaves a timeout for ending your wait.  That code can
> > legitimately be in a context that is neither interruptible nor killable.
> > 
> > > > > >    		entity->fini_status = -ERESTARTSYS;
> > > > > >    	else
> > > > > >    		entity->fini_status = wait_event_killable(sched->job_scheduled,
> > > > But really this smells like a bug in wait_event_killable, since
> > > > wait_event_interruptible does not suffer from the same bug. It will return
> > > > immediately when there's a signal pending.
> > > Even when wait_event_interruptible is called as following -
> > > ...->do_signal->get_signal->....->wait_event_interruptible ?
> > > I haven't tried it but wait_event_interruptible is very much alike to
> > > wait_event_killable so I would assume it will also
> > > not be interrupted if called like that. (Will give it a try just out
> > > of curiosity anyway)
> > As PF_EXITING is set want_signal should fail and the signal state of the
> > task should not be updatable by signals.
> > 
> > Eric
> > 
> > 
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
  2018-04-24 15:30   ` Andrey Grodzovsky
                     ` (2 preceding siblings ...)
  (?)
@ 2018-04-25 13:05   ` Oleg Nesterov
  -1 siblings, 0 replies; 122+ messages in thread
From: Oleg Nesterov @ 2018-04-25 13:05 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: linux-kernel, amd-gfx, Alexander.Deucher, Christian.Koenig,
	David.Panariti, akpm, ebiederm

On 04/24, Andrey Grodzovsky wrote:
>
> --- a/drivers/gpu/drm/scheduler/gpu_scheduler.c
> +++ b/drivers/gpu/drm/scheduler/gpu_scheduler.c
> @@ -227,9 +227,10 @@ void drm_sched_entity_do_release(struct drm_gpu_scheduler *sched,
>  		return;
>  	/**
>  	 * The client will not queue more IBs during this fini, consume existing
> -	 * queued IBs or discard them on SIGKILL
> +	 * queued IBs or discard them when in death signal state since
> +	 * wait_event_killable can't receive signals in that state.
>  	*/
> -	if ((current->flags & PF_SIGNALED) && current->exit_code == SIGKILL)
> +	if (current->flags & PF_SIGNALED)

please do not use PF_SIGNALED, it must die. Besides you can't rely on this flag
in multi-threaded case. current->exit_code doesn't look right too.

>  		entity->fini_status = -ERESTARTSYS;
>  	else
>  		entity->fini_status = wait_event_killable(sched->job_scheduled,

So afaics the problem is that fatal_signal_pending() is not necessarily true
after SIGKILL was already dequeued and thus wait_event_killable(), right?

This was already discussed, but it is not clear what we can/should do. We can
probably change get_signal() to not dequeue SIGKILL or do something else to keep
fatal_signal_pending() == T for the exiting killed thread.

But in this case we probably also want to discriminate the "real" SIGKILL's from
group_exit/exec/coredump.

Oleg.

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
  2018-04-25  7:14             ` Daniel Vetter
@ 2018-04-25 13:08                 ` Andrey Grodzovsky
  0 siblings, 0 replies; 122+ messages in thread
From: Andrey Grodzovsky @ 2018-04-25 13:08 UTC (permalink / raw)
  To: Eric W. Biederman, David.Panariti, Michel Dänzer,
	linux-kernel, dri-devel, oleg, amd-gfx, Alexander.Deucher, akpm,
	Christian.Koenig



On 04/25/2018 03:14 AM, Daniel Vetter wrote:
> On Tue, Apr 24, 2018 at 05:37:08PM -0400, Andrey Grodzovsky wrote:
>>
>> On 04/24/2018 05:21 PM, Eric W. Biederman wrote:
>>> Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> writes:
>>>
>>>> On 04/24/2018 03:44 PM, Daniel Vetter wrote:
>>>>> On Tue, Apr 24, 2018 at 05:46:52PM +0200, Michel Dänzer wrote:
>>>>>> Adding the dri-devel list, since this is driver independent code.
>>>>>>
>>>>>>
>>>>>> On 2018-04-24 05:30 PM, Andrey Grodzovsky wrote:
>>>>>>> Avoid calling wait_event_killable when you are possibly being called
>>>>>>> from get_signal routine since in that case you end up in a deadlock
>>>>>>> where you are alreay blocked in singla processing any trying to wait
>>>>>> Multiple typos here, "[...] already blocked in signal processing and [...]"?
>>>>>>
>>>>>>
>>>>>>> on a new signal.
>>>>>>>
>>>>>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>>>>>> ---
>>>>>>>     drivers/gpu/drm/scheduler/gpu_scheduler.c | 5 +++--
>>>>>>>     1 file changed, 3 insertions(+), 2 deletions(-)
>>>>>>>
>>>>>>> diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler.c b/drivers/gpu/drm/scheduler/gpu_scheduler.c
>>>>>>> index 088ff2b..09fd258 100644
>>>>>>> --- a/drivers/gpu/drm/scheduler/gpu_scheduler.c
>>>>>>> +++ b/drivers/gpu/drm/scheduler/gpu_scheduler.c
>>>>>>> @@ -227,9 +227,10 @@ void drm_sched_entity_do_release(struct drm_gpu_scheduler *sched,
>>>>>>>     		return;
>>>>>>>     	/**
>>>>>>>     	 * The client will not queue more IBs during this fini, consume existing
>>>>>>> -	 * queued IBs or discard them on SIGKILL
>>>>>>> +	 * queued IBs or discard them when in death signal state since
>>>>>>> +	 * wait_event_killable can't receive signals in that state.
>>>>>>>     	*/
>>>>>>> -	if ((current->flags & PF_SIGNALED) && current->exit_code == SIGKILL)
>>>>>>> +	if (current->flags & PF_SIGNALED)
>>>>> You want fatal_signal_pending() here, instead of inventing your own broken
>>>>> version.
>>>> I rely on current->flags & PF_SIGNALED because this being set from
>>>> within get_signal,
>>> It doesn't mean that.  Unless you are called by do_coredump (you
>>> aren't).
>> Looking in latest code here
>> https://elixir.bootlin.com/linux/v4.17-rc2/source/kernel/signal.c#L2449
>> i see that current->flags |= PF_SIGNALED; is out side of
>> if (sig_kernel_coredump(signr)) {...} scope
> Ok I read some more about this, and I guess you go through process exit
> and then eventually close. But I'm not sure.
>
> The code in drm_sched_entity_fini also looks strange: You unpark the
> scheduler thread before you remove all the IBs. At least from the comment
> that doesn't sound like what you want to do.

I think it should be safe for the dying scheduler entity since before 
that (in drm_sched_entity_do_release) we set it's runqueue to NULL
so no new jobs will be dequeued form it by the scheduler thread.

>
> But in general, PF_SIGNALED is really something deeply internal to the
> core (used for some book-keeping and accounting). The drm scheduler is the
> only thing looking at it, so smells like a layering violation. I suspect
> (but without knowing what you're actually trying to achive here can't be
> sure) you want to look at something else.
>
> E.g. PF_EXITING seems to be used in a lot more places to cancel stuff
> that's no longer relevant when a task exits, not PF_SIGNALED. There's the
> TIF_MEMDIE flag if you're hacking around issues with the oom-killer.
>
> This here on the other hand looks really fragile, and probably only does
> what you want to do by accident.
> -Daniel

Yes , that what Eric also said and in the V2 patches i will try  to 
change PF_EXITING

Another issue is changing wait_event_killable to wait_event_timeout 
where I need to understand
what TO value is acceptable for all the drivers using the scheduler, or 
maybe it should come as a property
of drm_sched_entity.

Andrey
>
>> Andrey
>>
>>> The closing of files does not happen in do_coredump.
>>> Which means you are being called from do_exit.
>>> In fact you are being called after exit_files which closes
>>> the files.  The actual __fput processing happens in task_work_run.
>>>
>>>> meaning I am within signal processing  in which case I want to avoid
>>>> any signal based wait for that task,
>>>>   From what i see in the code, task_struct.pending.signal is being set
>>>> for other threads in same
>>>> group (zap_other_threads) or for other scenarios, those task are still
>>>> able to receive signals
>>>> so calling wait_event_killable there will not have problem.
>>> Excpet that you are geing called after from do_exit and after exit_files
>>> which is after exit_signal.  Which means that PF_EXITING has been set.
>>> Which implies that the kernel signal handling machinery has already
>>> started being torn down.
>>>
>>> Not as much as I would like to happen at that point as we are still
>>> left with some old CLONE_PTHREAD messes in the code that need to be
>>> cleaned up.
>>>
>>> Still given the fact you are task_work_run it is quite possible even
>>> release_task has been run on that task before the f_op->release method
>>> is called.  So you simply can not count on signals working.
>>>
>>> Which in practice leaves a timeout for ending your wait.  That code can
>>> legitimately be in a context that is neither interruptible nor killable.
>>>
>>>>>>>     		entity->fini_status = -ERESTARTSYS;
>>>>>>>     	else
>>>>>>>     		entity->fini_status = wait_event_killable(sched->job_scheduled,
>>>>> But really this smells like a bug in wait_event_killable, since
>>>>> wait_event_interruptible does not suffer from the same bug. It will return
>>>>> immediately when there's a signal pending.
>>>> Even when wait_event_interruptible is called as following -
>>>> ...->do_signal->get_signal->....->wait_event_interruptible ?
>>>> I haven't tried it but wait_event_interruptible is very much alike to
>>>> wait_event_killable so I would assume it will also
>>>> not be interrupted if called like that. (Will give it a try just out
>>>> of curiosity anyway)
>>> As PF_EXITING is set want_signal should fail and the signal state of the
>>> task should not be updatable by signals.
>>>
>>> Eric
>>>
>>>
>> _______________________________________________
>> dri-devel mailing list
>> dri-devel@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
@ 2018-04-25 13:08                 ` Andrey Grodzovsky
  0 siblings, 0 replies; 122+ messages in thread
From: Andrey Grodzovsky @ 2018-04-25 13:08 UTC (permalink / raw)
  To: Eric W. Biederman, David.Panariti, Michel Dänzer,
	linux-kernel, dri-devel, oleg, amd-gfx, Alexander.Deucher, akpm,
	Christian.Koenig



On 04/25/2018 03:14 AM, Daniel Vetter wrote:
> On Tue, Apr 24, 2018 at 05:37:08PM -0400, Andrey Grodzovsky wrote:
>>
>> On 04/24/2018 05:21 PM, Eric W. Biederman wrote:
>>> Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> writes:
>>>
>>>> On 04/24/2018 03:44 PM, Daniel Vetter wrote:
>>>>> On Tue, Apr 24, 2018 at 05:46:52PM +0200, Michel Dänzer wrote:
>>>>>> Adding the dri-devel list, since this is driver independent code.
>>>>>>
>>>>>>
>>>>>> On 2018-04-24 05:30 PM, Andrey Grodzovsky wrote:
>>>>>>> Avoid calling wait_event_killable when you are possibly being called
>>>>>>> from get_signal routine since in that case you end up in a deadlock
>>>>>>> where you are alreay blocked in singla processing any trying to wait
>>>>>> Multiple typos here, "[...] already blocked in signal processing and [...]"?
>>>>>>
>>>>>>
>>>>>>> on a new signal.
>>>>>>>
>>>>>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>>>>>> ---
>>>>>>>     drivers/gpu/drm/scheduler/gpu_scheduler.c | 5 +++--
>>>>>>>     1 file changed, 3 insertions(+), 2 deletions(-)
>>>>>>>
>>>>>>> diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler.c b/drivers/gpu/drm/scheduler/gpu_scheduler.c
>>>>>>> index 088ff2b..09fd258 100644
>>>>>>> --- a/drivers/gpu/drm/scheduler/gpu_scheduler.c
>>>>>>> +++ b/drivers/gpu/drm/scheduler/gpu_scheduler.c
>>>>>>> @@ -227,9 +227,10 @@ void drm_sched_entity_do_release(struct drm_gpu_scheduler *sched,
>>>>>>>     		return;
>>>>>>>     	/**
>>>>>>>     	 * The client will not queue more IBs during this fini, consume existing
>>>>>>> -	 * queued IBs or discard them on SIGKILL
>>>>>>> +	 * queued IBs or discard them when in death signal state since
>>>>>>> +	 * wait_event_killable can't receive signals in that state.
>>>>>>>     	*/
>>>>>>> -	if ((current->flags & PF_SIGNALED) && current->exit_code == SIGKILL)
>>>>>>> +	if (current->flags & PF_SIGNALED)
>>>>> You want fatal_signal_pending() here, instead of inventing your own broken
>>>>> version.
>>>> I rely on current->flags & PF_SIGNALED because this being set from
>>>> within get_signal,
>>> It doesn't mean that.  Unless you are called by do_coredump (you
>>> aren't).
>> Looking in latest code here
>> https://elixir.bootlin.com/linux/v4.17-rc2/source/kernel/signal.c#L2449
>> i see that current->flags |= PF_SIGNALED; is out side of
>> if (sig_kernel_coredump(signr)) {...} scope
> Ok I read some more about this, and I guess you go through process exit
> and then eventually close. But I'm not sure.
>
> The code in drm_sched_entity_fini also looks strange: You unpark the
> scheduler thread before you remove all the IBs. At least from the comment
> that doesn't sound like what you want to do.

I think it should be safe for the dying scheduler entity since before 
that (in drm_sched_entity_do_release) we set it's runqueue to NULL
so no new jobs will be dequeued form it by the scheduler thread.

>
> But in general, PF_SIGNALED is really something deeply internal to the
> core (used for some book-keeping and accounting). The drm scheduler is the
> only thing looking at it, so smells like a layering violation. I suspect
> (but without knowing what you're actually trying to achive here can't be
> sure) you want to look at something else.
>
> E.g. PF_EXITING seems to be used in a lot more places to cancel stuff
> that's no longer relevant when a task exits, not PF_SIGNALED. There's the
> TIF_MEMDIE flag if you're hacking around issues with the oom-killer.
>
> This here on the other hand looks really fragile, and probably only does
> what you want to do by accident.
> -Daniel

Yes , that what Eric also said and in the V2 patches i will try  to 
change PF_EXITING

Another issue is changing wait_event_killable to wait_event_timeout 
where I need to understand
what TO value is acceptable for all the drivers using the scheduler, or 
maybe it should come as a property
of drm_sched_entity.

Andrey
>
>> Andrey
>>
>>> The closing of files does not happen in do_coredump.
>>> Which means you are being called from do_exit.
>>> In fact you are being called after exit_files which closes
>>> the files.  The actual __fput processing happens in task_work_run.
>>>
>>>> meaning I am within signal processing  in which case I want to avoid
>>>> any signal based wait for that task,
>>>>   From what i see in the code, task_struct.pending.signal is being set
>>>> for other threads in same
>>>> group (zap_other_threads) or for other scenarios, those task are still
>>>> able to receive signals
>>>> so calling wait_event_killable there will not have problem.
>>> Excpet that you are geing called after from do_exit and after exit_files
>>> which is after exit_signal.  Which means that PF_EXITING has been set.
>>> Which implies that the kernel signal handling machinery has already
>>> started being torn down.
>>>
>>> Not as much as I would like to happen at that point as we are still
>>> left with some old CLONE_PTHREAD messes in the code that need to be
>>> cleaned up.
>>>
>>> Still given the fact you are task_work_run it is quite possible even
>>> release_task has been run on that task before the f_op->release method
>>> is called.  So you simply can not count on signals working.
>>>
>>> Which in practice leaves a timeout for ending your wait.  That code can
>>> legitimately be in a context that is neither interruptible nor killable.
>>>
>>>>>>>     		entity->fini_status = -ERESTARTSYS;
>>>>>>>     	else
>>>>>>>     		entity->fini_status = wait_event_killable(sched->job_scheduled,
>>>>> But really this smells like a bug in wait_event_killable, since
>>>>> wait_event_interruptible does not suffer from the same bug. It will return
>>>>> immediately when there's a signal pending.
>>>> Even when wait_event_interruptible is called as following -
>>>> ...->do_signal->get_signal->....->wait_event_interruptible ?
>>>> I haven't tried it but wait_event_interruptible is very much alike to
>>>> wait_event_killable so I would assume it will also
>>>> not be interrupted if called like that. (Will give it a try just out
>>>> of curiosity anyway)
>>> As PF_EXITING is set want_signal should fail and the signal state of the
>>> task should not be updatable by signals.
>>>
>>> Eric
>>>
>>>
>> _______________________________________________
>> dri-devel mailing list
>> dri-devel@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/dri-devel

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 1/3] signals: Allow generation of SIGKILL to exiting task.
  2018-04-24 15:30   ` Andrey Grodzovsky
                     ` (2 preceding siblings ...)
  (?)
@ 2018-04-25 13:13   ` Oleg Nesterov
  -1 siblings, 0 replies; 122+ messages in thread
From: Oleg Nesterov @ 2018-04-25 13:13 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: linux-kernel, amd-gfx, Alexander.Deucher, Christian.Koenig,
	David.Panariti, akpm, ebiederm

On 04/24, Andrey Grodzovsky wrote:
>
> Currently calling wait_event_killable as part of exiting process
> will stall forever since SIGKILL generation is suppresed by PF_EXITING.

See my reply to 2/3,

> In our partilaur case AMDGPU driver wants to flush all GPU jobs in
> flight before shutting down. But if some job hangs the pipe we still want to
> be able to kill it and avoid a process in D state.

this patch won't really help in multi-threaded case,

> --- a/kernel/signal.c
> +++ b/kernel/signal.c
> @@ -886,10 +886,10 @@ static inline int wants_signal(int sig, struct task_struct *p)
>  {
>  	if (sigismember(&p->blocked, sig))
>  		return 0;
> -	if (p->flags & PF_EXITING)
> -		return 0;
>  	if (sig == SIGKILL)
>  		return 1;
> +	if (p->flags & PF_EXITING)
> +		return 0;

So you want to trigger signal_wake_up() at the end of complete_signal().

Unless you use tkill() you can wake another thread, not the thread blocked
in drm_sched_entity_fini().

And if the whole process is already dying complete_signal() will do nothing
else.

Oleg.

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
  2018-04-24 21:40           ` Daniel Vetter
  (?)
@ 2018-04-25 13:22           ` Oleg Nesterov
  2018-04-25 13:36             ` Daniel Vetter
  -1 siblings, 1 reply; 122+ messages in thread
From: Oleg Nesterov @ 2018-04-25 13:22 UTC (permalink / raw)
  To: Andrey Grodzovsky, Michel Dänzer, linux-kernel, amd-gfx,
	dri-devel, David.Panariti, ebiederm, Alexander.Deucher, akpm,
	Christian.Koenig

On 04/24, Daniel Vetter wrote:
>
> wait_event_killabel doesn't check for fatal_signal_pending before calling
> schedule, so definitely has a nice race there.

This is fine. See the signal_pending_state() check in __schedule().

And this doesn't differ from wait_event_interruptible(), it too doesn't
check signal_pending(), we rely on schedule() which must not block if the
caller is signalled/killed.

The problem is that it is not clear what should fatal_signal_pending() or
even signal_pending() mean after exit_signals().

Oleg.

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
  2018-04-25 13:22           ` Oleg Nesterov
@ 2018-04-25 13:36             ` Daniel Vetter
  2018-04-25 14:18                 ` Oleg Nesterov
  0 siblings, 1 reply; 122+ messages in thread
From: Daniel Vetter @ 2018-04-25 13:36 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andrey Grodzovsky, Michel Dänzer, Linux Kernel Mailing List,
	amd-gfx list, dri-devel, David.Panariti, Eric Biederman,
	Alex Deucher, Andrew Morton, Christian König

On Wed, Apr 25, 2018 at 3:22 PM, Oleg Nesterov <oleg@redhat.com> wrote:
> On 04/24, Daniel Vetter wrote:
>>
>> wait_event_killabel doesn't check for fatal_signal_pending before calling
>> schedule, so definitely has a nice race there.
>
> This is fine. See the signal_pending_state() check in __schedule().
>
> And this doesn't differ from wait_event_interruptible(), it too doesn't
> check signal_pending(), we rely on schedule() which must not block if the
> caller is signalled/killed.
>
> The problem is that it is not clear what should fatal_signal_pending() or
> even signal_pending() mean after exit_signals().

Uh, I was totally thrown off in all the wait_event* macros and somehow
landed in the _locked variants, which all need to recheck before they
drop the lock, for efficiency reasons. See do_wait_intr().

Sorry for the confusion.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
  2018-04-24 21:40           ` Daniel Vetter
@ 2018-04-25 13:43             ` Andrey Grodzovsky
  -1 siblings, 0 replies; 122+ messages in thread
From: Andrey Grodzovsky @ 2018-04-25 13:43 UTC (permalink / raw)
  To: Michel Dänzer, linux-kernel, amd-gfx, dri-devel,
	David.Panariti, oleg, ebiederm, Alexander.Deucher, akpm,
	Christian.Koenig



On 04/24/2018 05:40 PM, Daniel Vetter wrote:
> On Tue, Apr 24, 2018 at 05:02:40PM -0400, Andrey Grodzovsky wrote:
>>
>> On 04/24/2018 03:44 PM, Daniel Vetter wrote:
>>> On Tue, Apr 24, 2018 at 05:46:52PM +0200, Michel Dänzer wrote:
>>>> Adding the dri-devel list, since this is driver independent code.
>>>>
>>>>
>>>> On 2018-04-24 05:30 PM, Andrey Grodzovsky wrote:
>>>>> Avoid calling wait_event_killable when you are possibly being called
>>>>> from get_signal routine since in that case you end up in a deadlock
>>>>> where you are alreay blocked in singla processing any trying to wait
>>>> Multiple typos here, "[...] already blocked in signal processing and [...]"?
>>>>
>>>>
>>>>> on a new signal.
>>>>>
>>>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>>>> ---
>>>>>    drivers/gpu/drm/scheduler/gpu_scheduler.c | 5 +++--
>>>>>    1 file changed, 3 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler.c b/drivers/gpu/drm/scheduler/gpu_scheduler.c
>>>>> index 088ff2b..09fd258 100644
>>>>> --- a/drivers/gpu/drm/scheduler/gpu_scheduler.c
>>>>> +++ b/drivers/gpu/drm/scheduler/gpu_scheduler.c
>>>>> @@ -227,9 +227,10 @@ void drm_sched_entity_do_release(struct drm_gpu_scheduler *sched,
>>>>>    		return;
>>>>>    	/**
>>>>>    	 * The client will not queue more IBs during this fini, consume existing
>>>>> -	 * queued IBs or discard them on SIGKILL
>>>>> +	 * queued IBs or discard them when in death signal state since
>>>>> +	 * wait_event_killable can't receive signals in that state.
>>>>>    	*/
>>>>> -	if ((current->flags & PF_SIGNALED) && current->exit_code == SIGKILL)
>>>>> +	if (current->flags & PF_SIGNALED)
>>> You want fatal_signal_pending() here, instead of inventing your own broken
>>> version.
>> I rely on current->flags & PF_SIGNALED because this being set from within
>> get_signal,
>> meaning I am within signal processing  in which case I want to avoid any
>> signal based wait for that task,
>>  From what i see in the code, task_struct.pending.signal is being set for
>> other threads in same
>> group (zap_other_threads) or for other scenarios, those task are still able
>> to receive signals
>> so calling wait_event_killable there will not have problem.
>>>>>    		entity->fini_status = -ERESTARTSYS;
>>>>>    	else
>>>>>    		entity->fini_status = wait_event_killable(sched->job_scheduled,
>>> But really this smells like a bug in wait_event_killable, since
>>> wait_event_interruptible does not suffer from the same bug. It will return
>>> immediately when there's a signal pending.
>> Even when wait_event_interruptible is called as following -
>> ...->do_signal->get_signal->....->wait_event_interruptible ?
>> I haven't tried it but wait_event_interruptible is very much alike to
>> wait_event_killable so I would assume it will also
>> not be interrupted if called like that. (Will give it a try just out of
>> curiosity anyway)
> wait_event_killabel doesn't check for fatal_signal_pending before calling
> schedule, so definitely has a nice race there.
>
> But if you're sure that you really need to check PF_SIGNALED, then I'm
> honestly not clear on what you're trying to pull off here. Your sparse
> explanation of what happens isn't enough, since I have no idea how you can
> get from get_signal() to the above wait_event_killable callsite.

Fatal signal will trigger process termination during which all FDs are 
released, including DRM's.

See here -

[<0>] drm_sched_entity_fini+0x10a/0x3a0 [gpu_sched]
[<0>] amdgpu_ctx_do_release+0x129/0x170 [amdgpu]
[<0>] amdgpu_ctx_mgr_fini+0xd5/0xe0 [amdgpu]
[<0>] amdgpu_driver_postclose_kms+0xcd/0x440 [amdgpu]
[<0>] drm_release+0x414/0x5b0 [drm]
[<0>] __fput+0x176/0x350
[<0>] task_work_run+0xa1/0xc0

(From Eric's explanation above is triggered by do_exit->exit_files)
...
[<0>] do_exit+0x48f/0x1280
[<0>] do_group_exit+0x89/0x140
[<0>] get_signal+0x375/0x8f0
[<0>] do_signal+0x79/0xaa0
[<0>] exit_to_usermode_loop+0x83/0xd0
[<0>] do_syscall_64+0x244/0x270
[<0>] entry_SYSCALL_64_after_hwframe+0x3d/0xa2

Andrey

> -Daniel
>
>> Andrey
>>
>>> I think this should be fixed in core code, not papered over in some
>>> subsystem.
>>> -Daniel
>>>
>>>> -- 
>>>> Earthling Michel Dänzer               |               http://www.amd.com
>>>> Libre software enthusiast             |             Mesa and X developer
>>>> _______________________________________________
>>>> dri-devel mailing list
>>>> dri-devel@lists.freedesktop.org
>>>> https://lists.freedesktop.org/mailman/listinfo/dri-devel
>> _______________________________________________
>> dri-devel mailing list
>> dri-devel@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
@ 2018-04-25 13:43             ` Andrey Grodzovsky
  0 siblings, 0 replies; 122+ messages in thread
From: Andrey Grodzovsky @ 2018-04-25 13:43 UTC (permalink / raw)
  To: Michel Dänzer, linux-kernel, amd-gfx, dri-devel,
	David.Panariti, oleg, ebiederm, Alexander.Deucher, akpm,
	Christian.Koenig



On 04/24/2018 05:40 PM, Daniel Vetter wrote:
> On Tue, Apr 24, 2018 at 05:02:40PM -0400, Andrey Grodzovsky wrote:
>>
>> On 04/24/2018 03:44 PM, Daniel Vetter wrote:
>>> On Tue, Apr 24, 2018 at 05:46:52PM +0200, Michel Dänzer wrote:
>>>> Adding the dri-devel list, since this is driver independent code.
>>>>
>>>>
>>>> On 2018-04-24 05:30 PM, Andrey Grodzovsky wrote:
>>>>> Avoid calling wait_event_killable when you are possibly being called
>>>>> from get_signal routine since in that case you end up in a deadlock
>>>>> where you are alreay blocked in singla processing any trying to wait
>>>> Multiple typos here, "[...] already blocked in signal processing and [...]"?
>>>>
>>>>
>>>>> on a new signal.
>>>>>
>>>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>>>> ---
>>>>>    drivers/gpu/drm/scheduler/gpu_scheduler.c | 5 +++--
>>>>>    1 file changed, 3 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler.c b/drivers/gpu/drm/scheduler/gpu_scheduler.c
>>>>> index 088ff2b..09fd258 100644
>>>>> --- a/drivers/gpu/drm/scheduler/gpu_scheduler.c
>>>>> +++ b/drivers/gpu/drm/scheduler/gpu_scheduler.c
>>>>> @@ -227,9 +227,10 @@ void drm_sched_entity_do_release(struct drm_gpu_scheduler *sched,
>>>>>    		return;
>>>>>    	/**
>>>>>    	 * The client will not queue more IBs during this fini, consume existing
>>>>> -	 * queued IBs or discard them on SIGKILL
>>>>> +	 * queued IBs or discard them when in death signal state since
>>>>> +	 * wait_event_killable can't receive signals in that state.
>>>>>    	*/
>>>>> -	if ((current->flags & PF_SIGNALED) && current->exit_code == SIGKILL)
>>>>> +	if (current->flags & PF_SIGNALED)
>>> You want fatal_signal_pending() here, instead of inventing your own broken
>>> version.
>> I rely on current->flags & PF_SIGNALED because this being set from within
>> get_signal,
>> meaning I am within signal processing  in which case I want to avoid any
>> signal based wait for that task,
>>  From what i see in the code, task_struct.pending.signal is being set for
>> other threads in same
>> group (zap_other_threads) or for other scenarios, those task are still able
>> to receive signals
>> so calling wait_event_killable there will not have problem.
>>>>>    		entity->fini_status = -ERESTARTSYS;
>>>>>    	else
>>>>>    		entity->fini_status = wait_event_killable(sched->job_scheduled,
>>> But really this smells like a bug in wait_event_killable, since
>>> wait_event_interruptible does not suffer from the same bug. It will return
>>> immediately when there's a signal pending.
>> Even when wait_event_interruptible is called as following -
>> ...->do_signal->get_signal->....->wait_event_interruptible ?
>> I haven't tried it but wait_event_interruptible is very much alike to
>> wait_event_killable so I would assume it will also
>> not be interrupted if called like that. (Will give it a try just out of
>> curiosity anyway)
> wait_event_killabel doesn't check for fatal_signal_pending before calling
> schedule, so definitely has a nice race there.
>
> But if you're sure that you really need to check PF_SIGNALED, then I'm
> honestly not clear on what you're trying to pull off here. Your sparse
> explanation of what happens isn't enough, since I have no idea how you can
> get from get_signal() to the above wait_event_killable callsite.

Fatal signal will trigger process termination during which all FDs are 
released, including DRM's.

See here -

[<0>] drm_sched_entity_fini+0x10a/0x3a0 [gpu_sched]
[<0>] amdgpu_ctx_do_release+0x129/0x170 [amdgpu]
[<0>] amdgpu_ctx_mgr_fini+0xd5/0xe0 [amdgpu]
[<0>] amdgpu_driver_postclose_kms+0xcd/0x440 [amdgpu]
[<0>] drm_release+0x414/0x5b0 [drm]
[<0>] __fput+0x176/0x350
[<0>] task_work_run+0xa1/0xc0

(From Eric's explanation above is triggered by do_exit->exit_files)
...
[<0>] do_exit+0x48f/0x1280
[<0>] do_group_exit+0x89/0x140
[<0>] get_signal+0x375/0x8f0
[<0>] do_signal+0x79/0xaa0
[<0>] exit_to_usermode_loop+0x83/0xd0
[<0>] do_syscall_64+0x244/0x270
[<0>] entry_SYSCALL_64_after_hwframe+0x3d/0xa2

Andrey

> -Daniel
>
>> Andrey
>>
>>> I think this should be fixed in core code, not papered over in some
>>> subsystem.
>>> -Daniel
>>>
>>>> -- 
>>>> Earthling Michel Dänzer               |               http://www.amd.com
>>>> Libre software enthusiast             |             Mesa and X developer
>>>> _______________________________________________
>>>> dri-devel mailing list
>>>> dri-devel@lists.freedesktop.org
>>>> https://lists.freedesktop.org/mailman/listinfo/dri-devel
>> _______________________________________________
>> dri-devel mailing list
>> dri-devel@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/dri-devel

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
  2018-04-24 17:12       ` Eric W. Biederman
@ 2018-04-25 13:55         ` Oleg Nesterov
  2018-04-25 14:21             ` Andrey Grodzovsky
  0 siblings, 1 reply; 122+ messages in thread
From: Oleg Nesterov @ 2018-04-25 13:55 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Andrey Grodzovsky, linux-kernel, amd-gfx, Alexander.Deucher,
	Christian.Koenig, David.Panariti, akpm

On 04/24, Eric W. Biederman wrote:
>
> Let me respectfully suggest that the wait_event_killable on that code
> path is wrong.

I tend to agree even if I don't know this code.

But if it can be called from f_op->release() then any usage of "current" or
signals looks suspicious. Simply because "current" can be completely irrelevant
task which does the last fput(), say, cat /proc/pid/fdinfo/... or even a kernel
thread.

Oleg.

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
  2018-04-25 13:36             ` Daniel Vetter
@ 2018-04-25 14:18                 ` Oleg Nesterov
  0 siblings, 0 replies; 122+ messages in thread
From: Oleg Nesterov @ 2018-04-25 14:18 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Andrey Grodzovsky, Michel Dänzer, Linux Kernel Mailing List,
	amd-gfx list, dri-devel, David.Panariti, Eric Biederman,
	Alex Deucher, Andrew Morton, Christian König

On 04/25, Daniel Vetter wrote:
>
> On Wed, Apr 25, 2018 at 3:22 PM, Oleg Nesterov <oleg@redhat.com> wrote:
> > On 04/24, Daniel Vetter wrote:
> >>
> >> wait_event_killabel doesn't check for fatal_signal_pending before calling
> >> schedule, so definitely has a nice race there.
> >
> > This is fine. See the signal_pending_state() check in __schedule().
> >
> > And this doesn't differ from wait_event_interruptible(), it too doesn't
> > check signal_pending(), we rely on schedule() which must not block if the
> > caller is signalled/killed.
> >
> > The problem is that it is not clear what should fatal_signal_pending() or
> > even signal_pending() mean after exit_signals().
>
> Uh, I was totally thrown off in all the wait_event* macros and somehow
> landed in the _locked variants, which all need to recheck before they
> drop the lock, for efficiency reasons. See do_wait_intr().

Just in case, note that do_wait_intr() has to check signal_pending() for
completely differerent reason. We need to return non-zero code to stop the
main loop in __wait_event_interruptible_locked(); unlike ___wait_event()
it doesn't check signal_pending() itself.

Oleg.

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
@ 2018-04-25 14:18                 ` Oleg Nesterov
  0 siblings, 0 replies; 122+ messages in thread
From: Oleg Nesterov @ 2018-04-25 14:18 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: David.Panariti, Michel Dänzer, Linux Kernel Mailing List,
	dri-devel, amd-gfx list, Alex Deucher, Andrew Morton,
	Christian König, Eric Biederman

On 04/25, Daniel Vetter wrote:
>
> On Wed, Apr 25, 2018 at 3:22 PM, Oleg Nesterov <oleg@redhat.com> wrote:
> > On 04/24, Daniel Vetter wrote:
> >>
> >> wait_event_killabel doesn't check for fatal_signal_pending before calling
> >> schedule, so definitely has a nice race there.
> >
> > This is fine. See the signal_pending_state() check in __schedule().
> >
> > And this doesn't differ from wait_event_interruptible(), it too doesn't
> > check signal_pending(), we rely on schedule() which must not block if the
> > caller is signalled/killed.
> >
> > The problem is that it is not clear what should fatal_signal_pending() or
> > even signal_pending() mean after exit_signals().
>
> Uh, I was totally thrown off in all the wait_event* macros and somehow
> landed in the _locked variants, which all need to recheck before they
> drop the lock, for efficiency reasons. See do_wait_intr().

Just in case, note that do_wait_intr() has to check signal_pending() for
completely differerent reason. We need to return non-zero code to stop the
main loop in __wait_event_interruptible_locked(); unlike ___wait_event()
it doesn't check signal_pending() itself.

Oleg.

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
@ 2018-04-25 14:21             ` Andrey Grodzovsky
  0 siblings, 0 replies; 122+ messages in thread
From: Andrey Grodzovsky @ 2018-04-25 14:21 UTC (permalink / raw)
  To: Oleg Nesterov, Eric W. Biederman
  Cc: linux-kernel, amd-gfx, Alexander.Deucher, Christian.Koenig,
	David.Panariti, akpm



On 04/25/2018 09:55 AM, Oleg Nesterov wrote:
> On 04/24, Eric W. Biederman wrote:
>> Let me respectfully suggest that the wait_event_killable on that code
>> path is wrong.
> I tend to agree even if I don't know this code.
>
> But if it can be called from f_op->release() then any usage of "current" or
> signals looks suspicious. Simply because "current" can be completely irrelevant
> task which does the last fput(), say, cat /proc/pid/fdinfo/... or even a kernel
> thread.
>
> Oleg.

So what you say is that switching to current->PF_EXITING as indication 
of a reason why I am
here (drm_sched_entity_fini) is also a bad idea, but we still want to be 
able to exit immediately
and not wait for GPU jobs completion when the reason for reaching this 
code is because of KILL
signal to the user process who opened the device file. With termination 
from fput   it seems impossible...
But thinking more about it, any task still referencing this file and 
putting down the reference and is not
exiting due to SIGKILL will just have to go through the  slow path - 
wait for jobs completion on GPU (with some TO).

Andrey

>

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
@ 2018-04-25 14:21             ` Andrey Grodzovsky
  0 siblings, 0 replies; 122+ messages in thread
From: Andrey Grodzovsky @ 2018-04-25 14:21 UTC (permalink / raw)
  To: Oleg Nesterov, Eric W. Biederman
  Cc: David.Panariti-5C7GfCeVMHo, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	Alexander.Deucher-5C7GfCeVMHo,
	akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	Christian.Koenig-5C7GfCeVMHo



On 04/25/2018 09:55 AM, Oleg Nesterov wrote:
> On 04/24, Eric W. Biederman wrote:
>> Let me respectfully suggest that the wait_event_killable on that code
>> path is wrong.
> I tend to agree even if I don't know this code.
>
> But if it can be called from f_op->release() then any usage of "current" or
> signals looks suspicious. Simply because "current" can be completely irrelevant
> task which does the last fput(), say, cat /proc/pid/fdinfo/... or even a kernel
> thread.
>
> Oleg.

So what you say is that switching to current->PF_EXITING as indication 
of a reason why I am
here (drm_sched_entity_fini) is also a bad idea, but we still want to be 
able to exit immediately
and not wait for GPU jobs completion when the reason for reaching this 
code is because of KILL
signal to the user process who opened the device file. With termination 
from fput   it seems impossible...
But thinking more about it, any task still referencing this file and 
putting down the reference and is not
exiting due to SIGKILL will just have to go through the  slow path - 
wait for jobs completion on GPU (with some TO).

Andrey

>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
  2018-04-25 13:08                 ` Andrey Grodzovsky
  (?)
@ 2018-04-25 15:29                 ` Eric W. Biederman
  2018-04-25 16:13                   ` Andrey Grodzovsky
  -1 siblings, 1 reply; 122+ messages in thread
From: Eric W. Biederman @ 2018-04-25 15:29 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: David.Panariti, Michel Dänzer, linux-kernel, dri-devel,
	oleg, amd-gfx, Alexander.Deucher, akpm, Christian.Koenig

Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> writes:

> On 04/25/2018 03:14 AM, Daniel Vetter wrote:
>> On Tue, Apr 24, 2018 at 05:37:08PM -0400, Andrey Grodzovsky wrote:
>>>
>>> On 04/24/2018 05:21 PM, Eric W. Biederman wrote:
>>>> Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> writes:
>>>>
>>>>> On 04/24/2018 03:44 PM, Daniel Vetter wrote:
>>>>>> On Tue, Apr 24, 2018 at 05:46:52PM +0200, Michel Dänzer wrote:
>>>>>>> Adding the dri-devel list, since this is driver independent code.
>>>>>>>
>>>>>>>
>>>>>>> On 2018-04-24 05:30 PM, Andrey Grodzovsky wrote:
>>>>>>>> Avoid calling wait_event_killable when you are possibly being called
>>>>>>>> from get_signal routine since in that case you end up in a deadlock
>>>>>>>> where you are alreay blocked in singla processing any trying to wait
>>>>>>> Multiple typos here, "[...] already blocked in signal processing and [...]"?
>>>>>>>
>>>>>>>
>>>>>>>> on a new signal.
>>>>>>>>
>>>>>>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>>>>>>> ---
>>>>>>>>     drivers/gpu/drm/scheduler/gpu_scheduler.c | 5 +++--
>>>>>>>>     1 file changed, 3 insertions(+), 2 deletions(-)
>>>>>>>>
>>>>>>>> diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler.c b/drivers/gpu/drm/scheduler/gpu_scheduler.c
>>>>>>>> index 088ff2b..09fd258 100644
>>>>>>>> --- a/drivers/gpu/drm/scheduler/gpu_scheduler.c
>>>>>>>> +++ b/drivers/gpu/drm/scheduler/gpu_scheduler.c
>>>>>>>> @@ -227,9 +227,10 @@ void drm_sched_entity_do_release(struct drm_gpu_scheduler *sched,
>>>>>>>>     		return;
>>>>>>>>     	/**
>>>>>>>>     	 * The client will not queue more IBs during this fini, consume existing
>>>>>>>> -	 * queued IBs or discard them on SIGKILL
>>>>>>>> +	 * queued IBs or discard them when in death signal state since
>>>>>>>> +	 * wait_event_killable can't receive signals in that state.
>>>>>>>>     	*/
>>>>>>>> -	if ((current->flags & PF_SIGNALED) && current->exit_code == SIGKILL)
>>>>>>>> +	if (current->flags & PF_SIGNALED)
>>>>>> You want fatal_signal_pending() here, instead of inventing your own broken
>>>>>> version.
>>>>> I rely on current->flags & PF_SIGNALED because this being set from
>>>>> within get_signal,
>>>> It doesn't mean that.  Unless you are called by do_coredump (you
>>>> aren't).
>>> Looking in latest code here
>>> https://elixir.bootlin.com/linux/v4.17-rc2/source/kernel/signal.c#L2449
>>> i see that current->flags |= PF_SIGNALED; is out side of
>>> if (sig_kernel_coredump(signr)) {...} scope
>> Ok I read some more about this, and I guess you go through process exit
>> and then eventually close. But I'm not sure.
>>
>> The code in drm_sched_entity_fini also looks strange: You unpark the
>> scheduler thread before you remove all the IBs. At least from the comment
>> that doesn't sound like what you want to do.
>
> I think it should be safe for the dying scheduler entity since before that (in
> drm_sched_entity_do_release) we set it's runqueue to NULL
> so no new jobs will be dequeued form it by the scheduler thread.
>
>>
>> But in general, PF_SIGNALED is really something deeply internal to the
>> core (used for some book-keeping and accounting). The drm scheduler is the
>> only thing looking at it, so smells like a layering violation. I suspect
>> (but without knowing what you're actually trying to achive here can't be
>> sure) you want to look at something else.
>>
>> E.g. PF_EXITING seems to be used in a lot more places to cancel stuff
>> that's no longer relevant when a task exits, not PF_SIGNALED. There's the
>> TIF_MEMDIE flag if you're hacking around issues with the oom-killer.
>>
>> This here on the other hand looks really fragile, and probably only does
>> what you want to do by accident.
>> -Daniel
>
> Yes , that what Eric also said and in the V2 patches i will try  to change
> PF_EXITING
>
> Another issue is changing wait_event_killable to wait_event_timeout where I need
> to understand
> what TO value is acceptable for all the drivers using the scheduler, or maybe it
> should come as a property
> of drm_sched_entity.

It would not surprise me if you could pick a large value like 1 second
and issue a warning if that time outever triggers.  It sounds like the
condition where we wait indefinitely today is because something went
wrong in the driver.

Eric

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
  2018-04-25 15:29                 ` Eric W. Biederman
@ 2018-04-25 16:13                   ` Andrey Grodzovsky
  2018-04-25 16:31                     ` Eric W. Biederman
  0 siblings, 1 reply; 122+ messages in thread
From: Andrey Grodzovsky @ 2018-04-25 16:13 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David.Panariti, Michel Dänzer, linux-kernel, amd-gfx, oleg,
	dri-devel, Alexander.Deucher, akpm, Christian.Koenig


[-- Attachment #1.1: Type: text/plain, Size: 871 bytes --]



On 04/25/2018 11:29 AM, Eric W. Biederman wrote:
>> Another issue is changing wait_event_killable to wait_event_timeout where I need
>> to understand
>> what TO value is acceptable for all the drivers using the scheduler, or maybe it
>> should come as a property
>> of drm_sched_entity.
> It would not surprise me if you could pick a large value like 1 second
> and issue a warning if that time outever triggers.  It sounds like the
> condition where we wait indefinitely today is because something went
> wrong in the driver.

We wait here for all GPU jobs in flight which belong to the dying entity 
to complete. The driver submits
the GPU jobs but the content of the job might be is not under driver's 
control and could take
long time to finish or even hang (e.g. graphic or compute shader) , I 
guess that why originally the wait is indefinite.

Andrey

>
> Eric


[-- Attachment #1.2: Type: text/html, Size: 1486 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
  2018-04-25 16:13                   ` Andrey Grodzovsky
@ 2018-04-25 16:31                     ` Eric W. Biederman
  0 siblings, 0 replies; 122+ messages in thread
From: Eric W. Biederman @ 2018-04-25 16:31 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: David.Panariti, Michel Dänzer, linux-kernel, dri-devel,
	oleg, amd-gfx, Alexander.Deucher, akpm, Christian.Koenig

Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> writes:

> On 04/25/2018 11:29 AM, Eric W. Biederman wrote:
>
>>  Another issue is changing wait_event_killable to wait_event_timeout where I need
>> to understand
>> what TO value is acceptable for all the drivers using the scheduler, or maybe it
>> should come as a property
>> of drm_sched_entity.
>>
>> It would not surprise me if you could pick a large value like 1 second
>> and issue a warning if that time outever triggers.  It sounds like the
>> condition where we wait indefinitely today is because something went
>> wrong in the driver.
>
> We wait here for all GPU jobs in flight which belong to the dying entity to complete. The driver submits
> the GPU jobs but the content of the job might be is not under driver's control and could take 
> long time to finish or even hang (e.g. graphic or compute shader) , I
> guess that why originally the wait is indefinite.


I am ignorant of what user space expect or what the semantics of the
susbsystem are here, so I might be completely off base.  But this wait
for a long time behavior I would expect much more from f_op->flush or a
f_op->fsync method.

fsync so it could be obtained without closing the file descriptor.
flush so that you could get a return value out to close.

But I honestly don't know semantically what your userspace applications
expect and/or require so I can really only say.  Those of weird semantics.

Eric

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 3/3] drm/amdgpu: Switch to interrupted wait to recover from ring hang.
@ 2018-04-25 17:17             ` Andrey Grodzovsky
  0 siblings, 0 replies; 122+ messages in thread
From: Andrey Grodzovsky @ 2018-04-25 17:17 UTC (permalink / raw)
  To: Eric W. Biederman, Panariti, David
  Cc: linux-kernel, amd-gfx, Deucher, Alexander, Koenig, Christian, oleg, akpm



On 04/24/2018 12:30 PM, Eric W. Biederman wrote:
> "Panariti, David" <David.Panariti@amd.com> writes:
>
>> Andrey Grodzovsky <andrey.grodzovsky@amd.com> writes:
>>> Kind of dma_fence_wait_killable, except that we don't have such API
>>> (maybe worth adding ?)
>> Depends on how many places it would be called, or think it might be called.  Can always factor on the 2nd time it's needed.
>> Factoring, IMO, rarely hurts.  The factored function can easily be visited using `M-.' ;->
>>
>> Also, if the wait could be very long, would a log message, something like "xxx has run for Y seconds."  help?
>> I personally hate hanging w/no info.
> Ugh.  This loop appears susceptible to loosing wake ups.  There are
> races between when a wake-up happens, when we clear the sleeping state,
> and when we test the stat to see if we should stat awake.  So yes
> implementing a dma_fence_wait_killable that handles of all that
> correctly sounds like an very good idea.

I am not clear here - could you be more specific about what races will 
happen here, more bellow
>
> Eric
>
>
>>> If the ring is hanging for some reason allow to recover the waiting by sending fatal signal.
>>>
>>> Originally-by: David Panariti <David.Panariti@amd.com>
>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>> ---
>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 14 ++++++++++----
>>>    1 file changed, 10 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
>>> index eb80edf..37a36af 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
>>> @@ -421,10 +421,16 @@ int amdgpu_ctx_wait_prev_fence(struct amdgpu_ctx *ctx, unsigned ring_id)
>>>
>>>        if (other) {
>>>                signed long r;
>>> -             r = dma_fence_wait_timeout(other, false, MAX_SCHEDULE_TIMEOUT);
>>> -             if (r < 0) {
>>> -                     DRM_ERROR("Error (%ld) waiting for fence!\n", r);
>>> -                     return r;
>>> +
>>> +             while (true) {
>>> +                     if ((r = dma_fence_wait_timeout(other, true,
>>> +                                     MAX_SCHEDULE_TIMEOUT)) >= 0)
>>> +                             return 0;
>>> +

Do you mean that by the time I reach here some other thread from my 
group already might dequeued SIGKILL since it's a shared signal and 
hence fatal_signal_pending will return false ? Or are you talking about 
the dma_fence_wait_timeout implementation in dma_fence_default_wait with 
schedule_timeout ?

Andrey

>>> +                     if (fatal_signal_pending(current)) {
>>> +                             DRM_ERROR("Error (%ld) waiting for fence!\n", r);
>>> +                             return r;
>>> +                     }
>>>                }
>>>        }
>>>
>>> --
>>> 2.7.4
>>>
> Eric

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 3/3] drm/amdgpu: Switch to interrupted wait to recover from ring hang.
@ 2018-04-25 17:17             ` Andrey Grodzovsky
  0 siblings, 0 replies; 122+ messages in thread
From: Andrey Grodzovsky @ 2018-04-25 17:17 UTC (permalink / raw)
  To: Eric W. Biederman, Panariti, David
  Cc: oleg-H+wXaHxf7aLQT0dZR+AlfA,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Deucher, Alexander,
	akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, Koenig, Christian



On 04/24/2018 12:30 PM, Eric W. Biederman wrote:
> "Panariti, David" <David.Panariti@amd.com> writes:
>
>> Andrey Grodzovsky <andrey.grodzovsky@amd.com> writes:
>>> Kind of dma_fence_wait_killable, except that we don't have such API
>>> (maybe worth adding ?)
>> Depends on how many places it would be called, or think it might be called.  Can always factor on the 2nd time it's needed.
>> Factoring, IMO, rarely hurts.  The factored function can easily be visited using `M-.' ;->
>>
>> Also, if the wait could be very long, would a log message, something like "xxx has run for Y seconds."  help?
>> I personally hate hanging w/no info.
> Ugh.  This loop appears susceptible to loosing wake ups.  There are
> races between when a wake-up happens, when we clear the sleeping state,
> and when we test the stat to see if we should stat awake.  So yes
> implementing a dma_fence_wait_killable that handles of all that
> correctly sounds like an very good idea.

I am not clear here - could you be more specific about what races will 
happen here, more bellow
>
> Eric
>
>
>>> If the ring is hanging for some reason allow to recover the waiting by sending fatal signal.
>>>
>>> Originally-by: David Panariti <David.Panariti@amd.com>
>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>> ---
>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 14 ++++++++++----
>>>    1 file changed, 10 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
>>> index eb80edf..37a36af 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
>>> @@ -421,10 +421,16 @@ int amdgpu_ctx_wait_prev_fence(struct amdgpu_ctx *ctx, unsigned ring_id)
>>>
>>>        if (other) {
>>>                signed long r;
>>> -             r = dma_fence_wait_timeout(other, false, MAX_SCHEDULE_TIMEOUT);
>>> -             if (r < 0) {
>>> -                     DRM_ERROR("Error (%ld) waiting for fence!\n", r);
>>> -                     return r;
>>> +
>>> +             while (true) {
>>> +                     if ((r = dma_fence_wait_timeout(other, true,
>>> +                                     MAX_SCHEDULE_TIMEOUT)) >= 0)
>>> +                             return 0;
>>> +

Do you mean that by the time I reach here some other thread from my 
group already might dequeued SIGKILL since it's a shared signal and 
hence fatal_signal_pending will return false ? Or are you talking about 
the dma_fence_wait_timeout implementation in dma_fence_default_wait with 
schedule_timeout ?

Andrey

>>> +                     if (fatal_signal_pending(current)) {
>>> +                             DRM_ERROR("Error (%ld) waiting for fence!\n", r);
>>> +                             return r;
>>> +                     }
>>>                }
>>>        }
>>>
>>> --
>>> 2.7.4
>>>
> Eric

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
  2018-04-25 14:21             ` Andrey Grodzovsky
  (?)
@ 2018-04-25 17:17             ` Oleg Nesterov
  2018-04-25 18:40                 ` Andrey Grodzovsky
  -1 siblings, 1 reply; 122+ messages in thread
From: Oleg Nesterov @ 2018-04-25 17:17 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: Eric W. Biederman, linux-kernel, amd-gfx, Alexander.Deucher,
	Christian.Koenig, David.Panariti, akpm

On 04/25, Andrey Grodzovsky wrote:
>
> here (drm_sched_entity_fini) is also a bad idea, but we still want to be
> able to exit immediately
> and not wait for GPU jobs completion when the reason for reaching this code
> is because of KILL
> signal to the user process who opened the device file.

Can you hook f_op->flush method?

Oleg.

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
@ 2018-04-25 18:40                 ` Andrey Grodzovsky
  0 siblings, 0 replies; 122+ messages in thread
From: Andrey Grodzovsky @ 2018-04-25 18:40 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Eric W. Biederman, linux-kernel, amd-gfx, Alexander.Deucher,
	Christian.Koenig, David.Panariti, akpm



On 04/25/2018 01:17 PM, Oleg Nesterov wrote:
> On 04/25, Andrey Grodzovsky wrote:
>> here (drm_sched_entity_fini) is also a bad idea, but we still want to be
>> able to exit immediately
>> and not wait for GPU jobs completion when the reason for reaching this code
>> is because of KILL
>> signal to the user process who opened the device file.
> Can you hook f_op->flush method?

But this one is called for each task releasing a reference to the the 
file, so not sure I see how this solves the problem.

Andrey

>
> Oleg.
>

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
@ 2018-04-25 18:40                 ` Andrey Grodzovsky
  0 siblings, 0 replies; 122+ messages in thread
From: Andrey Grodzovsky @ 2018-04-25 18:40 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: David.Panariti-5C7GfCeVMHo, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Eric W. Biederman,
	Alexander.Deucher-5C7GfCeVMHo,
	akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	Christian.Koenig-5C7GfCeVMHo



On 04/25/2018 01:17 PM, Oleg Nesterov wrote:
> On 04/25, Andrey Grodzovsky wrote:
>> here (drm_sched_entity_fini) is also a bad idea, but we still want to be
>> able to exit immediately
>> and not wait for GPU jobs completion when the reason for reaching this code
>> is because of KILL
>> signal to the user process who opened the device file.
> Can you hook f_op->flush method?

But this one is called for each task releasing a reference to the the 
file, so not sure I see how this solves the problem.

Andrey

>
> Oleg.
>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 3/3] drm/amdgpu: Switch to interrupted wait to recover from ring hang.
  2018-04-25 17:17             ` Andrey Grodzovsky
@ 2018-04-25 20:55               ` Eric W. Biederman
  -1 siblings, 0 replies; 122+ messages in thread
From: Eric W. Biederman @ 2018-04-25 20:55 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: Panariti, David, linux-kernel, amd-gfx, Deucher, Alexander,
	Koenig, Christian, oleg, akpm

Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> writes:

> On 04/24/2018 12:30 PM, Eric W. Biederman wrote:
>> "Panariti, David" <David.Panariti@amd.com> writes:
>>
>>> Andrey Grodzovsky <andrey.grodzovsky@amd.com> writes:
>>>> Kind of dma_fence_wait_killable, except that we don't have such API
>>>> (maybe worth adding ?)
>>> Depends on how many places it would be called, or think it might be called.  Can always factor on the 2nd time it's needed.
>>> Factoring, IMO, rarely hurts.  The factored function can easily be visited using `M-.' ;->
>>>
>>> Also, if the wait could be very long, would a log message, something like "xxx has run for Y seconds."  help?
>>> I personally hate hanging w/no info.
>> Ugh.  This loop appears susceptible to loosing wake ups.  There are
>> races between when a wake-up happens, when we clear the sleeping state,
>> and when we test the stat to see if we should stat awake.  So yes
>> implementing a dma_fence_wait_killable that handles of all that
>> correctly sounds like an very good idea.
>
> I am not clear here - could you be more specific about what races will happen
> here, more bellow
>>
>> Eric
>>
>>
>>>> If the ring is hanging for some reason allow to recover the waiting by sending fatal signal.
>>>>
>>>> Originally-by: David Panariti <David.Panariti@amd.com>
>>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>>> ---
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 14 ++++++++++----
>>>>    1 file changed, 10 insertions(+), 4 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
>>>> index eb80edf..37a36af 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
>>>> @@ -421,10 +421,16 @@ int amdgpu_ctx_wait_prev_fence(struct amdgpu_ctx *ctx, unsigned ring_id)
>>>>
>>>>        if (other) {
>>>>                signed long r;
>>>> -             r = dma_fence_wait_timeout(other, false, MAX_SCHEDULE_TIMEOUT);
>>>> -             if (r < 0) {
>>>> -                     DRM_ERROR("Error (%ld) waiting for fence!\n", r);
>>>> -                     return r;
>>>> +
>>>> +             while (true) {
>>>> +                     if ((r = dma_fence_wait_timeout(other, true,
>>>> +                                     MAX_SCHEDULE_TIMEOUT)) >= 0)
>>>> +                             return 0;
>>>> +
>
> Do you mean that by the time I reach here some other thread from my group
> already might dequeued SIGKILL since it's a shared signal and hence
> fatal_signal_pending will return false ? Or are you talking about the
> dma_fence_wait_timeout implementation in dma_fence_default_wait with
> schedule_timeout ?

Given Oleg's earlier comment about the scheduler having special cases
for signals I might be wrong.  But in general there is a pattern:

	for (;;) {
		set_current_state(TASK_UNINTERRUPTIBLE);
		if (loop_is_done())
			break;
		schedule();
	}
        set_current_state(TASK_RUNNING);

If you violate that pattern by testing for a condition without
having first set your task as TASK_UNINTERRUPTIBLE (or whatever your
sleep state is).  Then it is possible to miss a wake-up that
tests the condidtion.

Thus I am quite concerned that there is a subtle corner case where
you can miss a wakeup and not retest fatal_signal_pending().

Given that there is is a timeout the worst case might have you sleep
MAX_SCHEDULE_TIMEOUT instead of indefinitely.

Without a comment why this is safe, or having fatal_signal_pending
check integrated into dma_fence_wait_timeout I am not comfortable
with this loop.

Eric


>>>> +                     if (fatal_signal_pending(current)) {
>>>> +                             DRM_ERROR("Error (%ld) waiting for fence!\n", r);
>>>> +                             return r;
>>>> +                     }
>>>>                }
>>>>        }
>>>>
>>>> --
>>>> 2.7.4
>>>>
>> Eric

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 3/3] drm/amdgpu: Switch to interrupted wait to recover from ring hang.
@ 2018-04-25 20:55               ` Eric W. Biederman
  0 siblings, 0 replies; 122+ messages in thread
From: Eric W. Biederman @ 2018-04-25 20:55 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: Panariti, David, linux-kernel, amd-gfx, Deucher, Alexander,
	Koenig, Christian, oleg, akpm

Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> writes:

> On 04/24/2018 12:30 PM, Eric W. Biederman wrote:
>> "Panariti, David" <David.Panariti@amd.com> writes:
>>
>>> Andrey Grodzovsky <andrey.grodzovsky@amd.com> writes:
>>>> Kind of dma_fence_wait_killable, except that we don't have such API
>>>> (maybe worth adding ?)
>>> Depends on how many places it would be called, or think it might be called.  Can always factor on the 2nd time it's needed.
>>> Factoring, IMO, rarely hurts.  The factored function can easily be visited using `M-.' ;->
>>>
>>> Also, if the wait could be very long, would a log message, something like "xxx has run for Y seconds."  help?
>>> I personally hate hanging w/no info.
>> Ugh.  This loop appears susceptible to loosing wake ups.  There are
>> races between when a wake-up happens, when we clear the sleeping state,
>> and when we test the stat to see if we should stat awake.  So yes
>> implementing a dma_fence_wait_killable that handles of all that
>> correctly sounds like an very good idea.
>
> I am not clear here - could you be more specific about what races will happen
> here, more bellow
>>
>> Eric
>>
>>
>>>> If the ring is hanging for some reason allow to recover the waiting by sending fatal signal.
>>>>
>>>> Originally-by: David Panariti <David.Panariti@amd.com>
>>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>>> ---
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 14 ++++++++++----
>>>>    1 file changed, 10 insertions(+), 4 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
>>>> index eb80edf..37a36af 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
>>>> @@ -421,10 +421,16 @@ int amdgpu_ctx_wait_prev_fence(struct amdgpu_ctx *ctx, unsigned ring_id)
>>>>
>>>>        if (other) {
>>>>                signed long r;
>>>> -             r = dma_fence_wait_timeout(other, false, MAX_SCHEDULE_TIMEOUT);
>>>> -             if (r < 0) {
>>>> -                     DRM_ERROR("Error (%ld) waiting for fence!\n", r);
>>>> -                     return r;
>>>> +
>>>> +             while (true) {
>>>> +                     if ((r = dma_fence_wait_timeout(other, true,
>>>> +                                     MAX_SCHEDULE_TIMEOUT)) >= 0)
>>>> +                             return 0;
>>>> +
>
> Do you mean that by the time I reach here some other thread from my group
> already might dequeued SIGKILL since it's a shared signal and hence
> fatal_signal_pending will return false ? Or are you talking about the
> dma_fence_wait_timeout implementation in dma_fence_default_wait with
> schedule_timeout ?

Given Oleg's earlier comment about the scheduler having special cases
for signals I might be wrong.  But in general there is a pattern:

	for (;;) {
		set_current_state(TASK_UNINTERRUPTIBLE);
		if (loop_is_done())
			break;
		schedule();
	}
        set_current_state(TASK_RUNNING);

If you violate that pattern by testing for a condition without
having first set your task as TASK_UNINTERRUPTIBLE (or whatever your
sleep state is).  Then it is possible to miss a wake-up that
tests the condidtion.

Thus I am quite concerned that there is a subtle corner case where
you can miss a wakeup and not retest fatal_signal_pending().

Given that there is is a timeout the worst case might have you sleep
MAX_SCHEDULE_TIMEOUT instead of indefinitely.

Without a comment why this is safe, or having fatal_signal_pending
check integrated into dma_fence_wait_timeout I am not comfortable
with this loop.

Eric


>>>> +                     if (fatal_signal_pending(current)) {
>>>> +                             DRM_ERROR("Error (%ld) waiting for fence!\n", r);
>>>> +                             return r;
>>>> +                     }
>>>>                }
>>>>        }
>>>>
>>>> --
>>>> 2.7.4
>>>>
>> Eric

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
  2018-04-25 18:40                 ` Andrey Grodzovsky
  (?)
@ 2018-04-26  0:01                 ` Eric W. Biederman
  2018-04-26 12:34                     ` Andrey Grodzovsky
  2018-04-30 12:08                     ` Christian König
  -1 siblings, 2 replies; 122+ messages in thread
From: Eric W. Biederman @ 2018-04-26  0:01 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: Oleg Nesterov, linux-kernel, amd-gfx, Alexander.Deucher,
	Christian.Koenig, David.Panariti, akpm

Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> writes:

> On 04/25/2018 01:17 PM, Oleg Nesterov wrote:
>> On 04/25, Andrey Grodzovsky wrote:
>>> here (drm_sched_entity_fini) is also a bad idea, but we still want to be
>>> able to exit immediately
>>> and not wait for GPU jobs completion when the reason for reaching this code
>>> is because of KILL
>>> signal to the user process who opened the device file.
>> Can you hook f_op->flush method?
>
> But this one is called for each task releasing a reference to the the file, so
> not sure I see how this solves the problem.

The big question is why do you need to wait during the final closing a
file?

The wait can be terminated so the wait does not appear to be simply a
matter of correctness.

Eric

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 3/3] drm/amdgpu: Switch to interrupted wait to recover from ring hang.
@ 2018-04-26 12:28                 ` Andrey Grodzovsky
  0 siblings, 0 replies; 122+ messages in thread
From: Andrey Grodzovsky @ 2018-04-26 12:28 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Panariti, David, linux-kernel, amd-gfx, Deucher, Alexander,
	Koenig, Christian, oleg, akpm



On 04/25/2018 04:55 PM, Eric W. Biederman wrote:
> Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> writes:
>
>> On 04/24/2018 12:30 PM, Eric W. Biederman wrote:
>>> "Panariti, David" <David.Panariti@amd.com> writes:
>>>
>>>> Andrey Grodzovsky <andrey.grodzovsky@amd.com> writes:
>>>>> Kind of dma_fence_wait_killable, except that we don't have such API
>>>>> (maybe worth adding ?)
>>>> Depends on how many places it would be called, or think it might be called.  Can always factor on the 2nd time it's needed.
>>>> Factoring, IMO, rarely hurts.  The factored function can easily be visited using `M-.' ;->
>>>>
>>>> Also, if the wait could be very long, would a log message, something like "xxx has run for Y seconds."  help?
>>>> I personally hate hanging w/no info.
>>> Ugh.  This loop appears susceptible to loosing wake ups.  There are
>>> races between when a wake-up happens, when we clear the sleeping state,
>>> and when we test the stat to see if we should stat awake.  So yes
>>> implementing a dma_fence_wait_killable that handles of all that
>>> correctly sounds like an very good idea.
>> I am not clear here - could you be more specific about what races will happen
>> here, more bellow
>>> Eric
>>>
>>>
>>>>> If the ring is hanging for some reason allow to recover the waiting by sending fatal signal.
>>>>>
>>>>> Originally-by: David Panariti <David.Panariti@amd.com>
>>>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>>>> ---
>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 14 ++++++++++----
>>>>>     1 file changed, 10 insertions(+), 4 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
>>>>> index eb80edf..37a36af 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
>>>>> @@ -421,10 +421,16 @@ int amdgpu_ctx_wait_prev_fence(struct amdgpu_ctx *ctx, unsigned ring_id)
>>>>>
>>>>>         if (other) {
>>>>>                 signed long r;
>>>>> -             r = dma_fence_wait_timeout(other, false, MAX_SCHEDULE_TIMEOUT);
>>>>> -             if (r < 0) {
>>>>> -                     DRM_ERROR("Error (%ld) waiting for fence!\n", r);
>>>>> -                     return r;
>>>>> +
>>>>> +             while (true) {
>>>>> +                     if ((r = dma_fence_wait_timeout(other, true,
>>>>> +                                     MAX_SCHEDULE_TIMEOUT)) >= 0)
>>>>> +                             return 0;
>>>>> +
>> Do you mean that by the time I reach here some other thread from my group
>> already might dequeued SIGKILL since it's a shared signal and hence
>> fatal_signal_pending will return false ? Or are you talking about the
>> dma_fence_wait_timeout implementation in dma_fence_default_wait with
>> schedule_timeout ?
> Given Oleg's earlier comment about the scheduler having special cases
> for signals I might be wrong.  But in general there is a pattern:
>
> 	for (;;) {
> 		set_current_state(TASK_UNINTERRUPTIBLE);
> 		if (loop_is_done())
> 			break;
> 		schedule();
> 	}
>          set_current_state(TASK_RUNNING);
>
> If you violate that pattern by testing for a condition without
> having first set your task as TASK_UNINTERRUPTIBLE (or whatever your
> sleep state is).  Then it is possible to miss a wake-up that
> tests the condidtion.
>
> Thus I am quite concerned that there is a subtle corner case where
> you can miss a wakeup and not retest fatal_signal_pending().


I see the general problem now. In this particular case 
dma_fence_default_wait
and the caller of wake_up_state use lock for protecting wake up delivery 
and wakeup condition
and also dma_fence_default_wait retests the wakeup condition on entry.
But obviously it's a bad practice to rely on API's internal 
implementation for assumptions
in client code.

>
> Given that there is is a timeout the worst case might have you sleep
> MAX_SCHEDULE_TIMEOUT instead of indefinitely.

It actually means never wake

>
> Without a comment why this is safe, or having fatal_signal_pending
> check integrated into dma_fence_wait_timeout I am not comfortable
> with this loop.

Agree, fatal_signal_pending should be part of the wait function.

Andrey

>
> Eric
>
>
>>>>> +                     if (fatal_signal_pending(current)) {
>>>>> +                             DRM_ERROR("Error (%ld) waiting for fence!\n", r);
>>>>> +                             return r;
>>>>> +                     }
>>>>>                 }
>>>>>         }
>>>>>
>>>>> --
>>>>> 2.7.4
>>>>>
>>> Eric

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 3/3] drm/amdgpu: Switch to interrupted wait to recover from ring hang.
@ 2018-04-26 12:28                 ` Andrey Grodzovsky
  0 siblings, 0 replies; 122+ messages in thread
From: Andrey Grodzovsky @ 2018-04-26 12:28 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Panariti, David, oleg-H+wXaHxf7aLQT0dZR+AlfA,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Deucher, Alexander,
	akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, Koenig, Christian



On 04/25/2018 04:55 PM, Eric W. Biederman wrote:
> Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> writes:
>
>> On 04/24/2018 12:30 PM, Eric W. Biederman wrote:
>>> "Panariti, David" <David.Panariti@amd.com> writes:
>>>
>>>> Andrey Grodzovsky <andrey.grodzovsky@amd.com> writes:
>>>>> Kind of dma_fence_wait_killable, except that we don't have such API
>>>>> (maybe worth adding ?)
>>>> Depends on how many places it would be called, or think it might be called.  Can always factor on the 2nd time it's needed.
>>>> Factoring, IMO, rarely hurts.  The factored function can easily be visited using `M-.' ;->
>>>>
>>>> Also, if the wait could be very long, would a log message, something like "xxx has run for Y seconds."  help?
>>>> I personally hate hanging w/no info.
>>> Ugh.  This loop appears susceptible to loosing wake ups.  There are
>>> races between when a wake-up happens, when we clear the sleeping state,
>>> and when we test the stat to see if we should stat awake.  So yes
>>> implementing a dma_fence_wait_killable that handles of all that
>>> correctly sounds like an very good idea.
>> I am not clear here - could you be more specific about what races will happen
>> here, more bellow
>>> Eric
>>>
>>>
>>>>> If the ring is hanging for some reason allow to recover the waiting by sending fatal signal.
>>>>>
>>>>> Originally-by: David Panariti <David.Panariti@amd.com>
>>>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>>>> ---
>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 14 ++++++++++----
>>>>>     1 file changed, 10 insertions(+), 4 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
>>>>> index eb80edf..37a36af 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
>>>>> @@ -421,10 +421,16 @@ int amdgpu_ctx_wait_prev_fence(struct amdgpu_ctx *ctx, unsigned ring_id)
>>>>>
>>>>>         if (other) {
>>>>>                 signed long r;
>>>>> -             r = dma_fence_wait_timeout(other, false, MAX_SCHEDULE_TIMEOUT);
>>>>> -             if (r < 0) {
>>>>> -                     DRM_ERROR("Error (%ld) waiting for fence!\n", r);
>>>>> -                     return r;
>>>>> +
>>>>> +             while (true) {
>>>>> +                     if ((r = dma_fence_wait_timeout(other, true,
>>>>> +                                     MAX_SCHEDULE_TIMEOUT)) >= 0)
>>>>> +                             return 0;
>>>>> +
>> Do you mean that by the time I reach here some other thread from my group
>> already might dequeued SIGKILL since it's a shared signal and hence
>> fatal_signal_pending will return false ? Or are you talking about the
>> dma_fence_wait_timeout implementation in dma_fence_default_wait with
>> schedule_timeout ?
> Given Oleg's earlier comment about the scheduler having special cases
> for signals I might be wrong.  But in general there is a pattern:
>
> 	for (;;) {
> 		set_current_state(TASK_UNINTERRUPTIBLE);
> 		if (loop_is_done())
> 			break;
> 		schedule();
> 	}
>          set_current_state(TASK_RUNNING);
>
> If you violate that pattern by testing for a condition without
> having first set your task as TASK_UNINTERRUPTIBLE (or whatever your
> sleep state is).  Then it is possible to miss a wake-up that
> tests the condidtion.
>
> Thus I am quite concerned that there is a subtle corner case where
> you can miss a wakeup and not retest fatal_signal_pending().


I see the general problem now. In this particular case 
dma_fence_default_wait
and the caller of wake_up_state use lock for protecting wake up delivery 
and wakeup condition
and also dma_fence_default_wait retests the wakeup condition on entry.
But obviously it's a bad practice to rely on API's internal 
implementation for assumptions
in client code.

>
> Given that there is is a timeout the worst case might have you sleep
> MAX_SCHEDULE_TIMEOUT instead of indefinitely.

It actually means never wake

>
> Without a comment why this is safe, or having fatal_signal_pending
> check integrated into dma_fence_wait_timeout I am not comfortable
> with this loop.

Agree, fatal_signal_pending should be part of the wait function.

Andrey

>
> Eric
>
>
>>>>> +                     if (fatal_signal_pending(current)) {
>>>>> +                             DRM_ERROR("Error (%ld) waiting for fence!\n", r);
>>>>> +                             return r;
>>>>> +                     }
>>>>>                 }
>>>>>         }
>>>>>
>>>>> --
>>>>> 2.7.4
>>>>>
>>> Eric

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
@ 2018-04-26 12:34                     ` Andrey Grodzovsky
  0 siblings, 0 replies; 122+ messages in thread
From: Andrey Grodzovsky @ 2018-04-26 12:34 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Oleg Nesterov, linux-kernel, amd-gfx, Alexander.Deucher,
	Christian.Koenig, David.Panariti, akpm



On 04/25/2018 08:01 PM, Eric W. Biederman wrote:
> Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> writes:
>
>> On 04/25/2018 01:17 PM, Oleg Nesterov wrote:
>>> On 04/25, Andrey Grodzovsky wrote:
>>>> here (drm_sched_entity_fini) is also a bad idea, but we still want to be
>>>> able to exit immediately
>>>> and not wait for GPU jobs completion when the reason for reaching this code
>>>> is because of KILL
>>>> signal to the user process who opened the device file.
>>> Can you hook f_op->flush method?
>> But this one is called for each task releasing a reference to the the file, so
>> not sure I see how this solves the problem.
> The big question is why do you need to wait during the final closing a
> file?
>
> The wait can be terminated so the wait does not appear to be simply a
> matter of correctness.

Well, as I understand it, it just means that you don't want to abruptly 
terminate GPU work in progress without a good
reason (such as KILL signal). When we exit we are going to release 
various resources GPU is still using so we either
wait for it to complete or terminate the remaining jobs.

Andrey

>
> Eric

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
@ 2018-04-26 12:34                     ` Andrey Grodzovsky
  0 siblings, 0 replies; 122+ messages in thread
From: Andrey Grodzovsky @ 2018-04-26 12:34 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David.Panariti-5C7GfCeVMHo, Oleg Nesterov,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	Alexander.Deucher-5C7GfCeVMHo,
	akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	Christian.Koenig-5C7GfCeVMHo



On 04/25/2018 08:01 PM, Eric W. Biederman wrote:
> Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> writes:
>
>> On 04/25/2018 01:17 PM, Oleg Nesterov wrote:
>>> On 04/25, Andrey Grodzovsky wrote:
>>>> here (drm_sched_entity_fini) is also a bad idea, but we still want to be
>>>> able to exit immediately
>>>> and not wait for GPU jobs completion when the reason for reaching this code
>>>> is because of KILL
>>>> signal to the user process who opened the device file.
>>> Can you hook f_op->flush method?
>> But this one is called for each task releasing a reference to the the file, so
>> not sure I see how this solves the problem.
> The big question is why do you need to wait during the final closing a
> file?
>
> The wait can be terminated so the wait does not appear to be simply a
> matter of correctness.

Well, as I understand it, it just means that you don't want to abruptly 
terminate GPU work in progress without a good
reason (such as KILL signal). When we exit we are going to release 
various resources GPU is still using so we either
wait for it to complete or terminate the remaining jobs.

Andrey

>
> Eric

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
@ 2018-04-26 12:52                       ` Andrey Grodzovsky
  0 siblings, 0 replies; 122+ messages in thread
From: Andrey Grodzovsky @ 2018-04-26 12:52 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Oleg Nesterov, linux-kernel, amd-gfx, Alexander.Deucher,
	Christian.Koenig, David.Panariti, akpm



On 04/26/2018 08:34 AM, Andrey Grodzovsky wrote:
>
>
> On 04/25/2018 08:01 PM, Eric W. Biederman wrote:
>> Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> writes:
>>
>>> On 04/25/2018 01:17 PM, Oleg Nesterov wrote:
>>>> On 04/25, Andrey Grodzovsky wrote:
>>>>> here (drm_sched_entity_fini) is also a bad idea, but we still want 
>>>>> to be
>>>>> able to exit immediately
>>>>> and not wait for GPU jobs completion when the reason for reaching 
>>>>> this code
>>>>> is because of KILL
>>>>> signal to the user process who opened the device file.
>>>> Can you hook f_op->flush method?
>>> But this one is called for each task releasing a reference to the 
>>> the file, so
>>> not sure I see how this solves the problem.
>> The big question is why do you need to wait during the final closing a
>> file?
>>
>> The wait can be terminated so the wait does not appear to be simply a
>> matter of correctness.
>
> Well, as I understand it, it just means that you don't want to 
> abruptly terminate GPU work in progress without a good
> reason (such as KILL signal). When we exit we are going to release 
> various resources GPU is still using so we either
> wait for it to complete or terminate the remaining jobs.

Looked more into code, some correction, drm_sched_entity_fini means the 
SW job queue itself is about to die, so we must
either wait for completion or terminate any outstanding jobs that are 
still in the SW queue. Anything which already in flight in HW
will still complete.

Andrey

>
> Andrey
>
>>
>> Eric
>

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
@ 2018-04-26 12:52                       ` Andrey Grodzovsky
  0 siblings, 0 replies; 122+ messages in thread
From: Andrey Grodzovsky @ 2018-04-26 12:52 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David.Panariti-5C7GfCeVMHo, Oleg Nesterov,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	Alexander.Deucher-5C7GfCeVMHo,
	akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	Christian.Koenig-5C7GfCeVMHo



On 04/26/2018 08:34 AM, Andrey Grodzovsky wrote:
>
>
> On 04/25/2018 08:01 PM, Eric W. Biederman wrote:
>> Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> writes:
>>
>>> On 04/25/2018 01:17 PM, Oleg Nesterov wrote:
>>>> On 04/25, Andrey Grodzovsky wrote:
>>>>> here (drm_sched_entity_fini) is also a bad idea, but we still want 
>>>>> to be
>>>>> able to exit immediately
>>>>> and not wait for GPU jobs completion when the reason for reaching 
>>>>> this code
>>>>> is because of KILL
>>>>> signal to the user process who opened the device file.
>>>> Can you hook f_op->flush method?
>>> But this one is called for each task releasing a reference to the 
>>> the file, so
>>> not sure I see how this solves the problem.
>> The big question is why do you need to wait during the final closing a
>> file?
>>
>> The wait can be terminated so the wait does not appear to be simply a
>> matter of correctness.
>
> Well, as I understand it, it just means that you don't want to 
> abruptly terminate GPU work in progress without a good
> reason (such as KILL signal). When we exit we are going to release 
> various resources GPU is still using so we either
> wait for it to complete or terminate the remaining jobs.

Looked more into code, some correction, drm_sched_entity_fini means the 
SW job queue itself is about to die, so we must
either wait for completion or terminate any outstanding jobs that are 
still in the SW queue. Anything which already in flight in HW
will still complete.

Andrey

>
> Andrey
>
>>
>> Eric
>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
  2018-04-26 12:52                       ` Andrey Grodzovsky
  (?)
@ 2018-04-26 15:57                       ` Eric W. Biederman
  2018-04-26 20:43                           ` Andrey Grodzovsky
  -1 siblings, 1 reply; 122+ messages in thread
From: Eric W. Biederman @ 2018-04-26 15:57 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: Oleg Nesterov, linux-kernel, amd-gfx, Alexander.Deucher,
	Christian.Koenig, David.Panariti, akpm

Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> writes:

> On 04/26/2018 08:34 AM, Andrey Grodzovsky wrote:
>>
>>
>> On 04/25/2018 08:01 PM, Eric W. Biederman wrote:
>>> Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> writes:
>>>
>>>> On 04/25/2018 01:17 PM, Oleg Nesterov wrote:
>>>>> On 04/25, Andrey Grodzovsky wrote:
>>>>>> here (drm_sched_entity_fini) is also a bad idea, but we still want to be
>>>>>> able to exit immediately
>>>>>> and not wait for GPU jobs completion when the reason for reaching this
>>>>>> code
>>>>>> is because of KILL
>>>>>> signal to the user process who opened the device file.
>>>>> Can you hook f_op->flush method?
>>>> But this one is called for each task releasing a reference to the the file,
>>>> so
>>>> not sure I see how this solves the problem.
>>> The big question is why do you need to wait during the final closing a
>>> file?
>>>
>>> The wait can be terminated so the wait does not appear to be simply a
>>> matter of correctness.
>>
>> Well, as I understand it, it just means that you don't want to abruptly
>> terminate GPU work in progress without a good
>> reason (such as KILL signal). When we exit we are going to release various
>> resources GPU is still using so we either
>> wait for it to complete or terminate the remaining jobs.

At the point of do_exit you might as well be a KILL signal however you
got there.

> Looked more into code, some correction, drm_sched_entity_fini means the SW job
> queue itself is about to die, so we must
> either wait for completion or terminate any outstanding jobs that are still in
> the SW queue. Anything which already in flight in HW
> will still complete.

It sounds like we don't care if we block the process that had the file
descriptor open, this is just book keeping.  Which allows having a piece
of code that cleans up resources when the GPU is done with the queue but
does not make userspace wait.  (option 1)

For it to make sense that we let the process run there has to be
something that cares about the results being completed.  If all of the
file descriptors are closed and the process is killed I can't see who
will care that the software queue will continue to be processed.  So it
may be reasonable to simply kill the queue (option 2).

If userspace really needs the wait it is probably better done in
f_op->flush so that every close of the file descriptor blocks
until the queue is flushed (option 3).

Do you know if userspace cares about the gpu operations completing?

My skim of the code suggests that nothing actually cares about those
operations, but I really don't know the gpu well.

Eric

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
@ 2018-04-26 20:43                           ` Andrey Grodzovsky
  0 siblings, 0 replies; 122+ messages in thread
From: Andrey Grodzovsky @ 2018-04-26 20:43 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Oleg Nesterov, linux-kernel, amd-gfx, Alexander.Deucher,
	Christian.Koenig, David.Panariti, akpm



On 04/26/2018 11:57 AM, Eric W. Biederman wrote:
> Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> writes:
>
>> On 04/26/2018 08:34 AM, Andrey Grodzovsky wrote:
>>>
>>> On 04/25/2018 08:01 PM, Eric W. Biederman wrote:
>>>> Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> writes:
>>>>
>>>>> On 04/25/2018 01:17 PM, Oleg Nesterov wrote:
>>>>>> On 04/25, Andrey Grodzovsky wrote:
>>>>>>> here (drm_sched_entity_fini) is also a bad idea, but we still want to be
>>>>>>> able to exit immediately
>>>>>>> and not wait for GPU jobs completion when the reason for reaching this
>>>>>>> code
>>>>>>> is because of KILL
>>>>>>> signal to the user process who opened the device file.
>>>>>> Can you hook f_op->flush method?
>>>>> But this one is called for each task releasing a reference to the the file,
>>>>> so
>>>>> not sure I see how this solves the problem.
>>>> The big question is why do you need to wait during the final closing a
>>>> file?
>>>>
>>>> The wait can be terminated so the wait does not appear to be simply a
>>>> matter of correctness.
>>> Well, as I understand it, it just means that you don't want to abruptly
>>> terminate GPU work in progress without a good
>>> reason (such as KILL signal). When we exit we are going to release various
>>> resources GPU is still using so we either
>>> wait for it to complete or terminate the remaining jobs.
> At the point of do_exit you might as well be a KILL signal however you
> got there.
>
>> Looked more into code, some correction, drm_sched_entity_fini means the SW job
>> queue itself is about to die, so we must
>> either wait for completion or terminate any outstanding jobs that are still in
>> the SW queue. Anything which already in flight in HW
>> will still complete.
> It sounds like we don't care if we block the process that had the file
> descriptor open, this is just book keeping.  Which allows having a piece
> of code that cleans up resources when the GPU is done with the queue but
> does not make userspace wait.  (option 1)
>
> For it to make sense that we let the process run there has to be
> something that cares about the results being completed.  If all of the
> file descriptors are closed and the process is killed I can't see who
> will care that the software queue will continue to be processed.  So it
> may be reasonable to simply kill the queue (option 2).
>
> If userspace really needs the wait it is probably better done in
> f_op->flush so that every close of the file descriptor blocks
> until the queue is flushed (option 3).
>
> Do you know if userspace cares about the gpu operations completing?

I don't have a good answer for that, I would assume it depends on type
of jobs still remaining unprocessed and on the general type of work the user
process is doing.

Some key people who can answer this are currently away for a few days/week
so the answer for this will have to wait a bit.

Andrey

>
> My skim of the code suggests that nothing actually cares about those
> operations, but I really don't know the gpu well.
>
> Eric

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
@ 2018-04-26 20:43                           ` Andrey Grodzovsky
  0 siblings, 0 replies; 122+ messages in thread
From: Andrey Grodzovsky @ 2018-04-26 20:43 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David.Panariti-5C7GfCeVMHo, Oleg Nesterov,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	Alexander.Deucher-5C7GfCeVMHo,
	akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	Christian.Koenig-5C7GfCeVMHo



On 04/26/2018 11:57 AM, Eric W. Biederman wrote:
> Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> writes:
>
>> On 04/26/2018 08:34 AM, Andrey Grodzovsky wrote:
>>>
>>> On 04/25/2018 08:01 PM, Eric W. Biederman wrote:
>>>> Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> writes:
>>>>
>>>>> On 04/25/2018 01:17 PM, Oleg Nesterov wrote:
>>>>>> On 04/25, Andrey Grodzovsky wrote:
>>>>>>> here (drm_sched_entity_fini) is also a bad idea, but we still want to be
>>>>>>> able to exit immediately
>>>>>>> and not wait for GPU jobs completion when the reason for reaching this
>>>>>>> code
>>>>>>> is because of KILL
>>>>>>> signal to the user process who opened the device file.
>>>>>> Can you hook f_op->flush method?
>>>>> But this one is called for each task releasing a reference to the the file,
>>>>> so
>>>>> not sure I see how this solves the problem.
>>>> The big question is why do you need to wait during the final closing a
>>>> file?
>>>>
>>>> The wait can be terminated so the wait does not appear to be simply a
>>>> matter of correctness.
>>> Well, as I understand it, it just means that you don't want to abruptly
>>> terminate GPU work in progress without a good
>>> reason (such as KILL signal). When we exit we are going to release various
>>> resources GPU is still using so we either
>>> wait for it to complete or terminate the remaining jobs.
> At the point of do_exit you might as well be a KILL signal however you
> got there.
>
>> Looked more into code, some correction, drm_sched_entity_fini means the SW job
>> queue itself is about to die, so we must
>> either wait for completion or terminate any outstanding jobs that are still in
>> the SW queue. Anything which already in flight in HW
>> will still complete.
> It sounds like we don't care if we block the process that had the file
> descriptor open, this is just book keeping.  Which allows having a piece
> of code that cleans up resources when the GPU is done with the queue but
> does not make userspace wait.  (option 1)
>
> For it to make sense that we let the process run there has to be
> something that cares about the results being completed.  If all of the
> file descriptors are closed and the process is killed I can't see who
> will care that the software queue will continue to be processed.  So it
> may be reasonable to simply kill the queue (option 2).
>
> If userspace really needs the wait it is probably better done in
> f_op->flush so that every close of the file descriptor blocks
> until the queue is flushed (option 3).
>
> Do you know if userspace cares about the gpu operations completing?

I don't have a good answer for that, I would assume it depends on type
of jobs still remaining unprocessed and on the general type of work the user
process is doing.

Some key people who can answer this are currently away for a few days/week
so the answer for this will have to wait a bit.

Andrey

>
> My skim of the code suggests that nothing actually cares about those
> operations, but I really don't know the gpu well.
>
> Eric

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 3/3] drm/amdgpu: Switch to interrupted wait to recover from ring hang.
@ 2018-04-30 11:34     ` Christian König
  0 siblings, 0 replies; 122+ messages in thread
From: Christian König @ 2018-04-30 11:34 UTC (permalink / raw)
  To: Andrey Grodzovsky, linux-kernel, amd-gfx
  Cc: Alexander.Deucher, David.Panariti, oleg, akpm, ebiederm

Am 24.04.2018 um 17:30 schrieb Andrey Grodzovsky:
> If the ring is hanging for some reason allow to recover the waiting
> by sending fatal signal.
>
> Originally-by: David Panariti <David.Panariti@amd.com>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 14 ++++++++++----
>   1 file changed, 10 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
> index eb80edf..37a36af 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
> @@ -421,10 +421,16 @@ int amdgpu_ctx_wait_prev_fence(struct amdgpu_ctx *ctx, unsigned ring_id)
>   
>   	if (other) {
>   		signed long r;
> -		r = dma_fence_wait_timeout(other, false, MAX_SCHEDULE_TIMEOUT);
> -		if (r < 0) {
> -			DRM_ERROR("Error (%ld) waiting for fence!\n", r);
> -			return r;
> +
> +		while (true) {
> +			if ((r = dma_fence_wait_timeout(other, true,
> +					MAX_SCHEDULE_TIMEOUT)) >= 0)
> +				return 0;
> +
> +			if (fatal_signal_pending(current)) {
> +				DRM_ERROR("Error (%ld) waiting for fence!\n", r);
> +				return r;
> +			}

Please drop the whole extra handling. The caller is perfectly capable of 
dealing with interrupted waits.

So all we need to do here is change "dma_fence_wait_timeout(other, 
false, ..." into "dma_fence_wait_timeout(other, true, ..." and suppress 
the error message when the IOCTL was just interrupted by a signal.

Regards,
Christian.

>   		}
>   	}
>   

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 3/3] drm/amdgpu: Switch to interrupted wait to recover from ring hang.
@ 2018-04-30 11:34     ` Christian König
  0 siblings, 0 replies; 122+ messages in thread
From: Christian König @ 2018-04-30 11:34 UTC (permalink / raw)
  To: Andrey Grodzovsky, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: Alexander.Deucher-5C7GfCeVMHo, David.Panariti-5C7GfCeVMHo,
	akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	oleg-H+wXaHxf7aLQT0dZR+AlfA, ebiederm-aS9lmoZGLiVWk0Htik3J/w

Am 24.04.2018 um 17:30 schrieb Andrey Grodzovsky:
> If the ring is hanging for some reason allow to recover the waiting
> by sending fatal signal.
>
> Originally-by: David Panariti <David.Panariti@amd.com>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 14 ++++++++++----
>   1 file changed, 10 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
> index eb80edf..37a36af 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
> @@ -421,10 +421,16 @@ int amdgpu_ctx_wait_prev_fence(struct amdgpu_ctx *ctx, unsigned ring_id)
>   
>   	if (other) {
>   		signed long r;
> -		r = dma_fence_wait_timeout(other, false, MAX_SCHEDULE_TIMEOUT);
> -		if (r < 0) {
> -			DRM_ERROR("Error (%ld) waiting for fence!\n", r);
> -			return r;
> +
> +		while (true) {
> +			if ((r = dma_fence_wait_timeout(other, true,
> +					MAX_SCHEDULE_TIMEOUT)) >= 0)
> +				return 0;
> +
> +			if (fatal_signal_pending(current)) {
> +				DRM_ERROR("Error (%ld) waiting for fence!\n", r);
> +				return r;
> +			}

Please drop the whole extra handling. The caller is perfectly capable of 
dealing with interrupted waits.

So all we need to do here is change "dma_fence_wait_timeout(other, 
false, ..." into "dma_fence_wait_timeout(other, true, ..." and suppress 
the error message when the IOCTL was just interrupted by a signal.

Regards,
Christian.

>   		}
>   	}
>   

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
@ 2018-04-30 12:08                     ` Christian König
  0 siblings, 0 replies; 122+ messages in thread
From: Christian König @ 2018-04-30 12:08 UTC (permalink / raw)
  To: Eric W. Biederman, Andrey Grodzovsky
  Cc: David.Panariti, Oleg Nesterov, amd-gfx, linux-kernel,
	Alexander.Deucher, akpm, Christian.Koenig

Hi Eric,

sorry for the late response, was on vacation last week.

Am 26.04.2018 um 02:01 schrieb Eric W. Biederman:
> Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> writes:
>
>> On 04/25/2018 01:17 PM, Oleg Nesterov wrote:
>>> On 04/25, Andrey Grodzovsky wrote:
>>>> here (drm_sched_entity_fini) is also a bad idea, but we still want to be
>>>> able to exit immediately
>>>> and not wait for GPU jobs completion when the reason for reaching this code
>>>> is because of KILL
>>>> signal to the user process who opened the device file.
>>> Can you hook f_op->flush method?

THANKS! That sounds like a really good idea to me and we haven't 
investigated into that direction yet.

>> But this one is called for each task releasing a reference to the the file, so
>> not sure I see how this solves the problem.
> The big question is why do you need to wait during the final closing a
> file?

As always it's because of historical reasons. Initially user space 
pushed commands directly to a hardware queue and when a processes 
finished we didn't need to wait for anything.

Then the GPU scheduler was introduced which delayed pushing the jobs to 
the hardware queue to a later point in time.

This wait was then added to maintain backward compability and not break 
userspace (but see below).

> The wait can be terminated so the wait does not appear to be simply a
> matter of correctness.

Well when the process is killed we don't care about correctness any 
more, we just want to get rid of it as quickly as possible (OOM 
situation etc...).

But it is perfectly possible that a process submits some render commands 
and then calls exit() or terminates because of a SIGTERM, SIGINT etc.. 
In this case we need to wait here to make sure that all rendering is 
pushed to the hardware because the scheduler might need 
resources/settings from the file descriptor.

For example if you just remove that wait you could close firefox and get 
garbage on the screen for a millisecond because the remaining rendering 
commands where not executed.

So what we essentially need is to distinct between a SIGKILL (which 
means stop processing as soon as possible) and any other reason because 
then we don't want to annoy the user with garbage on the screen (even if 
it's just for a few milliseconds).

Constructive ideas how to handle this would be very welcome, cause I 
completely agree that what we have at the moment by checking PF_SIGNAL 
is just a very very hacky workaround.

Thanks,
Christian.

>
> Eric
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
@ 2018-04-30 12:08                     ` Christian König
  0 siblings, 0 replies; 122+ messages in thread
From: Christian König @ 2018-04-30 12:08 UTC (permalink / raw)
  To: Eric W. Biederman, Andrey Grodzovsky
  Cc: David.Panariti-5C7GfCeVMHo, Oleg Nesterov,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	Alexander.Deucher-5C7GfCeVMHo,
	akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	Christian.Koenig-5C7GfCeVMHo

Hi Eric,

sorry for the late response, was on vacation last week.

Am 26.04.2018 um 02:01 schrieb Eric W. Biederman:
> Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> writes:
>
>> On 04/25/2018 01:17 PM, Oleg Nesterov wrote:
>>> On 04/25, Andrey Grodzovsky wrote:
>>>> here (drm_sched_entity_fini) is also a bad idea, but we still want to be
>>>> able to exit immediately
>>>> and not wait for GPU jobs completion when the reason for reaching this code
>>>> is because of KILL
>>>> signal to the user process who opened the device file.
>>> Can you hook f_op->flush method?

THANKS! That sounds like a really good idea to me and we haven't 
investigated into that direction yet.

>> But this one is called for each task releasing a reference to the the file, so
>> not sure I see how this solves the problem.
> The big question is why do you need to wait during the final closing a
> file?

As always it's because of historical reasons. Initially user space 
pushed commands directly to a hardware queue and when a processes 
finished we didn't need to wait for anything.

Then the GPU scheduler was introduced which delayed pushing the jobs to 
the hardware queue to a later point in time.

This wait was then added to maintain backward compability and not break 
userspace (but see below).

> The wait can be terminated so the wait does not appear to be simply a
> matter of correctness.

Well when the process is killed we don't care about correctness any 
more, we just want to get rid of it as quickly as possible (OOM 
situation etc...).

But it is perfectly possible that a process submits some render commands 
and then calls exit() or terminates because of a SIGTERM, SIGINT etc.. 
In this case we need to wait here to make sure that all rendering is 
pushed to the hardware because the scheduler might need 
resources/settings from the file descriptor.

For example if you just remove that wait you could close firefox and get 
garbage on the screen for a millisecond because the remaining rendering 
commands where not executed.

So what we essentially need is to distinct between a SIGKILL (which 
means stop processing as soon as possible) and any other reason because 
then we don't want to annoy the user with garbage on the screen (even if 
it's just for a few milliseconds).

Constructive ideas how to handle this would be very welcome, cause I 
completely agree that what we have at the moment by checking PF_SIGNAL 
is just a very very hacky workaround.

Thanks,
Christian.

>
> Eric
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
@ 2018-04-30 14:32                       ` Andrey Grodzovsky
  0 siblings, 0 replies; 122+ messages in thread
From: Andrey Grodzovsky @ 2018-04-30 14:32 UTC (permalink / raw)
  To: christian.koenig, Eric W. Biederman
  Cc: David.Panariti, Oleg Nesterov, amd-gfx, linux-kernel,
	Alexander.Deucher, akpm



On 04/30/2018 08:08 AM, Christian König wrote:
> Hi Eric,
>
> sorry for the late response, was on vacation last week.
>
> Am 26.04.2018 um 02:01 schrieb Eric W. Biederman:
>> Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> writes:
>>
>>> On 04/25/2018 01:17 PM, Oleg Nesterov wrote:
>>>> On 04/25, Andrey Grodzovsky wrote:
>>>>> here (drm_sched_entity_fini) is also a bad idea, but we still want 
>>>>> to be
>>>>> able to exit immediately
>>>>> and not wait for GPU jobs completion when the reason for reaching 
>>>>> this code
>>>>> is because of KILL
>>>>> signal to the user process who opened the device file.
>>>> Can you hook f_op->flush method?
>
> THANKS! That sounds like a really good idea to me and we haven't 
> investigated into that direction yet.
>
>>> But this one is called for each task releasing a reference to the 
>>> the file, so
>>> not sure I see how this solves the problem.
>> The big question is why do you need to wait during the final closing a
>> file?
>
> As always it's because of historical reasons. Initially user space 
> pushed commands directly to a hardware queue and when a processes 
> finished we didn't need to wait for anything.
>
> Then the GPU scheduler was introduced which delayed pushing the jobs 
> to the hardware queue to a later point in time.
>
> This wait was then added to maintain backward compability and not 
> break userspace (but see below).
>
>> The wait can be terminated so the wait does not appear to be simply a
>> matter of correctness.
>
> Well when the process is killed we don't care about correctness any 
> more, we just want to get rid of it as quickly as possible (OOM 
> situation etc...).
>
> But it is perfectly possible that a process submits some render 
> commands and then calls exit() or terminates because of a SIGTERM, 
> SIGINT etc.. In this case we need to wait here to make sure that all 
> rendering is pushed to the hardware because the scheduler might need 
> resources/settings from the file descriptor.
>
> For example if you just remove that wait you could close firefox and 
> get garbage on the screen for a millisecond because the remaining 
> rendering commands where not executed.
>
> So what we essentially need is to distinct between a SIGKILL (which 
> means stop processing as soon as possible) and any other reason 
> because then we don't want to annoy the user with garbage on the 
> screen (even if it's just for a few milliseconds).
>
> Constructive ideas how to handle this would be very welcome, cause I 
> completely agree that what we have at the moment by checking PF_SIGNAL 
> is just a very very hacky workaround.

What about changing PF_SIGNALED to  PF_EXITING in 
drm_sched_entity_do_release

-       if ((current->flags & PF_SIGNALED) && current->exit_code == SIGKILL)
+      if ((current->flags & PF_EXITING) && current->exit_code == SIGKILL)

 From looking into do_exit and it's callers , current->exit_code will 
get assign the signal which was delivered to the task. If SIGINT was 
sent then it's SIGINT, if SIGKILL then SIGKILL.

Andrey


>
> Thanks,
> Christian.


>
>>
>> Eric
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
@ 2018-04-30 14:32                       ` Andrey Grodzovsky
  0 siblings, 0 replies; 122+ messages in thread
From: Andrey Grodzovsky @ 2018-04-30 14:32 UTC (permalink / raw)
  To: christian.koenig-5C7GfCeVMHo, Eric W. Biederman
  Cc: David.Panariti-5C7GfCeVMHo, Oleg Nesterov,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	Alexander.Deucher-5C7GfCeVMHo,
	akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b



On 04/30/2018 08:08 AM, Christian König wrote:
> Hi Eric,
>
> sorry for the late response, was on vacation last week.
>
> Am 26.04.2018 um 02:01 schrieb Eric W. Biederman:
>> Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> writes:
>>
>>> On 04/25/2018 01:17 PM, Oleg Nesterov wrote:
>>>> On 04/25, Andrey Grodzovsky wrote:
>>>>> here (drm_sched_entity_fini) is also a bad idea, but we still want 
>>>>> to be
>>>>> able to exit immediately
>>>>> and not wait for GPU jobs completion when the reason for reaching 
>>>>> this code
>>>>> is because of KILL
>>>>> signal to the user process who opened the device file.
>>>> Can you hook f_op->flush method?
>
> THANKS! That sounds like a really good idea to me and we haven't 
> investigated into that direction yet.
>
>>> But this one is called for each task releasing a reference to the 
>>> the file, so
>>> not sure I see how this solves the problem.
>> The big question is why do you need to wait during the final closing a
>> file?
>
> As always it's because of historical reasons. Initially user space 
> pushed commands directly to a hardware queue and when a processes 
> finished we didn't need to wait for anything.
>
> Then the GPU scheduler was introduced which delayed pushing the jobs 
> to the hardware queue to a later point in time.
>
> This wait was then added to maintain backward compability and not 
> break userspace (but see below).
>
>> The wait can be terminated so the wait does not appear to be simply a
>> matter of correctness.
>
> Well when the process is killed we don't care about correctness any 
> more, we just want to get rid of it as quickly as possible (OOM 
> situation etc...).
>
> But it is perfectly possible that a process submits some render 
> commands and then calls exit() or terminates because of a SIGTERM, 
> SIGINT etc.. In this case we need to wait here to make sure that all 
> rendering is pushed to the hardware because the scheduler might need 
> resources/settings from the file descriptor.
>
> For example if you just remove that wait you could close firefox and 
> get garbage on the screen for a millisecond because the remaining 
> rendering commands where not executed.
>
> So what we essentially need is to distinct between a SIGKILL (which 
> means stop processing as soon as possible) and any other reason 
> because then we don't want to annoy the user with garbage on the 
> screen (even if it's just for a few milliseconds).
>
> Constructive ideas how to handle this would be very welcome, cause I 
> completely agree that what we have at the moment by checking PF_SIGNAL 
> is just a very very hacky workaround.

What about changing PF_SIGNALED to  PF_EXITING in 
drm_sched_entity_do_release

-       if ((current->flags & PF_SIGNALED) && current->exit_code == SIGKILL)
+      if ((current->flags & PF_EXITING) && current->exit_code == SIGKILL)

 From looking into do_exit and it's callers , current->exit_code will 
get assign the signal which was delivered to the task. If SIGINT was 
sent then it's SIGINT, if SIGKILL then SIGKILL.

Andrey


>
> Thanks,
> Christian.


>
>>
>> Eric
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
@ 2018-04-30 15:25                         ` Christian König
  0 siblings, 0 replies; 122+ messages in thread
From: Christian König @ 2018-04-30 15:25 UTC (permalink / raw)
  To: Andrey Grodzovsky, christian.koenig, Eric W. Biederman
  Cc: David.Panariti, Oleg Nesterov, amd-gfx, linux-kernel,
	Alexander.Deucher, akpm

Am 30.04.2018 um 16:32 schrieb Andrey Grodzovsky:
>
>
> On 04/30/2018 08:08 AM, Christian König wrote:
>> Hi Eric,
>>
>> sorry for the late response, was on vacation last week.
>>
>> Am 26.04.2018 um 02:01 schrieb Eric W. Biederman:
>>> Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> writes:
>>>
>>>> On 04/25/2018 01:17 PM, Oleg Nesterov wrote:
>>>>> On 04/25, Andrey Grodzovsky wrote:
>>>>>> here (drm_sched_entity_fini) is also a bad idea, but we still 
>>>>>> want to be
>>>>>> able to exit immediately
>>>>>> and not wait for GPU jobs completion when the reason for reaching 
>>>>>> this code
>>>>>> is because of KILL
>>>>>> signal to the user process who opened the device file.
>>>>> Can you hook f_op->flush method?
>>
>> THANKS! That sounds like a really good idea to me and we haven't 
>> investigated into that direction yet.
>>
>>>> But this one is called for each task releasing a reference to the 
>>>> the file, so
>>>> not sure I see how this solves the problem.
>>> The big question is why do you need to wait during the final closing a
>>> file?
>>
>> As always it's because of historical reasons. Initially user space 
>> pushed commands directly to a hardware queue and when a processes 
>> finished we didn't need to wait for anything.
>>
>> Then the GPU scheduler was introduced which delayed pushing the jobs 
>> to the hardware queue to a later point in time.
>>
>> This wait was then added to maintain backward compability and not 
>> break userspace (but see below).
>>
>>> The wait can be terminated so the wait does not appear to be simply a
>>> matter of correctness.
>>
>> Well when the process is killed we don't care about correctness any 
>> more, we just want to get rid of it as quickly as possible (OOM 
>> situation etc...).
>>
>> But it is perfectly possible that a process submits some render 
>> commands and then calls exit() or terminates because of a SIGTERM, 
>> SIGINT etc.. In this case we need to wait here to make sure that all 
>> rendering is pushed to the hardware because the scheduler might need 
>> resources/settings from the file descriptor.
>>
>> For example if you just remove that wait you could close firefox and 
>> get garbage on the screen for a millisecond because the remaining 
>> rendering commands where not executed.
>>
>> So what we essentially need is to distinct between a SIGKILL (which 
>> means stop processing as soon as possible) and any other reason 
>> because then we don't want to annoy the user with garbage on the 
>> screen (even if it's just for a few milliseconds).
>>
>> Constructive ideas how to handle this would be very welcome, cause I 
>> completely agree that what we have at the moment by checking 
>> PF_SIGNAL is just a very very hacky workaround.
>
> What about changing PF_SIGNALED to  PF_EXITING in 
> drm_sched_entity_do_release
>
> -       if ((current->flags & PF_SIGNALED) && current->exit_code == 
> SIGKILL)
> +      if ((current->flags & PF_EXITING) && current->exit_code == 
> SIGKILL)
>
> From looking into do_exit and it's callers , current->exit_code will 
> get assign the signal which was delivered to the task. If SIGINT was 
> sent then it's SIGINT, if SIGKILL then SIGKILL.

That's at least a band aid to stop us from abusing PF_SIGNALED.

But additional to that change, can you investigate when f_ops->flush() 
is called when the process exists normally, because of SIGKILL or 
because of some other signal?

Could be that this is more closely to what we are searching for,
Christian.

>
> Andrey
>
>
>>
>> Thanks,
>> Christian.
>
>
>>
>>>
>>> Eric
>>> _______________________________________________
>>> amd-gfx mailing list
>>> amd-gfx@lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>>
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
@ 2018-04-30 15:25                         ` Christian König
  0 siblings, 0 replies; 122+ messages in thread
From: Christian König @ 2018-04-30 15:25 UTC (permalink / raw)
  To: Andrey Grodzovsky, christian.koenig-5C7GfCeVMHo, Eric W. Biederman
  Cc: David.Panariti-5C7GfCeVMHo, Oleg Nesterov,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	Alexander.Deucher-5C7GfCeVMHo,
	akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b

Am 30.04.2018 um 16:32 schrieb Andrey Grodzovsky:
>
>
> On 04/30/2018 08:08 AM, Christian König wrote:
>> Hi Eric,
>>
>> sorry for the late response, was on vacation last week.
>>
>> Am 26.04.2018 um 02:01 schrieb Eric W. Biederman:
>>> Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> writes:
>>>
>>>> On 04/25/2018 01:17 PM, Oleg Nesterov wrote:
>>>>> On 04/25, Andrey Grodzovsky wrote:
>>>>>> here (drm_sched_entity_fini) is also a bad idea, but we still 
>>>>>> want to be
>>>>>> able to exit immediately
>>>>>> and not wait for GPU jobs completion when the reason for reaching 
>>>>>> this code
>>>>>> is because of KILL
>>>>>> signal to the user process who opened the device file.
>>>>> Can you hook f_op->flush method?
>>
>> THANKS! That sounds like a really good idea to me and we haven't 
>> investigated into that direction yet.
>>
>>>> But this one is called for each task releasing a reference to the 
>>>> the file, so
>>>> not sure I see how this solves the problem.
>>> The big question is why do you need to wait during the final closing a
>>> file?
>>
>> As always it's because of historical reasons. Initially user space 
>> pushed commands directly to a hardware queue and when a processes 
>> finished we didn't need to wait for anything.
>>
>> Then the GPU scheduler was introduced which delayed pushing the jobs 
>> to the hardware queue to a later point in time.
>>
>> This wait was then added to maintain backward compability and not 
>> break userspace (but see below).
>>
>>> The wait can be terminated so the wait does not appear to be simply a
>>> matter of correctness.
>>
>> Well when the process is killed we don't care about correctness any 
>> more, we just want to get rid of it as quickly as possible (OOM 
>> situation etc...).
>>
>> But it is perfectly possible that a process submits some render 
>> commands and then calls exit() or terminates because of a SIGTERM, 
>> SIGINT etc.. In this case we need to wait here to make sure that all 
>> rendering is pushed to the hardware because the scheduler might need 
>> resources/settings from the file descriptor.
>>
>> For example if you just remove that wait you could close firefox and 
>> get garbage on the screen for a millisecond because the remaining 
>> rendering commands where not executed.
>>
>> So what we essentially need is to distinct between a SIGKILL (which 
>> means stop processing as soon as possible) and any other reason 
>> because then we don't want to annoy the user with garbage on the 
>> screen (even if it's just for a few milliseconds).
>>
>> Constructive ideas how to handle this would be very welcome, cause I 
>> completely agree that what we have at the moment by checking 
>> PF_SIGNAL is just a very very hacky workaround.
>
> What about changing PF_SIGNALED to  PF_EXITING in 
> drm_sched_entity_do_release
>
> -       if ((current->flags & PF_SIGNALED) && current->exit_code == 
> SIGKILL)
> +      if ((current->flags & PF_EXITING) && current->exit_code == 
> SIGKILL)
>
> From looking into do_exit and it's callers , current->exit_code will 
> get assign the signal which was delivered to the task. If SIGINT was 
> sent then it's SIGINT, if SIGKILL then SIGKILL.

That's at least a band aid to stop us from abusing PF_SIGNALED.

But additional to that change, can you investigate when f_ops->flush() 
is called when the process exists normally, because of SIGKILL or 
because of some other signal?

Could be that this is more closely to what we are searching for,
Christian.

>
> Andrey
>
>
>>
>> Thanks,
>> Christian.
>
>
>>
>>>
>>> Eric
>>> _______________________________________________
>>> amd-gfx mailing list
>>> amd-gfx@lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>>
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
  2018-04-30 12:08                     ` Christian König
  (?)
  (?)
@ 2018-04-30 15:29                     ` Oleg Nesterov
  -1 siblings, 0 replies; 122+ messages in thread
From: Oleg Nesterov @ 2018-04-30 15:29 UTC (permalink / raw)
  To: Christian König
  Cc: Eric W. Biederman, Andrey Grodzovsky, David.Panariti, amd-gfx,
	linux-kernel, Alexander.Deucher, akpm, Christian.Koenig

On 04/30, Christian König wrote:
>
> Well when the process is killed we don't care about correctness any more, we
> just want to get rid of it as quickly as possible (OOM situation etc...).

OK,

> But it is perfectly possible that a process submits some render commands and
> then calls exit() or terminates because of a SIGTERM, SIGINT etc..

This doesn't differ from SIGKILL. I mean, any unhandled fatal signal translates
to SIGKILL and I think this is fine.

but this doesn't really matter,

> So what we essentially need is to distinct between a SIGKILL (which means
> stop processing as soon as possible) and any other reason because then we
> don't want to annoy the user with garbage on the screen (even if it's just
> for a few milliseconds).

For what?

OK, I see another email from Andrey, I'll reply to that email...

Oleg.

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
  2018-04-30 14:32                       ` Andrey Grodzovsky
  (?)
  (?)
@ 2018-04-30 16:00                       ` Oleg Nesterov
  2018-04-30 16:10                           ` Andrey Grodzovsky
  -1 siblings, 1 reply; 122+ messages in thread
From: Oleg Nesterov @ 2018-04-30 16:00 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: christian.koenig, Eric W. Biederman, David.Panariti, amd-gfx,
	linux-kernel, Alexander.Deucher, akpm

On 04/30, Andrey Grodzovsky wrote:
>
> What about changing PF_SIGNALED to  PF_EXITING in
> drm_sched_entity_do_release
>
> -       if ((current->flags & PF_SIGNALED) && current->exit_code == SIGKILL)
> +      if ((current->flags & PF_EXITING) && current->exit_code == SIGKILL)

let me repeat, please don't use task->exit_code. And in fact this check is racy.

But this doesn't matter. Say, we can trivially add SIGNAL_GROUP_KILLED_BY_SIGKILL,
or do something else, but I fail to understand what are you trying to do. Suppose
that the check above is correct in that it is true iff the task is exiting and
it was killed by SIGKILL. What about the "else" branch which does

	r = wait_event_killable(sched->job_scheduled, ...)

?

Once again, fatal_signal_pending() (or even signal_pending()) is not well defined
after the exiting task passes exit_signals().

So wait_event_killable() can fail because fatal_signal_pending() is true; and this
can happen even if it was not killed.

Or it can block and SIGKILL won't be able to wake it up.

> If SIGINT was sent then it's SIGINT,

Yes, but see above. in this case fatal_signal_pending() will be likely true so
wait_event_killable() will fail unless condition is already true.

Oleg.

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
@ 2018-04-30 16:10                           ` Andrey Grodzovsky
  0 siblings, 0 replies; 122+ messages in thread
From: Andrey Grodzovsky @ 2018-04-30 16:10 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: christian.koenig, Eric W. Biederman, David.Panariti, amd-gfx,
	linux-kernel, Alexander.Deucher, akpm



On 04/30/2018 12:00 PM, Oleg Nesterov wrote:
> On 04/30, Andrey Grodzovsky wrote:
>> What about changing PF_SIGNALED to  PF_EXITING in
>> drm_sched_entity_do_release
>>
>> -       if ((current->flags & PF_SIGNALED) && current->exit_code == SIGKILL)
>> +      if ((current->flags & PF_EXITING) && current->exit_code == SIGKILL)
> let me repeat, please don't use task->exit_code. And in fact this check is racy
>
> But this doesn't matter. Say, we can trivially add SIGNAL_GROUP_KILLED_BY_SIGKILL,
> or do something else,


Can you explain where is the race and what is a possible alternative then ?

>   but I fail to understand what are you trying to do. Suppose
> that the check above is correct in that it is true iff the task is exiting and
> it was killed by SIGKILL. What about the "else" branch which does
>
> 	r = wait_event_killable(sched->job_scheduled, ...)
>
> ?
>
> Once again, fatal_signal_pending() (or even signal_pending()) is not well defined
> after the exiting task passes exit_signals().
>
> So wait_event_killable() can fail because fatal_signal_pending() is true; and this
> can happen even if it was not killed.
>
> Or it can block and SIGKILL won't be able to wake it up.
>
>> If SIGINT was sent then it's SIGINT,
> Yes, but see above. in this case fatal_signal_pending() will be likely true so
> wait_event_killable() will fail unless condition is already true.

My bad, I didn't show the full intended fix, it was just a snippet to 
address the differentiation between exiting
do to SIGKILL and any other exit, I also intended to change 
wait_event_killable to wait_event_timeout.

Andrey

>
> Oleg.
>

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
@ 2018-04-30 16:10                           ` Andrey Grodzovsky
  0 siblings, 0 replies; 122+ messages in thread
From: Andrey Grodzovsky @ 2018-04-30 16:10 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: David.Panariti-5C7GfCeVMHo, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Eric W. Biederman,
	Alexander.Deucher-5C7GfCeVMHo,
	akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	christian.koenig-5C7GfCeVMHo



On 04/30/2018 12:00 PM, Oleg Nesterov wrote:
> On 04/30, Andrey Grodzovsky wrote:
>> What about changing PF_SIGNALED to  PF_EXITING in
>> drm_sched_entity_do_release
>>
>> -       if ((current->flags & PF_SIGNALED) && current->exit_code == SIGKILL)
>> +      if ((current->flags & PF_EXITING) && current->exit_code == SIGKILL)
> let me repeat, please don't use task->exit_code. And in fact this check is racy
>
> But this doesn't matter. Say, we can trivially add SIGNAL_GROUP_KILLED_BY_SIGKILL,
> or do something else,


Can you explain where is the race and what is a possible alternative then ?

>   but I fail to understand what are you trying to do. Suppose
> that the check above is correct in that it is true iff the task is exiting and
> it was killed by SIGKILL. What about the "else" branch which does
>
> 	r = wait_event_killable(sched->job_scheduled, ...)
>
> ?
>
> Once again, fatal_signal_pending() (or even signal_pending()) is not well defined
> after the exiting task passes exit_signals().
>
> So wait_event_killable() can fail because fatal_signal_pending() is true; and this
> can happen even if it was not killed.
>
> Or it can block and SIGKILL won't be able to wake it up.
>
>> If SIGINT was sent then it's SIGINT,
> Yes, but see above. in this case fatal_signal_pending() will be likely true so
> wait_event_killable() will fail unless condition is already true.

My bad, I didn't show the full intended fix, it was just a snippet to 
address the differentiation between exiting
do to SIGKILL and any other exit, I also intended to change 
wait_event_killable to wait_event_timeout.

Andrey

>
> Oleg.
>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
  2018-04-30 12:08                     ` Christian König
                                       ` (2 preceding siblings ...)
  (?)
@ 2018-04-30 16:25                     ` Eric W. Biederman
  2018-04-30 17:18                         ` Andrey Grodzovsky
  -1 siblings, 1 reply; 122+ messages in thread
From: Eric W. Biederman @ 2018-04-30 16:25 UTC (permalink / raw)
  To: Christian König
  Cc: Andrey Grodzovsky, christian.koenig, David.Panariti,
	Oleg Nesterov, amd-gfx, linux-kernel, Alexander.Deucher, akpm

Christian König <ckoenig.leichtzumerken@gmail.com> writes:

> Hi Eric,
>
> sorry for the late response, was on vacation last week.
>
> Am 26.04.2018 um 02:01 schrieb Eric W. Biederman:
>> Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> writes:
>>
>>> On 04/25/2018 01:17 PM, Oleg Nesterov wrote:
>>>> On 04/25, Andrey Grodzovsky wrote:
>>>>> here (drm_sched_entity_fini) is also a bad idea, but we still want to be
>>>>> able to exit immediately
>>>>> and not wait for GPU jobs completion when the reason for reaching this code
>>>>> is because of KILL
>>>>> signal to the user process who opened the device file.
>>>> Can you hook f_op->flush method?
>
> THANKS! That sounds like a really good idea to me and we haven't investigated
> into that direction yet.

For the backwards compatibility concerns you cite below the flush method
seems a much better place to introduce the wait.  You at least really
will be in a process context for that.  Still might be in exit but at
least you will be legitimately be in a process.

>>> But this one is called for each task releasing a reference to the the file, so
>>> not sure I see how this solves the problem.
>> The big question is why do you need to wait during the final closing a
>> file?
>
> As always it's because of historical reasons. Initially user space pushed
> commands directly to a hardware queue and when a processes finished we didn't
> need to wait for anything.
>
> Then the GPU scheduler was introduced which delayed pushing the jobs to the
> hardware queue to a later point in time.
>
> This wait was then added to maintain backward compability and not break
> userspace (but see below).

That make sense.

>> The wait can be terminated so the wait does not appear to be simply a
>> matter of correctness.
>
> Well when the process is killed we don't care about correctness any more, we
> just want to get rid of it as quickly as possible (OOM situation etc...).
>
> But it is perfectly possible that a process submits some render commands and
> then calls exit() or terminates because of a SIGTERM, SIGINT etc.. In this case
> we need to wait here to make sure that all rendering is pushed to the hardware
> because the scheduler might need resources/settings from the file
> descriptor.
>
> For example if you just remove that wait you could close firefox and get garbage
> on the screen for a millisecond because the remaining rendering commands where
> not executed.
>
> So what we essentially need is to distinct between a SIGKILL (which means stop
> processing as soon as possible) and any other reason because then we don't want
> to annoy the user with garbage on the screen (even if it's just for a few
> milliseconds).

I see a couple of issues.

- Running the code in release rather than in flush.

Using flush will catch every close so it should be more backwards
compatible.  f_op->flush always runs in process context so looking at
current makes sense.

- Distinguishing between death by SIGKILL and other process exit deaths.

In f_op->flush the code can test "((tsk->flags & PF_EXITING) &&
(tsk->code == SIGKILL))" to see if it was SIGKILL that terminated
the process.

- Dealing with stuck queues (where this patchset came in).

For stuck queues you are going to need a timeout instead of the current
indefinite wait after PF_EXITING is set.  From what you have described a
few milliseconds should be enough.  If PF_EXITING is not set you can
still just make the wait killable and skip the timeout if that will give
a better backwards compatible user experience.

What can't be done is try and catch SIGKILL after a process has called
do_exit.  A dead process is a dead process.

Eric

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
@ 2018-04-30 17:18                         ` Andrey Grodzovsky
  0 siblings, 0 replies; 122+ messages in thread
From: Andrey Grodzovsky @ 2018-04-30 17:18 UTC (permalink / raw)
  To: Eric W. Biederman, Christian König
  Cc: christian.koenig, David.Panariti, Oleg Nesterov, amd-gfx,
	linux-kernel, Alexander.Deucher, akpm



On 04/30/2018 12:25 PM, Eric W. Biederman wrote:
> Christian König <ckoenig.leichtzumerken@gmail.com> writes:
>
>> Hi Eric,
>>
>> sorry for the late response, was on vacation last week.
>>
>> Am 26.04.2018 um 02:01 schrieb Eric W. Biederman:
>>> Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> writes:
>>>
>>>> On 04/25/2018 01:17 PM, Oleg Nesterov wrote:
>>>>> On 04/25, Andrey Grodzovsky wrote:
>>>>>> here (drm_sched_entity_fini) is also a bad idea, but we still want to be
>>>>>> able to exit immediately
>>>>>> and not wait for GPU jobs completion when the reason for reaching this code
>>>>>> is because of KILL
>>>>>> signal to the user process who opened the device file.
>>>>> Can you hook f_op->flush method?
>> THANKS! That sounds like a really good idea to me and we haven't investigated
>> into that direction yet.
> For the backwards compatibility concerns you cite below the flush method
> seems a much better place to introduce the wait.  You at least really
> will be in a process context for that.  Still might be in exit but at
> least you will be legitimately be in a process.
>
>>>> But this one is called for each task releasing a reference to the the file, so
>>>> not sure I see how this solves the problem.
>>> The big question is why do you need to wait during the final closing a
>>> file?
>> As always it's because of historical reasons. Initially user space pushed
>> commands directly to a hardware queue and when a processes finished we didn't
>> need to wait for anything.
>>
>> Then the GPU scheduler was introduced which delayed pushing the jobs to the
>> hardware queue to a later point in time.
>>
>> This wait was then added to maintain backward compability and not break
>> userspace (but see below).
> That make sense.
>
>>> The wait can be terminated so the wait does not appear to be simply a
>>> matter of correctness.
>> Well when the process is killed we don't care about correctness any more, we
>> just want to get rid of it as quickly as possible (OOM situation etc...).
>>
>> But it is perfectly possible that a process submits some render commands and
>> then calls exit() or terminates because of a SIGTERM, SIGINT etc.. In this case
>> we need to wait here to make sure that all rendering is pushed to the hardware
>> because the scheduler might need resources/settings from the file
>> descriptor.
>>
>> For example if you just remove that wait you could close firefox and get garbage
>> on the screen for a millisecond because the remaining rendering commands where
>> not executed.
>>
>> So what we essentially need is to distinct between a SIGKILL (which means stop
>> processing as soon as possible) and any other reason because then we don't want
>> to annoy the user with garbage on the screen (even if it's just for a few
>> milliseconds).
> I see a couple of issues.
>
> - Running the code in release rather than in flush.
>
> Using flush will catch every close so it should be more backwards
> compatible.  f_op->flush always runs in process context so looking at
> current makes sense.
>
> - Distinguishing between death by SIGKILL and other process exit deaths.
>
> In f_op->flush the code can test "((tsk->flags & PF_EXITING) &&
> (tsk->code == SIGKILL))" to see if it was SIGKILL that terminated
> the process.

What about Oleg's note not to rely on tsk->code == SIGKILL (still not 
clear why ?)
and that this entire check is racy (against what ?) ? Or is it relevant 
to .release hook
only ?

Andrey

>
> - Dealing with stuck queues (where this patchset came in).
>
> For stuck queues you are going to need a timeout instead of the current
> indefinite wait after PF_EXITING is set.  From what you have described a
> few milliseconds should be enough.  If PF_EXITING is not set you can
> still just make the wait killable and skip the timeout if that will give
> a better backwards compatible user experience.
>
> What can't be done is try and catch SIGKILL after a process has called
> do_exit.  A dead process is a dead process.
>
> Eric

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
@ 2018-04-30 17:18                         ` Andrey Grodzovsky
  0 siblings, 0 replies; 122+ messages in thread
From: Andrey Grodzovsky @ 2018-04-30 17:18 UTC (permalink / raw)
  To: Eric W. Biederman, Christian König
  Cc: David.Panariti-5C7GfCeVMHo, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Oleg Nesterov,
	Alexander.Deucher-5C7GfCeVMHo,
	akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	christian.koenig-5C7GfCeVMHo



On 04/30/2018 12:25 PM, Eric W. Biederman wrote:
> Christian König <ckoenig.leichtzumerken@gmail.com> writes:
>
>> Hi Eric,
>>
>> sorry for the late response, was on vacation last week.
>>
>> Am 26.04.2018 um 02:01 schrieb Eric W. Biederman:
>>> Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> writes:
>>>
>>>> On 04/25/2018 01:17 PM, Oleg Nesterov wrote:
>>>>> On 04/25, Andrey Grodzovsky wrote:
>>>>>> here (drm_sched_entity_fini) is also a bad idea, but we still want to be
>>>>>> able to exit immediately
>>>>>> and not wait for GPU jobs completion when the reason for reaching this code
>>>>>> is because of KILL
>>>>>> signal to the user process who opened the device file.
>>>>> Can you hook f_op->flush method?
>> THANKS! That sounds like a really good idea to me and we haven't investigated
>> into that direction yet.
> For the backwards compatibility concerns you cite below the flush method
> seems a much better place to introduce the wait.  You at least really
> will be in a process context for that.  Still might be in exit but at
> least you will be legitimately be in a process.
>
>>>> But this one is called for each task releasing a reference to the the file, so
>>>> not sure I see how this solves the problem.
>>> The big question is why do you need to wait during the final closing a
>>> file?
>> As always it's because of historical reasons. Initially user space pushed
>> commands directly to a hardware queue and when a processes finished we didn't
>> need to wait for anything.
>>
>> Then the GPU scheduler was introduced which delayed pushing the jobs to the
>> hardware queue to a later point in time.
>>
>> This wait was then added to maintain backward compability and not break
>> userspace (but see below).
> That make sense.
>
>>> The wait can be terminated so the wait does not appear to be simply a
>>> matter of correctness.
>> Well when the process is killed we don't care about correctness any more, we
>> just want to get rid of it as quickly as possible (OOM situation etc...).
>>
>> But it is perfectly possible that a process submits some render commands and
>> then calls exit() or terminates because of a SIGTERM, SIGINT etc.. In this case
>> we need to wait here to make sure that all rendering is pushed to the hardware
>> because the scheduler might need resources/settings from the file
>> descriptor.
>>
>> For example if you just remove that wait you could close firefox and get garbage
>> on the screen for a millisecond because the remaining rendering commands where
>> not executed.
>>
>> So what we essentially need is to distinct between a SIGKILL (which means stop
>> processing as soon as possible) and any other reason because then we don't want
>> to annoy the user with garbage on the screen (even if it's just for a few
>> milliseconds).
> I see a couple of issues.
>
> - Running the code in release rather than in flush.
>
> Using flush will catch every close so it should be more backwards
> compatible.  f_op->flush always runs in process context so looking at
> current makes sense.
>
> - Distinguishing between death by SIGKILL and other process exit deaths.
>
> In f_op->flush the code can test "((tsk->flags & PF_EXITING) &&
> (tsk->code == SIGKILL))" to see if it was SIGKILL that terminated
> the process.

What about Oleg's note not to rely on tsk->code == SIGKILL (still not 
clear why ?)
and that this entire check is racy (against what ?) ? Or is it relevant 
to .release hook
only ?

Andrey

>
> - Dealing with stuck queues (where this patchset came in).
>
> For stuck queues you are going to need a timeout instead of the current
> indefinite wait after PF_EXITING is set.  From what you have described a
> few milliseconds should be enough.  If PF_EXITING is not set you can
> still just make the wait killable and skip the timeout if that will give
> a better backwards compatible user experience.
>
> What can't be done is try and catch SIGKILL after a process has called
> do_exit.  A dead process is a dead process.
>
> Eric

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
@ 2018-04-30 18:29                             ` Christian König
  0 siblings, 0 replies; 122+ messages in thread
From: Christian König @ 2018-04-30 18:29 UTC (permalink / raw)
  To: Andrey Grodzovsky, Oleg Nesterov
  Cc: David.Panariti, linux-kernel, amd-gfx, Eric W. Biederman,
	Alexander.Deucher, akpm, christian.koenig

Am 30.04.2018 um 18:10 schrieb Andrey Grodzovsky:
>
>
> On 04/30/2018 12:00 PM, Oleg Nesterov wrote:
>> On 04/30, Andrey Grodzovsky wrote:
>>> What about changing PF_SIGNALED to PF_EXITING in
>>> drm_sched_entity_do_release
>>>
>>> -       if ((current->flags & PF_SIGNALED) && current->exit_code == 
>>> SIGKILL)
>>> +      if ((current->flags & PF_EXITING) && current->exit_code == 
>>> SIGKILL)
>> let me repeat, please don't use task->exit_code. And in fact this 
>> check is racy
>>
>> But this doesn't matter. Say, we can trivially add 
>> SIGNAL_GROUP_KILLED_BY_SIGKILL,
>> or do something else,
>
>
> Can you explain where is the race and what is a possible alternative 
> then ?

The race is that the release doesn't necessarily comes from the 
process/context which used the fd.

E.g. it is just called when the last reference count goes away, but that 
can be anywhere not related to the original process using it, e.g. in a 
kernel thread or a debugger etc...

The approach with the flush is indeed a really nice idea and I bite 
myself to not had that previously as well.

Christian.

>
>>   but I fail to understand what are you trying to do. Suppose
>> that the check above is correct in that it is true iff the task is 
>> exiting and
>> it was killed by SIGKILL. What about the "else" branch which does
>>
>>     r = wait_event_killable(sched->job_scheduled, ...)
>>
>> ?
>>
>> Once again, fatal_signal_pending() (or even signal_pending()) is not 
>> well defined
>> after the exiting task passes exit_signals().
>>
>> So wait_event_killable() can fail because fatal_signal_pending() is 
>> true; and this
>> can happen even if it was not killed.
>>
>> Or it can block and SIGKILL won't be able to wake it up.
>>
>>> If SIGINT was sent then it's SIGINT,
>> Yes, but see above. in this case fatal_signal_pending() will be 
>> likely true so
>> wait_event_killable() will fail unless condition is already true.
>
> My bad, I didn't show the full intended fix, it was just a snippet to 
> address the differentiation between exiting
> do to SIGKILL and any other exit, I also intended to change 
> wait_event_killable to wait_event_timeout.
>
> Andrey
>
>>
>> Oleg.
>>
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
@ 2018-04-30 18:29                             ` Christian König
  0 siblings, 0 replies; 122+ messages in thread
From: Christian König @ 2018-04-30 18:29 UTC (permalink / raw)
  To: Andrey Grodzovsky, Oleg Nesterov
  Cc: David.Panariti-5C7GfCeVMHo, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Eric W. Biederman,
	Alexander.Deucher-5C7GfCeVMHo,
	akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	christian.koenig-5C7GfCeVMHo

Am 30.04.2018 um 18:10 schrieb Andrey Grodzovsky:
>
>
> On 04/30/2018 12:00 PM, Oleg Nesterov wrote:
>> On 04/30, Andrey Grodzovsky wrote:
>>> What about changing PF_SIGNALED to PF_EXITING in
>>> drm_sched_entity_do_release
>>>
>>> -       if ((current->flags & PF_SIGNALED) && current->exit_code == 
>>> SIGKILL)
>>> +      if ((current->flags & PF_EXITING) && current->exit_code == 
>>> SIGKILL)
>> let me repeat, please don't use task->exit_code. And in fact this 
>> check is racy
>>
>> But this doesn't matter. Say, we can trivially add 
>> SIGNAL_GROUP_KILLED_BY_SIGKILL,
>> or do something else,
>
>
> Can you explain where is the race and what is a possible alternative 
> then ?

The race is that the release doesn't necessarily comes from the 
process/context which used the fd.

E.g. it is just called when the last reference count goes away, but that 
can be anywhere not related to the original process using it, e.g. in a 
kernel thread or a debugger etc...

The approach with the flush is indeed a really nice idea and I bite 
myself to not had that previously as well.

Christian.

>
>>   but I fail to understand what are you trying to do. Suppose
>> that the check above is correct in that it is true iff the task is 
>> exiting and
>> it was killed by SIGKILL. What about the "else" branch which does
>>
>>     r = wait_event_killable(sched->job_scheduled, ...)
>>
>> ?
>>
>> Once again, fatal_signal_pending() (or even signal_pending()) is not 
>> well defined
>> after the exiting task passes exit_signals().
>>
>> So wait_event_killable() can fail because fatal_signal_pending() is 
>> true; and this
>> can happen even if it was not killed.
>>
>> Or it can block and SIGKILL won't be able to wake it up.
>>
>>> If SIGINT was sent then it's SIGINT,
>> Yes, but see above. in this case fatal_signal_pending() will be 
>> likely true so
>> wait_event_killable() will fail unless condition is already true.
>
> My bad, I didn't show the full intended fix, it was just a snippet to 
> address the differentiation between exiting
> do to SIGKILL and any other exit, I also intended to change 
> wait_event_killable to wait_event_timeout.
>
> Andrey
>
>>
>> Oleg.
>>
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
@ 2018-04-30 19:28                               ` Andrey Grodzovsky
  0 siblings, 0 replies; 122+ messages in thread
From: Andrey Grodzovsky @ 2018-04-30 19:28 UTC (permalink / raw)
  To: christian.koenig, Oleg Nesterov
  Cc: David.Panariti, linux-kernel, amd-gfx, Eric W. Biederman,
	Alexander.Deucher, akpm



On 04/30/2018 02:29 PM, Christian König wrote:
> Am 30.04.2018 um 18:10 schrieb Andrey Grodzovsky:
>>
>>
>> On 04/30/2018 12:00 PM, Oleg Nesterov wrote:
>>> On 04/30, Andrey Grodzovsky wrote:
>>>> What about changing PF_SIGNALED to PF_EXITING in
>>>> drm_sched_entity_do_release
>>>>
>>>> -       if ((current->flags & PF_SIGNALED) && current->exit_code == 
>>>> SIGKILL)
>>>> +      if ((current->flags & PF_EXITING) && current->exit_code == 
>>>> SIGKILL)
>>> let me repeat, please don't use task->exit_code. And in fact this 
>>> check is racy
>>>
>>> But this doesn't matter. Say, we can trivially add 
>>> SIGNAL_GROUP_KILLED_BY_SIGKILL,
>>> or do something else,
>>
>>
>> Can you explain where is the race and what is a possible alternative 
>> then ?
>
> The race is that the release doesn't necessarily comes from the 
> process/context which used the fd.
>
> E.g. it is just called when the last reference count goes away, but 
> that can be anywhere not related to the original process using it, 
> e.g. in a kernel thread or a debugger etc...

I still don't see how it is a problem, if release comes from another 
task, then our process  (let's say Firefox who received SIGKILL) won't 
even get here since fput will not call .release so it will die instantly,
the last process who holds the reference (let's say the debugger) when 
finish will just go to wait_event_timeout and wait for SW queue to be 
empty from jobs (if any). So all the jobs will have their chance to get 
to HW anyway.

>
> The approach with the flush is indeed a really nice idea and I bite 
> myself to not had that previously as well.

Regarding your request from another email to investigate more on .flush

Looked at the code and did some reading -

 From LDD3
"The flush operation is invoked when a process closes its copy of a file 
descriptor for a device; it should execute (and wait for) any 
outstanding operations on the device"

 From printing back trace from dummy .flush hook in our driver -

Normal exit (process terminates on it's own)

[  295.586130 <    0.000006>]  dump_stack+0x5c/0x78
[  295.586273 <    0.000143>]  my_flush+0xa/0x10 [amdgpu]
[  295.586283 <    0.000010>]  filp_close+0x4a/0x90
[  295.586288 <    0.000005>]  SyS_close+0x2d/0x60
[  295.586295 <    0.000003>]  do_syscall_64+0xee/0x270

Exit triggered by fatal signal (not handled  signal, including SIGKILL)

[  356.551456 <    0.000008>]  dump_stack+0x5c/0x78
[  356.551592 <    0.000136>]  my_flush+0xa/0x10 [amdgpu]
[  356.551597 <    0.000005>]  filp_close+0x4a/0x90
[  356.551605 <    0.000008>]  put_files_struct+0xaf/0x120
[  356.551615 <    0.000010>]  do_exit+0x468/0x1280
[  356.551669 <    0.000009>]  do_group_exit+0x89/0x140
[  356.551679 <    0.000010>]  get_signal+0x375/0x8f0
[  356.551696 <    0.000017>]  do_signal+0x79/0xaa0
[  356.551756 <    0.000014>]  exit_to_usermode_loop+0x83/0xd0
[  356.551764 <    0.000008>]  do_syscall_64+0x244/0x270

So as it was said here before, it will be called for every process 
closing his FD to the file.

But again, I don't quire see yet what we earn by using .flush, is it 
that you force every process holding reference to DRM file not
die until all jobs are submitted to HW (as long as the process not being 
killed by  a signal) ?

Andrey

>
> Christian.

The idea here is that any task still referencing this file and putting 
down the reference and is not
exiting due to SIGKILL will just have to go through the  slow path - 
wait for jobs completion on GPU (with some TO).
>
>>
>>>   but I fail to understand what are you trying to do. Suppose
>>> that the check above is correct in that it is true iff the task is 
>>> exiting and
>>> it was killed by SIGKILL. What about the "else" branch which does
>>>
>>>     r = wait_event_killable(sched->job_scheduled, ...)
>>>
>>> ?
>>>
>>> Once again, fatal_signal_pending() (or even signal_pending()) is not 
>>> well defined
>>> after the exiting task passes exit_signals().
>>>
>>> So wait_event_killable() can fail because fatal_signal_pending() is 
>>> true; and this
>>> can happen even if it was not killed.
>>>
>>> Or it can block and SIGKILL won't be able to wake it up.
>>>
>>>> If SIGINT was sent then it's SIGINT,
>>> Yes, but see above. in this case fatal_signal_pending() will be 
>>> likely true so
>>> wait_event_killable() will fail unless condition is already true.
>>
>> My bad, I didn't show the full intended fix, it was just a snippet to 
>> address the differentiation between exiting
>> do to SIGKILL and any other exit, I also intended to change 
>> wait_event_killable to wait_event_timeout.
>>
>> Andrey
>>
>>>
>>> Oleg.
>>>
>>
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
@ 2018-04-30 19:28                               ` Andrey Grodzovsky
  0 siblings, 0 replies; 122+ messages in thread
From: Andrey Grodzovsky @ 2018-04-30 19:28 UTC (permalink / raw)
  To: christian.koenig-5C7GfCeVMHo, Oleg Nesterov
  Cc: David.Panariti-5C7GfCeVMHo, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Eric W. Biederman,
	Alexander.Deucher-5C7GfCeVMHo,
	akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b



On 04/30/2018 02:29 PM, Christian König wrote:
> Am 30.04.2018 um 18:10 schrieb Andrey Grodzovsky:
>>
>>
>> On 04/30/2018 12:00 PM, Oleg Nesterov wrote:
>>> On 04/30, Andrey Grodzovsky wrote:
>>>> What about changing PF_SIGNALED to PF_EXITING in
>>>> drm_sched_entity_do_release
>>>>
>>>> -       if ((current->flags & PF_SIGNALED) && current->exit_code == 
>>>> SIGKILL)
>>>> +      if ((current->flags & PF_EXITING) && current->exit_code == 
>>>> SIGKILL)
>>> let me repeat, please don't use task->exit_code. And in fact this 
>>> check is racy
>>>
>>> But this doesn't matter. Say, we can trivially add 
>>> SIGNAL_GROUP_KILLED_BY_SIGKILL,
>>> or do something else,
>>
>>
>> Can you explain where is the race and what is a possible alternative 
>> then ?
>
> The race is that the release doesn't necessarily comes from the 
> process/context which used the fd.
>
> E.g. it is just called when the last reference count goes away, but 
> that can be anywhere not related to the original process using it, 
> e.g. in a kernel thread or a debugger etc...

I still don't see how it is a problem, if release comes from another 
task, then our process  (let's say Firefox who received SIGKILL) won't 
even get here since fput will not call .release so it will die instantly,
the last process who holds the reference (let's say the debugger) when 
finish will just go to wait_event_timeout and wait for SW queue to be 
empty from jobs (if any). So all the jobs will have their chance to get 
to HW anyway.

>
> The approach with the flush is indeed a really nice idea and I bite 
> myself to not had that previously as well.

Regarding your request from another email to investigate more on .flush

Looked at the code and did some reading -

 From LDD3
"The flush operation is invoked when a process closes its copy of a file 
descriptor for a device; it should execute (and wait for) any 
outstanding operations on the device"

 From printing back trace from dummy .flush hook in our driver -

Normal exit (process terminates on it's own)

[  295.586130 <    0.000006>]  dump_stack+0x5c/0x78
[  295.586273 <    0.000143>]  my_flush+0xa/0x10 [amdgpu]
[  295.586283 <    0.000010>]  filp_close+0x4a/0x90
[  295.586288 <    0.000005>]  SyS_close+0x2d/0x60
[  295.586295 <    0.000003>]  do_syscall_64+0xee/0x270

Exit triggered by fatal signal (not handled  signal, including SIGKILL)

[  356.551456 <    0.000008>]  dump_stack+0x5c/0x78
[  356.551592 <    0.000136>]  my_flush+0xa/0x10 [amdgpu]
[  356.551597 <    0.000005>]  filp_close+0x4a/0x90
[  356.551605 <    0.000008>]  put_files_struct+0xaf/0x120
[  356.551615 <    0.000010>]  do_exit+0x468/0x1280
[  356.551669 <    0.000009>]  do_group_exit+0x89/0x140
[  356.551679 <    0.000010>]  get_signal+0x375/0x8f0
[  356.551696 <    0.000017>]  do_signal+0x79/0xaa0
[  356.551756 <    0.000014>]  exit_to_usermode_loop+0x83/0xd0
[  356.551764 <    0.000008>]  do_syscall_64+0x244/0x270

So as it was said here before, it will be called for every process 
closing his FD to the file.

But again, I don't quire see yet what we earn by using .flush, is it 
that you force every process holding reference to DRM file not
die until all jobs are submitted to HW (as long as the process not being 
killed by  a signal) ?

Andrey

>
> Christian.

The idea here is that any task still referencing this file and putting 
down the reference and is not
exiting due to SIGKILL will just have to go through the  slow path - 
wait for jobs completion on GPU (with some TO).
>
>>
>>>   but I fail to understand what are you trying to do. Suppose
>>> that the check above is correct in that it is true iff the task is 
>>> exiting and
>>> it was killed by SIGKILL. What about the "else" branch which does
>>>
>>>     r = wait_event_killable(sched->job_scheduled, ...)
>>>
>>> ?
>>>
>>> Once again, fatal_signal_pending() (or even signal_pending()) is not 
>>> well defined
>>> after the exiting task passes exit_signals().
>>>
>>> So wait_event_killable() can fail because fatal_signal_pending() is 
>>> true; and this
>>> can happen even if it was not killed.
>>>
>>> Or it can block and SIGKILL won't be able to wake it up.
>>>
>>>> If SIGINT was sent then it's SIGINT,
>>> Yes, but see above. in this case fatal_signal_pending() will be 
>>> likely true so
>>> wait_event_killable() will fail unless condition is already true.
>>
>> My bad, I didn't show the full intended fix, it was just a snippet to 
>> address the differentiation between exiting
>> do to SIGKILL and any other exit, I also intended to change 
>> wait_event_killable to wait_event_timeout.
>>
>> Andrey
>>
>>>
>>> Oleg.
>>>
>>
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
  2018-04-30 16:10                           ` Andrey Grodzovsky
  (?)
  (?)
@ 2018-05-01 14:35                           ` Oleg Nesterov
  2018-05-23 15:08                               ` Andrey Grodzovsky
  -1 siblings, 1 reply; 122+ messages in thread
From: Oleg Nesterov @ 2018-05-01 14:35 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: christian.koenig, Eric W. Biederman, David.Panariti, amd-gfx,
	linux-kernel, Alexander.Deucher, akpm

On 04/30, Andrey Grodzovsky wrote:
>
> On 04/30/2018 12:00 PM, Oleg Nesterov wrote:
> >On 04/30, Andrey Grodzovsky wrote:
> >>What about changing PF_SIGNALED to  PF_EXITING in
> >>drm_sched_entity_do_release
> >>
> >>-       if ((current->flags & PF_SIGNALED) && current->exit_code == SIGKILL)
> >>+      if ((current->flags & PF_EXITING) && current->exit_code == SIGKILL)
> >let me repeat, please don't use task->exit_code. And in fact this check is racy
> >
> >But this doesn't matter. Say, we can trivially add SIGNAL_GROUP_KILLED_BY_SIGKILL,
> >or do something else,
>
> Can you explain where is the race and what is a possible alternative then ?

Oh. I mentioned this race automatically, because I am pedant ;) Let me repeat
that this doesn't really matter, and let me remind that the caller of fop->release
can be completely unrelated process, say $cat /proc/pid/fdinfo. And in any case
->exit_code should not be used outside of ptrace/exit paths.

OK, the race. Consider a process P with a main thread M and a sub-thread T.

T does pthread_exit(), enters do_exit() and gets a preemption before exit_files().

The process is killed by SIGKILL. M calls do_group_exit(), do_exit() and passes
exit_files(). However, it doesn't call close_files() because T has another reference.

T resumes, calls close_files(), fput(), etc, and then exit_task_work(), so
it can finally call ->release() with current->exit_code == 0 desptite the fact
the process was killed.

Again, again, this doesn't matter. We can distinguish killed-or-not, by SIGKILL-
or-not. But I still do not think we actually need this. At least in ->release()
paths, ->flush() may differ.

Oleg.

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
@ 2018-05-02 11:48                                 ` Christian König
  0 siblings, 0 replies; 122+ messages in thread
From: Christian König @ 2018-05-02 11:48 UTC (permalink / raw)
  To: Andrey Grodzovsky, Oleg Nesterov
  Cc: David.Panariti, linux-kernel, amd-gfx, Eric W. Biederman,
	Alexander.Deucher, akpm

Am 30.04.2018 um 21:28 schrieb Andrey Grodzovsky:
>
>
> On 04/30/2018 02:29 PM, Christian König wrote:
>> Am 30.04.2018 um 18:10 schrieb Andrey Grodzovsky:
>>>
>>>
>>> On 04/30/2018 12:00 PM, Oleg Nesterov wrote:
>>>> On 04/30, Andrey Grodzovsky wrote:
>>>>> What about changing PF_SIGNALED to PF_EXITING in
>>>>> drm_sched_entity_do_release
>>>>>
>>>>> -       if ((current->flags & PF_SIGNALED) && current->exit_code 
>>>>> == SIGKILL)
>>>>> +      if ((current->flags & PF_EXITING) && current->exit_code == 
>>>>> SIGKILL)
>>>> let me repeat, please don't use task->exit_code. And in fact this 
>>>> check is racy
>>>>
>>>> But this doesn't matter. Say, we can trivially add 
>>>> SIGNAL_GROUP_KILLED_BY_SIGKILL,
>>>> or do something else,
>>>
>>>
>>> Can you explain where is the race and what is a possible alternative 
>>> then ?
>>
>> The race is that the release doesn't necessarily comes from the 
>> process/context which used the fd.
>>
>> E.g. it is just called when the last reference count goes away, but 
>> that can be anywhere not related to the original process using it, 
>> e.g. in a kernel thread or a debugger etc...
>
> I still don't see how it is a problem, if release comes from another 
> task, then our process  (let's say Firefox who received SIGKILL) won't 
> even get here since fput will not call .release so it will die instantly,

And exactly that's the problem. We would then just block whatever task 
is the last one to release the fd.

> the last process who holds the reference (let's say the debugger) when 
> finish will just go to wait_event_timeout and wait for SW queue to be 
> empty from jobs (if any). So all the jobs will have their chance to 
> get to HW anyway.

Yeah, but that's exactly what we want to avoid when the process is 
killed by a signal.

>> The approach with the flush is indeed a really nice idea and I bite 
>> myself to not had that previously as well.
>
> Regarding your request from another email to investigate more on .flush
>
> Looked at the code and did some reading -
>
> From LDD3
> "The flush operation is invoked when a process closes its copy of a 
> file descriptor for a device; it should execute (and wait for) any 
> outstanding operations on the device"

Sounds exactly like what we need.

>
> From printing back trace from dummy .flush hook in our driver -
>
> Normal exit (process terminates on it's own)
>
> [  295.586130 <    0.000006>]  dump_stack+0x5c/0x78
> [  295.586273 <    0.000143>]  my_flush+0xa/0x10 [amdgpu]
> [  295.586283 <    0.000010>]  filp_close+0x4a/0x90
> [  295.586288 <    0.000005>]  SyS_close+0x2d/0x60
> [  295.586295 <    0.000003>]  do_syscall_64+0xee/0x270
>
> Exit triggered by fatal signal (not handled  signal, including SIGKILL)
>
> [  356.551456 <    0.000008>]  dump_stack+0x5c/0x78
> [  356.551592 <    0.000136>]  my_flush+0xa/0x10 [amdgpu]
> [  356.551597 <    0.000005>]  filp_close+0x4a/0x90
> [  356.551605 <    0.000008>]  put_files_struct+0xaf/0x120
> [  356.551615 <    0.000010>]  do_exit+0x468/0x1280
> [  356.551669 <    0.000009>]  do_group_exit+0x89/0x140
> [  356.551679 <    0.000010>]  get_signal+0x375/0x8f0
> [  356.551696 <    0.000017>]  do_signal+0x79/0xaa0
> [  356.551756 <    0.000014>] exit_to_usermode_loop+0x83/0xd0
> [  356.551764 <    0.000008>]  do_syscall_64+0x244/0x270
>
> So as it was said here before, it will be called for every process 
> closing his FD to the file.
>
> But again, I don't quire see yet what we earn by using .flush, is it 
> that you force every process holding reference to DRM file not
> die until all jobs are submitted to HW (as long as the process not 
> being killed by  a signal) ?

If in your example firefox dies by a fatal signal we won't notice that 
when somebody else is holding the last reference. E.g. we would still 
try to submit the jobs to the hardware.

I suggest the following approach:
1. Implement the flush callback and call the function to wait for the 
scheduler to push everything to the hardware (maybe rename the scheduler 
function to flush as well).

2. Change the scheduler to test for PF_EXITING, if it's set use 
wait_event_timeout() if it isn't set use wait_event_killable().

When the wait times out or is killed set a flag so that the _fini 
function knows that. Alternatively you could cleanup the _fini function 
to work in all cases, e.g. both when there are still jobs on the queue 
and when the queue is empty. For this you need to add something like a 
struct completion to the main loop to remove this start()/stop() of the 
kernel thread.

Christian.

>
> Andrey

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
@ 2018-05-02 11:48                                 ` Christian König
  0 siblings, 0 replies; 122+ messages in thread
From: Christian König @ 2018-05-02 11:48 UTC (permalink / raw)
  To: Andrey Grodzovsky, Oleg Nesterov
  Cc: David.Panariti-5C7GfCeVMHo, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Eric W. Biederman,
	Alexander.Deucher-5C7GfCeVMHo,
	akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b

Am 30.04.2018 um 21:28 schrieb Andrey Grodzovsky:
>
>
> On 04/30/2018 02:29 PM, Christian König wrote:
>> Am 30.04.2018 um 18:10 schrieb Andrey Grodzovsky:
>>>
>>>
>>> On 04/30/2018 12:00 PM, Oleg Nesterov wrote:
>>>> On 04/30, Andrey Grodzovsky wrote:
>>>>> What about changing PF_SIGNALED to PF_EXITING in
>>>>> drm_sched_entity_do_release
>>>>>
>>>>> -       if ((current->flags & PF_SIGNALED) && current->exit_code 
>>>>> == SIGKILL)
>>>>> +      if ((current->flags & PF_EXITING) && current->exit_code == 
>>>>> SIGKILL)
>>>> let me repeat, please don't use task->exit_code. And in fact this 
>>>> check is racy
>>>>
>>>> But this doesn't matter. Say, we can trivially add 
>>>> SIGNAL_GROUP_KILLED_BY_SIGKILL,
>>>> or do something else,
>>>
>>>
>>> Can you explain where is the race and what is a possible alternative 
>>> then ?
>>
>> The race is that the release doesn't necessarily comes from the 
>> process/context which used the fd.
>>
>> E.g. it is just called when the last reference count goes away, but 
>> that can be anywhere not related to the original process using it, 
>> e.g. in a kernel thread or a debugger etc...
>
> I still don't see how it is a problem, if release comes from another 
> task, then our process  (let's say Firefox who received SIGKILL) won't 
> even get here since fput will not call .release so it will die instantly,

And exactly that's the problem. We would then just block whatever task 
is the last one to release the fd.

> the last process who holds the reference (let's say the debugger) when 
> finish will just go to wait_event_timeout and wait for SW queue to be 
> empty from jobs (if any). So all the jobs will have their chance to 
> get to HW anyway.

Yeah, but that's exactly what we want to avoid when the process is 
killed by a signal.

>> The approach with the flush is indeed a really nice idea and I bite 
>> myself to not had that previously as well.
>
> Regarding your request from another email to investigate more on .flush
>
> Looked at the code and did some reading -
>
> From LDD3
> "The flush operation is invoked when a process closes its copy of a 
> file descriptor for a device; it should execute (and wait for) any 
> outstanding operations on the device"

Sounds exactly like what we need.

>
> From printing back trace from dummy .flush hook in our driver -
>
> Normal exit (process terminates on it's own)
>
> [  295.586130 <    0.000006>]  dump_stack+0x5c/0x78
> [  295.586273 <    0.000143>]  my_flush+0xa/0x10 [amdgpu]
> [  295.586283 <    0.000010>]  filp_close+0x4a/0x90
> [  295.586288 <    0.000005>]  SyS_close+0x2d/0x60
> [  295.586295 <    0.000003>]  do_syscall_64+0xee/0x270
>
> Exit triggered by fatal signal (not handled  signal, including SIGKILL)
>
> [  356.551456 <    0.000008>]  dump_stack+0x5c/0x78
> [  356.551592 <    0.000136>]  my_flush+0xa/0x10 [amdgpu]
> [  356.551597 <    0.000005>]  filp_close+0x4a/0x90
> [  356.551605 <    0.000008>]  put_files_struct+0xaf/0x120
> [  356.551615 <    0.000010>]  do_exit+0x468/0x1280
> [  356.551669 <    0.000009>]  do_group_exit+0x89/0x140
> [  356.551679 <    0.000010>]  get_signal+0x375/0x8f0
> [  356.551696 <    0.000017>]  do_signal+0x79/0xaa0
> [  356.551756 <    0.000014>] exit_to_usermode_loop+0x83/0xd0
> [  356.551764 <    0.000008>]  do_syscall_64+0x244/0x270
>
> So as it was said here before, it will be called for every process 
> closing his FD to the file.
>
> But again, I don't quire see yet what we earn by using .flush, is it 
> that you force every process holding reference to DRM file not
> die until all jobs are submitted to HW (as long as the process not 
> being killed by  a signal) ?

If in your example firefox dies by a fatal signal we won't notice that 
when somebody else is holding the last reference. E.g. we would still 
try to submit the jobs to the hardware.

I suggest the following approach:
1. Implement the flush callback and call the function to wait for the 
scheduler to push everything to the hardware (maybe rename the scheduler 
function to flush as well).

2. Change the scheduler to test for PF_EXITING, if it's set use 
wait_event_timeout() if it isn't set use wait_event_killable().

When the wait times out or is killed set a flag so that the _fini 
function knows that. Alternatively you could cleanup the _fini function 
to work in all cases, e.g. both when there are still jobs on the queue 
and when the queue is empty. For this you need to add something like a 
struct completion to the main loop to remove this start()/stop() of the 
kernel thread.

Christian.

>
> Andrey

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
  2018-05-02 11:48                                 ` Christian König
  (?)
@ 2018-05-17 11:18                                 ` Andrey Grodzovsky
  2018-05-17 14:48                                   ` Michel Dänzer
  -1 siblings, 1 reply; 122+ messages in thread
From: Andrey Grodzovsky @ 2018-05-17 11:18 UTC (permalink / raw)
  To: Michel Dänzer, ML dri-devel; +Cc: Koenig, Christian

Hi Michele and others, I am trying to implement the approach bellow to 
resolve AMDGPU's hang when commands are stuck in pipe during process exit.

I noticed that once I implemented the file_operation.flush callback  
then during run of X, i see the flush callback gets called not only for 
Xorg process but for other

processes such as 'xkbcomp' and even 'sh', it seems like Xorg passes his 
FDs to children, Christian mentioned he remembered a discussion to 
always set FD_CLOEXEC flag when opening the hardware device file, so

we suspect a bug in Xorg with regard to this behavior.

Any advise on this would be very helpful.


Andrey


On 05/02/2018 07:48 AM, Christian König wrote:
> I suggest the following approach:
> 1. Implement the flush callback and call the function to wait for the 
> scheduler to push everything to the hardware (maybe rename the 
> scheduler function to flush as well).
>
> 2. Change the scheduler to test for PF_EXITING, if it's set use 
> wait_event_timeout() if it isn't set use wait_event_killable().
>
> When the wait times out or is killed set a flag so that the _fini 
> function knows that. Alternatively you could cleanup the _fini 
> function to work in all cases, e.g. both when there are still jobs on 
> the queue and when the queue is empty. For this you need to add 
> something like a struct completion to the main loop to remove this 
> start()/stop() of the kernel thread.
>
> Christian. 

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
  2018-05-17 11:18                                 ` Andrey Grodzovsky
@ 2018-05-17 14:48                                   ` Michel Dänzer
  2018-05-17 15:33                                     ` Andrey Grodzovsky
  2018-05-17 19:05                                     ` Andrey Grodzovsky
  0 siblings, 2 replies; 122+ messages in thread
From: Michel Dänzer @ 2018-05-17 14:48 UTC (permalink / raw)
  To: Andrey Grodzovsky; +Cc: Koenig, Christian, ML dri-devel

On 2018-05-17 01:18 PM, Andrey Grodzovsky wrote:
> Hi Michele and others, I am trying to implement the approach bellow to
> resolve AMDGPU's hang when commands are stuck in pipe during process exit.
> 
> I noticed that once I implemented the file_operation.flush callback 
> then during run of X, i see the flush callback gets called not only for
> Xorg process but for other
> 
> processes such as 'xkbcomp' and even 'sh', it seems like Xorg passes his
> FDs to children, Christian mentioned he remembered a discussion to
> always set FD_CLOEXEC flag when opening the hardware device file, so
> 
> we suspect a bug in Xorg with regard to this behavior.

Try the libdrm patch below.

Note that the X server passes DRM file descriptors to DRI3 clients.


diff --git a/xf86drm.c b/xf86drm.c
index 3a9d0ed2..c09437b0 100644
--- a/xf86drm.c
+++ b/xf86drm.c
@@ -405,7 +405,7 @@ wait_for_udev:
     }
 #endif

-    fd = open(buf, O_RDWR, 0);
+    fd = open(buf, O_RDWR | O_CLOEXEC, 0);
     drmMsg("drmOpenDevice: open result is %d, (%s)\n",
            fd, fd < 0 ? strerror(errno) : "OK");
     if (fd >= 0)
@@ -425,7 +425,7 @@ wait_for_udev:
             chmod(buf, devmode);
         }
     }
-    fd = open(buf, O_RDWR, 0);
+    fd = open(buf, O_RDWR | O_CLOEXEC, 0);
     drmMsg("drmOpenDevice: open result is %d, (%s)\n",
            fd, fd < 0 ? strerror(errno) : "OK");
     if (fd >= 0)
@@ -474,7 +474,7 @@ static int drmOpenMinor(int minor, int create, int type)
     };

     sprintf(buf, dev_name, DRM_DIR_NAME, minor);
-    if ((fd = open(buf, O_RDWR, 0)) >= 0)
+    if ((fd = open(buf, O_RDWR | O_CLOEXEC, 0)) >= 0)
         return fd;
     return -errno;
 }



-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
  2018-05-17 14:48                                   ` Michel Dänzer
@ 2018-05-17 15:33                                     ` Andrey Grodzovsky
  2018-05-17 15:52                                       ` Michel Dänzer
  2018-05-17 19:05                                     ` Andrey Grodzovsky
  1 sibling, 1 reply; 122+ messages in thread
From: Andrey Grodzovsky @ 2018-05-17 15:33 UTC (permalink / raw)
  To: Michel Dänzer; +Cc: Koenig, Christian, ML dri-devel

Thanks Michel, will give it a try.

BTW, just out of interest, how the FDs are passed to clients ? Using 
sockets ? Can you point me to the code which does it ?

Andrey


On 05/17/2018 10:48 AM, Michel Dänzer wrote:
> On 2018-05-17 01:18 PM, Andrey Grodzovsky wrote:
>> Hi Michele and others, I am trying to implement the approach bellow to
>> resolve AMDGPU's hang when commands are stuck in pipe during process exit.
>>
>> I noticed that once I implemented the file_operation.flush callback
>> then during run of X, i see the flush callback gets called not only for
>> Xorg process but for other
>>
>> processes such as 'xkbcomp' and even 'sh', it seems like Xorg passes his
>> FDs to children, Christian mentioned he remembered a discussion to
>> always set FD_CLOEXEC flag when opening the hardware device file, so
>>
>> we suspect a bug in Xorg with regard to this behavior.
> Try the libdrm patch below.
>
> Note that the X server passes DRM file descriptors to DRI3 clients.
>
>
> diff --git a/xf86drm.c b/xf86drm.c
> index 3a9d0ed2..c09437b0 100644
> --- a/xf86drm.c
> +++ b/xf86drm.c
> @@ -405,7 +405,7 @@ wait_for_udev:
>       }
>   #endif
>
> -    fd = open(buf, O_RDWR, 0);
> +    fd = open(buf, O_RDWR | O_CLOEXEC, 0);
>       drmMsg("drmOpenDevice: open result is %d, (%s)\n",
>              fd, fd < 0 ? strerror(errno) : "OK");
>       if (fd >= 0)
> @@ -425,7 +425,7 @@ wait_for_udev:
>               chmod(buf, devmode);
>           }
>       }
> -    fd = open(buf, O_RDWR, 0);
> +    fd = open(buf, O_RDWR | O_CLOEXEC, 0);
>       drmMsg("drmOpenDevice: open result is %d, (%s)\n",
>              fd, fd < 0 ? strerror(errno) : "OK");
>       if (fd >= 0)
> @@ -474,7 +474,7 @@ static int drmOpenMinor(int minor, int create, int type)
>       };
>
>       sprintf(buf, dev_name, DRM_DIR_NAME, minor);
> -    if ((fd = open(buf, O_RDWR, 0)) >= 0)
> +    if ((fd = open(buf, O_RDWR | O_CLOEXEC, 0)) >= 0)
>           return fd;
>       return -errno;
>   }
>
>
>

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
  2018-05-17 15:33                                     ` Andrey Grodzovsky
@ 2018-05-17 15:52                                       ` Michel Dänzer
  0 siblings, 0 replies; 122+ messages in thread
From: Michel Dänzer @ 2018-05-17 15:52 UTC (permalink / raw)
  To: Andrey Grodzovsky; +Cc: Koenig, Christian, ML dri-devel

On 2018-05-17 05:33 PM, Andrey Grodzovsky wrote:
> 
> BTW, just out of interest, how the FDs are passed to clients ? Using
> sockets ?

Yes, via the socket used for the X11 display connection.


> Can you point me to the code which does it ?

xserver/dri3/dri3_request.c:dri3_send_open_reply() =>
xserver/os/io.c:WriteFdToClient()

Note that since dri3_send_open_reply passes TRUE for WriteFdToClient's
do_close parameter, the file descriptor is closed in the Xorg process
after sending it to the client.


-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
  2018-05-17 14:48                                   ` Michel Dänzer
  2018-05-17 15:33                                     ` Andrey Grodzovsky
@ 2018-05-17 19:05                                     ` Andrey Grodzovsky
  2018-05-18  8:46                                       ` Michel Dänzer
  1 sibling, 1 reply; 122+ messages in thread
From: Andrey Grodzovsky @ 2018-05-17 19:05 UTC (permalink / raw)
  To: Michel Dänzer; +Cc: Koenig, Christian, ML dri-devel



On 05/17/2018 10:48 AM, Michel Dänzer wrote:
> On 2018-05-17 01:18 PM, Andrey Grodzovsky wrote:
>> Hi Michele and others, I am trying to implement the approach bellow to
>> resolve AMDGPU's hang when commands are stuck in pipe during process exit.
>>
>> I noticed that once I implemented the file_operation.flush callback
>> then during run of X, i see the flush callback gets called not only for
>> Xorg process but for other
>>
>> processes such as 'xkbcomp' and even 'sh', it seems like Xorg passes his
>> FDs to children, Christian mentioned he remembered a discussion to
>> always set FD_CLOEXEC flag when opening the hardware device file, so
>>
>> we suspect a bug in Xorg with regard to this behavior.
> Try the libdrm patch below.
>
> Note that the X server passes DRM file descriptors to DRI3 clients.

Tried it, didn't help. I still see other processes calling .flush for 
/dev/dri/card0

Thanks,
Andrey

>
>
> diff --git a/xf86drm.c b/xf86drm.c
> index 3a9d0ed2..c09437b0 100644
> --- a/xf86drm.c
> +++ b/xf86drm.c
> @@ -405,7 +405,7 @@ wait_for_udev:
>       }
>   #endif
>
> -    fd = open(buf, O_RDWR, 0);
> +    fd = open(buf, O_RDWR | O_CLOEXEC, 0);
>       drmMsg("drmOpenDevice: open result is %d, (%s)\n",
>              fd, fd < 0 ? strerror(errno) : "OK");
>       if (fd >= 0)
> @@ -425,7 +425,7 @@ wait_for_udev:
>               chmod(buf, devmode);
>           }
>       }
> -    fd = open(buf, O_RDWR, 0);
> +    fd = open(buf, O_RDWR | O_CLOEXEC, 0);
>       drmMsg("drmOpenDevice: open result is %d, (%s)\n",
>              fd, fd < 0 ? strerror(errno) : "OK");
>       if (fd >= 0)
> @@ -474,7 +474,7 @@ static int drmOpenMinor(int minor, int create, int type)
>       };
>
>       sprintf(buf, dev_name, DRM_DIR_NAME, minor);
> -    if ((fd = open(buf, O_RDWR, 0)) >= 0)
> +    if ((fd = open(buf, O_RDWR | O_CLOEXEC, 0)) >= 0)
>           return fd;
>       return -errno;
>   }
>
>
>

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
  2018-05-17 19:05                                     ` Andrey Grodzovsky
@ 2018-05-18  8:46                                       ` Michel Dänzer
  2018-05-18  9:42                                         ` Christian König
  2018-05-22 15:49                                         ` Andrey Grodzovsky
  0 siblings, 2 replies; 122+ messages in thread
From: Michel Dänzer @ 2018-05-18  8:46 UTC (permalink / raw)
  To: Andrey Grodzovsky; +Cc: Koenig, Christian, ML dri-devel

[-- Attachment #1: Type: text/plain, Size: 1478 bytes --]

On 2018-05-17 09:05 PM, Andrey Grodzovsky wrote:
> On 05/17/2018 10:48 AM, Michel Dänzer wrote:
>> On 2018-05-17 01:18 PM, Andrey Grodzovsky wrote:
>>> Hi Michele and others, I am trying to implement the approach bellow to
>>> resolve AMDGPU's hang when commands are stuck in pipe during process
>>> exit.
>>>
>>> I noticed that once I implemented the file_operation.flush callback
>>> then during run of X, i see the flush callback gets called not only for
>>> Xorg process but for other
>>>
>>> processes such as 'xkbcomp' and even 'sh', it seems like Xorg passes his
>>> FDs to children, Christian mentioned he remembered a discussion to
>>> always set FD_CLOEXEC flag when opening the hardware device file, so
>>>
>>> we suspect a bug in Xorg with regard to this behavior.
>> Try the libdrm patch below.
>>
>> Note that the X server passes DRM file descriptors to DRI3 clients.
> 
> Tried it, didn't help. I still see other processes calling .flush for
> /dev/dri/card0

Try the attached xserver patch on top. With these patches, I no longer
see any DRM file descriptors being opened without O_CLOEXEC running Xorg
-pogo in strace.


Anyway, the kernel can't rely on userspace using O_CLOEXEC. If the flush
callback being called from multiple processes is an issue, maybe the
flush callback isn't appropriate after all.


-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: xserver-DRM-always-CLOEXEC.diff --]
[-- Type: text/x-patch; name="xserver-DRM-always-CLOEXEC.diff", Size: 1271 bytes --]

diff --git a/hw/xfree86/drivers/modesetting/driver.c b/hw/xfree86/drivers/modesetting/driver.c
index 5d8906d63..306541f33 100644
--- a/hw/xfree86/drivers/modesetting/driver.c
+++ b/hw/xfree86/drivers/modesetting/driver.c
@@ -200,12 +200,12 @@ open_hw(const char *dev)
     int fd;
 
     if (dev)
-        fd = open(dev, O_RDWR, 0);
+        fd = open(dev, O_RDWR | O_CLOEXEC, 0);
     else {
         dev = getenv("KMSDEVICE");
-        if ((NULL == dev) || ((fd = open(dev, O_RDWR, 0)) == -1)) {
+        if ((NULL == dev) || ((fd = open(dev, O_RDWR | O_CLOEXEC, 0)) == -1)) {
             dev = "/dev/dri/card0";
-            fd = open(dev, O_RDWR, 0);
+            fd = open(dev, O_RDWR | O_CLOEXEC, 0);
         }
     }
     if (fd == -1)
diff --git a/hw/xfree86/os-support/linux/lnx_platform.c b/hw/xfree86/os-support/linux/lnx_platform.c
index 11af52c46..70374ace8 100644
--- a/hw/xfree86/os-support/linux/lnx_platform.c
+++ b/hw/xfree86/os-support/linux/lnx_platform.c
@@ -43,7 +43,7 @@ get_drm_info(struct OdevAttributes *attribs, char *path, int delayed_index)
     }
 
     if (fd == -1)
-        fd = open(path, O_RDWR, O_CLOEXEC);
+        fd = open(path, O_RDWR | O_CLOEXEC, 0);
 
     if (fd == -1)
         return FALSE;

[-- Attachment #3: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
  2018-05-18  8:46                                       ` Michel Dänzer
@ 2018-05-18  9:42                                         ` Christian König
  2018-05-18 14:44                                           ` Michel Dänzer
  2018-05-22 15:49                                         ` Andrey Grodzovsky
  1 sibling, 1 reply; 122+ messages in thread
From: Christian König @ 2018-05-18  9:42 UTC (permalink / raw)
  To: Michel Dänzer, Andrey Grodzovsky; +Cc: Koenig, Christian, ML dri-devel


> Anyway, the kernel can't rely on userspace using O_CLOEXEC. If the flush
> callback being called from multiple processes is an issue, maybe the
> flush callback isn't appropriate after all.

Userspace could also grab a reference just by opening /proc/$pid/fd/*.

The idea is just that when any process which used the fd is killed by a 
signal we drop the remaining jobs from being submitted to the hardware.

Christian.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
  2018-05-18  9:42                                         ` Christian König
@ 2018-05-18 14:44                                           ` Michel Dänzer
  2018-05-18 14:50                                             ` Christian König
  0 siblings, 1 reply; 122+ messages in thread
From: Michel Dänzer @ 2018-05-18 14:44 UTC (permalink / raw)
  To: christian.koenig; +Cc: ML dri-devel

On 2018-05-18 11:42 AM, Christian König wrote:
> 
>> Anyway, the kernel can't rely on userspace using O_CLOEXEC. If the flush
>> callback being called from multiple processes is an issue, maybe the
>> flush callback isn't appropriate after all.
> 
> Userspace could also grab a reference just by opening /proc/$pid/fd/*.
> 
> The idea is just that when any process which used the fd is killed by a
> signal we drop the remaining jobs from being submitted to the hardware.

This must only affect jobs submitted by the killed process, not those
submitted by other processes.


-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
  2018-05-18 14:44                                           ` Michel Dänzer
@ 2018-05-18 14:50                                             ` Christian König
  2018-05-18 15:02                                               ` Andrey Grodzovsky
  0 siblings, 1 reply; 122+ messages in thread
From: Christian König @ 2018-05-18 14:50 UTC (permalink / raw)
  To: Michel Dänzer; +Cc: ML dri-devel

Am 18.05.2018 um 16:44 schrieb Michel Dänzer:
> On 2018-05-18 11:42 AM, Christian König wrote:
>>> Anyway, the kernel can't rely on userspace using O_CLOEXEC. If the flush
>>> callback being called from multiple processes is an issue, maybe the
>>> flush callback isn't appropriate after all.
>> Userspace could also grab a reference just by opening /proc/$pid/fd/*.
>>
>> The idea is just that when any process which used the fd is killed by a
>> signal we drop the remaining jobs from being submitted to the hardware.
> This must only affect jobs submitted by the killed process, not those
> submitted by other processes.

Yeah, that's exactly the plan here.

For additional security we could safe the pid of the job submitter, but 
since this should basically not happen in normal operation I would 
rather like to avoid that.

Christian.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
  2018-05-18 14:50                                             ` Christian König
@ 2018-05-18 15:02                                               ` Andrey Grodzovsky
  2018-05-22 12:58                                                 ` Christian König
  0 siblings, 1 reply; 122+ messages in thread
From: Andrey Grodzovsky @ 2018-05-18 15:02 UTC (permalink / raw)
  To: Christian König, Michel Dänzer; +Cc: ML dri-devel



On 05/18/2018 10:50 AM, Christian König wrote:
> Am 18.05.2018 um 16:44 schrieb Michel Dänzer:
>> On 2018-05-18 11:42 AM, Christian König wrote:
>>>> Anyway, the kernel can't rely on userspace using O_CLOEXEC. If the 
>>>> flush
>>>> callback being called from multiple processes is an issue, maybe the
>>>> flush callback isn't appropriate after all.
>>> Userspace could also grab a reference just by opening /proc/$pid/fd/*.
>>>
>>> The idea is just that when any process which used the fd is killed by a
>>> signal we drop the remaining jobs from being submitted to the hardware.
>> This must only affect jobs submitted by the killed process, not those
>> submitted by other processes.
>
> Yeah, that's exactly the plan here.

I don't see how it's gong to happen -
.flush is being called for any terminating process regardless if he 
submitted jobs
or just accidentally (or not)  has the device file FD in his private 
file table. So here
we going to have a problem with that requirement. If a process is being 
killed and .flush is
executed I don't have any way to know which  amdgpu_ctx to chose to 
terminate it's pending jobs.
The only info i have from .flush caller is the process id.
As it's now in amdgpu_ctx_mgr_entity_fini and 
amdgpu_ctx_mgr_entity_cleanup we are going to iterate
all the contextes from the context manager list and terminate them all, 
which sounds wrong to me indeed.
I can save the pid of the context creator on the context structure so i 
can match during .flush call, but in case some one
creates the context but passes the context id to another process for 
actual job submission this approach won't work either.

Am I messing something here ?

Andrey

>
> For additional security we could safe the pid of the job submitter, 
> but since this should basically not happen in normal operation I would 
> rather like to avoid that.
>
> Christian.

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
  2018-05-18 15:02                                               ` Andrey Grodzovsky
@ 2018-05-22 12:58                                                 ` Christian König
  0 siblings, 0 replies; 122+ messages in thread
From: Christian König @ 2018-05-22 12:58 UTC (permalink / raw)
  To: Andrey Grodzovsky, Michel Dänzer; +Cc: ML dri-devel

Am 18.05.2018 um 17:02 schrieb Andrey Grodzovsky:
>
>
> On 05/18/2018 10:50 AM, Christian König wrote:
>> Am 18.05.2018 um 16:44 schrieb Michel Dänzer:
>>> On 2018-05-18 11:42 AM, Christian König wrote:
>>>>> Anyway, the kernel can't rely on userspace using O_CLOEXEC. If the 
>>>>> flush
>>>>> callback being called from multiple processes is an issue, maybe the
>>>>> flush callback isn't appropriate after all.
>>>> Userspace could also grab a reference just by opening /proc/$pid/fd/*.
>>>>
>>>> The idea is just that when any process which used the fd is killed 
>>>> by a
>>>> signal we drop the remaining jobs from being submitted to the 
>>>> hardware.
>>> This must only affect jobs submitted by the killed process, not those
>>> submitted by other processes.
>>
>> Yeah, that's exactly the plan here.
>
> I don't see how it's gong to happen -
> .flush is being called for any terminating process regardless if he 
> submitted jobs
> or just accidentally (or not)  has the device file FD in his private 
> file table. So here
> we going to have a problem with that requirement. If a process is 
> being killed and .flush is
> executed I don't have any way to know which  amdgpu_ctx to chose to 
> terminate it's pending jobs.
> The only info i have from .flush caller is the process id.
> As it's now in amdgpu_ctx_mgr_entity_fini and 
> amdgpu_ctx_mgr_entity_cleanup we are going to iterate
> all the contextes from the context manager list and terminate them 
> all, which sounds wrong to me indeed.
> I can save the pid of the context creator on the context structure so 
> i can match during .flush call, but in case some one
> creates the context but passes the context id to another process for 
> actual job submission this approach won't work either.
>
> Am I messing something here ?

Your analyses is correct, it's just that I think that this case should 
not happen.

What can happen is that the fd is passed accidentally to child processes 
and those child processes are then killed, but passing the fd to child 
processes is a bug in the first place.

When somebody on purpose opens the fd and kills the process then it 
breaks and he can keep the pieces. I mean to open the fd you need to be 
privileged anyway.

What we could do to completely fix the issue:
1. Note for each submitted job which process (pid) it submitted.
2. During flush wait or kill only jobs of the current process.

But I think that this is overkill.

Christian.

>
> Andrey
>
>>
>> For additional security we could safe the pid of the job submitter, 
>> but since this should basically not happen in normal operation I 
>> would rather like to avoid that.
>>
>> Christian.
>

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
  2018-05-18  8:46                                       ` Michel Dänzer
  2018-05-18  9:42                                         ` Christian König
@ 2018-05-22 15:49                                         ` Andrey Grodzovsky
  2018-05-22 16:09                                           ` Michel Dänzer
  1 sibling, 1 reply; 122+ messages in thread
From: Andrey Grodzovsky @ 2018-05-22 15:49 UTC (permalink / raw)
  To: Michel Dänzer; +Cc: Koenig, Christian, ML dri-devel

[-- Attachment #1: Type: text/plain, Size: 1789 bytes --]



On 05/18/2018 04:46 AM, Michel Dänzer wrote:
> On 2018-05-17 09:05 PM, Andrey Grodzovsky wrote:
>> On 05/17/2018 10:48 AM, Michel Dänzer wrote:
>>> On 2018-05-17 01:18 PM, Andrey Grodzovsky wrote:
>>>> Hi Michele and others, I am trying to implement the approach bellow to
>>>> resolve AMDGPU's hang when commands are stuck in pipe during process
>>>> exit.
>>>>
>>>> I noticed that once I implemented the file_operation.flush callback
>>>> then during run of X, i see the flush callback gets called not only for
>>>> Xorg process but for other
>>>>
>>>> processes such as 'xkbcomp' and even 'sh', it seems like Xorg passes his
>>>> FDs to children, Christian mentioned he remembered a discussion to
>>>> always set FD_CLOEXEC flag when opening the hardware device file, so
>>>>
>>>> we suspect a bug in Xorg with regard to this behavior.
>>> Try the libdrm patch below.
>>>
>>> Note that the X server passes DRM file descriptors to DRI3 clients.
>> Tried it, didn't help. I still see other processes calling .flush for
>> /dev/dri/card0
> Try the attached xserver patch on top. With these patches, I no longer
> see any DRM file descriptors being opened without O_CLOEXEC running Xorg
> -pogo in strace.

Thanks for the patch, unfortunately this is my first time  building xorg
form source and I hit some blocks with dependencies. I wonder if you
could quickly apply to amdgpu the attached small patch and run xinit from
command line. In case the FD is not passed any more you will only see
Xorg print in dmeg afterwards, otherwise 'sh' and 'xkbcomp' will also 
get printed.

Andrey

>
> Anyway, the kernel can't rely on userspace using O_CLOEXEC. If the flush
> callback being called from multiple processes is an issue, maybe the
> flush callback isn't appropriate after all.
>
>


[-- Attachment #2: test_flush.patch --]
[-- Type: text/x-patch, Size: 681 bytes --]

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index b0bf2f2..1f63712 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -855,9 +855,18 @@ static const struct dev_pm_ops amdgpu_pm_ops = {
 	.runtime_idle = amdgpu_pmops_runtime_idle,
 };
 
+static int amdgpu_flush(struct file *f, fl_owner_t id) {
+
+	DRM_ERROR("%s\n", current->comm);
+
+       return 0;
+}
+
+
 static const struct file_operations amdgpu_driver_kms_fops = {
 	.owner = THIS_MODULE,
 	.open = drm_open,
+	.flush = amdgpu_flush,
 	.release = drm_release,
 	.unlocked_ioctl = amdgpu_drm_ioctl,
 	.mmap = amdgpu_mmap,

[-- Attachment #3: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
  2018-05-22 15:49                                         ` Andrey Grodzovsky
@ 2018-05-22 16:09                                           ` Michel Dänzer
  2018-05-22 16:30                                             ` Andrey Grodzovsky
  0 siblings, 1 reply; 122+ messages in thread
From: Michel Dänzer @ 2018-05-22 16:09 UTC (permalink / raw)
  To: Andrey Grodzovsky; +Cc: Koenig, Christian, ML dri-devel

On 2018-05-22 05:49 PM, Andrey Grodzovsky wrote:
> On 05/18/2018 04:46 AM, Michel Dänzer wrote:
>> On 2018-05-17 09:05 PM, Andrey Grodzovsky wrote:
>>> On 05/17/2018 10:48 AM, Michel Dänzer wrote:
>>>> On 2018-05-17 01:18 PM, Andrey Grodzovsky wrote:
>>>>> Hi Michele and others, I am trying to implement the approach bellow to
>>>>> resolve AMDGPU's hang when commands are stuck in pipe during process
>>>>> exit.
>>>>>
>>>>> I noticed that once I implemented the file_operation.flush callback
>>>>> then during run of X, i see the flush callback gets called not only
>>>>> for
>>>>> Xorg process but for other
>>>>>
>>>>> processes such as 'xkbcomp' and even 'sh', it seems like Xorg
>>>>> passes his
>>>>> FDs to children, Christian mentioned he remembered a discussion to
>>>>> always set FD_CLOEXEC flag when opening the hardware device file, so
>>>>>
>>>>> we suspect a bug in Xorg with regard to this behavior.
>>>> Try the libdrm patch below.
>>>>
>>>> Note that the X server passes DRM file descriptors to DRI3 clients.
>>> Tried it, didn't help. I still see other processes calling .flush for
>>> /dev/dri/card0
>> Try the attached xserver patch on top. With these patches, I no longer
>> see any DRM file descriptors being opened without O_CLOEXEC running Xorg
>> -pogo in strace.
> 
> Thanks for the patch, unfortunately this is my first time  building xorg
> form source and I hit some blocks with dependencies. I wonder if you
> could quickly apply to amdgpu the attached small patch and run xinit from
> command line. In case the FD is not passed any more you will only see
> Xorg print in dmeg afterwards, otherwise 'sh' and 'xkbcomp' will also
> get printed.

As I said above, with these patches I don't see any DRM file descriptors
being opened without O_CLOEXEC, so they aren't getting passed to sh or
xkbcomp.


-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
  2018-05-22 16:09                                           ` Michel Dänzer
@ 2018-05-22 16:30                                             ` Andrey Grodzovsky
  2018-05-22 16:33                                               ` Michel Dänzer
  0 siblings, 1 reply; 122+ messages in thread
From: Andrey Grodzovsky @ 2018-05-22 16:30 UTC (permalink / raw)
  To: Michel Dänzer; +Cc: Koenig, Christian, ML dri-devel



On 05/22/2018 12:09 PM, Michel Dänzer wrote:
> On 2018-05-22 05:49 PM, Andrey Grodzovsky wrote:
>> On 05/18/2018 04:46 AM, Michel Dänzer wrote:
>>> On 2018-05-17 09:05 PM, Andrey Grodzovsky wrote:
>>>> On 05/17/2018 10:48 AM, Michel Dänzer wrote:
>>>>> On 2018-05-17 01:18 PM, Andrey Grodzovsky wrote:
>>>>>> Hi Michele and others, I am trying to implement the approach bellow to
>>>>>> resolve AMDGPU's hang when commands are stuck in pipe during process
>>>>>> exit.
>>>>>>
>>>>>> I noticed that once I implemented the file_operation.flush callback
>>>>>> then during run of X, i see the flush callback gets called not only
>>>>>> for
>>>>>> Xorg process but for other
>>>>>>
>>>>>> processes such as 'xkbcomp' and even 'sh', it seems like Xorg
>>>>>> passes his
>>>>>> FDs to children, Christian mentioned he remembered a discussion to
>>>>>> always set FD_CLOEXEC flag when opening the hardware device file, so
>>>>>>
>>>>>> we suspect a bug in Xorg with regard to this behavior.
>>>>> Try the libdrm patch below.
>>>>>
>>>>> Note that the X server passes DRM file descriptors to DRI3 clients.
>>>> Tried it, didn't help. I still see other processes calling .flush for
>>>> /dev/dri/card0
>>> Try the attached xserver patch on top. With these patches, I no longer
>>> see any DRM file descriptors being opened without O_CLOEXEC running Xorg
>>> -pogo in strace.
>> Thanks for the patch, unfortunately this is my first time  building xorg
>> form source and I hit some blocks with dependencies. I wonder if you
>> could quickly apply to amdgpu the attached small patch and run xinit from
>> command line. In case the FD is not passed any more you will only see
>> Xorg print in dmeg afterwards, otherwise 'sh' and 'xkbcomp' will also
>> get printed.
> As I said above, with these patches I don't see any DRM file descriptors
> being opened without O_CLOEXEC, so they aren't getting passed to sh or
> xkbcomp.

OK then, please put me on CC when you send this patch for review.

Andrey

>
>

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
  2018-05-22 16:30                                             ` Andrey Grodzovsky
@ 2018-05-22 16:33                                               ` Michel Dänzer
  2018-05-22 16:37                                                 ` Andrey Grodzovsky
  0 siblings, 1 reply; 122+ messages in thread
From: Michel Dänzer @ 2018-05-22 16:33 UTC (permalink / raw)
  To: Andrey Grodzovsky; +Cc: Koenig, Christian, ML dri-devel

On 2018-05-22 06:30 PM, Andrey Grodzovsky wrote:
> On 05/22/2018 12:09 PM, Michel Dänzer wrote:
>> On 2018-05-22 05:49 PM, Andrey Grodzovsky wrote:
>>> On 05/18/2018 04:46 AM, Michel Dänzer wrote:
>>>> On 2018-05-17 09:05 PM, Andrey Grodzovsky wrote:
>>>>> On 05/17/2018 10:48 AM, Michel Dänzer wrote:
>>>>>> On 2018-05-17 01:18 PM, Andrey Grodzovsky wrote:
>>>>>>> Hi Michele and others, I am trying to implement the approach
>>>>>>> bellow to
>>>>>>> resolve AMDGPU's hang when commands are stuck in pipe during process
>>>>>>> exit.
>>>>>>>
>>>>>>> I noticed that once I implemented the file_operation.flush callback
>>>>>>> then during run of X, i see the flush callback gets called not only
>>>>>>> for
>>>>>>> Xorg process but for other
>>>>>>>
>>>>>>> processes such as 'xkbcomp' and even 'sh', it seems like Xorg
>>>>>>> passes his
>>>>>>> FDs to children, Christian mentioned he remembered a discussion to
>>>>>>> always set FD_CLOEXEC flag when opening the hardware device file, so
>>>>>>>
>>>>>>> we suspect a bug in Xorg with regard to this behavior.
>>>>>> Try the libdrm patch below.
>>>>>>
>>>>>> Note that the X server passes DRM file descriptors to DRI3 clients.
>>>>> Tried it, didn't help. I still see other processes calling .flush for
>>>>> /dev/dri/card0
>>>> Try the attached xserver patch on top. With these patches, I no longer
>>>> see any DRM file descriptors being opened without O_CLOEXEC running
>>>> Xorg
>>>> -pogo in strace.
>>> Thanks for the patch, unfortunately this is my first time  building xorg
>>> form source and I hit some blocks with dependencies. I wonder if you
>>> could quickly apply to amdgpu the attached small patch and run xinit
>>> from
>>> command line. In case the FD is not passed any more you will only see
>>> Xorg print in dmeg afterwards, otherwise 'sh' and 'xkbcomp' will also
>>> get printed.
>> As I said above, with these patches I don't see any DRM file descriptors
>> being opened without O_CLOEXEC, so they aren't getting passed to sh or
>> xkbcomp.
> 
> OK then, please put me on CC when you send this patch for review.

Both patches have already landed on the respective Git master branches.


-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
  2018-05-22 16:33                                               ` Michel Dänzer
@ 2018-05-22 16:37                                                 ` Andrey Grodzovsky
  0 siblings, 0 replies; 122+ messages in thread
From: Andrey Grodzovsky @ 2018-05-22 16:37 UTC (permalink / raw)
  To: Michel Dänzer; +Cc: Koenig, Christian, ML dri-devel



On 05/22/2018 12:33 PM, Michel Dänzer wrote:
> On 2018-05-22 06:30 PM, Andrey Grodzovsky wrote:
>> On 05/22/2018 12:09 PM, Michel Dänzer wrote:
>>> On 2018-05-22 05:49 PM, Andrey Grodzovsky wrote:
>>>> On 05/18/2018 04:46 AM, Michel Dänzer wrote:
>>>>> On 2018-05-17 09:05 PM, Andrey Grodzovsky wrote:
>>>>>> On 05/17/2018 10:48 AM, Michel Dänzer wrote:
>>>>>>> On 2018-05-17 01:18 PM, Andrey Grodzovsky wrote:
>>>>>>>> Hi Michele and others, I am trying to implement the approach
>>>>>>>> bellow to
>>>>>>>> resolve AMDGPU's hang when commands are stuck in pipe during process
>>>>>>>> exit.
>>>>>>>>
>>>>>>>> I noticed that once I implemented the file_operation.flush callback
>>>>>>>> then during run of X, i see the flush callback gets called not only
>>>>>>>> for
>>>>>>>> Xorg process but for other
>>>>>>>>
>>>>>>>> processes such as 'xkbcomp' and even 'sh', it seems like Xorg
>>>>>>>> passes his
>>>>>>>> FDs to children, Christian mentioned he remembered a discussion to
>>>>>>>> always set FD_CLOEXEC flag when opening the hardware device file, so
>>>>>>>>
>>>>>>>> we suspect a bug in Xorg with regard to this behavior.
>>>>>>> Try the libdrm patch below.
>>>>>>>
>>>>>>> Note that the X server passes DRM file descriptors to DRI3 clients.
>>>>>> Tried it, didn't help. I still see other processes calling .flush for
>>>>>> /dev/dri/card0
>>>>> Try the attached xserver patch on top. With these patches, I no longer
>>>>> see any DRM file descriptors being opened without O_CLOEXEC running
>>>>> Xorg
>>>>> -pogo in strace.
>>>> Thanks for the patch, unfortunately this is my first time  building xorg
>>>> form source and I hit some blocks with dependencies. I wonder if you
>>>> could quickly apply to amdgpu the attached small patch and run xinit
>>>> from
>>>> command line. In case the FD is not passed any more you will only see
>>>> Xorg print in dmeg afterwards, otherwise 'sh' and 'xkbcomp' will also
>>>> get printed.
>>> As I said above, with these patches I don't see any DRM file descriptors
>>> being opened without O_CLOEXEC, so they aren't getting passed to sh or
>>> xkbcomp.
>> OK then, please put me on CC when you send this patch for review.
> Both patches have already landed on the respective Git master branches.

Good to know.

Andrey

>
>

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
@ 2018-05-23 15:08                               ` Andrey Grodzovsky
  0 siblings, 0 replies; 122+ messages in thread
From: Andrey Grodzovsky @ 2018-05-23 15:08 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: christian.koenig, Eric W. Biederman, David.Panariti, amd-gfx,
	linux-kernel, Alexander.Deucher, akpm



On 05/01/2018 10:35 AM, Oleg Nesterov wrote:
> On 04/30, Andrey Grodzovsky wrote:
>> On 04/30/2018 12:00 PM, Oleg Nesterov wrote:
>>> On 04/30, Andrey Grodzovsky wrote:
>>>> What about changing PF_SIGNALED to  PF_EXITING in
>>>> drm_sched_entity_do_release
>>>>
>>>> -       if ((current->flags & PF_SIGNALED) && current->exit_code == SIGKILL)
>>>> +      if ((current->flags & PF_EXITING) && current->exit_code == SIGKILL)
>>> let me repeat, please don't use task->exit_code. And in fact this check is racy
>>>
>>> But this doesn't matter. Say, we can trivially add SIGNAL_GROUP_KILLED_BY_SIGKILL,
>>> or do something else,
>> Can you explain where is the race and what is a possible alternative then ?
> Oh. I mentioned this race automatically, because I am pedant ;) Let me repeat
> that this doesn't really matter, and let me remind that the caller of fop->release
> can be completely unrelated process, say $cat /proc/pid/fdinfo. And in any case
> ->exit_code should not be used outside of ptrace/exit paths.
>
> OK, the race. Consider a process P with a main thread M and a sub-thread T.
>
> T does pthread_exit(), enters do_exit() and gets a preemption before exit_files().
>
> The process is killed by SIGKILL. M calls do_group_exit(), do_exit() and passes
> exit_files(). However, it doesn't call close_files() because T has another reference.
>
> T resumes, calls close_files(), fput(), etc, and then exit_task_work(), so
> it can finally call ->release() with current->exit_code == 0 desptite the fact
> the process was killed.
>
> Again, again, this doesn't matter. We can distinguish killed-or-not, by SIGKILL-
> or-not. But I still do not think we actually need this. At least in ->release()
> paths, ->flush() may differ.

Hi Oleg, reworked the code to use .flush hook and eliminated 
wait_event_killable.
So in case of .flush is this OK to look at task->exit_code == SIGKILL to 
distinguish
exit by signal ?

Andrey

>
> Oleg.
>

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
@ 2018-05-23 15:08                               ` Andrey Grodzovsky
  0 siblings, 0 replies; 122+ messages in thread
From: Andrey Grodzovsky @ 2018-05-23 15:08 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: David.Panariti-5C7GfCeVMHo, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Eric W. Biederman,
	Alexander.Deucher-5C7GfCeVMHo,
	akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	christian.koenig-5C7GfCeVMHo



On 05/01/2018 10:35 AM, Oleg Nesterov wrote:
> On 04/30, Andrey Grodzovsky wrote:
>> On 04/30/2018 12:00 PM, Oleg Nesterov wrote:
>>> On 04/30, Andrey Grodzovsky wrote:
>>>> What about changing PF_SIGNALED to  PF_EXITING in
>>>> drm_sched_entity_do_release
>>>>
>>>> -       if ((current->flags & PF_SIGNALED) && current->exit_code == SIGKILL)
>>>> +      if ((current->flags & PF_EXITING) && current->exit_code == SIGKILL)
>>> let me repeat, please don't use task->exit_code. And in fact this check is racy
>>>
>>> But this doesn't matter. Say, we can trivially add SIGNAL_GROUP_KILLED_BY_SIGKILL,
>>> or do something else,
>> Can you explain where is the race and what is a possible alternative then ?
> Oh. I mentioned this race automatically, because I am pedant ;) Let me repeat
> that this doesn't really matter, and let me remind that the caller of fop->release
> can be completely unrelated process, say $cat /proc/pid/fdinfo. And in any case
> ->exit_code should not be used outside of ptrace/exit paths.
>
> OK, the race. Consider a process P with a main thread M and a sub-thread T.
>
> T does pthread_exit(), enters do_exit() and gets a preemption before exit_files().
>
> The process is killed by SIGKILL. M calls do_group_exit(), do_exit() and passes
> exit_files(). However, it doesn't call close_files() because T has another reference.
>
> T resumes, calls close_files(), fput(), etc, and then exit_task_work(), so
> it can finally call ->release() with current->exit_code == 0 desptite the fact
> the process was killed.
>
> Again, again, this doesn't matter. We can distinguish killed-or-not, by SIGKILL-
> or-not. But I still do not think we actually need this. At least in ->release()
> paths, ->flush() may differ.

Hi Oleg, reworked the code to use .flush hook and eliminated 
wait_event_killable.
So in case of .flush is this OK to look at task->exit_code == SIGKILL to 
distinguish
exit by signal ?

Andrey

>
> Oleg.
>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 122+ messages in thread

end of thread, other threads:[~2018-05-23 15:08 UTC | newest]

Thread overview: 122+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-04-24 15:30 Avoid uninterruptible sleep during process exit Andrey Grodzovsky
2018-04-24 15:30 ` Andrey Grodzovsky
2018-04-24 15:30 ` [PATCH 1/3] signals: Allow generation of SIGKILL to exiting task Andrey Grodzovsky
2018-04-24 15:30   ` Andrey Grodzovsky
2018-04-24 16:10   ` Eric W. Biederman
2018-04-24 16:10     ` Eric W. Biederman
2018-04-24 16:42   ` Eric W. Biederman
2018-04-24 16:42     ` Eric W. Biederman
2018-04-24 16:51     ` Andrey Grodzovsky
2018-04-24 16:51       ` Andrey Grodzovsky
2018-04-24 17:29       ` Eric W. Biederman
2018-04-25 13:13   ` Oleg Nesterov
2018-04-24 15:30 ` [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process Andrey Grodzovsky
2018-04-24 15:30   ` Andrey Grodzovsky
2018-04-24 15:46   ` Michel Dänzer
2018-04-24 15:51     ` Andrey Grodzovsky
2018-04-24 15:51       ` Andrey Grodzovsky
2018-04-24 15:52     ` Andrey Grodzovsky
2018-04-24 15:52       ` Andrey Grodzovsky
2018-04-24 19:44     ` Daniel Vetter
2018-04-24 19:44       ` Daniel Vetter
2018-04-24 21:00       ` Eric W. Biederman
2018-04-24 21:02       ` Andrey Grodzovsky
2018-04-24 21:02         ` Andrey Grodzovsky
2018-04-24 21:21         ` Eric W. Biederman
2018-04-24 21:37           ` Andrey Grodzovsky
2018-04-24 21:37             ` Andrey Grodzovsky
2018-04-24 22:11             ` Eric W. Biederman
2018-04-25  7:14             ` Daniel Vetter
2018-04-25 13:08               ` Andrey Grodzovsky
2018-04-25 13:08                 ` Andrey Grodzovsky
2018-04-25 15:29                 ` Eric W. Biederman
2018-04-25 16:13                   ` Andrey Grodzovsky
2018-04-25 16:31                     ` Eric W. Biederman
2018-04-24 21:40         ` Daniel Vetter
2018-04-24 21:40           ` Daniel Vetter
2018-04-25 13:22           ` Oleg Nesterov
2018-04-25 13:36             ` Daniel Vetter
2018-04-25 14:18               ` Oleg Nesterov
2018-04-25 14:18                 ` Oleg Nesterov
2018-04-25 13:43           ` Andrey Grodzovsky
2018-04-25 13:43             ` Andrey Grodzovsky
2018-04-24 16:23   ` Eric W. Biederman
2018-04-24 16:23     ` Eric W. Biederman
2018-04-24 16:43     ` Andrey Grodzovsky
2018-04-24 16:43       ` Andrey Grodzovsky
2018-04-24 17:12       ` Eric W. Biederman
2018-04-25 13:55         ` Oleg Nesterov
2018-04-25 14:21           ` Andrey Grodzovsky
2018-04-25 14:21             ` Andrey Grodzovsky
2018-04-25 17:17             ` Oleg Nesterov
2018-04-25 18:40               ` Andrey Grodzovsky
2018-04-25 18:40                 ` Andrey Grodzovsky
2018-04-26  0:01                 ` Eric W. Biederman
2018-04-26 12:34                   ` Andrey Grodzovsky
2018-04-26 12:34                     ` Andrey Grodzovsky
2018-04-26 12:52                     ` Andrey Grodzovsky
2018-04-26 12:52                       ` Andrey Grodzovsky
2018-04-26 15:57                       ` Eric W. Biederman
2018-04-26 20:43                         ` Andrey Grodzovsky
2018-04-26 20:43                           ` Andrey Grodzovsky
2018-04-30 12:08                   ` Christian König
2018-04-30 12:08                     ` Christian König
2018-04-30 14:32                     ` Andrey Grodzovsky
2018-04-30 14:32                       ` Andrey Grodzovsky
2018-04-30 15:25                       ` Christian König
2018-04-30 15:25                         ` Christian König
2018-04-30 16:00                       ` Oleg Nesterov
2018-04-30 16:10                         ` Andrey Grodzovsky
2018-04-30 16:10                           ` Andrey Grodzovsky
2018-04-30 18:29                           ` Christian König
2018-04-30 18:29                             ` Christian König
2018-04-30 19:28                             ` Andrey Grodzovsky
2018-04-30 19:28                               ` Andrey Grodzovsky
2018-05-02 11:48                               ` Christian König
2018-05-02 11:48                                 ` Christian König
2018-05-17 11:18                                 ` Andrey Grodzovsky
2018-05-17 14:48                                   ` Michel Dänzer
2018-05-17 15:33                                     ` Andrey Grodzovsky
2018-05-17 15:52                                       ` Michel Dänzer
2018-05-17 19:05                                     ` Andrey Grodzovsky
2018-05-18  8:46                                       ` Michel Dänzer
2018-05-18  9:42                                         ` Christian König
2018-05-18 14:44                                           ` Michel Dänzer
2018-05-18 14:50                                             ` Christian König
2018-05-18 15:02                                               ` Andrey Grodzovsky
2018-05-22 12:58                                                 ` Christian König
2018-05-22 15:49                                         ` Andrey Grodzovsky
2018-05-22 16:09                                           ` Michel Dänzer
2018-05-22 16:30                                             ` Andrey Grodzovsky
2018-05-22 16:33                                               ` Michel Dänzer
2018-05-22 16:37                                                 ` Andrey Grodzovsky
2018-05-01 14:35                           ` Oleg Nesterov
2018-05-23 15:08                             ` Andrey Grodzovsky
2018-05-23 15:08                               ` Andrey Grodzovsky
2018-04-30 15:29                     ` Oleg Nesterov
2018-04-30 16:25                     ` Eric W. Biederman
2018-04-30 17:18                       ` Andrey Grodzovsky
2018-04-30 17:18                         ` Andrey Grodzovsky
2018-04-25 13:05   ` Oleg Nesterov
2018-04-24 15:30 ` [PATCH 3/3] drm/amdgpu: Switch to interrupted wait to recover from ring hang Andrey Grodzovsky
2018-04-24 15:30   ` Andrey Grodzovsky
2018-04-24 15:52   ` Panariti, David
2018-04-24 15:52     ` Panariti, David
2018-04-24 15:58     ` Andrey Grodzovsky
2018-04-24 15:58       ` Andrey Grodzovsky
2018-04-24 16:20       ` Panariti, David
2018-04-24 16:20         ` Panariti, David
2018-04-24 16:30         ` Eric W. Biederman
2018-04-24 16:30           ` Eric W. Biederman
2018-04-25 17:17           ` Andrey Grodzovsky
2018-04-25 17:17             ` Andrey Grodzovsky
2018-04-25 20:55             ` Eric W. Biederman
2018-04-25 20:55               ` Eric W. Biederman
2018-04-26 12:28               ` Andrey Grodzovsky
2018-04-26 12:28                 ` Andrey Grodzovsky
2018-04-24 16:14   ` Eric W. Biederman
2018-04-24 16:14     ` Eric W. Biederman
2018-04-24 16:38     ` Andrey Grodzovsky
2018-04-24 16:38       ` Andrey Grodzovsky
2018-04-30 11:34   ` Christian König
2018-04-30 11:34     ` Christian König

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.