All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] drm/sched: fix the bug of time out calculation(v3)
@ 2021-08-26  4:55 Monk Liu
  2021-08-26 10:09 ` Christian König
  2021-08-26 14:03 ` Andrey Grodzovsky
  0 siblings, 2 replies; 20+ messages in thread
From: Monk Liu @ 2021-08-26  4:55 UTC (permalink / raw)
  To: amd-gfx; +Cc: dri-devel, Monk Liu

issue:
in cleanup_job the cancle_delayed_work will cancel a TO timer
even the its corresponding job is still running.

fix:
do not cancel the timer in cleanup_job, instead do the cancelling
only when the heading job is signaled, and if there is a "next" job
we start_timeout again.

v2:
further cleanup the logic, and do the TDR timer cancelling if the signaled job
is the last one in its scheduler.

v3:
change the issue description
remove the cancel_delayed_work in the begining of the cleanup_job
recover the implement of drm_sched_job_begin.

TODO:
1)introduce pause/resume scheduler in job_timeout to serial the handling
of scheduler and job_timeout.
2)drop the bad job's del and insert in scheduler due to above serialization
(no race issue anymore with the serialization)

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
---
 drivers/gpu/drm/scheduler/sched_main.c | 25 ++++++++++---------------
 1 file changed, 10 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index a2a9536..ecf8140 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -676,13 +676,7 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched)
 {
 	struct drm_sched_job *job, *next;
 
-	/*
-	 * Don't destroy jobs while the timeout worker is running  OR thread
-	 * is being parked and hence assumed to not touch pending_list
-	 */
-	if ((sched->timeout != MAX_SCHEDULE_TIMEOUT &&
-	    !cancel_delayed_work(&sched->work_tdr)) ||
-	    kthread_should_park())
+	if (kthread_should_park())
 		return NULL;
 
 	spin_lock(&sched->job_list_lock);
@@ -693,17 +687,21 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched)
 	if (job && dma_fence_is_signaled(&job->s_fence->finished)) {
 		/* remove job from pending_list */
 		list_del_init(&job->list);
+
+		/* cancel this job's TO timer */
+		cancel_delayed_work(&sched->work_tdr);
 		/* make the scheduled timestamp more accurate */
 		next = list_first_entry_or_null(&sched->pending_list,
 						typeof(*next), list);
-		if (next)
+
+		if (next) {
 			next->s_fence->scheduled.timestamp =
 				job->s_fence->finished.timestamp;
-
+			/* start TO timer for next job */
+			drm_sched_start_timeout(sched);
+		}
 	} else {
 		job = NULL;
-		/* queue timeout for next job */
-		drm_sched_start_timeout(sched);
 	}
 
 	spin_unlock(&sched->job_list_lock);
@@ -791,11 +789,8 @@ static int drm_sched_main(void *param)
 					  (entity = drm_sched_select_entity(sched))) ||
 					 kthread_should_stop());
 
-		if (cleanup_job) {
+		if (cleanup_job)
 			sched->ops->free_job(cleanup_job);
-			/* queue timeout for next job */
-			drm_sched_start_timeout(sched);
-		}
 
 		if (!entity)
 			continue;
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH] drm/sched: fix the bug of time out calculation(v3)
  2021-08-26  4:55 [PATCH] drm/sched: fix the bug of time out calculation(v3) Monk Liu
@ 2021-08-26 10:09 ` Christian König
  2021-08-26 11:55   ` Liu, Monk
  2021-08-26 14:03 ` Andrey Grodzovsky
  1 sibling, 1 reply; 20+ messages in thread
From: Christian König @ 2021-08-26 10:09 UTC (permalink / raw)
  To: Monk Liu, amd-gfx; +Cc: dri-devel

Am 26.08.21 um 06:55 schrieb Monk Liu:
> issue:
> in cleanup_job the cancle_delayed_work will cancel a TO timer
> even the its corresponding job is still running.

Yeah, that makes a lot more sense.

>
> fix:
> do not cancel the timer in cleanup_job, instead do the cancelling
> only when the heading job is signaled, and if there is a "next" job
> we start_timeout again.
>
> v2:
> further cleanup the logic, and do the TDR timer cancelling if the signaled job
> is the last one in its scheduler.
>
> v3:
> change the issue description
> remove the cancel_delayed_work in the begining of the cleanup_job
> recover the implement of drm_sched_job_begin.
>
> TODO:
> 1)introduce pause/resume scheduler in job_timeout to serial the handling
> of scheduler and job_timeout.
> 2)drop the bad job's del and insert in scheduler due to above serialization
> (no race issue anymore with the serialization)
>
> Signed-off-by: Monk Liu <Monk.Liu@amd.com>
> ---
>   drivers/gpu/drm/scheduler/sched_main.c | 25 ++++++++++---------------
>   1 file changed, 10 insertions(+), 15 deletions(-)
>
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> index a2a9536..ecf8140 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -676,13 +676,7 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched)
>   {
>   	struct drm_sched_job *job, *next;
>   
> -	/*
> -	 * Don't destroy jobs while the timeout worker is running  OR thread
> -	 * is being parked and hence assumed to not touch pending_list
> -	 */
> -	if ((sched->timeout != MAX_SCHEDULE_TIMEOUT &&
> -	    !cancel_delayed_work(&sched->work_tdr)) ||
> -	    kthread_should_park())
> +	if (kthread_should_park())
>   		return NULL;
>   
>   	spin_lock(&sched->job_list_lock);
> @@ -693,17 +687,21 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched)
>   	if (job && dma_fence_is_signaled(&job->s_fence->finished)) {
>   		/* remove job from pending_list */
>   		list_del_init(&job->list);
> +
> +		/* cancel this job's TO timer */
> +		cancel_delayed_work(&sched->work_tdr);

I'm not sure if the work_tdr is initialized when a maximum timeout is 
specified. Please double check.

BTW: Can we please drop the "tdr" naming from the scheduler? That is 
just a timeout functionality and not related to recovery in any way.

We even do not start hardware recovery in a lot of cases now (when wave 
kill is successfully).

Regards,
Christian.

>   		/* make the scheduled timestamp more accurate */
>   		next = list_first_entry_or_null(&sched->pending_list,
>   						typeof(*next), list);
> -		if (next)
> +
> +		if (next) {
>   			next->s_fence->scheduled.timestamp =
>   				job->s_fence->finished.timestamp;
> -
> +			/* start TO timer for next job */
> +			drm_sched_start_timeout(sched);
> +		}
>   	} else {
>   		job = NULL;
> -		/* queue timeout for next job */
> -		drm_sched_start_timeout(sched);
>   	}
>   
>   	spin_unlock(&sched->job_list_lock);
> @@ -791,11 +789,8 @@ static int drm_sched_main(void *param)
>   					  (entity = drm_sched_select_entity(sched))) ||
>   					 kthread_should_stop());
>   
> -		if (cleanup_job) {
> +		if (cleanup_job)
>   			sched->ops->free_job(cleanup_job);
> -			/* queue timeout for next job */
> -			drm_sched_start_timeout(sched);
> -		}
>   
>   		if (!entity)
>   			continue;


^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: [PATCH] drm/sched: fix the bug of time out calculation(v3)
  2021-08-26 10:09 ` Christian König
@ 2021-08-26 11:55   ` Liu, Monk
  2021-08-26 12:37     ` Christian König
  0 siblings, 1 reply; 20+ messages in thread
From: Liu, Monk @ 2021-08-26 11:55 UTC (permalink / raw)
  To: Christian König, amd-gfx; +Cc: dri-devel

[AMD Official Use Only]

>>I'm not sure if the work_tdr is initialized when a maximum timeout is specified. Please double check.

Ok, will do

>>BTW: Can we please drop the "tdr" naming from the scheduler? That is just a timeout functionality and not related to recovery in any way.
We even do not start hardware recovery in a lot of cases now (when wave kill is successfully).

Umm, sounds reasonable, I can rename it to "to" with another patch 

Thanks 

------------------------------------------
Monk Liu | Cloud-GPU Core team
------------------------------------------

-----Original Message-----
From: Christian König <ckoenig.leichtzumerken@gmail.com> 
Sent: Thursday, August 26, 2021 6:09 PM
To: Liu, Monk <Monk.Liu@amd.com>; amd-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Subject: Re: [PATCH] drm/sched: fix the bug of time out calculation(v3)

Am 26.08.21 um 06:55 schrieb Monk Liu:
> issue:
> in cleanup_job the cancle_delayed_work will cancel a TO timer even the 
> its corresponding job is still running.

Yeah, that makes a lot more sense.

>
> fix:
> do not cancel the timer in cleanup_job, instead do the cancelling only 
> when the heading job is signaled, and if there is a "next" job we 
> start_timeout again.
>
> v2:
> further cleanup the logic, and do the TDR timer cancelling if the 
> signaled job is the last one in its scheduler.
>
> v3:
> change the issue description
> remove the cancel_delayed_work in the begining of the cleanup_job 
> recover the implement of drm_sched_job_begin.
>
> TODO:
> 1)introduce pause/resume scheduler in job_timeout to serial the 
> handling of scheduler and job_timeout.
> 2)drop the bad job's del and insert in scheduler due to above 
> serialization (no race issue anymore with the serialization)
>
> Signed-off-by: Monk Liu <Monk.Liu@amd.com>
> ---
>   drivers/gpu/drm/scheduler/sched_main.c | 25 ++++++++++---------------
>   1 file changed, 10 insertions(+), 15 deletions(-)
>
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
> b/drivers/gpu/drm/scheduler/sched_main.c
> index a2a9536..ecf8140 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -676,13 +676,7 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched)
>   {
>   	struct drm_sched_job *job, *next;
>   
> -	/*
> -	 * Don't destroy jobs while the timeout worker is running  OR thread
> -	 * is being parked and hence assumed to not touch pending_list
> -	 */
> -	if ((sched->timeout != MAX_SCHEDULE_TIMEOUT &&
> -	    !cancel_delayed_work(&sched->work_tdr)) ||
> -	    kthread_should_park())
> +	if (kthread_should_park())
>   		return NULL;
>   
>   	spin_lock(&sched->job_list_lock);
> @@ -693,17 +687,21 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched)
>   	if (job && dma_fence_is_signaled(&job->s_fence->finished)) {
>   		/* remove job from pending_list */
>   		list_del_init(&job->list);
> +
> +		/* cancel this job's TO timer */
> +		cancel_delayed_work(&sched->work_tdr);

I'm not sure if the work_tdr is initialized when a maximum timeout is specified. Please double check.

BTW: Can we please drop the "tdr" naming from the scheduler? That is just a timeout functionality and not related to recovery in any way.

We even do not start hardware recovery in a lot of cases now (when wave kill is successfully).

Regards,
Christian.

>   		/* make the scheduled timestamp more accurate */
>   		next = list_first_entry_or_null(&sched->pending_list,
>   						typeof(*next), list);
> -		if (next)
> +
> +		if (next) {
>   			next->s_fence->scheduled.timestamp =
>   				job->s_fence->finished.timestamp;
> -
> +			/* start TO timer for next job */
> +			drm_sched_start_timeout(sched);
> +		}
>   	} else {
>   		job = NULL;
> -		/* queue timeout for next job */
> -		drm_sched_start_timeout(sched);
>   	}
>   
>   	spin_unlock(&sched->job_list_lock);
> @@ -791,11 +789,8 @@ static int drm_sched_main(void *param)
>   					  (entity = drm_sched_select_entity(sched))) ||
>   					 kthread_should_stop());
>   
> -		if (cleanup_job) {
> +		if (cleanup_job)
>   			sched->ops->free_job(cleanup_job);
> -			/* queue timeout for next job */
> -			drm_sched_start_timeout(sched);
> -		}
>   
>   		if (!entity)
>   			continue;

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] drm/sched: fix the bug of time out calculation(v3)
  2021-08-26 11:55   ` Liu, Monk
@ 2021-08-26 12:37     ` Christian König
  2021-08-26 12:46       ` Daniel Vetter
  2021-08-27  7:30       ` Liu, Monk
  0 siblings, 2 replies; 20+ messages in thread
From: Christian König @ 2021-08-26 12:37 UTC (permalink / raw)
  To: Liu, Monk, amd-gfx; +Cc: dri-devel

Am 26.08.21 um 13:55 schrieb Liu, Monk:
> [AMD Official Use Only]
>
>>> I'm not sure if the work_tdr is initialized when a maximum timeout is specified. Please double check.
> Ok, will do
>
>>> BTW: Can we please drop the "tdr" naming from the scheduler? That is just a timeout functionality and not related to recovery in any way.
> We even do not start hardware recovery in a lot of cases now (when wave kill is successfully).
>
> Umm, sounds reasonable, I can rename it to "to" with another patch

Maybe more like job_timeout or timeout_work or something into that 
direction.

Christian.

>
> Thanks
>
> ------------------------------------------
> Monk Liu | Cloud-GPU Core team
> ------------------------------------------
>
> -----Original Message-----
> From: Christian König <ckoenig.leichtzumerken@gmail.com>
> Sent: Thursday, August 26, 2021 6:09 PM
> To: Liu, Monk <Monk.Liu@amd.com>; amd-gfx@lists.freedesktop.org
> Cc: dri-devel@lists.freedesktop.org
> Subject: Re: [PATCH] drm/sched: fix the bug of time out calculation(v3)
>
> Am 26.08.21 um 06:55 schrieb Monk Liu:
>> issue:
>> in cleanup_job the cancle_delayed_work will cancel a TO timer even the
>> its corresponding job is still running.
> Yeah, that makes a lot more sense.
>
>> fix:
>> do not cancel the timer in cleanup_job, instead do the cancelling only
>> when the heading job is signaled, and if there is a "next" job we
>> start_timeout again.
>>
>> v2:
>> further cleanup the logic, and do the TDR timer cancelling if the
>> signaled job is the last one in its scheduler.
>>
>> v3:
>> change the issue description
>> remove the cancel_delayed_work in the begining of the cleanup_job
>> recover the implement of drm_sched_job_begin.
>>
>> TODO:
>> 1)introduce pause/resume scheduler in job_timeout to serial the
>> handling of scheduler and job_timeout.
>> 2)drop the bad job's del and insert in scheduler due to above
>> serialization (no race issue anymore with the serialization)
>>
>> Signed-off-by: Monk Liu <Monk.Liu@amd.com>
>> ---
>>    drivers/gpu/drm/scheduler/sched_main.c | 25 ++++++++++---------------
>>    1 file changed, 10 insertions(+), 15 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c
>> b/drivers/gpu/drm/scheduler/sched_main.c
>> index a2a9536..ecf8140 100644
>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>> @@ -676,13 +676,7 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched)
>>    {
>>    	struct drm_sched_job *job, *next;
>>    
>> -	/*
>> -	 * Don't destroy jobs while the timeout worker is running  OR thread
>> -	 * is being parked and hence assumed to not touch pending_list
>> -	 */
>> -	if ((sched->timeout != MAX_SCHEDULE_TIMEOUT &&
>> -	    !cancel_delayed_work(&sched->work_tdr)) ||
>> -	    kthread_should_park())
>> +	if (kthread_should_park())
>>    		return NULL;
>>    
>>    	spin_lock(&sched->job_list_lock);
>> @@ -693,17 +687,21 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched)
>>    	if (job && dma_fence_is_signaled(&job->s_fence->finished)) {
>>    		/* remove job from pending_list */
>>    		list_del_init(&job->list);
>> +
>> +		/* cancel this job's TO timer */
>> +		cancel_delayed_work(&sched->work_tdr);
> I'm not sure if the work_tdr is initialized when a maximum timeout is specified. Please double check.
>
> BTW: Can we please drop the "tdr" naming from the scheduler? That is just a timeout functionality and not related to recovery in any way.
>
> We even do not start hardware recovery in a lot of cases now (when wave kill is successfully).
>
> Regards,
> Christian.
>
>>    		/* make the scheduled timestamp more accurate */
>>    		next = list_first_entry_or_null(&sched->pending_list,
>>    						typeof(*next), list);
>> -		if (next)
>> +
>> +		if (next) {
>>    			next->s_fence->scheduled.timestamp =
>>    				job->s_fence->finished.timestamp;
>> -
>> +			/* start TO timer for next job */
>> +			drm_sched_start_timeout(sched);
>> +		}
>>    	} else {
>>    		job = NULL;
>> -		/* queue timeout for next job */
>> -		drm_sched_start_timeout(sched);
>>    	}
>>    
>>    	spin_unlock(&sched->job_list_lock);
>> @@ -791,11 +789,8 @@ static int drm_sched_main(void *param)
>>    					  (entity = drm_sched_select_entity(sched))) ||
>>    					 kthread_should_stop());
>>    
>> -		if (cleanup_job) {
>> +		if (cleanup_job)
>>    			sched->ops->free_job(cleanup_job);
>> -			/* queue timeout for next job */
>> -			drm_sched_start_timeout(sched);
>> -		}
>>    
>>    		if (!entity)
>>    			continue;


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] drm/sched: fix the bug of time out calculation(v3)
  2021-08-26 12:37     ` Christian König
@ 2021-08-26 12:46       ` Daniel Vetter
  2021-08-27  7:30       ` Liu, Monk
  1 sibling, 0 replies; 20+ messages in thread
From: Daniel Vetter @ 2021-08-26 12:46 UTC (permalink / raw)
  To: Christian König; +Cc: Liu, Monk, amd-gfx, dri-devel

On Thu, Aug 26, 2021 at 02:37:40PM +0200, Christian König wrote:
> Am 26.08.21 um 13:55 schrieb Liu, Monk:
> > [AMD Official Use Only]
> > 
> > > > I'm not sure if the work_tdr is initialized when a maximum timeout is specified. Please double check.
> > Ok, will do
> > 
> > > > BTW: Can we please drop the "tdr" naming from the scheduler? That is just a timeout functionality and not related to recovery in any way.
> > We even do not start hardware recovery in a lot of cases now (when wave kill is successfully).
> > 
> > Umm, sounds reasonable, I can rename it to "to" with another patch
> 
> Maybe more like job_timeout or timeout_work or something into that
> direction.

Yeah that's better. TO is even worse I think than TDR, which is at least
somewhat well-known from the windows side.

Also would be good to polish the commit message a bit, there's a few typos
and confusing wording.
-Daniel

> 
> Christian.
> 
> > 
> > Thanks
> > 
> > ------------------------------------------
> > Monk Liu | Cloud-GPU Core team
> > ------------------------------------------
> > 
> > -----Original Message-----
> > From: Christian König <ckoenig.leichtzumerken@gmail.com>
> > Sent: Thursday, August 26, 2021 6:09 PM
> > To: Liu, Monk <Monk.Liu@amd.com>; amd-gfx@lists.freedesktop.org
> > Cc: dri-devel@lists.freedesktop.org
> > Subject: Re: [PATCH] drm/sched: fix the bug of time out calculation(v3)
> > 
> > Am 26.08.21 um 06:55 schrieb Monk Liu:
> > > issue:
> > > in cleanup_job the cancle_delayed_work will cancel a TO timer even the
> > > its corresponding job is still running.
> > Yeah, that makes a lot more sense.
> > 
> > > fix:
> > > do not cancel the timer in cleanup_job, instead do the cancelling only
> > > when the heading job is signaled, and if there is a "next" job we
> > > start_timeout again.
> > > 
> > > v2:
> > > further cleanup the logic, and do the TDR timer cancelling if the
> > > signaled job is the last one in its scheduler.
> > > 
> > > v3:
> > > change the issue description
> > > remove the cancel_delayed_work in the begining of the cleanup_job
> > > recover the implement of drm_sched_job_begin.
> > > 
> > > TODO:
> > > 1)introduce pause/resume scheduler in job_timeout to serial the
> > > handling of scheduler and job_timeout.
> > > 2)drop the bad job's del and insert in scheduler due to above
> > > serialization (no race issue anymore with the serialization)
> > > 
> > > Signed-off-by: Monk Liu <Monk.Liu@amd.com>
> > > ---
> > >    drivers/gpu/drm/scheduler/sched_main.c | 25 ++++++++++---------------
> > >    1 file changed, 10 insertions(+), 15 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c
> > > b/drivers/gpu/drm/scheduler/sched_main.c
> > > index a2a9536..ecf8140 100644
> > > --- a/drivers/gpu/drm/scheduler/sched_main.c
> > > +++ b/drivers/gpu/drm/scheduler/sched_main.c
> > > @@ -676,13 +676,7 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched)
> > >    {
> > >    	struct drm_sched_job *job, *next;
> > > -	/*
> > > -	 * Don't destroy jobs while the timeout worker is running  OR thread
> > > -	 * is being parked and hence assumed to not touch pending_list
> > > -	 */
> > > -	if ((sched->timeout != MAX_SCHEDULE_TIMEOUT &&
> > > -	    !cancel_delayed_work(&sched->work_tdr)) ||
> > > -	    kthread_should_park())
> > > +	if (kthread_should_park())
> > >    		return NULL;
> > >    	spin_lock(&sched->job_list_lock);
> > > @@ -693,17 +687,21 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched)
> > >    	if (job && dma_fence_is_signaled(&job->s_fence->finished)) {
> > >    		/* remove job from pending_list */
> > >    		list_del_init(&job->list);
> > > +
> > > +		/* cancel this job's TO timer */
> > > +		cancel_delayed_work(&sched->work_tdr);
> > I'm not sure if the work_tdr is initialized when a maximum timeout is specified. Please double check.
> > 
> > BTW: Can we please drop the "tdr" naming from the scheduler? That is just a timeout functionality and not related to recovery in any way.
> > 
> > We even do not start hardware recovery in a lot of cases now (when wave kill is successfully).
> > 
> > Regards,
> > Christian.
> > 
> > >    		/* make the scheduled timestamp more accurate */
> > >    		next = list_first_entry_or_null(&sched->pending_list,
> > >    						typeof(*next), list);
> > > -		if (next)
> > > +
> > > +		if (next) {
> > >    			next->s_fence->scheduled.timestamp =
> > >    				job->s_fence->finished.timestamp;
> > > -
> > > +			/* start TO timer for next job */
> > > +			drm_sched_start_timeout(sched);
> > > +		}
> > >    	} else {
> > >    		job = NULL;
> > > -		/* queue timeout for next job */
> > > -		drm_sched_start_timeout(sched);
> > >    	}
> > >    	spin_unlock(&sched->job_list_lock);
> > > @@ -791,11 +789,8 @@ static int drm_sched_main(void *param)
> > >    					  (entity = drm_sched_select_entity(sched))) ||
> > >    					 kthread_should_stop());
> > > -		if (cleanup_job) {
> > > +		if (cleanup_job)
> > >    			sched->ops->free_job(cleanup_job);
> > > -			/* queue timeout for next job */
> > > -			drm_sched_start_timeout(sched);
> > > -		}
> > >    		if (!entity)
> > >    			continue;
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] drm/sched: fix the bug of time out calculation(v3)
  2021-08-26  4:55 [PATCH] drm/sched: fix the bug of time out calculation(v3) Monk Liu
  2021-08-26 10:09 ` Christian König
@ 2021-08-26 14:03 ` Andrey Grodzovsky
  2021-08-26 20:14   ` Andrey Grodzovsky
  1 sibling, 1 reply; 20+ messages in thread
From: Andrey Grodzovsky @ 2021-08-26 14:03 UTC (permalink / raw)
  To: Monk Liu, amd-gfx; +Cc: dri-devel


On 2021-08-26 12:55 a.m., Monk Liu wrote:
> issue:
> in cleanup_job the cancle_delayed_work will cancel a TO timer
> even the its corresponding job is still running.
>
> fix:
> do not cancel the timer in cleanup_job, instead do the cancelling
> only when the heading job is signaled, and if there is a "next" job
> we start_timeout again.
>
> v2:
> further cleanup the logic, and do the TDR timer cancelling if the signaled job
> is the last one in its scheduler.
>
> v3:
> change the issue description
> remove the cancel_delayed_work in the begining of the cleanup_job
> recover the implement of drm_sched_job_begin.
>
> TODO:
> 1)introduce pause/resume scheduler in job_timeout to serial the handling
> of scheduler and job_timeout.
> 2)drop the bad job's del and insert in scheduler due to above serialization
> (no race issue anymore with the serialization)
>
> Signed-off-by: Monk Liu <Monk.Liu@amd.com>
> ---
>   drivers/gpu/drm/scheduler/sched_main.c | 25 ++++++++++---------------
>   1 file changed, 10 insertions(+), 15 deletions(-)
>
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> index a2a9536..ecf8140 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -676,13 +676,7 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched)
>   {
>   	struct drm_sched_job *job, *next;
>   
> -	/*
> -	 * Don't destroy jobs while the timeout worker is running  OR thread
> -	 * is being parked and hence assumed to not touch pending_list
> -	 */
> -	if ((sched->timeout != MAX_SCHEDULE_TIMEOUT &&
> -	    !cancel_delayed_work(&sched->work_tdr)) ||
> -	    kthread_should_park())
> +	if (kthread_should_park())
>   		return NULL;


I actually don't see why we need to keep the above,
on the other side (in drm_sched_stop) we won't touch the pending list
anyway until sched thread came to full stop (kthread_park). If you do 
see a reason why
this needed then a comment should be here i think.

Andrey


>   
>   	spin_lock(&sched->job_list_lock);
> @@ -693,17 +687,21 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched)
>   	if (job && dma_fence_is_signaled(&job->s_fence->finished)) {
>   		/* remove job from pending_list */
>   		list_del_init(&job->list);
> +
> +		/* cancel this job's TO timer */
> +		cancel_delayed_work(&sched->work_tdr);
>   		/* make the scheduled timestamp more accurate */
>   		next = list_first_entry_or_null(&sched->pending_list,
>   						typeof(*next), list);
> -		if (next)
> +
> +		if (next) {
>   			next->s_fence->scheduled.timestamp =
>   				job->s_fence->finished.timestamp;
> -
> +			/* start TO timer for next job */
> +			drm_sched_start_timeout(sched);
> +		}
>   	} else {
>   		job = NULL;
> -		/* queue timeout for next job */
> -		drm_sched_start_timeout(sched);
>   	}
>   
>   	spin_unlock(&sched->job_list_lock);
> @@ -791,11 +789,8 @@ static int drm_sched_main(void *param)
>   					  (entity = drm_sched_select_entity(sched))) ||
>   					 kthread_should_stop());
>   
> -		if (cleanup_job) {
> +		if (cleanup_job)
>   			sched->ops->free_job(cleanup_job);
> -			/* queue timeout for next job */
> -			drm_sched_start_timeout(sched);
> -		}
>   
>   		if (!entity)
>   			continue;

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] drm/sched: fix the bug of time out calculation(v3)
  2021-08-26 14:03 ` Andrey Grodzovsky
@ 2021-08-26 20:14   ` Andrey Grodzovsky
  2021-08-27  6:12     ` Christian König
  2021-08-27  7:20     ` Liu, Monk
  0 siblings, 2 replies; 20+ messages in thread
From: Andrey Grodzovsky @ 2021-08-26 20:14 UTC (permalink / raw)
  To: Monk Liu, amd-gfx, Christian.Koenig; +Cc: dri-devel

[-- Attachment #1: Type: text/plain, Size: 4218 bytes --]

Attached quick patch for per job TTL calculation to make more precises 
next timer expiration. It's on top of the patch in this thread. Let me 
know if this makes sense.

Andrey

On 2021-08-26 10:03 a.m., Andrey Grodzovsky wrote:
>
> On 2021-08-26 12:55 a.m., Monk Liu wrote:
>> issue:
>> in cleanup_job the cancle_delayed_work will cancel a TO timer
>> even the its corresponding job is still running.
>>
>> fix:
>> do not cancel the timer in cleanup_job, instead do the cancelling
>> only when the heading job is signaled, and if there is a "next" job
>> we start_timeout again.
>>
>> v2:
>> further cleanup the logic, and do the TDR timer cancelling if the 
>> signaled job
>> is the last one in its scheduler.
>>
>> v3:
>> change the issue description
>> remove the cancel_delayed_work in the begining of the cleanup_job
>> recover the implement of drm_sched_job_begin.
>>
>> TODO:
>> 1)introduce pause/resume scheduler in job_timeout to serial the handling
>> of scheduler and job_timeout.
>> 2)drop the bad job's del and insert in scheduler due to above 
>> serialization
>> (no race issue anymore with the serialization)
>>
>> Signed-off-by: Monk Liu <Monk.Liu@amd.com>
>> ---
>>   drivers/gpu/drm/scheduler/sched_main.c | 25 ++++++++++---------------
>>   1 file changed, 10 insertions(+), 15 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
>> b/drivers/gpu/drm/scheduler/sched_main.c
>> index a2a9536..ecf8140 100644
>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>> @@ -676,13 +676,7 @@ drm_sched_get_cleanup_job(struct 
>> drm_gpu_scheduler *sched)
>>   {
>>       struct drm_sched_job *job, *next;
>>   -    /*
>> -     * Don't destroy jobs while the timeout worker is running OR thread
>> -     * is being parked and hence assumed to not touch pending_list
>> -     */
>> -    if ((sched->timeout != MAX_SCHEDULE_TIMEOUT &&
>> -        !cancel_delayed_work(&sched->work_tdr)) ||
>> -        kthread_should_park())
>> +    if (kthread_should_park())
>>           return NULL;
>
>
> I actually don't see why we need to keep the above,
> on the other side (in drm_sched_stop) we won't touch the pending list
> anyway until sched thread came to full stop (kthread_park). If you do 
> see a reason why
> this needed then a comment should be here i think.
>
> Andrey
>
>
>> spin_lock(&sched->job_list_lock);
>> @@ -693,17 +687,21 @@ drm_sched_get_cleanup_job(struct 
>> drm_gpu_scheduler *sched)
>>       if (job && dma_fence_is_signaled(&job->s_fence->finished)) {
>>           /* remove job from pending_list */
>>           list_del_init(&job->list);
>> +
>> +        /* cancel this job's TO timer */
>> +        cancel_delayed_work(&sched->work_tdr);
>>           /* make the scheduled timestamp more accurate */
>>           next = list_first_entry_or_null(&sched->pending_list,
>>                           typeof(*next), list);
>> -        if (next)
>> +
>> +        if (next) {
>>               next->s_fence->scheduled.timestamp =
>>                   job->s_fence->finished.timestamp;
>> -
>> +            /* start TO timer for next job */
>> +            drm_sched_start_timeout(sched);
>> +        }
>>       } else {
>>           job = NULL;
>> -        /* queue timeout for next job */
>> -        drm_sched_start_timeout(sched);
>>       }
>>         spin_unlock(&sched->job_list_lock);
>> @@ -791,11 +789,8 @@ static int drm_sched_main(void *param)
>>                         (entity = drm_sched_select_entity(sched))) ||
>>                        kthread_should_stop());
>>   -        if (cleanup_job) {
>> +        if (cleanup_job)
>>               sched->ops->free_job(cleanup_job);
>> -            /* queue timeout for next job */
>> -            drm_sched_start_timeout(sched);
>> -        }
>>             if (!entity)
>>               continue;

[-- Attachment #2: 0001-drm-sched-Add-TTL-per-job-for-timeout-handling.patch --]
[-- Type: text/x-patch, Size: 2553 bytes --]

From d4671ce3c3b18c369b512cd692aec3769f37e11a Mon Sep 17 00:00:00 2001
From: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Date: Thu, 26 Aug 2021 16:08:01 -0400
Subject: drm/sched: Add TTL per job for timeout handling.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/scheduler/sched_main.c | 16 ++++++++++++++--
 include/drm/gpu_scheduler.h            |  2 ++
 2 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index ecf8140f6968..c8e31515803c 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -306,6 +306,7 @@ static void drm_sched_job_begin(struct drm_sched_job *s_job)
 
 	spin_lock(&sched->job_list_lock);
 	list_add_tail(&s_job->list, &sched->pending_list);
+	s_job->ts = get_jiffies_64();
 	drm_sched_start_timeout(sched);
 	spin_unlock(&sched->job_list_lock);
 }
@@ -695,10 +696,21 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched)
 						typeof(*next), list);
 
 		if (next) {
+			uint64_t ttl;
+
 			next->s_fence->scheduled.timestamp =
 				job->s_fence->finished.timestamp;
-			/* start TO timer for next job */
-			drm_sched_start_timeout(sched);
+
+			/*
+			 * Make precise calculation how much time should be
+			 * left for the next job before reaming timer. In case
+			 *  it's TTL expired scheduler TO handler right away.
+			 */
+			ttl = get_jiffies_64() - job->ts;
+			if (likely(ttl < sched->timeout))
+				mod_delayed_work(system_wq, &sched->work_tdr, ttl);
+			else
+				mod_delayed_work(system_wq, &sched->work_tdr, 0);
 		}
 	} else {
 		job = NULL;
diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
index d18af49fd009..80cc23e799cf 100644
--- a/include/drm/gpu_scheduler.h
+++ b/include/drm/gpu_scheduler.h
@@ -182,6 +182,7 @@ struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f);
  * @s_priority: the priority of the job.
  * @entity: the entity to which this job belongs.
  * @cb: the callback for the parent fence in s_fence.
+ * @ts: ts to messure for how long the job been running in jiffies
  *
  * A job is created by the driver using drm_sched_job_init(), and
  * should call drm_sched_entity_push_job() once it wants the scheduler
@@ -198,6 +199,7 @@ struct drm_sched_job {
 	enum drm_sched_priority		s_priority;
 	struct drm_sched_entity         *entity;
 	struct dma_fence_cb		cb;
+	uint64_t 			ts;
 };
 
 static inline bool drm_sched_invalidate_job(struct drm_sched_job *s_job,
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH] drm/sched: fix the bug of time out calculation(v3)
  2021-08-26 20:14   ` Andrey Grodzovsky
@ 2021-08-27  6:12     ` Christian König
  2021-08-27  7:46       ` Liu, Monk
  2021-08-27  7:20     ` Liu, Monk
  1 sibling, 1 reply; 20+ messages in thread
From: Christian König @ 2021-08-27  6:12 UTC (permalink / raw)
  To: Andrey Grodzovsky, Monk Liu, amd-gfx, Christian.Koenig; +Cc: dri-devel

I don't think that this will be necessary nor desired.

See the job should be cleaned up as soon as possible after it is 
finished or otherwise we won't cancel the timeout quick enough either.

Christian.

Am 26.08.21 um 22:14 schrieb Andrey Grodzovsky:
> Attached quick patch for per job TTL calculation to make more precises 
> next timer expiration. It's on top of the patch in this thread. Let me 
> know if this makes sense.
>
> Andrey
>
> On 2021-08-26 10:03 a.m., Andrey Grodzovsky wrote:
>>
>> On 2021-08-26 12:55 a.m., Monk Liu wrote:
>>> issue:
>>> in cleanup_job the cancle_delayed_work will cancel a TO timer
>>> even the its corresponding job is still running.
>>>
>>> fix:
>>> do not cancel the timer in cleanup_job, instead do the cancelling
>>> only when the heading job is signaled, and if there is a "next" job
>>> we start_timeout again.
>>>
>>> v2:
>>> further cleanup the logic, and do the TDR timer cancelling if the 
>>> signaled job
>>> is the last one in its scheduler.
>>>
>>> v3:
>>> change the issue description
>>> remove the cancel_delayed_work in the begining of the cleanup_job
>>> recover the implement of drm_sched_job_begin.
>>>
>>> TODO:
>>> 1)introduce pause/resume scheduler in job_timeout to serial the 
>>> handling
>>> of scheduler and job_timeout.
>>> 2)drop the bad job's del and insert in scheduler due to above 
>>> serialization
>>> (no race issue anymore with the serialization)
>>>
>>> Signed-off-by: Monk Liu <Monk.Liu@amd.com>
>>> ---
>>>   drivers/gpu/drm/scheduler/sched_main.c | 25 ++++++++++---------------
>>>   1 file changed, 10 insertions(+), 15 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
>>> b/drivers/gpu/drm/scheduler/sched_main.c
>>> index a2a9536..ecf8140 100644
>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>> @@ -676,13 +676,7 @@ drm_sched_get_cleanup_job(struct 
>>> drm_gpu_scheduler *sched)
>>>   {
>>>       struct drm_sched_job *job, *next;
>>>   -    /*
>>> -     * Don't destroy jobs while the timeout worker is running OR 
>>> thread
>>> -     * is being parked and hence assumed to not touch pending_list
>>> -     */
>>> -    if ((sched->timeout != MAX_SCHEDULE_TIMEOUT &&
>>> -        !cancel_delayed_work(&sched->work_tdr)) ||
>>> -        kthread_should_park())
>>> +    if (kthread_should_park())
>>>           return NULL;
>>
>>
>> I actually don't see why we need to keep the above,
>> on the other side (in drm_sched_stop) we won't touch the pending list
>> anyway until sched thread came to full stop (kthread_park). If you do 
>> see a reason why
>> this needed then a comment should be here i think.
>>
>> Andrey
>>
>>
>>> spin_lock(&sched->job_list_lock);
>>> @@ -693,17 +687,21 @@ drm_sched_get_cleanup_job(struct 
>>> drm_gpu_scheduler *sched)
>>>       if (job && dma_fence_is_signaled(&job->s_fence->finished)) {
>>>           /* remove job from pending_list */
>>>           list_del_init(&job->list);
>>> +
>>> +        /* cancel this job's TO timer */
>>> +        cancel_delayed_work(&sched->work_tdr);
>>>           /* make the scheduled timestamp more accurate */
>>>           next = list_first_entry_or_null(&sched->pending_list,
>>>                           typeof(*next), list);
>>> -        if (next)
>>> +
>>> +        if (next) {
>>>               next->s_fence->scheduled.timestamp =
>>>                   job->s_fence->finished.timestamp;
>>> -
>>> +            /* start TO timer for next job */
>>> +            drm_sched_start_timeout(sched);
>>> +        }
>>>       } else {
>>>           job = NULL;
>>> -        /* queue timeout for next job */
>>> -        drm_sched_start_timeout(sched);
>>>       }
>>>         spin_unlock(&sched->job_list_lock);
>>> @@ -791,11 +789,8 @@ static int drm_sched_main(void *param)
>>>                         (entity = drm_sched_select_entity(sched))) ||
>>>                        kthread_should_stop());
>>>   -        if (cleanup_job) {
>>> +        if (cleanup_job)
>>>               sched->ops->free_job(cleanup_job);
>>> -            /* queue timeout for next job */
>>> -            drm_sched_start_timeout(sched);
>>> -        }
>>>             if (!entity)
>>>               continue;


^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: [PATCH] drm/sched: fix the bug of time out calculation(v3)
  2021-08-26 20:14   ` Andrey Grodzovsky
  2021-08-27  6:12     ` Christian König
@ 2021-08-27  7:20     ` Liu, Monk
  2021-08-27 13:57       ` Andrey Grodzovsky
  1 sibling, 1 reply; 20+ messages in thread
From: Liu, Monk @ 2021-08-27  7:20 UTC (permalink / raw)
  To: Grodzovsky, Andrey, amd-gfx, Koenig, Christian; +Cc: dri-devel

[AMD Official Use Only]

what is that 'ts' representing for ? it looks to me the jiffies that it get scheduled to the ring,  but a job scheduled to the ring doesn't represent it's being processed by hw.

Thanks 

------------------------------------------
Monk Liu | Cloud-GPU Core team
------------------------------------------

-----Original Message-----
From: Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>
Sent: Friday, August 27, 2021 4:14 AM
To: Liu, Monk <Monk.Liu@amd.com>; amd-gfx@lists.freedesktop.org; Koenig, Christian <Christian.Koenig@amd.com>
Cc: dri-devel@lists.freedesktop.org
Subject: Re: [PATCH] drm/sched: fix the bug of time out calculation(v3)

Attached quick patch for per job TTL calculation to make more precises next timer expiration. It's on top of the patch in this thread. Let me know if this makes sense.

Andrey

On 2021-08-26 10:03 a.m., Andrey Grodzovsky wrote:
>
> On 2021-08-26 12:55 a.m., Monk Liu wrote:
>> issue:
>> in cleanup_job the cancle_delayed_work will cancel a TO timer even 
>> the its corresponding job is still running.
>>
>> fix:
>> do not cancel the timer in cleanup_job, instead do the cancelling 
>> only when the heading job is signaled, and if there is a "next" job 
>> we start_timeout again.
>>
>> v2:
>> further cleanup the logic, and do the TDR timer cancelling if the 
>> signaled job is the last one in its scheduler.
>>
>> v3:
>> change the issue description
>> remove the cancel_delayed_work in the begining of the cleanup_job 
>> recover the implement of drm_sched_job_begin.
>>
>> TODO:
>> 1)introduce pause/resume scheduler in job_timeout to serial the 
>> handling of scheduler and job_timeout.
>> 2)drop the bad job's del and insert in scheduler due to above 
>> serialization (no race issue anymore with the serialization)
>>
>> Signed-off-by: Monk Liu <Monk.Liu@amd.com>
>> ---
>>   drivers/gpu/drm/scheduler/sched_main.c | 25
>> ++++++++++---------------
>>   1 file changed, 10 insertions(+), 15 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c
>> b/drivers/gpu/drm/scheduler/sched_main.c
>> index a2a9536..ecf8140 100644
>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>> @@ -676,13 +676,7 @@ drm_sched_get_cleanup_job(struct 
>> drm_gpu_scheduler *sched)
>>   {
>>       struct drm_sched_job *job, *next;
>>   -    /*
>> -     * Don't destroy jobs while the timeout worker is running OR 
>> thread
>> -     * is being parked and hence assumed to not touch pending_list
>> -     */
>> -    if ((sched->timeout != MAX_SCHEDULE_TIMEOUT &&
>> -        !cancel_delayed_work(&sched->work_tdr)) ||
>> -        kthread_should_park())
>> +    if (kthread_should_park())
>>           return NULL;
>
>
> I actually don't see why we need to keep the above, on the other side 
> (in drm_sched_stop) we won't touch the pending list anyway until sched 
> thread came to full stop (kthread_park). If you do see a reason why 
> this needed then a comment should be here i think.
>
> Andrey
>
>
>> spin_lock(&sched->job_list_lock);
>> @@ -693,17 +687,21 @@ drm_sched_get_cleanup_job(struct 
>> drm_gpu_scheduler *sched)
>>       if (job && dma_fence_is_signaled(&job->s_fence->finished)) {
>>           /* remove job from pending_list */
>>           list_del_init(&job->list);
>> +
>> +        /* cancel this job's TO timer */
>> +        cancel_delayed_work(&sched->work_tdr);
>>           /* make the scheduled timestamp more accurate */
>>           next = list_first_entry_or_null(&sched->pending_list,
>>                           typeof(*next), list);
>> -        if (next)
>> +
>> +        if (next) {
>>               next->s_fence->scheduled.timestamp =
>>                   job->s_fence->finished.timestamp;
>> -
>> +            /* start TO timer for next job */
>> +            drm_sched_start_timeout(sched);
>> +        }
>>       } else {
>>           job = NULL;
>> -        /* queue timeout for next job */
>> -        drm_sched_start_timeout(sched);
>>       }
>>         spin_unlock(&sched->job_list_lock);
>> @@ -791,11 +789,8 @@ static int drm_sched_main(void *param)
>>                         (entity = drm_sched_select_entity(sched))) ||
>>                        kthread_should_stop());
>>   -        if (cleanup_job) {
>> +        if (cleanup_job)
>>               sched->ops->free_job(cleanup_job);
>> -            /* queue timeout for next job */
>> -            drm_sched_start_timeout(sched);
>> -        }
>>             if (!entity)
>>               continue;

^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: [PATCH] drm/sched: fix the bug of time out calculation(v3)
  2021-08-26 12:37     ` Christian König
  2021-08-26 12:46       ` Daniel Vetter
@ 2021-08-27  7:30       ` Liu, Monk
  1 sibling, 0 replies; 20+ messages in thread
From: Liu, Monk @ 2021-08-27  7:30 UTC (permalink / raw)
  To: Christian König, amd-gfx; +Cc: dri-devel

[AMD Official Use Only]

>>> I'm not sure if the work_tdr is initialized when a maximum timeout is specified. Please double check.
Even timeout set to max the work_tdr is still initialized:

int drm_sched_init(struct drm_gpu_scheduler *sched,
		   const struct drm_sched_backend_ops *ops,
		   unsigned hw_submission, unsigned hang_limit, long timeout,
		   atomic_t *score, const char *name)
{
	int i, ret;
	sched->ops = ops;
	sched->hw_submission_limit = hw_submission;
	sched->name = name;
	sched->timeout = timeout;
	sched->hang_limit = hang_limit;
	sched->score = score ? score : &sched->_score;
	for (i = DRM_SCHED_PRIORITY_MIN; i < DRM_SCHED_PRIORITY_COUNT; i++)
		drm_sched_rq_init(sched, &sched->sched_rq[i]);

	init_waitqueue_head(&sched->wake_up_worker);
	init_waitqueue_head(&sched->job_scheduled);
	INIT_LIST_HEAD(&sched->pending_list);
	spin_lock_init(&sched->job_list_lock);
	atomic_set(&sched->hw_rq_count, 0);
	INIT_DELAYED_WORK(&sched->work_tdr, drm_sched_job_timedout);
	atomic_set(&sched->_score, 0);
	atomic64_set(&sched->job_id_count, 0);

Thanks 

------------------------------------------
Monk Liu | Cloud-GPU Core team
------------------------------------------

-----Original Message-----
From: Christian König <ckoenig.leichtzumerken@gmail.com> 
Sent: Thursday, August 26, 2021 8:38 PM
To: Liu, Monk <Monk.Liu@amd.com>; amd-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Subject: Re: [PATCH] drm/sched: fix the bug of time out calculation(v3)

Am 26.08.21 um 13:55 schrieb Liu, Monk:
> [AMD Official Use Only]
>
>>> I'm not sure if the work_tdr is initialized when a maximum timeout is specified. Please double check.
> Ok, will do
>
>>> BTW: Can we please drop the "tdr" naming from the scheduler? That is just a timeout functionality and not related to recovery in any way.
> We even do not start hardware recovery in a lot of cases now (when wave kill is successfully).
>
> Umm, sounds reasonable, I can rename it to "to" with another patch

Maybe more like job_timeout or timeout_work or something into that direction.

Christian.

>
> Thanks
>
> ------------------------------------------
> Monk Liu | Cloud-GPU Core team
> ------------------------------------------
>
> -----Original Message-----
> From: Christian König <ckoenig.leichtzumerken@gmail.com>
> Sent: Thursday, August 26, 2021 6:09 PM
> To: Liu, Monk <Monk.Liu@amd.com>; amd-gfx@lists.freedesktop.org
> Cc: dri-devel@lists.freedesktop.org
> Subject: Re: [PATCH] drm/sched: fix the bug of time out 
> calculation(v3)
>
> Am 26.08.21 um 06:55 schrieb Monk Liu:
>> issue:
>> in cleanup_job the cancle_delayed_work will cancel a TO timer even 
>> the its corresponding job is still running.
> Yeah, that makes a lot more sense.
>
>> fix:
>> do not cancel the timer in cleanup_job, instead do the cancelling 
>> only when the heading job is signaled, and if there is a "next" job 
>> we start_timeout again.
>>
>> v2:
>> further cleanup the logic, and do the TDR timer cancelling if the 
>> signaled job is the last one in its scheduler.
>>
>> v3:
>> change the issue description
>> remove the cancel_delayed_work in the begining of the cleanup_job 
>> recover the implement of drm_sched_job_begin.
>>
>> TODO:
>> 1)introduce pause/resume scheduler in job_timeout to serial the 
>> handling of scheduler and job_timeout.
>> 2)drop the bad job's del and insert in scheduler due to above 
>> serialization (no race issue anymore with the serialization)
>>
>> Signed-off-by: Monk Liu <Monk.Liu@amd.com>
>> ---
>>    drivers/gpu/drm/scheduler/sched_main.c | 25 ++++++++++---------------
>>    1 file changed, 10 insertions(+), 15 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c
>> b/drivers/gpu/drm/scheduler/sched_main.c
>> index a2a9536..ecf8140 100644
>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>> @@ -676,13 +676,7 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched)
>>    {
>>    	struct drm_sched_job *job, *next;
>>    
>> -	/*
>> -	 * Don't destroy jobs while the timeout worker is running  OR thread
>> -	 * is being parked and hence assumed to not touch pending_list
>> -	 */
>> -	if ((sched->timeout != MAX_SCHEDULE_TIMEOUT &&
>> -	    !cancel_delayed_work(&sched->work_tdr)) ||
>> -	    kthread_should_park())
>> +	if (kthread_should_park())
>>    		return NULL;
>>    
>>    	spin_lock(&sched->job_list_lock); @@ -693,17 +687,21 @@ 
>> drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched)
>>    	if (job && dma_fence_is_signaled(&job->s_fence->finished)) {
>>    		/* remove job from pending_list */
>>    		list_del_init(&job->list);
>> +
>> +		/* cancel this job's TO timer */
>> +		cancel_delayed_work(&sched->work_tdr);
> I'm not sure if the work_tdr is initialized when a maximum timeout is specified. Please double check.
>
> BTW: Can we please drop the "tdr" naming from the scheduler? That is just a timeout functionality and not related to recovery in any way.
>
> We even do not start hardware recovery in a lot of cases now (when wave kill is successfully).
>
> Regards,
> Christian.
>
>>    		/* make the scheduled timestamp more accurate */
>>    		next = list_first_entry_or_null(&sched->pending_list,
>>    						typeof(*next), list);
>> -		if (next)
>> +
>> +		if (next) {
>>    			next->s_fence->scheduled.timestamp =
>>    				job->s_fence->finished.timestamp;
>> -
>> +			/* start TO timer for next job */
>> +			drm_sched_start_timeout(sched);
>> +		}
>>    	} else {
>>    		job = NULL;
>> -		/* queue timeout for next job */
>> -		drm_sched_start_timeout(sched);
>>    	}
>>    
>>    	spin_unlock(&sched->job_list_lock);
>> @@ -791,11 +789,8 @@ static int drm_sched_main(void *param)
>>    					  (entity = drm_sched_select_entity(sched))) ||
>>    					 kthread_should_stop());
>>    
>> -		if (cleanup_job) {
>> +		if (cleanup_job)
>>    			sched->ops->free_job(cleanup_job);
>> -			/* queue timeout for next job */
>> -			drm_sched_start_timeout(sched);
>> -		}
>>    
>>    		if (!entity)
>>    			continue;

^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: [PATCH] drm/sched: fix the bug of time out calculation(v3)
  2021-08-27  6:12     ` Christian König
@ 2021-08-27  7:46       ` Liu, Monk
  2021-08-27 13:45         ` Andrey Grodzovsky
  0 siblings, 1 reply; 20+ messages in thread
From: Liu, Monk @ 2021-08-27  7:46 UTC (permalink / raw)
  To: Christian König, Grodzovsky, Andrey, amd-gfx, Koenig, Christian
  Cc: dri-devel

[AMD Official Use Only]

Yeah, that "kthread_should_park" is also irrelevant looks to me as well and it delays the signaled job's cleanup/free

Thanks 

------------------------------------------
Monk Liu | Cloud-GPU Core team
------------------------------------------

-----Original Message-----
From: Christian König <ckoenig.leichtzumerken@gmail.com> 
Sent: Friday, August 27, 2021 2:12 PM
To: Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>; Liu, Monk <Monk.Liu@amd.com>; amd-gfx@lists.freedesktop.org; Koenig, Christian <Christian.Koenig@amd.com>
Cc: dri-devel@lists.freedesktop.org
Subject: Re: [PATCH] drm/sched: fix the bug of time out calculation(v3)

I don't think that this will be necessary nor desired.

See the job should be cleaned up as soon as possible after it is finished or otherwise we won't cancel the timeout quick enough either.

Christian.

Am 26.08.21 um 22:14 schrieb Andrey Grodzovsky:
> Attached quick patch for per job TTL calculation to make more precises 
> next timer expiration. It's on top of the patch in this thread. Let me 
> know if this makes sense.
>
> Andrey
>
> On 2021-08-26 10:03 a.m., Andrey Grodzovsky wrote:
>>
>> On 2021-08-26 12:55 a.m., Monk Liu wrote:
>>> issue:
>>> in cleanup_job the cancle_delayed_work will cancel a TO timer even 
>>> the its corresponding job is still running.
>>>
>>> fix:
>>> do not cancel the timer in cleanup_job, instead do the cancelling 
>>> only when the heading job is signaled, and if there is a "next" job 
>>> we start_timeout again.
>>>
>>> v2:
>>> further cleanup the logic, and do the TDR timer cancelling if the 
>>> signaled job is the last one in its scheduler.
>>>
>>> v3:
>>> change the issue description
>>> remove the cancel_delayed_work in the begining of the cleanup_job 
>>> recover the implement of drm_sched_job_begin.
>>>
>>> TODO:
>>> 1)introduce pause/resume scheduler in job_timeout to serial the 
>>> handling of scheduler and job_timeout.
>>> 2)drop the bad job's del and insert in scheduler due to above 
>>> serialization (no race issue anymore with the serialization)
>>>
>>> Signed-off-by: Monk Liu <Monk.Liu@amd.com>
>>> ---
>>>   drivers/gpu/drm/scheduler/sched_main.c | 25 
>>> ++++++++++---------------
>>>   1 file changed, 10 insertions(+), 15 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c
>>> b/drivers/gpu/drm/scheduler/sched_main.c
>>> index a2a9536..ecf8140 100644
>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>> @@ -676,13 +676,7 @@ drm_sched_get_cleanup_job(struct 
>>> drm_gpu_scheduler *sched)
>>>   {
>>>       struct drm_sched_job *job, *next;
>>>   -    /*
>>> -     * Don't destroy jobs while the timeout worker is running OR 
>>> thread
>>> -     * is being parked and hence assumed to not touch pending_list
>>> -     */
>>> -    if ((sched->timeout != MAX_SCHEDULE_TIMEOUT &&
>>> -        !cancel_delayed_work(&sched->work_tdr)) ||
>>> -        kthread_should_park())
>>> +    if (kthread_should_park())
>>>           return NULL;
>>
>>
>> I actually don't see why we need to keep the above, on the other side 
>> (in drm_sched_stop) we won't touch the pending list anyway until 
>> sched thread came to full stop (kthread_park). If you do see a reason 
>> why this needed then a comment should be here i think.
>>
>> Andrey
>>
>>
>>> spin_lock(&sched->job_list_lock);
>>> @@ -693,17 +687,21 @@ drm_sched_get_cleanup_job(struct 
>>> drm_gpu_scheduler *sched)
>>>       if (job && dma_fence_is_signaled(&job->s_fence->finished)) {
>>>           /* remove job from pending_list */
>>>           list_del_init(&job->list);
>>> +
>>> +        /* cancel this job's TO timer */
>>> +        cancel_delayed_work(&sched->work_tdr);
>>>           /* make the scheduled timestamp more accurate */
>>>           next = list_first_entry_or_null(&sched->pending_list,
>>>                           typeof(*next), list);
>>> -        if (next)
>>> +
>>> +        if (next) {
>>>               next->s_fence->scheduled.timestamp =
>>>                   job->s_fence->finished.timestamp;
>>> -
>>> +            /* start TO timer for next job */
>>> +            drm_sched_start_timeout(sched);
>>> +        }
>>>       } else {
>>>           job = NULL;
>>> -        /* queue timeout for next job */
>>> -        drm_sched_start_timeout(sched);
>>>       }
>>>         spin_unlock(&sched->job_list_lock);
>>> @@ -791,11 +789,8 @@ static int drm_sched_main(void *param)
>>>                         (entity = drm_sched_select_entity(sched))) 
>>> ||
>>>                        kthread_should_stop());
>>>   -        if (cleanup_job) {
>>> +        if (cleanup_job)
>>>               sched->ops->free_job(cleanup_job);
>>> -            /* queue timeout for next job */
>>> -            drm_sched_start_timeout(sched);
>>> -        }
>>>             if (!entity)
>>>               continue;

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] drm/sched: fix the bug of time out calculation(v3)
  2021-08-27  7:46       ` Liu, Monk
@ 2021-08-27 13:45         ` Andrey Grodzovsky
  2021-08-27 14:26           ` Christian König
  0 siblings, 1 reply; 20+ messages in thread
From: Andrey Grodzovsky @ 2021-08-27 13:45 UTC (permalink / raw)
  To: Liu, Monk, Christian König, amd-gfx, Koenig, Christian; +Cc: dri-devel

So we agree if (kthread_should_park()) return NULL should go away ?

Andrey


On 2021-08-27 3:46 a.m., Liu, Monk wrote:
> [AMD Official Use Only]
>
> Yeah, that "kthread_should_park" is also irrelevant looks to me as well and it delays the signaled job's cleanup/free
>
> Thanks
>
> ------------------------------------------
> Monk Liu | Cloud-GPU Core team
> ------------------------------------------
>
> -----Original Message-----
> From: Christian König <ckoenig.leichtzumerken@gmail.com>
> Sent: Friday, August 27, 2021 2:12 PM
> To: Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>; Liu, Monk <Monk.Liu@amd.com>; amd-gfx@lists.freedesktop.org; Koenig, Christian <Christian.Koenig@amd.com>
> Cc: dri-devel@lists.freedesktop.org
> Subject: Re: [PATCH] drm/sched: fix the bug of time out calculation(v3)
>
> I don't think that this will be necessary nor desired.
>
> See the job should be cleaned up as soon as possible after it is finished or otherwise we won't cancel the timeout quick enough either.
>
> Christian.
>
> Am 26.08.21 um 22:14 schrieb Andrey Grodzovsky:
>> Attached quick patch for per job TTL calculation to make more precises
>> next timer expiration. It's on top of the patch in this thread. Let me
>> know if this makes sense.
>>
>> Andrey
>>
>> On 2021-08-26 10:03 a.m., Andrey Grodzovsky wrote:
>>> On 2021-08-26 12:55 a.m., Monk Liu wrote:
>>>> issue:
>>>> in cleanup_job the cancle_delayed_work will cancel a TO timer even
>>>> the its corresponding job is still running.
>>>>
>>>> fix:
>>>> do not cancel the timer in cleanup_job, instead do the cancelling
>>>> only when the heading job is signaled, and if there is a "next" job
>>>> we start_timeout again.
>>>>
>>>> v2:
>>>> further cleanup the logic, and do the TDR timer cancelling if the
>>>> signaled job is the last one in its scheduler.
>>>>
>>>> v3:
>>>> change the issue description
>>>> remove the cancel_delayed_work in the begining of the cleanup_job
>>>> recover the implement of drm_sched_job_begin.
>>>>
>>>> TODO:
>>>> 1)introduce pause/resume scheduler in job_timeout to serial the
>>>> handling of scheduler and job_timeout.
>>>> 2)drop the bad job's del and insert in scheduler due to above
>>>> serialization (no race issue anymore with the serialization)
>>>>
>>>> Signed-off-by: Monk Liu <Monk.Liu@amd.com>
>>>> ---
>>>>    drivers/gpu/drm/scheduler/sched_main.c | 25
>>>> ++++++++++---------------
>>>>    1 file changed, 10 insertions(+), 15 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c
>>>> b/drivers/gpu/drm/scheduler/sched_main.c
>>>> index a2a9536..ecf8140 100644
>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>>> @@ -676,13 +676,7 @@ drm_sched_get_cleanup_job(struct
>>>> drm_gpu_scheduler *sched)
>>>>    {
>>>>        struct drm_sched_job *job, *next;
>>>>    -    /*
>>>> -     * Don't destroy jobs while the timeout worker is running OR
>>>> thread
>>>> -     * is being parked and hence assumed to not touch pending_list
>>>> -     */
>>>> -    if ((sched->timeout != MAX_SCHEDULE_TIMEOUT &&
>>>> -        !cancel_delayed_work(&sched->work_tdr)) ||
>>>> -        kthread_should_park())
>>>> +    if (kthread_should_park())
>>>>            return NULL;
>>>
>>> I actually don't see why we need to keep the above, on the other side
>>> (in drm_sched_stop) we won't touch the pending list anyway until
>>> sched thread came to full stop (kthread_park). If you do see a reason
>>> why this needed then a comment should be here i think.
>>>
>>> Andrey
>>>
>>>
>>>> spin_lock(&sched->job_list_lock);
>>>> @@ -693,17 +687,21 @@ drm_sched_get_cleanup_job(struct
>>>> drm_gpu_scheduler *sched)
>>>>        if (job && dma_fence_is_signaled(&job->s_fence->finished)) {
>>>>            /* remove job from pending_list */
>>>>            list_del_init(&job->list);
>>>> +
>>>> +        /* cancel this job's TO timer */
>>>> +        cancel_delayed_work(&sched->work_tdr);
>>>>            /* make the scheduled timestamp more accurate */
>>>>            next = list_first_entry_or_null(&sched->pending_list,
>>>>                            typeof(*next), list);
>>>> -        if (next)
>>>> +
>>>> +        if (next) {
>>>>                next->s_fence->scheduled.timestamp =
>>>>                    job->s_fence->finished.timestamp;
>>>> -
>>>> +            /* start TO timer for next job */
>>>> +            drm_sched_start_timeout(sched);
>>>> +        }
>>>>        } else {
>>>>            job = NULL;
>>>> -        /* queue timeout for next job */
>>>> -        drm_sched_start_timeout(sched);
>>>>        }
>>>>          spin_unlock(&sched->job_list_lock);
>>>> @@ -791,11 +789,8 @@ static int drm_sched_main(void *param)
>>>>                          (entity = drm_sched_select_entity(sched)))
>>>> ||
>>>>                         kthread_should_stop());
>>>>    -        if (cleanup_job) {
>>>> +        if (cleanup_job)
>>>>                sched->ops->free_job(cleanup_job);
>>>> -            /* queue timeout for next job */
>>>> -            drm_sched_start_timeout(sched);
>>>> -        }
>>>>              if (!entity)
>>>>                continue;

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] drm/sched: fix the bug of time out calculation(v3)
  2021-08-27  7:20     ` Liu, Monk
@ 2021-08-27 13:57       ` Andrey Grodzovsky
  2021-08-27 14:29         ` Christian König
  0 siblings, 1 reply; 20+ messages in thread
From: Andrey Grodzovsky @ 2021-08-27 13:57 UTC (permalink / raw)
  To: Liu, Monk, amd-gfx, Koenig, Christian; +Cc: dri-devel

The TS  represents the point in time when the job was inserted into the 
pending list.
I don't think it matters when it actually starts to be processed, what 
matters is when this job was inserted into pending list because right at 
that point you arm the TO timer (when no other is running already)
and so when the previous job completes and you cancel and rearm again 
you can use that TS from the next job in pending list to calculate how 
much time has actually left for it to run before TDR must be initiated
and not just give it again full TO value to run even if it has already 
been running for a while.

Also, i am not sure also about the assumption that what we measure is 
processing by HW, what we measure is from the moment it was scheduled to 
ring to the moment the job completed (EOP event). At least that what our 
TDR timer measures and so it makes sense to set the TS at this point.

Andrey

On 2021-08-27 3:20 a.m., Liu, Monk wrote:
> [AMD Official Use Only]
>
> what is that 'ts' representing for ? it looks to me the jiffies that it get scheduled to the ring,  but a job scheduled to the ring doesn't represent it's being processed by hw.
>
> Thanks
>
> ------------------------------------------
> Monk Liu | Cloud-GPU Core team
> ------------------------------------------
>
> -----Original Message-----
> From: Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>
> Sent: Friday, August 27, 2021 4:14 AM
> To: Liu, Monk <Monk.Liu@amd.com>; amd-gfx@lists.freedesktop.org; Koenig, Christian <Christian.Koenig@amd.com>
> Cc: dri-devel@lists.freedesktop.org
> Subject: Re: [PATCH] drm/sched: fix the bug of time out calculation(v3)
>
> Attached quick patch for per job TTL calculation to make more precises next timer expiration. It's on top of the patch in this thread. Let me know if this makes sense.
>
> Andrey
>
> On 2021-08-26 10:03 a.m., Andrey Grodzovsky wrote:
>> On 2021-08-26 12:55 a.m., Monk Liu wrote:
>>> issue:
>>> in cleanup_job the cancle_delayed_work will cancel a TO timer even
>>> the its corresponding job is still running.
>>>
>>> fix:
>>> do not cancel the timer in cleanup_job, instead do the cancelling
>>> only when the heading job is signaled, and if there is a "next" job
>>> we start_timeout again.
>>>
>>> v2:
>>> further cleanup the logic, and do the TDR timer cancelling if the
>>> signaled job is the last one in its scheduler.
>>>
>>> v3:
>>> change the issue description
>>> remove the cancel_delayed_work in the begining of the cleanup_job
>>> recover the implement of drm_sched_job_begin.
>>>
>>> TODO:
>>> 1)introduce pause/resume scheduler in job_timeout to serial the
>>> handling of scheduler and job_timeout.
>>> 2)drop the bad job's del and insert in scheduler due to above
>>> serialization (no race issue anymore with the serialization)
>>>
>>> Signed-off-by: Monk Liu <Monk.Liu@amd.com>
>>> ---
>>>    drivers/gpu/drm/scheduler/sched_main.c | 25
>>> ++++++++++---------------
>>>    1 file changed, 10 insertions(+), 15 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c
>>> b/drivers/gpu/drm/scheduler/sched_main.c
>>> index a2a9536..ecf8140 100644
>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>> @@ -676,13 +676,7 @@ drm_sched_get_cleanup_job(struct
>>> drm_gpu_scheduler *sched)
>>>    {
>>>        struct drm_sched_job *job, *next;
>>>    -    /*
>>> -     * Don't destroy jobs while the timeout worker is running OR
>>> thread
>>> -     * is being parked and hence assumed to not touch pending_list
>>> -     */
>>> -    if ((sched->timeout != MAX_SCHEDULE_TIMEOUT &&
>>> -        !cancel_delayed_work(&sched->work_tdr)) ||
>>> -        kthread_should_park())
>>> +    if (kthread_should_park())
>>>            return NULL;
>>
>> I actually don't see why we need to keep the above, on the other side
>> (in drm_sched_stop) we won't touch the pending list anyway until sched
>> thread came to full stop (kthread_park). If you do see a reason why
>> this needed then a comment should be here i think.
>>
>> Andrey
>>
>>
>>> spin_lock(&sched->job_list_lock);
>>> @@ -693,17 +687,21 @@ drm_sched_get_cleanup_job(struct
>>> drm_gpu_scheduler *sched)
>>>        if (job && dma_fence_is_signaled(&job->s_fence->finished)) {
>>>            /* remove job from pending_list */
>>>            list_del_init(&job->list);
>>> +
>>> +        /* cancel this job's TO timer */
>>> +        cancel_delayed_work(&sched->work_tdr);
>>>            /* make the scheduled timestamp more accurate */
>>>            next = list_first_entry_or_null(&sched->pending_list,
>>>                            typeof(*next), list);
>>> -        if (next)
>>> +
>>> +        if (next) {
>>>                next->s_fence->scheduled.timestamp =
>>>                    job->s_fence->finished.timestamp;
>>> -
>>> +            /* start TO timer for next job */
>>> +            drm_sched_start_timeout(sched);
>>> +        }
>>>        } else {
>>>            job = NULL;
>>> -        /* queue timeout for next job */
>>> -        drm_sched_start_timeout(sched);
>>>        }
>>>          spin_unlock(&sched->job_list_lock);
>>> @@ -791,11 +789,8 @@ static int drm_sched_main(void *param)
>>>                          (entity = drm_sched_select_entity(sched))) ||
>>>                         kthread_should_stop());
>>>    -        if (cleanup_job) {
>>> +        if (cleanup_job)
>>>                sched->ops->free_job(cleanup_job);
>>> -            /* queue timeout for next job */
>>> -            drm_sched_start_timeout(sched);
>>> -        }
>>>              if (!entity)
>>>                continue;

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] drm/sched: fix the bug of time out calculation(v3)
  2021-08-27 13:45         ` Andrey Grodzovsky
@ 2021-08-27 14:26           ` Christian König
  0 siblings, 0 replies; 20+ messages in thread
From: Christian König @ 2021-08-27 14:26 UTC (permalink / raw)
  To: Andrey Grodzovsky, Liu, Monk, Christian König, amd-gfx; +Cc: dri-devel

Yes, I don't see any good reason for that either.

Christian.

Am 27.08.21 um 15:45 schrieb Andrey Grodzovsky:
> So we agree if (kthread_should_park()) return NULL should go away ?
>
> Andrey
>
>
> On 2021-08-27 3:46 a.m., Liu, Monk wrote:
>> [AMD Official Use Only]
>>
>> Yeah, that "kthread_should_park" is also irrelevant looks to me as 
>> well and it delays the signaled job's cleanup/free
>>
>> Thanks
>>
>> ------------------------------------------
>> Monk Liu | Cloud-GPU Core team
>> ------------------------------------------
>>
>> -----Original Message-----
>> From: Christian König <ckoenig.leichtzumerken@gmail.com>
>> Sent: Friday, August 27, 2021 2:12 PM
>> To: Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>; Liu, Monk 
>> <Monk.Liu@amd.com>; amd-gfx@lists.freedesktop.org; Koenig, Christian 
>> <Christian.Koenig@amd.com>
>> Cc: dri-devel@lists.freedesktop.org
>> Subject: Re: [PATCH] drm/sched: fix the bug of time out calculation(v3)
>>
>> I don't think that this will be necessary nor desired.
>>
>> See the job should be cleaned up as soon as possible after it is 
>> finished or otherwise we won't cancel the timeout quick enough either.
>>
>> Christian.
>>
>> Am 26.08.21 um 22:14 schrieb Andrey Grodzovsky:
>>> Attached quick patch for per job TTL calculation to make more precises
>>> next timer expiration. It's on top of the patch in this thread. Let me
>>> know if this makes sense.
>>>
>>> Andrey
>>>
>>> On 2021-08-26 10:03 a.m., Andrey Grodzovsky wrote:
>>>> On 2021-08-26 12:55 a.m., Monk Liu wrote:
>>>>> issue:
>>>>> in cleanup_job the cancle_delayed_work will cancel a TO timer even
>>>>> the its corresponding job is still running.
>>>>>
>>>>> fix:
>>>>> do not cancel the timer in cleanup_job, instead do the cancelling
>>>>> only when the heading job is signaled, and if there is a "next" job
>>>>> we start_timeout again.
>>>>>
>>>>> v2:
>>>>> further cleanup the logic, and do the TDR timer cancelling if the
>>>>> signaled job is the last one in its scheduler.
>>>>>
>>>>> v3:
>>>>> change the issue description
>>>>> remove the cancel_delayed_work in the begining of the cleanup_job
>>>>> recover the implement of drm_sched_job_begin.
>>>>>
>>>>> TODO:
>>>>> 1)introduce pause/resume scheduler in job_timeout to serial the
>>>>> handling of scheduler and job_timeout.
>>>>> 2)drop the bad job's del and insert in scheduler due to above
>>>>> serialization (no race issue anymore with the serialization)
>>>>>
>>>>> Signed-off-by: Monk Liu <Monk.Liu@amd.com>
>>>>> ---
>>>>>    drivers/gpu/drm/scheduler/sched_main.c | 25
>>>>> ++++++++++---------------
>>>>>    1 file changed, 10 insertions(+), 15 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c
>>>>> b/drivers/gpu/drm/scheduler/sched_main.c
>>>>> index a2a9536..ecf8140 100644
>>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>>>> @@ -676,13 +676,7 @@ drm_sched_get_cleanup_job(struct
>>>>> drm_gpu_scheduler *sched)
>>>>>    {
>>>>>        struct drm_sched_job *job, *next;
>>>>>    -    /*
>>>>> -     * Don't destroy jobs while the timeout worker is running OR
>>>>> thread
>>>>> -     * is being parked and hence assumed to not touch pending_list
>>>>> -     */
>>>>> -    if ((sched->timeout != MAX_SCHEDULE_TIMEOUT &&
>>>>> -        !cancel_delayed_work(&sched->work_tdr)) ||
>>>>> -        kthread_should_park())
>>>>> +    if (kthread_should_park())
>>>>>            return NULL;
>>>>
>>>> I actually don't see why we need to keep the above, on the other side
>>>> (in drm_sched_stop) we won't touch the pending list anyway until
>>>> sched thread came to full stop (kthread_park). If you do see a reason
>>>> why this needed then a comment should be here i think.
>>>>
>>>> Andrey
>>>>
>>>>
>>>>> spin_lock(&sched->job_list_lock);
>>>>> @@ -693,17 +687,21 @@ drm_sched_get_cleanup_job(struct
>>>>> drm_gpu_scheduler *sched)
>>>>>        if (job && dma_fence_is_signaled(&job->s_fence->finished)) {
>>>>>            /* remove job from pending_list */
>>>>>            list_del_init(&job->list);
>>>>> +
>>>>> +        /* cancel this job's TO timer */
>>>>> +        cancel_delayed_work(&sched->work_tdr);
>>>>>            /* make the scheduled timestamp more accurate */
>>>>>            next = list_first_entry_or_null(&sched->pending_list,
>>>>>                            typeof(*next), list);
>>>>> -        if (next)
>>>>> +
>>>>> +        if (next) {
>>>>>                next->s_fence->scheduled.timestamp =
>>>>>                    job->s_fence->finished.timestamp;
>>>>> -
>>>>> +            /* start TO timer for next job */
>>>>> +            drm_sched_start_timeout(sched);
>>>>> +        }
>>>>>        } else {
>>>>>            job = NULL;
>>>>> -        /* queue timeout for next job */
>>>>> -        drm_sched_start_timeout(sched);
>>>>>        }
>>>>>          spin_unlock(&sched->job_list_lock);
>>>>> @@ -791,11 +789,8 @@ static int drm_sched_main(void *param)
>>>>>                          (entity = drm_sched_select_entity(sched)))
>>>>> ||
>>>>>                         kthread_should_stop());
>>>>>    -        if (cleanup_job) {
>>>>> +        if (cleanup_job)
>>>>>                sched->ops->free_job(cleanup_job);
>>>>> -            /* queue timeout for next job */
>>>>> -            drm_sched_start_timeout(sched);
>>>>> -        }
>>>>>              if (!entity)
>>>>>                continue;


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] drm/sched: fix the bug of time out calculation(v3)
  2021-08-27 13:57       ` Andrey Grodzovsky
@ 2021-08-27 14:29         ` Christian König
  2021-08-27 18:22           ` Andrey Grodzovsky
  0 siblings, 1 reply; 20+ messages in thread
From: Christian König @ 2021-08-27 14:29 UTC (permalink / raw)
  To: Andrey Grodzovsky, Liu, Monk, amd-gfx, Koenig, Christian; +Cc: dri-devel

I don't think that makes sense.

See we don't want to start the time when the job is inserted into the 
ring buffer, but rather when it starts processing.

Starting processing is a bit swampy defined, but just starting the timer 
when the previous job completes should be fine enough.

Christian.

Am 27.08.21 um 15:57 schrieb Andrey Grodzovsky:
> The TS represents the point in time when the job was inserted into the 
> pending list.
> I don't think it matters when it actually starts to be processed, what 
> matters is when this job was inserted into pending list because right 
> at that point you arm the TO timer (when no other is running already)
> and so when the previous job completes and you cancel and rearm again 
> you can use that TS from the next job in pending list to calculate how 
> much time has actually left for it to run before TDR must be initiated
> and not just give it again full TO value to run even if it has already 
> been running for a while.
>
> Also, i am not sure also about the assumption that what we measure is 
> processing by HW, what we measure is from the moment it was scheduled 
> to ring to the moment the job completed (EOP event). At least that 
> what our TDR timer measures and so it makes sense to set the TS at 
> this point.
>
> Andrey
>
> On 2021-08-27 3:20 a.m., Liu, Monk wrote:
>> [AMD Official Use Only]
>>
>> what is that 'ts' representing for ? it looks to me the jiffies that 
>> it get scheduled to the ring,  but a job scheduled to the ring 
>> doesn't represent it's being processed by hw.
>>
>> Thanks
>>
>> ------------------------------------------
>> Monk Liu | Cloud-GPU Core team
>> ------------------------------------------
>>
>> -----Original Message-----
>> From: Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>
>> Sent: Friday, August 27, 2021 4:14 AM
>> To: Liu, Monk <Monk.Liu@amd.com>; amd-gfx@lists.freedesktop.org; 
>> Koenig, Christian <Christian.Koenig@amd.com>
>> Cc: dri-devel@lists.freedesktop.org
>> Subject: Re: [PATCH] drm/sched: fix the bug of time out calculation(v3)
>>
>> Attached quick patch for per job TTL calculation to make more 
>> precises next timer expiration. It's on top of the patch in this 
>> thread. Let me know if this makes sense.
>>
>> Andrey
>>
>> On 2021-08-26 10:03 a.m., Andrey Grodzovsky wrote:
>>> On 2021-08-26 12:55 a.m., Monk Liu wrote:
>>>> issue:
>>>> in cleanup_job the cancle_delayed_work will cancel a TO timer even
>>>> the its corresponding job is still running.
>>>>
>>>> fix:
>>>> do not cancel the timer in cleanup_job, instead do the cancelling
>>>> only when the heading job is signaled, and if there is a "next" job
>>>> we start_timeout again.
>>>>
>>>> v2:
>>>> further cleanup the logic, and do the TDR timer cancelling if the
>>>> signaled job is the last one in its scheduler.
>>>>
>>>> v3:
>>>> change the issue description
>>>> remove the cancel_delayed_work in the begining of the cleanup_job
>>>> recover the implement of drm_sched_job_begin.
>>>>
>>>> TODO:
>>>> 1)introduce pause/resume scheduler in job_timeout to serial the
>>>> handling of scheduler and job_timeout.
>>>> 2)drop the bad job's del and insert in scheduler due to above
>>>> serialization (no race issue anymore with the serialization)
>>>>
>>>> Signed-off-by: Monk Liu <Monk.Liu@amd.com>
>>>> ---
>>>>    drivers/gpu/drm/scheduler/sched_main.c | 25
>>>> ++++++++++---------------
>>>>    1 file changed, 10 insertions(+), 15 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c
>>>> b/drivers/gpu/drm/scheduler/sched_main.c
>>>> index a2a9536..ecf8140 100644
>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>>> @@ -676,13 +676,7 @@ drm_sched_get_cleanup_job(struct
>>>> drm_gpu_scheduler *sched)
>>>>    {
>>>>        struct drm_sched_job *job, *next;
>>>>    -    /*
>>>> -     * Don't destroy jobs while the timeout worker is running OR
>>>> thread
>>>> -     * is being parked and hence assumed to not touch pending_list
>>>> -     */
>>>> -    if ((sched->timeout != MAX_SCHEDULE_TIMEOUT &&
>>>> -        !cancel_delayed_work(&sched->work_tdr)) ||
>>>> -        kthread_should_park())
>>>> +    if (kthread_should_park())
>>>>            return NULL;
>>>
>>> I actually don't see why we need to keep the above, on the other side
>>> (in drm_sched_stop) we won't touch the pending list anyway until sched
>>> thread came to full stop (kthread_park). If you do see a reason why
>>> this needed then a comment should be here i think.
>>>
>>> Andrey
>>>
>>>
>>>> spin_lock(&sched->job_list_lock);
>>>> @@ -693,17 +687,21 @@ drm_sched_get_cleanup_job(struct
>>>> drm_gpu_scheduler *sched)
>>>>        if (job && dma_fence_is_signaled(&job->s_fence->finished)) {
>>>>            /* remove job from pending_list */
>>>>            list_del_init(&job->list);
>>>> +
>>>> +        /* cancel this job's TO timer */
>>>> +        cancel_delayed_work(&sched->work_tdr);
>>>>            /* make the scheduled timestamp more accurate */
>>>>            next = list_first_entry_or_null(&sched->pending_list,
>>>>                            typeof(*next), list);
>>>> -        if (next)
>>>> +
>>>> +        if (next) {
>>>>                next->s_fence->scheduled.timestamp =
>>>>                    job->s_fence->finished.timestamp;
>>>> -
>>>> +            /* start TO timer for next job */
>>>> +            drm_sched_start_timeout(sched);
>>>> +        }
>>>>        } else {
>>>>            job = NULL;
>>>> -        /* queue timeout for next job */
>>>> -        drm_sched_start_timeout(sched);
>>>>        }
>>>>          spin_unlock(&sched->job_list_lock);
>>>> @@ -791,11 +789,8 @@ static int drm_sched_main(void *param)
>>>>                          (entity = drm_sched_select_entity(sched))) ||
>>>>                         kthread_should_stop());
>>>>    -        if (cleanup_job) {
>>>> +        if (cleanup_job)
>>>>                sched->ops->free_job(cleanup_job);
>>>> -            /* queue timeout for next job */
>>>> -            drm_sched_start_timeout(sched);
>>>> -        }
>>>>              if (!entity)
>>>>                continue;


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] drm/sched: fix the bug of time out calculation(v3)
  2021-08-27 14:29         ` Christian König
@ 2021-08-27 18:22           ` Andrey Grodzovsky
  2021-08-27 18:30             ` Christian König
  0 siblings, 1 reply; 20+ messages in thread
From: Andrey Grodzovsky @ 2021-08-27 18:22 UTC (permalink / raw)
  To: Christian König, Liu, Monk, amd-gfx, Koenig, Christian; +Cc: dri-devel

As I mentioned to Monk before - what about cases such as in this test - 
https://gitlab.freedesktop.org/mesa/drm/-/commit/bc21168fa924d3fc4a000492e861f50a1a135b25
Here you don't have serialized sequence where when jobs finishes 
processing and second starts, they execute together  concurrently - for 
those cases seems
to me restarting the timer for the second job from scratch will let it 
hang much longer then allowed by TO value.

Andrey

On 2021-08-27 10:29 a.m., Christian König wrote:
> I don't think that makes sense.
>
> See we don't want to start the time when the job is inserted into the 
> ring buffer, but rather when it starts processing.
>
> Starting processing is a bit swampy defined, but just starting the 
> timer when the previous job completes should be fine enough.
>
> Christian.
>
> Am 27.08.21 um 15:57 schrieb Andrey Grodzovsky:
>> The TS represents the point in time when the job was inserted into 
>> the pending list.
>> I don't think it matters when it actually starts to be processed, 
>> what matters is when this job was inserted into pending list because 
>> right at that point you arm the TO timer (when no other is running 
>> already)
>> and so when the previous job completes and you cancel and rearm again 
>> you can use that TS from the next job in pending list to calculate 
>> how much time has actually left for it to run before TDR must be 
>> initiated
>> and not just give it again full TO value to run even if it has 
>> already been running for a while.
>>
>> Also, i am not sure also about the assumption that what we measure is 
>> processing by HW, what we measure is from the moment it was scheduled 
>> to ring to the moment the job completed (EOP event). At least that 
>> what our TDR timer measures and so it makes sense to set the TS at 
>> this point.
>>
>> Andrey
>>
>> On 2021-08-27 3:20 a.m., Liu, Monk wrote:
>>> [AMD Official Use Only]
>>>
>>> what is that 'ts' representing for ? it looks to me the jiffies that 
>>> it get scheduled to the ring,  but a job scheduled to the ring 
>>> doesn't represent it's being processed by hw.
>>>
>>> Thanks
>>>
>>> ------------------------------------------
>>> Monk Liu | Cloud-GPU Core team
>>> ------------------------------------------
>>>
>>> -----Original Message-----
>>> From: Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>
>>> Sent: Friday, August 27, 2021 4:14 AM
>>> To: Liu, Monk <Monk.Liu@amd.com>; amd-gfx@lists.freedesktop.org; 
>>> Koenig, Christian <Christian.Koenig@amd.com>
>>> Cc: dri-devel@lists.freedesktop.org
>>> Subject: Re: [PATCH] drm/sched: fix the bug of time out calculation(v3)
>>>
>>> Attached quick patch for per job TTL calculation to make more 
>>> precises next timer expiration. It's on top of the patch in this 
>>> thread. Let me know if this makes sense.
>>>
>>> Andrey
>>>
>>> On 2021-08-26 10:03 a.m., Andrey Grodzovsky wrote:
>>>> On 2021-08-26 12:55 a.m., Monk Liu wrote:
>>>>> issue:
>>>>> in cleanup_job the cancle_delayed_work will cancel a TO timer even
>>>>> the its corresponding job is still running.
>>>>>
>>>>> fix:
>>>>> do not cancel the timer in cleanup_job, instead do the cancelling
>>>>> only when the heading job is signaled, and if there is a "next" job
>>>>> we start_timeout again.
>>>>>
>>>>> v2:
>>>>> further cleanup the logic, and do the TDR timer cancelling if the
>>>>> signaled job is the last one in its scheduler.
>>>>>
>>>>> v3:
>>>>> change the issue description
>>>>> remove the cancel_delayed_work in the begining of the cleanup_job
>>>>> recover the implement of drm_sched_job_begin.
>>>>>
>>>>> TODO:
>>>>> 1)introduce pause/resume scheduler in job_timeout to serial the
>>>>> handling of scheduler and job_timeout.
>>>>> 2)drop the bad job's del and insert in scheduler due to above
>>>>> serialization (no race issue anymore with the serialization)
>>>>>
>>>>> Signed-off-by: Monk Liu <Monk.Liu@amd.com>
>>>>> ---
>>>>>    drivers/gpu/drm/scheduler/sched_main.c | 25
>>>>> ++++++++++---------------
>>>>>    1 file changed, 10 insertions(+), 15 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c
>>>>> b/drivers/gpu/drm/scheduler/sched_main.c
>>>>> index a2a9536..ecf8140 100644
>>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>>>> @@ -676,13 +676,7 @@ drm_sched_get_cleanup_job(struct
>>>>> drm_gpu_scheduler *sched)
>>>>>    {
>>>>>        struct drm_sched_job *job, *next;
>>>>>    -    /*
>>>>> -     * Don't destroy jobs while the timeout worker is running OR
>>>>> thread
>>>>> -     * is being parked and hence assumed to not touch pending_list
>>>>> -     */
>>>>> -    if ((sched->timeout != MAX_SCHEDULE_TIMEOUT &&
>>>>> -        !cancel_delayed_work(&sched->work_tdr)) ||
>>>>> -        kthread_should_park())
>>>>> +    if (kthread_should_park())
>>>>>            return NULL;
>>>>
>>>> I actually don't see why we need to keep the above, on the other side
>>>> (in drm_sched_stop) we won't touch the pending list anyway until sched
>>>> thread came to full stop (kthread_park). If you do see a reason why
>>>> this needed then a comment should be here i think.
>>>>
>>>> Andrey
>>>>
>>>>
>>>>> spin_lock(&sched->job_list_lock);
>>>>> @@ -693,17 +687,21 @@ drm_sched_get_cleanup_job(struct
>>>>> drm_gpu_scheduler *sched)
>>>>>        if (job && dma_fence_is_signaled(&job->s_fence->finished)) {
>>>>>            /* remove job from pending_list */
>>>>>            list_del_init(&job->list);
>>>>> +
>>>>> +        /* cancel this job's TO timer */
>>>>> +        cancel_delayed_work(&sched->work_tdr);
>>>>>            /* make the scheduled timestamp more accurate */
>>>>>            next = list_first_entry_or_null(&sched->pending_list,
>>>>>                            typeof(*next), list);
>>>>> -        if (next)
>>>>> +
>>>>> +        if (next) {
>>>>>                next->s_fence->scheduled.timestamp =
>>>>>                    job->s_fence->finished.timestamp;
>>>>> -
>>>>> +            /* start TO timer for next job */
>>>>> +            drm_sched_start_timeout(sched);
>>>>> +        }
>>>>>        } else {
>>>>>            job = NULL;
>>>>> -        /* queue timeout for next job */
>>>>> -        drm_sched_start_timeout(sched);
>>>>>        }
>>>>>          spin_unlock(&sched->job_list_lock);
>>>>> @@ -791,11 +789,8 @@ static int drm_sched_main(void *param)
>>>>>                          (entity = 
>>>>> drm_sched_select_entity(sched))) ||
>>>>>                         kthread_should_stop());
>>>>>    -        if (cleanup_job) {
>>>>> +        if (cleanup_job)
>>>>>                sched->ops->free_job(cleanup_job);
>>>>> -            /* queue timeout for next job */
>>>>> -            drm_sched_start_timeout(sched);
>>>>> -        }
>>>>>              if (!entity)
>>>>>                continue;
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] drm/sched: fix the bug of time out calculation(v3)
  2021-08-27 18:22           ` Andrey Grodzovsky
@ 2021-08-27 18:30             ` Christian König
  2021-08-27 18:39               ` Andrey Grodzovsky
  2021-08-31 13:08               ` Daniel Vetter
  0 siblings, 2 replies; 20+ messages in thread
From: Christian König @ 2021-08-27 18:30 UTC (permalink / raw)
  To: Andrey Grodzovsky, Christian König, Liu, Monk, amd-gfx; +Cc: dri-devel

Yeah, that's what I meant with that the start of processing a job is a 
bit swampy defined.

Jobs overload, but we simply don't have another good indicator that a 
job started except that the previous one completed.

It's still better than starting the timer when pushing the job to the 
ring buffer, because that is completely off.

Christian.

Am 27.08.21 um 20:22 schrieb Andrey Grodzovsky:
> As I mentioned to Monk before - what about cases such as in this test 
> - 
> https://gitlab.freedesktop.org/mesa/drm/-/commit/bc21168fa924d3fc4a000492e861f50a1a135b25 
>
> Here you don't have serialized sequence where when jobs finishes 
> processing and second starts, they execute together  concurrently - 
> for those cases seems
> to me restarting the timer for the second job from scratch will let it 
> hang much longer then allowed by TO value.
>
> Andrey
>
> On 2021-08-27 10:29 a.m., Christian König wrote:
>> I don't think that makes sense.
>>
>> See we don't want to start the time when the job is inserted into the 
>> ring buffer, but rather when it starts processing.
>>
>> Starting processing is a bit swampy defined, but just starting the 
>> timer when the previous job completes should be fine enough.
>>
>> Christian.
>>
>> Am 27.08.21 um 15:57 schrieb Andrey Grodzovsky:
>>> The TS represents the point in time when the job was inserted into 
>>> the pending list.
>>> I don't think it matters when it actually starts to be processed, 
>>> what matters is when this job was inserted into pending list because 
>>> right at that point you arm the TO timer (when no other is running 
>>> already)
>>> and so when the previous job completes and you cancel and rearm 
>>> again you can use that TS from the next job in pending list to 
>>> calculate how much time has actually left for it to run before TDR 
>>> must be initiated
>>> and not just give it again full TO value to run even if it has 
>>> already been running for a while.
>>>
>>> Also, i am not sure also about the assumption that what we measure 
>>> is processing by HW, what we measure is from the moment it was 
>>> scheduled to ring to the moment the job completed (EOP event). At 
>>> least that what our TDR timer measures and so it makes sense to set 
>>> the TS at this point.
>>>
>>> Andrey
>>>
>>> On 2021-08-27 3:20 a.m., Liu, Monk wrote:
>>>> [AMD Official Use Only]
>>>>
>>>> what is that 'ts' representing for ? it looks to me the jiffies 
>>>> that it get scheduled to the ring,  but a job scheduled to the ring 
>>>> doesn't represent it's being processed by hw.
>>>>
>>>> Thanks
>>>>
>>>> ------------------------------------------
>>>> Monk Liu | Cloud-GPU Core team
>>>> ------------------------------------------
>>>>
>>>> -----Original Message-----
>>>> From: Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>
>>>> Sent: Friday, August 27, 2021 4:14 AM
>>>> To: Liu, Monk <Monk.Liu@amd.com>; amd-gfx@lists.freedesktop.org; 
>>>> Koenig, Christian <Christian.Koenig@amd.com>
>>>> Cc: dri-devel@lists.freedesktop.org
>>>> Subject: Re: [PATCH] drm/sched: fix the bug of time out 
>>>> calculation(v3)
>>>>
>>>> Attached quick patch for per job TTL calculation to make more 
>>>> precises next timer expiration. It's on top of the patch in this 
>>>> thread. Let me know if this makes sense.
>>>>
>>>> Andrey
>>>>
>>>> On 2021-08-26 10:03 a.m., Andrey Grodzovsky wrote:
>>>>> On 2021-08-26 12:55 a.m., Monk Liu wrote:
>>>>>> issue:
>>>>>> in cleanup_job the cancle_delayed_work will cancel a TO timer even
>>>>>> the its corresponding job is still running.
>>>>>>
>>>>>> fix:
>>>>>> do not cancel the timer in cleanup_job, instead do the cancelling
>>>>>> only when the heading job is signaled, and if there is a "next" job
>>>>>> we start_timeout again.
>>>>>>
>>>>>> v2:
>>>>>> further cleanup the logic, and do the TDR timer cancelling if the
>>>>>> signaled job is the last one in its scheduler.
>>>>>>
>>>>>> v3:
>>>>>> change the issue description
>>>>>> remove the cancel_delayed_work in the begining of the cleanup_job
>>>>>> recover the implement of drm_sched_job_begin.
>>>>>>
>>>>>> TODO:
>>>>>> 1)introduce pause/resume scheduler in job_timeout to serial the
>>>>>> handling of scheduler and job_timeout.
>>>>>> 2)drop the bad job's del and insert in scheduler due to above
>>>>>> serialization (no race issue anymore with the serialization)
>>>>>>
>>>>>> Signed-off-by: Monk Liu <Monk.Liu@amd.com>
>>>>>> ---
>>>>>>    drivers/gpu/drm/scheduler/sched_main.c | 25
>>>>>> ++++++++++---------------
>>>>>>    1 file changed, 10 insertions(+), 15 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c
>>>>>> b/drivers/gpu/drm/scheduler/sched_main.c
>>>>>> index a2a9536..ecf8140 100644
>>>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>>>>> @@ -676,13 +676,7 @@ drm_sched_get_cleanup_job(struct
>>>>>> drm_gpu_scheduler *sched)
>>>>>>    {
>>>>>>        struct drm_sched_job *job, *next;
>>>>>>    -    /*
>>>>>> -     * Don't destroy jobs while the timeout worker is running OR
>>>>>> thread
>>>>>> -     * is being parked and hence assumed to not touch pending_list
>>>>>> -     */
>>>>>> -    if ((sched->timeout != MAX_SCHEDULE_TIMEOUT &&
>>>>>> -        !cancel_delayed_work(&sched->work_tdr)) ||
>>>>>> -        kthread_should_park())
>>>>>> +    if (kthread_should_park())
>>>>>>            return NULL;
>>>>>
>>>>> I actually don't see why we need to keep the above, on the other side
>>>>> (in drm_sched_stop) we won't touch the pending list anyway until 
>>>>> sched
>>>>> thread came to full stop (kthread_park). If you do see a reason why
>>>>> this needed then a comment should be here i think.
>>>>>
>>>>> Andrey
>>>>>
>>>>>
>>>>>> spin_lock(&sched->job_list_lock);
>>>>>> @@ -693,17 +687,21 @@ drm_sched_get_cleanup_job(struct
>>>>>> drm_gpu_scheduler *sched)
>>>>>>        if (job && dma_fence_is_signaled(&job->s_fence->finished)) {
>>>>>>            /* remove job from pending_list */
>>>>>>            list_del_init(&job->list);
>>>>>> +
>>>>>> +        /* cancel this job's TO timer */
>>>>>> +        cancel_delayed_work(&sched->work_tdr);
>>>>>>            /* make the scheduled timestamp more accurate */
>>>>>>            next = list_first_entry_or_null(&sched->pending_list,
>>>>>>                            typeof(*next), list);
>>>>>> -        if (next)
>>>>>> +
>>>>>> +        if (next) {
>>>>>>                next->s_fence->scheduled.timestamp =
>>>>>> job->s_fence->finished.timestamp;
>>>>>> -
>>>>>> +            /* start TO timer for next job */
>>>>>> +            drm_sched_start_timeout(sched);
>>>>>> +        }
>>>>>>        } else {
>>>>>>            job = NULL;
>>>>>> -        /* queue timeout for next job */
>>>>>> -        drm_sched_start_timeout(sched);
>>>>>>        }
>>>>>>          spin_unlock(&sched->job_list_lock);
>>>>>> @@ -791,11 +789,8 @@ static int drm_sched_main(void *param)
>>>>>>                          (entity = 
>>>>>> drm_sched_select_entity(sched))) ||
>>>>>>                         kthread_should_stop());
>>>>>>    -        if (cleanup_job) {
>>>>>> +        if (cleanup_job)
>>>>>>                sched->ops->free_job(cleanup_job);
>>>>>> -            /* queue timeout for next job */
>>>>>> -            drm_sched_start_timeout(sched);
>>>>>> -        }
>>>>>>              if (!entity)
>>>>>>                continue;
>>


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] drm/sched: fix the bug of time out calculation(v3)
  2021-08-27 18:30             ` Christian König
@ 2021-08-27 18:39               ` Andrey Grodzovsky
  2021-08-31 13:08               ` Daniel Vetter
  1 sibling, 0 replies; 20+ messages in thread
From: Andrey Grodzovsky @ 2021-08-27 18:39 UTC (permalink / raw)
  To: Christian König, Christian König, Liu, Monk, amd-gfx; +Cc: dri-devel

Sure then.

Andrey

On 2021-08-27 2:30 p.m., Christian König wrote:
> Yeah, that's what I meant with that the start of processing a job is a 
> bit swampy defined.
>
> Jobs overload, but we simply don't have another good indicator that a 
> job started except that the previous one completed.
>
> It's still better than starting the timer when pushing the job to the 
> ring buffer, because that is completely off.
>
> Christian.
>
> Am 27.08.21 um 20:22 schrieb Andrey Grodzovsky:
>> As I mentioned to Monk before - what about cases such as in this test 
>> - 
>> https://gitlab.freedesktop.org/mesa/drm/-/commit/bc21168fa924d3fc4a000492e861f50a1a135b25 
>>
>> Here you don't have serialized sequence where when jobs finishes 
>> processing and second starts, they execute together concurrently - 
>> for those cases seems
>> to me restarting the timer for the second job from scratch will let 
>> it hang much longer then allowed by TO value.
>>
>> Andrey
>>
>> On 2021-08-27 10:29 a.m., Christian König wrote:
>>> I don't think that makes sense.
>>>
>>> See we don't want to start the time when the job is inserted into 
>>> the ring buffer, but rather when it starts processing.
>>>
>>> Starting processing is a bit swampy defined, but just starting the 
>>> timer when the previous job completes should be fine enough.
>>>
>>> Christian.
>>>
>>> Am 27.08.21 um 15:57 schrieb Andrey Grodzovsky:
>>>> The TS represents the point in time when the job was inserted into 
>>>> the pending list.
>>>> I don't think it matters when it actually starts to be processed, 
>>>> what matters is when this job was inserted into pending list 
>>>> because right at that point you arm the TO timer (when no other is 
>>>> running already)
>>>> and so when the previous job completes and you cancel and rearm 
>>>> again you can use that TS from the next job in pending list to 
>>>> calculate how much time has actually left for it to run before TDR 
>>>> must be initiated
>>>> and not just give it again full TO value to run even if it has 
>>>> already been running for a while.
>>>>
>>>> Also, i am not sure also about the assumption that what we measure 
>>>> is processing by HW, what we measure is from the moment it was 
>>>> scheduled to ring to the moment the job completed (EOP event). At 
>>>> least that what our TDR timer measures and so it makes sense to set 
>>>> the TS at this point.
>>>>
>>>> Andrey
>>>>
>>>> On 2021-08-27 3:20 a.m., Liu, Monk wrote:
>>>>> [AMD Official Use Only]
>>>>>
>>>>> what is that 'ts' representing for ? it looks to me the jiffies 
>>>>> that it get scheduled to the ring,  but a job scheduled to the 
>>>>> ring doesn't represent it's being processed by hw.
>>>>>
>>>>> Thanks
>>>>>
>>>>> ------------------------------------------
>>>>> Monk Liu | Cloud-GPU Core team
>>>>> ------------------------------------------
>>>>>
>>>>> -----Original Message-----
>>>>> From: Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>
>>>>> Sent: Friday, August 27, 2021 4:14 AM
>>>>> To: Liu, Monk <Monk.Liu@amd.com>; amd-gfx@lists.freedesktop.org; 
>>>>> Koenig, Christian <Christian.Koenig@amd.com>
>>>>> Cc: dri-devel@lists.freedesktop.org
>>>>> Subject: Re: [PATCH] drm/sched: fix the bug of time out 
>>>>> calculation(v3)
>>>>>
>>>>> Attached quick patch for per job TTL calculation to make more 
>>>>> precises next timer expiration. It's on top of the patch in this 
>>>>> thread. Let me know if this makes sense.
>>>>>
>>>>> Andrey
>>>>>
>>>>> On 2021-08-26 10:03 a.m., Andrey Grodzovsky wrote:
>>>>>> On 2021-08-26 12:55 a.m., Monk Liu wrote:
>>>>>>> issue:
>>>>>>> in cleanup_job the cancle_delayed_work will cancel a TO timer even
>>>>>>> the its corresponding job is still running.
>>>>>>>
>>>>>>> fix:
>>>>>>> do not cancel the timer in cleanup_job, instead do the cancelling
>>>>>>> only when the heading job is signaled, and if there is a "next" job
>>>>>>> we start_timeout again.
>>>>>>>
>>>>>>> v2:
>>>>>>> further cleanup the logic, and do the TDR timer cancelling if the
>>>>>>> signaled job is the last one in its scheduler.
>>>>>>>
>>>>>>> v3:
>>>>>>> change the issue description
>>>>>>> remove the cancel_delayed_work in the begining of the cleanup_job
>>>>>>> recover the implement of drm_sched_job_begin.
>>>>>>>
>>>>>>> TODO:
>>>>>>> 1)introduce pause/resume scheduler in job_timeout to serial the
>>>>>>> handling of scheduler and job_timeout.
>>>>>>> 2)drop the bad job's del and insert in scheduler due to above
>>>>>>> serialization (no race issue anymore with the serialization)
>>>>>>>
>>>>>>> Signed-off-by: Monk Liu <Monk.Liu@amd.com>
>>>>>>> ---
>>>>>>>    drivers/gpu/drm/scheduler/sched_main.c | 25
>>>>>>> ++++++++++---------------
>>>>>>>    1 file changed, 10 insertions(+), 15 deletions(-)
>>>>>>>
>>>>>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c
>>>>>>> b/drivers/gpu/drm/scheduler/sched_main.c
>>>>>>> index a2a9536..ecf8140 100644
>>>>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>>>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>>>>>> @@ -676,13 +676,7 @@ drm_sched_get_cleanup_job(struct
>>>>>>> drm_gpu_scheduler *sched)
>>>>>>>    {
>>>>>>>        struct drm_sched_job *job, *next;
>>>>>>>    -    /*
>>>>>>> -     * Don't destroy jobs while the timeout worker is running OR
>>>>>>> thread
>>>>>>> -     * is being parked and hence assumed to not touch pending_list
>>>>>>> -     */
>>>>>>> -    if ((sched->timeout != MAX_SCHEDULE_TIMEOUT &&
>>>>>>> - !cancel_delayed_work(&sched->work_tdr)) ||
>>>>>>> -        kthread_should_park())
>>>>>>> +    if (kthread_should_park())
>>>>>>>            return NULL;
>>>>>>
>>>>>> I actually don't see why we need to keep the above, on the other 
>>>>>> side
>>>>>> (in drm_sched_stop) we won't touch the pending list anyway until 
>>>>>> sched
>>>>>> thread came to full stop (kthread_park). If you do see a reason why
>>>>>> this needed then a comment should be here i think.
>>>>>>
>>>>>> Andrey
>>>>>>
>>>>>>
>>>>>>> spin_lock(&sched->job_list_lock);
>>>>>>> @@ -693,17 +687,21 @@ drm_sched_get_cleanup_job(struct
>>>>>>> drm_gpu_scheduler *sched)
>>>>>>>        if (job && dma_fence_is_signaled(&job->s_fence->finished)) {
>>>>>>>            /* remove job from pending_list */
>>>>>>>            list_del_init(&job->list);
>>>>>>> +
>>>>>>> +        /* cancel this job's TO timer */
>>>>>>> +        cancel_delayed_work(&sched->work_tdr);
>>>>>>>            /* make the scheduled timestamp more accurate */
>>>>>>>            next = list_first_entry_or_null(&sched->pending_list,
>>>>>>>                            typeof(*next), list);
>>>>>>> -        if (next)
>>>>>>> +
>>>>>>> +        if (next) {
>>>>>>> next->s_fence->scheduled.timestamp =
>>>>>>> job->s_fence->finished.timestamp;
>>>>>>> -
>>>>>>> +            /* start TO timer for next job */
>>>>>>> +            drm_sched_start_timeout(sched);
>>>>>>> +        }
>>>>>>>        } else {
>>>>>>>            job = NULL;
>>>>>>> -        /* queue timeout for next job */
>>>>>>> -        drm_sched_start_timeout(sched);
>>>>>>>        }
>>>>>>>          spin_unlock(&sched->job_list_lock);
>>>>>>> @@ -791,11 +789,8 @@ static int drm_sched_main(void *param)
>>>>>>>                          (entity = 
>>>>>>> drm_sched_select_entity(sched))) ||
>>>>>>>                         kthread_should_stop());
>>>>>>>    -        if (cleanup_job) {
>>>>>>> +        if (cleanup_job)
>>>>>>> sched->ops->free_job(cleanup_job);
>>>>>>> -            /* queue timeout for next job */
>>>>>>> -            drm_sched_start_timeout(sched);
>>>>>>> -        }
>>>>>>>              if (!entity)
>>>>>>>                continue;
>>>
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] drm/sched: fix the bug of time out calculation(v3)
  2021-08-27 18:30             ` Christian König
  2021-08-27 18:39               ` Andrey Grodzovsky
@ 2021-08-31 13:08               ` Daniel Vetter
  2021-09-01  1:30                 ` Liu, Monk
  1 sibling, 1 reply; 20+ messages in thread
From: Daniel Vetter @ 2021-08-31 13:08 UTC (permalink / raw)
  To: Christian König
  Cc: Andrey Grodzovsky, Christian König, Liu, Monk, amd-gfx, dri-devel

On Fri, Aug 27, 2021 at 08:30:32PM +0200, Christian König wrote:
> Yeah, that's what I meant with that the start of processing a job is a bit
> swampy defined.
> 
> Jobs overload, but we simply don't have another good indicator that a job
> started except that the previous one completed.
> 
> It's still better than starting the timer when pushing the job to the ring
> buffer, because that is completely off.

Not sure this matters that much really in practice, and in some cases we
might want to actually just reset it all if the built up backlog is way
too long.

For anything that really runs way too long you want preempt-ctx/revoke
fences anyway, not end-of-cs fences with tdr.
-Daniel

> 
> Christian.
> 
> Am 27.08.21 um 20:22 schrieb Andrey Grodzovsky:
> > As I mentioned to Monk before - what about cases such as in this test - https://gitlab.freedesktop.org/mesa/drm/-/commit/bc21168fa924d3fc4a000492e861f50a1a135b25
> > 
> > Here you don't have serialized sequence where when jobs finishes
> > processing and second starts, they execute together  concurrently - for
> > those cases seems
> > to me restarting the timer for the second job from scratch will let it
> > hang much longer then allowed by TO value.
> > 
> > Andrey
> > 
> > On 2021-08-27 10:29 a.m., Christian König wrote:
> > > I don't think that makes sense.
> > > 
> > > See we don't want to start the time when the job is inserted into
> > > the ring buffer, but rather when it starts processing.
> > > 
> > > Starting processing is a bit swampy defined, but just starting the
> > > timer when the previous job completes should be fine enough.
> > > 
> > > Christian.
> > > 
> > > Am 27.08.21 um 15:57 schrieb Andrey Grodzovsky:
> > > > The TS represents the point in time when the job was inserted
> > > > into the pending list.
> > > > I don't think it matters when it actually starts to be
> > > > processed, what matters is when this job was inserted into
> > > > pending list because right at that point you arm the TO timer
> > > > (when no other is running already)
> > > > and so when the previous job completes and you cancel and rearm
> > > > again you can use that TS from the next job in pending list to
> > > > calculate how much time has actually left for it to run before
> > > > TDR must be initiated
> > > > and not just give it again full TO value to run even if it has
> > > > already been running for a while.
> > > > 
> > > > Also, i am not sure also about the assumption that what we
> > > > measure is processing by HW, what we measure is from the moment
> > > > it was scheduled to ring to the moment the job completed (EOP
> > > > event). At least that what our TDR timer measures and so it
> > > > makes sense to set the TS at this point.
> > > > 
> > > > Andrey
> > > > 
> > > > On 2021-08-27 3:20 a.m., Liu, Monk wrote:
> > > > > [AMD Official Use Only]
> > > > > 
> > > > > what is that 'ts' representing for ? it looks to me the
> > > > > jiffies that it get scheduled to the ring,  but a job
> > > > > scheduled to the ring doesn't represent it's being processed
> > > > > by hw.
> > > > > 
> > > > > Thanks
> > > > > 
> > > > > ------------------------------------------
> > > > > Monk Liu | Cloud-GPU Core team
> > > > > ------------------------------------------
> > > > > 
> > > > > -----Original Message-----
> > > > > From: Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>
> > > > > Sent: Friday, August 27, 2021 4:14 AM
> > > > > To: Liu, Monk <Monk.Liu@amd.com>;
> > > > > amd-gfx@lists.freedesktop.org; Koenig, Christian
> > > > > <Christian.Koenig@amd.com>
> > > > > Cc: dri-devel@lists.freedesktop.org
> > > > > Subject: Re: [PATCH] drm/sched: fix the bug of time out
> > > > > calculation(v3)
> > > > > 
> > > > > Attached quick patch for per job TTL calculation to make
> > > > > more precises next timer expiration. It's on top of the
> > > > > patch in this thread. Let me know if this makes sense.
> > > > > 
> > > > > Andrey
> > > > > 
> > > > > On 2021-08-26 10:03 a.m., Andrey Grodzovsky wrote:
> > > > > > On 2021-08-26 12:55 a.m., Monk Liu wrote:
> > > > > > > issue:
> > > > > > > in cleanup_job the cancle_delayed_work will cancel a TO timer even
> > > > > > > the its corresponding job is still running.
> > > > > > > 
> > > > > > > fix:
> > > > > > > do not cancel the timer in cleanup_job, instead do the cancelling
> > > > > > > only when the heading job is signaled, and if there is a "next" job
> > > > > > > we start_timeout again.
> > > > > > > 
> > > > > > > v2:
> > > > > > > further cleanup the logic, and do the TDR timer cancelling if the
> > > > > > > signaled job is the last one in its scheduler.
> > > > > > > 
> > > > > > > v3:
> > > > > > > change the issue description
> > > > > > > remove the cancel_delayed_work in the begining of the cleanup_job
> > > > > > > recover the implement of drm_sched_job_begin.
> > > > > > > 
> > > > > > > TODO:
> > > > > > > 1)introduce pause/resume scheduler in job_timeout to serial the
> > > > > > > handling of scheduler and job_timeout.
> > > > > > > 2)drop the bad job's del and insert in scheduler due to above
> > > > > > > serialization (no race issue anymore with the serialization)
> > > > > > > 
> > > > > > > Signed-off-by: Monk Liu <Monk.Liu@amd.com>
> > > > > > > ---
> > > > > > >    drivers/gpu/drm/scheduler/sched_main.c | 25
> > > > > > > ++++++++++---------------
> > > > > > >    1 file changed, 10 insertions(+), 15 deletions(-)
> > > > > > > 
> > > > > > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c
> > > > > > > b/drivers/gpu/drm/scheduler/sched_main.c
> > > > > > > index a2a9536..ecf8140 100644
> > > > > > > --- a/drivers/gpu/drm/scheduler/sched_main.c
> > > > > > > +++ b/drivers/gpu/drm/scheduler/sched_main.c
> > > > > > > @@ -676,13 +676,7 @@ drm_sched_get_cleanup_job(struct
> > > > > > > drm_gpu_scheduler *sched)
> > > > > > >    {
> > > > > > >        struct drm_sched_job *job, *next;
> > > > > > >    -    /*
> > > > > > > -     * Don't destroy jobs while the timeout worker is running OR
> > > > > > > thread
> > > > > > > -     * is being parked and hence assumed to not touch pending_list
> > > > > > > -     */
> > > > > > > -    if ((sched->timeout != MAX_SCHEDULE_TIMEOUT &&
> > > > > > > -        !cancel_delayed_work(&sched->work_tdr)) ||
> > > > > > > -        kthread_should_park())
> > > > > > > +    if (kthread_should_park())
> > > > > > >            return NULL;
> > > > > > 
> > > > > > I actually don't see why we need to keep the above, on the other side
> > > > > > (in drm_sched_stop) we won't touch the pending list
> > > > > > anyway until sched
> > > > > > thread came to full stop (kthread_park). If you do see a reason why
> > > > > > this needed then a comment should be here i think.
> > > > > > 
> > > > > > Andrey
> > > > > > 
> > > > > > 
> > > > > > > spin_lock(&sched->job_list_lock);
> > > > > > > @@ -693,17 +687,21 @@ drm_sched_get_cleanup_job(struct
> > > > > > > drm_gpu_scheduler *sched)
> > > > > > >        if (job && dma_fence_is_signaled(&job->s_fence->finished)) {
> > > > > > >            /* remove job from pending_list */
> > > > > > >            list_del_init(&job->list);
> > > > > > > +
> > > > > > > +        /* cancel this job's TO timer */
> > > > > > > +        cancel_delayed_work(&sched->work_tdr);
> > > > > > >            /* make the scheduled timestamp more accurate */
> > > > > > >            next = list_first_entry_or_null(&sched->pending_list,
> > > > > > >                            typeof(*next), list);
> > > > > > > -        if (next)
> > > > > > > +
> > > > > > > +        if (next) {
> > > > > > >                next->s_fence->scheduled.timestamp =
> > > > > > > job->s_fence->finished.timestamp;
> > > > > > > -
> > > > > > > +            /* start TO timer for next job */
> > > > > > > +            drm_sched_start_timeout(sched);
> > > > > > > +        }
> > > > > > >        } else {
> > > > > > >            job = NULL;
> > > > > > > -        /* queue timeout for next job */
> > > > > > > -        drm_sched_start_timeout(sched);
> > > > > > >        }
> > > > > > >          spin_unlock(&sched->job_list_lock);
> > > > > > > @@ -791,11 +789,8 @@ static int drm_sched_main(void *param)
> > > > > > >                          (entity =
> > > > > > > drm_sched_select_entity(sched))) ||
> > > > > > >                         kthread_should_stop());
> > > > > > >    -        if (cleanup_job) {
> > > > > > > +        if (cleanup_job)
> > > > > > >                sched->ops->free_job(cleanup_job);
> > > > > > > -            /* queue timeout for next job */
> > > > > > > -            drm_sched_start_timeout(sched);
> > > > > > > -        }
> > > > > > >              if (!entity)
> > > > > > >                continue;
> > > 
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: [PATCH] drm/sched: fix the bug of time out calculation(v3)
  2021-08-31 13:08               ` Daniel Vetter
@ 2021-09-01  1:30                 ` Liu, Monk
  0 siblings, 0 replies; 20+ messages in thread
From: Liu, Monk @ 2021-09-01  1:30 UTC (permalink / raw)
  To: Daniel Vetter, Koenig, Christian
  Cc: Grodzovsky, Andrey, Christian König, amd-gfx, dri-devel

[AMD Official Use Only]

That' really matter in practice, when two jobs from different process scheduled to the ring close to each other, if we don't discriminate A from B then B will be considered a bad job due to A's timeout, which will force B's process to exit (e.g.: X server)

Thanks 

------------------------------------------
Monk Liu | Cloud-GPU Core team
------------------------------------------

-----Original Message-----
From: Daniel Vetter <daniel@ffwll.ch> 
Sent: Tuesday, August 31, 2021 9:09 PM
To: Koenig, Christian <Christian.Koenig@amd.com>
Cc: Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>; Christian König <ckoenig.leichtzumerken@gmail.com>; Liu, Monk <Monk.Liu@amd.com>; amd-gfx@lists.freedesktop.org; dri-devel@lists.freedesktop.org
Subject: Re: [PATCH] drm/sched: fix the bug of time out calculation(v3)

On Fri, Aug 27, 2021 at 08:30:32PM +0200, Christian König wrote:
> Yeah, that's what I meant with that the start of processing a job is a 
> bit swampy defined.
> 
> Jobs overload, but we simply don't have another good indicator that a 
> job started except that the previous one completed.
> 
> It's still better than starting the timer when pushing the job to the 
> ring buffer, because that is completely off.

Not sure this matters that much really in practice, and in some cases we might want to actually just reset it all if the built up backlog is way too long.

For anything that really runs way too long you want preempt-ctx/revoke fences anyway, not end-of-cs fences with tdr.
-Daniel

> 
> Christian.
> 
> Am 27.08.21 um 20:22 schrieb Andrey Grodzovsky:
> > As I mentioned to Monk before - what about cases such as in this 
> > test - 
> > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgi
> > tlab.freedesktop.org%2Fmesa%2Fdrm%2F-%2Fcommit%2Fbc21168fa924d3fc4a0
> > 00492e861f50a1a135b25&amp;data=04%7C01%7CMonk.Liu%40amd.com%7Cbd1847
> > 4429e34f8eaac208d96c80710e%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C
> > 0%7C637660121179715855%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiL
> > CJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=1WTD3
> > opiBtT29bbbqJub5nfhWgX5vMNppiFKgWDe%2FoQ%3D&amp;reserved=0
> > 
> > Here you don't have serialized sequence where when jobs finishes 
> > processing and second starts, they execute together  concurrently - 
> > for those cases seems to me restarting the timer for the second job 
> > from scratch will let it hang much longer then allowed by TO value.
> > 
> > Andrey
> > 
> > On 2021-08-27 10:29 a.m., Christian König wrote:
> > > I don't think that makes sense.
> > > 
> > > See we don't want to start the time when the job is inserted into 
> > > the ring buffer, but rather when it starts processing.
> > > 
> > > Starting processing is a bit swampy defined, but just starting the 
> > > timer when the previous job completes should be fine enough.
> > > 
> > > Christian.
> > > 
> > > Am 27.08.21 um 15:57 schrieb Andrey Grodzovsky:
> > > > The TS represents the point in time when the job was inserted 
> > > > into the pending list.
> > > > I don't think it matters when it actually starts to be 
> > > > processed, what matters is when this job was inserted into 
> > > > pending list because right at that point you arm the TO timer 
> > > > (when no other is running already) and so when the previous job 
> > > > completes and you cancel and rearm again you can use that TS 
> > > > from the next job in pending list to calculate how much time has 
> > > > actually left for it to run before TDR must be initiated and not 
> > > > just give it again full TO value to run even if it has already 
> > > > been running for a while.
> > > > 
> > > > Also, i am not sure also about the assumption that what we 
> > > > measure is processing by HW, what we measure is from the moment 
> > > > it was scheduled to ring to the moment the job completed (EOP 
> > > > event). At least that what our TDR timer measures and so it 
> > > > makes sense to set the TS at this point.
> > > > 
> > > > Andrey
> > > > 
> > > > On 2021-08-27 3:20 a.m., Liu, Monk wrote:
> > > > > [AMD Official Use Only]
> > > > > 
> > > > > what is that 'ts' representing for ? it looks to me the 
> > > > > jiffies that it get scheduled to the ring,  but a job 
> > > > > scheduled to the ring doesn't represent it's being processed 
> > > > > by hw.
> > > > > 
> > > > > Thanks
> > > > > 
> > > > > ------------------------------------------
> > > > > Monk Liu | Cloud-GPU Core team
> > > > > ------------------------------------------
> > > > > 
> > > > > -----Original Message-----
> > > > > From: Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>
> > > > > Sent: Friday, August 27, 2021 4:14 AM
> > > > > To: Liu, Monk <Monk.Liu@amd.com>; 
> > > > > amd-gfx@lists.freedesktop.org; Koenig, Christian 
> > > > > <Christian.Koenig@amd.com>
> > > > > Cc: dri-devel@lists.freedesktop.org
> > > > > Subject: Re: [PATCH] drm/sched: fix the bug of time out
> > > > > calculation(v3)
> > > > > 
> > > > > Attached quick patch for per job TTL calculation to make more 
> > > > > precises next timer expiration. It's on top of the patch in 
> > > > > this thread. Let me know if this makes sense.
> > > > > 
> > > > > Andrey
> > > > > 
> > > > > On 2021-08-26 10:03 a.m., Andrey Grodzovsky wrote:
> > > > > > On 2021-08-26 12:55 a.m., Monk Liu wrote:
> > > > > > > issue:
> > > > > > > in cleanup_job the cancle_delayed_work will cancel a TO 
> > > > > > > timer even the its corresponding job is still running.
> > > > > > > 
> > > > > > > fix:
> > > > > > > do not cancel the timer in cleanup_job, instead do the 
> > > > > > > cancelling only when the heading job is signaled, and if 
> > > > > > > there is a "next" job we start_timeout again.
> > > > > > > 
> > > > > > > v2:
> > > > > > > further cleanup the logic, and do the TDR timer cancelling 
> > > > > > > if the signaled job is the last one in its scheduler.
> > > > > > > 
> > > > > > > v3:
> > > > > > > change the issue description remove the 
> > > > > > > cancel_delayed_work in the begining of the cleanup_job 
> > > > > > > recover the implement of drm_sched_job_begin.
> > > > > > > 
> > > > > > > TODO:
> > > > > > > 1)introduce pause/resume scheduler in job_timeout to 
> > > > > > > serial the handling of scheduler and job_timeout.
> > > > > > > 2)drop the bad job's del and insert in scheduler due to 
> > > > > > > above serialization (no race issue anymore with the 
> > > > > > > serialization)
> > > > > > > 
> > > > > > > Signed-off-by: Monk Liu <Monk.Liu@amd.com>
> > > > > > > ---
> > > > > > >    drivers/gpu/drm/scheduler/sched_main.c | 25
> > > > > > > ++++++++++---------------
> > > > > > >    1 file changed, 10 insertions(+), 15 deletions(-)
> > > > > > > 
> > > > > > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c
> > > > > > > b/drivers/gpu/drm/scheduler/sched_main.c
> > > > > > > index a2a9536..ecf8140 100644
> > > > > > > --- a/drivers/gpu/drm/scheduler/sched_main.c
> > > > > > > +++ b/drivers/gpu/drm/scheduler/sched_main.c
> > > > > > > @@ -676,13 +676,7 @@ drm_sched_get_cleanup_job(struct 
> > > > > > > drm_gpu_scheduler *sched)
> > > > > > >    {
> > > > > > >        struct drm_sched_job *job, *next;
> > > > > > >    -    /*
> > > > > > > -     * Don't destroy jobs while the timeout worker is 
> > > > > > > running OR thread
> > > > > > > -     * is being parked and hence assumed to not touch 
> > > > > > > pending_list
> > > > > > > -     */
> > > > > > > -    if ((sched->timeout != MAX_SCHEDULE_TIMEOUT &&
> > > > > > > -        !cancel_delayed_work(&sched->work_tdr)) ||
> > > > > > > -        kthread_should_park())
> > > > > > > +    if (kthread_should_park())
> > > > > > >            return NULL;
> > > > > > 
> > > > > > I actually don't see why we need to keep the above, on the 
> > > > > > other side (in drm_sched_stop) we won't touch the pending 
> > > > > > list anyway until sched thread came to full stop 
> > > > > > (kthread_park). If you do see a reason why this needed then 
> > > > > > a comment should be here i think.
> > > > > > 
> > > > > > Andrey
> > > > > > 
> > > > > > 
> > > > > > > spin_lock(&sched->job_list_lock); @@ -693,17 +687,21 @@ 
> > > > > > > drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched)
> > > > > > >        if (job && 
> > > > > > > dma_fence_is_signaled(&job->s_fence->finished)) {
> > > > > > >            /* remove job from pending_list */
> > > > > > >            list_del_init(&job->list);
> > > > > > > +
> > > > > > > +        /* cancel this job's TO timer */
> > > > > > > +        cancel_delayed_work(&sched->work_tdr);
> > > > > > >            /* make the scheduled timestamp more accurate 
> > > > > > > */
> > > > > > >            next = 
> > > > > > > list_first_entry_or_null(&sched->pending_list,
> > > > > > >                            typeof(*next), list);
> > > > > > > -        if (next)
> > > > > > > +
> > > > > > > +        if (next) {
> > > > > > >                next->s_fence->scheduled.timestamp =
> > > > > > > job->s_fence->finished.timestamp;
> > > > > > > -
> > > > > > > +            /* start TO timer for next job */
> > > > > > > +            drm_sched_start_timeout(sched);
> > > > > > > +        }
> > > > > > >        } else {
> > > > > > >            job = NULL;
> > > > > > > -        /* queue timeout for next job */
> > > > > > > -        drm_sched_start_timeout(sched);
> > > > > > >        }
> > > > > > >          spin_unlock(&sched->job_list_lock);
> > > > > > > @@ -791,11 +789,8 @@ static int drm_sched_main(void 
> > > > > > > *param)
> > > > > > >                          (entity =
> > > > > > > drm_sched_select_entity(sched))) ||
> > > > > > >                         kthread_should_stop());
> > > > > > >    -        if (cleanup_job) {
> > > > > > > +        if (cleanup_job)
> > > > > > >                sched->ops->free_job(cleanup_job);
> > > > > > > -            /* queue timeout for next job */
> > > > > > > -            drm_sched_start_timeout(sched);
> > > > > > > -        }
> > > > > > >              if (!entity)
> > > > > > >                continue;
> > > 
> 

--
Daniel Vetter
Software Engineer, Intel Corporation
https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fblog.ffwll.ch%2F&amp;data=04%7C01%7CMonk.Liu%40amd.com%7Cbd18474429e34f8eaac208d96c80710e%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637660121179715855%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=yB5ssoeNrhCR6b%2BndHJd9IQK2mjoKlpyYP5noVeh0iA%3D&amp;reserved=0

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2021-09-01  1:30 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-26  4:55 [PATCH] drm/sched: fix the bug of time out calculation(v3) Monk Liu
2021-08-26 10:09 ` Christian König
2021-08-26 11:55   ` Liu, Monk
2021-08-26 12:37     ` Christian König
2021-08-26 12:46       ` Daniel Vetter
2021-08-27  7:30       ` Liu, Monk
2021-08-26 14:03 ` Andrey Grodzovsky
2021-08-26 20:14   ` Andrey Grodzovsky
2021-08-27  6:12     ` Christian König
2021-08-27  7:46       ` Liu, Monk
2021-08-27 13:45         ` Andrey Grodzovsky
2021-08-27 14:26           ` Christian König
2021-08-27  7:20     ` Liu, Monk
2021-08-27 13:57       ` Andrey Grodzovsky
2021-08-27 14:29         ` Christian König
2021-08-27 18:22           ` Andrey Grodzovsky
2021-08-27 18:30             ` Christian König
2021-08-27 18:39               ` Andrey Grodzovsky
2021-08-31 13:08               ` Daniel Vetter
2021-09-01  1:30                 ` Liu, Monk

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.