All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
@ 2019-11-25 20:51 ` Andrey Grodzovsky
  0 siblings, 0 replies; 125+ messages in thread
From: Andrey Grodzovsky @ 2019-11-25 20:51 UTC (permalink / raw)
  Cc: amd-gfx, steven.price, Emily.Deng, dri-devel, Christian.Koenig

Problem:
Due to a race between drm_sched_cleanup_jobs in sched thread and
drm_sched_job_timedout in timeout work there is a possiblity that
bad job was already freed while still being accessed from the
timeout thread.

Fix:
Instead of just peeking at the bad job in the mirror list
remove it from the list under lock and then put it back later when
we are garanteed no race with main sched thread is possible which
is after the thread is parked.

v2: Lock around processing ring_mirror_list in drm_sched_cleanup_jobs.

v3: Rebase on top of drm-misc-next. v2 is not needed anymore as
drm_sched_get_cleanup_job already has a lock there.

v4: Fix comments to relfect latest code in drm-misc.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Tested-by: Emily Deng <Emily.Deng@amd.com>
---
 drivers/gpu/drm/scheduler/sched_main.c | 27 +++++++++++++++++++++++++++
 1 file changed, 27 insertions(+)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index 6774955..1bf9c40 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -284,10 +284,21 @@ static void drm_sched_job_timedout(struct work_struct *work)
 	unsigned long flags;
 
 	sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work);
+
+	/* Protects against concurrent deletion in drm_sched_get_cleanup_job */
+	spin_lock_irqsave(&sched->job_list_lock, flags);
 	job = list_first_entry_or_null(&sched->ring_mirror_list,
 				       struct drm_sched_job, node);
 
 	if (job) {
+		/*
+		 * Remove the bad job so it cannot be freed by concurrent
+		 * drm_sched_cleanup_jobs. It will be reinserted back after sched->thread
+		 * is parked at which point it's safe.
+		 */
+		list_del_init(&job->node);
+		spin_unlock_irqrestore(&sched->job_list_lock, flags);
+
 		job->sched->ops->timedout_job(job);
 
 		/*
@@ -298,6 +309,8 @@ static void drm_sched_job_timedout(struct work_struct *work)
 			job->sched->ops->free_job(job);
 			sched->free_guilty = false;
 		}
+	} else {
+		spin_unlock_irqrestore(&sched->job_list_lock, flags);
 	}
 
 	spin_lock_irqsave(&sched->job_list_lock, flags);
@@ -370,6 +383,20 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad)
 	kthread_park(sched->thread);
 
 	/*
+	 * Reinsert back the bad job here - now it's safe as
+	 * drm_sched_get_cleanup_job cannot race against us and release the
+	 * bad job at this point - we parked (waited for) any in progress
+	 * (earlier) cleanups and drm_sched_get_cleanup_job will not be called
+	 * now until the scheduler thread is unparked.
+	 */
+	if (bad && bad->sched == sched)
+		/*
+		 * Add at the head of the queue to reflect it was the earliest
+		 * job extracted.
+		 */
+		list_add(&bad->node, &sched->ring_mirror_list);
+
+	/*
 	 * Iterate the job list from later to  earlier one and either deactive
 	 * their HW callbacks or remove them from mirror list if they already
 	 * signaled.
-- 
2.7.4

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 125+ messages in thread

* [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
@ 2019-11-25 20:51 ` Andrey Grodzovsky
  0 siblings, 0 replies; 125+ messages in thread
From: Andrey Grodzovsky @ 2019-11-25 20:51 UTC (permalink / raw)
  Cc: amd-gfx, steven.price, Emily.Deng, dri-devel, Christian.Koenig

Problem:
Due to a race between drm_sched_cleanup_jobs in sched thread and
drm_sched_job_timedout in timeout work there is a possiblity that
bad job was already freed while still being accessed from the
timeout thread.

Fix:
Instead of just peeking at the bad job in the mirror list
remove it from the list under lock and then put it back later when
we are garanteed no race with main sched thread is possible which
is after the thread is parked.

v2: Lock around processing ring_mirror_list in drm_sched_cleanup_jobs.

v3: Rebase on top of drm-misc-next. v2 is not needed anymore as
drm_sched_get_cleanup_job already has a lock there.

v4: Fix comments to relfect latest code in drm-misc.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Tested-by: Emily Deng <Emily.Deng@amd.com>
---
 drivers/gpu/drm/scheduler/sched_main.c | 27 +++++++++++++++++++++++++++
 1 file changed, 27 insertions(+)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index 6774955..1bf9c40 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -284,10 +284,21 @@ static void drm_sched_job_timedout(struct work_struct *work)
 	unsigned long flags;
 
 	sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work);
+
+	/* Protects against concurrent deletion in drm_sched_get_cleanup_job */
+	spin_lock_irqsave(&sched->job_list_lock, flags);
 	job = list_first_entry_or_null(&sched->ring_mirror_list,
 				       struct drm_sched_job, node);
 
 	if (job) {
+		/*
+		 * Remove the bad job so it cannot be freed by concurrent
+		 * drm_sched_cleanup_jobs. It will be reinserted back after sched->thread
+		 * is parked at which point it's safe.
+		 */
+		list_del_init(&job->node);
+		spin_unlock_irqrestore(&sched->job_list_lock, flags);
+
 		job->sched->ops->timedout_job(job);
 
 		/*
@@ -298,6 +309,8 @@ static void drm_sched_job_timedout(struct work_struct *work)
 			job->sched->ops->free_job(job);
 			sched->free_guilty = false;
 		}
+	} else {
+		spin_unlock_irqrestore(&sched->job_list_lock, flags);
 	}
 
 	spin_lock_irqsave(&sched->job_list_lock, flags);
@@ -370,6 +383,20 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad)
 	kthread_park(sched->thread);
 
 	/*
+	 * Reinsert back the bad job here - now it's safe as
+	 * drm_sched_get_cleanup_job cannot race against us and release the
+	 * bad job at this point - we parked (waited for) any in progress
+	 * (earlier) cleanups and drm_sched_get_cleanup_job will not be called
+	 * now until the scheduler thread is unparked.
+	 */
+	if (bad && bad->sched == sched)
+		/*
+		 * Add at the head of the queue to reflect it was the earliest
+		 * job extracted.
+		 */
+		list_add(&bad->node, &sched->ring_mirror_list);
+
+	/*
 	 * Iterate the job list from later to  earlier one and either deactive
 	 * their HW callbacks or remove them from mirror list if they already
 	 * signaled.
-- 
2.7.4

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 125+ messages in thread

* [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
@ 2019-11-25 20:51 ` Andrey Grodzovsky
  0 siblings, 0 replies; 125+ messages in thread
From: Andrey Grodzovsky @ 2019-11-25 20:51 UTC (permalink / raw)
  Cc: Andrey Grodzovsky, amd-gfx, steven.price, Emily.Deng, dri-devel,
	Christian.Koenig

Problem:
Due to a race between drm_sched_cleanup_jobs in sched thread and
drm_sched_job_timedout in timeout work there is a possiblity that
bad job was already freed while still being accessed from the
timeout thread.

Fix:
Instead of just peeking at the bad job in the mirror list
remove it from the list under lock and then put it back later when
we are garanteed no race with main sched thread is possible which
is after the thread is parked.

v2: Lock around processing ring_mirror_list in drm_sched_cleanup_jobs.

v3: Rebase on top of drm-misc-next. v2 is not needed anymore as
drm_sched_get_cleanup_job already has a lock there.

v4: Fix comments to relfect latest code in drm-misc.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Tested-by: Emily Deng <Emily.Deng@amd.com>
---
 drivers/gpu/drm/scheduler/sched_main.c | 27 +++++++++++++++++++++++++++
 1 file changed, 27 insertions(+)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index 6774955..1bf9c40 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -284,10 +284,21 @@ static void drm_sched_job_timedout(struct work_struct *work)
 	unsigned long flags;
 
 	sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work);
+
+	/* Protects against concurrent deletion in drm_sched_get_cleanup_job */
+	spin_lock_irqsave(&sched->job_list_lock, flags);
 	job = list_first_entry_or_null(&sched->ring_mirror_list,
 				       struct drm_sched_job, node);
 
 	if (job) {
+		/*
+		 * Remove the bad job so it cannot be freed by concurrent
+		 * drm_sched_cleanup_jobs. It will be reinserted back after sched->thread
+		 * is parked at which point it's safe.
+		 */
+		list_del_init(&job->node);
+		spin_unlock_irqrestore(&sched->job_list_lock, flags);
+
 		job->sched->ops->timedout_job(job);
 
 		/*
@@ -298,6 +309,8 @@ static void drm_sched_job_timedout(struct work_struct *work)
 			job->sched->ops->free_job(job);
 			sched->free_guilty = false;
 		}
+	} else {
+		spin_unlock_irqrestore(&sched->job_list_lock, flags);
 	}
 
 	spin_lock_irqsave(&sched->job_list_lock, flags);
@@ -370,6 +383,20 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad)
 	kthread_park(sched->thread);
 
 	/*
+	 * Reinsert back the bad job here - now it's safe as
+	 * drm_sched_get_cleanup_job cannot race against us and release the
+	 * bad job at this point - we parked (waited for) any in progress
+	 * (earlier) cleanups and drm_sched_get_cleanup_job will not be called
+	 * now until the scheduler thread is unparked.
+	 */
+	if (bad && bad->sched == sched)
+		/*
+		 * Add at the head of the queue to reflect it was the earliest
+		 * job extracted.
+		 */
+		list_add(&bad->node, &sched->ring_mirror_list);
+
+	/*
 	 * Iterate the job list from later to  earlier one and either deactive
 	 * their HW callbacks or remove them from mirror list if they already
 	 * signaled.
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 125+ messages in thread

* RE: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
@ 2019-11-25 21:44   ` Deng, Emily
  0 siblings, 0 replies; 125+ messages in thread
From: Deng, Emily @ 2019-11-25 21:44 UTC (permalink / raw)
  To: Grodzovsky, Andrey; +Cc: steven.price, amd-gfx, dri-devel, Koenig, Christian

[AMD Official Use Only - Internal Distribution Only]

Hi Andrey,
    Seems you didn't submit this patch?

Best wishes
Emily Deng



>-----Original Message-----
>From: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>Sent: Monday, November 25, 2019 12:51 PM
>Cc: dri-devel@lists.freedesktop.org; amd-gfx@lists.freedesktop.org; Koenig,
>Christian <Christian.Koenig@amd.com>; Deng, Emily
><Emily.Deng@amd.com>; steven.price@arm.com; Grodzovsky, Andrey
><Andrey.Grodzovsky@amd.com>
>Subject: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
>
>Problem:
>Due to a race between drm_sched_cleanup_jobs in sched thread and
>drm_sched_job_timedout in timeout work there is a possiblity that bad job
>was already freed while still being accessed from the timeout thread.
>
>Fix:
>Instead of just peeking at the bad job in the mirror list remove it from the list
>under lock and then put it back later when we are garanteed no race with
>main sched thread is possible which is after the thread is parked.
>
>v2: Lock around processing ring_mirror_list in drm_sched_cleanup_jobs.
>
>v3: Rebase on top of drm-misc-next. v2 is not needed anymore as
>drm_sched_get_cleanup_job already has a lock there.
>
>v4: Fix comments to relfect latest code in drm-misc.
>
>Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>Reviewed-by: Christian König <christian.koenig@amd.com>
>Tested-by: Emily Deng <Emily.Deng@amd.com>
>---
> drivers/gpu/drm/scheduler/sched_main.c | 27
>+++++++++++++++++++++++++++
> 1 file changed, 27 insertions(+)
>
>diff --git a/drivers/gpu/drm/scheduler/sched_main.c
>b/drivers/gpu/drm/scheduler/sched_main.c
>index 6774955..1bf9c40 100644
>--- a/drivers/gpu/drm/scheduler/sched_main.c
>+++ b/drivers/gpu/drm/scheduler/sched_main.c
>@@ -284,10 +284,21 @@ static void drm_sched_job_timedout(struct
>work_struct *work)
> 	unsigned long flags;
>
> 	sched = container_of(work, struct drm_gpu_scheduler,
>work_tdr.work);
>+
>+	/* Protects against concurrent deletion in
>drm_sched_get_cleanup_job */
>+	spin_lock_irqsave(&sched->job_list_lock, flags);
> 	job = list_first_entry_or_null(&sched->ring_mirror_list,
> 				       struct drm_sched_job, node);
>
> 	if (job) {
>+		/*
>+		 * Remove the bad job so it cannot be freed by concurrent
>+		 * drm_sched_cleanup_jobs. It will be reinserted back after
>sched->thread
>+		 * is parked at which point it's safe.
>+		 */
>+		list_del_init(&job->node);
>+		spin_unlock_irqrestore(&sched->job_list_lock, flags);
>+
> 		job->sched->ops->timedout_job(job);
>
> 		/*
>@@ -298,6 +309,8 @@ static void drm_sched_job_timedout(struct
>work_struct *work)
> 			job->sched->ops->free_job(job);
> 			sched->free_guilty = false;
> 		}
>+	} else {
>+		spin_unlock_irqrestore(&sched->job_list_lock, flags);
> 	}
>
> 	spin_lock_irqsave(&sched->job_list_lock, flags); @@ -370,6 +383,20
>@@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct
>drm_sched_job *bad)
> 	kthread_park(sched->thread);
>
> 	/*
>+	 * Reinsert back the bad job here - now it's safe as
>+	 * drm_sched_get_cleanup_job cannot race against us and release the
>+	 * bad job at this point - we parked (waited for) any in progress
>+	 * (earlier) cleanups and drm_sched_get_cleanup_job will not be
>called
>+	 * now until the scheduler thread is unparked.
>+	 */
>+	if (bad && bad->sched == sched)
>+		/*
>+		 * Add at the head of the queue to reflect it was the earliest
>+		 * job extracted.
>+		 */
>+		list_add(&bad->node, &sched->ring_mirror_list);
>+
>+	/*
> 	 * Iterate the job list from later to  earlier one and either deactive
> 	 * their HW callbacks or remove them from mirror list if they already
> 	 * signaled.
>--
>2.7.4
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 125+ messages in thread

* RE: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
@ 2019-11-25 21:44   ` Deng, Emily
  0 siblings, 0 replies; 125+ messages in thread
From: Deng, Emily @ 2019-11-25 21:44 UTC (permalink / raw)
  To: Grodzovsky, Andrey; +Cc: steven.price, amd-gfx, dri-devel, Koenig,  Christian

[AMD Official Use Only - Internal Distribution Only]

Hi Andrey,
    Seems you didn't submit this patch?

Best wishes
Emily Deng



>-----Original Message-----
>From: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>Sent: Monday, November 25, 2019 12:51 PM
>Cc: dri-devel@lists.freedesktop.org; amd-gfx@lists.freedesktop.org; Koenig,
>Christian <Christian.Koenig@amd.com>; Deng, Emily
><Emily.Deng@amd.com>; steven.price@arm.com; Grodzovsky, Andrey
><Andrey.Grodzovsky@amd.com>
>Subject: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
>
>Problem:
>Due to a race between drm_sched_cleanup_jobs in sched thread and
>drm_sched_job_timedout in timeout work there is a possiblity that bad job
>was already freed while still being accessed from the timeout thread.
>
>Fix:
>Instead of just peeking at the bad job in the mirror list remove it from the list
>under lock and then put it back later when we are garanteed no race with
>main sched thread is possible which is after the thread is parked.
>
>v2: Lock around processing ring_mirror_list in drm_sched_cleanup_jobs.
>
>v3: Rebase on top of drm-misc-next. v2 is not needed anymore as
>drm_sched_get_cleanup_job already has a lock there.
>
>v4: Fix comments to relfect latest code in drm-misc.
>
>Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>Reviewed-by: Christian König <christian.koenig@amd.com>
>Tested-by: Emily Deng <Emily.Deng@amd.com>
>---
> drivers/gpu/drm/scheduler/sched_main.c | 27
>+++++++++++++++++++++++++++
> 1 file changed, 27 insertions(+)
>
>diff --git a/drivers/gpu/drm/scheduler/sched_main.c
>b/drivers/gpu/drm/scheduler/sched_main.c
>index 6774955..1bf9c40 100644
>--- a/drivers/gpu/drm/scheduler/sched_main.c
>+++ b/drivers/gpu/drm/scheduler/sched_main.c
>@@ -284,10 +284,21 @@ static void drm_sched_job_timedout(struct
>work_struct *work)
> 	unsigned long flags;
>
> 	sched = container_of(work, struct drm_gpu_scheduler,
>work_tdr.work);
>+
>+	/* Protects against concurrent deletion in
>drm_sched_get_cleanup_job */
>+	spin_lock_irqsave(&sched->job_list_lock, flags);
> 	job = list_first_entry_or_null(&sched->ring_mirror_list,
> 				       struct drm_sched_job, node);
>
> 	if (job) {
>+		/*
>+		 * Remove the bad job so it cannot be freed by concurrent
>+		 * drm_sched_cleanup_jobs. It will be reinserted back after
>sched->thread
>+		 * is parked at which point it's safe.
>+		 */
>+		list_del_init(&job->node);
>+		spin_unlock_irqrestore(&sched->job_list_lock, flags);
>+
> 		job->sched->ops->timedout_job(job);
>
> 		/*
>@@ -298,6 +309,8 @@ static void drm_sched_job_timedout(struct
>work_struct *work)
> 			job->sched->ops->free_job(job);
> 			sched->free_guilty = false;
> 		}
>+	} else {
>+		spin_unlock_irqrestore(&sched->job_list_lock, flags);
> 	}
>
> 	spin_lock_irqsave(&sched->job_list_lock, flags); @@ -370,6 +383,20
>@@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct
>drm_sched_job *bad)
> 	kthread_park(sched->thread);
>
> 	/*
>+	 * Reinsert back the bad job here - now it's safe as
>+	 * drm_sched_get_cleanup_job cannot race against us and release the
>+	 * bad job at this point - we parked (waited for) any in progress
>+	 * (earlier) cleanups and drm_sched_get_cleanup_job will not be
>called
>+	 * now until the scheduler thread is unparked.
>+	 */
>+	if (bad && bad->sched == sched)
>+		/*
>+		 * Add at the head of the queue to reflect it was the earliest
>+		 * job extracted.
>+		 */
>+		list_add(&bad->node, &sched->ring_mirror_list);
>+
>+	/*
> 	 * Iterate the job list from later to  earlier one and either deactive
> 	 * their HW callbacks or remove them from mirror list if they already
> 	 * signaled.
>--
>2.7.4
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 125+ messages in thread

* RE: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
@ 2019-11-25 21:44   ` Deng, Emily
  0 siblings, 0 replies; 125+ messages in thread
From: Deng, Emily @ 2019-11-25 21:44 UTC (permalink / raw)
  To: Grodzovsky, Andrey
  Cc: Grodzovsky, Andrey, steven.price, amd-gfx, dri-devel, Koenig,  Christian

[AMD Official Use Only - Internal Distribution Only]

Hi Andrey,
    Seems you didn't submit this patch?

Best wishes
Emily Deng



>-----Original Message-----
>From: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>Sent: Monday, November 25, 2019 12:51 PM
>Cc: dri-devel@lists.freedesktop.org; amd-gfx@lists.freedesktop.org; Koenig,
>Christian <Christian.Koenig@amd.com>; Deng, Emily
><Emily.Deng@amd.com>; steven.price@arm.com; Grodzovsky, Andrey
><Andrey.Grodzovsky@amd.com>
>Subject: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
>
>Problem:
>Due to a race between drm_sched_cleanup_jobs in sched thread and
>drm_sched_job_timedout in timeout work there is a possiblity that bad job
>was already freed while still being accessed from the timeout thread.
>
>Fix:
>Instead of just peeking at the bad job in the mirror list remove it from the list
>under lock and then put it back later when we are garanteed no race with
>main sched thread is possible which is after the thread is parked.
>
>v2: Lock around processing ring_mirror_list in drm_sched_cleanup_jobs.
>
>v3: Rebase on top of drm-misc-next. v2 is not needed anymore as
>drm_sched_get_cleanup_job already has a lock there.
>
>v4: Fix comments to relfect latest code in drm-misc.
>
>Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>Reviewed-by: Christian König <christian.koenig@amd.com>
>Tested-by: Emily Deng <Emily.Deng@amd.com>
>---
> drivers/gpu/drm/scheduler/sched_main.c | 27
>+++++++++++++++++++++++++++
> 1 file changed, 27 insertions(+)
>
>diff --git a/drivers/gpu/drm/scheduler/sched_main.c
>b/drivers/gpu/drm/scheduler/sched_main.c
>index 6774955..1bf9c40 100644
>--- a/drivers/gpu/drm/scheduler/sched_main.c
>+++ b/drivers/gpu/drm/scheduler/sched_main.c
>@@ -284,10 +284,21 @@ static void drm_sched_job_timedout(struct
>work_struct *work)
> 	unsigned long flags;
>
> 	sched = container_of(work, struct drm_gpu_scheduler,
>work_tdr.work);
>+
>+	/* Protects against concurrent deletion in
>drm_sched_get_cleanup_job */
>+	spin_lock_irqsave(&sched->job_list_lock, flags);
> 	job = list_first_entry_or_null(&sched->ring_mirror_list,
> 				       struct drm_sched_job, node);
>
> 	if (job) {
>+		/*
>+		 * Remove the bad job so it cannot be freed by concurrent
>+		 * drm_sched_cleanup_jobs. It will be reinserted back after
>sched->thread
>+		 * is parked at which point it's safe.
>+		 */
>+		list_del_init(&job->node);
>+		spin_unlock_irqrestore(&sched->job_list_lock, flags);
>+
> 		job->sched->ops->timedout_job(job);
>
> 		/*
>@@ -298,6 +309,8 @@ static void drm_sched_job_timedout(struct
>work_struct *work)
> 			job->sched->ops->free_job(job);
> 			sched->free_guilty = false;
> 		}
>+	} else {
>+		spin_unlock_irqrestore(&sched->job_list_lock, flags);
> 	}
>
> 	spin_lock_irqsave(&sched->job_list_lock, flags); @@ -370,6 +383,20
>@@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct
>drm_sched_job *bad)
> 	kthread_park(sched->thread);
>
> 	/*
>+	 * Reinsert back the bad job here - now it's safe as
>+	 * drm_sched_get_cleanup_job cannot race against us and release the
>+	 * bad job at this point - we parked (waited for) any in progress
>+	 * (earlier) cleanups and drm_sched_get_cleanup_job will not be
>called
>+	 * now until the scheduler thread is unparked.
>+	 */
>+	if (bad && bad->sched == sched)
>+		/*
>+		 * Add at the head of the queue to reflect it was the earliest
>+		 * job extracted.
>+		 */
>+		list_add(&bad->node, &sched->ring_mirror_list);
>+
>+	/*
> 	 * Iterate the job list from later to  earlier one and either deactive
> 	 * their HW callbacks or remove them from mirror list if they already
> 	 * signaled.
>--
>2.7.4
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
@ 2019-11-26  0:09     ` Grodzovsky, Andrey
  0 siblings, 0 replies; 125+ messages in thread
From: Grodzovsky, Andrey @ 2019-11-26  0:09 UTC (permalink / raw)
  To: Deng, Emily
  Cc: Deucher, Alexander, steven.price, amd-gfx, dri-devel, Koenig, Christian

Christian asked to submit it to drm-misc instead of our drm-next to avoid later conflicts with Steven's patch which he mentioned in this thread which is not in drm-next yet.
Christian, Alex, once this merged to drm-misc I guess we need to pull all latest changes from there to drm-next so the issue Emily reported can be avoided.

Andrey

________________________________________
From: Deng, Emily <Emily.Deng@amd.com>
Sent: 25 November 2019 16:44:36
To: Grodzovsky, Andrey
Cc: dri-devel@lists.freedesktop.org; amd-gfx@lists.freedesktop.org; Koenig, Christian; steven.price@arm.com; Grodzovsky, Andrey
Subject: RE: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.

[AMD Official Use Only - Internal Distribution Only]

Hi Andrey,
    Seems you didn't submit this patch?

Best wishes
Emily Deng



>-----Original Message-----
>From: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>Sent: Monday, November 25, 2019 12:51 PM
>Cc: dri-devel@lists.freedesktop.org; amd-gfx@lists.freedesktop.org; Koenig,
>Christian <Christian.Koenig@amd.com>; Deng, Emily
><Emily.Deng@amd.com>; steven.price@arm.com; Grodzovsky, Andrey
><Andrey.Grodzovsky@amd.com>
>Subject: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
>
>Problem:
>Due to a race between drm_sched_cleanup_jobs in sched thread and
>drm_sched_job_timedout in timeout work there is a possiblity that bad job
>was already freed while still being accessed from the timeout thread.
>
>Fix:
>Instead of just peeking at the bad job in the mirror list remove it from the list
>under lock and then put it back later when we are garanteed no race with
>main sched thread is possible which is after the thread is parked.
>
>v2: Lock around processing ring_mirror_list in drm_sched_cleanup_jobs.
>
>v3: Rebase on top of drm-misc-next. v2 is not needed anymore as
>drm_sched_get_cleanup_job already has a lock there.
>
>v4: Fix comments to relfect latest code in drm-misc.
>
>Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>Reviewed-by: Christian König <christian.koenig@amd.com>
>Tested-by: Emily Deng <Emily.Deng@amd.com>
>---
> drivers/gpu/drm/scheduler/sched_main.c | 27
>+++++++++++++++++++++++++++
> 1 file changed, 27 insertions(+)
>
>diff --git a/drivers/gpu/drm/scheduler/sched_main.c
>b/drivers/gpu/drm/scheduler/sched_main.c
>index 6774955..1bf9c40 100644
>--- a/drivers/gpu/drm/scheduler/sched_main.c
>+++ b/drivers/gpu/drm/scheduler/sched_main.c
>@@ -284,10 +284,21 @@ static void drm_sched_job_timedout(struct
>work_struct *work)
>       unsigned long flags;
>
>       sched = container_of(work, struct drm_gpu_scheduler,
>work_tdr.work);
>+
>+      /* Protects against concurrent deletion in
>drm_sched_get_cleanup_job */
>+      spin_lock_irqsave(&sched->job_list_lock, flags);
>       job = list_first_entry_or_null(&sched->ring_mirror_list,
>                                      struct drm_sched_job, node);
>
>       if (job) {
>+              /*
>+               * Remove the bad job so it cannot be freed by concurrent
>+               * drm_sched_cleanup_jobs. It will be reinserted back after
>sched->thread
>+               * is parked at which point it's safe.
>+               */
>+              list_del_init(&job->node);
>+              spin_unlock_irqrestore(&sched->job_list_lock, flags);
>+
>               job->sched->ops->timedout_job(job);
>
>               /*
>@@ -298,6 +309,8 @@ static void drm_sched_job_timedout(struct
>work_struct *work)
>                       job->sched->ops->free_job(job);
>                       sched->free_guilty = false;
>               }
>+      } else {
>+              spin_unlock_irqrestore(&sched->job_list_lock, flags);
>       }
>
>       spin_lock_irqsave(&sched->job_list_lock, flags); @@ -370,6 +383,20
>@@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct
>drm_sched_job *bad)
>       kthread_park(sched->thread);
>
>       /*
>+       * Reinsert back the bad job here - now it's safe as
>+       * drm_sched_get_cleanup_job cannot race against us and release the
>+       * bad job at this point - we parked (waited for) any in progress
>+       * (earlier) cleanups and drm_sched_get_cleanup_job will not be
>called
>+       * now until the scheduler thread is unparked.
>+       */
>+      if (bad && bad->sched == sched)
>+              /*
>+               * Add at the head of the queue to reflect it was the earliest
>+               * job extracted.
>+               */
>+              list_add(&bad->node, &sched->ring_mirror_list);
>+
>+      /*
>        * Iterate the job list from later to  earlier one and either deactive
>        * their HW callbacks or remove them from mirror list if they already
>        * signaled.
>--
>2.7.4
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
@ 2019-11-26  0:09     ` Grodzovsky, Andrey
  0 siblings, 0 replies; 125+ messages in thread
From: Grodzovsky, Andrey @ 2019-11-26  0:09 UTC (permalink / raw)
  To: Deng, Emily
  Cc: Deucher, Alexander, steven.price, amd-gfx, dri-devel, Koenig,  Christian

Christian asked to submit it to drm-misc instead of our drm-next to avoid later conflicts with Steven's patch which he mentioned in this thread which is not in drm-next yet.
Christian, Alex, once this merged to drm-misc I guess we need to pull all latest changes from there to drm-next so the issue Emily reported can be avoided.

Andrey

________________________________________
From: Deng, Emily <Emily.Deng@amd.com>
Sent: 25 November 2019 16:44:36
To: Grodzovsky, Andrey
Cc: dri-devel@lists.freedesktop.org; amd-gfx@lists.freedesktop.org; Koenig, Christian; steven.price@arm.com; Grodzovsky, Andrey
Subject: RE: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.

[AMD Official Use Only - Internal Distribution Only]

Hi Andrey,
    Seems you didn't submit this patch?

Best wishes
Emily Deng



>-----Original Message-----
>From: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>Sent: Monday, November 25, 2019 12:51 PM
>Cc: dri-devel@lists.freedesktop.org; amd-gfx@lists.freedesktop.org; Koenig,
>Christian <Christian.Koenig@amd.com>; Deng, Emily
><Emily.Deng@amd.com>; steven.price@arm.com; Grodzovsky, Andrey
><Andrey.Grodzovsky@amd.com>
>Subject: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
>
>Problem:
>Due to a race between drm_sched_cleanup_jobs in sched thread and
>drm_sched_job_timedout in timeout work there is a possiblity that bad job
>was already freed while still being accessed from the timeout thread.
>
>Fix:
>Instead of just peeking at the bad job in the mirror list remove it from the list
>under lock and then put it back later when we are garanteed no race with
>main sched thread is possible which is after the thread is parked.
>
>v2: Lock around processing ring_mirror_list in drm_sched_cleanup_jobs.
>
>v3: Rebase on top of drm-misc-next. v2 is not needed anymore as
>drm_sched_get_cleanup_job already has a lock there.
>
>v4: Fix comments to relfect latest code in drm-misc.
>
>Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>Reviewed-by: Christian König <christian.koenig@amd.com>
>Tested-by: Emily Deng <Emily.Deng@amd.com>
>---
> drivers/gpu/drm/scheduler/sched_main.c | 27
>+++++++++++++++++++++++++++
> 1 file changed, 27 insertions(+)
>
>diff --git a/drivers/gpu/drm/scheduler/sched_main.c
>b/drivers/gpu/drm/scheduler/sched_main.c
>index 6774955..1bf9c40 100644
>--- a/drivers/gpu/drm/scheduler/sched_main.c
>+++ b/drivers/gpu/drm/scheduler/sched_main.c
>@@ -284,10 +284,21 @@ static void drm_sched_job_timedout(struct
>work_struct *work)
>       unsigned long flags;
>
>       sched = container_of(work, struct drm_gpu_scheduler,
>work_tdr.work);
>+
>+      /* Protects against concurrent deletion in
>drm_sched_get_cleanup_job */
>+      spin_lock_irqsave(&sched->job_list_lock, flags);
>       job = list_first_entry_or_null(&sched->ring_mirror_list,
>                                      struct drm_sched_job, node);
>
>       if (job) {
>+              /*
>+               * Remove the bad job so it cannot be freed by concurrent
>+               * drm_sched_cleanup_jobs. It will be reinserted back after
>sched->thread
>+               * is parked at which point it's safe.
>+               */
>+              list_del_init(&job->node);
>+              spin_unlock_irqrestore(&sched->job_list_lock, flags);
>+
>               job->sched->ops->timedout_job(job);
>
>               /*
>@@ -298,6 +309,8 @@ static void drm_sched_job_timedout(struct
>work_struct *work)
>                       job->sched->ops->free_job(job);
>                       sched->free_guilty = false;
>               }
>+      } else {
>+              spin_unlock_irqrestore(&sched->job_list_lock, flags);
>       }
>
>       spin_lock_irqsave(&sched->job_list_lock, flags); @@ -370,6 +383,20
>@@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct
>drm_sched_job *bad)
>       kthread_park(sched->thread);
>
>       /*
>+       * Reinsert back the bad job here - now it's safe as
>+       * drm_sched_get_cleanup_job cannot race against us and release the
>+       * bad job at this point - we parked (waited for) any in progress
>+       * (earlier) cleanups and drm_sched_get_cleanup_job will not be
>called
>+       * now until the scheduler thread is unparked.
>+       */
>+      if (bad && bad->sched == sched)
>+              /*
>+               * Add at the head of the queue to reflect it was the earliest
>+               * job extracted.
>+               */
>+              list_add(&bad->node, &sched->ring_mirror_list);
>+
>+      /*
>        * Iterate the job list from later to  earlier one and either deactive
>        * their HW callbacks or remove them from mirror list if they already
>        * signaled.
>--
>2.7.4
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
@ 2019-11-26  0:09     ` Grodzovsky, Andrey
  0 siblings, 0 replies; 125+ messages in thread
From: Grodzovsky, Andrey @ 2019-11-26  0:09 UTC (permalink / raw)
  To: Deng, Emily
  Cc: Deucher, Alexander, steven.price, amd-gfx, dri-devel, Koenig,  Christian

Christian asked to submit it to drm-misc instead of our drm-next to avoid later conflicts with Steven's patch which he mentioned in this thread which is not in drm-next yet.
Christian, Alex, once this merged to drm-misc I guess we need to pull all latest changes from there to drm-next so the issue Emily reported can be avoided.

Andrey

________________________________________
From: Deng, Emily <Emily.Deng@amd.com>
Sent: 25 November 2019 16:44:36
To: Grodzovsky, Andrey
Cc: dri-devel@lists.freedesktop.org; amd-gfx@lists.freedesktop.org; Koenig, Christian; steven.price@arm.com; Grodzovsky, Andrey
Subject: RE: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.

[AMD Official Use Only - Internal Distribution Only]

Hi Andrey,
    Seems you didn't submit this patch?

Best wishes
Emily Deng



>-----Original Message-----
>From: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>Sent: Monday, November 25, 2019 12:51 PM
>Cc: dri-devel@lists.freedesktop.org; amd-gfx@lists.freedesktop.org; Koenig,
>Christian <Christian.Koenig@amd.com>; Deng, Emily
><Emily.Deng@amd.com>; steven.price@arm.com; Grodzovsky, Andrey
><Andrey.Grodzovsky@amd.com>
>Subject: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
>
>Problem:
>Due to a race between drm_sched_cleanup_jobs in sched thread and
>drm_sched_job_timedout in timeout work there is a possiblity that bad job
>was already freed while still being accessed from the timeout thread.
>
>Fix:
>Instead of just peeking at the bad job in the mirror list remove it from the list
>under lock and then put it back later when we are garanteed no race with
>main sched thread is possible which is after the thread is parked.
>
>v2: Lock around processing ring_mirror_list in drm_sched_cleanup_jobs.
>
>v3: Rebase on top of drm-misc-next. v2 is not needed anymore as
>drm_sched_get_cleanup_job already has a lock there.
>
>v4: Fix comments to relfect latest code in drm-misc.
>
>Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>Reviewed-by: Christian König <christian.koenig@amd.com>
>Tested-by: Emily Deng <Emily.Deng@amd.com>
>---
> drivers/gpu/drm/scheduler/sched_main.c | 27
>+++++++++++++++++++++++++++
> 1 file changed, 27 insertions(+)
>
>diff --git a/drivers/gpu/drm/scheduler/sched_main.c
>b/drivers/gpu/drm/scheduler/sched_main.c
>index 6774955..1bf9c40 100644
>--- a/drivers/gpu/drm/scheduler/sched_main.c
>+++ b/drivers/gpu/drm/scheduler/sched_main.c
>@@ -284,10 +284,21 @@ static void drm_sched_job_timedout(struct
>work_struct *work)
>       unsigned long flags;
>
>       sched = container_of(work, struct drm_gpu_scheduler,
>work_tdr.work);
>+
>+      /* Protects against concurrent deletion in
>drm_sched_get_cleanup_job */
>+      spin_lock_irqsave(&sched->job_list_lock, flags);
>       job = list_first_entry_or_null(&sched->ring_mirror_list,
>                                      struct drm_sched_job, node);
>
>       if (job) {
>+              /*
>+               * Remove the bad job so it cannot be freed by concurrent
>+               * drm_sched_cleanup_jobs. It will be reinserted back after
>sched->thread
>+               * is parked at which point it's safe.
>+               */
>+              list_del_init(&job->node);
>+              spin_unlock_irqrestore(&sched->job_list_lock, flags);
>+
>               job->sched->ops->timedout_job(job);
>
>               /*
>@@ -298,6 +309,8 @@ static void drm_sched_job_timedout(struct
>work_struct *work)
>                       job->sched->ops->free_job(job);
>                       sched->free_guilty = false;
>               }
>+      } else {
>+              spin_unlock_irqrestore(&sched->job_list_lock, flags);
>       }
>
>       spin_lock_irqsave(&sched->job_list_lock, flags); @@ -370,6 +383,20
>@@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct
>drm_sched_job *bad)
>       kthread_park(sched->thread);
>
>       /*
>+       * Reinsert back the bad job here - now it's safe as
>+       * drm_sched_get_cleanup_job cannot race against us and release the
>+       * bad job at this point - we parked (waited for) any in progress
>+       * (earlier) cleanups and drm_sched_get_cleanup_job will not be
>called
>+       * now until the scheduler thread is unparked.
>+       */
>+      if (bad && bad->sched == sched)
>+              /*
>+               * Add at the head of the queue to reflect it was the earliest
>+               * job extracted.
>+               */
>+              list_add(&bad->node, &sched->ring_mirror_list);
>+
>+      /*
>        * Iterate the job list from later to  earlier one and either deactive
>        * their HW callbacks or remove them from mirror list if they already
>        * signaled.
>--
>2.7.4
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
@ 2019-11-26 15:36         ` Deucher, Alexander
  0 siblings, 0 replies; 125+ messages in thread
From: Deucher, Alexander @ 2019-11-26 15:36 UTC (permalink / raw)
  To: Grodzovsky, Andrey, Deng, Emily
  Cc: steven.price-5wv7dgnIgG8,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Koenig, Christian


[-- Attachment #1.1: Type: text/plain, Size: 6025 bytes --]

I recently updated amd-staging-drm-next.  Apply whatever makes sense for now and it'll naturally fall out in the next rebase.

Alex
________________________________
From: Grodzovsky, Andrey <Andrey.Grodzovsky-5C7GfCeVMHo@public.gmane.org>
Sent: Monday, November 25, 2019 7:09 PM
To: Deng, Emily <Emily.Deng-5C7GfCeVMHo@public.gmane.org>
Cc: dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org <dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org>; amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org <amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org>; Koenig, Christian <Christian.Koenig-5C7GfCeVMHo@public.gmane.org>; steven.price-5wv7dgnIgG8@public.gmane.org <steven.price-5wv7dgnIgG8@public.gmane.org>; Deucher, Alexander <Alexander.Deucher-5C7GfCeVMHo@public.gmane.org>
Subject: Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.

Christian asked to submit it to drm-misc instead of our drm-next to avoid later conflicts with Steven's patch which he mentioned in this thread which is not in drm-next yet.
Christian, Alex, once this merged to drm-misc I guess we need to pull all latest changes from there to drm-next so the issue Emily reported can be avoided.

Andrey

________________________________________
From: Deng, Emily <Emily.Deng-5C7GfCeVMHo@public.gmane.org>
Sent: 25 November 2019 16:44:36
To: Grodzovsky, Andrey
Cc: dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org; amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org; Koenig, Christian; steven.price-5wv7dgnIgG8@public.gmane.org; Grodzovsky, Andrey
Subject: RE: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.

[AMD Official Use Only - Internal Distribution Only]

Hi Andrey,
    Seems you didn't submit this patch?

Best wishes
Emily Deng



>-----Original Message-----
>From: Andrey Grodzovsky <andrey.grodzovsky-5C7GfCeVMHo@public.gmane.org>
>Sent: Monday, November 25, 2019 12:51 PM
>Cc: dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org; amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org; Koenig,
>Christian <Christian.Koenig-5C7GfCeVMHo@public.gmane.org>; Deng, Emily
><Emily.Deng-5C7GfCeVMHo@public.gmane.org>; steven.price-5wv7dgnIgG8@public.gmane.org; Grodzovsky, Andrey
><Andrey.Grodzovsky-5C7GfCeVMHo@public.gmane.org>
>Subject: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
>
>Problem:
>Due to a race between drm_sched_cleanup_jobs in sched thread and
>drm_sched_job_timedout in timeout work there is a possiblity that bad job
>was already freed while still being accessed from the timeout thread.
>
>Fix:
>Instead of just peeking at the bad job in the mirror list remove it from the list
>under lock and then put it back later when we are garanteed no race with
>main sched thread is possible which is after the thread is parked.
>
>v2: Lock around processing ring_mirror_list in drm_sched_cleanup_jobs.
>
>v3: Rebase on top of drm-misc-next. v2 is not needed anymore as
>drm_sched_get_cleanup_job already has a lock there.
>
>v4: Fix comments to relfect latest code in drm-misc.
>
>Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky-5C7GfCeVMHo@public.gmane.org>
>Reviewed-by: Christian König <christian.koenig-5C7GfCeVMHo@public.gmane.org>
>Tested-by: Emily Deng <Emily.Deng-5C7GfCeVMHo@public.gmane.org>
>---
> drivers/gpu/drm/scheduler/sched_main.c | 27
>+++++++++++++++++++++++++++
> 1 file changed, 27 insertions(+)
>
>diff --git a/drivers/gpu/drm/scheduler/sched_main.c
>b/drivers/gpu/drm/scheduler/sched_main.c
>index 6774955..1bf9c40 100644
>--- a/drivers/gpu/drm/scheduler/sched_main.c
>+++ b/drivers/gpu/drm/scheduler/sched_main.c
>@@ -284,10 +284,21 @@ static void drm_sched_job_timedout(struct
>work_struct *work)
>       unsigned long flags;
>
>       sched = container_of(work, struct drm_gpu_scheduler,
>work_tdr.work);
>+
>+      /* Protects against concurrent deletion in
>drm_sched_get_cleanup_job */
>+      spin_lock_irqsave(&sched->job_list_lock, flags);
>       job = list_first_entry_or_null(&sched->ring_mirror_list,
>                                      struct drm_sched_job, node);
>
>       if (job) {
>+              /*
>+               * Remove the bad job so it cannot be freed by concurrent
>+               * drm_sched_cleanup_jobs. It will be reinserted back after
>sched->thread
>+               * is parked at which point it's safe.
>+               */
>+              list_del_init(&job->node);
>+              spin_unlock_irqrestore(&sched->job_list_lock, flags);
>+
>               job->sched->ops->timedout_job(job);
>
>               /*
>@@ -298,6 +309,8 @@ static void drm_sched_job_timedout(struct
>work_struct *work)
>                       job->sched->ops->free_job(job);
>                       sched->free_guilty = false;
>               }
>+      } else {
>+              spin_unlock_irqrestore(&sched->job_list_lock, flags);
>       }
>
>       spin_lock_irqsave(&sched->job_list_lock, flags); @@ -370,6 +383,20
>@@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct
>drm_sched_job *bad)
>       kthread_park(sched->thread);
>
>       /*
>+       * Reinsert back the bad job here - now it's safe as
>+       * drm_sched_get_cleanup_job cannot race against us and release the
>+       * bad job at this point - we parked (waited for) any in progress
>+       * (earlier) cleanups and drm_sched_get_cleanup_job will not be
>called
>+       * now until the scheduler thread is unparked.
>+       */
>+      if (bad && bad->sched == sched)
>+              /*
>+               * Add at the head of the queue to reflect it was the earliest
>+               * job extracted.
>+               */
>+              list_add(&bad->node, &sched->ring_mirror_list);
>+
>+      /*
>        * Iterate the job list from later to  earlier one and either deactive
>        * their HW callbacks or remove them from mirror list if they already
>        * signaled.
>--
>2.7.4

[-- Attachment #1.2: Type: text/html, Size: 10334 bytes --]

[-- Attachment #2: Type: text/plain, Size: 153 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
@ 2019-11-26 15:36         ` Deucher, Alexander
  0 siblings, 0 replies; 125+ messages in thread
From: Deucher, Alexander @ 2019-11-26 15:36 UTC (permalink / raw)
  To: Grodzovsky, Andrey, Deng, Emily
  Cc: steven.price, amd-gfx, dri-devel, Koenig,  Christian


[-- Attachment #1.1: Type: text/plain, Size: 5465 bytes --]

I recently updated amd-staging-drm-next.  Apply whatever makes sense for now and it'll naturally fall out in the next rebase.

Alex
________________________________
From: Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>
Sent: Monday, November 25, 2019 7:09 PM
To: Deng, Emily <Emily.Deng@amd.com>
Cc: dri-devel@lists.freedesktop.org <dri-devel@lists.freedesktop.org>; amd-gfx@lists.freedesktop.org <amd-gfx@lists.freedesktop.org>; Koenig, Christian <Christian.Koenig@amd.com>; steven.price@arm.com <steven.price@arm.com>; Deucher, Alexander <Alexander.Deucher@amd.com>
Subject: Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.

Christian asked to submit it to drm-misc instead of our drm-next to avoid later conflicts with Steven's patch which he mentioned in this thread which is not in drm-next yet.
Christian, Alex, once this merged to drm-misc I guess we need to pull all latest changes from there to drm-next so the issue Emily reported can be avoided.

Andrey

________________________________________
From: Deng, Emily <Emily.Deng@amd.com>
Sent: 25 November 2019 16:44:36
To: Grodzovsky, Andrey
Cc: dri-devel@lists.freedesktop.org; amd-gfx@lists.freedesktop.org; Koenig, Christian; steven.price@arm.com; Grodzovsky, Andrey
Subject: RE: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.

[AMD Official Use Only - Internal Distribution Only]

Hi Andrey,
    Seems you didn't submit this patch?

Best wishes
Emily Deng



>-----Original Message-----
>From: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>Sent: Monday, November 25, 2019 12:51 PM
>Cc: dri-devel@lists.freedesktop.org; amd-gfx@lists.freedesktop.org; Koenig,
>Christian <Christian.Koenig@amd.com>; Deng, Emily
><Emily.Deng@amd.com>; steven.price@arm.com; Grodzovsky, Andrey
><Andrey.Grodzovsky@amd.com>
>Subject: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
>
>Problem:
>Due to a race between drm_sched_cleanup_jobs in sched thread and
>drm_sched_job_timedout in timeout work there is a possiblity that bad job
>was already freed while still being accessed from the timeout thread.
>
>Fix:
>Instead of just peeking at the bad job in the mirror list remove it from the list
>under lock and then put it back later when we are garanteed no race with
>main sched thread is possible which is after the thread is parked.
>
>v2: Lock around processing ring_mirror_list in drm_sched_cleanup_jobs.
>
>v3: Rebase on top of drm-misc-next. v2 is not needed anymore as
>drm_sched_get_cleanup_job already has a lock there.
>
>v4: Fix comments to relfect latest code in drm-misc.
>
>Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>Reviewed-by: Christian König <christian.koenig@amd.com>
>Tested-by: Emily Deng <Emily.Deng@amd.com>
>---
> drivers/gpu/drm/scheduler/sched_main.c | 27
>+++++++++++++++++++++++++++
> 1 file changed, 27 insertions(+)
>
>diff --git a/drivers/gpu/drm/scheduler/sched_main.c
>b/drivers/gpu/drm/scheduler/sched_main.c
>index 6774955..1bf9c40 100644
>--- a/drivers/gpu/drm/scheduler/sched_main.c
>+++ b/drivers/gpu/drm/scheduler/sched_main.c
>@@ -284,10 +284,21 @@ static void drm_sched_job_timedout(struct
>work_struct *work)
>       unsigned long flags;
>
>       sched = container_of(work, struct drm_gpu_scheduler,
>work_tdr.work);
>+
>+      /* Protects against concurrent deletion in
>drm_sched_get_cleanup_job */
>+      spin_lock_irqsave(&sched->job_list_lock, flags);
>       job = list_first_entry_or_null(&sched->ring_mirror_list,
>                                      struct drm_sched_job, node);
>
>       if (job) {
>+              /*
>+               * Remove the bad job so it cannot be freed by concurrent
>+               * drm_sched_cleanup_jobs. It will be reinserted back after
>sched->thread
>+               * is parked at which point it's safe.
>+               */
>+              list_del_init(&job->node);
>+              spin_unlock_irqrestore(&sched->job_list_lock, flags);
>+
>               job->sched->ops->timedout_job(job);
>
>               /*
>@@ -298,6 +309,8 @@ static void drm_sched_job_timedout(struct
>work_struct *work)
>                       job->sched->ops->free_job(job);
>                       sched->free_guilty = false;
>               }
>+      } else {
>+              spin_unlock_irqrestore(&sched->job_list_lock, flags);
>       }
>
>       spin_lock_irqsave(&sched->job_list_lock, flags); @@ -370,6 +383,20
>@@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct
>drm_sched_job *bad)
>       kthread_park(sched->thread);
>
>       /*
>+       * Reinsert back the bad job here - now it's safe as
>+       * drm_sched_get_cleanup_job cannot race against us and release the
>+       * bad job at this point - we parked (waited for) any in progress
>+       * (earlier) cleanups and drm_sched_get_cleanup_job will not be
>called
>+       * now until the scheduler thread is unparked.
>+       */
>+      if (bad && bad->sched == sched)
>+              /*
>+               * Add at the head of the queue to reflect it was the earliest
>+               * job extracted.
>+               */
>+              list_add(&bad->node, &sched->ring_mirror_list);
>+
>+      /*
>        * Iterate the job list from later to  earlier one and either deactive
>        * their HW callbacks or remove them from mirror list if they already
>        * signaled.
>--
>2.7.4

[-- Attachment #1.2: Type: text/html, Size: 9792 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
@ 2019-11-26 15:36         ` Deucher, Alexander
  0 siblings, 0 replies; 125+ messages in thread
From: Deucher, Alexander @ 2019-11-26 15:36 UTC (permalink / raw)
  To: Grodzovsky, Andrey, Deng, Emily
  Cc: steven.price, amd-gfx, dri-devel, Koenig,  Christian


[-- Attachment #1.1: Type: text/plain, Size: 5465 bytes --]

I recently updated amd-staging-drm-next.  Apply whatever makes sense for now and it'll naturally fall out in the next rebase.

Alex
________________________________
From: Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>
Sent: Monday, November 25, 2019 7:09 PM
To: Deng, Emily <Emily.Deng@amd.com>
Cc: dri-devel@lists.freedesktop.org <dri-devel@lists.freedesktop.org>; amd-gfx@lists.freedesktop.org <amd-gfx@lists.freedesktop.org>; Koenig, Christian <Christian.Koenig@amd.com>; steven.price@arm.com <steven.price@arm.com>; Deucher, Alexander <Alexander.Deucher@amd.com>
Subject: Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.

Christian asked to submit it to drm-misc instead of our drm-next to avoid later conflicts with Steven's patch which he mentioned in this thread which is not in drm-next yet.
Christian, Alex, once this merged to drm-misc I guess we need to pull all latest changes from there to drm-next so the issue Emily reported can be avoided.

Andrey

________________________________________
From: Deng, Emily <Emily.Deng@amd.com>
Sent: 25 November 2019 16:44:36
To: Grodzovsky, Andrey
Cc: dri-devel@lists.freedesktop.org; amd-gfx@lists.freedesktop.org; Koenig, Christian; steven.price@arm.com; Grodzovsky, Andrey
Subject: RE: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.

[AMD Official Use Only - Internal Distribution Only]

Hi Andrey,
    Seems you didn't submit this patch?

Best wishes
Emily Deng



>-----Original Message-----
>From: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>Sent: Monday, November 25, 2019 12:51 PM
>Cc: dri-devel@lists.freedesktop.org; amd-gfx@lists.freedesktop.org; Koenig,
>Christian <Christian.Koenig@amd.com>; Deng, Emily
><Emily.Deng@amd.com>; steven.price@arm.com; Grodzovsky, Andrey
><Andrey.Grodzovsky@amd.com>
>Subject: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
>
>Problem:
>Due to a race between drm_sched_cleanup_jobs in sched thread and
>drm_sched_job_timedout in timeout work there is a possiblity that bad job
>was already freed while still being accessed from the timeout thread.
>
>Fix:
>Instead of just peeking at the bad job in the mirror list remove it from the list
>under lock and then put it back later when we are garanteed no race with
>main sched thread is possible which is after the thread is parked.
>
>v2: Lock around processing ring_mirror_list in drm_sched_cleanup_jobs.
>
>v3: Rebase on top of drm-misc-next. v2 is not needed anymore as
>drm_sched_get_cleanup_job already has a lock there.
>
>v4: Fix comments to relfect latest code in drm-misc.
>
>Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>Reviewed-by: Christian König <christian.koenig@amd.com>
>Tested-by: Emily Deng <Emily.Deng@amd.com>
>---
> drivers/gpu/drm/scheduler/sched_main.c | 27
>+++++++++++++++++++++++++++
> 1 file changed, 27 insertions(+)
>
>diff --git a/drivers/gpu/drm/scheduler/sched_main.c
>b/drivers/gpu/drm/scheduler/sched_main.c
>index 6774955..1bf9c40 100644
>--- a/drivers/gpu/drm/scheduler/sched_main.c
>+++ b/drivers/gpu/drm/scheduler/sched_main.c
>@@ -284,10 +284,21 @@ static void drm_sched_job_timedout(struct
>work_struct *work)
>       unsigned long flags;
>
>       sched = container_of(work, struct drm_gpu_scheduler,
>work_tdr.work);
>+
>+      /* Protects against concurrent deletion in
>drm_sched_get_cleanup_job */
>+      spin_lock_irqsave(&sched->job_list_lock, flags);
>       job = list_first_entry_or_null(&sched->ring_mirror_list,
>                                      struct drm_sched_job, node);
>
>       if (job) {
>+              /*
>+               * Remove the bad job so it cannot be freed by concurrent
>+               * drm_sched_cleanup_jobs. It will be reinserted back after
>sched->thread
>+               * is parked at which point it's safe.
>+               */
>+              list_del_init(&job->node);
>+              spin_unlock_irqrestore(&sched->job_list_lock, flags);
>+
>               job->sched->ops->timedout_job(job);
>
>               /*
>@@ -298,6 +309,8 @@ static void drm_sched_job_timedout(struct
>work_struct *work)
>                       job->sched->ops->free_job(job);
>                       sched->free_guilty = false;
>               }
>+      } else {
>+              spin_unlock_irqrestore(&sched->job_list_lock, flags);
>       }
>
>       spin_lock_irqsave(&sched->job_list_lock, flags); @@ -370,6 +383,20
>@@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct
>drm_sched_job *bad)
>       kthread_park(sched->thread);
>
>       /*
>+       * Reinsert back the bad job here - now it's safe as
>+       * drm_sched_get_cleanup_job cannot race against us and release the
>+       * bad job at this point - we parked (waited for) any in progress
>+       * (earlier) cleanups and drm_sched_get_cleanup_job will not be
>called
>+       * now until the scheduler thread is unparked.
>+       */
>+      if (bad && bad->sched == sched)
>+              /*
>+               * Add at the head of the queue to reflect it was the earliest
>+               * job extracted.
>+               */
>+              list_add(&bad->node, &sched->ring_mirror_list);
>+
>+      /*
>        * Iterate the job list from later to  earlier one and either deactive
>        * their HW callbacks or remove them from mirror list if they already
>        * signaled.
>--
>2.7.4

[-- Attachment #1.2: Type: text/html, Size: 9792 bytes --]

[-- Attachment #2: Type: text/plain, Size: 153 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
@ 2019-11-26 15:37   ` Andrey Grodzovsky
  0 siblings, 0 replies; 125+ messages in thread
From: Andrey Grodzovsky @ 2019-11-26 15:37 UTC (permalink / raw)
  Cc: Emily.Deng, steven.price, amd-gfx, dri-devel, Christian.Koenig

Ping

Andrey

On 11/25/19 3:51 PM, Andrey Grodzovsky wrote:
> Problem:
> Due to a race between drm_sched_cleanup_jobs in sched thread and
> drm_sched_job_timedout in timeout work there is a possiblity that
> bad job was already freed while still being accessed from the
> timeout thread.
>
> Fix:
> Instead of just peeking at the bad job in the mirror list
> remove it from the list under lock and then put it back later when
> we are garanteed no race with main sched thread is possible which
> is after the thread is parked.
>
> v2: Lock around processing ring_mirror_list in drm_sched_cleanup_jobs.
>
> v3: Rebase on top of drm-misc-next. v2 is not needed anymore as
> drm_sched_get_cleanup_job already has a lock there.
>
> v4: Fix comments to relfect latest code in drm-misc.
>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> Reviewed-by: Christian König <christian.koenig@amd.com>
> Tested-by: Emily Deng <Emily.Deng@amd.com>
> ---
>   drivers/gpu/drm/scheduler/sched_main.c | 27 +++++++++++++++++++++++++++
>   1 file changed, 27 insertions(+)
>
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> index 6774955..1bf9c40 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -284,10 +284,21 @@ static void drm_sched_job_timedout(struct work_struct *work)
>   	unsigned long flags;
>   
>   	sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work);
> +
> +	/* Protects against concurrent deletion in drm_sched_get_cleanup_job */
> +	spin_lock_irqsave(&sched->job_list_lock, flags);
>   	job = list_first_entry_or_null(&sched->ring_mirror_list,
>   				       struct drm_sched_job, node);
>   
>   	if (job) {
> +		/*
> +		 * Remove the bad job so it cannot be freed by concurrent
> +		 * drm_sched_cleanup_jobs. It will be reinserted back after sched->thread
> +		 * is parked at which point it's safe.
> +		 */
> +		list_del_init(&job->node);
> +		spin_unlock_irqrestore(&sched->job_list_lock, flags);
> +
>   		job->sched->ops->timedout_job(job);
>   
>   		/*
> @@ -298,6 +309,8 @@ static void drm_sched_job_timedout(struct work_struct *work)
>   			job->sched->ops->free_job(job);
>   			sched->free_guilty = false;
>   		}
> +	} else {
> +		spin_unlock_irqrestore(&sched->job_list_lock, flags);
>   	}
>   
>   	spin_lock_irqsave(&sched->job_list_lock, flags);
> @@ -370,6 +383,20 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad)
>   	kthread_park(sched->thread);
>   
>   	/*
> +	 * Reinsert back the bad job here - now it's safe as
> +	 * drm_sched_get_cleanup_job cannot race against us and release the
> +	 * bad job at this point - we parked (waited for) any in progress
> +	 * (earlier) cleanups and drm_sched_get_cleanup_job will not be called
> +	 * now until the scheduler thread is unparked.
> +	 */
> +	if (bad && bad->sched == sched)
> +		/*
> +		 * Add at the head of the queue to reflect it was the earliest
> +		 * job extracted.
> +		 */
> +		list_add(&bad->node, &sched->ring_mirror_list);
> +
> +	/*
>   	 * Iterate the job list from later to  earlier one and either deactive
>   	 * their HW callbacks or remove them from mirror list if they already
>   	 * signaled.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
@ 2019-11-26 15:37   ` Andrey Grodzovsky
  0 siblings, 0 replies; 125+ messages in thread
From: Andrey Grodzovsky @ 2019-11-26 15:37 UTC (permalink / raw)
  Cc: Emily.Deng, steven.price, amd-gfx, dri-devel, Christian.Koenig

Ping

Andrey

On 11/25/19 3:51 PM, Andrey Grodzovsky wrote:
> Problem:
> Due to a race between drm_sched_cleanup_jobs in sched thread and
> drm_sched_job_timedout in timeout work there is a possiblity that
> bad job was already freed while still being accessed from the
> timeout thread.
>
> Fix:
> Instead of just peeking at the bad job in the mirror list
> remove it from the list under lock and then put it back later when
> we are garanteed no race with main sched thread is possible which
> is after the thread is parked.
>
> v2: Lock around processing ring_mirror_list in drm_sched_cleanup_jobs.
>
> v3: Rebase on top of drm-misc-next. v2 is not needed anymore as
> drm_sched_get_cleanup_job already has a lock there.
>
> v4: Fix comments to relfect latest code in drm-misc.
>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> Reviewed-by: Christian König <christian.koenig@amd.com>
> Tested-by: Emily Deng <Emily.Deng@amd.com>
> ---
>   drivers/gpu/drm/scheduler/sched_main.c | 27 +++++++++++++++++++++++++++
>   1 file changed, 27 insertions(+)
>
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> index 6774955..1bf9c40 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -284,10 +284,21 @@ static void drm_sched_job_timedout(struct work_struct *work)
>   	unsigned long flags;
>   
>   	sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work);
> +
> +	/* Protects against concurrent deletion in drm_sched_get_cleanup_job */
> +	spin_lock_irqsave(&sched->job_list_lock, flags);
>   	job = list_first_entry_or_null(&sched->ring_mirror_list,
>   				       struct drm_sched_job, node);
>   
>   	if (job) {
> +		/*
> +		 * Remove the bad job so it cannot be freed by concurrent
> +		 * drm_sched_cleanup_jobs. It will be reinserted back after sched->thread
> +		 * is parked at which point it's safe.
> +		 */
> +		list_del_init(&job->node);
> +		spin_unlock_irqrestore(&sched->job_list_lock, flags);
> +
>   		job->sched->ops->timedout_job(job);
>   
>   		/*
> @@ -298,6 +309,8 @@ static void drm_sched_job_timedout(struct work_struct *work)
>   			job->sched->ops->free_job(job);
>   			sched->free_guilty = false;
>   		}
> +	} else {
> +		spin_unlock_irqrestore(&sched->job_list_lock, flags);
>   	}
>   
>   	spin_lock_irqsave(&sched->job_list_lock, flags);
> @@ -370,6 +383,20 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad)
>   	kthread_park(sched->thread);
>   
>   	/*
> +	 * Reinsert back the bad job here - now it's safe as
> +	 * drm_sched_get_cleanup_job cannot race against us and release the
> +	 * bad job at this point - we parked (waited for) any in progress
> +	 * (earlier) cleanups and drm_sched_get_cleanup_job will not be called
> +	 * now until the scheduler thread is unparked.
> +	 */
> +	if (bad && bad->sched == sched)
> +		/*
> +		 * Add at the head of the queue to reflect it was the earliest
> +		 * job extracted.
> +		 */
> +		list_add(&bad->node, &sched->ring_mirror_list);
> +
> +	/*
>   	 * Iterate the job list from later to  earlier one and either deactive
>   	 * their HW callbacks or remove them from mirror list if they already
>   	 * signaled.
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 125+ messages in thread

* RE: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
@ 2019-11-27  0:41       ` Deng, Emily
  0 siblings, 0 replies; 125+ messages in thread
From: Deng, Emily @ 2019-11-27  0:41 UTC (permalink / raw)
  To: Grodzovsky, Andrey
  Cc: steven.price-5wv7dgnIgG8,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Koenig, Christian

[AMD Official Use Only - Internal Distribution Only]

Reviewed-by: Emily Deng <Emily.Deng@amd.com>

>-----Original Message-----
>From: Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>
>Sent: Tuesday, November 26, 2019 7:37 AM
>Cc: dri-devel@lists.freedesktop.org; amd-gfx@lists.freedesktop.org; Koenig,
>Christian <Christian.Koenig@amd.com>; Deng, Emily
><Emily.Deng@amd.com>; steven.price@arm.com
>Subject: Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
>
>Ping
>
>Andrey
>
>On 11/25/19 3:51 PM, Andrey Grodzovsky wrote:
>> Problem:
>> Due to a race between drm_sched_cleanup_jobs in sched thread and
>> drm_sched_job_timedout in timeout work there is a possiblity that bad
>> job was already freed while still being accessed from the timeout
>> thread.
>>
>> Fix:
>> Instead of just peeking at the bad job in the mirror list remove it
>> from the list under lock and then put it back later when we are
>> garanteed no race with main sched thread is possible which is after
>> the thread is parked.
>>
>> v2: Lock around processing ring_mirror_list in drm_sched_cleanup_jobs.
>>
>> v3: Rebase on top of drm-misc-next. v2 is not needed anymore as
>> drm_sched_get_cleanup_job already has a lock there.
>>
>> v4: Fix comments to relfect latest code in drm-misc.
>>
>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>> Reviewed-by: Christian König <christian.koenig@amd.com>
>> Tested-by: Emily Deng <Emily.Deng@amd.com>
>> ---
>>   drivers/gpu/drm/scheduler/sched_main.c | 27
>+++++++++++++++++++++++++++
>>   1 file changed, 27 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c
>> b/drivers/gpu/drm/scheduler/sched_main.c
>> index 6774955..1bf9c40 100644
>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>> @@ -284,10 +284,21 @@ static void drm_sched_job_timedout(struct
>work_struct *work)
>>   	unsigned long flags;
>>
>>   	sched = container_of(work, struct drm_gpu_scheduler,
>> work_tdr.work);
>> +
>> +	/* Protects against concurrent deletion in
>drm_sched_get_cleanup_job */
>> +	spin_lock_irqsave(&sched->job_list_lock, flags);
>>   	job = list_first_entry_or_null(&sched->ring_mirror_list,
>>   				       struct drm_sched_job, node);
>>
>>   	if (job) {
>> +		/*
>> +		 * Remove the bad job so it cannot be freed by concurrent
>> +		 * drm_sched_cleanup_jobs. It will be reinserted back after
>sched->thread
>> +		 * is parked at which point it's safe.
>> +		 */
>> +		list_del_init(&job->node);
>> +		spin_unlock_irqrestore(&sched->job_list_lock, flags);
>> +
>>   		job->sched->ops->timedout_job(job);
>>
>>   		/*
>> @@ -298,6 +309,8 @@ static void drm_sched_job_timedout(struct
>work_struct *work)
>>   			job->sched->ops->free_job(job);
>>   			sched->free_guilty = false;
>>   		}
>> +	} else {
>> +		spin_unlock_irqrestore(&sched->job_list_lock, flags);
>>   	}
>>
>>   	spin_lock_irqsave(&sched->job_list_lock, flags); @@ -370,6 +383,20
>> @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct
>drm_sched_job *bad)
>>   	kthread_park(sched->thread);
>>
>>   	/*
>> +	 * Reinsert back the bad job here - now it's safe as
>> +	 * drm_sched_get_cleanup_job cannot race against us and release the
>> +	 * bad job at this point - we parked (waited for) any in progress
>> +	 * (earlier) cleanups and drm_sched_get_cleanup_job will not be
>called
>> +	 * now until the scheduler thread is unparked.
>> +	 */
>> +	if (bad && bad->sched == sched)
>> +		/*
>> +		 * Add at the head of the queue to reflect it was the earliest
>> +		 * job extracted.
>> +		 */
>> +		list_add(&bad->node, &sched->ring_mirror_list);
>> +
>> +	/*
>>   	 * Iterate the job list from later to  earlier one and either deactive
>>   	 * their HW callbacks or remove them from mirror list if they already
>>   	 * signaled.
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 125+ messages in thread

* RE: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
@ 2019-11-27  0:41       ` Deng, Emily
  0 siblings, 0 replies; 125+ messages in thread
From: Deng, Emily @ 2019-11-27  0:41 UTC (permalink / raw)
  To: Grodzovsky, Andrey; +Cc: steven.price, amd-gfx, dri-devel, Koenig,  Christian

[AMD Official Use Only - Internal Distribution Only]

Reviewed-by: Emily Deng <Emily.Deng@amd.com>

>-----Original Message-----
>From: Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>
>Sent: Tuesday, November 26, 2019 7:37 AM
>Cc: dri-devel@lists.freedesktop.org; amd-gfx@lists.freedesktop.org; Koenig,
>Christian <Christian.Koenig@amd.com>; Deng, Emily
><Emily.Deng@amd.com>; steven.price@arm.com
>Subject: Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
>
>Ping
>
>Andrey
>
>On 11/25/19 3:51 PM, Andrey Grodzovsky wrote:
>> Problem:
>> Due to a race between drm_sched_cleanup_jobs in sched thread and
>> drm_sched_job_timedout in timeout work there is a possiblity that bad
>> job was already freed while still being accessed from the timeout
>> thread.
>>
>> Fix:
>> Instead of just peeking at the bad job in the mirror list remove it
>> from the list under lock and then put it back later when we are
>> garanteed no race with main sched thread is possible which is after
>> the thread is parked.
>>
>> v2: Lock around processing ring_mirror_list in drm_sched_cleanup_jobs.
>>
>> v3: Rebase on top of drm-misc-next. v2 is not needed anymore as
>> drm_sched_get_cleanup_job already has a lock there.
>>
>> v4: Fix comments to relfect latest code in drm-misc.
>>
>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>> Reviewed-by: Christian König <christian.koenig@amd.com>
>> Tested-by: Emily Deng <Emily.Deng@amd.com>
>> ---
>>   drivers/gpu/drm/scheduler/sched_main.c | 27
>+++++++++++++++++++++++++++
>>   1 file changed, 27 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c
>> b/drivers/gpu/drm/scheduler/sched_main.c
>> index 6774955..1bf9c40 100644
>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>> @@ -284,10 +284,21 @@ static void drm_sched_job_timedout(struct
>work_struct *work)
>>   	unsigned long flags;
>>
>>   	sched = container_of(work, struct drm_gpu_scheduler,
>> work_tdr.work);
>> +
>> +	/* Protects against concurrent deletion in
>drm_sched_get_cleanup_job */
>> +	spin_lock_irqsave(&sched->job_list_lock, flags);
>>   	job = list_first_entry_or_null(&sched->ring_mirror_list,
>>   				       struct drm_sched_job, node);
>>
>>   	if (job) {
>> +		/*
>> +		 * Remove the bad job so it cannot be freed by concurrent
>> +		 * drm_sched_cleanup_jobs. It will be reinserted back after
>sched->thread
>> +		 * is parked at which point it's safe.
>> +		 */
>> +		list_del_init(&job->node);
>> +		spin_unlock_irqrestore(&sched->job_list_lock, flags);
>> +
>>   		job->sched->ops->timedout_job(job);
>>
>>   		/*
>> @@ -298,6 +309,8 @@ static void drm_sched_job_timedout(struct
>work_struct *work)
>>   			job->sched->ops->free_job(job);
>>   			sched->free_guilty = false;
>>   		}
>> +	} else {
>> +		spin_unlock_irqrestore(&sched->job_list_lock, flags);
>>   	}
>>
>>   	spin_lock_irqsave(&sched->job_list_lock, flags); @@ -370,6 +383,20
>> @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct
>drm_sched_job *bad)
>>   	kthread_park(sched->thread);
>>
>>   	/*
>> +	 * Reinsert back the bad job here - now it's safe as
>> +	 * drm_sched_get_cleanup_job cannot race against us and release the
>> +	 * bad job at this point - we parked (waited for) any in progress
>> +	 * (earlier) cleanups and drm_sched_get_cleanup_job will not be
>called
>> +	 * now until the scheduler thread is unparked.
>> +	 */
>> +	if (bad && bad->sched == sched)
>> +		/*
>> +		 * Add at the head of the queue to reflect it was the earliest
>> +		 * job extracted.
>> +		 */
>> +		list_add(&bad->node, &sched->ring_mirror_list);
>> +
>> +	/*
>>   	 * Iterate the job list from later to  earlier one and either deactive
>>   	 * their HW callbacks or remove them from mirror list if they already
>>   	 * signaled.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 125+ messages in thread

* RE: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
@ 2019-11-27  0:41       ` Deng, Emily
  0 siblings, 0 replies; 125+ messages in thread
From: Deng, Emily @ 2019-11-27  0:41 UTC (permalink / raw)
  To: Grodzovsky, Andrey; +Cc: steven.price, amd-gfx, dri-devel, Koenig,  Christian

[AMD Official Use Only - Internal Distribution Only]

Reviewed-by: Emily Deng <Emily.Deng@amd.com>

>-----Original Message-----
>From: Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>
>Sent: Tuesday, November 26, 2019 7:37 AM
>Cc: dri-devel@lists.freedesktop.org; amd-gfx@lists.freedesktop.org; Koenig,
>Christian <Christian.Koenig@amd.com>; Deng, Emily
><Emily.Deng@amd.com>; steven.price@arm.com
>Subject: Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
>
>Ping
>
>Andrey
>
>On 11/25/19 3:51 PM, Andrey Grodzovsky wrote:
>> Problem:
>> Due to a race between drm_sched_cleanup_jobs in sched thread and
>> drm_sched_job_timedout in timeout work there is a possiblity that bad
>> job was already freed while still being accessed from the timeout
>> thread.
>>
>> Fix:
>> Instead of just peeking at the bad job in the mirror list remove it
>> from the list under lock and then put it back later when we are
>> garanteed no race with main sched thread is possible which is after
>> the thread is parked.
>>
>> v2: Lock around processing ring_mirror_list in drm_sched_cleanup_jobs.
>>
>> v3: Rebase on top of drm-misc-next. v2 is not needed anymore as
>> drm_sched_get_cleanup_job already has a lock there.
>>
>> v4: Fix comments to relfect latest code in drm-misc.
>>
>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>> Reviewed-by: Christian König <christian.koenig@amd.com>
>> Tested-by: Emily Deng <Emily.Deng@amd.com>
>> ---
>>   drivers/gpu/drm/scheduler/sched_main.c | 27
>+++++++++++++++++++++++++++
>>   1 file changed, 27 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c
>> b/drivers/gpu/drm/scheduler/sched_main.c
>> index 6774955..1bf9c40 100644
>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>> @@ -284,10 +284,21 @@ static void drm_sched_job_timedout(struct
>work_struct *work)
>>   	unsigned long flags;
>>
>>   	sched = container_of(work, struct drm_gpu_scheduler,
>> work_tdr.work);
>> +
>> +	/* Protects against concurrent deletion in
>drm_sched_get_cleanup_job */
>> +	spin_lock_irqsave(&sched->job_list_lock, flags);
>>   	job = list_first_entry_or_null(&sched->ring_mirror_list,
>>   				       struct drm_sched_job, node);
>>
>>   	if (job) {
>> +		/*
>> +		 * Remove the bad job so it cannot be freed by concurrent
>> +		 * drm_sched_cleanup_jobs. It will be reinserted back after
>sched->thread
>> +		 * is parked at which point it's safe.
>> +		 */
>> +		list_del_init(&job->node);
>> +		spin_unlock_irqrestore(&sched->job_list_lock, flags);
>> +
>>   		job->sched->ops->timedout_job(job);
>>
>>   		/*
>> @@ -298,6 +309,8 @@ static void drm_sched_job_timedout(struct
>work_struct *work)
>>   			job->sched->ops->free_job(job);
>>   			sched->free_guilty = false;
>>   		}
>> +	} else {
>> +		spin_unlock_irqrestore(&sched->job_list_lock, flags);
>>   	}
>>
>>   	spin_lock_irqsave(&sched->job_list_lock, flags); @@ -370,6 +383,20
>> @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct
>drm_sched_job *bad)
>>   	kthread_park(sched->thread);
>>
>>   	/*
>> +	 * Reinsert back the bad job here - now it's safe as
>> +	 * drm_sched_get_cleanup_job cannot race against us and release the
>> +	 * bad job at this point - we parked (waited for) any in progress
>> +	 * (earlier) cleanups and drm_sched_get_cleanup_job will not be
>called
>> +	 * now until the scheduler thread is unparked.
>> +	 */
>> +	if (bad && bad->sched == sched)
>> +		/*
>> +		 * Add at the head of the queue to reflect it was the earliest
>> +		 * job extracted.
>> +		 */
>> +		list_add(&bad->node, &sched->ring_mirror_list);
>> +
>> +	/*
>>   	 * Iterate the job list from later to  earlier one and either deactive
>>   	 * their HW callbacks or remove them from mirror list if they already
>>   	 * signaled.
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 125+ messages in thread

* RE: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
@ 2019-12-02 19:24           ` Deng, Emily
  0 siblings, 0 replies; 125+ messages in thread
From: Deng, Emily @ 2019-12-02 19:24 UTC (permalink / raw)
  To: Grodzovsky, Andrey
  Cc: steven.price-5wv7dgnIgG8,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Koenig, Christian

[AMD Official Use Only - Internal Distribution Only]

Hi Andrey,
    Seems this patch is still not in amd-staging-drm-next?

Best wishes
Emily Deng



>-----Original Message-----
>From: Deng, Emily
>Sent: Tuesday, November 26, 2019 4:41 PM
>To: Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>
>Cc: dri-devel@lists.freedesktop.org; amd-gfx@lists.freedesktop.org; Koenig,
>Christian <Christian.Koenig@amd.com>; steven.price@arm.com
>Subject: RE: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
>
>[AMD Official Use Only - Internal Distribution Only]
>
>Reviewed-by: Emily Deng <Emily.Deng@amd.com>
>
>>-----Original Message-----
>>From: Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>
>>Sent: Tuesday, November 26, 2019 7:37 AM
>>Cc: dri-devel@lists.freedesktop.org; amd-gfx@lists.freedesktop.org;
>>Koenig, Christian <Christian.Koenig@amd.com>; Deng, Emily
>><Emily.Deng@amd.com>; steven.price@arm.com
>>Subject: Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
>>
>>Ping
>>
>>Andrey
>>
>>On 11/25/19 3:51 PM, Andrey Grodzovsky wrote:
>>> Problem:
>>> Due to a race between drm_sched_cleanup_jobs in sched thread and
>>> drm_sched_job_timedout in timeout work there is a possiblity that bad
>>> job was already freed while still being accessed from the timeout
>>> thread.
>>>
>>> Fix:
>>> Instead of just peeking at the bad job in the mirror list remove it
>>> from the list under lock and then put it back later when we are
>>> garanteed no race with main sched thread is possible which is after
>>> the thread is parked.
>>>
>>> v2: Lock around processing ring_mirror_list in drm_sched_cleanup_jobs.
>>>
>>> v3: Rebase on top of drm-misc-next. v2 is not needed anymore as
>>> drm_sched_get_cleanup_job already has a lock there.
>>>
>>> v4: Fix comments to relfect latest code in drm-misc.
>>>
>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>> Reviewed-by: Christian König <christian.koenig@amd.com>
>>> Tested-by: Emily Deng <Emily.Deng@amd.com>
>>> ---
>>>   drivers/gpu/drm/scheduler/sched_main.c | 27
>>+++++++++++++++++++++++++++
>>>   1 file changed, 27 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c
>>> b/drivers/gpu/drm/scheduler/sched_main.c
>>> index 6774955..1bf9c40 100644
>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>> @@ -284,10 +284,21 @@ static void drm_sched_job_timedout(struct
>>work_struct *work)
>>>   	unsigned long flags;
>>>
>>>   	sched = container_of(work, struct drm_gpu_scheduler,
>>> work_tdr.work);
>>> +
>>> +	/* Protects against concurrent deletion in
>>drm_sched_get_cleanup_job */
>>> +	spin_lock_irqsave(&sched->job_list_lock, flags);
>>>   	job = list_first_entry_or_null(&sched->ring_mirror_list,
>>>   				       struct drm_sched_job, node);
>>>
>>>   	if (job) {
>>> +		/*
>>> +		 * Remove the bad job so it cannot be freed by concurrent
>>> +		 * drm_sched_cleanup_jobs. It will be reinserted back after
>>sched->thread
>>> +		 * is parked at which point it's safe.
>>> +		 */
>>> +		list_del_init(&job->node);
>>> +		spin_unlock_irqrestore(&sched->job_list_lock, flags);
>>> +
>>>   		job->sched->ops->timedout_job(job);
>>>
>>>   		/*
>>> @@ -298,6 +309,8 @@ static void drm_sched_job_timedout(struct
>>work_struct *work)
>>>   			job->sched->ops->free_job(job);
>>>   			sched->free_guilty = false;
>>>   		}
>>> +	} else {
>>> +		spin_unlock_irqrestore(&sched->job_list_lock, flags);
>>>   	}
>>>
>>>   	spin_lock_irqsave(&sched->job_list_lock, flags); @@ -370,6 +383,20
>>> @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct
>>drm_sched_job *bad)
>>>   	kthread_park(sched->thread);
>>>
>>>   	/*
>>> +	 * Reinsert back the bad job here - now it's safe as
>>> +	 * drm_sched_get_cleanup_job cannot race against us and release the
>>> +	 * bad job at this point - we parked (waited for) any in progress
>>> +	 * (earlier) cleanups and drm_sched_get_cleanup_job will not be
>>called
>>> +	 * now until the scheduler thread is unparked.
>>> +	 */
>>> +	if (bad && bad->sched == sched)
>>> +		/*
>>> +		 * Add at the head of the queue to reflect it was the earliest
>>> +		 * job extracted.
>>> +		 */
>>> +		list_add(&bad->node, &sched->ring_mirror_list);
>>> +
>>> +	/*
>>>   	 * Iterate the job list from later to  earlier one and either deactive
>>>   	 * their HW callbacks or remove them from mirror list if they already
>>>   	 * signaled.
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 125+ messages in thread

* RE: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
@ 2019-12-02 19:24           ` Deng, Emily
  0 siblings, 0 replies; 125+ messages in thread
From: Deng, Emily @ 2019-12-02 19:24 UTC (permalink / raw)
  To: Grodzovsky, Andrey; +Cc: steven.price, amd-gfx, dri-devel, Koenig,  Christian

[AMD Official Use Only - Internal Distribution Only]

Hi Andrey,
    Seems this patch is still not in amd-staging-drm-next?

Best wishes
Emily Deng



>-----Original Message-----
>From: Deng, Emily
>Sent: Tuesday, November 26, 2019 4:41 PM
>To: Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>
>Cc: dri-devel@lists.freedesktop.org; amd-gfx@lists.freedesktop.org; Koenig,
>Christian <Christian.Koenig@amd.com>; steven.price@arm.com
>Subject: RE: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
>
>[AMD Official Use Only - Internal Distribution Only]
>
>Reviewed-by: Emily Deng <Emily.Deng@amd.com>
>
>>-----Original Message-----
>>From: Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>
>>Sent: Tuesday, November 26, 2019 7:37 AM
>>Cc: dri-devel@lists.freedesktop.org; amd-gfx@lists.freedesktop.org;
>>Koenig, Christian <Christian.Koenig@amd.com>; Deng, Emily
>><Emily.Deng@amd.com>; steven.price@arm.com
>>Subject: Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
>>
>>Ping
>>
>>Andrey
>>
>>On 11/25/19 3:51 PM, Andrey Grodzovsky wrote:
>>> Problem:
>>> Due to a race between drm_sched_cleanup_jobs in sched thread and
>>> drm_sched_job_timedout in timeout work there is a possiblity that bad
>>> job was already freed while still being accessed from the timeout
>>> thread.
>>>
>>> Fix:
>>> Instead of just peeking at the bad job in the mirror list remove it
>>> from the list under lock and then put it back later when we are
>>> garanteed no race with main sched thread is possible which is after
>>> the thread is parked.
>>>
>>> v2: Lock around processing ring_mirror_list in drm_sched_cleanup_jobs.
>>>
>>> v3: Rebase on top of drm-misc-next. v2 is not needed anymore as
>>> drm_sched_get_cleanup_job already has a lock there.
>>>
>>> v4: Fix comments to relfect latest code in drm-misc.
>>>
>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>> Reviewed-by: Christian König <christian.koenig@amd.com>
>>> Tested-by: Emily Deng <Emily.Deng@amd.com>
>>> ---
>>>   drivers/gpu/drm/scheduler/sched_main.c | 27
>>+++++++++++++++++++++++++++
>>>   1 file changed, 27 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c
>>> b/drivers/gpu/drm/scheduler/sched_main.c
>>> index 6774955..1bf9c40 100644
>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>> @@ -284,10 +284,21 @@ static void drm_sched_job_timedout(struct
>>work_struct *work)
>>>   	unsigned long flags;
>>>
>>>   	sched = container_of(work, struct drm_gpu_scheduler,
>>> work_tdr.work);
>>> +
>>> +	/* Protects against concurrent deletion in
>>drm_sched_get_cleanup_job */
>>> +	spin_lock_irqsave(&sched->job_list_lock, flags);
>>>   	job = list_first_entry_or_null(&sched->ring_mirror_list,
>>>   				       struct drm_sched_job, node);
>>>
>>>   	if (job) {
>>> +		/*
>>> +		 * Remove the bad job so it cannot be freed by concurrent
>>> +		 * drm_sched_cleanup_jobs. It will be reinserted back after
>>sched->thread
>>> +		 * is parked at which point it's safe.
>>> +		 */
>>> +		list_del_init(&job->node);
>>> +		spin_unlock_irqrestore(&sched->job_list_lock, flags);
>>> +
>>>   		job->sched->ops->timedout_job(job);
>>>
>>>   		/*
>>> @@ -298,6 +309,8 @@ static void drm_sched_job_timedout(struct
>>work_struct *work)
>>>   			job->sched->ops->free_job(job);
>>>   			sched->free_guilty = false;
>>>   		}
>>> +	} else {
>>> +		spin_unlock_irqrestore(&sched->job_list_lock, flags);
>>>   	}
>>>
>>>   	spin_lock_irqsave(&sched->job_list_lock, flags); @@ -370,6 +383,20
>>> @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct
>>drm_sched_job *bad)
>>>   	kthread_park(sched->thread);
>>>
>>>   	/*
>>> +	 * Reinsert back the bad job here - now it's safe as
>>> +	 * drm_sched_get_cleanup_job cannot race against us and release the
>>> +	 * bad job at this point - we parked (waited for) any in progress
>>> +	 * (earlier) cleanups and drm_sched_get_cleanup_job will not be
>>called
>>> +	 * now until the scheduler thread is unparked.
>>> +	 */
>>> +	if (bad && bad->sched == sched)
>>> +		/*
>>> +		 * Add at the head of the queue to reflect it was the earliest
>>> +		 * job extracted.
>>> +		 */
>>> +		list_add(&bad->node, &sched->ring_mirror_list);
>>> +
>>> +	/*
>>>   	 * Iterate the job list from later to  earlier one and either deactive
>>>   	 * their HW callbacks or remove them from mirror list if they already
>>>   	 * signaled.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 125+ messages in thread

* RE: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
@ 2019-12-02 19:24           ` Deng, Emily
  0 siblings, 0 replies; 125+ messages in thread
From: Deng, Emily @ 2019-12-02 19:24 UTC (permalink / raw)
  To: Grodzovsky, Andrey; +Cc: steven.price, amd-gfx, dri-devel, Koenig,  Christian

[AMD Official Use Only - Internal Distribution Only]

Hi Andrey,
    Seems this patch is still not in amd-staging-drm-next?

Best wishes
Emily Deng



>-----Original Message-----
>From: Deng, Emily
>Sent: Tuesday, November 26, 2019 4:41 PM
>To: Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>
>Cc: dri-devel@lists.freedesktop.org; amd-gfx@lists.freedesktop.org; Koenig,
>Christian <Christian.Koenig@amd.com>; steven.price@arm.com
>Subject: RE: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
>
>[AMD Official Use Only - Internal Distribution Only]
>
>Reviewed-by: Emily Deng <Emily.Deng@amd.com>
>
>>-----Original Message-----
>>From: Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>
>>Sent: Tuesday, November 26, 2019 7:37 AM
>>Cc: dri-devel@lists.freedesktop.org; amd-gfx@lists.freedesktop.org;
>>Koenig, Christian <Christian.Koenig@amd.com>; Deng, Emily
>><Emily.Deng@amd.com>; steven.price@arm.com
>>Subject: Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
>>
>>Ping
>>
>>Andrey
>>
>>On 11/25/19 3:51 PM, Andrey Grodzovsky wrote:
>>> Problem:
>>> Due to a race between drm_sched_cleanup_jobs in sched thread and
>>> drm_sched_job_timedout in timeout work there is a possiblity that bad
>>> job was already freed while still being accessed from the timeout
>>> thread.
>>>
>>> Fix:
>>> Instead of just peeking at the bad job in the mirror list remove it
>>> from the list under lock and then put it back later when we are
>>> garanteed no race with main sched thread is possible which is after
>>> the thread is parked.
>>>
>>> v2: Lock around processing ring_mirror_list in drm_sched_cleanup_jobs.
>>>
>>> v3: Rebase on top of drm-misc-next. v2 is not needed anymore as
>>> drm_sched_get_cleanup_job already has a lock there.
>>>
>>> v4: Fix comments to relfect latest code in drm-misc.
>>>
>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>> Reviewed-by: Christian König <christian.koenig@amd.com>
>>> Tested-by: Emily Deng <Emily.Deng@amd.com>
>>> ---
>>>   drivers/gpu/drm/scheduler/sched_main.c | 27
>>+++++++++++++++++++++++++++
>>>   1 file changed, 27 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c
>>> b/drivers/gpu/drm/scheduler/sched_main.c
>>> index 6774955..1bf9c40 100644
>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>> @@ -284,10 +284,21 @@ static void drm_sched_job_timedout(struct
>>work_struct *work)
>>>   	unsigned long flags;
>>>
>>>   	sched = container_of(work, struct drm_gpu_scheduler,
>>> work_tdr.work);
>>> +
>>> +	/* Protects against concurrent deletion in
>>drm_sched_get_cleanup_job */
>>> +	spin_lock_irqsave(&sched->job_list_lock, flags);
>>>   	job = list_first_entry_or_null(&sched->ring_mirror_list,
>>>   				       struct drm_sched_job, node);
>>>
>>>   	if (job) {
>>> +		/*
>>> +		 * Remove the bad job so it cannot be freed by concurrent
>>> +		 * drm_sched_cleanup_jobs. It will be reinserted back after
>>sched->thread
>>> +		 * is parked at which point it's safe.
>>> +		 */
>>> +		list_del_init(&job->node);
>>> +		spin_unlock_irqrestore(&sched->job_list_lock, flags);
>>> +
>>>   		job->sched->ops->timedout_job(job);
>>>
>>>   		/*
>>> @@ -298,6 +309,8 @@ static void drm_sched_job_timedout(struct
>>work_struct *work)
>>>   			job->sched->ops->free_job(job);
>>>   			sched->free_guilty = false;
>>>   		}
>>> +	} else {
>>> +		spin_unlock_irqrestore(&sched->job_list_lock, flags);
>>>   	}
>>>
>>>   	spin_lock_irqsave(&sched->job_list_lock, flags); @@ -370,6 +383,20
>>> @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct
>>drm_sched_job *bad)
>>>   	kthread_park(sched->thread);
>>>
>>>   	/*
>>> +	 * Reinsert back the bad job here - now it's safe as
>>> +	 * drm_sched_get_cleanup_job cannot race against us and release the
>>> +	 * bad job at this point - we parked (waited for) any in progress
>>> +	 * (earlier) cleanups and drm_sched_get_cleanup_job will not be
>>called
>>> +	 * now until the scheduler thread is unparked.
>>> +	 */
>>> +	if (bad && bad->sched == sched)
>>> +		/*
>>> +		 * Add at the head of the queue to reflect it was the earliest
>>> +		 * job extracted.
>>> +		 */
>>> +		list_add(&bad->node, &sched->ring_mirror_list);
>>> +
>>> +	/*
>>>   	 * Iterate the job list from later to  earlier one and either deactive
>>>   	 * their HW callbacks or remove them from mirror list if they already
>>>   	 * signaled.
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
  2019-12-02 19:24           ` Deng, Emily
@ 2019-12-03 19:10             ` Andrey Grodzovsky
  -1 siblings, 0 replies; 125+ messages in thread
From: Andrey Grodzovsky @ 2019-12-03 19:10 UTC (permalink / raw)
  To: Deng, Emily, Deucher, Alexander
  Cc: steven.price, amd-gfx, dri-devel, Koenig, Christian

Yes - Christian just pushed it to drm-next-misc - I guess Alex/Christian 
didn't pull to amd-staging-drm-next yet.

Andrey

On 12/2/19 2:24 PM, Deng, Emily wrote:
> [AMD Official Use Only - Internal Distribution Only]
>
> Hi Andrey,
>      Seems this patch is still not in amd-staging-drm-next?
>
> Best wishes
> Emily Deng
>
>
>
>> -----Original Message-----
>> From: Deng, Emily
>> Sent: Tuesday, November 26, 2019 4:41 PM
>> To: Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>
>> Cc: dri-devel@lists.freedesktop.org; amd-gfx@lists.freedesktop.org; Koenig,
>> Christian <Christian.Koenig@amd.com>; steven.price@arm.com
>> Subject: RE: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
>>
>> [AMD Official Use Only - Internal Distribution Only]
>>
>> Reviewed-by: Emily Deng <Emily.Deng@amd.com>
>>
>>> -----Original Message-----
>>> From: Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>
>>> Sent: Tuesday, November 26, 2019 7:37 AM
>>> Cc: dri-devel@lists.freedesktop.org; amd-gfx@lists.freedesktop.org;
>>> Koenig, Christian <Christian.Koenig@amd.com>; Deng, Emily
>>> <Emily.Deng@amd.com>; steven.price@arm.com
>>> Subject: Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
>>>
>>> Ping
>>>
>>> Andrey
>>>
>>> On 11/25/19 3:51 PM, Andrey Grodzovsky wrote:
>>>> Problem:
>>>> Due to a race between drm_sched_cleanup_jobs in sched thread and
>>>> drm_sched_job_timedout in timeout work there is a possiblity that bad
>>>> job was already freed while still being accessed from the timeout
>>>> thread.
>>>>
>>>> Fix:
>>>> Instead of just peeking at the bad job in the mirror list remove it
>>>> from the list under lock and then put it back later when we are
>>>> garanteed no race with main sched thread is possible which is after
>>>> the thread is parked.
>>>>
>>>> v2: Lock around processing ring_mirror_list in drm_sched_cleanup_jobs.
>>>>
>>>> v3: Rebase on top of drm-misc-next. v2 is not needed anymore as
>>>> drm_sched_get_cleanup_job already has a lock there.
>>>>
>>>> v4: Fix comments to relfect latest code in drm-misc.
>>>>
>>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>>> Reviewed-by: Christian König <christian.koenig@amd.com>
>>>> Tested-by: Emily Deng <Emily.Deng@amd.com>
>>>> ---
>>>>    drivers/gpu/drm/scheduler/sched_main.c | 27
>>> +++++++++++++++++++++++++++
>>>>    1 file changed, 27 insertions(+)
>>>>
>>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c
>>>> b/drivers/gpu/drm/scheduler/sched_main.c
>>>> index 6774955..1bf9c40 100644
>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>>> @@ -284,10 +284,21 @@ static void drm_sched_job_timedout(struct
>>> work_struct *work)
>>>>    	unsigned long flags;
>>>>
>>>>    	sched = container_of(work, struct drm_gpu_scheduler,
>>>> work_tdr.work);
>>>> +
>>>> +	/* Protects against concurrent deletion in
>>> drm_sched_get_cleanup_job */
>>>> +	spin_lock_irqsave(&sched->job_list_lock, flags);
>>>>    	job = list_first_entry_or_null(&sched->ring_mirror_list,
>>>>    				       struct drm_sched_job, node);
>>>>
>>>>    	if (job) {
>>>> +		/*
>>>> +		 * Remove the bad job so it cannot be freed by concurrent
>>>> +		 * drm_sched_cleanup_jobs. It will be reinserted back after
>>> sched->thread
>>>> +		 * is parked at which point it's safe.
>>>> +		 */
>>>> +		list_del_init(&job->node);
>>>> +		spin_unlock_irqrestore(&sched->job_list_lock, flags);
>>>> +
>>>>    		job->sched->ops->timedout_job(job);
>>>>
>>>>    		/*
>>>> @@ -298,6 +309,8 @@ static void drm_sched_job_timedout(struct
>>> work_struct *work)
>>>>    			job->sched->ops->free_job(job);
>>>>    			sched->free_guilty = false;
>>>>    		}
>>>> +	} else {
>>>> +		spin_unlock_irqrestore(&sched->job_list_lock, flags);
>>>>    	}
>>>>
>>>>    	spin_lock_irqsave(&sched->job_list_lock, flags); @@ -370,6 +383,20
>>>> @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct
>>> drm_sched_job *bad)
>>>>    	kthread_park(sched->thread);
>>>>
>>>>    	/*
>>>> +	 * Reinsert back the bad job here - now it's safe as
>>>> +	 * drm_sched_get_cleanup_job cannot race against us and release the
>>>> +	 * bad job at this point - we parked (waited for) any in progress
>>>> +	 * (earlier) cleanups and drm_sched_get_cleanup_job will not be
>>> called
>>>> +	 * now until the scheduler thread is unparked.
>>>> +	 */
>>>> +	if (bad && bad->sched == sched)
>>>> +		/*
>>>> +		 * Add at the head of the queue to reflect it was the earliest
>>>> +		 * job extracted.
>>>> +		 */
>>>> +		list_add(&bad->node, &sched->ring_mirror_list);
>>>> +
>>>> +	/*
>>>>    	 * Iterate the job list from later to  earlier one and either deactive
>>>>    	 * their HW callbacks or remove them from mirror list if they already
>>>>    	 * signaled.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
@ 2019-12-03 19:10             ` Andrey Grodzovsky
  0 siblings, 0 replies; 125+ messages in thread
From: Andrey Grodzovsky @ 2019-12-03 19:10 UTC (permalink / raw)
  To: Deng, Emily, Deucher, Alexander
  Cc: steven.price, amd-gfx, dri-devel, Koenig, Christian

Yes - Christian just pushed it to drm-next-misc - I guess Alex/Christian 
didn't pull to amd-staging-drm-next yet.

Andrey

On 12/2/19 2:24 PM, Deng, Emily wrote:
> [AMD Official Use Only - Internal Distribution Only]
>
> Hi Andrey,
>      Seems this patch is still not in amd-staging-drm-next?
>
> Best wishes
> Emily Deng
>
>
>
>> -----Original Message-----
>> From: Deng, Emily
>> Sent: Tuesday, November 26, 2019 4:41 PM
>> To: Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>
>> Cc: dri-devel@lists.freedesktop.org; amd-gfx@lists.freedesktop.org; Koenig,
>> Christian <Christian.Koenig@amd.com>; steven.price@arm.com
>> Subject: RE: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
>>
>> [AMD Official Use Only - Internal Distribution Only]
>>
>> Reviewed-by: Emily Deng <Emily.Deng@amd.com>
>>
>>> -----Original Message-----
>>> From: Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>
>>> Sent: Tuesday, November 26, 2019 7:37 AM
>>> Cc: dri-devel@lists.freedesktop.org; amd-gfx@lists.freedesktop.org;
>>> Koenig, Christian <Christian.Koenig@amd.com>; Deng, Emily
>>> <Emily.Deng@amd.com>; steven.price@arm.com
>>> Subject: Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
>>>
>>> Ping
>>>
>>> Andrey
>>>
>>> On 11/25/19 3:51 PM, Andrey Grodzovsky wrote:
>>>> Problem:
>>>> Due to a race between drm_sched_cleanup_jobs in sched thread and
>>>> drm_sched_job_timedout in timeout work there is a possiblity that bad
>>>> job was already freed while still being accessed from the timeout
>>>> thread.
>>>>
>>>> Fix:
>>>> Instead of just peeking at the bad job in the mirror list remove it
>>>> from the list under lock and then put it back later when we are
>>>> garanteed no race with main sched thread is possible which is after
>>>> the thread is parked.
>>>>
>>>> v2: Lock around processing ring_mirror_list in drm_sched_cleanup_jobs.
>>>>
>>>> v3: Rebase on top of drm-misc-next. v2 is not needed anymore as
>>>> drm_sched_get_cleanup_job already has a lock there.
>>>>
>>>> v4: Fix comments to relfect latest code in drm-misc.
>>>>
>>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>>> Reviewed-by: Christian König <christian.koenig@amd.com>
>>>> Tested-by: Emily Deng <Emily.Deng@amd.com>
>>>> ---
>>>>    drivers/gpu/drm/scheduler/sched_main.c | 27
>>> +++++++++++++++++++++++++++
>>>>    1 file changed, 27 insertions(+)
>>>>
>>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c
>>>> b/drivers/gpu/drm/scheduler/sched_main.c
>>>> index 6774955..1bf9c40 100644
>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>>> @@ -284,10 +284,21 @@ static void drm_sched_job_timedout(struct
>>> work_struct *work)
>>>>    	unsigned long flags;
>>>>
>>>>    	sched = container_of(work, struct drm_gpu_scheduler,
>>>> work_tdr.work);
>>>> +
>>>> +	/* Protects against concurrent deletion in
>>> drm_sched_get_cleanup_job */
>>>> +	spin_lock_irqsave(&sched->job_list_lock, flags);
>>>>    	job = list_first_entry_or_null(&sched->ring_mirror_list,
>>>>    				       struct drm_sched_job, node);
>>>>
>>>>    	if (job) {
>>>> +		/*
>>>> +		 * Remove the bad job so it cannot be freed by concurrent
>>>> +		 * drm_sched_cleanup_jobs. It will be reinserted back after
>>> sched->thread
>>>> +		 * is parked at which point it's safe.
>>>> +		 */
>>>> +		list_del_init(&job->node);
>>>> +		spin_unlock_irqrestore(&sched->job_list_lock, flags);
>>>> +
>>>>    		job->sched->ops->timedout_job(job);
>>>>
>>>>    		/*
>>>> @@ -298,6 +309,8 @@ static void drm_sched_job_timedout(struct
>>> work_struct *work)
>>>>    			job->sched->ops->free_job(job);
>>>>    			sched->free_guilty = false;
>>>>    		}
>>>> +	} else {
>>>> +		spin_unlock_irqrestore(&sched->job_list_lock, flags);
>>>>    	}
>>>>
>>>>    	spin_lock_irqsave(&sched->job_list_lock, flags); @@ -370,6 +383,20
>>>> @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct
>>> drm_sched_job *bad)
>>>>    	kthread_park(sched->thread);
>>>>
>>>>    	/*
>>>> +	 * Reinsert back the bad job here - now it's safe as
>>>> +	 * drm_sched_get_cleanup_job cannot race against us and release the
>>>> +	 * bad job at this point - we parked (waited for) any in progress
>>>> +	 * (earlier) cleanups and drm_sched_get_cleanup_job will not be
>>> called
>>>> +	 * now until the scheduler thread is unparked.
>>>> +	 */
>>>> +	if (bad && bad->sched == sched)
>>>> +		/*
>>>> +		 * Add at the head of the queue to reflect it was the earliest
>>>> +		 * job extracted.
>>>> +		 */
>>>> +		list_add(&bad->node, &sched->ring_mirror_list);
>>>> +
>>>> +	/*
>>>>    	 * Iterate the job list from later to  earlier one and either deactive
>>>>    	 * their HW callbacks or remove them from mirror list if they already
>>>>    	 * signaled.
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
  2019-12-03 19:10             ` Andrey Grodzovsky
@ 2019-12-03 19:44               ` Deucher, Alexander
  -1 siblings, 0 replies; 125+ messages in thread
From: Deucher, Alexander @ 2019-12-03 19:44 UTC (permalink / raw)
  To: Grodzovsky, Andrey, Deng, Emily
  Cc: steven.price, amd-gfx, dri-devel, Koenig,  Christian


[-- Attachment #1.1: Type: text/plain, Size: 5739 bytes --]

[AMD Official Use Only - Internal Distribution Only]

Please go ahead an apply whatever version is necessary for amd-staging-drm-next.

Alex

________________________________
From: Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>
Sent: Tuesday, December 3, 2019 2:10 PM
To: Deng, Emily <Emily.Deng@amd.com>; Deucher, Alexander <Alexander.Deucher@amd.com>
Cc: dri-devel@lists.freedesktop.org <dri-devel@lists.freedesktop.org>; amd-gfx@lists.freedesktop.org <amd-gfx@lists.freedesktop.org>; Koenig, Christian <Christian.Koenig@amd.com>; steven.price@arm.com <steven.price@arm.com>
Subject: Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.

Yes - Christian just pushed it to drm-next-misc - I guess Alex/Christian
didn't pull to amd-staging-drm-next yet.

Andrey

On 12/2/19 2:24 PM, Deng, Emily wrote:
> [AMD Official Use Only - Internal Distribution Only]
>
> Hi Andrey,
>      Seems this patch is still not in amd-staging-drm-next?
>
> Best wishes
> Emily Deng
>
>
>
>> -----Original Message-----
>> From: Deng, Emily
>> Sent: Tuesday, November 26, 2019 4:41 PM
>> To: Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>
>> Cc: dri-devel@lists.freedesktop.org; amd-gfx@lists.freedesktop.org; Koenig,
>> Christian <Christian.Koenig@amd.com>; steven.price@arm.com
>> Subject: RE: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
>>
>> [AMD Official Use Only - Internal Distribution Only]
>>
>> Reviewed-by: Emily Deng <Emily.Deng@amd.com>
>>
>>> -----Original Message-----
>>> From: Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>
>>> Sent: Tuesday, November 26, 2019 7:37 AM
>>> Cc: dri-devel@lists.freedesktop.org; amd-gfx@lists.freedesktop.org;
>>> Koenig, Christian <Christian.Koenig@amd.com>; Deng, Emily
>>> <Emily.Deng@amd.com>; steven.price@arm.com
>>> Subject: Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
>>>
>>> Ping
>>>
>>> Andrey
>>>
>>> On 11/25/19 3:51 PM, Andrey Grodzovsky wrote:
>>>> Problem:
>>>> Due to a race between drm_sched_cleanup_jobs in sched thread and
>>>> drm_sched_job_timedout in timeout work there is a possiblity that bad
>>>> job was already freed while still being accessed from the timeout
>>>> thread.
>>>>
>>>> Fix:
>>>> Instead of just peeking at the bad job in the mirror list remove it
>>>> from the list under lock and then put it back later when we are
>>>> garanteed no race with main sched thread is possible which is after
>>>> the thread is parked.
>>>>
>>>> v2: Lock around processing ring_mirror_list in drm_sched_cleanup_jobs.
>>>>
>>>> v3: Rebase on top of drm-misc-next. v2 is not needed anymore as
>>>> drm_sched_get_cleanup_job already has a lock there.
>>>>
>>>> v4: Fix comments to relfect latest code in drm-misc.
>>>>
>>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>>> Reviewed-by: Christian König <christian.koenig@amd.com>
>>>> Tested-by: Emily Deng <Emily.Deng@amd.com>
>>>> ---
>>>>    drivers/gpu/drm/scheduler/sched_main.c | 27
>>> +++++++++++++++++++++++++++
>>>>    1 file changed, 27 insertions(+)
>>>>
>>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c
>>>> b/drivers/gpu/drm/scheduler/sched_main.c
>>>> index 6774955..1bf9c40 100644
>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>>> @@ -284,10 +284,21 @@ static void drm_sched_job_timedout(struct
>>> work_struct *work)
>>>>     unsigned long flags;
>>>>
>>>>     sched = container_of(work, struct drm_gpu_scheduler,
>>>> work_tdr.work);
>>>> +
>>>> +  /* Protects against concurrent deletion in
>>> drm_sched_get_cleanup_job */
>>>> +  spin_lock_irqsave(&sched->job_list_lock, flags);
>>>>     job = list_first_entry_or_null(&sched->ring_mirror_list,
>>>>                                    struct drm_sched_job, node);
>>>>
>>>>     if (job) {
>>>> +          /*
>>>> +           * Remove the bad job so it cannot be freed by concurrent
>>>> +           * drm_sched_cleanup_jobs. It will be reinserted back after
>>> sched->thread
>>>> +           * is parked at which point it's safe.
>>>> +           */
>>>> +          list_del_init(&job->node);
>>>> +          spin_unlock_irqrestore(&sched->job_list_lock, flags);
>>>> +
>>>>             job->sched->ops->timedout_job(job);
>>>>
>>>>             /*
>>>> @@ -298,6 +309,8 @@ static void drm_sched_job_timedout(struct
>>> work_struct *work)
>>>>                     job->sched->ops->free_job(job);
>>>>                     sched->free_guilty = false;
>>>>             }
>>>> +  } else {
>>>> +          spin_unlock_irqrestore(&sched->job_list_lock, flags);
>>>>     }
>>>>
>>>>     spin_lock_irqsave(&sched->job_list_lock, flags); @@ -370,6 +383,20
>>>> @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct
>>> drm_sched_job *bad)
>>>>     kthread_park(sched->thread);
>>>>
>>>>     /*
>>>> +   * Reinsert back the bad job here - now it's safe as
>>>> +   * drm_sched_get_cleanup_job cannot race against us and release the
>>>> +   * bad job at this point - we parked (waited for) any in progress
>>>> +   * (earlier) cleanups and drm_sched_get_cleanup_job will not be
>>> called
>>>> +   * now until the scheduler thread is unparked.
>>>> +   */
>>>> +  if (bad && bad->sched == sched)
>>>> +          /*
>>>> +           * Add at the head of the queue to reflect it was the earliest
>>>> +           * job extracted.
>>>> +           */
>>>> +          list_add(&bad->node, &sched->ring_mirror_list);
>>>> +
>>>> +  /*
>>>>      * Iterate the job list from later to  earlier one and either deactive
>>>>      * their HW callbacks or remove them from mirror list if they already
>>>>      * signaled.

[-- Attachment #1.2: Type: text/html, Size: 10715 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
@ 2019-12-03 19:44               ` Deucher, Alexander
  0 siblings, 0 replies; 125+ messages in thread
From: Deucher, Alexander @ 2019-12-03 19:44 UTC (permalink / raw)
  To: Grodzovsky, Andrey, Deng, Emily
  Cc: steven.price, amd-gfx, dri-devel, Koenig,  Christian


[-- Attachment #1.1: Type: text/plain, Size: 5739 bytes --]

[AMD Official Use Only - Internal Distribution Only]

Please go ahead an apply whatever version is necessary for amd-staging-drm-next.

Alex

________________________________
From: Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>
Sent: Tuesday, December 3, 2019 2:10 PM
To: Deng, Emily <Emily.Deng@amd.com>; Deucher, Alexander <Alexander.Deucher@amd.com>
Cc: dri-devel@lists.freedesktop.org <dri-devel@lists.freedesktop.org>; amd-gfx@lists.freedesktop.org <amd-gfx@lists.freedesktop.org>; Koenig, Christian <Christian.Koenig@amd.com>; steven.price@arm.com <steven.price@arm.com>
Subject: Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.

Yes - Christian just pushed it to drm-next-misc - I guess Alex/Christian
didn't pull to amd-staging-drm-next yet.

Andrey

On 12/2/19 2:24 PM, Deng, Emily wrote:
> [AMD Official Use Only - Internal Distribution Only]
>
> Hi Andrey,
>      Seems this patch is still not in amd-staging-drm-next?
>
> Best wishes
> Emily Deng
>
>
>
>> -----Original Message-----
>> From: Deng, Emily
>> Sent: Tuesday, November 26, 2019 4:41 PM
>> To: Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>
>> Cc: dri-devel@lists.freedesktop.org; amd-gfx@lists.freedesktop.org; Koenig,
>> Christian <Christian.Koenig@amd.com>; steven.price@arm.com
>> Subject: RE: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
>>
>> [AMD Official Use Only - Internal Distribution Only]
>>
>> Reviewed-by: Emily Deng <Emily.Deng@amd.com>
>>
>>> -----Original Message-----
>>> From: Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>
>>> Sent: Tuesday, November 26, 2019 7:37 AM
>>> Cc: dri-devel@lists.freedesktop.org; amd-gfx@lists.freedesktop.org;
>>> Koenig, Christian <Christian.Koenig@amd.com>; Deng, Emily
>>> <Emily.Deng@amd.com>; steven.price@arm.com
>>> Subject: Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
>>>
>>> Ping
>>>
>>> Andrey
>>>
>>> On 11/25/19 3:51 PM, Andrey Grodzovsky wrote:
>>>> Problem:
>>>> Due to a race between drm_sched_cleanup_jobs in sched thread and
>>>> drm_sched_job_timedout in timeout work there is a possiblity that bad
>>>> job was already freed while still being accessed from the timeout
>>>> thread.
>>>>
>>>> Fix:
>>>> Instead of just peeking at the bad job in the mirror list remove it
>>>> from the list under lock and then put it back later when we are
>>>> garanteed no race with main sched thread is possible which is after
>>>> the thread is parked.
>>>>
>>>> v2: Lock around processing ring_mirror_list in drm_sched_cleanup_jobs.
>>>>
>>>> v3: Rebase on top of drm-misc-next. v2 is not needed anymore as
>>>> drm_sched_get_cleanup_job already has a lock there.
>>>>
>>>> v4: Fix comments to relfect latest code in drm-misc.
>>>>
>>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>>> Reviewed-by: Christian König <christian.koenig@amd.com>
>>>> Tested-by: Emily Deng <Emily.Deng@amd.com>
>>>> ---
>>>>    drivers/gpu/drm/scheduler/sched_main.c | 27
>>> +++++++++++++++++++++++++++
>>>>    1 file changed, 27 insertions(+)
>>>>
>>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c
>>>> b/drivers/gpu/drm/scheduler/sched_main.c
>>>> index 6774955..1bf9c40 100644
>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>>> @@ -284,10 +284,21 @@ static void drm_sched_job_timedout(struct
>>> work_struct *work)
>>>>     unsigned long flags;
>>>>
>>>>     sched = container_of(work, struct drm_gpu_scheduler,
>>>> work_tdr.work);
>>>> +
>>>> +  /* Protects against concurrent deletion in
>>> drm_sched_get_cleanup_job */
>>>> +  spin_lock_irqsave(&sched->job_list_lock, flags);
>>>>     job = list_first_entry_or_null(&sched->ring_mirror_list,
>>>>                                    struct drm_sched_job, node);
>>>>
>>>>     if (job) {
>>>> +          /*
>>>> +           * Remove the bad job so it cannot be freed by concurrent
>>>> +           * drm_sched_cleanup_jobs. It will be reinserted back after
>>> sched->thread
>>>> +           * is parked at which point it's safe.
>>>> +           */
>>>> +          list_del_init(&job->node);
>>>> +          spin_unlock_irqrestore(&sched->job_list_lock, flags);
>>>> +
>>>>             job->sched->ops->timedout_job(job);
>>>>
>>>>             /*
>>>> @@ -298,6 +309,8 @@ static void drm_sched_job_timedout(struct
>>> work_struct *work)
>>>>                     job->sched->ops->free_job(job);
>>>>                     sched->free_guilty = false;
>>>>             }
>>>> +  } else {
>>>> +          spin_unlock_irqrestore(&sched->job_list_lock, flags);
>>>>     }
>>>>
>>>>     spin_lock_irqsave(&sched->job_list_lock, flags); @@ -370,6 +383,20
>>>> @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct
>>> drm_sched_job *bad)
>>>>     kthread_park(sched->thread);
>>>>
>>>>     /*
>>>> +   * Reinsert back the bad job here - now it's safe as
>>>> +   * drm_sched_get_cleanup_job cannot race against us and release the
>>>> +   * bad job at this point - we parked (waited for) any in progress
>>>> +   * (earlier) cleanups and drm_sched_get_cleanup_job will not be
>>> called
>>>> +   * now until the scheduler thread is unparked.
>>>> +   */
>>>> +  if (bad && bad->sched == sched)
>>>> +          /*
>>>> +           * Add at the head of the queue to reflect it was the earliest
>>>> +           * job extracted.
>>>> +           */
>>>> +          list_add(&bad->node, &sched->ring_mirror_list);
>>>> +
>>>> +  /*
>>>>      * Iterate the job list from later to  earlier one and either deactive
>>>>      * their HW callbacks or remove them from mirror list if they already
>>>>      * signaled.

[-- Attachment #1.2: Type: text/html, Size: 10715 bytes --]

[-- Attachment #2: Type: text/plain, Size: 153 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 125+ messages in thread

* RE: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
  2019-12-03 19:10             ` Andrey Grodzovsky
@ 2019-12-03 19:53               ` Deng, Emily
  -1 siblings, 0 replies; 125+ messages in thread
From: Deng, Emily @ 2019-12-03 19:53 UTC (permalink / raw)
  To: Grodzovsky, Andrey, Deucher, Alexander
  Cc: steven.price, amd-gfx, dri-devel, Koenig,  Christian

[AMD Official Use Only - Internal Distribution Only]

Hi Alex,
    When we will cherry pick those patches to drm-next?

>-----Original Message-----
>From: Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>
>Sent: Tuesday, December 3, 2019 11:10 AM
>To: Deng, Emily <Emily.Deng@amd.com>; Deucher, Alexander
><Alexander.Deucher@amd.com>
>Cc: dri-devel@lists.freedesktop.org; amd-gfx@lists.freedesktop.org; Koenig,
>Christian <Christian.Koenig@amd.com>; steven.price@arm.com
>Subject: Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
>
>Yes - Christian just pushed it to drm-next-misc - I guess Alex/Christian didn't pull
>to amd-staging-drm-next yet.
>
>Andrey
>
>On 12/2/19 2:24 PM, Deng, Emily wrote:
>> [AMD Official Use Only - Internal Distribution Only]
>>
>> Hi Andrey,
>>      Seems this patch is still not in amd-staging-drm-next?
>>
>> Best wishes
>> Emily Deng
>>
>>
>>
>>> -----Original Message-----
>>> From: Deng, Emily
>>> Sent: Tuesday, November 26, 2019 4:41 PM
>>> To: Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>
>>> Cc: dri-devel@lists.freedesktop.org; amd-gfx@lists.freedesktop.org;
>>> Koenig, Christian <Christian.Koenig@amd.com>; steven.price@arm.com
>>> Subject: RE: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
>>>
>>> [AMD Official Use Only - Internal Distribution Only]
>>>
>>> Reviewed-by: Emily Deng <Emily.Deng@amd.com>
>>>
>>>> -----Original Message-----
>>>> From: Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>
>>>> Sent: Tuesday, November 26, 2019 7:37 AM
>>>> Cc: dri-devel@lists.freedesktop.org; amd-gfx@lists.freedesktop.org;
>>>> Koenig, Christian <Christian.Koenig@amd.com>; Deng, Emily
>>>> <Emily.Deng@amd.com>; steven.price@arm.com
>>>> Subject: Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
>>>>
>>>> Ping
>>>>
>>>> Andrey
>>>>
>>>> On 11/25/19 3:51 PM, Andrey Grodzovsky wrote:
>>>>> Problem:
>>>>> Due to a race between drm_sched_cleanup_jobs in sched thread and
>>>>> drm_sched_job_timedout in timeout work there is a possiblity that
>>>>> bad job was already freed while still being accessed from the
>>>>> timeout thread.
>>>>>
>>>>> Fix:
>>>>> Instead of just peeking at the bad job in the mirror list remove it
>>>>> from the list under lock and then put it back later when we are
>>>>> garanteed no race with main sched thread is possible which is after
>>>>> the thread is parked.
>>>>>
>>>>> v2: Lock around processing ring_mirror_list in drm_sched_cleanup_jobs.
>>>>>
>>>>> v3: Rebase on top of drm-misc-next. v2 is not needed anymore as
>>>>> drm_sched_get_cleanup_job already has a lock there.
>>>>>
>>>>> v4: Fix comments to relfect latest code in drm-misc.
>>>>>
>>>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>>>> Reviewed-by: Christian König <christian.koenig@amd.com>
>>>>> Tested-by: Emily Deng <Emily.Deng@amd.com>
>>>>> ---
>>>>>    drivers/gpu/drm/scheduler/sched_main.c | 27
>>>> +++++++++++++++++++++++++++
>>>>>    1 file changed, 27 insertions(+)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c
>>>>> b/drivers/gpu/drm/scheduler/sched_main.c
>>>>> index 6774955..1bf9c40 100644
>>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>>>> @@ -284,10 +284,21 @@ static void drm_sched_job_timedout(struct
>>>> work_struct *work)
>>>>>    	unsigned long flags;
>>>>>
>>>>>    	sched = container_of(work, struct drm_gpu_scheduler,
>>>>> work_tdr.work);
>>>>> +
>>>>> +	/* Protects against concurrent deletion in
>>>> drm_sched_get_cleanup_job */
>>>>> +	spin_lock_irqsave(&sched->job_list_lock, flags);
>>>>>    	job = list_first_entry_or_null(&sched->ring_mirror_list,
>>>>>    				       struct drm_sched_job, node);
>>>>>
>>>>>    	if (job) {
>>>>> +		/*
>>>>> +		 * Remove the bad job so it cannot be freed by concurrent
>>>>> +		 * drm_sched_cleanup_jobs. It will be reinserted back after
>>>> sched->thread
>>>>> +		 * is parked at which point it's safe.
>>>>> +		 */
>>>>> +		list_del_init(&job->node);
>>>>> +		spin_unlock_irqrestore(&sched->job_list_lock, flags);
>>>>> +
>>>>>    		job->sched->ops->timedout_job(job);
>>>>>
>>>>>    		/*
>>>>> @@ -298,6 +309,8 @@ static void drm_sched_job_timedout(struct
>>>> work_struct *work)
>>>>>    			job->sched->ops->free_job(job);
>>>>>    			sched->free_guilty = false;
>>>>>    		}
>>>>> +	} else {
>>>>> +		spin_unlock_irqrestore(&sched->job_list_lock, flags);
>>>>>    	}
>>>>>
>>>>>    	spin_lock_irqsave(&sched->job_list_lock, flags); @@ -370,6
>>>>> +383,20 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched,
>>>>> struct
>>>> drm_sched_job *bad)
>>>>>    	kthread_park(sched->thread);
>>>>>
>>>>>    	/*
>>>>> +	 * Reinsert back the bad job here - now it's safe as
>>>>> +	 * drm_sched_get_cleanup_job cannot race against us and release the
>>>>> +	 * bad job at this point - we parked (waited for) any in progress
>>>>> +	 * (earlier) cleanups and drm_sched_get_cleanup_job will not be
>>>> called
>>>>> +	 * now until the scheduler thread is unparked.
>>>>> +	 */
>>>>> +	if (bad && bad->sched == sched)
>>>>> +		/*
>>>>> +		 * Add at the head of the queue to reflect it was the earliest
>>>>> +		 * job extracted.
>>>>> +		 */
>>>>> +		list_add(&bad->node, &sched->ring_mirror_list);
>>>>> +
>>>>> +	/*
>>>>>    	 * Iterate the job list from later to  earlier one and either deactive
>>>>>    	 * their HW callbacks or remove them from mirror list if they already
>>>>>    	 * signaled.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 125+ messages in thread

* RE: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
@ 2019-12-03 19:53               ` Deng, Emily
  0 siblings, 0 replies; 125+ messages in thread
From: Deng, Emily @ 2019-12-03 19:53 UTC (permalink / raw)
  To: Grodzovsky, Andrey, Deucher, Alexander
  Cc: steven.price, amd-gfx, dri-devel, Koenig,  Christian

[AMD Official Use Only - Internal Distribution Only]

Hi Alex,
    When we will cherry pick those patches to drm-next?

>-----Original Message-----
>From: Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>
>Sent: Tuesday, December 3, 2019 11:10 AM
>To: Deng, Emily <Emily.Deng@amd.com>; Deucher, Alexander
><Alexander.Deucher@amd.com>
>Cc: dri-devel@lists.freedesktop.org; amd-gfx@lists.freedesktop.org; Koenig,
>Christian <Christian.Koenig@amd.com>; steven.price@arm.com
>Subject: Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
>
>Yes - Christian just pushed it to drm-next-misc - I guess Alex/Christian didn't pull
>to amd-staging-drm-next yet.
>
>Andrey
>
>On 12/2/19 2:24 PM, Deng, Emily wrote:
>> [AMD Official Use Only - Internal Distribution Only]
>>
>> Hi Andrey,
>>      Seems this patch is still not in amd-staging-drm-next?
>>
>> Best wishes
>> Emily Deng
>>
>>
>>
>>> -----Original Message-----
>>> From: Deng, Emily
>>> Sent: Tuesday, November 26, 2019 4:41 PM
>>> To: Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>
>>> Cc: dri-devel@lists.freedesktop.org; amd-gfx@lists.freedesktop.org;
>>> Koenig, Christian <Christian.Koenig@amd.com>; steven.price@arm.com
>>> Subject: RE: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
>>>
>>> [AMD Official Use Only - Internal Distribution Only]
>>>
>>> Reviewed-by: Emily Deng <Emily.Deng@amd.com>
>>>
>>>> -----Original Message-----
>>>> From: Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>
>>>> Sent: Tuesday, November 26, 2019 7:37 AM
>>>> Cc: dri-devel@lists.freedesktop.org; amd-gfx@lists.freedesktop.org;
>>>> Koenig, Christian <Christian.Koenig@amd.com>; Deng, Emily
>>>> <Emily.Deng@amd.com>; steven.price@arm.com
>>>> Subject: Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
>>>>
>>>> Ping
>>>>
>>>> Andrey
>>>>
>>>> On 11/25/19 3:51 PM, Andrey Grodzovsky wrote:
>>>>> Problem:
>>>>> Due to a race between drm_sched_cleanup_jobs in sched thread and
>>>>> drm_sched_job_timedout in timeout work there is a possiblity that
>>>>> bad job was already freed while still being accessed from the
>>>>> timeout thread.
>>>>>
>>>>> Fix:
>>>>> Instead of just peeking at the bad job in the mirror list remove it
>>>>> from the list under lock and then put it back later when we are
>>>>> garanteed no race with main sched thread is possible which is after
>>>>> the thread is parked.
>>>>>
>>>>> v2: Lock around processing ring_mirror_list in drm_sched_cleanup_jobs.
>>>>>
>>>>> v3: Rebase on top of drm-misc-next. v2 is not needed anymore as
>>>>> drm_sched_get_cleanup_job already has a lock there.
>>>>>
>>>>> v4: Fix comments to relfect latest code in drm-misc.
>>>>>
>>>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>>>> Reviewed-by: Christian König <christian.koenig@amd.com>
>>>>> Tested-by: Emily Deng <Emily.Deng@amd.com>
>>>>> ---
>>>>>    drivers/gpu/drm/scheduler/sched_main.c | 27
>>>> +++++++++++++++++++++++++++
>>>>>    1 file changed, 27 insertions(+)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c
>>>>> b/drivers/gpu/drm/scheduler/sched_main.c
>>>>> index 6774955..1bf9c40 100644
>>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>>>> @@ -284,10 +284,21 @@ static void drm_sched_job_timedout(struct
>>>> work_struct *work)
>>>>>    	unsigned long flags;
>>>>>
>>>>>    	sched = container_of(work, struct drm_gpu_scheduler,
>>>>> work_tdr.work);
>>>>> +
>>>>> +	/* Protects against concurrent deletion in
>>>> drm_sched_get_cleanup_job */
>>>>> +	spin_lock_irqsave(&sched->job_list_lock, flags);
>>>>>    	job = list_first_entry_or_null(&sched->ring_mirror_list,
>>>>>    				       struct drm_sched_job, node);
>>>>>
>>>>>    	if (job) {
>>>>> +		/*
>>>>> +		 * Remove the bad job so it cannot be freed by concurrent
>>>>> +		 * drm_sched_cleanup_jobs. It will be reinserted back after
>>>> sched->thread
>>>>> +		 * is parked at which point it's safe.
>>>>> +		 */
>>>>> +		list_del_init(&job->node);
>>>>> +		spin_unlock_irqrestore(&sched->job_list_lock, flags);
>>>>> +
>>>>>    		job->sched->ops->timedout_job(job);
>>>>>
>>>>>    		/*
>>>>> @@ -298,6 +309,8 @@ static void drm_sched_job_timedout(struct
>>>> work_struct *work)
>>>>>    			job->sched->ops->free_job(job);
>>>>>    			sched->free_guilty = false;
>>>>>    		}
>>>>> +	} else {
>>>>> +		spin_unlock_irqrestore(&sched->job_list_lock, flags);
>>>>>    	}
>>>>>
>>>>>    	spin_lock_irqsave(&sched->job_list_lock, flags); @@ -370,6
>>>>> +383,20 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched,
>>>>> struct
>>>> drm_sched_job *bad)
>>>>>    	kthread_park(sched->thread);
>>>>>
>>>>>    	/*
>>>>> +	 * Reinsert back the bad job here - now it's safe as
>>>>> +	 * drm_sched_get_cleanup_job cannot race against us and release the
>>>>> +	 * bad job at this point - we parked (waited for) any in progress
>>>>> +	 * (earlier) cleanups and drm_sched_get_cleanup_job will not be
>>>> called
>>>>> +	 * now until the scheduler thread is unparked.
>>>>> +	 */
>>>>> +	if (bad && bad->sched == sched)
>>>>> +		/*
>>>>> +		 * Add at the head of the queue to reflect it was the earliest
>>>>> +		 * job extracted.
>>>>> +		 */
>>>>> +		list_add(&bad->node, &sched->ring_mirror_list);
>>>>> +
>>>>> +	/*
>>>>>    	 * Iterate the job list from later to  earlier one and either deactive
>>>>>    	 * their HW callbacks or remove them from mirror list if they already
>>>>>    	 * signaled.
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
  2019-12-03 19:44               ` Deucher, Alexander
@ 2019-12-03 19:57                 ` Andrey Grodzovsky
  -1 siblings, 0 replies; 125+ messages in thread
From: Andrey Grodzovsky @ 2019-12-03 19:57 UTC (permalink / raw)
  To: Deucher, Alexander, Deng, Emily
  Cc: steven.price, amd-gfx, dri-devel, Koenig, Christian


[-- Attachment #1.1: Type: text/plain, Size: 6317 bytes --]

I don't think i can apply this patch 'as is' as this has dependency on 
patch by Steven which also wasn't applied yet - 588b982 Steven 
Price        6 weeks ago    drm: Don't free jobs in 
wait_event_interruptible()


Andrey


On 12/3/19 2:44 PM, Deucher, Alexander wrote:
>
> [AMD Official Use Only - Internal Distribution Only]
>
>
> Please go ahead an apply whatever version is necessary for 
> amd-staging-drm-next.
>
> Alex
>
> ------------------------------------------------------------------------
> *From:* Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>
> *Sent:* Tuesday, December 3, 2019 2:10 PM
> *To:* Deng, Emily <Emily.Deng@amd.com>; Deucher, Alexander 
> <Alexander.Deucher@amd.com>
> *Cc:* dri-devel@lists.freedesktop.org 
> <dri-devel@lists.freedesktop.org>; amd-gfx@lists.freedesktop.org 
> <amd-gfx@lists.freedesktop.org>; Koenig, Christian 
> <Christian.Koenig@amd.com>; steven.price@arm.com <steven.price@arm.com>
> *Subject:* Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
> Yes - Christian just pushed it to drm-next-misc - I guess Alex/Christian
> didn't pull to amd-staging-drm-next yet.
>
> Andrey
>
> On 12/2/19 2:24 PM, Deng, Emily wrote:
> > [AMD Official Use Only - Internal Distribution Only]
> >
> > Hi Andrey,
> >      Seems this patch is still not in amd-staging-drm-next?
> >
> > Best wishes
> > Emily Deng
> >
> >
> >
> >> -----Original Message-----
> >> From: Deng, Emily
> >> Sent: Tuesday, November 26, 2019 4:41 PM
> >> To: Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>
> >> Cc: dri-devel@lists.freedesktop.org; amd-gfx@lists.freedesktop.org; 
> Koenig,
> >> Christian <Christian.Koenig@amd.com>; steven.price@arm.com
> >> Subject: RE: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
> >>
> >> [AMD Official Use Only - Internal Distribution Only]
> >>
> >> Reviewed-by: Emily Deng <Emily.Deng@amd.com>
> >>
> >>> -----Original Message-----
> >>> From: Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>
> >>> Sent: Tuesday, November 26, 2019 7:37 AM
> >>> Cc: dri-devel@lists.freedesktop.org; amd-gfx@lists.freedesktop.org;
> >>> Koenig, Christian <Christian.Koenig@amd.com>; Deng, Emily
> >>> <Emily.Deng@amd.com>; steven.price@arm.com
> >>> Subject: Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
> >>>
> >>> Ping
> >>>
> >>> Andrey
> >>>
> >>> On 11/25/19 3:51 PM, Andrey Grodzovsky wrote:
> >>>> Problem:
> >>>> Due to a race between drm_sched_cleanup_jobs in sched thread and
> >>>> drm_sched_job_timedout in timeout work there is a possiblity that bad
> >>>> job was already freed while still being accessed from the timeout
> >>>> thread.
> >>>>
> >>>> Fix:
> >>>> Instead of just peeking at the bad job in the mirror list remove it
> >>>> from the list under lock and then put it back later when we are
> >>>> garanteed no race with main sched thread is possible which is after
> >>>> the thread is parked.
> >>>>
> >>>> v2: Lock around processing ring_mirror_list in 
> drm_sched_cleanup_jobs.
> >>>>
> >>>> v3: Rebase on top of drm-misc-next. v2 is not needed anymore as
> >>>> drm_sched_get_cleanup_job already has a lock there.
> >>>>
> >>>> v4: Fix comments to relfect latest code in drm-misc.
> >>>>
> >>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> >>>> Reviewed-by: Christian König <christian.koenig@amd.com>
> >>>> Tested-by: Emily Deng <Emily.Deng@amd.com>
> >>>> ---
> >>>> drivers/gpu/drm/scheduler/sched_main.c | 27
> >>> +++++++++++++++++++++++++++
> >>>>    1 file changed, 27 insertions(+)
> >>>>
> >>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c
> >>>> b/drivers/gpu/drm/scheduler/sched_main.c
> >>>> index 6774955..1bf9c40 100644
> >>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
> >>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> >>>> @@ -284,10 +284,21 @@ static void drm_sched_job_timedout(struct
> >>> work_struct *work)
> >>>>     unsigned long flags;
> >>>>
> >>>>     sched = container_of(work, struct drm_gpu_scheduler,
> >>>> work_tdr.work);
> >>>> +
> >>>> +  /* Protects against concurrent deletion in
> >>> drm_sched_get_cleanup_job */
> >>>> + spin_lock_irqsave(&sched->job_list_lock, flags);
> >>>>     job = list_first_entry_or_null(&sched->ring_mirror_list,
> >>>> struct drm_sched_job, node);
> >>>>
> >>>>     if (job) {
> >>>> +          /*
> >>>> +           * Remove the bad job so it cannot be freed by concurrent
> >>>> +           * drm_sched_cleanup_jobs. It will be reinserted back 
> after
> >>> sched->thread
> >>>> +           * is parked at which point it's safe.
> >>>> +           */
> >>>> + list_del_init(&job->node);
> >>>> + spin_unlock_irqrestore(&sched->job_list_lock, flags);
> >>>> +
> >>>> job->sched->ops->timedout_job(job);
> >>>>
> >>>>             /*
> >>>> @@ -298,6 +309,8 @@ static void drm_sched_job_timedout(struct
> >>> work_struct *work)
> >>>> job->sched->ops->free_job(job);
> >>>> sched->free_guilty = false;
> >>>>             }
> >>>> +  } else {
> >>>> + spin_unlock_irqrestore(&sched->job_list_lock, flags);
> >>>>     }
> >>>>
> >>>> spin_lock_irqsave(&sched->job_list_lock, flags); @@ -370,6 +383,20
> >>>> @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct
> >>> drm_sched_job *bad)
> >>>>     kthread_park(sched->thread);
> >>>>
> >>>>     /*
> >>>> +   * Reinsert back the bad job here - now it's safe as
> >>>> +   * drm_sched_get_cleanup_job cannot race against us and 
> release the
> >>>> +   * bad job at this point - we parked (waited for) any in progress
> >>>> +   * (earlier) cleanups and drm_sched_get_cleanup_job will not be
> >>> called
> >>>> +   * now until the scheduler thread is unparked.
> >>>> +   */
> >>>> +  if (bad && bad->sched == sched)
> >>>> +          /*
> >>>> +           * Add at the head of the queue to reflect it was the 
> earliest
> >>>> +           * job extracted.
> >>>> +           */
> >>>> +          list_add(&bad->node, &sched->ring_mirror_list);
> >>>> +
> >>>> +  /*
> >>>>      * Iterate the job list from later to  earlier one and either 
> deactive
> >>>>      * their HW callbacks or remove them from mirror list if they 
> already
> >>>>      * signaled.

[-- Attachment #1.2: Type: text/html, Size: 16629 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
@ 2019-12-03 19:57                 ` Andrey Grodzovsky
  0 siblings, 0 replies; 125+ messages in thread
From: Andrey Grodzovsky @ 2019-12-03 19:57 UTC (permalink / raw)
  To: Deucher, Alexander, Deng, Emily
  Cc: steven.price, amd-gfx, dri-devel, Koenig, Christian


[-- Attachment #1.1: Type: text/plain, Size: 6317 bytes --]

I don't think i can apply this patch 'as is' as this has dependency on 
patch by Steven which also wasn't applied yet - 588b982 Steven 
Price        6 weeks ago    drm: Don't free jobs in 
wait_event_interruptible()


Andrey


On 12/3/19 2:44 PM, Deucher, Alexander wrote:
>
> [AMD Official Use Only - Internal Distribution Only]
>
>
> Please go ahead an apply whatever version is necessary for 
> amd-staging-drm-next.
>
> Alex
>
> ------------------------------------------------------------------------
> *From:* Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>
> *Sent:* Tuesday, December 3, 2019 2:10 PM
> *To:* Deng, Emily <Emily.Deng@amd.com>; Deucher, Alexander 
> <Alexander.Deucher@amd.com>
> *Cc:* dri-devel@lists.freedesktop.org 
> <dri-devel@lists.freedesktop.org>; amd-gfx@lists.freedesktop.org 
> <amd-gfx@lists.freedesktop.org>; Koenig, Christian 
> <Christian.Koenig@amd.com>; steven.price@arm.com <steven.price@arm.com>
> *Subject:* Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
> Yes - Christian just pushed it to drm-next-misc - I guess Alex/Christian
> didn't pull to amd-staging-drm-next yet.
>
> Andrey
>
> On 12/2/19 2:24 PM, Deng, Emily wrote:
> > [AMD Official Use Only - Internal Distribution Only]
> >
> > Hi Andrey,
> >      Seems this patch is still not in amd-staging-drm-next?
> >
> > Best wishes
> > Emily Deng
> >
> >
> >
> >> -----Original Message-----
> >> From: Deng, Emily
> >> Sent: Tuesday, November 26, 2019 4:41 PM
> >> To: Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>
> >> Cc: dri-devel@lists.freedesktop.org; amd-gfx@lists.freedesktop.org; 
> Koenig,
> >> Christian <Christian.Koenig@amd.com>; steven.price@arm.com
> >> Subject: RE: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
> >>
> >> [AMD Official Use Only - Internal Distribution Only]
> >>
> >> Reviewed-by: Emily Deng <Emily.Deng@amd.com>
> >>
> >>> -----Original Message-----
> >>> From: Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>
> >>> Sent: Tuesday, November 26, 2019 7:37 AM
> >>> Cc: dri-devel@lists.freedesktop.org; amd-gfx@lists.freedesktop.org;
> >>> Koenig, Christian <Christian.Koenig@amd.com>; Deng, Emily
> >>> <Emily.Deng@amd.com>; steven.price@arm.com
> >>> Subject: Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
> >>>
> >>> Ping
> >>>
> >>> Andrey
> >>>
> >>> On 11/25/19 3:51 PM, Andrey Grodzovsky wrote:
> >>>> Problem:
> >>>> Due to a race between drm_sched_cleanup_jobs in sched thread and
> >>>> drm_sched_job_timedout in timeout work there is a possiblity that bad
> >>>> job was already freed while still being accessed from the timeout
> >>>> thread.
> >>>>
> >>>> Fix:
> >>>> Instead of just peeking at the bad job in the mirror list remove it
> >>>> from the list under lock and then put it back later when we are
> >>>> garanteed no race with main sched thread is possible which is after
> >>>> the thread is parked.
> >>>>
> >>>> v2: Lock around processing ring_mirror_list in 
> drm_sched_cleanup_jobs.
> >>>>
> >>>> v3: Rebase on top of drm-misc-next. v2 is not needed anymore as
> >>>> drm_sched_get_cleanup_job already has a lock there.
> >>>>
> >>>> v4: Fix comments to relfect latest code in drm-misc.
> >>>>
> >>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> >>>> Reviewed-by: Christian König <christian.koenig@amd.com>
> >>>> Tested-by: Emily Deng <Emily.Deng@amd.com>
> >>>> ---
> >>>> drivers/gpu/drm/scheduler/sched_main.c | 27
> >>> +++++++++++++++++++++++++++
> >>>>    1 file changed, 27 insertions(+)
> >>>>
> >>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c
> >>>> b/drivers/gpu/drm/scheduler/sched_main.c
> >>>> index 6774955..1bf9c40 100644
> >>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
> >>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> >>>> @@ -284,10 +284,21 @@ static void drm_sched_job_timedout(struct
> >>> work_struct *work)
> >>>>     unsigned long flags;
> >>>>
> >>>>     sched = container_of(work, struct drm_gpu_scheduler,
> >>>> work_tdr.work);
> >>>> +
> >>>> +  /* Protects against concurrent deletion in
> >>> drm_sched_get_cleanup_job */
> >>>> + spin_lock_irqsave(&sched->job_list_lock, flags);
> >>>>     job = list_first_entry_or_null(&sched->ring_mirror_list,
> >>>> struct drm_sched_job, node);
> >>>>
> >>>>     if (job) {
> >>>> +          /*
> >>>> +           * Remove the bad job so it cannot be freed by concurrent
> >>>> +           * drm_sched_cleanup_jobs. It will be reinserted back 
> after
> >>> sched->thread
> >>>> +           * is parked at which point it's safe.
> >>>> +           */
> >>>> + list_del_init(&job->node);
> >>>> + spin_unlock_irqrestore(&sched->job_list_lock, flags);
> >>>> +
> >>>> job->sched->ops->timedout_job(job);
> >>>>
> >>>>             /*
> >>>> @@ -298,6 +309,8 @@ static void drm_sched_job_timedout(struct
> >>> work_struct *work)
> >>>> job->sched->ops->free_job(job);
> >>>> sched->free_guilty = false;
> >>>>             }
> >>>> +  } else {
> >>>> + spin_unlock_irqrestore(&sched->job_list_lock, flags);
> >>>>     }
> >>>>
> >>>> spin_lock_irqsave(&sched->job_list_lock, flags); @@ -370,6 +383,20
> >>>> @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct
> >>> drm_sched_job *bad)
> >>>>     kthread_park(sched->thread);
> >>>>
> >>>>     /*
> >>>> +   * Reinsert back the bad job here - now it's safe as
> >>>> +   * drm_sched_get_cleanup_job cannot race against us and 
> release the
> >>>> +   * bad job at this point - we parked (waited for) any in progress
> >>>> +   * (earlier) cleanups and drm_sched_get_cleanup_job will not be
> >>> called
> >>>> +   * now until the scheduler thread is unparked.
> >>>> +   */
> >>>> +  if (bad && bad->sched == sched)
> >>>> +          /*
> >>>> +           * Add at the head of the queue to reflect it was the 
> earliest
> >>>> +           * job extracted.
> >>>> +           */
> >>>> +          list_add(&bad->node, &sched->ring_mirror_list);
> >>>> +
> >>>> +  /*
> >>>>      * Iterate the job list from later to  earlier one and either 
> deactive
> >>>>      * their HW callbacks or remove them from mirror list if they 
> already
> >>>>      * signaled.

[-- Attachment #1.2: Type: text/html, Size: 16629 bytes --]

[-- Attachment #2: Type: text/plain, Size: 153 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
  2019-12-03 19:57                 ` Andrey Grodzovsky
@ 2019-12-03 19:59                   ` Deucher, Alexander
  -1 siblings, 0 replies; 125+ messages in thread
From: Deucher, Alexander @ 2019-12-03 19:59 UTC (permalink / raw)
  To: Grodzovsky, Andrey, Deng, Emily
  Cc: steven.price, amd-gfx, dri-devel, Koenig,  Christian


[-- Attachment #1.1: Type: text/plain, Size: 7538 bytes --]

[AMD Official Use Only - Internal Distribution Only]

Cherry pick whatever dependencies you need or pick the older version of the patch.  Either way works.

Alex
________________________________
From: Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>
Sent: Tuesday, December 3, 2019 2:57 PM
To: Deucher, Alexander <Alexander.Deucher@amd.com>; Deng, Emily <Emily.Deng@amd.com>
Cc: dri-devel@lists.freedesktop.org <dri-devel@lists.freedesktop.org>; amd-gfx@lists.freedesktop.org <amd-gfx@lists.freedesktop.org>; Koenig, Christian <Christian.Koenig@amd.com>; steven.price@arm.com <steven.price@arm.com>
Subject: Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.


I don't think i can apply this patch 'as is' as this has dependency on patch by Steven which also wasn't applied yet - 588b982 Steven Price        6 weeks ago    drm: Don't free jobs in wait_event_interruptible()


Andrey


On 12/3/19 2:44 PM, Deucher, Alexander wrote:

[AMD Official Use Only - Internal Distribution Only]

Please go ahead an apply whatever version is necessary for amd-staging-drm-next.

Alex

________________________________
From: Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com><mailto:Andrey.Grodzovsky@amd.com>
Sent: Tuesday, December 3, 2019 2:10 PM
To: Deng, Emily <Emily.Deng@amd.com><mailto:Emily.Deng@amd.com>; Deucher, Alexander <Alexander.Deucher@amd.com><mailto:Alexander.Deucher@amd.com>
Cc: dri-devel@lists.freedesktop.org<mailto:dri-devel@lists.freedesktop.org> <dri-devel@lists.freedesktop.org><mailto:dri-devel@lists.freedesktop.org>; amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org> <amd-gfx@lists.freedesktop.org><mailto:amd-gfx@lists.freedesktop.org>; Koenig, Christian <Christian.Koenig@amd.com><mailto:Christian.Koenig@amd.com>; steven.price@arm.com<mailto:steven.price@arm.com> <steven.price@arm.com><mailto:steven.price@arm.com>
Subject: Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.

Yes - Christian just pushed it to drm-next-misc - I guess Alex/Christian
didn't pull to amd-staging-drm-next yet.

Andrey

On 12/2/19 2:24 PM, Deng, Emily wrote:
> [AMD Official Use Only - Internal Distribution Only]
>
> Hi Andrey,
>      Seems this patch is still not in amd-staging-drm-next?
>
> Best wishes
> Emily Deng
>
>
>
>> -----Original Message-----
>> From: Deng, Emily
>> Sent: Tuesday, November 26, 2019 4:41 PM
>> To: Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com><mailto:Andrey.Grodzovsky@amd.com>
>> Cc: dri-devel@lists.freedesktop.org<mailto:dri-devel@lists.freedesktop.org>; amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>; Koenig,
>> Christian <Christian.Koenig@amd.com><mailto:Christian.Koenig@amd.com>; steven.price@arm.com<mailto:steven.price@arm.com>
>> Subject: RE: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
>>
>> [AMD Official Use Only - Internal Distribution Only]
>>
>> Reviewed-by: Emily Deng <Emily.Deng@amd.com><mailto:Emily.Deng@amd.com>
>>
>>> -----Original Message-----
>>> From: Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com><mailto:Andrey.Grodzovsky@amd.com>
>>> Sent: Tuesday, November 26, 2019 7:37 AM
>>> Cc: dri-devel@lists.freedesktop.org<mailto:dri-devel@lists.freedesktop.org>; amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>;
>>> Koenig, Christian <Christian.Koenig@amd.com><mailto:Christian.Koenig@amd.com>; Deng, Emily
>>> <Emily.Deng@amd.com><mailto:Emily.Deng@amd.com>; steven.price@arm.com<mailto:steven.price@arm.com>
>>> Subject: Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
>>>
>>> Ping
>>>
>>> Andrey
>>>
>>> On 11/25/19 3:51 PM, Andrey Grodzovsky wrote:
>>>> Problem:
>>>> Due to a race between drm_sched_cleanup_jobs in sched thread and
>>>> drm_sched_job_timedout in timeout work there is a possiblity that bad
>>>> job was already freed while still being accessed from the timeout
>>>> thread.
>>>>
>>>> Fix:
>>>> Instead of just peeking at the bad job in the mirror list remove it
>>>> from the list under lock and then put it back later when we are
>>>> garanteed no race with main sched thread is possible which is after
>>>> the thread is parked.
>>>>
>>>> v2: Lock around processing ring_mirror_list in drm_sched_cleanup_jobs.
>>>>
>>>> v3: Rebase on top of drm-misc-next. v2 is not needed anymore as
>>>> drm_sched_get_cleanup_job already has a lock there.
>>>>
>>>> v4: Fix comments to relfect latest code in drm-misc.
>>>>
>>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com><mailto:andrey.grodzovsky@amd.com>
>>>> Reviewed-by: Christian König <christian.koenig@amd.com><mailto:christian.koenig@amd.com>
>>>> Tested-by: Emily Deng <Emily.Deng@amd.com><mailto:Emily.Deng@amd.com>
>>>> ---
>>>>    drivers/gpu/drm/scheduler/sched_main.c | 27
>>> +++++++++++++++++++++++++++
>>>>    1 file changed, 27 insertions(+)
>>>>
>>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c
>>>> b/drivers/gpu/drm/scheduler/sched_main.c
>>>> index 6774955..1bf9c40 100644
>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>>> @@ -284,10 +284,21 @@ static void drm_sched_job_timedout(struct
>>> work_struct *work)
>>>>     unsigned long flags;
>>>>
>>>>     sched = container_of(work, struct drm_gpu_scheduler,
>>>> work_tdr.work);
>>>> +
>>>> +  /* Protects against concurrent deletion in
>>> drm_sched_get_cleanup_job */
>>>> +  spin_lock_irqsave(&sched->job_list_lock, flags);
>>>>     job = list_first_entry_or_null(&sched->ring_mirror_list,
>>>>                                    struct drm_sched_job, node);
>>>>
>>>>     if (job) {
>>>> +          /*
>>>> +           * Remove the bad job so it cannot be freed by concurrent
>>>> +           * drm_sched_cleanup_jobs. It will be reinserted back after
>>> sched->thread
>>>> +           * is parked at which point it's safe.
>>>> +           */
>>>> +          list_del_init(&job->node);
>>>> +          spin_unlock_irqrestore(&sched->job_list_lock, flags);
>>>> +
>>>>             job->sched->ops->timedout_job(job);
>>>>
>>>>             /*
>>>> @@ -298,6 +309,8 @@ static void drm_sched_job_timedout(struct
>>> work_struct *work)
>>>>                     job->sched->ops->free_job(job);
>>>>                     sched->free_guilty = false;
>>>>             }
>>>> +  } else {
>>>> +          spin_unlock_irqrestore(&sched->job_list_lock, flags);
>>>>     }
>>>>
>>>>     spin_lock_irqsave(&sched->job_list_lock, flags); @@ -370,6 +383,20
>>>> @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct
>>> drm_sched_job *bad)
>>>>     kthread_park(sched->thread);
>>>>
>>>>     /*
>>>> +   * Reinsert back the bad job here - now it's safe as
>>>> +   * drm_sched_get_cleanup_job cannot race against us and release the
>>>> +   * bad job at this point - we parked (waited for) any in progress
>>>> +   * (earlier) cleanups and drm_sched_get_cleanup_job will not be
>>> called
>>>> +   * now until the scheduler thread is unparked.
>>>> +   */
>>>> +  if (bad && bad->sched == sched)
>>>> +          /*
>>>> +           * Add at the head of the queue to reflect it was the earliest
>>>> +           * job extracted.
>>>> +           */
>>>> +          list_add(&bad->node, &sched->ring_mirror_list);
>>>> +
>>>> +  /*
>>>>      * Iterate the job list from later to  earlier one and either deactive
>>>>      * their HW callbacks or remove them from mirror list if they already
>>>>      * signaled.

[-- Attachment #1.2: Type: text/html, Size: 14729 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
@ 2019-12-03 19:59                   ` Deucher, Alexander
  0 siblings, 0 replies; 125+ messages in thread
From: Deucher, Alexander @ 2019-12-03 19:59 UTC (permalink / raw)
  To: Grodzovsky, Andrey, Deng, Emily
  Cc: steven.price, amd-gfx, dri-devel, Koenig,  Christian


[-- Attachment #1.1: Type: text/plain, Size: 7538 bytes --]

[AMD Official Use Only - Internal Distribution Only]

Cherry pick whatever dependencies you need or pick the older version of the patch.  Either way works.

Alex
________________________________
From: Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>
Sent: Tuesday, December 3, 2019 2:57 PM
To: Deucher, Alexander <Alexander.Deucher@amd.com>; Deng, Emily <Emily.Deng@amd.com>
Cc: dri-devel@lists.freedesktop.org <dri-devel@lists.freedesktop.org>; amd-gfx@lists.freedesktop.org <amd-gfx@lists.freedesktop.org>; Koenig, Christian <Christian.Koenig@amd.com>; steven.price@arm.com <steven.price@arm.com>
Subject: Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.


I don't think i can apply this patch 'as is' as this has dependency on patch by Steven which also wasn't applied yet - 588b982 Steven Price        6 weeks ago    drm: Don't free jobs in wait_event_interruptible()


Andrey


On 12/3/19 2:44 PM, Deucher, Alexander wrote:

[AMD Official Use Only - Internal Distribution Only]

Please go ahead an apply whatever version is necessary for amd-staging-drm-next.

Alex

________________________________
From: Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com><mailto:Andrey.Grodzovsky@amd.com>
Sent: Tuesday, December 3, 2019 2:10 PM
To: Deng, Emily <Emily.Deng@amd.com><mailto:Emily.Deng@amd.com>; Deucher, Alexander <Alexander.Deucher@amd.com><mailto:Alexander.Deucher@amd.com>
Cc: dri-devel@lists.freedesktop.org<mailto:dri-devel@lists.freedesktop.org> <dri-devel@lists.freedesktop.org><mailto:dri-devel@lists.freedesktop.org>; amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org> <amd-gfx@lists.freedesktop.org><mailto:amd-gfx@lists.freedesktop.org>; Koenig, Christian <Christian.Koenig@amd.com><mailto:Christian.Koenig@amd.com>; steven.price@arm.com<mailto:steven.price@arm.com> <steven.price@arm.com><mailto:steven.price@arm.com>
Subject: Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.

Yes - Christian just pushed it to drm-next-misc - I guess Alex/Christian
didn't pull to amd-staging-drm-next yet.

Andrey

On 12/2/19 2:24 PM, Deng, Emily wrote:
> [AMD Official Use Only - Internal Distribution Only]
>
> Hi Andrey,
>      Seems this patch is still not in amd-staging-drm-next?
>
> Best wishes
> Emily Deng
>
>
>
>> -----Original Message-----
>> From: Deng, Emily
>> Sent: Tuesday, November 26, 2019 4:41 PM
>> To: Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com><mailto:Andrey.Grodzovsky@amd.com>
>> Cc: dri-devel@lists.freedesktop.org<mailto:dri-devel@lists.freedesktop.org>; amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>; Koenig,
>> Christian <Christian.Koenig@amd.com><mailto:Christian.Koenig@amd.com>; steven.price@arm.com<mailto:steven.price@arm.com>
>> Subject: RE: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
>>
>> [AMD Official Use Only - Internal Distribution Only]
>>
>> Reviewed-by: Emily Deng <Emily.Deng@amd.com><mailto:Emily.Deng@amd.com>
>>
>>> -----Original Message-----
>>> From: Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com><mailto:Andrey.Grodzovsky@amd.com>
>>> Sent: Tuesday, November 26, 2019 7:37 AM
>>> Cc: dri-devel@lists.freedesktop.org<mailto:dri-devel@lists.freedesktop.org>; amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>;
>>> Koenig, Christian <Christian.Koenig@amd.com><mailto:Christian.Koenig@amd.com>; Deng, Emily
>>> <Emily.Deng@amd.com><mailto:Emily.Deng@amd.com>; steven.price@arm.com<mailto:steven.price@arm.com>
>>> Subject: Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
>>>
>>> Ping
>>>
>>> Andrey
>>>
>>> On 11/25/19 3:51 PM, Andrey Grodzovsky wrote:
>>>> Problem:
>>>> Due to a race between drm_sched_cleanup_jobs in sched thread and
>>>> drm_sched_job_timedout in timeout work there is a possiblity that bad
>>>> job was already freed while still being accessed from the timeout
>>>> thread.
>>>>
>>>> Fix:
>>>> Instead of just peeking at the bad job in the mirror list remove it
>>>> from the list under lock and then put it back later when we are
>>>> garanteed no race with main sched thread is possible which is after
>>>> the thread is parked.
>>>>
>>>> v2: Lock around processing ring_mirror_list in drm_sched_cleanup_jobs.
>>>>
>>>> v3: Rebase on top of drm-misc-next. v2 is not needed anymore as
>>>> drm_sched_get_cleanup_job already has a lock there.
>>>>
>>>> v4: Fix comments to relfect latest code in drm-misc.
>>>>
>>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com><mailto:andrey.grodzovsky@amd.com>
>>>> Reviewed-by: Christian König <christian.koenig@amd.com><mailto:christian.koenig@amd.com>
>>>> Tested-by: Emily Deng <Emily.Deng@amd.com><mailto:Emily.Deng@amd.com>
>>>> ---
>>>>    drivers/gpu/drm/scheduler/sched_main.c | 27
>>> +++++++++++++++++++++++++++
>>>>    1 file changed, 27 insertions(+)
>>>>
>>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c
>>>> b/drivers/gpu/drm/scheduler/sched_main.c
>>>> index 6774955..1bf9c40 100644
>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>>> @@ -284,10 +284,21 @@ static void drm_sched_job_timedout(struct
>>> work_struct *work)
>>>>     unsigned long flags;
>>>>
>>>>     sched = container_of(work, struct drm_gpu_scheduler,
>>>> work_tdr.work);
>>>> +
>>>> +  /* Protects against concurrent deletion in
>>> drm_sched_get_cleanup_job */
>>>> +  spin_lock_irqsave(&sched->job_list_lock, flags);
>>>>     job = list_first_entry_or_null(&sched->ring_mirror_list,
>>>>                                    struct drm_sched_job, node);
>>>>
>>>>     if (job) {
>>>> +          /*
>>>> +           * Remove the bad job so it cannot be freed by concurrent
>>>> +           * drm_sched_cleanup_jobs. It will be reinserted back after
>>> sched->thread
>>>> +           * is parked at which point it's safe.
>>>> +           */
>>>> +          list_del_init(&job->node);
>>>> +          spin_unlock_irqrestore(&sched->job_list_lock, flags);
>>>> +
>>>>             job->sched->ops->timedout_job(job);
>>>>
>>>>             /*
>>>> @@ -298,6 +309,8 @@ static void drm_sched_job_timedout(struct
>>> work_struct *work)
>>>>                     job->sched->ops->free_job(job);
>>>>                     sched->free_guilty = false;
>>>>             }
>>>> +  } else {
>>>> +          spin_unlock_irqrestore(&sched->job_list_lock, flags);
>>>>     }
>>>>
>>>>     spin_lock_irqsave(&sched->job_list_lock, flags); @@ -370,6 +383,20
>>>> @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct
>>> drm_sched_job *bad)
>>>>     kthread_park(sched->thread);
>>>>
>>>>     /*
>>>> +   * Reinsert back the bad job here - now it's safe as
>>>> +   * drm_sched_get_cleanup_job cannot race against us and release the
>>>> +   * bad job at this point - we parked (waited for) any in progress
>>>> +   * (earlier) cleanups and drm_sched_get_cleanup_job will not be
>>> called
>>>> +   * now until the scheduler thread is unparked.
>>>> +   */
>>>> +  if (bad && bad->sched == sched)
>>>> +          /*
>>>> +           * Add at the head of the queue to reflect it was the earliest
>>>> +           * job extracted.
>>>> +           */
>>>> +          list_add(&bad->node, &sched->ring_mirror_list);
>>>> +
>>>> +  /*
>>>>      * Iterate the job list from later to  earlier one and either deactive
>>>>      * their HW callbacks or remove them from mirror list if they already
>>>>      * signaled.

[-- Attachment #1.2: Type: text/html, Size: 14729 bytes --]

[-- Attachment #2: Type: text/plain, Size: 153 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
  2019-12-03 19:59                   ` Deucher, Alexander
@ 2019-12-03 20:32                     ` Andrey Grodzovsky
  -1 siblings, 0 replies; 125+ messages in thread
From: Andrey Grodzovsky @ 2019-12-03 20:32 UTC (permalink / raw)
  To: Deucher, Alexander, Deng, Emily
  Cc: steven.price, amd-gfx, dri-devel, Koenig, Christian


[-- Attachment #1.1: Type: text/plain, Size: 8420 bytes --]

Turns out Steven's patch was already in so i just cherry-picked the 
change from drm-next-misc


Emily - it's in.


Andrey


On 12/3/19 2:59 PM, Deucher, Alexander wrote:
>
> [AMD Official Use Only - Internal Distribution Only]
>
>
> Cherry pick whatever dependencies you need or pick the older version 
> of the patch.  Either way works.
>
> Alex
> ------------------------------------------------------------------------
> *From:* Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>
> *Sent:* Tuesday, December 3, 2019 2:57 PM
> *To:* Deucher, Alexander <Alexander.Deucher@amd.com>; Deng, Emily 
> <Emily.Deng@amd.com>
> *Cc:* dri-devel@lists.freedesktop.org 
> <dri-devel@lists.freedesktop.org>; amd-gfx@lists.freedesktop.org 
> <amd-gfx@lists.freedesktop.org>; Koenig, Christian 
> <Christian.Koenig@amd.com>; steven.price@arm.com <steven.price@arm.com>
> *Subject:* Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
>
> I don't think i can apply this patch 'as is' as this has dependency on 
> patch by Steven which also wasn't applied yet - 588b982 Steven 
> Price        6 weeks ago    drm: Don't free jobs in 
> wait_event_interruptible()
>
>
> Andrey
>
>
> On 12/3/19 2:44 PM, Deucher, Alexander wrote:
>>
>> [AMD Official Use Only - Internal Distribution Only]
>>
>>
>> Please go ahead an apply whatever version is necessary for 
>> amd-staging-drm-next.
>>
>> Alex
>>
>> ------------------------------------------------------------------------
>> *From:* Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com> 
>> <mailto:Andrey.Grodzovsky@amd.com>
>> *Sent:* Tuesday, December 3, 2019 2:10 PM
>> *To:* Deng, Emily <Emily.Deng@amd.com> <mailto:Emily.Deng@amd.com>; 
>> Deucher, Alexander <Alexander.Deucher@amd.com> 
>> <mailto:Alexander.Deucher@amd.com>
>> *Cc:* dri-devel@lists.freedesktop.org 
>> <mailto:dri-devel@lists.freedesktop.org> 
>> <dri-devel@lists.freedesktop.org> 
>> <mailto:dri-devel@lists.freedesktop.org>; 
>> amd-gfx@lists.freedesktop.org <mailto:amd-gfx@lists.freedesktop.org> 
>> <amd-gfx@lists.freedesktop.org> 
>> <mailto:amd-gfx@lists.freedesktop.org>; Koenig, Christian 
>> <Christian.Koenig@amd.com> <mailto:Christian.Koenig@amd.com>; 
>> steven.price@arm.com <mailto:steven.price@arm.com> 
>> <steven.price@arm.com> <mailto:steven.price@arm.com>
>> *Subject:* Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
>> Yes - Christian just pushed it to drm-next-misc - I guess Alex/Christian
>> didn't pull to amd-staging-drm-next yet.
>>
>> Andrey
>>
>> On 12/2/19 2:24 PM, Deng, Emily wrote:
>> > [AMD Official Use Only - Internal Distribution Only]
>> >
>> > Hi Andrey,
>> >      Seems this patch is still not in amd-staging-drm-next?
>> >
>> > Best wishes
>> > Emily Deng
>> >
>> >
>> >
>> >> -----Original Message-----
>> >> From: Deng, Emily
>> >> Sent: Tuesday, November 26, 2019 4:41 PM
>> >> To: Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com> 
>> <mailto:Andrey.Grodzovsky@amd.com>
>> >> Cc: dri-devel@lists.freedesktop.org 
>> <mailto:dri-devel@lists.freedesktop.org>; 
>> amd-gfx@lists.freedesktop.org <mailto:amd-gfx@lists.freedesktop.org>; 
>> Koenig,
>> >> Christian <Christian.Koenig@amd.com> 
>> <mailto:Christian.Koenig@amd.com>; steven.price@arm.com 
>> <mailto:steven.price@arm.com>
>> >> Subject: RE: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
>> >>
>> >> [AMD Official Use Only - Internal Distribution Only]
>> >>
>> >> Reviewed-by: Emily Deng <Emily.Deng@amd.com> 
>> <mailto:Emily.Deng@amd.com>
>> >>
>> >>> -----Original Message-----
>> >>> From: Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com> 
>> <mailto:Andrey.Grodzovsky@amd.com>
>> >>> Sent: Tuesday, November 26, 2019 7:37 AM
>> >>> Cc: dri-devel@lists.freedesktop.org 
>> <mailto:dri-devel@lists.freedesktop.org>; 
>> amd-gfx@lists.freedesktop.org <mailto:amd-gfx@lists.freedesktop.org>;
>> >>> Koenig, Christian <Christian.Koenig@amd.com> 
>> <mailto:Christian.Koenig@amd.com>; Deng, Emily
>> >>> <Emily.Deng@amd.com> <mailto:Emily.Deng@amd.com>; 
>> steven.price@arm.com <mailto:steven.price@arm.com>
>> >>> Subject: Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
>> >>>
>> >>> Ping
>> >>>
>> >>> Andrey
>> >>>
>> >>> On 11/25/19 3:51 PM, Andrey Grodzovsky wrote:
>> >>>> Problem:
>> >>>> Due to a race between drm_sched_cleanup_jobs in sched thread and
>> >>>> drm_sched_job_timedout in timeout work there is a possiblity 
>> that bad
>> >>>> job was already freed while still being accessed from the timeout
>> >>>> thread.
>> >>>>
>> >>>> Fix:
>> >>>> Instead of just peeking at the bad job in the mirror list remove it
>> >>>> from the list under lock and then put it back later when we are
>> >>>> garanteed no race with main sched thread is possible which is after
>> >>>> the thread is parked.
>> >>>>
>> >>>> v2: Lock around processing ring_mirror_list in 
>> drm_sched_cleanup_jobs.
>> >>>>
>> >>>> v3: Rebase on top of drm-misc-next. v2 is not needed anymore as
>> >>>> drm_sched_get_cleanup_job already has a lock there.
>> >>>>
>> >>>> v4: Fix comments to relfect latest code in drm-misc.
>> >>>>
>> >>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> 
>> <mailto:andrey.grodzovsky@amd.com>
>> >>>> Reviewed-by: Christian König <christian.koenig@amd.com> 
>> <mailto:christian.koenig@amd.com>
>> >>>> Tested-by: Emily Deng <Emily.Deng@amd.com> 
>> <mailto:Emily.Deng@amd.com>
>> >>>> ---
>> >>>> drivers/gpu/drm/scheduler/sched_main.c | 27
>> >>> +++++++++++++++++++++++++++
>> >>>>    1 file changed, 27 insertions(+)
>> >>>>
>> >>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c
>> >>>> b/drivers/gpu/drm/scheduler/sched_main.c
>> >>>> index 6774955..1bf9c40 100644
>> >>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>> >>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>> >>>> @@ -284,10 +284,21 @@ static void drm_sched_job_timedout(struct
>> >>> work_struct *work)
>> >>>>     unsigned long flags;
>> >>>>
>> >>>>     sched = container_of(work, struct drm_gpu_scheduler,
>> >>>> work_tdr.work);
>> >>>> +
>> >>>> +  /* Protects against concurrent deletion in
>> >>> drm_sched_get_cleanup_job */
>> >>>> + spin_lock_irqsave(&sched->job_list_lock, flags);
>> >>>>     job = list_first_entry_or_null(&sched->ring_mirror_list,
>> >>>>                                    struct drm_sched_job, node);
>> >>>>
>> >>>>     if (job) {
>> >>>> +          /*
>> >>>> +           * Remove the bad job so it cannot be freed by concurrent
>> >>>> +           * drm_sched_cleanup_jobs. It will be reinserted back 
>> after
>> >>> sched->thread
>> >>>> +           * is parked at which point it's safe.
>> >>>> +           */
>> >>>> + list_del_init(&job->node);
>> >>>> + spin_unlock_irqrestore(&sched->job_list_lock, flags);
>> >>>> +
>> >>>> job->sched->ops->timedout_job(job);
>> >>>>
>> >>>>             /*
>> >>>> @@ -298,6 +309,8 @@ static void drm_sched_job_timedout(struct
>> >>> work_struct *work)
>> >>>> job->sched->ops->free_job(job);
>> >>>> sched->free_guilty = false;
>> >>>>             }
>> >>>> +  } else {
>> >>>> + spin_unlock_irqrestore(&sched->job_list_lock, flags);
>> >>>>     }
>> >>>>
>> >>>> spin_lock_irqsave(&sched->job_list_lock, flags); @@ -370,6 +383,20
>> >>>> @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct
>> >>> drm_sched_job *bad)
>> >>>> kthread_park(sched->thread);
>> >>>>
>> >>>>     /*
>> >>>> +   * Reinsert back the bad job here - now it's safe as
>> >>>> +   * drm_sched_get_cleanup_job cannot race against us and 
>> release the
>> >>>> +   * bad job at this point - we parked (waited for) any in progress
>> >>>> +   * (earlier) cleanups and drm_sched_get_cleanup_job will not be
>> >>> called
>> >>>> +   * now until the scheduler thread is unparked.
>> >>>> +   */
>> >>>> +  if (bad && bad->sched == sched)
>> >>>> +          /*
>> >>>> +           * Add at the head of the queue to reflect it was the 
>> earliest
>> >>>> +           * job extracted.
>> >>>> +           */
>> >>>> + list_add(&bad->node, &sched->ring_mirror_list);
>> >>>> +
>> >>>> +  /*
>> >>>>      * Iterate the job list from later to  earlier one and 
>> either deactive
>> >>>>      * their HW callbacks or remove them from mirror list if 
>> they already
>> >>>>      * signaled.

[-- Attachment #1.2: Type: text/html, Size: 22150 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
@ 2019-12-03 20:32                     ` Andrey Grodzovsky
  0 siblings, 0 replies; 125+ messages in thread
From: Andrey Grodzovsky @ 2019-12-03 20:32 UTC (permalink / raw)
  To: Deucher, Alexander, Deng, Emily
  Cc: steven.price, amd-gfx, dri-devel, Koenig, Christian


[-- Attachment #1.1: Type: text/plain, Size: 8420 bytes --]

Turns out Steven's patch was already in so i just cherry-picked the 
change from drm-next-misc


Emily - it's in.


Andrey


On 12/3/19 2:59 PM, Deucher, Alexander wrote:
>
> [AMD Official Use Only - Internal Distribution Only]
>
>
> Cherry pick whatever dependencies you need or pick the older version 
> of the patch.  Either way works.
>
> Alex
> ------------------------------------------------------------------------
> *From:* Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>
> *Sent:* Tuesday, December 3, 2019 2:57 PM
> *To:* Deucher, Alexander <Alexander.Deucher@amd.com>; Deng, Emily 
> <Emily.Deng@amd.com>
> *Cc:* dri-devel@lists.freedesktop.org 
> <dri-devel@lists.freedesktop.org>; amd-gfx@lists.freedesktop.org 
> <amd-gfx@lists.freedesktop.org>; Koenig, Christian 
> <Christian.Koenig@amd.com>; steven.price@arm.com <steven.price@arm.com>
> *Subject:* Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
>
> I don't think i can apply this patch 'as is' as this has dependency on 
> patch by Steven which also wasn't applied yet - 588b982 Steven 
> Price        6 weeks ago    drm: Don't free jobs in 
> wait_event_interruptible()
>
>
> Andrey
>
>
> On 12/3/19 2:44 PM, Deucher, Alexander wrote:
>>
>> [AMD Official Use Only - Internal Distribution Only]
>>
>>
>> Please go ahead an apply whatever version is necessary for 
>> amd-staging-drm-next.
>>
>> Alex
>>
>> ------------------------------------------------------------------------
>> *From:* Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com> 
>> <mailto:Andrey.Grodzovsky@amd.com>
>> *Sent:* Tuesday, December 3, 2019 2:10 PM
>> *To:* Deng, Emily <Emily.Deng@amd.com> <mailto:Emily.Deng@amd.com>; 
>> Deucher, Alexander <Alexander.Deucher@amd.com> 
>> <mailto:Alexander.Deucher@amd.com>
>> *Cc:* dri-devel@lists.freedesktop.org 
>> <mailto:dri-devel@lists.freedesktop.org> 
>> <dri-devel@lists.freedesktop.org> 
>> <mailto:dri-devel@lists.freedesktop.org>; 
>> amd-gfx@lists.freedesktop.org <mailto:amd-gfx@lists.freedesktop.org> 
>> <amd-gfx@lists.freedesktop.org> 
>> <mailto:amd-gfx@lists.freedesktop.org>; Koenig, Christian 
>> <Christian.Koenig@amd.com> <mailto:Christian.Koenig@amd.com>; 
>> steven.price@arm.com <mailto:steven.price@arm.com> 
>> <steven.price@arm.com> <mailto:steven.price@arm.com>
>> *Subject:* Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
>> Yes - Christian just pushed it to drm-next-misc - I guess Alex/Christian
>> didn't pull to amd-staging-drm-next yet.
>>
>> Andrey
>>
>> On 12/2/19 2:24 PM, Deng, Emily wrote:
>> > [AMD Official Use Only - Internal Distribution Only]
>> >
>> > Hi Andrey,
>> >      Seems this patch is still not in amd-staging-drm-next?
>> >
>> > Best wishes
>> > Emily Deng
>> >
>> >
>> >
>> >> -----Original Message-----
>> >> From: Deng, Emily
>> >> Sent: Tuesday, November 26, 2019 4:41 PM
>> >> To: Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com> 
>> <mailto:Andrey.Grodzovsky@amd.com>
>> >> Cc: dri-devel@lists.freedesktop.org 
>> <mailto:dri-devel@lists.freedesktop.org>; 
>> amd-gfx@lists.freedesktop.org <mailto:amd-gfx@lists.freedesktop.org>; 
>> Koenig,
>> >> Christian <Christian.Koenig@amd.com> 
>> <mailto:Christian.Koenig@amd.com>; steven.price@arm.com 
>> <mailto:steven.price@arm.com>
>> >> Subject: RE: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
>> >>
>> >> [AMD Official Use Only - Internal Distribution Only]
>> >>
>> >> Reviewed-by: Emily Deng <Emily.Deng@amd.com> 
>> <mailto:Emily.Deng@amd.com>
>> >>
>> >>> -----Original Message-----
>> >>> From: Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com> 
>> <mailto:Andrey.Grodzovsky@amd.com>
>> >>> Sent: Tuesday, November 26, 2019 7:37 AM
>> >>> Cc: dri-devel@lists.freedesktop.org 
>> <mailto:dri-devel@lists.freedesktop.org>; 
>> amd-gfx@lists.freedesktop.org <mailto:amd-gfx@lists.freedesktop.org>;
>> >>> Koenig, Christian <Christian.Koenig@amd.com> 
>> <mailto:Christian.Koenig@amd.com>; Deng, Emily
>> >>> <Emily.Deng@amd.com> <mailto:Emily.Deng@amd.com>; 
>> steven.price@arm.com <mailto:steven.price@arm.com>
>> >>> Subject: Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
>> >>>
>> >>> Ping
>> >>>
>> >>> Andrey
>> >>>
>> >>> On 11/25/19 3:51 PM, Andrey Grodzovsky wrote:
>> >>>> Problem:
>> >>>> Due to a race between drm_sched_cleanup_jobs in sched thread and
>> >>>> drm_sched_job_timedout in timeout work there is a possiblity 
>> that bad
>> >>>> job was already freed while still being accessed from the timeout
>> >>>> thread.
>> >>>>
>> >>>> Fix:
>> >>>> Instead of just peeking at the bad job in the mirror list remove it
>> >>>> from the list under lock and then put it back later when we are
>> >>>> garanteed no race with main sched thread is possible which is after
>> >>>> the thread is parked.
>> >>>>
>> >>>> v2: Lock around processing ring_mirror_list in 
>> drm_sched_cleanup_jobs.
>> >>>>
>> >>>> v3: Rebase on top of drm-misc-next. v2 is not needed anymore as
>> >>>> drm_sched_get_cleanup_job already has a lock there.
>> >>>>
>> >>>> v4: Fix comments to relfect latest code in drm-misc.
>> >>>>
>> >>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> 
>> <mailto:andrey.grodzovsky@amd.com>
>> >>>> Reviewed-by: Christian König <christian.koenig@amd.com> 
>> <mailto:christian.koenig@amd.com>
>> >>>> Tested-by: Emily Deng <Emily.Deng@amd.com> 
>> <mailto:Emily.Deng@amd.com>
>> >>>> ---
>> >>>> drivers/gpu/drm/scheduler/sched_main.c | 27
>> >>> +++++++++++++++++++++++++++
>> >>>>    1 file changed, 27 insertions(+)
>> >>>>
>> >>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c
>> >>>> b/drivers/gpu/drm/scheduler/sched_main.c
>> >>>> index 6774955..1bf9c40 100644
>> >>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>> >>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>> >>>> @@ -284,10 +284,21 @@ static void drm_sched_job_timedout(struct
>> >>> work_struct *work)
>> >>>>     unsigned long flags;
>> >>>>
>> >>>>     sched = container_of(work, struct drm_gpu_scheduler,
>> >>>> work_tdr.work);
>> >>>> +
>> >>>> +  /* Protects against concurrent deletion in
>> >>> drm_sched_get_cleanup_job */
>> >>>> + spin_lock_irqsave(&sched->job_list_lock, flags);
>> >>>>     job = list_first_entry_or_null(&sched->ring_mirror_list,
>> >>>>                                    struct drm_sched_job, node);
>> >>>>
>> >>>>     if (job) {
>> >>>> +          /*
>> >>>> +           * Remove the bad job so it cannot be freed by concurrent
>> >>>> +           * drm_sched_cleanup_jobs. It will be reinserted back 
>> after
>> >>> sched->thread
>> >>>> +           * is parked at which point it's safe.
>> >>>> +           */
>> >>>> + list_del_init(&job->node);
>> >>>> + spin_unlock_irqrestore(&sched->job_list_lock, flags);
>> >>>> +
>> >>>> job->sched->ops->timedout_job(job);
>> >>>>
>> >>>>             /*
>> >>>> @@ -298,6 +309,8 @@ static void drm_sched_job_timedout(struct
>> >>> work_struct *work)
>> >>>> job->sched->ops->free_job(job);
>> >>>> sched->free_guilty = false;
>> >>>>             }
>> >>>> +  } else {
>> >>>> + spin_unlock_irqrestore(&sched->job_list_lock, flags);
>> >>>>     }
>> >>>>
>> >>>> spin_lock_irqsave(&sched->job_list_lock, flags); @@ -370,6 +383,20
>> >>>> @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct
>> >>> drm_sched_job *bad)
>> >>>> kthread_park(sched->thread);
>> >>>>
>> >>>>     /*
>> >>>> +   * Reinsert back the bad job here - now it's safe as
>> >>>> +   * drm_sched_get_cleanup_job cannot race against us and 
>> release the
>> >>>> +   * bad job at this point - we parked (waited for) any in progress
>> >>>> +   * (earlier) cleanups and drm_sched_get_cleanup_job will not be
>> >>> called
>> >>>> +   * now until the scheduler thread is unparked.
>> >>>> +   */
>> >>>> +  if (bad && bad->sched == sched)
>> >>>> +          /*
>> >>>> +           * Add at the head of the queue to reflect it was the 
>> earliest
>> >>>> +           * job extracted.
>> >>>> +           */
>> >>>> + list_add(&bad->node, &sched->ring_mirror_list);
>> >>>> +
>> >>>> +  /*
>> >>>>      * Iterate the job list from later to  earlier one and 
>> either deactive
>> >>>>      * their HW callbacks or remove them from mirror list if 
>> they already
>> >>>>      * signaled.

[-- Attachment #1.2: Type: text/html, Size: 22150 bytes --]

[-- Attachment #2: Type: text/plain, Size: 153 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 125+ messages in thread

* RE: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
  2019-12-03 20:32                     ` Andrey Grodzovsky
@ 2019-12-03 20:58                       ` Deng, Emily
  -1 siblings, 0 replies; 125+ messages in thread
From: Deng, Emily @ 2019-12-03 20:58 UTC (permalink / raw)
  To: Grodzovsky, Andrey, Deucher, Alexander
  Cc: steven.price, amd-gfx, dri-devel, Koenig,  Christian


[-- Attachment #1.1: Type: text/plain, Size: 8579 bytes --]

[AMD Official Use Only - Internal Distribution Only]

Hi Andrey,
    Thanks very much.

Best wishes
Emily Deng
From: Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>
Sent: Tuesday, December 3, 2019 12:33 PM
To: Deucher, Alexander <Alexander.Deucher@amd.com>; Deng, Emily <Emily.Deng@amd.com>
Cc: dri-devel@lists.freedesktop.org; amd-gfx@lists.freedesktop.org; Koenig, Christian <Christian.Koenig@amd.com>; steven.price@arm.com
Subject: Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.


Turns out Steven's patch was already in so i just cherry-picked the change from drm-next-misc



Emily - it's in.



Andrey


On 12/3/19 2:59 PM, Deucher, Alexander wrote:

[AMD Official Use Only - Internal Distribution Only]

Cherry pick whatever dependencies you need or pick the older version of the patch.  Either way works.

Alex
________________________________
From: Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com><mailto:Andrey.Grodzovsky@amd.com>
Sent: Tuesday, December 3, 2019 2:57 PM
To: Deucher, Alexander <Alexander.Deucher@amd.com><mailto:Alexander.Deucher@amd.com>; Deng, Emily <Emily.Deng@amd.com><mailto:Emily.Deng@amd.com>
Cc: dri-devel@lists.freedesktop.org<mailto:dri-devel@lists.freedesktop.org> <dri-devel@lists.freedesktop.org><mailto:dri-devel@lists.freedesktop.org>; amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org> <amd-gfx@lists.freedesktop.org><mailto:amd-gfx@lists.freedesktop.org>; Koenig, Christian <Christian.Koenig@amd.com><mailto:Christian.Koenig@amd.com>; steven.price@arm.com<mailto:steven.price@arm.com> <steven.price@arm.com><mailto:steven.price@arm.com>
Subject: Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.


I don't think i can apply this patch 'as is' as this has dependency on patch by Steven which also wasn't applied yet - 588b982 Steven Price        6 weeks ago    drm: Don't free jobs in wait_event_interruptible()



Andrey


On 12/3/19 2:44 PM, Deucher, Alexander wrote:

[AMD Official Use Only - Internal Distribution Only]

Please go ahead an apply whatever version is necessary for amd-staging-drm-next.

Alex

________________________________
From: Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com><mailto:Andrey.Grodzovsky@amd.com>
Sent: Tuesday, December 3, 2019 2:10 PM
To: Deng, Emily <Emily.Deng@amd.com><mailto:Emily.Deng@amd.com>; Deucher, Alexander <Alexander.Deucher@amd.com><mailto:Alexander.Deucher@amd.com>
Cc: dri-devel@lists.freedesktop.org<mailto:dri-devel@lists.freedesktop.org> <dri-devel@lists.freedesktop.org><mailto:dri-devel@lists.freedesktop.org>; amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org> <amd-gfx@lists.freedesktop.org><mailto:amd-gfx@lists.freedesktop.org>; Koenig, Christian <Christian.Koenig@amd.com><mailto:Christian.Koenig@amd.com>; steven.price@arm.com<mailto:steven.price@arm.com> <steven.price@arm.com><mailto:steven.price@arm.com>
Subject: Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.

Yes - Christian just pushed it to drm-next-misc - I guess Alex/Christian
didn't pull to amd-staging-drm-next yet.

Andrey

On 12/2/19 2:24 PM, Deng, Emily wrote:
> [AMD Official Use Only - Internal Distribution Only]
>
> Hi Andrey,
>      Seems this patch is still not in amd-staging-drm-next?
>
> Best wishes
> Emily Deng
>
>
>
>> -----Original Message-----
>> From: Deng, Emily
>> Sent: Tuesday, November 26, 2019 4:41 PM
>> To: Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com><mailto:Andrey.Grodzovsky@amd.com>
>> Cc: dri-devel@lists.freedesktop.org<mailto:dri-devel@lists.freedesktop.org>; amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>; Koenig,
>> Christian <Christian.Koenig@amd.com><mailto:Christian.Koenig@amd.com>; steven.price@arm.com<mailto:steven.price@arm.com>
>> Subject: RE: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
>>
>> [AMD Official Use Only - Internal Distribution Only]
>>
>> Reviewed-by: Emily Deng <Emily.Deng@amd.com><mailto:Emily.Deng@amd.com>
>>
>>> -----Original Message-----
>>> From: Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com><mailto:Andrey.Grodzovsky@amd.com>
>>> Sent: Tuesday, November 26, 2019 7:37 AM
>>> Cc: dri-devel@lists.freedesktop.org<mailto:dri-devel@lists.freedesktop.org>; amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>;
>>> Koenig, Christian <Christian.Koenig@amd.com><mailto:Christian.Koenig@amd.com>; Deng, Emily
>>> <Emily.Deng@amd.com><mailto:Emily.Deng@amd.com>; steven.price@arm.com<mailto:steven.price@arm.com>
>>> Subject: Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
>>>
>>> Ping
>>>
>>> Andrey
>>>
>>> On 11/25/19 3:51 PM, Andrey Grodzovsky wrote:
>>>> Problem:
>>>> Due to a race between drm_sched_cleanup_jobs in sched thread and
>>>> drm_sched_job_timedout in timeout work there is a possiblity that bad
>>>> job was already freed while still being accessed from the timeout
>>>> thread.
>>>>
>>>> Fix:
>>>> Instead of just peeking at the bad job in the mirror list remove it
>>>> from the list under lock and then put it back later when we are
>>>> garanteed no race with main sched thread is possible which is after
>>>> the thread is parked.
>>>>
>>>> v2: Lock around processing ring_mirror_list in drm_sched_cleanup_jobs.
>>>>
>>>> v3: Rebase on top of drm-misc-next. v2 is not needed anymore as
>>>> drm_sched_get_cleanup_job already has a lock there.
>>>>
>>>> v4: Fix comments to relfect latest code in drm-misc.
>>>>
>>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com><mailto:andrey.grodzovsky@amd.com>
>>>> Reviewed-by: Christian König <christian.koenig@amd.com><mailto:christian.koenig@amd.com>
>>>> Tested-by: Emily Deng <Emily.Deng@amd.com><mailto:Emily.Deng@amd.com>
>>>> ---
>>>>    drivers/gpu/drm/scheduler/sched_main.c | 27
>>> +++++++++++++++++++++++++++
>>>>    1 file changed, 27 insertions(+)
>>>>
>>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c
>>>> b/drivers/gpu/drm/scheduler/sched_main.c
>>>> index 6774955..1bf9c40 100644
>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>>> @@ -284,10 +284,21 @@ static void drm_sched_job_timedout(struct
>>> work_struct *work)
>>>>     unsigned long flags;
>>>>
>>>>     sched = container_of(work, struct drm_gpu_scheduler,
>>>> work_tdr.work);
>>>> +
>>>> +  /* Protects against concurrent deletion in
>>> drm_sched_get_cleanup_job */
>>>> +  spin_lock_irqsave(&sched->job_list_lock, flags);
>>>>     job = list_first_entry_or_null(&sched->ring_mirror_list,
>>>>                                    struct drm_sched_job, node);
>>>>
>>>>     if (job) {
>>>> +          /*
>>>> +           * Remove the bad job so it cannot be freed by concurrent
>>>> +           * drm_sched_cleanup_jobs. It will be reinserted back after
>>> sched->thread
>>>> +           * is parked at which point it's safe.
>>>> +           */
>>>> +          list_del_init(&job->node);
>>>> +          spin_unlock_irqrestore(&sched->job_list_lock, flags);
>>>> +
>>>>             job->sched->ops->timedout_job(job);
>>>>
>>>>             /*
>>>> @@ -298,6 +309,8 @@ static void drm_sched_job_timedout(struct
>>> work_struct *work)
>>>>                     job->sched->ops->free_job(job);
>>>>                     sched->free_guilty = false;
>>>>             }
>>>> +  } else {
>>>> +          spin_unlock_irqrestore(&sched->job_list_lock, flags);
>>>>     }
>>>>
>>>>     spin_lock_irqsave(&sched->job_list_lock, flags); @@ -370,6 +383,20
>>>> @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct
>>> drm_sched_job *bad)
>>>>     kthread_park(sched->thread);
>>>>
>>>>     /*
>>>> +   * Reinsert back the bad job here - now it's safe as
>>>> +   * drm_sched_get_cleanup_job cannot race against us and release the
>>>> +   * bad job at this point - we parked (waited for) any in progress
>>>> +   * (earlier) cleanups and drm_sched_get_cleanup_job will not be
>>> called
>>>> +   * now until the scheduler thread is unparked.
>>>> +   */
>>>> +  if (bad && bad->sched == sched)
>>>> +          /*
>>>> +           * Add at the head of the queue to reflect it was the earliest
>>>> +           * job extracted.
>>>> +           */
>>>> +          list_add(&bad->node, &sched->ring_mirror_list);
>>>> +
>>>> +  /*
>>>>      * Iterate the job list from later to  earlier one and either deactive
>>>>      * their HW callbacks or remove them from mirror list if they already
>>>>      * signaled.

[-- Attachment #1.2: Type: text/html, Size: 18929 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 125+ messages in thread

* RE: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
@ 2019-12-03 20:58                       ` Deng, Emily
  0 siblings, 0 replies; 125+ messages in thread
From: Deng, Emily @ 2019-12-03 20:58 UTC (permalink / raw)
  To: Grodzovsky, Andrey, Deucher, Alexander
  Cc: steven.price, amd-gfx, dri-devel, Koenig,  Christian


[-- Attachment #1.1: Type: text/plain, Size: 8579 bytes --]

[AMD Official Use Only - Internal Distribution Only]

Hi Andrey,
    Thanks very much.

Best wishes
Emily Deng
From: Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>
Sent: Tuesday, December 3, 2019 12:33 PM
To: Deucher, Alexander <Alexander.Deucher@amd.com>; Deng, Emily <Emily.Deng@amd.com>
Cc: dri-devel@lists.freedesktop.org; amd-gfx@lists.freedesktop.org; Koenig, Christian <Christian.Koenig@amd.com>; steven.price@arm.com
Subject: Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.


Turns out Steven's patch was already in so i just cherry-picked the change from drm-next-misc



Emily - it's in.



Andrey


On 12/3/19 2:59 PM, Deucher, Alexander wrote:

[AMD Official Use Only - Internal Distribution Only]

Cherry pick whatever dependencies you need or pick the older version of the patch.  Either way works.

Alex
________________________________
From: Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com><mailto:Andrey.Grodzovsky@amd.com>
Sent: Tuesday, December 3, 2019 2:57 PM
To: Deucher, Alexander <Alexander.Deucher@amd.com><mailto:Alexander.Deucher@amd.com>; Deng, Emily <Emily.Deng@amd.com><mailto:Emily.Deng@amd.com>
Cc: dri-devel@lists.freedesktop.org<mailto:dri-devel@lists.freedesktop.org> <dri-devel@lists.freedesktop.org><mailto:dri-devel@lists.freedesktop.org>; amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org> <amd-gfx@lists.freedesktop.org><mailto:amd-gfx@lists.freedesktop.org>; Koenig, Christian <Christian.Koenig@amd.com><mailto:Christian.Koenig@amd.com>; steven.price@arm.com<mailto:steven.price@arm.com> <steven.price@arm.com><mailto:steven.price@arm.com>
Subject: Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.


I don't think i can apply this patch 'as is' as this has dependency on patch by Steven which also wasn't applied yet - 588b982 Steven Price        6 weeks ago    drm: Don't free jobs in wait_event_interruptible()



Andrey


On 12/3/19 2:44 PM, Deucher, Alexander wrote:

[AMD Official Use Only - Internal Distribution Only]

Please go ahead an apply whatever version is necessary for amd-staging-drm-next.

Alex

________________________________
From: Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com><mailto:Andrey.Grodzovsky@amd.com>
Sent: Tuesday, December 3, 2019 2:10 PM
To: Deng, Emily <Emily.Deng@amd.com><mailto:Emily.Deng@amd.com>; Deucher, Alexander <Alexander.Deucher@amd.com><mailto:Alexander.Deucher@amd.com>
Cc: dri-devel@lists.freedesktop.org<mailto:dri-devel@lists.freedesktop.org> <dri-devel@lists.freedesktop.org><mailto:dri-devel@lists.freedesktop.org>; amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org> <amd-gfx@lists.freedesktop.org><mailto:amd-gfx@lists.freedesktop.org>; Koenig, Christian <Christian.Koenig@amd.com><mailto:Christian.Koenig@amd.com>; steven.price@arm.com<mailto:steven.price@arm.com> <steven.price@arm.com><mailto:steven.price@arm.com>
Subject: Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.

Yes - Christian just pushed it to drm-next-misc - I guess Alex/Christian
didn't pull to amd-staging-drm-next yet.

Andrey

On 12/2/19 2:24 PM, Deng, Emily wrote:
> [AMD Official Use Only - Internal Distribution Only]
>
> Hi Andrey,
>      Seems this patch is still not in amd-staging-drm-next?
>
> Best wishes
> Emily Deng
>
>
>
>> -----Original Message-----
>> From: Deng, Emily
>> Sent: Tuesday, November 26, 2019 4:41 PM
>> To: Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com><mailto:Andrey.Grodzovsky@amd.com>
>> Cc: dri-devel@lists.freedesktop.org<mailto:dri-devel@lists.freedesktop.org>; amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>; Koenig,
>> Christian <Christian.Koenig@amd.com><mailto:Christian.Koenig@amd.com>; steven.price@arm.com<mailto:steven.price@arm.com>
>> Subject: RE: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
>>
>> [AMD Official Use Only - Internal Distribution Only]
>>
>> Reviewed-by: Emily Deng <Emily.Deng@amd.com><mailto:Emily.Deng@amd.com>
>>
>>> -----Original Message-----
>>> From: Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com><mailto:Andrey.Grodzovsky@amd.com>
>>> Sent: Tuesday, November 26, 2019 7:37 AM
>>> Cc: dri-devel@lists.freedesktop.org<mailto:dri-devel@lists.freedesktop.org>; amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>;
>>> Koenig, Christian <Christian.Koenig@amd.com><mailto:Christian.Koenig@amd.com>; Deng, Emily
>>> <Emily.Deng@amd.com><mailto:Emily.Deng@amd.com>; steven.price@arm.com<mailto:steven.price@arm.com>
>>> Subject: Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
>>>
>>> Ping
>>>
>>> Andrey
>>>
>>> On 11/25/19 3:51 PM, Andrey Grodzovsky wrote:
>>>> Problem:
>>>> Due to a race between drm_sched_cleanup_jobs in sched thread and
>>>> drm_sched_job_timedout in timeout work there is a possiblity that bad
>>>> job was already freed while still being accessed from the timeout
>>>> thread.
>>>>
>>>> Fix:
>>>> Instead of just peeking at the bad job in the mirror list remove it
>>>> from the list under lock and then put it back later when we are
>>>> garanteed no race with main sched thread is possible which is after
>>>> the thread is parked.
>>>>
>>>> v2: Lock around processing ring_mirror_list in drm_sched_cleanup_jobs.
>>>>
>>>> v3: Rebase on top of drm-misc-next. v2 is not needed anymore as
>>>> drm_sched_get_cleanup_job already has a lock there.
>>>>
>>>> v4: Fix comments to relfect latest code in drm-misc.
>>>>
>>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com><mailto:andrey.grodzovsky@amd.com>
>>>> Reviewed-by: Christian König <christian.koenig@amd.com><mailto:christian.koenig@amd.com>
>>>> Tested-by: Emily Deng <Emily.Deng@amd.com><mailto:Emily.Deng@amd.com>
>>>> ---
>>>>    drivers/gpu/drm/scheduler/sched_main.c | 27
>>> +++++++++++++++++++++++++++
>>>>    1 file changed, 27 insertions(+)
>>>>
>>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c
>>>> b/drivers/gpu/drm/scheduler/sched_main.c
>>>> index 6774955..1bf9c40 100644
>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>>> @@ -284,10 +284,21 @@ static void drm_sched_job_timedout(struct
>>> work_struct *work)
>>>>     unsigned long flags;
>>>>
>>>>     sched = container_of(work, struct drm_gpu_scheduler,
>>>> work_tdr.work);
>>>> +
>>>> +  /* Protects against concurrent deletion in
>>> drm_sched_get_cleanup_job */
>>>> +  spin_lock_irqsave(&sched->job_list_lock, flags);
>>>>     job = list_first_entry_or_null(&sched->ring_mirror_list,
>>>>                                    struct drm_sched_job, node);
>>>>
>>>>     if (job) {
>>>> +          /*
>>>> +           * Remove the bad job so it cannot be freed by concurrent
>>>> +           * drm_sched_cleanup_jobs. It will be reinserted back after
>>> sched->thread
>>>> +           * is parked at which point it's safe.
>>>> +           */
>>>> +          list_del_init(&job->node);
>>>> +          spin_unlock_irqrestore(&sched->job_list_lock, flags);
>>>> +
>>>>             job->sched->ops->timedout_job(job);
>>>>
>>>>             /*
>>>> @@ -298,6 +309,8 @@ static void drm_sched_job_timedout(struct
>>> work_struct *work)
>>>>                     job->sched->ops->free_job(job);
>>>>                     sched->free_guilty = false;
>>>>             }
>>>> +  } else {
>>>> +          spin_unlock_irqrestore(&sched->job_list_lock, flags);
>>>>     }
>>>>
>>>>     spin_lock_irqsave(&sched->job_list_lock, flags); @@ -370,6 +383,20
>>>> @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct
>>> drm_sched_job *bad)
>>>>     kthread_park(sched->thread);
>>>>
>>>>     /*
>>>> +   * Reinsert back the bad job here - now it's safe as
>>>> +   * drm_sched_get_cleanup_job cannot race against us and release the
>>>> +   * bad job at this point - we parked (waited for) any in progress
>>>> +   * (earlier) cleanups and drm_sched_get_cleanup_job will not be
>>> called
>>>> +   * now until the scheduler thread is unparked.
>>>> +   */
>>>> +  if (bad && bad->sched == sched)
>>>> +          /*
>>>> +           * Add at the head of the queue to reflect it was the earliest
>>>> +           * job extracted.
>>>> +           */
>>>> +          list_add(&bad->node, &sched->ring_mirror_list);
>>>> +
>>>> +  /*
>>>>      * Iterate the job list from later to  earlier one and either deactive
>>>>      * their HW callbacks or remove them from mirror list if they already
>>>>      * signaled.

[-- Attachment #1.2: Type: text/html, Size: 18929 bytes --]

[-- Attachment #2: Type: text/plain, Size: 153 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
  2019-11-25 20:51 ` Andrey Grodzovsky
@ 2020-02-05 18:24   ` Lucas Stach
  -1 siblings, 0 replies; 125+ messages in thread
From: Lucas Stach @ 2020-02-05 18:24 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: Emily.Deng, Christian.Koenig, dri-devel, amd-gfx, steven.price

Hi Andrey,

This commit breaks all drivers, which may bail out of the timeout
processing as they wish to extend the timeout (etnaviv, v3d).

Those drivers currently just return from the timeout handler before
calling drm_sched_stop(), which means with this commit applied we are
removing the first job from the ring_mirror_list, but never put it
back. This leads to jobs getting lost from the ring mirror, which then
causes quite a bit of fallout like unsignaled fences.

Not sure yet what to do about it, we can either add a function to add
the job back to the ring_mirror if the driver wants to extend the
timeout, or we could look for another way to stop
drm_sched_cleanup_jobs from freeing jobs that are currently in timeout
processing.

Regards,
Lucas

On Mo, 2019-11-25 at 15:51 -0500, Andrey Grodzovsky wrote:
> Problem:
> Due to a race between drm_sched_cleanup_jobs in sched thread and
> drm_sched_job_timedout in timeout work there is a possiblity that
> bad job was already freed while still being accessed from the
> timeout thread.
> 
> Fix:
> Instead of just peeking at the bad job in the mirror list
> remove it from the list under lock and then put it back later when
> we are garanteed no race with main sched thread is possible which
> is after the thread is parked.
> 
> v2: Lock around processing ring_mirror_list in drm_sched_cleanup_jobs.
> 
> v3: Rebase on top of drm-misc-next. v2 is not needed anymore as
> drm_sched_get_cleanup_job already has a lock there.
> 
> v4: Fix comments to relfect latest code in drm-misc.
> 
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> Reviewed-by: Christian König <christian.koenig@amd.com>
> Tested-by: Emily Deng <Emily.Deng@amd.com>
> ---
>  drivers/gpu/drm/scheduler/sched_main.c | 27 +++++++++++++++++++++++++++
>  1 file changed, 27 insertions(+)
> 
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> index 6774955..1bf9c40 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -284,10 +284,21 @@ static void drm_sched_job_timedout(struct work_struct *work)
>  	unsigned long flags;
>  
>  	sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work);
> +
> +	/* Protects against concurrent deletion in drm_sched_get_cleanup_job */
> +	spin_lock_irqsave(&sched->job_list_lock, flags);
>  	job = list_first_entry_or_null(&sched->ring_mirror_list,
>  				       struct drm_sched_job, node);
>  
>  	if (job) {
> +		/*
> +		 * Remove the bad job so it cannot be freed by concurrent
> +		 * drm_sched_cleanup_jobs. It will be reinserted back after sched->thread
> +		 * is parked at which point it's safe.
> +		 */
> +		list_del_init(&job->node);
> +		spin_unlock_irqrestore(&sched->job_list_lock, flags);
> +
>  		job->sched->ops->timedout_job(job);
>  
>  		/*
> @@ -298,6 +309,8 @@ static void drm_sched_job_timedout(struct work_struct *work)
>  			job->sched->ops->free_job(job);
>  			sched->free_guilty = false;
>  		}
> +	} else {
> +		spin_unlock_irqrestore(&sched->job_list_lock, flags);
>  	}
>  
>  	spin_lock_irqsave(&sched->job_list_lock, flags);
> @@ -370,6 +383,20 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad)
>  	kthread_park(sched->thread);
>  
>  	/*
> +	 * Reinsert back the bad job here - now it's safe as
> +	 * drm_sched_get_cleanup_job cannot race against us and release the
> +	 * bad job at this point - we parked (waited for) any in progress
> +	 * (earlier) cleanups and drm_sched_get_cleanup_job will not be called
> +	 * now until the scheduler thread is unparked.
> +	 */
> +	if (bad && bad->sched == sched)
> +		/*
> +		 * Add at the head of the queue to reflect it was the earliest
> +		 * job extracted.
> +		 */
> +		list_add(&bad->node, &sched->ring_mirror_list);
> +
> +	/*
>  	 * Iterate the job list from later to  earlier one and either deactive
>  	 * their HW callbacks or remove them from mirror list if they already
>  	 * signaled.

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
@ 2020-02-05 18:24   ` Lucas Stach
  0 siblings, 0 replies; 125+ messages in thread
From: Lucas Stach @ 2020-02-05 18:24 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: Emily.Deng, Christian.Koenig, dri-devel, amd-gfx, steven.price

Hi Andrey,

This commit breaks all drivers, which may bail out of the timeout
processing as they wish to extend the timeout (etnaviv, v3d).

Those drivers currently just return from the timeout handler before
calling drm_sched_stop(), which means with this commit applied we are
removing the first job from the ring_mirror_list, but never put it
back. This leads to jobs getting lost from the ring mirror, which then
causes quite a bit of fallout like unsignaled fences.

Not sure yet what to do about it, we can either add a function to add
the job back to the ring_mirror if the driver wants to extend the
timeout, or we could look for another way to stop
drm_sched_cleanup_jobs from freeing jobs that are currently in timeout
processing.

Regards,
Lucas

On Mo, 2019-11-25 at 15:51 -0500, Andrey Grodzovsky wrote:
> Problem:
> Due to a race between drm_sched_cleanup_jobs in sched thread and
> drm_sched_job_timedout in timeout work there is a possiblity that
> bad job was already freed while still being accessed from the
> timeout thread.
> 
> Fix:
> Instead of just peeking at the bad job in the mirror list
> remove it from the list under lock and then put it back later when
> we are garanteed no race with main sched thread is possible which
> is after the thread is parked.
> 
> v2: Lock around processing ring_mirror_list in drm_sched_cleanup_jobs.
> 
> v3: Rebase on top of drm-misc-next. v2 is not needed anymore as
> drm_sched_get_cleanup_job already has a lock there.
> 
> v4: Fix comments to relfect latest code in drm-misc.
> 
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> Reviewed-by: Christian König <christian.koenig@amd.com>
> Tested-by: Emily Deng <Emily.Deng@amd.com>
> ---
>  drivers/gpu/drm/scheduler/sched_main.c | 27 +++++++++++++++++++++++++++
>  1 file changed, 27 insertions(+)
> 
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> index 6774955..1bf9c40 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -284,10 +284,21 @@ static void drm_sched_job_timedout(struct work_struct *work)
>  	unsigned long flags;
>  
>  	sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work);
> +
> +	/* Protects against concurrent deletion in drm_sched_get_cleanup_job */
> +	spin_lock_irqsave(&sched->job_list_lock, flags);
>  	job = list_first_entry_or_null(&sched->ring_mirror_list,
>  				       struct drm_sched_job, node);
>  
>  	if (job) {
> +		/*
> +		 * Remove the bad job so it cannot be freed by concurrent
> +		 * drm_sched_cleanup_jobs. It will be reinserted back after sched->thread
> +		 * is parked at which point it's safe.
> +		 */
> +		list_del_init(&job->node);
> +		spin_unlock_irqrestore(&sched->job_list_lock, flags);
> +
>  		job->sched->ops->timedout_job(job);
>  
>  		/*
> @@ -298,6 +309,8 @@ static void drm_sched_job_timedout(struct work_struct *work)
>  			job->sched->ops->free_job(job);
>  			sched->free_guilty = false;
>  		}
> +	} else {
> +		spin_unlock_irqrestore(&sched->job_list_lock, flags);
>  	}
>  
>  	spin_lock_irqsave(&sched->job_list_lock, flags);
> @@ -370,6 +383,20 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad)
>  	kthread_park(sched->thread);
>  
>  	/*
> +	 * Reinsert back the bad job here - now it's safe as
> +	 * drm_sched_get_cleanup_job cannot race against us and release the
> +	 * bad job at this point - we parked (waited for) any in progress
> +	 * (earlier) cleanups and drm_sched_get_cleanup_job will not be called
> +	 * now until the scheduler thread is unparked.
> +	 */
> +	if (bad && bad->sched == sched)
> +		/*
> +		 * Add at the head of the queue to reflect it was the earliest
> +		 * job extracted.
> +		 */
> +		list_add(&bad->node, &sched->ring_mirror_list);
> +
> +	/*
>  	 * Iterate the job list from later to  earlier one and either deactive
>  	 * their HW callbacks or remove them from mirror list if they already
>  	 * signaled.

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
  2020-02-05 18:24   ` Lucas Stach
@ 2020-02-06 11:10     ` Lucas Stach
  -1 siblings, 0 replies; 125+ messages in thread
From: Lucas Stach @ 2020-02-06 11:10 UTC (permalink / raw)
  To: Andrey Grodzovsky, Christian.Koenig
  Cc: Emily.Deng, amd-gfx, dri-devel, steven.price

Hi all,

On Mi, 2020-02-05 at 19:24 +0100, Lucas Stach wrote:
> Hi Andrey,
> 
> This commit breaks all drivers, which may bail out of the timeout
> processing as they wish to extend the timeout (etnaviv, v3d).
> 
> Those drivers currently just return from the timeout handler before
> calling drm_sched_stop(), which means with this commit applied we are
> removing the first job from the ring_mirror_list, but never put it
> back. This leads to jobs getting lost from the ring mirror, which then
> causes quite a bit of fallout like unsignaled fences.
> 
> Not sure yet what to do about it, we can either add a function to add
> the job back to the ring_mirror if the driver wants to extend the
> timeout, or we could look for another way to stop
> drm_sched_cleanup_jobs from freeing jobs that are currently in timeout
> processing.

So after thinking about this a bit more my opinion is that we need to
revert this change for now and go back to the drawing board for the
scheduler timeout handling.

Right now this starts to feel like a big midlayer mistake with all the
very intricate intertwining between the drivers and the scheduler. The
rules on when it's safe to manipulate the ring mirror and when
completed jobs are signaled and freed are not really well specified.
The fact that we need to mutate state in order to get rid of races
instead of having a single big "timeout processing is owner of the
scheduler state for now" is a big fat warning sign IMHO.

It took me far longer than I'd like to admit to understand the failure
mode with fences not getting signaled after a GPU hang. The back and
forth between scheduler and driver code makes things really hard to
follow.

Regards,
Lucas

> Regards,
> Lucas
> 
> On Mo, 2019-11-25 at 15:51 -0500, Andrey Grodzovsky wrote:
> > Problem:
> > Due to a race between drm_sched_cleanup_jobs in sched thread and
> > drm_sched_job_timedout in timeout work there is a possiblity that
> > bad job was already freed while still being accessed from the
> > timeout thread.
> > 
> > Fix:
> > Instead of just peeking at the bad job in the mirror list
> > remove it from the list under lock and then put it back later when
> > we are garanteed no race with main sched thread is possible which
> > is after the thread is parked.
> > 
> > v2: Lock around processing ring_mirror_list in drm_sched_cleanup_jobs.
> > 
> > v3: Rebase on top of drm-misc-next. v2 is not needed anymore as
> > drm_sched_get_cleanup_job already has a lock there.
> > 
> > v4: Fix comments to relfect latest code in drm-misc.
> > 
> > Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> > Reviewed-by: Christian König <christian.koenig@amd.com>
> > Tested-by: Emily Deng <Emily.Deng@amd.com>
> > ---
> >  drivers/gpu/drm/scheduler/sched_main.c | 27 +++++++++++++++++++++++++++
> >  1 file changed, 27 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> > index 6774955..1bf9c40 100644
> > --- a/drivers/gpu/drm/scheduler/sched_main.c
> > +++ b/drivers/gpu/drm/scheduler/sched_main.c
> > @@ -284,10 +284,21 @@ static void drm_sched_job_timedout(struct work_struct *work)
> >  	unsigned long flags;
> >  
> >  	sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work);
> > +
> > +	/* Protects against concurrent deletion in drm_sched_get_cleanup_job */
> > +	spin_lock_irqsave(&sched->job_list_lock, flags);
> >  	job = list_first_entry_or_null(&sched->ring_mirror_list,
> >  				       struct drm_sched_job, node);
> >  
> >  	if (job) {
> > +		/*
> > +		 * Remove the bad job so it cannot be freed by concurrent
> > +		 * drm_sched_cleanup_jobs. It will be reinserted back after sched->thread
> > +		 * is parked at which point it's safe.
> > +		 */
> > +		list_del_init(&job->node);
> > +		spin_unlock_irqrestore(&sched->job_list_lock, flags);
> > +
> >  		job->sched->ops->timedout_job(job);
> >  
> >  		/*
> > @@ -298,6 +309,8 @@ static void drm_sched_job_timedout(struct work_struct *work)
> >  			job->sched->ops->free_job(job);
> >  			sched->free_guilty = false;
> >  		}
> > +	} else {
> > +		spin_unlock_irqrestore(&sched->job_list_lock, flags);
> >  	}
> >  
> >  	spin_lock_irqsave(&sched->job_list_lock, flags);
> > @@ -370,6 +383,20 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad)
> >  	kthread_park(sched->thread);
> >  
> >  	/*
> > +	 * Reinsert back the bad job here - now it's safe as
> > +	 * drm_sched_get_cleanup_job cannot race against us and release the
> > +	 * bad job at this point - we parked (waited for) any in progress
> > +	 * (earlier) cleanups and drm_sched_get_cleanup_job will not be called
> > +	 * now until the scheduler thread is unparked.
> > +	 */
> > +	if (bad && bad->sched == sched)
> > +		/*
> > +		 * Add at the head of the queue to reflect it was the earliest
> > +		 * job extracted.
> > +		 */
> > +		list_add(&bad->node, &sched->ring_mirror_list);
> > +
> > +	/*
> >  	 * Iterate the job list from later to  earlier one and either deactive
> >  	 * their HW callbacks or remove them from mirror list if they already
> >  	 * signaled.
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
@ 2020-02-06 11:10     ` Lucas Stach
  0 siblings, 0 replies; 125+ messages in thread
From: Lucas Stach @ 2020-02-06 11:10 UTC (permalink / raw)
  To: Andrey Grodzovsky, Christian.Koenig
  Cc: Emily.Deng, amd-gfx, dri-devel, steven.price

Hi all,

On Mi, 2020-02-05 at 19:24 +0100, Lucas Stach wrote:
> Hi Andrey,
> 
> This commit breaks all drivers, which may bail out of the timeout
> processing as they wish to extend the timeout (etnaviv, v3d).
> 
> Those drivers currently just return from the timeout handler before
> calling drm_sched_stop(), which means with this commit applied we are
> removing the first job from the ring_mirror_list, but never put it
> back. This leads to jobs getting lost from the ring mirror, which then
> causes quite a bit of fallout like unsignaled fences.
> 
> Not sure yet what to do about it, we can either add a function to add
> the job back to the ring_mirror if the driver wants to extend the
> timeout, or we could look for another way to stop
> drm_sched_cleanup_jobs from freeing jobs that are currently in timeout
> processing.

So after thinking about this a bit more my opinion is that we need to
revert this change for now and go back to the drawing board for the
scheduler timeout handling.

Right now this starts to feel like a big midlayer mistake with all the
very intricate intertwining between the drivers and the scheduler. The
rules on when it's safe to manipulate the ring mirror and when
completed jobs are signaled and freed are not really well specified.
The fact that we need to mutate state in order to get rid of races
instead of having a single big "timeout processing is owner of the
scheduler state for now" is a big fat warning sign IMHO.

It took me far longer than I'd like to admit to understand the failure
mode with fences not getting signaled after a GPU hang. The back and
forth between scheduler and driver code makes things really hard to
follow.

Regards,
Lucas

> Regards,
> Lucas
> 
> On Mo, 2019-11-25 at 15:51 -0500, Andrey Grodzovsky wrote:
> > Problem:
> > Due to a race between drm_sched_cleanup_jobs in sched thread and
> > drm_sched_job_timedout in timeout work there is a possiblity that
> > bad job was already freed while still being accessed from the
> > timeout thread.
> > 
> > Fix:
> > Instead of just peeking at the bad job in the mirror list
> > remove it from the list under lock and then put it back later when
> > we are garanteed no race with main sched thread is possible which
> > is after the thread is parked.
> > 
> > v2: Lock around processing ring_mirror_list in drm_sched_cleanup_jobs.
> > 
> > v3: Rebase on top of drm-misc-next. v2 is not needed anymore as
> > drm_sched_get_cleanup_job already has a lock there.
> > 
> > v4: Fix comments to relfect latest code in drm-misc.
> > 
> > Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> > Reviewed-by: Christian König <christian.koenig@amd.com>
> > Tested-by: Emily Deng <Emily.Deng@amd.com>
> > ---
> >  drivers/gpu/drm/scheduler/sched_main.c | 27 +++++++++++++++++++++++++++
> >  1 file changed, 27 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> > index 6774955..1bf9c40 100644
> > --- a/drivers/gpu/drm/scheduler/sched_main.c
> > +++ b/drivers/gpu/drm/scheduler/sched_main.c
> > @@ -284,10 +284,21 @@ static void drm_sched_job_timedout(struct work_struct *work)
> >  	unsigned long flags;
> >  
> >  	sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work);
> > +
> > +	/* Protects against concurrent deletion in drm_sched_get_cleanup_job */
> > +	spin_lock_irqsave(&sched->job_list_lock, flags);
> >  	job = list_first_entry_or_null(&sched->ring_mirror_list,
> >  				       struct drm_sched_job, node);
> >  
> >  	if (job) {
> > +		/*
> > +		 * Remove the bad job so it cannot be freed by concurrent
> > +		 * drm_sched_cleanup_jobs. It will be reinserted back after sched->thread
> > +		 * is parked at which point it's safe.
> > +		 */
> > +		list_del_init(&job->node);
> > +		spin_unlock_irqrestore(&sched->job_list_lock, flags);
> > +
> >  		job->sched->ops->timedout_job(job);
> >  
> >  		/*
> > @@ -298,6 +309,8 @@ static void drm_sched_job_timedout(struct work_struct *work)
> >  			job->sched->ops->free_job(job);
> >  			sched->free_guilty = false;
> >  		}
> > +	} else {
> > +		spin_unlock_irqrestore(&sched->job_list_lock, flags);
> >  	}
> >  
> >  	spin_lock_irqsave(&sched->job_list_lock, flags);
> > @@ -370,6 +383,20 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad)
> >  	kthread_park(sched->thread);
> >  
> >  	/*
> > +	 * Reinsert back the bad job here - now it's safe as
> > +	 * drm_sched_get_cleanup_job cannot race against us and release the
> > +	 * bad job at this point - we parked (waited for) any in progress
> > +	 * (earlier) cleanups and drm_sched_get_cleanup_job will not be called
> > +	 * now until the scheduler thread is unparked.
> > +	 */
> > +	if (bad && bad->sched == sched)
> > +		/*
> > +		 * Add at the head of the queue to reflect it was the earliest
> > +		 * job extracted.
> > +		 */
> > +		list_add(&bad->node, &sched->ring_mirror_list);
> > +
> > +	/*
> >  	 * Iterate the job list from later to  earlier one and either deactive
> >  	 * their HW callbacks or remove them from mirror list if they already
> >  	 * signaled.
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
  2020-02-06 11:10     ` Lucas Stach
@ 2020-02-06 11:49       ` Christian König
  -1 siblings, 0 replies; 125+ messages in thread
From: Christian König @ 2020-02-06 11:49 UTC (permalink / raw)
  To: Lucas Stach, Andrey Grodzovsky, Christian.Koenig
  Cc: Emily.Deng, dri-devel, amd-gfx, steven.price

Am 06.02.20 um 12:10 schrieb Lucas Stach:
> Hi all,
>
> On Mi, 2020-02-05 at 19:24 +0100, Lucas Stach wrote:
>> Hi Andrey,
>>
>> This commit breaks all drivers, which may bail out of the timeout
>> processing as they wish to extend the timeout (etnaviv, v3d).
>>
>> Those drivers currently just return from the timeout handler before
>> calling drm_sched_stop(), which means with this commit applied we are
>> removing the first job from the ring_mirror_list, but never put it
>> back. This leads to jobs getting lost from the ring mirror, which then
>> causes quite a bit of fallout like unsignaled fences.
>>
>> Not sure yet what to do about it, we can either add a function to add
>> the job back to the ring_mirror if the driver wants to extend the
>> timeout, or we could look for another way to stop
>> drm_sched_cleanup_jobs from freeing jobs that are currently in timeout
>> processing.
> So after thinking about this a bit more my opinion is that we need to
> revert this change for now and go back to the drawing board for the
> scheduler timeout handling.
>
> Right now this starts to feel like a big midlayer mistake with all the
> very intricate intertwining between the drivers and the scheduler. The
> rules on when it's safe to manipulate the ring mirror and when
> completed jobs are signaled and freed are not really well specified.
> The fact that we need to mutate state in order to get rid of races
> instead of having a single big "timeout processing is owner of the
> scheduler state for now" is a big fat warning sign IMHO.

Yes, that strongly feels like a hack to me as well. But I didn't had 
time and still haven't to take a closer look and suggest something better.

Christian.

>
> It took me far longer than I'd like to admit to understand the failure
> mode with fences not getting signaled after a GPU hang. The back and
> forth between scheduler and driver code makes things really hard to
> follow.
>
> Regards,
> Lucas
>
>> Regards,
>> Lucas
>>
>> On Mo, 2019-11-25 at 15:51 -0500, Andrey Grodzovsky wrote:
>>> Problem:
>>> Due to a race between drm_sched_cleanup_jobs in sched thread and
>>> drm_sched_job_timedout in timeout work there is a possiblity that
>>> bad job was already freed while still being accessed from the
>>> timeout thread.
>>>
>>> Fix:
>>> Instead of just peeking at the bad job in the mirror list
>>> remove it from the list under lock and then put it back later when
>>> we are garanteed no race with main sched thread is possible which
>>> is after the thread is parked.
>>>
>>> v2: Lock around processing ring_mirror_list in drm_sched_cleanup_jobs.
>>>
>>> v3: Rebase on top of drm-misc-next. v2 is not needed anymore as
>>> drm_sched_get_cleanup_job already has a lock there.
>>>
>>> v4: Fix comments to relfect latest code in drm-misc.
>>>
>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>> Reviewed-by: Christian König <christian.koenig@amd.com>
>>> Tested-by: Emily Deng <Emily.Deng@amd.com>
>>> ---
>>>   drivers/gpu/drm/scheduler/sched_main.c | 27 +++++++++++++++++++++++++++
>>>   1 file changed, 27 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
>>> index 6774955..1bf9c40 100644
>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>> @@ -284,10 +284,21 @@ static void drm_sched_job_timedout(struct work_struct *work)
>>>   	unsigned long flags;
>>>   
>>>   	sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work);
>>> +
>>> +	/* Protects against concurrent deletion in drm_sched_get_cleanup_job */
>>> +	spin_lock_irqsave(&sched->job_list_lock, flags);
>>>   	job = list_first_entry_or_null(&sched->ring_mirror_list,
>>>   				       struct drm_sched_job, node);
>>>   
>>>   	if (job) {
>>> +		/*
>>> +		 * Remove the bad job so it cannot be freed by concurrent
>>> +		 * drm_sched_cleanup_jobs. It will be reinserted back after sched->thread
>>> +		 * is parked at which point it's safe.
>>> +		 */
>>> +		list_del_init(&job->node);
>>> +		spin_unlock_irqrestore(&sched->job_list_lock, flags);
>>> +
>>>   		job->sched->ops->timedout_job(job);
>>>   
>>>   		/*
>>> @@ -298,6 +309,8 @@ static void drm_sched_job_timedout(struct work_struct *work)
>>>   			job->sched->ops->free_job(job);
>>>   			sched->free_guilty = false;
>>>   		}
>>> +	} else {
>>> +		spin_unlock_irqrestore(&sched->job_list_lock, flags);
>>>   	}
>>>   
>>>   	spin_lock_irqsave(&sched->job_list_lock, flags);
>>> @@ -370,6 +383,20 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad)
>>>   	kthread_park(sched->thread);
>>>   
>>>   	/*
>>> +	 * Reinsert back the bad job here - now it's safe as
>>> +	 * drm_sched_get_cleanup_job cannot race against us and release the
>>> +	 * bad job at this point - we parked (waited for) any in progress
>>> +	 * (earlier) cleanups and drm_sched_get_cleanup_job will not be called
>>> +	 * now until the scheduler thread is unparked.
>>> +	 */
>>> +	if (bad && bad->sched == sched)
>>> +		/*
>>> +		 * Add at the head of the queue to reflect it was the earliest
>>> +		 * job extracted.
>>> +		 */
>>> +		list_add(&bad->node, &sched->ring_mirror_list);
>>> +
>>> +	/*
>>>   	 * Iterate the job list from later to  earlier one and either deactive
>>>   	 * their HW callbacks or remove them from mirror list if they already
>>>   	 * signaled.
>> _______________________________________________
>> dri-devel mailing list
>> dri-devel@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/dri-devel
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
@ 2020-02-06 11:49       ` Christian König
  0 siblings, 0 replies; 125+ messages in thread
From: Christian König @ 2020-02-06 11:49 UTC (permalink / raw)
  To: Lucas Stach, Andrey Grodzovsky, Christian.Koenig
  Cc: Emily.Deng, dri-devel, amd-gfx, steven.price

Am 06.02.20 um 12:10 schrieb Lucas Stach:
> Hi all,
>
> On Mi, 2020-02-05 at 19:24 +0100, Lucas Stach wrote:
>> Hi Andrey,
>>
>> This commit breaks all drivers, which may bail out of the timeout
>> processing as they wish to extend the timeout (etnaviv, v3d).
>>
>> Those drivers currently just return from the timeout handler before
>> calling drm_sched_stop(), which means with this commit applied we are
>> removing the first job from the ring_mirror_list, but never put it
>> back. This leads to jobs getting lost from the ring mirror, which then
>> causes quite a bit of fallout like unsignaled fences.
>>
>> Not sure yet what to do about it, we can either add a function to add
>> the job back to the ring_mirror if the driver wants to extend the
>> timeout, or we could look for another way to stop
>> drm_sched_cleanup_jobs from freeing jobs that are currently in timeout
>> processing.
> So after thinking about this a bit more my opinion is that we need to
> revert this change for now and go back to the drawing board for the
> scheduler timeout handling.
>
> Right now this starts to feel like a big midlayer mistake with all the
> very intricate intertwining between the drivers and the scheduler. The
> rules on when it's safe to manipulate the ring mirror and when
> completed jobs are signaled and freed are not really well specified.
> The fact that we need to mutate state in order to get rid of races
> instead of having a single big "timeout processing is owner of the
> scheduler state for now" is a big fat warning sign IMHO.

Yes, that strongly feels like a hack to me as well. But I didn't had 
time and still haven't to take a closer look and suggest something better.

Christian.

>
> It took me far longer than I'd like to admit to understand the failure
> mode with fences not getting signaled after a GPU hang. The back and
> forth between scheduler and driver code makes things really hard to
> follow.
>
> Regards,
> Lucas
>
>> Regards,
>> Lucas
>>
>> On Mo, 2019-11-25 at 15:51 -0500, Andrey Grodzovsky wrote:
>>> Problem:
>>> Due to a race between drm_sched_cleanup_jobs in sched thread and
>>> drm_sched_job_timedout in timeout work there is a possiblity that
>>> bad job was already freed while still being accessed from the
>>> timeout thread.
>>>
>>> Fix:
>>> Instead of just peeking at the bad job in the mirror list
>>> remove it from the list under lock and then put it back later when
>>> we are garanteed no race with main sched thread is possible which
>>> is after the thread is parked.
>>>
>>> v2: Lock around processing ring_mirror_list in drm_sched_cleanup_jobs.
>>>
>>> v3: Rebase on top of drm-misc-next. v2 is not needed anymore as
>>> drm_sched_get_cleanup_job already has a lock there.
>>>
>>> v4: Fix comments to relfect latest code in drm-misc.
>>>
>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>> Reviewed-by: Christian König <christian.koenig@amd.com>
>>> Tested-by: Emily Deng <Emily.Deng@amd.com>
>>> ---
>>>   drivers/gpu/drm/scheduler/sched_main.c | 27 +++++++++++++++++++++++++++
>>>   1 file changed, 27 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
>>> index 6774955..1bf9c40 100644
>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>> @@ -284,10 +284,21 @@ static void drm_sched_job_timedout(struct work_struct *work)
>>>   	unsigned long flags;
>>>   
>>>   	sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work);
>>> +
>>> +	/* Protects against concurrent deletion in drm_sched_get_cleanup_job */
>>> +	spin_lock_irqsave(&sched->job_list_lock, flags);
>>>   	job = list_first_entry_or_null(&sched->ring_mirror_list,
>>>   				       struct drm_sched_job, node);
>>>   
>>>   	if (job) {
>>> +		/*
>>> +		 * Remove the bad job so it cannot be freed by concurrent
>>> +		 * drm_sched_cleanup_jobs. It will be reinserted back after sched->thread
>>> +		 * is parked at which point it's safe.
>>> +		 */
>>> +		list_del_init(&job->node);
>>> +		spin_unlock_irqrestore(&sched->job_list_lock, flags);
>>> +
>>>   		job->sched->ops->timedout_job(job);
>>>   
>>>   		/*
>>> @@ -298,6 +309,8 @@ static void drm_sched_job_timedout(struct work_struct *work)
>>>   			job->sched->ops->free_job(job);
>>>   			sched->free_guilty = false;
>>>   		}
>>> +	} else {
>>> +		spin_unlock_irqrestore(&sched->job_list_lock, flags);
>>>   	}
>>>   
>>>   	spin_lock_irqsave(&sched->job_list_lock, flags);
>>> @@ -370,6 +383,20 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad)
>>>   	kthread_park(sched->thread);
>>>   
>>>   	/*
>>> +	 * Reinsert back the bad job here - now it's safe as
>>> +	 * drm_sched_get_cleanup_job cannot race against us and release the
>>> +	 * bad job at this point - we parked (waited for) any in progress
>>> +	 * (earlier) cleanups and drm_sched_get_cleanup_job will not be called
>>> +	 * now until the scheduler thread is unparked.
>>> +	 */
>>> +	if (bad && bad->sched == sched)
>>> +		/*
>>> +		 * Add at the head of the queue to reflect it was the earliest
>>> +		 * job extracted.
>>> +		 */
>>> +		list_add(&bad->node, &sched->ring_mirror_list);
>>> +
>>> +	/*
>>>   	 * Iterate the job list from later to  earlier one and either deactive
>>>   	 * their HW callbacks or remove them from mirror list if they already
>>>   	 * signaled.
>> _______________________________________________
>> dri-devel mailing list
>> dri-devel@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/dri-devel
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
  2020-02-06 11:49       ` Christian König
@ 2020-02-06 14:49         ` Alex Deucher
  -1 siblings, 0 replies; 125+ messages in thread
From: Alex Deucher @ 2020-02-06 14:49 UTC (permalink / raw)
  To: Christian Koenig
  Cc: amd-gfx list, steven.price, Emily Deng, Maling list - DRI developers

On Thu, Feb 6, 2020 at 6:50 AM Christian König
<ckoenig.leichtzumerken@gmail.com> wrote:
>
> Am 06.02.20 um 12:10 schrieb Lucas Stach:
> > Hi all,
> >
> > On Mi, 2020-02-05 at 19:24 +0100, Lucas Stach wrote:
> >> Hi Andrey,
> >>
> >> This commit breaks all drivers, which may bail out of the timeout
> >> processing as they wish to extend the timeout (etnaviv, v3d).
> >>
> >> Those drivers currently just return from the timeout handler before
> >> calling drm_sched_stop(), which means with this commit applied we are
> >> removing the first job from the ring_mirror_list, but never put it
> >> back. This leads to jobs getting lost from the ring mirror, which then
> >> causes quite a bit of fallout like unsignaled fences.
> >>
> >> Not sure yet what to do about it, we can either add a function to add
> >> the job back to the ring_mirror if the driver wants to extend the
> >> timeout, or we could look for another way to stop
> >> drm_sched_cleanup_jobs from freeing jobs that are currently in timeout
> >> processing.
> > So after thinking about this a bit more my opinion is that we need to
> > revert this change for now and go back to the drawing board for the
> > scheduler timeout handling.
> >
> > Right now this starts to feel like a big midlayer mistake with all the
> > very intricate intertwining between the drivers and the scheduler. The
> > rules on when it's safe to manipulate the ring mirror and when
> > completed jobs are signaled and freed are not really well specified.
> > The fact that we need to mutate state in order to get rid of races
> > instead of having a single big "timeout processing is owner of the
> > scheduler state for now" is a big fat warning sign IMHO.
>
> Yes, that strongly feels like a hack to me as well. But I didn't had
> time and still haven't to take a closer look and suggest something better.
>

In that case, can someone send me a revert?

Alex


> Christian.
>
> >
> > It took me far longer than I'd like to admit to understand the failure
> > mode with fences not getting signaled after a GPU hang. The back and
> > forth between scheduler and driver code makes things really hard to
> > follow.
> >
> > Regards,
> > Lucas
> >
> >> Regards,
> >> Lucas
> >>
> >> On Mo, 2019-11-25 at 15:51 -0500, Andrey Grodzovsky wrote:
> >>> Problem:
> >>> Due to a race between drm_sched_cleanup_jobs in sched thread and
> >>> drm_sched_job_timedout in timeout work there is a possiblity that
> >>> bad job was already freed while still being accessed from the
> >>> timeout thread.
> >>>
> >>> Fix:
> >>> Instead of just peeking at the bad job in the mirror list
> >>> remove it from the list under lock and then put it back later when
> >>> we are garanteed no race with main sched thread is possible which
> >>> is after the thread is parked.
> >>>
> >>> v2: Lock around processing ring_mirror_list in drm_sched_cleanup_jobs.
> >>>
> >>> v3: Rebase on top of drm-misc-next. v2 is not needed anymore as
> >>> drm_sched_get_cleanup_job already has a lock there.
> >>>
> >>> v4: Fix comments to relfect latest code in drm-misc.
> >>>
> >>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> >>> Reviewed-by: Christian König <christian.koenig@amd.com>
> >>> Tested-by: Emily Deng <Emily.Deng@amd.com>
> >>> ---
> >>>   drivers/gpu/drm/scheduler/sched_main.c | 27 +++++++++++++++++++++++++++
> >>>   1 file changed, 27 insertions(+)
> >>>
> >>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> >>> index 6774955..1bf9c40 100644
> >>> --- a/drivers/gpu/drm/scheduler/sched_main.c
> >>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> >>> @@ -284,10 +284,21 @@ static void drm_sched_job_timedout(struct work_struct *work)
> >>>     unsigned long flags;
> >>>
> >>>     sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work);
> >>> +
> >>> +   /* Protects against concurrent deletion in drm_sched_get_cleanup_job */
> >>> +   spin_lock_irqsave(&sched->job_list_lock, flags);
> >>>     job = list_first_entry_or_null(&sched->ring_mirror_list,
> >>>                                    struct drm_sched_job, node);
> >>>
> >>>     if (job) {
> >>> +           /*
> >>> +            * Remove the bad job so it cannot be freed by concurrent
> >>> +            * drm_sched_cleanup_jobs. It will be reinserted back after sched->thread
> >>> +            * is parked at which point it's safe.
> >>> +            */
> >>> +           list_del_init(&job->node);
> >>> +           spin_unlock_irqrestore(&sched->job_list_lock, flags);
> >>> +
> >>>             job->sched->ops->timedout_job(job);
> >>>
> >>>             /*
> >>> @@ -298,6 +309,8 @@ static void drm_sched_job_timedout(struct work_struct *work)
> >>>                     job->sched->ops->free_job(job);
> >>>                     sched->free_guilty = false;
> >>>             }
> >>> +   } else {
> >>> +           spin_unlock_irqrestore(&sched->job_list_lock, flags);
> >>>     }
> >>>
> >>>     spin_lock_irqsave(&sched->job_list_lock, flags);
> >>> @@ -370,6 +383,20 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad)
> >>>     kthread_park(sched->thread);
> >>>
> >>>     /*
> >>> +    * Reinsert back the bad job here - now it's safe as
> >>> +    * drm_sched_get_cleanup_job cannot race against us and release the
> >>> +    * bad job at this point - we parked (waited for) any in progress
> >>> +    * (earlier) cleanups and drm_sched_get_cleanup_job will not be called
> >>> +    * now until the scheduler thread is unparked.
> >>> +    */
> >>> +   if (bad && bad->sched == sched)
> >>> +           /*
> >>> +            * Add at the head of the queue to reflect it was the earliest
> >>> +            * job extracted.
> >>> +            */
> >>> +           list_add(&bad->node, &sched->ring_mirror_list);
> >>> +
> >>> +   /*
> >>>      * Iterate the job list from later to  earlier one and either deactive
> >>>      * their HW callbacks or remove them from mirror list if they already
> >>>      * signaled.
> >> _______________________________________________
> >> dri-devel mailing list
> >> dri-devel@lists.freedesktop.org
> >> https://lists.freedesktop.org/mailman/listinfo/dri-devel
> > _______________________________________________
> > amd-gfx mailing list
> > amd-gfx@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
@ 2020-02-06 14:49         ` Alex Deucher
  0 siblings, 0 replies; 125+ messages in thread
From: Alex Deucher @ 2020-02-06 14:49 UTC (permalink / raw)
  To: Christian Koenig
  Cc: Andrey Grodzovsky, amd-gfx list, steven.price, Emily Deng,
	Maling list - DRI developers, Lucas Stach

On Thu, Feb 6, 2020 at 6:50 AM Christian König
<ckoenig.leichtzumerken@gmail.com> wrote:
>
> Am 06.02.20 um 12:10 schrieb Lucas Stach:
> > Hi all,
> >
> > On Mi, 2020-02-05 at 19:24 +0100, Lucas Stach wrote:
> >> Hi Andrey,
> >>
> >> This commit breaks all drivers, which may bail out of the timeout
> >> processing as they wish to extend the timeout (etnaviv, v3d).
> >>
> >> Those drivers currently just return from the timeout handler before
> >> calling drm_sched_stop(), which means with this commit applied we are
> >> removing the first job from the ring_mirror_list, but never put it
> >> back. This leads to jobs getting lost from the ring mirror, which then
> >> causes quite a bit of fallout like unsignaled fences.
> >>
> >> Not sure yet what to do about it, we can either add a function to add
> >> the job back to the ring_mirror if the driver wants to extend the
> >> timeout, or we could look for another way to stop
> >> drm_sched_cleanup_jobs from freeing jobs that are currently in timeout
> >> processing.
> > So after thinking about this a bit more my opinion is that we need to
> > revert this change for now and go back to the drawing board for the
> > scheduler timeout handling.
> >
> > Right now this starts to feel like a big midlayer mistake with all the
> > very intricate intertwining between the drivers and the scheduler. The
> > rules on when it's safe to manipulate the ring mirror and when
> > completed jobs are signaled and freed are not really well specified.
> > The fact that we need to mutate state in order to get rid of races
> > instead of having a single big "timeout processing is owner of the
> > scheduler state for now" is a big fat warning sign IMHO.
>
> Yes, that strongly feels like a hack to me as well. But I didn't had
> time and still haven't to take a closer look and suggest something better.
>

In that case, can someone send me a revert?

Alex


> Christian.
>
> >
> > It took me far longer than I'd like to admit to understand the failure
> > mode with fences not getting signaled after a GPU hang. The back and
> > forth between scheduler and driver code makes things really hard to
> > follow.
> >
> > Regards,
> > Lucas
> >
> >> Regards,
> >> Lucas
> >>
> >> On Mo, 2019-11-25 at 15:51 -0500, Andrey Grodzovsky wrote:
> >>> Problem:
> >>> Due to a race between drm_sched_cleanup_jobs in sched thread and
> >>> drm_sched_job_timedout in timeout work there is a possiblity that
> >>> bad job was already freed while still being accessed from the
> >>> timeout thread.
> >>>
> >>> Fix:
> >>> Instead of just peeking at the bad job in the mirror list
> >>> remove it from the list under lock and then put it back later when
> >>> we are garanteed no race with main sched thread is possible which
> >>> is after the thread is parked.
> >>>
> >>> v2: Lock around processing ring_mirror_list in drm_sched_cleanup_jobs.
> >>>
> >>> v3: Rebase on top of drm-misc-next. v2 is not needed anymore as
> >>> drm_sched_get_cleanup_job already has a lock there.
> >>>
> >>> v4: Fix comments to relfect latest code in drm-misc.
> >>>
> >>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> >>> Reviewed-by: Christian König <christian.koenig@amd.com>
> >>> Tested-by: Emily Deng <Emily.Deng@amd.com>
> >>> ---
> >>>   drivers/gpu/drm/scheduler/sched_main.c | 27 +++++++++++++++++++++++++++
> >>>   1 file changed, 27 insertions(+)
> >>>
> >>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> >>> index 6774955..1bf9c40 100644
> >>> --- a/drivers/gpu/drm/scheduler/sched_main.c
> >>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> >>> @@ -284,10 +284,21 @@ static void drm_sched_job_timedout(struct work_struct *work)
> >>>     unsigned long flags;
> >>>
> >>>     sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work);
> >>> +
> >>> +   /* Protects against concurrent deletion in drm_sched_get_cleanup_job */
> >>> +   spin_lock_irqsave(&sched->job_list_lock, flags);
> >>>     job = list_first_entry_or_null(&sched->ring_mirror_list,
> >>>                                    struct drm_sched_job, node);
> >>>
> >>>     if (job) {
> >>> +           /*
> >>> +            * Remove the bad job so it cannot be freed by concurrent
> >>> +            * drm_sched_cleanup_jobs. It will be reinserted back after sched->thread
> >>> +            * is parked at which point it's safe.
> >>> +            */
> >>> +           list_del_init(&job->node);
> >>> +           spin_unlock_irqrestore(&sched->job_list_lock, flags);
> >>> +
> >>>             job->sched->ops->timedout_job(job);
> >>>
> >>>             /*
> >>> @@ -298,6 +309,8 @@ static void drm_sched_job_timedout(struct work_struct *work)
> >>>                     job->sched->ops->free_job(job);
> >>>                     sched->free_guilty = false;
> >>>             }
> >>> +   } else {
> >>> +           spin_unlock_irqrestore(&sched->job_list_lock, flags);
> >>>     }
> >>>
> >>>     spin_lock_irqsave(&sched->job_list_lock, flags);
> >>> @@ -370,6 +383,20 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad)
> >>>     kthread_park(sched->thread);
> >>>
> >>>     /*
> >>> +    * Reinsert back the bad job here - now it's safe as
> >>> +    * drm_sched_get_cleanup_job cannot race against us and release the
> >>> +    * bad job at this point - we parked (waited for) any in progress
> >>> +    * (earlier) cleanups and drm_sched_get_cleanup_job will not be called
> >>> +    * now until the scheduler thread is unparked.
> >>> +    */
> >>> +   if (bad && bad->sched == sched)
> >>> +           /*
> >>> +            * Add at the head of the queue to reflect it was the earliest
> >>> +            * job extracted.
> >>> +            */
> >>> +           list_add(&bad->node, &sched->ring_mirror_list);
> >>> +
> >>> +   /*
> >>>      * Iterate the job list from later to  earlier one and either deactive
> >>>      * their HW callbacks or remove them from mirror list if they already
> >>>      * signaled.
> >> _______________________________________________
> >> dri-devel mailing list
> >> dri-devel@lists.freedesktop.org
> >> https://lists.freedesktop.org/mailman/listinfo/dri-devel
> > _______________________________________________
> > amd-gfx mailing list
> > amd-gfx@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
  2020-02-06 14:49         ` Alex Deucher
@ 2020-02-06 14:51           ` Christian König
  -1 siblings, 0 replies; 125+ messages in thread
From: Christian König @ 2020-02-06 14:51 UTC (permalink / raw)
  To: Alex Deucher
  Cc: amd-gfx list, steven.price, Emily Deng, Maling list - DRI developers

Am 06.02.20 um 15:49 schrieb Alex Deucher:
> On Thu, Feb 6, 2020 at 6:50 AM Christian König
> <ckoenig.leichtzumerken@gmail.com> wrote:
>> Am 06.02.20 um 12:10 schrieb Lucas Stach:
>>> Hi all,
>>>
>>> On Mi, 2020-02-05 at 19:24 +0100, Lucas Stach wrote:
>>>> Hi Andrey,
>>>>
>>>> This commit breaks all drivers, which may bail out of the timeout
>>>> processing as they wish to extend the timeout (etnaviv, v3d).
>>>>
>>>> Those drivers currently just return from the timeout handler before
>>>> calling drm_sched_stop(), which means with this commit applied we are
>>>> removing the first job from the ring_mirror_list, but never put it
>>>> back. This leads to jobs getting lost from the ring mirror, which then
>>>> causes quite a bit of fallout like unsignaled fences.
>>>>
>>>> Not sure yet what to do about it, we can either add a function to add
>>>> the job back to the ring_mirror if the driver wants to extend the
>>>> timeout, or we could look for another way to stop
>>>> drm_sched_cleanup_jobs from freeing jobs that are currently in timeout
>>>> processing.
>>> So after thinking about this a bit more my opinion is that we need to
>>> revert this change for now and go back to the drawing board for the
>>> scheduler timeout handling.
>>>
>>> Right now this starts to feel like a big midlayer mistake with all the
>>> very intricate intertwining between the drivers and the scheduler. The
>>> rules on when it's safe to manipulate the ring mirror and when
>>> completed jobs are signaled and freed are not really well specified.
>>> The fact that we need to mutate state in order to get rid of races
>>> instead of having a single big "timeout processing is owner of the
>>> scheduler state for now" is a big fat warning sign IMHO.
>> Yes, that strongly feels like a hack to me as well. But I didn't had
>> time and still haven't to take a closer look and suggest something better.
>>
> In that case, can someone send me a revert?

Well a revert would break our driver.

The real solution is that somebody needs to sit down, gather ALL the 
requirements and then come up with a solution which is clean and works 
for everyone.

Christian.

>
> Alex
>
>
>> Christian.
>>
>>> It took me far longer than I'd like to admit to understand the failure
>>> mode with fences not getting signaled after a GPU hang. The back and
>>> forth between scheduler and driver code makes things really hard to
>>> follow.
>>>
>>> Regards,
>>> Lucas
>>>
>>>> Regards,
>>>> Lucas
>>>>
>>>> On Mo, 2019-11-25 at 15:51 -0500, Andrey Grodzovsky wrote:
>>>>> Problem:
>>>>> Due to a race between drm_sched_cleanup_jobs in sched thread and
>>>>> drm_sched_job_timedout in timeout work there is a possiblity that
>>>>> bad job was already freed while still being accessed from the
>>>>> timeout thread.
>>>>>
>>>>> Fix:
>>>>> Instead of just peeking at the bad job in the mirror list
>>>>> remove it from the list under lock and then put it back later when
>>>>> we are garanteed no race with main sched thread is possible which
>>>>> is after the thread is parked.
>>>>>
>>>>> v2: Lock around processing ring_mirror_list in drm_sched_cleanup_jobs.
>>>>>
>>>>> v3: Rebase on top of drm-misc-next. v2 is not needed anymore as
>>>>> drm_sched_get_cleanup_job already has a lock there.
>>>>>
>>>>> v4: Fix comments to relfect latest code in drm-misc.
>>>>>
>>>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>>>> Reviewed-by: Christian König <christian.koenig@amd.com>
>>>>> Tested-by: Emily Deng <Emily.Deng@amd.com>
>>>>> ---
>>>>>    drivers/gpu/drm/scheduler/sched_main.c | 27 +++++++++++++++++++++++++++
>>>>>    1 file changed, 27 insertions(+)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
>>>>> index 6774955..1bf9c40 100644
>>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>>>> @@ -284,10 +284,21 @@ static void drm_sched_job_timedout(struct work_struct *work)
>>>>>      unsigned long flags;
>>>>>
>>>>>      sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work);
>>>>> +
>>>>> +   /* Protects against concurrent deletion in drm_sched_get_cleanup_job */
>>>>> +   spin_lock_irqsave(&sched->job_list_lock, flags);
>>>>>      job = list_first_entry_or_null(&sched->ring_mirror_list,
>>>>>                                     struct drm_sched_job, node);
>>>>>
>>>>>      if (job) {
>>>>> +           /*
>>>>> +            * Remove the bad job so it cannot be freed by concurrent
>>>>> +            * drm_sched_cleanup_jobs. It will be reinserted back after sched->thread
>>>>> +            * is parked at which point it's safe.
>>>>> +            */
>>>>> +           list_del_init(&job->node);
>>>>> +           spin_unlock_irqrestore(&sched->job_list_lock, flags);
>>>>> +
>>>>>              job->sched->ops->timedout_job(job);
>>>>>
>>>>>              /*
>>>>> @@ -298,6 +309,8 @@ static void drm_sched_job_timedout(struct work_struct *work)
>>>>>                      job->sched->ops->free_job(job);
>>>>>                      sched->free_guilty = false;
>>>>>              }
>>>>> +   } else {
>>>>> +           spin_unlock_irqrestore(&sched->job_list_lock, flags);
>>>>>      }
>>>>>
>>>>>      spin_lock_irqsave(&sched->job_list_lock, flags);
>>>>> @@ -370,6 +383,20 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad)
>>>>>      kthread_park(sched->thread);
>>>>>
>>>>>      /*
>>>>> +    * Reinsert back the bad job here - now it's safe as
>>>>> +    * drm_sched_get_cleanup_job cannot race against us and release the
>>>>> +    * bad job at this point - we parked (waited for) any in progress
>>>>> +    * (earlier) cleanups and drm_sched_get_cleanup_job will not be called
>>>>> +    * now until the scheduler thread is unparked.
>>>>> +    */
>>>>> +   if (bad && bad->sched == sched)
>>>>> +           /*
>>>>> +            * Add at the head of the queue to reflect it was the earliest
>>>>> +            * job extracted.
>>>>> +            */
>>>>> +           list_add(&bad->node, &sched->ring_mirror_list);
>>>>> +
>>>>> +   /*
>>>>>       * Iterate the job list from later to  earlier one and either deactive
>>>>>       * their HW callbacks or remove them from mirror list if they already
>>>>>       * signaled.
>>>> _______________________________________________
>>>> dri-devel mailing list
>>>> dri-devel@lists.freedesktop.org
>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Fdri-devel&amp;data=02%7C01%7Cchristian.koenig%40amd.com%7Ce88b51a2443741b0b56f08d7ab13da74%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637165974076779365&amp;sdata=L7Hin%2Faw7vK9IYBaZn%2BVmWZKzjqTYBsvJ%2BIL80qB3M4%3D&amp;reserved=0
>>> _______________________________________________
>>> amd-gfx mailing list
>>> amd-gfx@lists.freedesktop.org
>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=02%7C01%7Cchristian.koenig%40amd.com%7Ce88b51a2443741b0b56f08d7ab13da74%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637165974076779365&amp;sdata=94EyD8X91MT5IVE8TN9%2FRYed8aIX6tN1Pvl8LJBkCeU%3D&amp;reserved=0
>> _______________________________________________
>> dri-devel mailing list
>> dri-devel@lists.freedesktop.org
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Fdri-devel&amp;data=02%7C01%7Cchristian.koenig%40amd.com%7Ce88b51a2443741b0b56f08d7ab13da74%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637165974076779365&amp;sdata=L7Hin%2Faw7vK9IYBaZn%2BVmWZKzjqTYBsvJ%2BIL80qB3M4%3D&amp;reserved=0

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
@ 2020-02-06 14:51           ` Christian König
  0 siblings, 0 replies; 125+ messages in thread
From: Christian König @ 2020-02-06 14:51 UTC (permalink / raw)
  To: Alex Deucher
  Cc: Andrey Grodzovsky, amd-gfx list, steven.price, Emily Deng,
	Maling list - DRI developers, Lucas Stach

Am 06.02.20 um 15:49 schrieb Alex Deucher:
> On Thu, Feb 6, 2020 at 6:50 AM Christian König
> <ckoenig.leichtzumerken@gmail.com> wrote:
>> Am 06.02.20 um 12:10 schrieb Lucas Stach:
>>> Hi all,
>>>
>>> On Mi, 2020-02-05 at 19:24 +0100, Lucas Stach wrote:
>>>> Hi Andrey,
>>>>
>>>> This commit breaks all drivers, which may bail out of the timeout
>>>> processing as they wish to extend the timeout (etnaviv, v3d).
>>>>
>>>> Those drivers currently just return from the timeout handler before
>>>> calling drm_sched_stop(), which means with this commit applied we are
>>>> removing the first job from the ring_mirror_list, but never put it
>>>> back. This leads to jobs getting lost from the ring mirror, which then
>>>> causes quite a bit of fallout like unsignaled fences.
>>>>
>>>> Not sure yet what to do about it, we can either add a function to add
>>>> the job back to the ring_mirror if the driver wants to extend the
>>>> timeout, or we could look for another way to stop
>>>> drm_sched_cleanup_jobs from freeing jobs that are currently in timeout
>>>> processing.
>>> So after thinking about this a bit more my opinion is that we need to
>>> revert this change for now and go back to the drawing board for the
>>> scheduler timeout handling.
>>>
>>> Right now this starts to feel like a big midlayer mistake with all the
>>> very intricate intertwining between the drivers and the scheduler. The
>>> rules on when it's safe to manipulate the ring mirror and when
>>> completed jobs are signaled and freed are not really well specified.
>>> The fact that we need to mutate state in order to get rid of races
>>> instead of having a single big "timeout processing is owner of the
>>> scheduler state for now" is a big fat warning sign IMHO.
>> Yes, that strongly feels like a hack to me as well. But I didn't had
>> time and still haven't to take a closer look and suggest something better.
>>
> In that case, can someone send me a revert?

Well a revert would break our driver.

The real solution is that somebody needs to sit down, gather ALL the 
requirements and then come up with a solution which is clean and works 
for everyone.

Christian.

>
> Alex
>
>
>> Christian.
>>
>>> It took me far longer than I'd like to admit to understand the failure
>>> mode with fences not getting signaled after a GPU hang. The back and
>>> forth between scheduler and driver code makes things really hard to
>>> follow.
>>>
>>> Regards,
>>> Lucas
>>>
>>>> Regards,
>>>> Lucas
>>>>
>>>> On Mo, 2019-11-25 at 15:51 -0500, Andrey Grodzovsky wrote:
>>>>> Problem:
>>>>> Due to a race between drm_sched_cleanup_jobs in sched thread and
>>>>> drm_sched_job_timedout in timeout work there is a possiblity that
>>>>> bad job was already freed while still being accessed from the
>>>>> timeout thread.
>>>>>
>>>>> Fix:
>>>>> Instead of just peeking at the bad job in the mirror list
>>>>> remove it from the list under lock and then put it back later when
>>>>> we are garanteed no race with main sched thread is possible which
>>>>> is after the thread is parked.
>>>>>
>>>>> v2: Lock around processing ring_mirror_list in drm_sched_cleanup_jobs.
>>>>>
>>>>> v3: Rebase on top of drm-misc-next. v2 is not needed anymore as
>>>>> drm_sched_get_cleanup_job already has a lock there.
>>>>>
>>>>> v4: Fix comments to relfect latest code in drm-misc.
>>>>>
>>>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>>>> Reviewed-by: Christian König <christian.koenig@amd.com>
>>>>> Tested-by: Emily Deng <Emily.Deng@amd.com>
>>>>> ---
>>>>>    drivers/gpu/drm/scheduler/sched_main.c | 27 +++++++++++++++++++++++++++
>>>>>    1 file changed, 27 insertions(+)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
>>>>> index 6774955..1bf9c40 100644
>>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>>>> @@ -284,10 +284,21 @@ static void drm_sched_job_timedout(struct work_struct *work)
>>>>>      unsigned long flags;
>>>>>
>>>>>      sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work);
>>>>> +
>>>>> +   /* Protects against concurrent deletion in drm_sched_get_cleanup_job */
>>>>> +   spin_lock_irqsave(&sched->job_list_lock, flags);
>>>>>      job = list_first_entry_or_null(&sched->ring_mirror_list,
>>>>>                                     struct drm_sched_job, node);
>>>>>
>>>>>      if (job) {
>>>>> +           /*
>>>>> +            * Remove the bad job so it cannot be freed by concurrent
>>>>> +            * drm_sched_cleanup_jobs. It will be reinserted back after sched->thread
>>>>> +            * is parked at which point it's safe.
>>>>> +            */
>>>>> +           list_del_init(&job->node);
>>>>> +           spin_unlock_irqrestore(&sched->job_list_lock, flags);
>>>>> +
>>>>>              job->sched->ops->timedout_job(job);
>>>>>
>>>>>              /*
>>>>> @@ -298,6 +309,8 @@ static void drm_sched_job_timedout(struct work_struct *work)
>>>>>                      job->sched->ops->free_job(job);
>>>>>                      sched->free_guilty = false;
>>>>>              }
>>>>> +   } else {
>>>>> +           spin_unlock_irqrestore(&sched->job_list_lock, flags);
>>>>>      }
>>>>>
>>>>>      spin_lock_irqsave(&sched->job_list_lock, flags);
>>>>> @@ -370,6 +383,20 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad)
>>>>>      kthread_park(sched->thread);
>>>>>
>>>>>      /*
>>>>> +    * Reinsert back the bad job here - now it's safe as
>>>>> +    * drm_sched_get_cleanup_job cannot race against us and release the
>>>>> +    * bad job at this point - we parked (waited for) any in progress
>>>>> +    * (earlier) cleanups and drm_sched_get_cleanup_job will not be called
>>>>> +    * now until the scheduler thread is unparked.
>>>>> +    */
>>>>> +   if (bad && bad->sched == sched)
>>>>> +           /*
>>>>> +            * Add at the head of the queue to reflect it was the earliest
>>>>> +            * job extracted.
>>>>> +            */
>>>>> +           list_add(&bad->node, &sched->ring_mirror_list);
>>>>> +
>>>>> +   /*
>>>>>       * Iterate the job list from later to  earlier one and either deactive
>>>>>       * their HW callbacks or remove them from mirror list if they already
>>>>>       * signaled.
>>>> _______________________________________________
>>>> dri-devel mailing list
>>>> dri-devel@lists.freedesktop.org
>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Fdri-devel&amp;data=02%7C01%7Cchristian.koenig%40amd.com%7Ce88b51a2443741b0b56f08d7ab13da74%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637165974076779365&amp;sdata=L7Hin%2Faw7vK9IYBaZn%2BVmWZKzjqTYBsvJ%2BIL80qB3M4%3D&amp;reserved=0
>>> _______________________________________________
>>> amd-gfx mailing list
>>> amd-gfx@lists.freedesktop.org
>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=02%7C01%7Cchristian.koenig%40amd.com%7Ce88b51a2443741b0b56f08d7ab13da74%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637165974076779365&amp;sdata=94EyD8X91MT5IVE8TN9%2FRYed8aIX6tN1Pvl8LJBkCeU%3D&amp;reserved=0
>> _______________________________________________
>> dri-devel mailing list
>> dri-devel@lists.freedesktop.org
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Fdri-devel&amp;data=02%7C01%7Cchristian.koenig%40amd.com%7Ce88b51a2443741b0b56f08d7ab13da74%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637165974076779365&amp;sdata=L7Hin%2Faw7vK9IYBaZn%2BVmWZKzjqTYBsvJ%2BIL80qB3M4%3D&amp;reserved=0

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
  2020-02-06 14:51           ` Christian König
@ 2020-02-06 15:49             ` Andrey Grodzovsky
  -1 siblings, 0 replies; 125+ messages in thread
From: Andrey Grodzovsky @ 2020-02-06 15:49 UTC (permalink / raw)
  To: Christian König, Alex Deucher
  Cc: Emily Deng, steven.price, amd-gfx list, Maling list - DRI developers


On 2/6/20 9:51 AM, Christian König wrote:
> Am 06.02.20 um 15:49 schrieb Alex Deucher:
>> On Thu, Feb 6, 2020 at 6:50 AM Christian König
>> <ckoenig.leichtzumerken@gmail.com> wrote:
>>> Am 06.02.20 um 12:10 schrieb Lucas Stach:
>>>> Hi all,
>>>>
>>>> On Mi, 2020-02-05 at 19:24 +0100, Lucas Stach wrote:
>>>>> Hi Andrey,
>>>>>
>>>>> This commit breaks all drivers, which may bail out of the timeout
>>>>> processing as they wish to extend the timeout (etnaviv, v3d).
>>>>>
>>>>> Those drivers currently just return from the timeout handler before
>>>>> calling drm_sched_stop(), which means with this commit applied we are
>>>>> removing the first job from the ring_mirror_list, but never put it
>>>>> back. This leads to jobs getting lost from the ring mirror, which 
>>>>> then
>>>>> causes quite a bit of fallout like unsignaled fences.
>>>>>
>>>>> Not sure yet what to do about it, we can either add a function to add
>>>>> the job back to the ring_mirror if the driver wants to extend the
>>>>> timeout, or we could look for another way to stop
>>>>> drm_sched_cleanup_jobs from freeing jobs that are currently in 
>>>>> timeout
>>>>> processing.
>>>> So after thinking about this a bit more my opinion is that we need to
>>>> revert this change for now and go back to the drawing board for the
>>>> scheduler timeout handling.
>>>>
>>>> Right now this starts to feel like a big midlayer mistake with all the
>>>> very intricate intertwining between the drivers and the scheduler. The
>>>> rules on when it's safe to manipulate the ring mirror and when
>>>> completed jobs are signaled and freed are not really well specified.
>>>> The fact that we need to mutate state in order to get rid of races
>>>> instead of having a single big "timeout processing is owner of the
>>>> scheduler state for now" is a big fat warning sign IMHO.
>>> Yes, that strongly feels like a hack to me as well. But I didn't had
>>> time and still haven't to take a closer look and suggest something 
>>> better.
>>>
>> In that case, can someone send me a revert?
>
> Well a revert would break our driver.
>
> The real solution is that somebody needs to sit down, gather ALL the 
> requirements and then come up with a solution which is clean and works 
> for everyone.
>
> Christian.


I can to take on this as indeed our general design on this becomes more 
and more entangled as GPU reset scenarios grow in complexity (at least 
in AMD driver). Currently I am on a high priority internal task which 
should take me around a week or 2 to finish and after that I can get to it.

Regarding temporary solution  - I looked into v3d and etnaviv use cases 
and we in AMD actually face the same scenario where we decide to skip HW 
reset if the guilty job did finish by the time we are processing the 
timeout  (see amdgpu_device_gpu_recover and skip_hw_reset goto) - the 
difference is we always call drm_sched_stop/start irrespectively of 
whether we are going to actually HW reset or not (same as extend 
timeout). I wonder if something like this can be done also for ve3 and 
etnaviv ?

Andrey


>
>>
>> Alex
>>
>>
>>> Christian.
>>>
>>>> It took me far longer than I'd like to admit to understand the failure
>>>> mode with fences not getting signaled after a GPU hang. The back and
>>>> forth between scheduler and driver code makes things really hard to
>>>> follow.
>>>>
>>>> Regards,
>>>> Lucas
>>>>
>>>>> Regards,
>>>>> Lucas
>>>>>
>>>>> On Mo, 2019-11-25 at 15:51 -0500, Andrey Grodzovsky wrote:
>>>>>> Problem:
>>>>>> Due to a race between drm_sched_cleanup_jobs in sched thread and
>>>>>> drm_sched_job_timedout in timeout work there is a possiblity that
>>>>>> bad job was already freed while still being accessed from the
>>>>>> timeout thread.
>>>>>>
>>>>>> Fix:
>>>>>> Instead of just peeking at the bad job in the mirror list
>>>>>> remove it from the list under lock and then put it back later when
>>>>>> we are garanteed no race with main sched thread is possible which
>>>>>> is after the thread is parked.
>>>>>>
>>>>>> v2: Lock around processing ring_mirror_list in 
>>>>>> drm_sched_cleanup_jobs.
>>>>>>
>>>>>> v3: Rebase on top of drm-misc-next. v2 is not needed anymore as
>>>>>> drm_sched_get_cleanup_job already has a lock there.
>>>>>>
>>>>>> v4: Fix comments to relfect latest code in drm-misc.
>>>>>>
>>>>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>>>>> Reviewed-by: Christian König <christian.koenig@amd.com>
>>>>>> Tested-by: Emily Deng <Emily.Deng@amd.com>
>>>>>> ---
>>>>>>    drivers/gpu/drm/scheduler/sched_main.c | 27 
>>>>>> +++++++++++++++++++++++++++
>>>>>>    1 file changed, 27 insertions(+)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
>>>>>> b/drivers/gpu/drm/scheduler/sched_main.c
>>>>>> index 6774955..1bf9c40 100644
>>>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>>>>> @@ -284,10 +284,21 @@ static void drm_sched_job_timedout(struct 
>>>>>> work_struct *work)
>>>>>>      unsigned long flags;
>>>>>>
>>>>>>      sched = container_of(work, struct drm_gpu_scheduler, 
>>>>>> work_tdr.work);
>>>>>> +
>>>>>> +   /* Protects against concurrent deletion in 
>>>>>> drm_sched_get_cleanup_job */
>>>>>> +   spin_lock_irqsave(&sched->job_list_lock, flags);
>>>>>>      job = list_first_entry_or_null(&sched->ring_mirror_list,
>>>>>>                                     struct drm_sched_job, node);
>>>>>>
>>>>>>      if (job) {
>>>>>> +           /*
>>>>>> +            * Remove the bad job so it cannot be freed by 
>>>>>> concurrent
>>>>>> +            * drm_sched_cleanup_jobs. It will be reinserted back 
>>>>>> after sched->thread
>>>>>> +            * is parked at which point it's safe.
>>>>>> +            */
>>>>>> +           list_del_init(&job->node);
>>>>>> + spin_unlock_irqrestore(&sched->job_list_lock, flags);
>>>>>> +
>>>>>> job->sched->ops->timedout_job(job);
>>>>>>
>>>>>>              /*
>>>>>> @@ -298,6 +309,8 @@ static void drm_sched_job_timedout(struct 
>>>>>> work_struct *work)
>>>>>> job->sched->ops->free_job(job);
>>>>>>                      sched->free_guilty = false;
>>>>>>              }
>>>>>> +   } else {
>>>>>> + spin_unlock_irqrestore(&sched->job_list_lock, flags);
>>>>>>      }
>>>>>>
>>>>>>      spin_lock_irqsave(&sched->job_list_lock, flags);
>>>>>> @@ -370,6 +383,20 @@ void drm_sched_stop(struct drm_gpu_scheduler 
>>>>>> *sched, struct drm_sched_job *bad)
>>>>>>      kthread_park(sched->thread);
>>>>>>
>>>>>>      /*
>>>>>> +    * Reinsert back the bad job here - now it's safe as
>>>>>> +    * drm_sched_get_cleanup_job cannot race against us and 
>>>>>> release the
>>>>>> +    * bad job at this point - we parked (waited for) any in 
>>>>>> progress
>>>>>> +    * (earlier) cleanups and drm_sched_get_cleanup_job will not 
>>>>>> be called
>>>>>> +    * now until the scheduler thread is unparked.
>>>>>> +    */
>>>>>> +   if (bad && bad->sched == sched)
>>>>>> +           /*
>>>>>> +            * Add at the head of the queue to reflect it was the 
>>>>>> earliest
>>>>>> +            * job extracted.
>>>>>> +            */
>>>>>> +           list_add(&bad->node, &sched->ring_mirror_list);
>>>>>> +
>>>>>> +   /*
>>>>>>       * Iterate the job list from later to  earlier one and 
>>>>>> either deactive
>>>>>>       * their HW callbacks or remove them from mirror list if 
>>>>>> they already
>>>>>>       * signaled.
>>>>> _______________________________________________
>>>>> dri-devel mailing list
>>>>> dri-devel@lists.freedesktop.org
>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Fdri-devel&amp;data=02%7C01%7Cchristian.koenig%40amd.com%7Ce88b51a2443741b0b56f08d7ab13da74%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637165974076779365&amp;sdata=L7Hin%2Faw7vK9IYBaZn%2BVmWZKzjqTYBsvJ%2BIL80qB3M4%3D&amp;reserved=0 
>>>>>
>>>> _______________________________________________
>>>> amd-gfx mailing list
>>>> amd-gfx@lists.freedesktop.org
>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=02%7C01%7Cchristian.koenig%40amd.com%7Ce88b51a2443741b0b56f08d7ab13da74%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637165974076779365&amp;sdata=94EyD8X91MT5IVE8TN9%2FRYed8aIX6tN1Pvl8LJBkCeU%3D&amp;reserved=0 
>>>>
>>> _______________________________________________
>>> dri-devel mailing list
>>> dri-devel@lists.freedesktop.org
>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Fdri-devel&amp;data=02%7C01%7Cchristian.koenig%40amd.com%7Ce88b51a2443741b0b56f08d7ab13da74%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637165974076779365&amp;sdata=L7Hin%2Faw7vK9IYBaZn%2BVmWZKzjqTYBsvJ%2BIL80qB3M4%3D&amp;reserved=0 
>>>
>
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
@ 2020-02-06 15:49             ` Andrey Grodzovsky
  0 siblings, 0 replies; 125+ messages in thread
From: Andrey Grodzovsky @ 2020-02-06 15:49 UTC (permalink / raw)
  To: Christian König, Alex Deucher
  Cc: Emily Deng, steven.price, amd-gfx list,
	Maling list - DRI developers, Lucas Stach


On 2/6/20 9:51 AM, Christian König wrote:
> Am 06.02.20 um 15:49 schrieb Alex Deucher:
>> On Thu, Feb 6, 2020 at 6:50 AM Christian König
>> <ckoenig.leichtzumerken@gmail.com> wrote:
>>> Am 06.02.20 um 12:10 schrieb Lucas Stach:
>>>> Hi all,
>>>>
>>>> On Mi, 2020-02-05 at 19:24 +0100, Lucas Stach wrote:
>>>>> Hi Andrey,
>>>>>
>>>>> This commit breaks all drivers, which may bail out of the timeout
>>>>> processing as they wish to extend the timeout (etnaviv, v3d).
>>>>>
>>>>> Those drivers currently just return from the timeout handler before
>>>>> calling drm_sched_stop(), which means with this commit applied we are
>>>>> removing the first job from the ring_mirror_list, but never put it
>>>>> back. This leads to jobs getting lost from the ring mirror, which 
>>>>> then
>>>>> causes quite a bit of fallout like unsignaled fences.
>>>>>
>>>>> Not sure yet what to do about it, we can either add a function to add
>>>>> the job back to the ring_mirror if the driver wants to extend the
>>>>> timeout, or we could look for another way to stop
>>>>> drm_sched_cleanup_jobs from freeing jobs that are currently in 
>>>>> timeout
>>>>> processing.
>>>> So after thinking about this a bit more my opinion is that we need to
>>>> revert this change for now and go back to the drawing board for the
>>>> scheduler timeout handling.
>>>>
>>>> Right now this starts to feel like a big midlayer mistake with all the
>>>> very intricate intertwining between the drivers and the scheduler. The
>>>> rules on when it's safe to manipulate the ring mirror and when
>>>> completed jobs are signaled and freed are not really well specified.
>>>> The fact that we need to mutate state in order to get rid of races
>>>> instead of having a single big "timeout processing is owner of the
>>>> scheduler state for now" is a big fat warning sign IMHO.
>>> Yes, that strongly feels like a hack to me as well. But I didn't had
>>> time and still haven't to take a closer look and suggest something 
>>> better.
>>>
>> In that case, can someone send me a revert?
>
> Well a revert would break our driver.
>
> The real solution is that somebody needs to sit down, gather ALL the 
> requirements and then come up with a solution which is clean and works 
> for everyone.
>
> Christian.


I can to take on this as indeed our general design on this becomes more 
and more entangled as GPU reset scenarios grow in complexity (at least 
in AMD driver). Currently I am on a high priority internal task which 
should take me around a week or 2 to finish and after that I can get to it.

Regarding temporary solution  - I looked into v3d and etnaviv use cases 
and we in AMD actually face the same scenario where we decide to skip HW 
reset if the guilty job did finish by the time we are processing the 
timeout  (see amdgpu_device_gpu_recover and skip_hw_reset goto) - the 
difference is we always call drm_sched_stop/start irrespectively of 
whether we are going to actually HW reset or not (same as extend 
timeout). I wonder if something like this can be done also for ve3 and 
etnaviv ?

Andrey


>
>>
>> Alex
>>
>>
>>> Christian.
>>>
>>>> It took me far longer than I'd like to admit to understand the failure
>>>> mode with fences not getting signaled after a GPU hang. The back and
>>>> forth between scheduler and driver code makes things really hard to
>>>> follow.
>>>>
>>>> Regards,
>>>> Lucas
>>>>
>>>>> Regards,
>>>>> Lucas
>>>>>
>>>>> On Mo, 2019-11-25 at 15:51 -0500, Andrey Grodzovsky wrote:
>>>>>> Problem:
>>>>>> Due to a race between drm_sched_cleanup_jobs in sched thread and
>>>>>> drm_sched_job_timedout in timeout work there is a possiblity that
>>>>>> bad job was already freed while still being accessed from the
>>>>>> timeout thread.
>>>>>>
>>>>>> Fix:
>>>>>> Instead of just peeking at the bad job in the mirror list
>>>>>> remove it from the list under lock and then put it back later when
>>>>>> we are garanteed no race with main sched thread is possible which
>>>>>> is after the thread is parked.
>>>>>>
>>>>>> v2: Lock around processing ring_mirror_list in 
>>>>>> drm_sched_cleanup_jobs.
>>>>>>
>>>>>> v3: Rebase on top of drm-misc-next. v2 is not needed anymore as
>>>>>> drm_sched_get_cleanup_job already has a lock there.
>>>>>>
>>>>>> v4: Fix comments to relfect latest code in drm-misc.
>>>>>>
>>>>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>>>>> Reviewed-by: Christian König <christian.koenig@amd.com>
>>>>>> Tested-by: Emily Deng <Emily.Deng@amd.com>
>>>>>> ---
>>>>>>    drivers/gpu/drm/scheduler/sched_main.c | 27 
>>>>>> +++++++++++++++++++++++++++
>>>>>>    1 file changed, 27 insertions(+)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
>>>>>> b/drivers/gpu/drm/scheduler/sched_main.c
>>>>>> index 6774955..1bf9c40 100644
>>>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>>>>> @@ -284,10 +284,21 @@ static void drm_sched_job_timedout(struct 
>>>>>> work_struct *work)
>>>>>>      unsigned long flags;
>>>>>>
>>>>>>      sched = container_of(work, struct drm_gpu_scheduler, 
>>>>>> work_tdr.work);
>>>>>> +
>>>>>> +   /* Protects against concurrent deletion in 
>>>>>> drm_sched_get_cleanup_job */
>>>>>> +   spin_lock_irqsave(&sched->job_list_lock, flags);
>>>>>>      job = list_first_entry_or_null(&sched->ring_mirror_list,
>>>>>>                                     struct drm_sched_job, node);
>>>>>>
>>>>>>      if (job) {
>>>>>> +           /*
>>>>>> +            * Remove the bad job so it cannot be freed by 
>>>>>> concurrent
>>>>>> +            * drm_sched_cleanup_jobs. It will be reinserted back 
>>>>>> after sched->thread
>>>>>> +            * is parked at which point it's safe.
>>>>>> +            */
>>>>>> +           list_del_init(&job->node);
>>>>>> + spin_unlock_irqrestore(&sched->job_list_lock, flags);
>>>>>> +
>>>>>> job->sched->ops->timedout_job(job);
>>>>>>
>>>>>>              /*
>>>>>> @@ -298,6 +309,8 @@ static void drm_sched_job_timedout(struct 
>>>>>> work_struct *work)
>>>>>> job->sched->ops->free_job(job);
>>>>>>                      sched->free_guilty = false;
>>>>>>              }
>>>>>> +   } else {
>>>>>> + spin_unlock_irqrestore(&sched->job_list_lock, flags);
>>>>>>      }
>>>>>>
>>>>>>      spin_lock_irqsave(&sched->job_list_lock, flags);
>>>>>> @@ -370,6 +383,20 @@ void drm_sched_stop(struct drm_gpu_scheduler 
>>>>>> *sched, struct drm_sched_job *bad)
>>>>>>      kthread_park(sched->thread);
>>>>>>
>>>>>>      /*
>>>>>> +    * Reinsert back the bad job here - now it's safe as
>>>>>> +    * drm_sched_get_cleanup_job cannot race against us and 
>>>>>> release the
>>>>>> +    * bad job at this point - we parked (waited for) any in 
>>>>>> progress
>>>>>> +    * (earlier) cleanups and drm_sched_get_cleanup_job will not 
>>>>>> be called
>>>>>> +    * now until the scheduler thread is unparked.
>>>>>> +    */
>>>>>> +   if (bad && bad->sched == sched)
>>>>>> +           /*
>>>>>> +            * Add at the head of the queue to reflect it was the 
>>>>>> earliest
>>>>>> +            * job extracted.
>>>>>> +            */
>>>>>> +           list_add(&bad->node, &sched->ring_mirror_list);
>>>>>> +
>>>>>> +   /*
>>>>>>       * Iterate the job list from later to  earlier one and 
>>>>>> either deactive
>>>>>>       * their HW callbacks or remove them from mirror list if 
>>>>>> they already
>>>>>>       * signaled.
>>>>> _______________________________________________
>>>>> dri-devel mailing list
>>>>> dri-devel@lists.freedesktop.org
>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Fdri-devel&amp;data=02%7C01%7Cchristian.koenig%40amd.com%7Ce88b51a2443741b0b56f08d7ab13da74%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637165974076779365&amp;sdata=L7Hin%2Faw7vK9IYBaZn%2BVmWZKzjqTYBsvJ%2BIL80qB3M4%3D&amp;reserved=0 
>>>>>
>>>> _______________________________________________
>>>> amd-gfx mailing list
>>>> amd-gfx@lists.freedesktop.org
>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=02%7C01%7Cchristian.koenig%40amd.com%7Ce88b51a2443741b0b56f08d7ab13da74%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637165974076779365&amp;sdata=94EyD8X91MT5IVE8TN9%2FRYed8aIX6tN1Pvl8LJBkCeU%3D&amp;reserved=0 
>>>>
>>> _______________________________________________
>>> dri-devel mailing list
>>> dri-devel@lists.freedesktop.org
>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Fdri-devel&amp;data=02%7C01%7Cchristian.koenig%40amd.com%7Ce88b51a2443741b0b56f08d7ab13da74%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637165974076779365&amp;sdata=L7Hin%2Faw7vK9IYBaZn%2BVmWZKzjqTYBsvJ%2BIL80qB3M4%3D&amp;reserved=0 
>>>
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
  2020-02-06 14:51           ` Christian König
@ 2020-02-07 15:26             ` Daniel Vetter
  -1 siblings, 0 replies; 125+ messages in thread
From: Daniel Vetter @ 2020-02-07 15:26 UTC (permalink / raw)
  To: Christian König, Dave Airlie
  Cc: Emily Deng, Maling list - DRI developers, amd-gfx list, Steven Price

On Thu, Feb 6, 2020 at 3:51 PM Christian König <christian.koenig@amd.com> wrote:
>
> Am 06.02.20 um 15:49 schrieb Alex Deucher:
> > On Thu, Feb 6, 2020 at 6:50 AM Christian König
> > <ckoenig.leichtzumerken@gmail.com> wrote:
> >> Am 06.02.20 um 12:10 schrieb Lucas Stach:
> >>> Hi all,
> >>>
> >>> On Mi, 2020-02-05 at 19:24 +0100, Lucas Stach wrote:
> >>>> Hi Andrey,
> >>>>
> >>>> This commit breaks all drivers, which may bail out of the timeout
> >>>> processing as they wish to extend the timeout (etnaviv, v3d).
> >>>>
> >>>> Those drivers currently just return from the timeout handler before
> >>>> calling drm_sched_stop(), which means with this commit applied we are
> >>>> removing the first job from the ring_mirror_list, but never put it
> >>>> back. This leads to jobs getting lost from the ring mirror, which then
> >>>> causes quite a bit of fallout like unsignaled fences.
> >>>>
> >>>> Not sure yet what to do about it, we can either add a function to add
> >>>> the job back to the ring_mirror if the driver wants to extend the
> >>>> timeout, or we could look for another way to stop
> >>>> drm_sched_cleanup_jobs from freeing jobs that are currently in timeout
> >>>> processing.
> >>> So after thinking about this a bit more my opinion is that we need to
> >>> revert this change for now and go back to the drawing board for the
> >>> scheduler timeout handling.
> >>>
> >>> Right now this starts to feel like a big midlayer mistake with all the
> >>> very intricate intertwining between the drivers and the scheduler. The
> >>> rules on when it's safe to manipulate the ring mirror and when
> >>> completed jobs are signaled and freed are not really well specified.
> >>> The fact that we need to mutate state in order to get rid of races
> >>> instead of having a single big "timeout processing is owner of the
> >>> scheduler state for now" is a big fat warning sign IMHO.
> >> Yes, that strongly feels like a hack to me as well. But I didn't had
> >> time and still haven't to take a closer look and suggest something better.
> >>
> > In that case, can someone send me a revert?
>
> Well a revert would break our driver.
>
> The real solution is that somebody needs to sit down, gather ALL the
> requirements and then come up with a solution which is clean and works
> for everyone.

Uh generally oldest regression wins. As much as it sucks, but if we
don't do that then there's just too much room for arguing and maybe it
gets fixed in the next big rework ...
-Daniel

>
> Christian.
>
> >
> > Alex
> >
> >
> >> Christian.
> >>
> >>> It took me far longer than I'd like to admit to understand the failure
> >>> mode with fences not getting signaled after a GPU hang. The back and
> >>> forth between scheduler and driver code makes things really hard to
> >>> follow.
> >>>
> >>> Regards,
> >>> Lucas
> >>>
> >>>> Regards,
> >>>> Lucas
> >>>>
> >>>> On Mo, 2019-11-25 at 15:51 -0500, Andrey Grodzovsky wrote:
> >>>>> Problem:
> >>>>> Due to a race between drm_sched_cleanup_jobs in sched thread and
> >>>>> drm_sched_job_timedout in timeout work there is a possiblity that
> >>>>> bad job was already freed while still being accessed from the
> >>>>> timeout thread.
> >>>>>
> >>>>> Fix:
> >>>>> Instead of just peeking at the bad job in the mirror list
> >>>>> remove it from the list under lock and then put it back later when
> >>>>> we are garanteed no race with main sched thread is possible which
> >>>>> is after the thread is parked.
> >>>>>
> >>>>> v2: Lock around processing ring_mirror_list in drm_sched_cleanup_jobs.
> >>>>>
> >>>>> v3: Rebase on top of drm-misc-next. v2 is not needed anymore as
> >>>>> drm_sched_get_cleanup_job already has a lock there.
> >>>>>
> >>>>> v4: Fix comments to relfect latest code in drm-misc.
> >>>>>
> >>>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> >>>>> Reviewed-by: Christian König <christian.koenig@amd.com>
> >>>>> Tested-by: Emily Deng <Emily.Deng@amd.com>
> >>>>> ---
> >>>>>    drivers/gpu/drm/scheduler/sched_main.c | 27 +++++++++++++++++++++++++++
> >>>>>    1 file changed, 27 insertions(+)
> >>>>>
> >>>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> >>>>> index 6774955..1bf9c40 100644
> >>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
> >>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> >>>>> @@ -284,10 +284,21 @@ static void drm_sched_job_timedout(struct work_struct *work)
> >>>>>      unsigned long flags;
> >>>>>
> >>>>>      sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work);
> >>>>> +
> >>>>> +   /* Protects against concurrent deletion in drm_sched_get_cleanup_job */
> >>>>> +   spin_lock_irqsave(&sched->job_list_lock, flags);
> >>>>>      job = list_first_entry_or_null(&sched->ring_mirror_list,
> >>>>>                                     struct drm_sched_job, node);
> >>>>>
> >>>>>      if (job) {
> >>>>> +           /*
> >>>>> +            * Remove the bad job so it cannot be freed by concurrent
> >>>>> +            * drm_sched_cleanup_jobs. It will be reinserted back after sched->thread
> >>>>> +            * is parked at which point it's safe.
> >>>>> +            */
> >>>>> +           list_del_init(&job->node);
> >>>>> +           spin_unlock_irqrestore(&sched->job_list_lock, flags);
> >>>>> +
> >>>>>              job->sched->ops->timedout_job(job);
> >>>>>
> >>>>>              /*
> >>>>> @@ -298,6 +309,8 @@ static void drm_sched_job_timedout(struct work_struct *work)
> >>>>>                      job->sched->ops->free_job(job);
> >>>>>                      sched->free_guilty = false;
> >>>>>              }
> >>>>> +   } else {
> >>>>> +           spin_unlock_irqrestore(&sched->job_list_lock, flags);
> >>>>>      }
> >>>>>
> >>>>>      spin_lock_irqsave(&sched->job_list_lock, flags);
> >>>>> @@ -370,6 +383,20 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad)
> >>>>>      kthread_park(sched->thread);
> >>>>>
> >>>>>      /*
> >>>>> +    * Reinsert back the bad job here - now it's safe as
> >>>>> +    * drm_sched_get_cleanup_job cannot race against us and release the
> >>>>> +    * bad job at this point - we parked (waited for) any in progress
> >>>>> +    * (earlier) cleanups and drm_sched_get_cleanup_job will not be called
> >>>>> +    * now until the scheduler thread is unparked.
> >>>>> +    */
> >>>>> +   if (bad && bad->sched == sched)
> >>>>> +           /*
> >>>>> +            * Add at the head of the queue to reflect it was the earliest
> >>>>> +            * job extracted.
> >>>>> +            */
> >>>>> +           list_add(&bad->node, &sched->ring_mirror_list);
> >>>>> +
> >>>>> +   /*
> >>>>>       * Iterate the job list from later to  earlier one and either deactive
> >>>>>       * their HW callbacks or remove them from mirror list if they already
> >>>>>       * signaled.
> >>>> _______________________________________________
> >>>> dri-devel mailing list
> >>>> dri-devel@lists.freedesktop.org
> >>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Fdri-devel&amp;data=02%7C01%7Cchristian.koenig%40amd.com%7Ce88b51a2443741b0b56f08d7ab13da74%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637165974076779365&amp;sdata=L7Hin%2Faw7vK9IYBaZn%2BVmWZKzjqTYBsvJ%2BIL80qB3M4%3D&amp;reserved=0
> >>> _______________________________________________
> >>> amd-gfx mailing list
> >>> amd-gfx@lists.freedesktop.org
> >>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=02%7C01%7Cchristian.koenig%40amd.com%7Ce88b51a2443741b0b56f08d7ab13da74%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637165974076779365&amp;sdata=94EyD8X91MT5IVE8TN9%2FRYed8aIX6tN1Pvl8LJBkCeU%3D&amp;reserved=0
> >> _______________________________________________
> >> dri-devel mailing list
> >> dri-devel@lists.freedesktop.org
> >> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Fdri-devel&amp;data=02%7C01%7Cchristian.koenig%40amd.com%7Ce88b51a2443741b0b56f08d7ab13da74%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637165974076779365&amp;sdata=L7Hin%2Faw7vK9IYBaZn%2BVmWZKzjqTYBsvJ%2BIL80qB3M4%3D&amp;reserved=0
>
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel



-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
@ 2020-02-07 15:26             ` Daniel Vetter
  0 siblings, 0 replies; 125+ messages in thread
From: Daniel Vetter @ 2020-02-07 15:26 UTC (permalink / raw)
  To: Christian König, Dave Airlie
  Cc: Alex Deucher, Emily Deng, Maling list - DRI developers,
	amd-gfx list, Steven Price

On Thu, Feb 6, 2020 at 3:51 PM Christian König <christian.koenig@amd.com> wrote:
>
> Am 06.02.20 um 15:49 schrieb Alex Deucher:
> > On Thu, Feb 6, 2020 at 6:50 AM Christian König
> > <ckoenig.leichtzumerken@gmail.com> wrote:
> >> Am 06.02.20 um 12:10 schrieb Lucas Stach:
> >>> Hi all,
> >>>
> >>> On Mi, 2020-02-05 at 19:24 +0100, Lucas Stach wrote:
> >>>> Hi Andrey,
> >>>>
> >>>> This commit breaks all drivers, which may bail out of the timeout
> >>>> processing as they wish to extend the timeout (etnaviv, v3d).
> >>>>
> >>>> Those drivers currently just return from the timeout handler before
> >>>> calling drm_sched_stop(), which means with this commit applied we are
> >>>> removing the first job from the ring_mirror_list, but never put it
> >>>> back. This leads to jobs getting lost from the ring mirror, which then
> >>>> causes quite a bit of fallout like unsignaled fences.
> >>>>
> >>>> Not sure yet what to do about it, we can either add a function to add
> >>>> the job back to the ring_mirror if the driver wants to extend the
> >>>> timeout, or we could look for another way to stop
> >>>> drm_sched_cleanup_jobs from freeing jobs that are currently in timeout
> >>>> processing.
> >>> So after thinking about this a bit more my opinion is that we need to
> >>> revert this change for now and go back to the drawing board for the
> >>> scheduler timeout handling.
> >>>
> >>> Right now this starts to feel like a big midlayer mistake with all the
> >>> very intricate intertwining between the drivers and the scheduler. The
> >>> rules on when it's safe to manipulate the ring mirror and when
> >>> completed jobs are signaled and freed are not really well specified.
> >>> The fact that we need to mutate state in order to get rid of races
> >>> instead of having a single big "timeout processing is owner of the
> >>> scheduler state for now" is a big fat warning sign IMHO.
> >> Yes, that strongly feels like a hack to me as well. But I didn't had
> >> time and still haven't to take a closer look and suggest something better.
> >>
> > In that case, can someone send me a revert?
>
> Well a revert would break our driver.
>
> The real solution is that somebody needs to sit down, gather ALL the
> requirements and then come up with a solution which is clean and works
> for everyone.

Uh generally oldest regression wins. As much as it sucks, but if we
don't do that then there's just too much room for arguing and maybe it
gets fixed in the next big rework ...
-Daniel

>
> Christian.
>
> >
> > Alex
> >
> >
> >> Christian.
> >>
> >>> It took me far longer than I'd like to admit to understand the failure
> >>> mode with fences not getting signaled after a GPU hang. The back and
> >>> forth between scheduler and driver code makes things really hard to
> >>> follow.
> >>>
> >>> Regards,
> >>> Lucas
> >>>
> >>>> Regards,
> >>>> Lucas
> >>>>
> >>>> On Mo, 2019-11-25 at 15:51 -0500, Andrey Grodzovsky wrote:
> >>>>> Problem:
> >>>>> Due to a race between drm_sched_cleanup_jobs in sched thread and
> >>>>> drm_sched_job_timedout in timeout work there is a possiblity that
> >>>>> bad job was already freed while still being accessed from the
> >>>>> timeout thread.
> >>>>>
> >>>>> Fix:
> >>>>> Instead of just peeking at the bad job in the mirror list
> >>>>> remove it from the list under lock and then put it back later when
> >>>>> we are garanteed no race with main sched thread is possible which
> >>>>> is after the thread is parked.
> >>>>>
> >>>>> v2: Lock around processing ring_mirror_list in drm_sched_cleanup_jobs.
> >>>>>
> >>>>> v3: Rebase on top of drm-misc-next. v2 is not needed anymore as
> >>>>> drm_sched_get_cleanup_job already has a lock there.
> >>>>>
> >>>>> v4: Fix comments to relfect latest code in drm-misc.
> >>>>>
> >>>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> >>>>> Reviewed-by: Christian König <christian.koenig@amd.com>
> >>>>> Tested-by: Emily Deng <Emily.Deng@amd.com>
> >>>>> ---
> >>>>>    drivers/gpu/drm/scheduler/sched_main.c | 27 +++++++++++++++++++++++++++
> >>>>>    1 file changed, 27 insertions(+)
> >>>>>
> >>>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> >>>>> index 6774955..1bf9c40 100644
> >>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
> >>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> >>>>> @@ -284,10 +284,21 @@ static void drm_sched_job_timedout(struct work_struct *work)
> >>>>>      unsigned long flags;
> >>>>>
> >>>>>      sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work);
> >>>>> +
> >>>>> +   /* Protects against concurrent deletion in drm_sched_get_cleanup_job */
> >>>>> +   spin_lock_irqsave(&sched->job_list_lock, flags);
> >>>>>      job = list_first_entry_or_null(&sched->ring_mirror_list,
> >>>>>                                     struct drm_sched_job, node);
> >>>>>
> >>>>>      if (job) {
> >>>>> +           /*
> >>>>> +            * Remove the bad job so it cannot be freed by concurrent
> >>>>> +            * drm_sched_cleanup_jobs. It will be reinserted back after sched->thread
> >>>>> +            * is parked at which point it's safe.
> >>>>> +            */
> >>>>> +           list_del_init(&job->node);
> >>>>> +           spin_unlock_irqrestore(&sched->job_list_lock, flags);
> >>>>> +
> >>>>>              job->sched->ops->timedout_job(job);
> >>>>>
> >>>>>              /*
> >>>>> @@ -298,6 +309,8 @@ static void drm_sched_job_timedout(struct work_struct *work)
> >>>>>                      job->sched->ops->free_job(job);
> >>>>>                      sched->free_guilty = false;
> >>>>>              }
> >>>>> +   } else {
> >>>>> +           spin_unlock_irqrestore(&sched->job_list_lock, flags);
> >>>>>      }
> >>>>>
> >>>>>      spin_lock_irqsave(&sched->job_list_lock, flags);
> >>>>> @@ -370,6 +383,20 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad)
> >>>>>      kthread_park(sched->thread);
> >>>>>
> >>>>>      /*
> >>>>> +    * Reinsert back the bad job here - now it's safe as
> >>>>> +    * drm_sched_get_cleanup_job cannot race against us and release the
> >>>>> +    * bad job at this point - we parked (waited for) any in progress
> >>>>> +    * (earlier) cleanups and drm_sched_get_cleanup_job will not be called
> >>>>> +    * now until the scheduler thread is unparked.
> >>>>> +    */
> >>>>> +   if (bad && bad->sched == sched)
> >>>>> +           /*
> >>>>> +            * Add at the head of the queue to reflect it was the earliest
> >>>>> +            * job extracted.
> >>>>> +            */
> >>>>> +           list_add(&bad->node, &sched->ring_mirror_list);
> >>>>> +
> >>>>> +   /*
> >>>>>       * Iterate the job list from later to  earlier one and either deactive
> >>>>>       * their HW callbacks or remove them from mirror list if they already
> >>>>>       * signaled.
> >>>> _______________________________________________
> >>>> dri-devel mailing list
> >>>> dri-devel@lists.freedesktop.org
> >>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Fdri-devel&amp;data=02%7C01%7Cchristian.koenig%40amd.com%7Ce88b51a2443741b0b56f08d7ab13da74%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637165974076779365&amp;sdata=L7Hin%2Faw7vK9IYBaZn%2BVmWZKzjqTYBsvJ%2BIL80qB3M4%3D&amp;reserved=0
> >>> _______________________________________________
> >>> amd-gfx mailing list
> >>> amd-gfx@lists.freedesktop.org
> >>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=02%7C01%7Cchristian.koenig%40amd.com%7Ce88b51a2443741b0b56f08d7ab13da74%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637165974076779365&amp;sdata=94EyD8X91MT5IVE8TN9%2FRYed8aIX6tN1Pvl8LJBkCeU%3D&amp;reserved=0
> >> _______________________________________________
> >> dri-devel mailing list
> >> dri-devel@lists.freedesktop.org
> >> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Fdri-devel&amp;data=02%7C01%7Cchristian.koenig%40amd.com%7Ce88b51a2443741b0b56f08d7ab13da74%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637165974076779365&amp;sdata=L7Hin%2Faw7vK9IYBaZn%2BVmWZKzjqTYBsvJ%2BIL80qB3M4%3D&amp;reserved=0
>
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel



-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
  2020-02-06 15:49             ` Andrey Grodzovsky
@ 2020-02-10 16:55               ` Andrey Grodzovsky
  -1 siblings, 0 replies; 125+ messages in thread
From: Andrey Grodzovsky @ 2020-02-10 16:55 UTC (permalink / raw)
  To: Christian König, Alex Deucher
  Cc: Emily Deng, steven.price, amd-gfx list, Maling list - DRI developers


[-- Attachment #1.1: Type: text/plain, Size: 1346 bytes --]

Lucas - Ping on my question and also I attached this temporary solution 
for etnaviv to clarify my point. If that something acceptable for now at 
least i can do the same for v3d where it requires a bit more code changes.

Andrey

On 2/6/20 10:49 AM, Andrey Grodzovsky wrote:
>> Well a revert would break our driver.
>>
>> The real solution is that somebody needs to sit down, gather ALL the 
>> requirements and then come up with a solution which is clean and 
>> works for everyone.
>>
>> Christian.
>
>
> I can to take on this as indeed our general design on this becomes 
> more and more entangled as GPU reset scenarios grow in complexity (at 
> least in AMD driver). Currently I am on a high priority internal task 
> which should take me around a week or 2 to finish and after that I can 
> get to it.
>
> Regarding temporary solution  - I looked into v3d and etnaviv use 
> cases and we in AMD actually face the same scenario where we decide to 
> skip HW reset if the guilty job did finish by the time we are 
> processing the timeout  (see amdgpu_device_gpu_recover and 
> skip_hw_reset goto) - the difference is we always call 
> drm_sched_stop/start irrespectively of whether we are going to 
> actually HW reset or not (same as extend timeout). I wonder if 
> something like this can be done also for ve3 and etnaviv ?
>
> Andrey 

[-- Attachment #1.2: Type: text/html, Size: 1965 bytes --]

[-- Attachment #2: 0001-drm-etnaviv-Always-execute-sched-stop-and-start.patch --]
[-- Type: text/x-patch, Size: 1811 bytes --]

From c3fa87856608463f14dddb03346c31054f3137c9 Mon Sep 17 00:00:00 2001
From: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Date: Mon, 10 Feb 2020 11:44:39 -0500
Subject: drm/etnaviv: Always execute sched stop and start.

During job timeout always stop and restart the scheduler even
if no HW resetis taking place.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/etnaviv/etnaviv_sched.c | 15 ++++++++++-----
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
index 4e3e95d..270caa8 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
@@ -89,12 +89,17 @@ static void etnaviv_sched_timedout_job(struct drm_sched_job *sched_job)
 	u32 dma_addr;
 	int change;
 
+
+
+	/* block scheduler */
+	drm_sched_stop(&gpu->sched, sched_job);
+
 	/*
 	 * If the GPU managed to complete this jobs fence, the timout is
 	 * spurious. Bail out.
 	 */
 	if (dma_fence_is_signaled(submit->out_fence))
-		return;
+		goto skip_hw_reset;
 
 	/*
 	 * If the GPU is still making forward progress on the front-end (which
@@ -105,12 +110,9 @@ static void etnaviv_sched_timedout_job(struct drm_sched_job *sched_job)
 	change = dma_addr - gpu->hangcheck_dma_addr;
 	if (change < 0 || change > 16) {
 		gpu->hangcheck_dma_addr = dma_addr;
-		return;
+		goto skip_hw_reset;
 	}
 
-	/* block scheduler */
-	drm_sched_stop(&gpu->sched, sched_job);
-
 	if(sched_job)
 		drm_sched_increase_karma(sched_job);
 
@@ -120,6 +122,9 @@ static void etnaviv_sched_timedout_job(struct drm_sched_job *sched_job)
 
 	drm_sched_resubmit_jobs(&gpu->sched);
 
+
+skip_hw_reset:
+
 	/* restart scheduler after GPU is usable again */
 	drm_sched_start(&gpu->sched, true);
 }
-- 
2.7.4


[-- Attachment #3: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 125+ messages in thread

* Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
@ 2020-02-10 16:55               ` Andrey Grodzovsky
  0 siblings, 0 replies; 125+ messages in thread
From: Andrey Grodzovsky @ 2020-02-10 16:55 UTC (permalink / raw)
  To: Christian König, Alex Deucher
  Cc: Emily Deng, steven.price, amd-gfx list,
	Maling list - DRI developers, Lucas Stach


[-- Attachment #1.1: Type: text/plain, Size: 1346 bytes --]

Lucas - Ping on my question and also I attached this temporary solution 
for etnaviv to clarify my point. If that something acceptable for now at 
least i can do the same for v3d where it requires a bit more code changes.

Andrey

On 2/6/20 10:49 AM, Andrey Grodzovsky wrote:
>> Well a revert would break our driver.
>>
>> The real solution is that somebody needs to sit down, gather ALL the 
>> requirements and then come up with a solution which is clean and 
>> works for everyone.
>>
>> Christian.
>
>
> I can to take on this as indeed our general design on this becomes 
> more and more entangled as GPU reset scenarios grow in complexity (at 
> least in AMD driver). Currently I am on a high priority internal task 
> which should take me around a week or 2 to finish and after that I can 
> get to it.
>
> Regarding temporary solution  - I looked into v3d and etnaviv use 
> cases and we in AMD actually face the same scenario where we decide to 
> skip HW reset if the guilty job did finish by the time we are 
> processing the timeout  (see amdgpu_device_gpu_recover and 
> skip_hw_reset goto) - the difference is we always call 
> drm_sched_stop/start irrespectively of whether we are going to 
> actually HW reset or not (same as extend timeout). I wonder if 
> something like this can be done also for ve3 and etnaviv ?
>
> Andrey 

[-- Attachment #1.2: Type: text/html, Size: 1965 bytes --]

[-- Attachment #2: 0001-drm-etnaviv-Always-execute-sched-stop-and-start.patch --]
[-- Type: text/x-patch, Size: 1811 bytes --]

From c3fa87856608463f14dddb03346c31054f3137c9 Mon Sep 17 00:00:00 2001
From: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Date: Mon, 10 Feb 2020 11:44:39 -0500
Subject: drm/etnaviv: Always execute sched stop and start.

During job timeout always stop and restart the scheduler even
if no HW resetis taking place.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/etnaviv/etnaviv_sched.c | 15 ++++++++++-----
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
index 4e3e95d..270caa8 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
@@ -89,12 +89,17 @@ static void etnaviv_sched_timedout_job(struct drm_sched_job *sched_job)
 	u32 dma_addr;
 	int change;
 
+
+
+	/* block scheduler */
+	drm_sched_stop(&gpu->sched, sched_job);
+
 	/*
 	 * If the GPU managed to complete this jobs fence, the timout is
 	 * spurious. Bail out.
 	 */
 	if (dma_fence_is_signaled(submit->out_fence))
-		return;
+		goto skip_hw_reset;
 
 	/*
 	 * If the GPU is still making forward progress on the front-end (which
@@ -105,12 +110,9 @@ static void etnaviv_sched_timedout_job(struct drm_sched_job *sched_job)
 	change = dma_addr - gpu->hangcheck_dma_addr;
 	if (change < 0 || change > 16) {
 		gpu->hangcheck_dma_addr = dma_addr;
-		return;
+		goto skip_hw_reset;
 	}
 
-	/* block scheduler */
-	drm_sched_stop(&gpu->sched, sched_job);
-
 	if(sched_job)
 		drm_sched_increase_karma(sched_job);
 
@@ -120,6 +122,9 @@ static void etnaviv_sched_timedout_job(struct drm_sched_job *sched_job)
 
 	drm_sched_resubmit_jobs(&gpu->sched);
 
+
+skip_hw_reset:
+
 	/* restart scheduler after GPU is usable again */
 	drm_sched_start(&gpu->sched, true);
 }
-- 
2.7.4


[-- Attachment #3: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 125+ messages in thread

* Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
  2020-02-10 16:55               ` Andrey Grodzovsky
@ 2020-02-10 21:50                 ` Luben Tuikov
  -1 siblings, 0 replies; 125+ messages in thread
From: Luben Tuikov @ 2020-02-10 21:50 UTC (permalink / raw)
  To: Andrey Grodzovsky, Christian König, Alex Deucher, Lucas Stach
  Cc: Emily Deng, Maling list - DRI developers, amd-gfx list, steven.price

Hi Lucas,

Thank you for bringing awareness of this issue, publicly.

As soon as this patch showed up back in November of 2019,
I objected to it, privately.

I suggested to instead use a _list_ to store the "state" of
all jobs of the same state. Then, at any time, timeout interrupt
or whatever, we can atomically (irq spinlock) move the timeout/bad
job to the timedout/cleanup/bad job list, and wake someone up
to deal with that list asynchronously, and return from the interrupt/etc.
immediately.

Then in due time, if any more interrupts or whatnot take place,
the job will either be in the timeout list or not. If it it,
then the instigator backs off as someone else (the list handler) will/is
awake and handling it (obviously a state variable may be kept as well).

This draws somewhat from my days with iSCSI, SCSI and SAS, 15 years ago,
where a device can complete a job (task) at anytime regardless
of what the SCSI layer "thinks" the task's state is: timed-out, aborted,
whatever. It is a very simple and elegant solution which generalizes
well.

Regards,
Luben

On 2020-02-10 11:55 a.m., Andrey Grodzovsky wrote:
> Lucas - Ping on my question and also I attached this temporary solution for etnaviv to clarify my point. If that something acceptable for now at least i can do the same for v3d where it requires a bit more code changes.
> 
> Andrey
> 
> On 2/6/20 10:49 AM, Andrey Grodzovsky wrote:
>>> Well a revert would break our driver.
>>>
>>> The real solution is that somebody needs to sit down, gather ALL the requirements and then come up with a solution which is clean and works for everyone.
>>>
>>> Christian.
>>
>>
>> I can to take on this as indeed our general design on this becomes more and more entangled as GPU reset scenarios grow in complexity (at least in AMD driver). Currently I am on a high priority internal task which should take me around a week or 2 to finish and after that I can get to it.
>>
>> Regarding temporary solution  - I looked into v3d and etnaviv use cases and we in AMD actually face the same scenario where we decide to skip HW reset if the guilty job did finish by the time we are processing the timeout  (see amdgpu_device_gpu_recover and skip_hw_reset goto) - the difference is we always call drm_sched_stop/start irrespectively of whether we are going to actually HW reset or not (same as extend timeout). I wonder if something like this can be done also for ve3 and etnaviv ?
>>
>> Andrey 
> 
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=02%7C01%7Cluben.tuikov%40amd.com%7Cce97bc29988e4068ef8108d7ae4a043d%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637169505277381327&amp;sdata=FyV0q3y5uWPwBgJF5QZLWARcXau916EUcYez2VA%2FqRA%3D&amp;reserved=0
> 

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
@ 2020-02-10 21:50                 ` Luben Tuikov
  0 siblings, 0 replies; 125+ messages in thread
From: Luben Tuikov @ 2020-02-10 21:50 UTC (permalink / raw)
  To: Andrey Grodzovsky, Christian König, Alex Deucher, Lucas Stach
  Cc: Emily Deng, Maling list - DRI developers, amd-gfx list, steven.price

Hi Lucas,

Thank you for bringing awareness of this issue, publicly.

As soon as this patch showed up back in November of 2019,
I objected to it, privately.

I suggested to instead use a _list_ to store the "state" of
all jobs of the same state. Then, at any time, timeout interrupt
or whatever, we can atomically (irq spinlock) move the timeout/bad
job to the timedout/cleanup/bad job list, and wake someone up
to deal with that list asynchronously, and return from the interrupt/etc.
immediately.

Then in due time, if any more interrupts or whatnot take place,
the job will either be in the timeout list or not. If it it,
then the instigator backs off as someone else (the list handler) will/is
awake and handling it (obviously a state variable may be kept as well).

This draws somewhat from my days with iSCSI, SCSI and SAS, 15 years ago,
where a device can complete a job (task) at anytime regardless
of what the SCSI layer "thinks" the task's state is: timed-out, aborted,
whatever. It is a very simple and elegant solution which generalizes
well.

Regards,
Luben

On 2020-02-10 11:55 a.m., Andrey Grodzovsky wrote:
> Lucas - Ping on my question and also I attached this temporary solution for etnaviv to clarify my point. If that something acceptable for now at least i can do the same for v3d where it requires a bit more code changes.
> 
> Andrey
> 
> On 2/6/20 10:49 AM, Andrey Grodzovsky wrote:
>>> Well a revert would break our driver.
>>>
>>> The real solution is that somebody needs to sit down, gather ALL the requirements and then come up with a solution which is clean and works for everyone.
>>>
>>> Christian.
>>
>>
>> I can to take on this as indeed our general design on this becomes more and more entangled as GPU reset scenarios grow in complexity (at least in AMD driver). Currently I am on a high priority internal task which should take me around a week or 2 to finish and after that I can get to it.
>>
>> Regarding temporary solution  - I looked into v3d and etnaviv use cases and we in AMD actually face the same scenario where we decide to skip HW reset if the guilty job did finish by the time we are processing the timeout  (see amdgpu_device_gpu_recover and skip_hw_reset goto) - the difference is we always call drm_sched_stop/start irrespectively of whether we are going to actually HW reset or not (same as extend timeout). I wonder if something like this can be done also for ve3 and etnaviv ?
>>
>> Andrey 
> 
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=02%7C01%7Cluben.tuikov%40amd.com%7Cce97bc29988e4068ef8108d7ae4a043d%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637169505277381327&amp;sdata=FyV0q3y5uWPwBgJF5QZLWARcXau916EUcYez2VA%2FqRA%3D&amp;reserved=0
> 

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
  2020-02-10 21:50                 ` Luben Tuikov
@ 2020-02-11 15:55                   ` Andrey Grodzovsky
  -1 siblings, 0 replies; 125+ messages in thread
From: Andrey Grodzovsky @ 2020-02-11 15:55 UTC (permalink / raw)
  To: Luben Tuikov, Christian König, Alex Deucher, Lucas Stach
  Cc: Emily Deng, Maling list - DRI developers, amd-gfx list, steven.price

On 2/10/20 4:50 PM, Luben Tuikov wrote:
> Hi Lucas,
>
> Thank you for bringing awareness of this issue, publicly.
>
> As soon as this patch showed up back in November of 2019,
> I objected to it, privately.


I didn't find this objection in my mail actually


>
> I suggested to instead use a _list_ to store the "state" of
> all jobs of the same state. Then, at any time, timeout interrupt
> or whatever, we can atomically (irq spinlock) move the timeout/bad
> job to the timedout/cleanup/bad job list, and wake someone up
> to deal with that list asynchronously, and return from the interrupt/etc.
> immediately.


Sounds a good idea to me, i think enough for us to have 2 lists, timeout 
list for jobs scheduled to HW and not yet completed (completion fence 
signaled) and cleanup list for those that did complete. This should give 
alternative solution to the race condition this patch was addressing 
without causing the break the Lucas reported. If no one objects I think 
i can try implement it.

Andrey


>
> Then in due time, if any more interrupts or whatnot take place,
> the job will either be in the timeout list or not. If it it,
> then the instigator backs off as someone else (the list handler) will/is
> awake and handling it (obviously a state variable may be kept as well).
>
> This draws somewhat from my days with iSCSI, SCSI and SAS, 15 years ago,
> where a device can complete a job (task) at anytime regardless
> of what the SCSI layer "thinks" the task's state is: timed-out, aborted,
> whatever. It is a very simple and elegant solution which generalizes
> well.
>
> Regards,
> Luben
>
> On 2020-02-10 11:55 a.m., Andrey Grodzovsky wrote:
>> Lucas - Ping on my question and also I attached this temporary solution for etnaviv to clarify my point. If that something acceptable for now at least i can do the same for v3d where it requires a bit more code changes.
>>
>> Andrey
>>
>> On 2/6/20 10:49 AM, Andrey Grodzovsky wrote:
>>>> Well a revert would break our driver.
>>>>
>>>> The real solution is that somebody needs to sit down, gather ALL the requirements and then come up with a solution which is clean and works for everyone.
>>>>
>>>> Christian.
>>>
>>> I can to take on this as indeed our general design on this becomes more and more entangled as GPU reset scenarios grow in complexity (at least in AMD driver). Currently I am on a high priority internal task which should take me around a week or 2 to finish and after that I can get to it.
>>>
>>> Regarding temporary solution  - I looked into v3d and etnaviv use cases and we in AMD actually face the same scenario where we decide to skip HW reset if the guilty job did finish by the time we are processing the timeout  (see amdgpu_device_gpu_recover and skip_hw_reset goto) - the difference is we always call drm_sched_stop/start irrespectively of whether we are going to actually HW reset or not (same as extend timeout). I wonder if something like this can be done also for ve3 and etnaviv ?
>>>
>>> Andrey
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx@lists.freedesktop.org
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=02%7C01%7Cluben.tuikov%40amd.com%7Cce97bc29988e4068ef8108d7ae4a043d%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637169505277381327&amp;sdata=FyV0q3y5uWPwBgJF5QZLWARcXau916EUcYez2VA%2FqRA%3D&amp;reserved=0
>>
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
@ 2020-02-11 15:55                   ` Andrey Grodzovsky
  0 siblings, 0 replies; 125+ messages in thread
From: Andrey Grodzovsky @ 2020-02-11 15:55 UTC (permalink / raw)
  To: Luben Tuikov, Christian König, Alex Deucher, Lucas Stach
  Cc: Emily Deng, Maling list - DRI developers, amd-gfx list, steven.price

On 2/10/20 4:50 PM, Luben Tuikov wrote:
> Hi Lucas,
>
> Thank you for bringing awareness of this issue, publicly.
>
> As soon as this patch showed up back in November of 2019,
> I objected to it, privately.


I didn't find this objection in my mail actually


>
> I suggested to instead use a _list_ to store the "state" of
> all jobs of the same state. Then, at any time, timeout interrupt
> or whatever, we can atomically (irq spinlock) move the timeout/bad
> job to the timedout/cleanup/bad job list, and wake someone up
> to deal with that list asynchronously, and return from the interrupt/etc.
> immediately.


Sounds a good idea to me, i think enough for us to have 2 lists, timeout 
list for jobs scheduled to HW and not yet completed (completion fence 
signaled) and cleanup list for those that did complete. This should give 
alternative solution to the race condition this patch was addressing 
without causing the break the Lucas reported. If no one objects I think 
i can try implement it.

Andrey


>
> Then in due time, if any more interrupts or whatnot take place,
> the job will either be in the timeout list or not. If it it,
> then the instigator backs off as someone else (the list handler) will/is
> awake and handling it (obviously a state variable may be kept as well).
>
> This draws somewhat from my days with iSCSI, SCSI and SAS, 15 years ago,
> where a device can complete a job (task) at anytime regardless
> of what the SCSI layer "thinks" the task's state is: timed-out, aborted,
> whatever. It is a very simple and elegant solution which generalizes
> well.
>
> Regards,
> Luben
>
> On 2020-02-10 11:55 a.m., Andrey Grodzovsky wrote:
>> Lucas - Ping on my question and also I attached this temporary solution for etnaviv to clarify my point. If that something acceptable for now at least i can do the same for v3d where it requires a bit more code changes.
>>
>> Andrey
>>
>> On 2/6/20 10:49 AM, Andrey Grodzovsky wrote:
>>>> Well a revert would break our driver.
>>>>
>>>> The real solution is that somebody needs to sit down, gather ALL the requirements and then come up with a solution which is clean and works for everyone.
>>>>
>>>> Christian.
>>>
>>> I can to take on this as indeed our general design on this becomes more and more entangled as GPU reset scenarios grow in complexity (at least in AMD driver). Currently I am on a high priority internal task which should take me around a week or 2 to finish and after that I can get to it.
>>>
>>> Regarding temporary solution  - I looked into v3d and etnaviv use cases and we in AMD actually face the same scenario where we decide to skip HW reset if the guilty job did finish by the time we are processing the timeout  (see amdgpu_device_gpu_recover and skip_hw_reset goto) - the difference is we always call drm_sched_stop/start irrespectively of whether we are going to actually HW reset or not (same as extend timeout). I wonder if something like this can be done also for ve3 and etnaviv ?
>>>
>>> Andrey
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx@lists.freedesktop.org
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=02%7C01%7Cluben.tuikov%40amd.com%7Cce97bc29988e4068ef8108d7ae4a043d%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637169505277381327&amp;sdata=FyV0q3y5uWPwBgJF5QZLWARcXau916EUcYez2VA%2FqRA%3D&amp;reserved=0
>>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
  2020-02-11 15:55                   ` Andrey Grodzovsky
@ 2020-02-11 21:27                     ` Andrey Grodzovsky
  -1 siblings, 0 replies; 125+ messages in thread
From: Andrey Grodzovsky @ 2020-02-11 21:27 UTC (permalink / raw)
  To: Luben Tuikov, Christian König, Alex Deucher, Lucas Stach
  Cc: Emily Deng, amd-gfx list, Maling list - DRI developers, steven.price


On 2/11/20 10:55 AM, Andrey Grodzovsky wrote:
> On 2/10/20 4:50 PM, Luben Tuikov wrote:
>> Hi Lucas,
>>
>> Thank you for bringing awareness of this issue, publicly.
>>
>> As soon as this patch showed up back in November of 2019,
>> I objected to it, privately.
>
>
> I didn't find this objection in my mail actually
>
>
>>
>> I suggested to instead use a _list_ to store the "state" of
>> all jobs of the same state. Then, at any time, timeout interrupt
>> or whatever, we can atomically (irq spinlock) move the timeout/bad
>> job to the timedout/cleanup/bad job list, and wake someone up
>> to deal with that list asynchronously, and return from the 
>> interrupt/etc.
>> immediately.
>
>
> Sounds a good idea to me, i think enough for us to have 2 lists, 
> timeout list for jobs scheduled to HW and not yet completed 
> (completion fence signaled) and cleanup list for those that did 
> complete. This should give alternative solution to the race condition 
> this patch was addressing without causing the break the Lucas 
> reported. If no one objects I think i can try implement it.
>
> Andrey


Thinking more i realize Luben is right about having also bad job list as 
this is needed for normal job competition (by fence callback from 
amdgpu_fence_process)  and you need to decide if you move it to cleanup 
list from timeout list or not. If it's already in bad job list - meaning 
that it's being processed by GPU recovery code you don't touch it, 
otherwise you move it to cleanup list where it will be freed eventually 
by invocation of drm_sched_get_cleanup_job.

Andrey


>
>
>>
>> Then in due time, if any more interrupts or whatnot take place,
>> the job will either be in the timeout list or not. If it it,
>> then the instigator backs off as someone else (the list handler) will/is
>> awake and handling it (obviously a state variable may be kept as well).
>>
>> This draws somewhat from my days with iSCSI, SCSI and SAS, 15 years ago,
>> where a device can complete a job (task) at anytime regardless
>> of what the SCSI layer "thinks" the task's state is: timed-out, aborted,
>> whatever. It is a very simple and elegant solution which generalizes
>> well.
>>
>> Regards,
>> Luben
>>
>> On 2020-02-10 11:55 a.m., Andrey Grodzovsky wrote:
>>> Lucas - Ping on my question and also I attached this temporary 
>>> solution for etnaviv to clarify my point. If that something 
>>> acceptable for now at least i can do the same for v3d where it 
>>> requires a bit more code changes.
>>>
>>> Andrey
>>>
>>> On 2/6/20 10:49 AM, Andrey Grodzovsky wrote:
>>>>> Well a revert would break our driver.
>>>>>
>>>>> The real solution is that somebody needs to sit down, gather ALL 
>>>>> the requirements and then come up with a solution which is clean 
>>>>> and works for everyone.
>>>>>
>>>>> Christian.
>>>>
>>>> I can to take on this as indeed our general design on this becomes 
>>>> more and more entangled as GPU reset scenarios grow in complexity 
>>>> (at least in AMD driver). Currently I am on a high priority 
>>>> internal task which should take me around a week or 2 to finish and 
>>>> after that I can get to it.
>>>>
>>>> Regarding temporary solution  - I looked into v3d and etnaviv use 
>>>> cases and we in AMD actually face the same scenario where we decide 
>>>> to skip HW reset if the guilty job did finish by the time we are 
>>>> processing the timeout  (see amdgpu_device_gpu_recover and 
>>>> skip_hw_reset goto) - the difference is we always call 
>>>> drm_sched_stop/start irrespectively of whether we are going to 
>>>> actually HW reset or not (same as extend timeout). I wonder if 
>>>> something like this can be done also for ve3 and etnaviv ?
>>>>
>>>> Andrey
>>> _______________________________________________
>>> amd-gfx mailing list
>>> amd-gfx@lists.freedesktop.org
>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=02%7C01%7Candrey.grodzovsky%40amd.com%7Cef96617d23a54fe9b6ef08d7af0ac9db%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637170333200621550&amp;sdata=wa7Eh3bdi%2BthYjjZF2yeTvTjNRipGPqVA%2FGQt0QL7R8%3D&amp;reserved=0 
>>>
>>>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=02%7C01%7Candrey.grodzovsky%40amd.com%7Cef96617d23a54fe9b6ef08d7af0ac9db%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637170333200621550&amp;sdata=wa7Eh3bdi%2BthYjjZF2yeTvTjNRipGPqVA%2FGQt0QL7R8%3D&amp;reserved=0 
>
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
@ 2020-02-11 21:27                     ` Andrey Grodzovsky
  0 siblings, 0 replies; 125+ messages in thread
From: Andrey Grodzovsky @ 2020-02-11 21:27 UTC (permalink / raw)
  To: Luben Tuikov, Christian König, Alex Deucher, Lucas Stach
  Cc: Emily Deng, amd-gfx list, Maling list - DRI developers, steven.price


On 2/11/20 10:55 AM, Andrey Grodzovsky wrote:
> On 2/10/20 4:50 PM, Luben Tuikov wrote:
>> Hi Lucas,
>>
>> Thank you for bringing awareness of this issue, publicly.
>>
>> As soon as this patch showed up back in November of 2019,
>> I objected to it, privately.
>
>
> I didn't find this objection in my mail actually
>
>
>>
>> I suggested to instead use a _list_ to store the "state" of
>> all jobs of the same state. Then, at any time, timeout interrupt
>> or whatever, we can atomically (irq spinlock) move the timeout/bad
>> job to the timedout/cleanup/bad job list, and wake someone up
>> to deal with that list asynchronously, and return from the 
>> interrupt/etc.
>> immediately.
>
>
> Sounds a good idea to me, i think enough for us to have 2 lists, 
> timeout list for jobs scheduled to HW and not yet completed 
> (completion fence signaled) and cleanup list for those that did 
> complete. This should give alternative solution to the race condition 
> this patch was addressing without causing the break the Lucas 
> reported. If no one objects I think i can try implement it.
>
> Andrey


Thinking more i realize Luben is right about having also bad job list as 
this is needed for normal job competition (by fence callback from 
amdgpu_fence_process)  and you need to decide if you move it to cleanup 
list from timeout list or not. If it's already in bad job list - meaning 
that it's being processed by GPU recovery code you don't touch it, 
otherwise you move it to cleanup list where it will be freed eventually 
by invocation of drm_sched_get_cleanup_job.

Andrey


>
>
>>
>> Then in due time, if any more interrupts or whatnot take place,
>> the job will either be in the timeout list or not. If it it,
>> then the instigator backs off as someone else (the list handler) will/is
>> awake and handling it (obviously a state variable may be kept as well).
>>
>> This draws somewhat from my days with iSCSI, SCSI and SAS, 15 years ago,
>> where a device can complete a job (task) at anytime regardless
>> of what the SCSI layer "thinks" the task's state is: timed-out, aborted,
>> whatever. It is a very simple and elegant solution which generalizes
>> well.
>>
>> Regards,
>> Luben
>>
>> On 2020-02-10 11:55 a.m., Andrey Grodzovsky wrote:
>>> Lucas - Ping on my question and also I attached this temporary 
>>> solution for etnaviv to clarify my point. If that something 
>>> acceptable for now at least i can do the same for v3d where it 
>>> requires a bit more code changes.
>>>
>>> Andrey
>>>
>>> On 2/6/20 10:49 AM, Andrey Grodzovsky wrote:
>>>>> Well a revert would break our driver.
>>>>>
>>>>> The real solution is that somebody needs to sit down, gather ALL 
>>>>> the requirements and then come up with a solution which is clean 
>>>>> and works for everyone.
>>>>>
>>>>> Christian.
>>>>
>>>> I can to take on this as indeed our general design on this becomes 
>>>> more and more entangled as GPU reset scenarios grow in complexity 
>>>> (at least in AMD driver). Currently I am on a high priority 
>>>> internal task which should take me around a week or 2 to finish and 
>>>> after that I can get to it.
>>>>
>>>> Regarding temporary solution  - I looked into v3d and etnaviv use 
>>>> cases and we in AMD actually face the same scenario where we decide 
>>>> to skip HW reset if the guilty job did finish by the time we are 
>>>> processing the timeout  (see amdgpu_device_gpu_recover and 
>>>> skip_hw_reset goto) - the difference is we always call 
>>>> drm_sched_stop/start irrespectively of whether we are going to 
>>>> actually HW reset or not (same as extend timeout). I wonder if 
>>>> something like this can be done also for ve3 and etnaviv ?
>>>>
>>>> Andrey
>>> _______________________________________________
>>> amd-gfx mailing list
>>> amd-gfx@lists.freedesktop.org
>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=02%7C01%7Candrey.grodzovsky%40amd.com%7Cef96617d23a54fe9b6ef08d7af0ac9db%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637170333200621550&amp;sdata=wa7Eh3bdi%2BthYjjZF2yeTvTjNRipGPqVA%2FGQt0QL7R8%3D&amp;reserved=0 
>>>
>>>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=02%7C01%7Candrey.grodzovsky%40amd.com%7Cef96617d23a54fe9b6ef08d7af0ac9db%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637170333200621550&amp;sdata=wa7Eh3bdi%2BthYjjZF2yeTvTjNRipGPqVA%2FGQt0QL7R8%3D&amp;reserved=0 
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
  2020-02-11 21:27                     ` Andrey Grodzovsky
@ 2020-02-12  0:53                       ` Luben Tuikov
  -1 siblings, 0 replies; 125+ messages in thread
From: Luben Tuikov @ 2020-02-12  0:53 UTC (permalink / raw)
  To: Andrey Grodzovsky, Christian König, Alex Deucher, Lucas Stach
  Cc: Emily Deng, amd-gfx list, Maling list - DRI developers, steven.price

On 2020-02-11 4:27 p.m., Andrey Grodzovsky wrote:
> 
> On 2/11/20 10:55 AM, Andrey Grodzovsky wrote:
>> On 2/10/20 4:50 PM, Luben Tuikov wrote:
>>> Hi Lucas,
>>>
>>> Thank you for bringing awareness of this issue, publicly.
>>>
>>> As soon as this patch showed up back in November of 2019,
>>> I objected to it, privately.
>>
>>
>> I didn't find this objection in my mail actually

Yes, I didn't send it to you.

>>> I suggested to instead use a _list_ to store the "state" of
>>> all jobs of the same state. Then, at any time, timeout interrupt
>>> or whatever, we can atomically (irq spinlock) move the timeout/bad
>>> job to the timedout/cleanup/bad job list, and wake someone up
>>> to deal with that list asynchronously, and return from the 
>>> interrupt/etc.
>>> immediately.
>>
>>
>> Sounds a good idea to me, i think enough for us to have 2 lists, 
>> timeout list for jobs scheduled to HW and not yet completed 
>> (completion fence signaled) and cleanup list for those that did 
>> complete. This should give alternative solution to the race condition 
>> this patch was addressing without causing the break the Lucas 
>> reported. If no one objects I think i can try implement it.
>>
>> Andrey
> 
> 
> Thinking more i realize Luben is right about having also bad job list as 
> this is needed for normal job competition (by fence callback from 
> amdgpu_fence_process)  and you need to decide if you move it to cleanup 
> list from timeout list or not. If it's already in bad job list - meaning 
> that it's being processed by GPU recovery code you don't touch it, 
> otherwise you move it to cleanup list where it will be freed eventually 
> by invocation of drm_sched_get_cleanup_job.

Yep...

Perhaps fewer lists, than "timeout", "bad" and "cleanup" could be had.
I'd also name the "bad" list as "recovery" list, as that is what would
be done to commands on that list.

"Timeout" is a status "timed-out", so perhaps just set the timeout
flag and move it to a "done" list. (Note that the command can still
complete asynchronously while on that list and while it has status
"timed-out'.)

The idea is that,
1) it avoid contention and races when more than one context
   can update the job at the same time, and
2) easy to process all jobs of a certain state and/or
   move them around, etc.

Let's discuss it and come up with a plan. :-)

Regards,
Luben




> 
> Andrey
> 
> 
>>
>>
>>>
>>> Then in due time, if any more interrupts or whatnot take place,
>>> the job will either be in the timeout list or not. If it it,
>>> then the instigator backs off as someone else (the list handler) will/is
>>> awake and handling it (obviously a state variable may be kept as well).
>>>
>>> This draws somewhat from my days with iSCSI, SCSI and SAS, 15 years ago,
>>> where a device can complete a job (task) at anytime regardless
>>> of what the SCSI layer "thinks" the task's state is: timed-out, aborted,
>>> whatever. It is a very simple and elegant solution which generalizes
>>> well.
>>>
>>> Regards,
>>> Luben
>>>
>>> On 2020-02-10 11:55 a.m., Andrey Grodzovsky wrote:
>>>> Lucas - Ping on my question and also I attached this temporary 
>>>> solution for etnaviv to clarify my point. If that something 
>>>> acceptable for now at least i can do the same for v3d where it 
>>>> requires a bit more code changes.
>>>>
>>>> Andrey
>>>>
>>>> On 2/6/20 10:49 AM, Andrey Grodzovsky wrote:
>>>>>> Well a revert would break our driver.
>>>>>>
>>>>>> The real solution is that somebody needs to sit down, gather ALL 
>>>>>> the requirements and then come up with a solution which is clean 
>>>>>> and works for everyone.
>>>>>>
>>>>>> Christian.
>>>>>
>>>>> I can to take on this as indeed our general design on this becomes 
>>>>> more and more entangled as GPU reset scenarios grow in complexity 
>>>>> (at least in AMD driver). Currently I am on a high priority 
>>>>> internal task which should take me around a week or 2 to finish and 
>>>>> after that I can get to it.
>>>>>
>>>>> Regarding temporary solution  - I looked into v3d and etnaviv use 
>>>>> cases and we in AMD actually face the same scenario where we decide 
>>>>> to skip HW reset if the guilty job did finish by the time we are 
>>>>> processing the timeout  (see amdgpu_device_gpu_recover and 
>>>>> skip_hw_reset goto) - the difference is we always call 
>>>>> drm_sched_stop/start irrespectively of whether we are going to 
>>>>> actually HW reset or not (same as extend timeout). I wonder if 
>>>>> something like this can be done also for ve3 and etnaviv ?
>>>>>
>>>>> Andrey
>>>> _______________________________________________
>>>> amd-gfx mailing list
>>>> amd-gfx@lists.freedesktop.org
>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=02%7C01%7Candrey.grodzovsky%40amd.com%7Cef96617d23a54fe9b6ef08d7af0ac9db%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637170333200621550&amp;sdata=wa7Eh3bdi%2BthYjjZF2yeTvTjNRipGPqVA%2FGQt0QL7R8%3D&amp;reserved=0 
>>>>
>>>>
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx@lists.freedesktop.org
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=02%7C01%7Candrey.grodzovsky%40amd.com%7Cef96617d23a54fe9b6ef08d7af0ac9db%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637170333200621550&amp;sdata=wa7Eh3bdi%2BthYjjZF2yeTvTjNRipGPqVA%2FGQt0QL7R8%3D&amp;reserved=0 
>>

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
@ 2020-02-12  0:53                       ` Luben Tuikov
  0 siblings, 0 replies; 125+ messages in thread
From: Luben Tuikov @ 2020-02-12  0:53 UTC (permalink / raw)
  To: Andrey Grodzovsky, Christian König, Alex Deucher, Lucas Stach
  Cc: Emily Deng, amd-gfx list, Maling list - DRI developers, steven.price

On 2020-02-11 4:27 p.m., Andrey Grodzovsky wrote:
> 
> On 2/11/20 10:55 AM, Andrey Grodzovsky wrote:
>> On 2/10/20 4:50 PM, Luben Tuikov wrote:
>>> Hi Lucas,
>>>
>>> Thank you for bringing awareness of this issue, publicly.
>>>
>>> As soon as this patch showed up back in November of 2019,
>>> I objected to it, privately.
>>
>>
>> I didn't find this objection in my mail actually

Yes, I didn't send it to you.

>>> I suggested to instead use a _list_ to store the "state" of
>>> all jobs of the same state. Then, at any time, timeout interrupt
>>> or whatever, we can atomically (irq spinlock) move the timeout/bad
>>> job to the timedout/cleanup/bad job list, and wake someone up
>>> to deal with that list asynchronously, and return from the 
>>> interrupt/etc.
>>> immediately.
>>
>>
>> Sounds a good idea to me, i think enough for us to have 2 lists, 
>> timeout list for jobs scheduled to HW and not yet completed 
>> (completion fence signaled) and cleanup list for those that did 
>> complete. This should give alternative solution to the race condition 
>> this patch was addressing without causing the break the Lucas 
>> reported. If no one objects I think i can try implement it.
>>
>> Andrey
> 
> 
> Thinking more i realize Luben is right about having also bad job list as 
> this is needed for normal job competition (by fence callback from 
> amdgpu_fence_process)  and you need to decide if you move it to cleanup 
> list from timeout list or not. If it's already in bad job list - meaning 
> that it's being processed by GPU recovery code you don't touch it, 
> otherwise you move it to cleanup list where it will be freed eventually 
> by invocation of drm_sched_get_cleanup_job.

Yep...

Perhaps fewer lists, than "timeout", "bad" and "cleanup" could be had.
I'd also name the "bad" list as "recovery" list, as that is what would
be done to commands on that list.

"Timeout" is a status "timed-out", so perhaps just set the timeout
flag and move it to a "done" list. (Note that the command can still
complete asynchronously while on that list and while it has status
"timed-out'.)

The idea is that,
1) it avoid contention and races when more than one context
   can update the job at the same time, and
2) easy to process all jobs of a certain state and/or
   move them around, etc.

Let's discuss it and come up with a plan. :-)

Regards,
Luben




> 
> Andrey
> 
> 
>>
>>
>>>
>>> Then in due time, if any more interrupts or whatnot take place,
>>> the job will either be in the timeout list or not. If it it,
>>> then the instigator backs off as someone else (the list handler) will/is
>>> awake and handling it (obviously a state variable may be kept as well).
>>>
>>> This draws somewhat from my days with iSCSI, SCSI and SAS, 15 years ago,
>>> where a device can complete a job (task) at anytime regardless
>>> of what the SCSI layer "thinks" the task's state is: timed-out, aborted,
>>> whatever. It is a very simple and elegant solution which generalizes
>>> well.
>>>
>>> Regards,
>>> Luben
>>>
>>> On 2020-02-10 11:55 a.m., Andrey Grodzovsky wrote:
>>>> Lucas - Ping on my question and also I attached this temporary 
>>>> solution for etnaviv to clarify my point. If that something 
>>>> acceptable for now at least i can do the same for v3d where it 
>>>> requires a bit more code changes.
>>>>
>>>> Andrey
>>>>
>>>> On 2/6/20 10:49 AM, Andrey Grodzovsky wrote:
>>>>>> Well a revert would break our driver.
>>>>>>
>>>>>> The real solution is that somebody needs to sit down, gather ALL 
>>>>>> the requirements and then come up with a solution which is clean 
>>>>>> and works for everyone.
>>>>>>
>>>>>> Christian.
>>>>>
>>>>> I can to take on this as indeed our general design on this becomes 
>>>>> more and more entangled as GPU reset scenarios grow in complexity 
>>>>> (at least in AMD driver). Currently I am on a high priority 
>>>>> internal task which should take me around a week or 2 to finish and 
>>>>> after that I can get to it.
>>>>>
>>>>> Regarding temporary solution  - I looked into v3d and etnaviv use 
>>>>> cases and we in AMD actually face the same scenario where we decide 
>>>>> to skip HW reset if the guilty job did finish by the time we are 
>>>>> processing the timeout  (see amdgpu_device_gpu_recover and 
>>>>> skip_hw_reset goto) - the difference is we always call 
>>>>> drm_sched_stop/start irrespectively of whether we are going to 
>>>>> actually HW reset or not (same as extend timeout). I wonder if 
>>>>> something like this can be done also for ve3 and etnaviv ?
>>>>>
>>>>> Andrey
>>>> _______________________________________________
>>>> amd-gfx mailing list
>>>> amd-gfx@lists.freedesktop.org
>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=02%7C01%7Candrey.grodzovsky%40amd.com%7Cef96617d23a54fe9b6ef08d7af0ac9db%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637170333200621550&amp;sdata=wa7Eh3bdi%2BthYjjZF2yeTvTjNRipGPqVA%2FGQt0QL7R8%3D&amp;reserved=0 
>>>>
>>>>
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx@lists.freedesktop.org
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=02%7C01%7Candrey.grodzovsky%40amd.com%7Cef96617d23a54fe9b6ef08d7af0ac9db%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637170333200621550&amp;sdata=wa7Eh3bdi%2BthYjjZF2yeTvTjNRipGPqVA%2FGQt0QL7R8%3D&amp;reserved=0 
>>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
  2020-02-12  0:53                       ` Luben Tuikov
@ 2020-02-12 16:33                         ` Andrey Grodzovsky
  -1 siblings, 0 replies; 125+ messages in thread
From: Andrey Grodzovsky @ 2020-02-12 16:33 UTC (permalink / raw)
  To: Luben Tuikov, Christian König, Alex Deucher, Lucas Stach
  Cc: Emily Deng, amd-gfx list, Maling list - DRI developers, steven.price


On 2/11/20 7:53 PM, Luben Tuikov wrote:
> On 2020-02-11 4:27 p.m., Andrey Grodzovsky wrote:
>> On 2/11/20 10:55 AM, Andrey Grodzovsky wrote:
>>> On 2/10/20 4:50 PM, Luben Tuikov wrote:
>>>> Hi Lucas,
>>>>
>>>> Thank you for bringing awareness of this issue, publicly.
>>>>
>>>> As soon as this patch showed up back in November of 2019,
>>>> I objected to it, privately.
>>>
>>> I didn't find this objection in my mail actually
> Yes, I didn't send it to you.
>
>>>> I suggested to instead use a _list_ to store the "state" of
>>>> all jobs of the same state. Then, at any time, timeout interrupt
>>>> or whatever, we can atomically (irq spinlock) move the timeout/bad
>>>> job to the timedout/cleanup/bad job list, and wake someone up
>>>> to deal with that list asynchronously, and return from the
>>>> interrupt/etc.
>>>> immediately.
>>>
>>> Sounds a good idea to me, i think enough for us to have 2 lists,
>>> timeout list for jobs scheduled to HW and not yet completed
>>> (completion fence signaled) and cleanup list for those that did
>>> complete. This should give alternative solution to the race condition
>>> this patch was addressing without causing the break the Lucas
>>> reported. If no one objects I think i can try implement it.
>>>
>>> Andrey
>>
>> Thinking more i realize Luben is right about having also bad job list as
>> this is needed for normal job competition (by fence callback from
>> amdgpu_fence_process)  and you need to decide if you move it to cleanup
>> list from timeout list or not. If it's already in bad job list - meaning
>> that it's being processed by GPU recovery code you don't touch it,
>> otherwise you move it to cleanup list where it will be freed eventually
>> by invocation of drm_sched_get_cleanup_job.
> Yep...
>
> Perhaps fewer lists, than "timeout", "bad" and "cleanup" could be had.
> I'd also name the "bad" list as "recovery" list, as that is what would
> be done to commands on that list.
>
> "Timeout" is a status "timed-out", so perhaps just set the timeout
> flag and move it to a "done" list. (Note that the command can still
> complete asynchronously while on that list and while it has status
> "timed-out'.)
>
> The idea is that,
> 1) it avoid contention and races when more than one context
>     can update the job at the same time, and
> 2) easy to process all jobs of a certain state and/or
>     move them around, etc.
>
> Let's discuss it and come up with a plan. :-)
>
> Regards,
> Luben


Sure, let me maybe come up with a draft patch so we have more concrete 
stuff to discuss and review.

Andrey



>
>
>
>
>> Andrey
>>
>>
>>>
>>>> Then in due time, if any more interrupts or whatnot take place,
>>>> the job will either be in the timeout list or not. If it it,
>>>> then the instigator backs off as someone else (the list handler) will/is
>>>> awake and handling it (obviously a state variable may be kept as well).
>>>>
>>>> This draws somewhat from my days with iSCSI, SCSI and SAS, 15 years ago,
>>>> where a device can complete a job (task) at anytime regardless
>>>> of what the SCSI layer "thinks" the task's state is: timed-out, aborted,
>>>> whatever. It is a very simple and elegant solution which generalizes
>>>> well.
>>>>
>>>> Regards,
>>>> Luben
>>>>
>>>> On 2020-02-10 11:55 a.m., Andrey Grodzovsky wrote:
>>>>> Lucas - Ping on my question and also I attached this temporary
>>>>> solution for etnaviv to clarify my point. If that something
>>>>> acceptable for now at least i can do the same for v3d where it
>>>>> requires a bit more code changes.
>>>>>
>>>>> Andrey
>>>>>
>>>>> On 2/6/20 10:49 AM, Andrey Grodzovsky wrote:
>>>>>>> Well a revert would break our driver.
>>>>>>>
>>>>>>> The real solution is that somebody needs to sit down, gather ALL
>>>>>>> the requirements and then come up with a solution which is clean
>>>>>>> and works for everyone.
>>>>>>>
>>>>>>> Christian.
>>>>>> I can to take on this as indeed our general design on this becomes
>>>>>> more and more entangled as GPU reset scenarios grow in complexity
>>>>>> (at least in AMD driver). Currently I am on a high priority
>>>>>> internal task which should take me around a week or 2 to finish and
>>>>>> after that I can get to it.
>>>>>>
>>>>>> Regarding temporary solution  - I looked into v3d and etnaviv use
>>>>>> cases and we in AMD actually face the same scenario where we decide
>>>>>> to skip HW reset if the guilty job did finish by the time we are
>>>>>> processing the timeout  (see amdgpu_device_gpu_recover and
>>>>>> skip_hw_reset goto) - the difference is we always call
>>>>>> drm_sched_stop/start irrespectively of whether we are going to
>>>>>> actually HW reset or not (same as extend timeout). I wonder if
>>>>>> something like this can be done also for ve3 and etnaviv ?
>>>>>>
>>>>>> Andrey
>>>>> _______________________________________________
>>>>> amd-gfx mailing list
>>>>> amd-gfx@lists.freedesktop.org
>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=02%7C01%7Candrey.grodzovsky%40amd.com%7Cef96617d23a54fe9b6ef08d7af0ac9db%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637170333200621550&amp;sdata=wa7Eh3bdi%2BthYjjZF2yeTvTjNRipGPqVA%2FGQt0QL7R8%3D&amp;reserved=0
>>>>>
>>>>>
>>> _______________________________________________
>>> amd-gfx mailing list
>>> amd-gfx@lists.freedesktop.org
>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=02%7C01%7Candrey.grodzovsky%40amd.com%7Cef96617d23a54fe9b6ef08d7af0ac9db%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637170333200621550&amp;sdata=wa7Eh3bdi%2BthYjjZF2yeTvTjNRipGPqVA%2FGQt0QL7R8%3D&amp;reserved=0
>>>
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
@ 2020-02-12 16:33                         ` Andrey Grodzovsky
  0 siblings, 0 replies; 125+ messages in thread
From: Andrey Grodzovsky @ 2020-02-12 16:33 UTC (permalink / raw)
  To: Luben Tuikov, Christian König, Alex Deucher, Lucas Stach
  Cc: Emily Deng, amd-gfx list, Maling list - DRI developers, steven.price


On 2/11/20 7:53 PM, Luben Tuikov wrote:
> On 2020-02-11 4:27 p.m., Andrey Grodzovsky wrote:
>> On 2/11/20 10:55 AM, Andrey Grodzovsky wrote:
>>> On 2/10/20 4:50 PM, Luben Tuikov wrote:
>>>> Hi Lucas,
>>>>
>>>> Thank you for bringing awareness of this issue, publicly.
>>>>
>>>> As soon as this patch showed up back in November of 2019,
>>>> I objected to it, privately.
>>>
>>> I didn't find this objection in my mail actually
> Yes, I didn't send it to you.
>
>>>> I suggested to instead use a _list_ to store the "state" of
>>>> all jobs of the same state. Then, at any time, timeout interrupt
>>>> or whatever, we can atomically (irq spinlock) move the timeout/bad
>>>> job to the timedout/cleanup/bad job list, and wake someone up
>>>> to deal with that list asynchronously, and return from the
>>>> interrupt/etc.
>>>> immediately.
>>>
>>> Sounds a good idea to me, i think enough for us to have 2 lists,
>>> timeout list for jobs scheduled to HW and not yet completed
>>> (completion fence signaled) and cleanup list for those that did
>>> complete. This should give alternative solution to the race condition
>>> this patch was addressing without causing the break the Lucas
>>> reported. If no one objects I think i can try implement it.
>>>
>>> Andrey
>>
>> Thinking more i realize Luben is right about having also bad job list as
>> this is needed for normal job competition (by fence callback from
>> amdgpu_fence_process)  and you need to decide if you move it to cleanup
>> list from timeout list or not. If it's already in bad job list - meaning
>> that it's being processed by GPU recovery code you don't touch it,
>> otherwise you move it to cleanup list where it will be freed eventually
>> by invocation of drm_sched_get_cleanup_job.
> Yep...
>
> Perhaps fewer lists, than "timeout", "bad" and "cleanup" could be had.
> I'd also name the "bad" list as "recovery" list, as that is what would
> be done to commands on that list.
>
> "Timeout" is a status "timed-out", so perhaps just set the timeout
> flag and move it to a "done" list. (Note that the command can still
> complete asynchronously while on that list and while it has status
> "timed-out'.)
>
> The idea is that,
> 1) it avoid contention and races when more than one context
>     can update the job at the same time, and
> 2) easy to process all jobs of a certain state and/or
>     move them around, etc.
>
> Let's discuss it and come up with a plan. :-)
>
> Regards,
> Luben


Sure, let me maybe come up with a draft patch so we have more concrete 
stuff to discuss and review.

Andrey



>
>
>
>
>> Andrey
>>
>>
>>>
>>>> Then in due time, if any more interrupts or whatnot take place,
>>>> the job will either be in the timeout list or not. If it it,
>>>> then the instigator backs off as someone else (the list handler) will/is
>>>> awake and handling it (obviously a state variable may be kept as well).
>>>>
>>>> This draws somewhat from my days with iSCSI, SCSI and SAS, 15 years ago,
>>>> where a device can complete a job (task) at anytime regardless
>>>> of what the SCSI layer "thinks" the task's state is: timed-out, aborted,
>>>> whatever. It is a very simple and elegant solution which generalizes
>>>> well.
>>>>
>>>> Regards,
>>>> Luben
>>>>
>>>> On 2020-02-10 11:55 a.m., Andrey Grodzovsky wrote:
>>>>> Lucas - Ping on my question and also I attached this temporary
>>>>> solution for etnaviv to clarify my point. If that something
>>>>> acceptable for now at least i can do the same for v3d where it
>>>>> requires a bit more code changes.
>>>>>
>>>>> Andrey
>>>>>
>>>>> On 2/6/20 10:49 AM, Andrey Grodzovsky wrote:
>>>>>>> Well a revert would break our driver.
>>>>>>>
>>>>>>> The real solution is that somebody needs to sit down, gather ALL
>>>>>>> the requirements and then come up with a solution which is clean
>>>>>>> and works for everyone.
>>>>>>>
>>>>>>> Christian.
>>>>>> I can to take on this as indeed our general design on this becomes
>>>>>> more and more entangled as GPU reset scenarios grow in complexity
>>>>>> (at least in AMD driver). Currently I am on a high priority
>>>>>> internal task which should take me around a week or 2 to finish and
>>>>>> after that I can get to it.
>>>>>>
>>>>>> Regarding temporary solution  - I looked into v3d and etnaviv use
>>>>>> cases and we in AMD actually face the same scenario where we decide
>>>>>> to skip HW reset if the guilty job did finish by the time we are
>>>>>> processing the timeout  (see amdgpu_device_gpu_recover and
>>>>>> skip_hw_reset goto) - the difference is we always call
>>>>>> drm_sched_stop/start irrespectively of whether we are going to
>>>>>> actually HW reset or not (same as extend timeout). I wonder if
>>>>>> something like this can be done also for ve3 and etnaviv ?
>>>>>>
>>>>>> Andrey
>>>>> _______________________________________________
>>>>> amd-gfx mailing list
>>>>> amd-gfx@lists.freedesktop.org
>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=02%7C01%7Candrey.grodzovsky%40amd.com%7Cef96617d23a54fe9b6ef08d7af0ac9db%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637170333200621550&amp;sdata=wa7Eh3bdi%2BthYjjZF2yeTvTjNRipGPqVA%2FGQt0QL7R8%3D&amp;reserved=0
>>>>>
>>>>>
>>> _______________________________________________
>>> amd-gfx mailing list
>>> amd-gfx@lists.freedesktop.org
>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=02%7C01%7Candrey.grodzovsky%40amd.com%7Cef96617d23a54fe9b6ef08d7af0ac9db%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637170333200621550&amp;sdata=wa7Eh3bdi%2BthYjjZF2yeTvTjNRipGPqVA%2FGQt0QL7R8%3D&amp;reserved=0
>>>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
  2020-02-12 16:33                         ` Andrey Grodzovsky
@ 2020-07-21 11:03                           ` Lucas Stach
  -1 siblings, 0 replies; 125+ messages in thread
From: Lucas Stach @ 2020-07-21 11:03 UTC (permalink / raw)
  To: Andrey Grodzovsky, Luben Tuikov, Christian König, Alex Deucher
  Cc: Emily Deng, amd-gfx list, Maling list - DRI developers, steven.price

Hi Andrey,

Am Mittwoch, den 12.02.2020, 11:33 -0500 schrieb Andrey Grodzovsky:
> On 2/11/20 7:53 PM, Luben Tuikov wrote:
> > On 2020-02-11 4:27 p.m., Andrey Grodzovsky wrote:
> > > On 2/11/20 10:55 AM, Andrey Grodzovsky wrote:
> > > > On 2/10/20 4:50 PM, Luben Tuikov wrote:
> > > > > Hi Lucas,
> > > > > 
> > > > > Thank you for bringing awareness of this issue, publicly.
> > > > > 
> > > > > As soon as this patch showed up back in November of 2019,
> > > > > I objected to it, privately.
> > > > 
> > > > I didn't find this objection in my mail actually
> > Yes, I didn't send it to you.
> > 
> > > > > I suggested to instead use a _list_ to store the "state" of
> > > > > all jobs of the same state. Then, at any time, timeout interrupt
> > > > > or whatever, we can atomically (irq spinlock) move the timeout/bad
> > > > > job to the timedout/cleanup/bad job list, and wake someone up
> > > > > to deal with that list asynchronously, and return from the
> > > > > interrupt/etc.
> > > > > immediately.
> > > > 
> > > > Sounds a good idea to me, i think enough for us to have 2 lists,
> > > > timeout list for jobs scheduled to HW and not yet completed
> > > > (completion fence signaled) and cleanup list for those that did
> > > > complete. This should give alternative solution to the race condition
> > > > this patch was addressing without causing the break the Lucas
> > > > reported. If no one objects I think i can try implement it.
> > > > 
> > > > Andrey
> > > 
> > > Thinking more i realize Luben is right about having also bad job list as
> > > this is needed for normal job competition (by fence callback from
> > > amdgpu_fence_process)  and you need to decide if you move it to cleanup
> > > list from timeout list or not. If it's already in bad job list - meaning
> > > that it's being processed by GPU recovery code you don't touch it,
> > > otherwise you move it to cleanup list where it will be freed eventually
> > > by invocation of drm_sched_get_cleanup_job.
> > Yep...
> > 
> > Perhaps fewer lists, than "timeout", "bad" and "cleanup" could be had.
> > I'd also name the "bad" list as "recovery" list, as that is what would
> > be done to commands on that list.
> > 
> > "Timeout" is a status "timed-out", so perhaps just set the timeout
> > flag and move it to a "done" list. (Note that the command can still
> > complete asynchronously while on that list and while it has status
> > "timed-out'.)
> > 
> > The idea is that,
> > 1) it avoid contention and races when more than one context
> >     can update the job at the same time, and
> > 2) easy to process all jobs of a certain state and/or
> >     move them around, etc.
> > 
> > Let's discuss it and come up with a plan. :-)
> > 
> > Regards,
> > Luben
> 
> Sure, let me maybe come up with a draft patch so we have more concrete 
> stuff to discuss and review.

It seems we all dropped the ball on this one. I believe this is still
an open issue. Has there been any progress from your side on fixing
this?

Regards,
Lucas

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
@ 2020-07-21 11:03                           ` Lucas Stach
  0 siblings, 0 replies; 125+ messages in thread
From: Lucas Stach @ 2020-07-21 11:03 UTC (permalink / raw)
  To: Andrey Grodzovsky, Luben Tuikov, Christian König, Alex Deucher
  Cc: Emily Deng, amd-gfx list, Maling list - DRI developers, steven.price

Hi Andrey,

Am Mittwoch, den 12.02.2020, 11:33 -0500 schrieb Andrey Grodzovsky:
> On 2/11/20 7:53 PM, Luben Tuikov wrote:
> > On 2020-02-11 4:27 p.m., Andrey Grodzovsky wrote:
> > > On 2/11/20 10:55 AM, Andrey Grodzovsky wrote:
> > > > On 2/10/20 4:50 PM, Luben Tuikov wrote:
> > > > > Hi Lucas,
> > > > > 
> > > > > Thank you for bringing awareness of this issue, publicly.
> > > > > 
> > > > > As soon as this patch showed up back in November of 2019,
> > > > > I objected to it, privately.
> > > > 
> > > > I didn't find this objection in my mail actually
> > Yes, I didn't send it to you.
> > 
> > > > > I suggested to instead use a _list_ to store the "state" of
> > > > > all jobs of the same state. Then, at any time, timeout interrupt
> > > > > or whatever, we can atomically (irq spinlock) move the timeout/bad
> > > > > job to the timedout/cleanup/bad job list, and wake someone up
> > > > > to deal with that list asynchronously, and return from the
> > > > > interrupt/etc.
> > > > > immediately.
> > > > 
> > > > Sounds a good idea to me, i think enough for us to have 2 lists,
> > > > timeout list for jobs scheduled to HW and not yet completed
> > > > (completion fence signaled) and cleanup list for those that did
> > > > complete. This should give alternative solution to the race condition
> > > > this patch was addressing without causing the break the Lucas
> > > > reported. If no one objects I think i can try implement it.
> > > > 
> > > > Andrey
> > > 
> > > Thinking more i realize Luben is right about having also bad job list as
> > > this is needed for normal job competition (by fence callback from
> > > amdgpu_fence_process)  and you need to decide if you move it to cleanup
> > > list from timeout list or not. If it's already in bad job list - meaning
> > > that it's being processed by GPU recovery code you don't touch it,
> > > otherwise you move it to cleanup list where it will be freed eventually
> > > by invocation of drm_sched_get_cleanup_job.
> > Yep...
> > 
> > Perhaps fewer lists, than "timeout", "bad" and "cleanup" could be had.
> > I'd also name the "bad" list as "recovery" list, as that is what would
> > be done to commands on that list.
> > 
> > "Timeout" is a status "timed-out", so perhaps just set the timeout
> > flag and move it to a "done" list. (Note that the command can still
> > complete asynchronously while on that list and while it has status
> > "timed-out'.)
> > 
> > The idea is that,
> > 1) it avoid contention and races when more than one context
> >     can update the job at the same time, and
> > 2) easy to process all jobs of a certain state and/or
> >     move them around, etc.
> > 
> > Let's discuss it and come up with a plan. :-)
> > 
> > Regards,
> > Luben
> 
> Sure, let me maybe come up with a draft patch so we have more concrete 
> stuff to discuss and review.

It seems we all dropped the ball on this one. I believe this is still
an open issue. Has there been any progress from your side on fixing
this?

Regards,
Lucas

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
  2020-07-21 11:03                           ` Lucas Stach
@ 2020-07-21 13:36                             ` Andrey Grodzovsky
  -1 siblings, 0 replies; 125+ messages in thread
From: Andrey Grodzovsky @ 2020-07-21 13:36 UTC (permalink / raw)
  To: Lucas Stach, Luben Tuikov, Christian König, Alex Deucher
  Cc: Emily Deng, amd-gfx list, Maling list - DRI developers, steven.price

Lucas, Luben picked the work on this a few month ago as I was diverted to a 
different project.

Luben, can you update Lucas please ?

Andrey

On 7/21/20 7:03 AM, Lucas Stach wrote:
> It seems we all dropped the ball on this one. I believe this is still
> an open issue. Has there been any progress from your side on fixing
> this?
>
> Regards,
> Lucas
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
@ 2020-07-21 13:36                             ` Andrey Grodzovsky
  0 siblings, 0 replies; 125+ messages in thread
From: Andrey Grodzovsky @ 2020-07-21 13:36 UTC (permalink / raw)
  To: Lucas Stach, Luben Tuikov, Christian König, Alex Deucher
  Cc: Emily Deng, amd-gfx list, Maling list - DRI developers, steven.price

Lucas, Luben picked the work on this a few month ago as I was diverted to a 
different project.

Luben, can you update Lucas please ?

Andrey

On 7/21/20 7:03 AM, Lucas Stach wrote:
> It seems we all dropped the ball on this one. I believe this is still
> an open issue. Has there been any progress from your side on fixing
> this?
>
> Regards,
> Lucas
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
  2020-07-21 13:36                             ` Andrey Grodzovsky
@ 2020-07-21 13:39                               ` Christian König
  -1 siblings, 0 replies; 125+ messages in thread
From: Christian König @ 2020-07-21 13:39 UTC (permalink / raw)
  To: Andrey Grodzovsky, Lucas Stach, Luben Tuikov, Alex Deucher
  Cc: Emily Deng, amd-gfx list, Maling list - DRI developers, steven.price

Luben had a good idea how to tackle the whole job handling.

Andrey/Lucas can you work with Luben to get this cleaned up because 
there are a lot of requirements on this which not only come from AMD.

Thanks,
Christian.

Am 21.07.20 um 15:36 schrieb Andrey Grodzovsky:
> Lucas, Luben picked the work on this a few month ago as I was diverted 
> to a different project.
>
> Luben, can you update Lucas please ?
>
> Andrey
>
> On 7/21/20 7:03 AM, Lucas Stach wrote:
>> It seems we all dropped the ball on this one. I believe this is still
>> an open issue. Has there been any progress from your side on fixing
>> this?
>>
>> Regards,
>> Lucas

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
@ 2020-07-21 13:39                               ` Christian König
  0 siblings, 0 replies; 125+ messages in thread
From: Christian König @ 2020-07-21 13:39 UTC (permalink / raw)
  To: Andrey Grodzovsky, Lucas Stach, Luben Tuikov, Alex Deucher
  Cc: Emily Deng, amd-gfx list, Maling list - DRI developers, steven.price

Luben had a good idea how to tackle the whole job handling.

Andrey/Lucas can you work with Luben to get this cleaned up because 
there are a lot of requirements on this which not only come from AMD.

Thanks,
Christian.

Am 21.07.20 um 15:36 schrieb Andrey Grodzovsky:
> Lucas, Luben picked the work on this a few month ago as I was diverted 
> to a different project.
>
> Luben, can you update Lucas please ?
>
> Andrey
>
> On 7/21/20 7:03 AM, Lucas Stach wrote:
>> It seems we all dropped the ball on this one. I believe this is still
>> an open issue. Has there been any progress from your side on fixing
>> this?
>>
>> Regards,
>> Lucas

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
  2020-07-21 13:39                               ` Christian König
@ 2020-07-21 13:42                                 ` Andrey Grodzovsky
  -1 siblings, 0 replies; 125+ messages in thread
From: Andrey Grodzovsky @ 2020-07-21 13:42 UTC (permalink / raw)
  To: Christian König, Lucas Stach, Luben Tuikov, Alex Deucher
  Cc: Emily Deng, amd-gfx list, Maling list - DRI developers, steven.price

Christian, I would want this very much but unfortunately I am on a strict 
schedule for an internal project currently and hence will not be able to 
actively participate. I will do my best to answer any questions Luben might have 
about current implementation.

Andrey

On 7/21/20 9:39 AM, Christian König wrote:
> Luben had a good idea how to tackle the whole job handling.
>
> Andrey/Lucas can you work with Luben to get this cleaned up because there are 
> a lot of requirements on this which not only come from AMD.
>
> Thanks,
> Christian.
>
> Am 21.07.20 um 15:36 schrieb Andrey Grodzovsky:
>> Lucas, Luben picked the work on this a few month ago as I was diverted to a 
>> different project.
>>
>> Luben, can you update Lucas please ?
>>
>> Andrey
>>
>> On 7/21/20 7:03 AM, Lucas Stach wrote:
>>> It seems we all dropped the ball on this one. I believe this is still
>>> an open issue. Has there been any progress from your side on fixing
>>> this?
>>>
>>> Regards,
>>> Lucas
>
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
@ 2020-07-21 13:42                                 ` Andrey Grodzovsky
  0 siblings, 0 replies; 125+ messages in thread
From: Andrey Grodzovsky @ 2020-07-21 13:42 UTC (permalink / raw)
  To: Christian König, Lucas Stach, Luben Tuikov, Alex Deucher
  Cc: Emily Deng, amd-gfx list, Maling list - DRI developers, steven.price

Christian, I would want this very much but unfortunately I am on a strict 
schedule for an internal project currently and hence will not be able to 
actively participate. I will do my best to answer any questions Luben might have 
about current implementation.

Andrey

On 7/21/20 9:39 AM, Christian König wrote:
> Luben had a good idea how to tackle the whole job handling.
>
> Andrey/Lucas can you work with Luben to get this cleaned up because there are 
> a lot of requirements on this which not only come from AMD.
>
> Thanks,
> Christian.
>
> Am 21.07.20 um 15:36 schrieb Andrey Grodzovsky:
>> Lucas, Luben picked the work on this a few month ago as I was diverted to a 
>> different project.
>>
>> Luben, can you update Lucas please ?
>>
>> Andrey
>>
>> On 7/21/20 7:03 AM, Lucas Stach wrote:
>>> It seems we all dropped the ball on this one. I believe this is still
>>> an open issue. Has there been any progress from your side on fixing
>>> this?
>>>
>>> Regards,
>>> Lucas
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
  2020-07-21 13:42                                 ` Andrey Grodzovsky
@ 2020-07-21 18:29                                   ` Luben Tuikov
  -1 siblings, 0 replies; 125+ messages in thread
From: Luben Tuikov @ 2020-07-21 18:29 UTC (permalink / raw)
  To: Andrey Grodzovsky, Christian König, Lucas Stach, Alex Deucher
  Cc: Emily Deng, amd-gfx list, Maling list - DRI developers, steven.price

Hi Lucas,

Thank you for following up on this. Some things have slowed down,
given the world pandemic we've been experiencing this year.

I've had the design ready and half of it implemented and committed
into a branch. Just as per what I wrote earlier this year on this thread.

I need to finish the rest which isn't big, but does need
some unravelling of the current code. Then I need testing,
which I suppose a number of people can help, so long as
they can make a frame time out and kick in the timeout handler.

I'll have more details in a few weeks.

Regards,
Luben

On 2020-07-21 9:42 a.m., Andrey Grodzovsky wrote:
> Christian, I would want this very much but unfortunately I am on a strict 
> schedule for an internal project currently and hence will not be able to 
> actively participate. I will do my best to answer any questions Luben might have 
> about current implementation.
> 
> Andrey
> 
> On 7/21/20 9:39 AM, Christian König wrote:
>> Luben had a good idea how to tackle the whole job handling.
>>
>> Andrey/Lucas can you work with Luben to get this cleaned up because there are 
>> a lot of requirements on this which not only come from AMD.
>>
>> Thanks,
>> Christian.
>>
>> Am 21.07.20 um 15:36 schrieb Andrey Grodzovsky:
>>> Lucas, Luben picked the work on this a few month ago as I was diverted to a 
>>> different project.
>>>
>>> Luben, can you update Lucas please ?
>>>
>>> Andrey
>>>
>>> On 7/21/20 7:03 AM, Lucas Stach wrote:
>>>> It seems we all dropped the ball on this one. I believe this is still
>>>> an open issue. Has there been any progress from your side on fixing
>>>> this?
>>>>
>>>> Regards,
>>>> Lucas
>>

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH v4] drm/scheduler: Avoid accessing freed bad job.
@ 2020-07-21 18:29                                   ` Luben Tuikov
  0 siblings, 0 replies; 125+ messages in thread
From: Luben Tuikov @ 2020-07-21 18:29 UTC (permalink / raw)
  To: Andrey Grodzovsky, Christian König, Lucas Stach, Alex Deucher
  Cc: Emily Deng, amd-gfx list, Maling list - DRI developers, steven.price

Hi Lucas,

Thank you for following up on this. Some things have slowed down,
given the world pandemic we've been experiencing this year.

I've had the design ready and half of it implemented and committed
into a branch. Just as per what I wrote earlier this year on this thread.

I need to finish the rest which isn't big, but does need
some unravelling of the current code. Then I need testing,
which I suppose a number of people can help, so long as
they can make a frame time out and kick in the timeout handler.

I'll have more details in a few weeks.

Regards,
Luben

On 2020-07-21 9:42 a.m., Andrey Grodzovsky wrote:
> Christian, I would want this very much but unfortunately I am on a strict 
> schedule for an internal project currently and hence will not be able to 
> actively participate. I will do my best to answer any questions Luben might have 
> about current implementation.
> 
> Andrey
> 
> On 7/21/20 9:39 AM, Christian König wrote:
>> Luben had a good idea how to tackle the whole job handling.
>>
>> Andrey/Lucas can you work with Luben to get this cleaned up because there are 
>> a lot of requirements on this which not only come from AMD.
>>
>> Thanks,
>> Christian.
>>
>> Am 21.07.20 um 15:36 schrieb Andrey Grodzovsky:
>>> Lucas, Luben picked the work on this a few month ago as I was diverted to a 
>>> different project.
>>>
>>> Luben, can you update Lucas please ?
>>>
>>> Andrey
>>>
>>> On 7/21/20 7:03 AM, Lucas Stach wrote:
>>>> It seems we all dropped the ball on this one. I believe this is still
>>>> an open issue. Has there been any progress from your side on fixing
>>>> this?
>>>>
>>>> Regards,
>>>> Lucas
>>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 125+ messages in thread

* [PATCH 0/6] Allow to extend the timeout without jobs disappearing
  2020-07-21 18:29                                   ` Luben Tuikov
@ 2020-11-25  3:17                                     ` Luben Tuikov
  -1 siblings, 0 replies; 125+ messages in thread
From: Luben Tuikov @ 2020-11-25  3:17 UTC (permalink / raw)
  To: Andrey Grodzovsky, Christian König, Lucas Stach, Alexander Deucher
  Cc: Emily Deng, Luben Tuikov, amd-gfx, dri-devel, steven.price

Hi guys,

This series of patches implements a pending list for
jobs which are in the hardware, and a done list for
tasks which are done and need to be freed.

It implements a second thread, dedicated to freeing
tasks from the done list. The main scheduler thread no
longer frees (cleans up) done tasks by polling the head
of the pending list (drm_sched_get_cleanup_task() is
now gone)--it only pushes tasks down to the GPU. As
tasks complete and call their DRM callback, their
fences are signalled and tasks are queued to the done
list and the done thread woken up to free them. This
can take place concurrently with the main scheduler
thread pushing tasks down to the GPU.

When a task times out, the timeout function prototype
now is made to return a value back to DRM. The reason
for this is that the GPU driver has intimate knowledge
of the hardware and can pass back information to DRM on
what to do. Whether to attempt to abort the task (by
say calling a driver abort function, etc., as the
implementation dictates), or whether the task needs
more time. Note that the task is not moved away from
the pending list, unless it is no longer in the GPU.
(The pending list holds tasks which are pending from
DRM's point of view, i.e. the GPU has control over
them--that could be things like DMA is active, CU's are
active, for the task, etc.)

The idea really is that what DRM wants to know is
whether the task is in the GPU or not. So now
drm_sched_backend_ops::timedout_job() returns
0 of the task is no longer with the GPU, or 1
if the task needs more time.

Tested up to patch 5. Running with patch 6 seems to
make X/GDM just sleep, and I'm looking into this now.

This series applies to drm-misc-next.

Luben Tuikov (6):
  drm/scheduler: "node" --> "list"
  gpu/drm: ring_mirror_list --> pending_list
  drm/scheduler: Job timeout handler returns status
  drm/scheduler: Essentialize the job done callback
  drm/amdgpu: Don't hardcode thread name length
  drm/sched: Make use of a "done" thread

 drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c |   6 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c  |   4 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c     |   8 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h    |   2 +-
 drivers/gpu/drm/scheduler/sched_main.c      | 275 ++++++++++----------
 include/drm/gpu_scheduler.h                 |  43 ++-
 6 files changed, 186 insertions(+), 152 deletions(-)

-- 
2.29.2.154.g7f7ebe054a

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 125+ messages in thread

* [PATCH 0/6] Allow to extend the timeout without jobs disappearing
@ 2020-11-25  3:17                                     ` Luben Tuikov
  0 siblings, 0 replies; 125+ messages in thread
From: Luben Tuikov @ 2020-11-25  3:17 UTC (permalink / raw)
  To: Andrey Grodzovsky, Christian König, Lucas Stach, Alexander Deucher
  Cc: Emily Deng, Luben Tuikov, amd-gfx, dri-devel, steven.price

Hi guys,

This series of patches implements a pending list for
jobs which are in the hardware, and a done list for
tasks which are done and need to be freed.

It implements a second thread, dedicated to freeing
tasks from the done list. The main scheduler thread no
longer frees (cleans up) done tasks by polling the head
of the pending list (drm_sched_get_cleanup_task() is
now gone)--it only pushes tasks down to the GPU. As
tasks complete and call their DRM callback, their
fences are signalled and tasks are queued to the done
list and the done thread woken up to free them. This
can take place concurrently with the main scheduler
thread pushing tasks down to the GPU.

When a task times out, the timeout function prototype
now is made to return a value back to DRM. The reason
for this is that the GPU driver has intimate knowledge
of the hardware and can pass back information to DRM on
what to do. Whether to attempt to abort the task (by
say calling a driver abort function, etc., as the
implementation dictates), or whether the task needs
more time. Note that the task is not moved away from
the pending list, unless it is no longer in the GPU.
(The pending list holds tasks which are pending from
DRM's point of view, i.e. the GPU has control over
them--that could be things like DMA is active, CU's are
active, for the task, etc.)

The idea really is that what DRM wants to know is
whether the task is in the GPU or not. So now
drm_sched_backend_ops::timedout_job() returns
0 of the task is no longer with the GPU, or 1
if the task needs more time.

Tested up to patch 5. Running with patch 6 seems to
make X/GDM just sleep, and I'm looking into this now.

This series applies to drm-misc-next.

Luben Tuikov (6):
  drm/scheduler: "node" --> "list"
  gpu/drm: ring_mirror_list --> pending_list
  drm/scheduler: Job timeout handler returns status
  drm/scheduler: Essentialize the job done callback
  drm/amdgpu: Don't hardcode thread name length
  drm/sched: Make use of a "done" thread

 drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c |   6 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c  |   4 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c     |   8 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h    |   2 +-
 drivers/gpu/drm/scheduler/sched_main.c      | 275 ++++++++++----------
 include/drm/gpu_scheduler.h                 |  43 ++-
 6 files changed, 186 insertions(+), 152 deletions(-)

-- 
2.29.2.154.g7f7ebe054a

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 125+ messages in thread

* [PATCH 1/6] drm/scheduler: "node" --> "list"
  2020-11-25  3:17                                     ` Luben Tuikov
@ 2020-11-25  3:17                                       ` Luben Tuikov
  -1 siblings, 0 replies; 125+ messages in thread
From: Luben Tuikov @ 2020-11-25  3:17 UTC (permalink / raw)
  To: Andrey Grodzovsky, Christian König, Lucas Stach, Alexander Deucher
  Cc: Emily Deng, Luben Tuikov, amd-gfx, dri-devel, steven.price

Rename "node" to "list" in struct drm_sched_job,
in order to make it consistent with what we see
being used throughout gpu_scheduler.h, for
instance in struct drm_sched_entity, as well as
the rest of DRM and the kernel.

Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c |  6 +++---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c  |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c     |  2 +-
 drivers/gpu/drm/scheduler/sched_main.c      | 23 +++++++++++----------
 include/drm/gpu_scheduler.h                 |  4 ++--
 5 files changed, 19 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
index 5c1f3725c741..8358cae0b5a4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
@@ -1427,7 +1427,7 @@ static void amdgpu_ib_preempt_job_recovery(struct drm_gpu_scheduler *sched)
 	struct dma_fence *fence;
 
 	spin_lock(&sched->job_list_lock);
-	list_for_each_entry(s_job, &sched->ring_mirror_list, node) {
+	list_for_each_entry(s_job, &sched->ring_mirror_list, list) {
 		fence = sched->ops->run_job(s_job);
 		dma_fence_put(fence);
 	}
@@ -1459,10 +1459,10 @@ static void amdgpu_ib_preempt_mark_partial_job(struct amdgpu_ring *ring)
 
 no_preempt:
 	spin_lock(&sched->job_list_lock);
-	list_for_each_entry_safe(s_job, tmp, &sched->ring_mirror_list, node) {
+	list_for_each_entry_safe(s_job, tmp, &sched->ring_mirror_list, list) {
 		if (dma_fence_is_signaled(&s_job->s_fence->finished)) {
 			/* remove job from ring_mirror_list */
-			list_del_init(&s_job->node);
+			list_del_init(&s_job->list);
 			sched->ops->free_job(s_job);
 			continue;
 		}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 7560b05e4ac1..4df6de81cd41 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -4128,7 +4128,7 @@ bool amdgpu_device_has_job_running(struct amdgpu_device *adev)
 
 		spin_lock(&ring->sched.job_list_lock);
 		job = list_first_entry_or_null(&ring->sched.ring_mirror_list,
-				struct drm_sched_job, node);
+				struct drm_sched_job, list);
 		spin_unlock(&ring->sched.job_list_lock);
 		if (job)
 			return true;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
index dcfe8a3b03ff..aca52a46b93d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -271,7 +271,7 @@ void amdgpu_job_stop_all_jobs_on_sched(struct drm_gpu_scheduler *sched)
 	}
 
 	/* Signal all jobs already scheduled to HW */
-	list_for_each_entry(s_job, &sched->ring_mirror_list, node) {
+	list_for_each_entry(s_job, &sched->ring_mirror_list, list) {
 		struct drm_sched_fence *s_fence = s_job->s_fence;
 
 		dma_fence_set_error(&s_fence->finished, -EHWPOISON);
diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index c6332d75025e..c52eba407ebd 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -272,7 +272,7 @@ static void drm_sched_job_begin(struct drm_sched_job *s_job)
 	struct drm_gpu_scheduler *sched = s_job->sched;
 
 	spin_lock(&sched->job_list_lock);
-	list_add_tail(&s_job->node, &sched->ring_mirror_list);
+	list_add_tail(&s_job->list, &sched->ring_mirror_list);
 	drm_sched_start_timeout(sched);
 	spin_unlock(&sched->job_list_lock);
 }
@@ -287,7 +287,7 @@ static void drm_sched_job_timedout(struct work_struct *work)
 	/* Protects against concurrent deletion in drm_sched_get_cleanup_job */
 	spin_lock(&sched->job_list_lock);
 	job = list_first_entry_or_null(&sched->ring_mirror_list,
-				       struct drm_sched_job, node);
+				       struct drm_sched_job, list);
 
 	if (job) {
 		/*
@@ -295,7 +295,7 @@ static void drm_sched_job_timedout(struct work_struct *work)
 		 * drm_sched_cleanup_jobs. It will be reinserted back after sched->thread
 		 * is parked at which point it's safe.
 		 */
-		list_del_init(&job->node);
+		list_del_init(&job->list);
 		spin_unlock(&sched->job_list_lock);
 
 		job->sched->ops->timedout_job(job);
@@ -392,7 +392,7 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad)
 		 * Add at the head of the queue to reflect it was the earliest
 		 * job extracted.
 		 */
-		list_add(&bad->node, &sched->ring_mirror_list);
+		list_add(&bad->list, &sched->ring_mirror_list);
 
 	/*
 	 * Iterate the job list from later to  earlier one and either deactive
@@ -400,7 +400,8 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad)
 	 * signaled.
 	 * This iteration is thread safe as sched thread is stopped.
 	 */
-	list_for_each_entry_safe_reverse(s_job, tmp, &sched->ring_mirror_list, node) {
+	list_for_each_entry_safe_reverse(s_job, tmp, &sched->ring_mirror_list,
+					 list) {
 		if (s_job->s_fence->parent &&
 		    dma_fence_remove_callback(s_job->s_fence->parent,
 					      &s_job->cb)) {
@@ -411,7 +412,7 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad)
 			 * Locking here is for concurrent resume timeout
 			 */
 			spin_lock(&sched->job_list_lock);
-			list_del_init(&s_job->node);
+			list_del_init(&s_job->list);
 			spin_unlock(&sched->job_list_lock);
 
 			/*
@@ -462,7 +463,7 @@ void drm_sched_start(struct drm_gpu_scheduler *sched, bool full_recovery)
 	 * so no new jobs are being inserted or removed. Also concurrent
 	 * GPU recovers can't run in parallel.
 	 */
-	list_for_each_entry_safe(s_job, tmp, &sched->ring_mirror_list, node) {
+	list_for_each_entry_safe(s_job, tmp, &sched->ring_mirror_list, list) {
 		struct dma_fence *fence = s_job->s_fence->parent;
 
 		atomic_inc(&sched->hw_rq_count);
@@ -505,7 +506,7 @@ void drm_sched_resubmit_jobs(struct drm_gpu_scheduler *sched)
 	bool found_guilty = false;
 	struct dma_fence *fence;
 
-	list_for_each_entry_safe(s_job, tmp, &sched->ring_mirror_list, node) {
+	list_for_each_entry_safe(s_job, tmp, &sched->ring_mirror_list, list) {
 		struct drm_sched_fence *s_fence = s_job->s_fence;
 
 		if (!found_guilty && atomic_read(&s_job->karma) > sched->hang_limit) {
@@ -565,7 +566,7 @@ int drm_sched_job_init(struct drm_sched_job *job,
 		return -ENOMEM;
 	job->id = atomic64_inc_return(&sched->job_id_count);
 
-	INIT_LIST_HEAD(&job->node);
+	INIT_LIST_HEAD(&job->list);
 
 	return 0;
 }
@@ -684,11 +685,11 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched)
 	spin_lock(&sched->job_list_lock);
 
 	job = list_first_entry_or_null(&sched->ring_mirror_list,
-				       struct drm_sched_job, node);
+				       struct drm_sched_job, list);
 
 	if (job && dma_fence_is_signaled(&job->s_fence->finished)) {
 		/* remove job from ring_mirror_list */
-		list_del_init(&job->node);
+		list_del_init(&job->list);
 	} else {
 		job = NULL;
 		/* queue timeout for next job */
diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
index 92436553fd6a..3add0072bd37 100644
--- a/include/drm/gpu_scheduler.h
+++ b/include/drm/gpu_scheduler.h
@@ -189,14 +189,14 @@ struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f);
  */
 struct drm_sched_job {
 	struct spsc_node		queue_node;
+	struct list_head		list;
 	struct drm_gpu_scheduler	*sched;
 	struct drm_sched_fence		*s_fence;
 	struct dma_fence_cb		finish_cb;
-	struct list_head		node;
 	uint64_t			id;
 	atomic_t			karma;
 	enum drm_sched_priority		s_priority;
-	struct drm_sched_entity  *entity;
+	struct drm_sched_entity         *entity;
 	struct dma_fence_cb		cb;
 };
 
-- 
2.29.2.154.g7f7ebe054a

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 125+ messages in thread

* [PATCH 1/6] drm/scheduler: "node" --> "list"
@ 2020-11-25  3:17                                       ` Luben Tuikov
  0 siblings, 0 replies; 125+ messages in thread
From: Luben Tuikov @ 2020-11-25  3:17 UTC (permalink / raw)
  To: Andrey Grodzovsky, Christian König, Lucas Stach, Alexander Deucher
  Cc: Emily Deng, Luben Tuikov, amd-gfx, dri-devel, steven.price

Rename "node" to "list" in struct drm_sched_job,
in order to make it consistent with what we see
being used throughout gpu_scheduler.h, for
instance in struct drm_sched_entity, as well as
the rest of DRM and the kernel.

Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c |  6 +++---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c  |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c     |  2 +-
 drivers/gpu/drm/scheduler/sched_main.c      | 23 +++++++++++----------
 include/drm/gpu_scheduler.h                 |  4 ++--
 5 files changed, 19 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
index 5c1f3725c741..8358cae0b5a4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
@@ -1427,7 +1427,7 @@ static void amdgpu_ib_preempt_job_recovery(struct drm_gpu_scheduler *sched)
 	struct dma_fence *fence;
 
 	spin_lock(&sched->job_list_lock);
-	list_for_each_entry(s_job, &sched->ring_mirror_list, node) {
+	list_for_each_entry(s_job, &sched->ring_mirror_list, list) {
 		fence = sched->ops->run_job(s_job);
 		dma_fence_put(fence);
 	}
@@ -1459,10 +1459,10 @@ static void amdgpu_ib_preempt_mark_partial_job(struct amdgpu_ring *ring)
 
 no_preempt:
 	spin_lock(&sched->job_list_lock);
-	list_for_each_entry_safe(s_job, tmp, &sched->ring_mirror_list, node) {
+	list_for_each_entry_safe(s_job, tmp, &sched->ring_mirror_list, list) {
 		if (dma_fence_is_signaled(&s_job->s_fence->finished)) {
 			/* remove job from ring_mirror_list */
-			list_del_init(&s_job->node);
+			list_del_init(&s_job->list);
 			sched->ops->free_job(s_job);
 			continue;
 		}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 7560b05e4ac1..4df6de81cd41 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -4128,7 +4128,7 @@ bool amdgpu_device_has_job_running(struct amdgpu_device *adev)
 
 		spin_lock(&ring->sched.job_list_lock);
 		job = list_first_entry_or_null(&ring->sched.ring_mirror_list,
-				struct drm_sched_job, node);
+				struct drm_sched_job, list);
 		spin_unlock(&ring->sched.job_list_lock);
 		if (job)
 			return true;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
index dcfe8a3b03ff..aca52a46b93d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -271,7 +271,7 @@ void amdgpu_job_stop_all_jobs_on_sched(struct drm_gpu_scheduler *sched)
 	}
 
 	/* Signal all jobs already scheduled to HW */
-	list_for_each_entry(s_job, &sched->ring_mirror_list, node) {
+	list_for_each_entry(s_job, &sched->ring_mirror_list, list) {
 		struct drm_sched_fence *s_fence = s_job->s_fence;
 
 		dma_fence_set_error(&s_fence->finished, -EHWPOISON);
diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index c6332d75025e..c52eba407ebd 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -272,7 +272,7 @@ static void drm_sched_job_begin(struct drm_sched_job *s_job)
 	struct drm_gpu_scheduler *sched = s_job->sched;
 
 	spin_lock(&sched->job_list_lock);
-	list_add_tail(&s_job->node, &sched->ring_mirror_list);
+	list_add_tail(&s_job->list, &sched->ring_mirror_list);
 	drm_sched_start_timeout(sched);
 	spin_unlock(&sched->job_list_lock);
 }
@@ -287,7 +287,7 @@ static void drm_sched_job_timedout(struct work_struct *work)
 	/* Protects against concurrent deletion in drm_sched_get_cleanup_job */
 	spin_lock(&sched->job_list_lock);
 	job = list_first_entry_or_null(&sched->ring_mirror_list,
-				       struct drm_sched_job, node);
+				       struct drm_sched_job, list);
 
 	if (job) {
 		/*
@@ -295,7 +295,7 @@ static void drm_sched_job_timedout(struct work_struct *work)
 		 * drm_sched_cleanup_jobs. It will be reinserted back after sched->thread
 		 * is parked at which point it's safe.
 		 */
-		list_del_init(&job->node);
+		list_del_init(&job->list);
 		spin_unlock(&sched->job_list_lock);
 
 		job->sched->ops->timedout_job(job);
@@ -392,7 +392,7 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad)
 		 * Add at the head of the queue to reflect it was the earliest
 		 * job extracted.
 		 */
-		list_add(&bad->node, &sched->ring_mirror_list);
+		list_add(&bad->list, &sched->ring_mirror_list);
 
 	/*
 	 * Iterate the job list from later to  earlier one and either deactive
@@ -400,7 +400,8 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad)
 	 * signaled.
 	 * This iteration is thread safe as sched thread is stopped.
 	 */
-	list_for_each_entry_safe_reverse(s_job, tmp, &sched->ring_mirror_list, node) {
+	list_for_each_entry_safe_reverse(s_job, tmp, &sched->ring_mirror_list,
+					 list) {
 		if (s_job->s_fence->parent &&
 		    dma_fence_remove_callback(s_job->s_fence->parent,
 					      &s_job->cb)) {
@@ -411,7 +412,7 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad)
 			 * Locking here is for concurrent resume timeout
 			 */
 			spin_lock(&sched->job_list_lock);
-			list_del_init(&s_job->node);
+			list_del_init(&s_job->list);
 			spin_unlock(&sched->job_list_lock);
 
 			/*
@@ -462,7 +463,7 @@ void drm_sched_start(struct drm_gpu_scheduler *sched, bool full_recovery)
 	 * so no new jobs are being inserted or removed. Also concurrent
 	 * GPU recovers can't run in parallel.
 	 */
-	list_for_each_entry_safe(s_job, tmp, &sched->ring_mirror_list, node) {
+	list_for_each_entry_safe(s_job, tmp, &sched->ring_mirror_list, list) {
 		struct dma_fence *fence = s_job->s_fence->parent;
 
 		atomic_inc(&sched->hw_rq_count);
@@ -505,7 +506,7 @@ void drm_sched_resubmit_jobs(struct drm_gpu_scheduler *sched)
 	bool found_guilty = false;
 	struct dma_fence *fence;
 
-	list_for_each_entry_safe(s_job, tmp, &sched->ring_mirror_list, node) {
+	list_for_each_entry_safe(s_job, tmp, &sched->ring_mirror_list, list) {
 		struct drm_sched_fence *s_fence = s_job->s_fence;
 
 		if (!found_guilty && atomic_read(&s_job->karma) > sched->hang_limit) {
@@ -565,7 +566,7 @@ int drm_sched_job_init(struct drm_sched_job *job,
 		return -ENOMEM;
 	job->id = atomic64_inc_return(&sched->job_id_count);
 
-	INIT_LIST_HEAD(&job->node);
+	INIT_LIST_HEAD(&job->list);
 
 	return 0;
 }
@@ -684,11 +685,11 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched)
 	spin_lock(&sched->job_list_lock);
 
 	job = list_first_entry_or_null(&sched->ring_mirror_list,
-				       struct drm_sched_job, node);
+				       struct drm_sched_job, list);
 
 	if (job && dma_fence_is_signaled(&job->s_fence->finished)) {
 		/* remove job from ring_mirror_list */
-		list_del_init(&job->node);
+		list_del_init(&job->list);
 	} else {
 		job = NULL;
 		/* queue timeout for next job */
diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
index 92436553fd6a..3add0072bd37 100644
--- a/include/drm/gpu_scheduler.h
+++ b/include/drm/gpu_scheduler.h
@@ -189,14 +189,14 @@ struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f);
  */
 struct drm_sched_job {
 	struct spsc_node		queue_node;
+	struct list_head		list;
 	struct drm_gpu_scheduler	*sched;
 	struct drm_sched_fence		*s_fence;
 	struct dma_fence_cb		finish_cb;
-	struct list_head		node;
 	uint64_t			id;
 	atomic_t			karma;
 	enum drm_sched_priority		s_priority;
-	struct drm_sched_entity  *entity;
+	struct drm_sched_entity         *entity;
 	struct dma_fence_cb		cb;
 };
 
-- 
2.29.2.154.g7f7ebe054a

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 125+ messages in thread

* [PATCH 2/6] gpu/drm: ring_mirror_list --> pending_list
  2020-11-25  3:17                                     ` Luben Tuikov
@ 2020-11-25  3:17                                       ` Luben Tuikov
  -1 siblings, 0 replies; 125+ messages in thread
From: Luben Tuikov @ 2020-11-25  3:17 UTC (permalink / raw)
  To: Andrey Grodzovsky, Christian König, Lucas Stach, Alexander Deucher
  Cc: Emily Deng, Luben Tuikov, amd-gfx, dri-devel, steven.price

Rename "ring_mirror_list" to "pending_list",
to describe what something is, not what it does,
how it's used, or how the hardware implements it.

This also abstracts the actual hardware
implementation, i.e. how the low-level driver
communicates with the device it drives, ring, CAM,
etc., shouldn't be exposed to DRM.

The pending_list keeps jobs submitted, which are
out of our control. Usually this means they are
pending execution status in hardware, but the
latter definition is a more general (inclusive)
definition.

Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c |  4 +--
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c  |  4 +--
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c     |  2 +-
 drivers/gpu/drm/scheduler/sched_main.c      | 34 ++++++++++-----------
 include/drm/gpu_scheduler.h                 | 10 +++---
 5 files changed, 27 insertions(+), 27 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
index 8358cae0b5a4..db77a5bdfa45 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
@@ -1427,7 +1427,7 @@ static void amdgpu_ib_preempt_job_recovery(struct drm_gpu_scheduler *sched)
 	struct dma_fence *fence;
 
 	spin_lock(&sched->job_list_lock);
-	list_for_each_entry(s_job, &sched->ring_mirror_list, list) {
+	list_for_each_entry(s_job, &sched->pending_list, list) {
 		fence = sched->ops->run_job(s_job);
 		dma_fence_put(fence);
 	}
@@ -1459,7 +1459,7 @@ static void amdgpu_ib_preempt_mark_partial_job(struct amdgpu_ring *ring)
 
 no_preempt:
 	spin_lock(&sched->job_list_lock);
-	list_for_each_entry_safe(s_job, tmp, &sched->ring_mirror_list, list) {
+	list_for_each_entry_safe(s_job, tmp, &sched->pending_list, list) {
 		if (dma_fence_is_signaled(&s_job->s_fence->finished)) {
 			/* remove job from ring_mirror_list */
 			list_del_init(&s_job->list);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 4df6de81cd41..fbae600aa5f9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -4127,8 +4127,8 @@ bool amdgpu_device_has_job_running(struct amdgpu_device *adev)
 			continue;
 
 		spin_lock(&ring->sched.job_list_lock);
-		job = list_first_entry_or_null(&ring->sched.ring_mirror_list,
-				struct drm_sched_job, list);
+		job = list_first_entry_or_null(&ring->sched.pending_list,
+					       struct drm_sched_job, list);
 		spin_unlock(&ring->sched.job_list_lock);
 		if (job)
 			return true;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
index aca52a46b93d..ff48101bab55 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -271,7 +271,7 @@ void amdgpu_job_stop_all_jobs_on_sched(struct drm_gpu_scheduler *sched)
 	}
 
 	/* Signal all jobs already scheduled to HW */
-	list_for_each_entry(s_job, &sched->ring_mirror_list, list) {
+	list_for_each_entry(s_job, &sched->pending_list, list) {
 		struct drm_sched_fence *s_fence = s_job->s_fence;
 
 		dma_fence_set_error(&s_fence->finished, -EHWPOISON);
diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index c52eba407ebd..b694df12aaba 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -198,7 +198,7 @@ EXPORT_SYMBOL(drm_sched_dependency_optimized);
 static void drm_sched_start_timeout(struct drm_gpu_scheduler *sched)
 {
 	if (sched->timeout != MAX_SCHEDULE_TIMEOUT &&
-	    !list_empty(&sched->ring_mirror_list))
+	    !list_empty(&sched->pending_list))
 		schedule_delayed_work(&sched->work_tdr, sched->timeout);
 }
 
@@ -258,7 +258,7 @@ void drm_sched_resume_timeout(struct drm_gpu_scheduler *sched,
 {
 	spin_lock(&sched->job_list_lock);
 
-	if (list_empty(&sched->ring_mirror_list))
+	if (list_empty(&sched->pending_list))
 		cancel_delayed_work(&sched->work_tdr);
 	else
 		mod_delayed_work(system_wq, &sched->work_tdr, remaining);
@@ -272,7 +272,7 @@ static void drm_sched_job_begin(struct drm_sched_job *s_job)
 	struct drm_gpu_scheduler *sched = s_job->sched;
 
 	spin_lock(&sched->job_list_lock);
-	list_add_tail(&s_job->list, &sched->ring_mirror_list);
+	list_add_tail(&s_job->list, &sched->pending_list);
 	drm_sched_start_timeout(sched);
 	spin_unlock(&sched->job_list_lock);
 }
@@ -286,7 +286,7 @@ static void drm_sched_job_timedout(struct work_struct *work)
 
 	/* Protects against concurrent deletion in drm_sched_get_cleanup_job */
 	spin_lock(&sched->job_list_lock);
-	job = list_first_entry_or_null(&sched->ring_mirror_list,
+	job = list_first_entry_or_null(&sched->pending_list,
 				       struct drm_sched_job, list);
 
 	if (job) {
@@ -371,7 +371,7 @@ EXPORT_SYMBOL(drm_sched_increase_karma);
  * Stop the scheduler and also removes and frees all completed jobs.
  * Note: bad job will not be freed as it might be used later and so it's
  * callers responsibility to release it manually if it's not part of the
- * mirror list any more.
+ * pending list any more.
  *
  */
 void drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad)
@@ -392,15 +392,15 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad)
 		 * Add at the head of the queue to reflect it was the earliest
 		 * job extracted.
 		 */
-		list_add(&bad->list, &sched->ring_mirror_list);
+		list_add(&bad->list, &sched->pending_list);
 
 	/*
 	 * Iterate the job list from later to  earlier one and either deactive
-	 * their HW callbacks or remove them from mirror list if they already
+	 * their HW callbacks or remove them from pending list if they already
 	 * signaled.
 	 * This iteration is thread safe as sched thread is stopped.
 	 */
-	list_for_each_entry_safe_reverse(s_job, tmp, &sched->ring_mirror_list,
+	list_for_each_entry_safe_reverse(s_job, tmp, &sched->pending_list,
 					 list) {
 		if (s_job->s_fence->parent &&
 		    dma_fence_remove_callback(s_job->s_fence->parent,
@@ -408,7 +408,7 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad)
 			atomic_dec(&sched->hw_rq_count);
 		} else {
 			/*
-			 * remove job from ring_mirror_list.
+			 * remove job from pending_list.
 			 * Locking here is for concurrent resume timeout
 			 */
 			spin_lock(&sched->job_list_lock);
@@ -463,7 +463,7 @@ void drm_sched_start(struct drm_gpu_scheduler *sched, bool full_recovery)
 	 * so no new jobs are being inserted or removed. Also concurrent
 	 * GPU recovers can't run in parallel.
 	 */
-	list_for_each_entry_safe(s_job, tmp, &sched->ring_mirror_list, list) {
+	list_for_each_entry_safe(s_job, tmp, &sched->pending_list, list) {
 		struct dma_fence *fence = s_job->s_fence->parent;
 
 		atomic_inc(&sched->hw_rq_count);
@@ -494,7 +494,7 @@ void drm_sched_start(struct drm_gpu_scheduler *sched, bool full_recovery)
 EXPORT_SYMBOL(drm_sched_start);
 
 /**
- * drm_sched_resubmit_jobs - helper to relunch job from mirror ring list
+ * drm_sched_resubmit_jobs - helper to relunch job from pending ring list
  *
  * @sched: scheduler instance
  *
@@ -506,7 +506,7 @@ void drm_sched_resubmit_jobs(struct drm_gpu_scheduler *sched)
 	bool found_guilty = false;
 	struct dma_fence *fence;
 
-	list_for_each_entry_safe(s_job, tmp, &sched->ring_mirror_list, list) {
+	list_for_each_entry_safe(s_job, tmp, &sched->pending_list, list) {
 		struct drm_sched_fence *s_fence = s_job->s_fence;
 
 		if (!found_guilty && atomic_read(&s_job->karma) > sched->hang_limit) {
@@ -665,7 +665,7 @@ static void drm_sched_process_job(struct dma_fence *f, struct dma_fence_cb *cb)
  *
  * @sched: scheduler instance
  *
- * Returns the next finished job from the mirror list (if there is one)
+ * Returns the next finished job from the pending list (if there is one)
  * ready for it to be destroyed.
  */
 static struct drm_sched_job *
@@ -675,7 +675,7 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched)
 
 	/*
 	 * Don't destroy jobs while the timeout worker is running  OR thread
-	 * is being parked and hence assumed to not touch ring_mirror_list
+	 * is being parked and hence assumed to not touch pending_list
 	 */
 	if ((sched->timeout != MAX_SCHEDULE_TIMEOUT &&
 	    !cancel_delayed_work(&sched->work_tdr)) ||
@@ -684,11 +684,11 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched)
 
 	spin_lock(&sched->job_list_lock);
 
-	job = list_first_entry_or_null(&sched->ring_mirror_list,
+	job = list_first_entry_or_null(&sched->pending_list,
 				       struct drm_sched_job, list);
 
 	if (job && dma_fence_is_signaled(&job->s_fence->finished)) {
-		/* remove job from ring_mirror_list */
+		/* remove job from pending_list */
 		list_del_init(&job->list);
 	} else {
 		job = NULL;
@@ -858,7 +858,7 @@ int drm_sched_init(struct drm_gpu_scheduler *sched,
 
 	init_waitqueue_head(&sched->wake_up_worker);
 	init_waitqueue_head(&sched->job_scheduled);
-	INIT_LIST_HEAD(&sched->ring_mirror_list);
+	INIT_LIST_HEAD(&sched->pending_list);
 	spin_lock_init(&sched->job_list_lock);
 	atomic_set(&sched->hw_rq_count, 0);
 	INIT_DELAYED_WORK(&sched->work_tdr, drm_sched_job_timedout);
diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
index 3add0072bd37..2e0c368e19f6 100644
--- a/include/drm/gpu_scheduler.h
+++ b/include/drm/gpu_scheduler.h
@@ -174,7 +174,7 @@ struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f);
  * @sched: the scheduler instance on which this job is scheduled.
  * @s_fence: contains the fences for the scheduling of job.
  * @finish_cb: the callback for the finished fence.
- * @node: used to append this struct to the @drm_gpu_scheduler.ring_mirror_list.
+ * @node: used to append this struct to the @drm_gpu_scheduler.pending_list.
  * @id: a unique id assigned to each job scheduled on the scheduler.
  * @karma: increment on every hang caused by this job. If this exceeds the hang
  *         limit of the scheduler then the job is marked guilty and will not
@@ -203,7 +203,7 @@ struct drm_sched_job {
 static inline bool drm_sched_invalidate_job(struct drm_sched_job *s_job,
 					    int threshold)
 {
-	return (s_job && atomic_inc_return(&s_job->karma) > threshold);
+	return s_job && atomic_inc_return(&s_job->karma) > threshold;
 }
 
 /**
@@ -260,8 +260,8 @@ struct drm_sched_backend_ops {
  * @work_tdr: schedules a delayed call to @drm_sched_job_timedout after the
  *            timeout interval is over.
  * @thread: the kthread on which the scheduler which run.
- * @ring_mirror_list: the list of jobs which are currently in the job queue.
- * @job_list_lock: lock to protect the ring_mirror_list.
+ * @pending_list: the list of jobs which are currently in the job queue.
+ * @job_list_lock: lock to protect the pending_list.
  * @hang_limit: once the hangs by a job crosses this limit then it is marked
  *              guilty and it will be considered for scheduling further.
  * @score: score to help loadbalancer pick a idle sched
@@ -282,7 +282,7 @@ struct drm_gpu_scheduler {
 	atomic64_t			job_id_count;
 	struct delayed_work		work_tdr;
 	struct task_struct		*thread;
-	struct list_head		ring_mirror_list;
+	struct list_head		pending_list;
 	spinlock_t			job_list_lock;
 	int				hang_limit;
 	atomic_t                        score;
-- 
2.29.2.154.g7f7ebe054a

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 125+ messages in thread

* [PATCH 2/6] gpu/drm: ring_mirror_list --> pending_list
@ 2020-11-25  3:17                                       ` Luben Tuikov
  0 siblings, 0 replies; 125+ messages in thread
From: Luben Tuikov @ 2020-11-25  3:17 UTC (permalink / raw)
  To: Andrey Grodzovsky, Christian König, Lucas Stach, Alexander Deucher
  Cc: Emily Deng, Luben Tuikov, amd-gfx, dri-devel, steven.price

Rename "ring_mirror_list" to "pending_list",
to describe what something is, not what it does,
how it's used, or how the hardware implements it.

This also abstracts the actual hardware
implementation, i.e. how the low-level driver
communicates with the device it drives, ring, CAM,
etc., shouldn't be exposed to DRM.

The pending_list keeps jobs submitted, which are
out of our control. Usually this means they are
pending execution status in hardware, but the
latter definition is a more general (inclusive)
definition.

Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c |  4 +--
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c  |  4 +--
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c     |  2 +-
 drivers/gpu/drm/scheduler/sched_main.c      | 34 ++++++++++-----------
 include/drm/gpu_scheduler.h                 | 10 +++---
 5 files changed, 27 insertions(+), 27 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
index 8358cae0b5a4..db77a5bdfa45 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
@@ -1427,7 +1427,7 @@ static void amdgpu_ib_preempt_job_recovery(struct drm_gpu_scheduler *sched)
 	struct dma_fence *fence;
 
 	spin_lock(&sched->job_list_lock);
-	list_for_each_entry(s_job, &sched->ring_mirror_list, list) {
+	list_for_each_entry(s_job, &sched->pending_list, list) {
 		fence = sched->ops->run_job(s_job);
 		dma_fence_put(fence);
 	}
@@ -1459,7 +1459,7 @@ static void amdgpu_ib_preempt_mark_partial_job(struct amdgpu_ring *ring)
 
 no_preempt:
 	spin_lock(&sched->job_list_lock);
-	list_for_each_entry_safe(s_job, tmp, &sched->ring_mirror_list, list) {
+	list_for_each_entry_safe(s_job, tmp, &sched->pending_list, list) {
 		if (dma_fence_is_signaled(&s_job->s_fence->finished)) {
 			/* remove job from ring_mirror_list */
 			list_del_init(&s_job->list);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 4df6de81cd41..fbae600aa5f9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -4127,8 +4127,8 @@ bool amdgpu_device_has_job_running(struct amdgpu_device *adev)
 			continue;
 
 		spin_lock(&ring->sched.job_list_lock);
-		job = list_first_entry_or_null(&ring->sched.ring_mirror_list,
-				struct drm_sched_job, list);
+		job = list_first_entry_or_null(&ring->sched.pending_list,
+					       struct drm_sched_job, list);
 		spin_unlock(&ring->sched.job_list_lock);
 		if (job)
 			return true;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
index aca52a46b93d..ff48101bab55 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -271,7 +271,7 @@ void amdgpu_job_stop_all_jobs_on_sched(struct drm_gpu_scheduler *sched)
 	}
 
 	/* Signal all jobs already scheduled to HW */
-	list_for_each_entry(s_job, &sched->ring_mirror_list, list) {
+	list_for_each_entry(s_job, &sched->pending_list, list) {
 		struct drm_sched_fence *s_fence = s_job->s_fence;
 
 		dma_fence_set_error(&s_fence->finished, -EHWPOISON);
diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index c52eba407ebd..b694df12aaba 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -198,7 +198,7 @@ EXPORT_SYMBOL(drm_sched_dependency_optimized);
 static void drm_sched_start_timeout(struct drm_gpu_scheduler *sched)
 {
 	if (sched->timeout != MAX_SCHEDULE_TIMEOUT &&
-	    !list_empty(&sched->ring_mirror_list))
+	    !list_empty(&sched->pending_list))
 		schedule_delayed_work(&sched->work_tdr, sched->timeout);
 }
 
@@ -258,7 +258,7 @@ void drm_sched_resume_timeout(struct drm_gpu_scheduler *sched,
 {
 	spin_lock(&sched->job_list_lock);
 
-	if (list_empty(&sched->ring_mirror_list))
+	if (list_empty(&sched->pending_list))
 		cancel_delayed_work(&sched->work_tdr);
 	else
 		mod_delayed_work(system_wq, &sched->work_tdr, remaining);
@@ -272,7 +272,7 @@ static void drm_sched_job_begin(struct drm_sched_job *s_job)
 	struct drm_gpu_scheduler *sched = s_job->sched;
 
 	spin_lock(&sched->job_list_lock);
-	list_add_tail(&s_job->list, &sched->ring_mirror_list);
+	list_add_tail(&s_job->list, &sched->pending_list);
 	drm_sched_start_timeout(sched);
 	spin_unlock(&sched->job_list_lock);
 }
@@ -286,7 +286,7 @@ static void drm_sched_job_timedout(struct work_struct *work)
 
 	/* Protects against concurrent deletion in drm_sched_get_cleanup_job */
 	spin_lock(&sched->job_list_lock);
-	job = list_first_entry_or_null(&sched->ring_mirror_list,
+	job = list_first_entry_or_null(&sched->pending_list,
 				       struct drm_sched_job, list);
 
 	if (job) {
@@ -371,7 +371,7 @@ EXPORT_SYMBOL(drm_sched_increase_karma);
  * Stop the scheduler and also removes and frees all completed jobs.
  * Note: bad job will not be freed as it might be used later and so it's
  * callers responsibility to release it manually if it's not part of the
- * mirror list any more.
+ * pending list any more.
  *
  */
 void drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad)
@@ -392,15 +392,15 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad)
 		 * Add at the head of the queue to reflect it was the earliest
 		 * job extracted.
 		 */
-		list_add(&bad->list, &sched->ring_mirror_list);
+		list_add(&bad->list, &sched->pending_list);
 
 	/*
 	 * Iterate the job list from later to  earlier one and either deactive
-	 * their HW callbacks or remove them from mirror list if they already
+	 * their HW callbacks or remove them from pending list if they already
 	 * signaled.
 	 * This iteration is thread safe as sched thread is stopped.
 	 */
-	list_for_each_entry_safe_reverse(s_job, tmp, &sched->ring_mirror_list,
+	list_for_each_entry_safe_reverse(s_job, tmp, &sched->pending_list,
 					 list) {
 		if (s_job->s_fence->parent &&
 		    dma_fence_remove_callback(s_job->s_fence->parent,
@@ -408,7 +408,7 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad)
 			atomic_dec(&sched->hw_rq_count);
 		} else {
 			/*
-			 * remove job from ring_mirror_list.
+			 * remove job from pending_list.
 			 * Locking here is for concurrent resume timeout
 			 */
 			spin_lock(&sched->job_list_lock);
@@ -463,7 +463,7 @@ void drm_sched_start(struct drm_gpu_scheduler *sched, bool full_recovery)
 	 * so no new jobs are being inserted or removed. Also concurrent
 	 * GPU recovers can't run in parallel.
 	 */
-	list_for_each_entry_safe(s_job, tmp, &sched->ring_mirror_list, list) {
+	list_for_each_entry_safe(s_job, tmp, &sched->pending_list, list) {
 		struct dma_fence *fence = s_job->s_fence->parent;
 
 		atomic_inc(&sched->hw_rq_count);
@@ -494,7 +494,7 @@ void drm_sched_start(struct drm_gpu_scheduler *sched, bool full_recovery)
 EXPORT_SYMBOL(drm_sched_start);
 
 /**
- * drm_sched_resubmit_jobs - helper to relunch job from mirror ring list
+ * drm_sched_resubmit_jobs - helper to relunch job from pending ring list
  *
  * @sched: scheduler instance
  *
@@ -506,7 +506,7 @@ void drm_sched_resubmit_jobs(struct drm_gpu_scheduler *sched)
 	bool found_guilty = false;
 	struct dma_fence *fence;
 
-	list_for_each_entry_safe(s_job, tmp, &sched->ring_mirror_list, list) {
+	list_for_each_entry_safe(s_job, tmp, &sched->pending_list, list) {
 		struct drm_sched_fence *s_fence = s_job->s_fence;
 
 		if (!found_guilty && atomic_read(&s_job->karma) > sched->hang_limit) {
@@ -665,7 +665,7 @@ static void drm_sched_process_job(struct dma_fence *f, struct dma_fence_cb *cb)
  *
  * @sched: scheduler instance
  *
- * Returns the next finished job from the mirror list (if there is one)
+ * Returns the next finished job from the pending list (if there is one)
  * ready for it to be destroyed.
  */
 static struct drm_sched_job *
@@ -675,7 +675,7 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched)
 
 	/*
 	 * Don't destroy jobs while the timeout worker is running  OR thread
-	 * is being parked and hence assumed to not touch ring_mirror_list
+	 * is being parked and hence assumed to not touch pending_list
 	 */
 	if ((sched->timeout != MAX_SCHEDULE_TIMEOUT &&
 	    !cancel_delayed_work(&sched->work_tdr)) ||
@@ -684,11 +684,11 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched)
 
 	spin_lock(&sched->job_list_lock);
 
-	job = list_first_entry_or_null(&sched->ring_mirror_list,
+	job = list_first_entry_or_null(&sched->pending_list,
 				       struct drm_sched_job, list);
 
 	if (job && dma_fence_is_signaled(&job->s_fence->finished)) {
-		/* remove job from ring_mirror_list */
+		/* remove job from pending_list */
 		list_del_init(&job->list);
 	} else {
 		job = NULL;
@@ -858,7 +858,7 @@ int drm_sched_init(struct drm_gpu_scheduler *sched,
 
 	init_waitqueue_head(&sched->wake_up_worker);
 	init_waitqueue_head(&sched->job_scheduled);
-	INIT_LIST_HEAD(&sched->ring_mirror_list);
+	INIT_LIST_HEAD(&sched->pending_list);
 	spin_lock_init(&sched->job_list_lock);
 	atomic_set(&sched->hw_rq_count, 0);
 	INIT_DELAYED_WORK(&sched->work_tdr, drm_sched_job_timedout);
diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
index 3add0072bd37..2e0c368e19f6 100644
--- a/include/drm/gpu_scheduler.h
+++ b/include/drm/gpu_scheduler.h
@@ -174,7 +174,7 @@ struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f);
  * @sched: the scheduler instance on which this job is scheduled.
  * @s_fence: contains the fences for the scheduling of job.
  * @finish_cb: the callback for the finished fence.
- * @node: used to append this struct to the @drm_gpu_scheduler.ring_mirror_list.
+ * @node: used to append this struct to the @drm_gpu_scheduler.pending_list.
  * @id: a unique id assigned to each job scheduled on the scheduler.
  * @karma: increment on every hang caused by this job. If this exceeds the hang
  *         limit of the scheduler then the job is marked guilty and will not
@@ -203,7 +203,7 @@ struct drm_sched_job {
 static inline bool drm_sched_invalidate_job(struct drm_sched_job *s_job,
 					    int threshold)
 {
-	return (s_job && atomic_inc_return(&s_job->karma) > threshold);
+	return s_job && atomic_inc_return(&s_job->karma) > threshold;
 }
 
 /**
@@ -260,8 +260,8 @@ struct drm_sched_backend_ops {
  * @work_tdr: schedules a delayed call to @drm_sched_job_timedout after the
  *            timeout interval is over.
  * @thread: the kthread on which the scheduler which run.
- * @ring_mirror_list: the list of jobs which are currently in the job queue.
- * @job_list_lock: lock to protect the ring_mirror_list.
+ * @pending_list: the list of jobs which are currently in the job queue.
+ * @job_list_lock: lock to protect the pending_list.
  * @hang_limit: once the hangs by a job crosses this limit then it is marked
  *              guilty and it will be considered for scheduling further.
  * @score: score to help loadbalancer pick a idle sched
@@ -282,7 +282,7 @@ struct drm_gpu_scheduler {
 	atomic64_t			job_id_count;
 	struct delayed_work		work_tdr;
 	struct task_struct		*thread;
-	struct list_head		ring_mirror_list;
+	struct list_head		pending_list;
 	spinlock_t			job_list_lock;
 	int				hang_limit;
 	atomic_t                        score;
-- 
2.29.2.154.g7f7ebe054a

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 125+ messages in thread

* [PATCH 3/6] drm/scheduler: Job timeout handler returns status
  2020-11-25  3:17                                     ` Luben Tuikov
@ 2020-11-25  3:17                                       ` Luben Tuikov
  -1 siblings, 0 replies; 125+ messages in thread
From: Luben Tuikov @ 2020-11-25  3:17 UTC (permalink / raw)
  To: Andrey Grodzovsky, Christian König, Lucas Stach, Alexander Deucher
  Cc: Emily Deng, Luben Tuikov, amd-gfx, dri-devel, steven.price

The job timeout handler now returns status
indicating back to the DRM layer whether the job
was successfully cancelled or whether more time
should be given to the job to complete.

Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c |  6 ++++--
 include/drm/gpu_scheduler.h             | 13 ++++++++++---
 2 files changed, 14 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
index ff48101bab55..81b73790ecc6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -28,7 +28,7 @@
 #include "amdgpu.h"
 #include "amdgpu_trace.h"
 
-static void amdgpu_job_timedout(struct drm_sched_job *s_job)
+static int amdgpu_job_timedout(struct drm_sched_job *s_job)
 {
 	struct amdgpu_ring *ring = to_amdgpu_ring(s_job->sched);
 	struct amdgpu_job *job = to_amdgpu_job(s_job);
@@ -41,7 +41,7 @@ static void amdgpu_job_timedout(struct drm_sched_job *s_job)
 	    amdgpu_ring_soft_recovery(ring, job->vmid, s_job->s_fence->parent)) {
 		DRM_ERROR("ring %s timeout, but soft recovered\n",
 			  s_job->sched->name);
-		return;
+		return 0;
 	}
 
 	amdgpu_vm_get_task_info(ring->adev, job->pasid, &ti);
@@ -53,10 +53,12 @@ static void amdgpu_job_timedout(struct drm_sched_job *s_job)
 
 	if (amdgpu_device_should_recover_gpu(ring->adev)) {
 		amdgpu_device_gpu_recover(ring->adev, job);
+		return 0;
 	} else {
 		drm_sched_suspend_timeout(&ring->sched);
 		if (amdgpu_sriov_vf(adev))
 			adev->virt.tdr_debug = true;
+		return 1;
 	}
 }
 
diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
index 2e0c368e19f6..61f7121e1c19 100644
--- a/include/drm/gpu_scheduler.h
+++ b/include/drm/gpu_scheduler.h
@@ -230,10 +230,17 @@ struct drm_sched_backend_ops {
 	struct dma_fence *(*run_job)(struct drm_sched_job *sched_job);
 
 	/**
-         * @timedout_job: Called when a job has taken too long to execute,
-         * to trigger GPU recovery.
+	 * @timedout_job: Called when a job has taken too long to execute,
+	 * to trigger GPU recovery.
+	 *
+	 * Return 0, if the job has been aborted successfully and will
+	 * never be heard of from the device. Return non-zero if the
+	 * job wasn't able to be aborted, i.e. if more time should be
+	 * given to this job. The result is not "bool" as this
+	 * function is not a predicate, although its result may seem
+	 * as one.
 	 */
-	void (*timedout_job)(struct drm_sched_job *sched_job);
+	int (*timedout_job)(struct drm_sched_job *sched_job);
 
 	/**
          * @free_job: Called once the job's finished fence has been signaled
-- 
2.29.2.154.g7f7ebe054a

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 125+ messages in thread

* [PATCH 3/6] drm/scheduler: Job timeout handler returns status
@ 2020-11-25  3:17                                       ` Luben Tuikov
  0 siblings, 0 replies; 125+ messages in thread
From: Luben Tuikov @ 2020-11-25  3:17 UTC (permalink / raw)
  To: Andrey Grodzovsky, Christian König, Lucas Stach, Alexander Deucher
  Cc: Emily Deng, Luben Tuikov, amd-gfx, dri-devel, steven.price

The job timeout handler now returns status
indicating back to the DRM layer whether the job
was successfully cancelled or whether more time
should be given to the job to complete.

Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c |  6 ++++--
 include/drm/gpu_scheduler.h             | 13 ++++++++++---
 2 files changed, 14 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
index ff48101bab55..81b73790ecc6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -28,7 +28,7 @@
 #include "amdgpu.h"
 #include "amdgpu_trace.h"
 
-static void amdgpu_job_timedout(struct drm_sched_job *s_job)
+static int amdgpu_job_timedout(struct drm_sched_job *s_job)
 {
 	struct amdgpu_ring *ring = to_amdgpu_ring(s_job->sched);
 	struct amdgpu_job *job = to_amdgpu_job(s_job);
@@ -41,7 +41,7 @@ static void amdgpu_job_timedout(struct drm_sched_job *s_job)
 	    amdgpu_ring_soft_recovery(ring, job->vmid, s_job->s_fence->parent)) {
 		DRM_ERROR("ring %s timeout, but soft recovered\n",
 			  s_job->sched->name);
-		return;
+		return 0;
 	}
 
 	amdgpu_vm_get_task_info(ring->adev, job->pasid, &ti);
@@ -53,10 +53,12 @@ static void amdgpu_job_timedout(struct drm_sched_job *s_job)
 
 	if (amdgpu_device_should_recover_gpu(ring->adev)) {
 		amdgpu_device_gpu_recover(ring->adev, job);
+		return 0;
 	} else {
 		drm_sched_suspend_timeout(&ring->sched);
 		if (amdgpu_sriov_vf(adev))
 			adev->virt.tdr_debug = true;
+		return 1;
 	}
 }
 
diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
index 2e0c368e19f6..61f7121e1c19 100644
--- a/include/drm/gpu_scheduler.h
+++ b/include/drm/gpu_scheduler.h
@@ -230,10 +230,17 @@ struct drm_sched_backend_ops {
 	struct dma_fence *(*run_job)(struct drm_sched_job *sched_job);
 
 	/**
-         * @timedout_job: Called when a job has taken too long to execute,
-         * to trigger GPU recovery.
+	 * @timedout_job: Called when a job has taken too long to execute,
+	 * to trigger GPU recovery.
+	 *
+	 * Return 0, if the job has been aborted successfully and will
+	 * never be heard of from the device. Return non-zero if the
+	 * job wasn't able to be aborted, i.e. if more time should be
+	 * given to this job. The result is not "bool" as this
+	 * function is not a predicate, although its result may seem
+	 * as one.
 	 */
-	void (*timedout_job)(struct drm_sched_job *sched_job);
+	int (*timedout_job)(struct drm_sched_job *sched_job);
 
 	/**
          * @free_job: Called once the job's finished fence has been signaled
-- 
2.29.2.154.g7f7ebe054a

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 125+ messages in thread

* [PATCH 4/6] drm/scheduler: Essentialize the job done callback
  2020-11-25  3:17                                     ` Luben Tuikov
@ 2020-11-25  3:17                                       ` Luben Tuikov
  -1 siblings, 0 replies; 125+ messages in thread
From: Luben Tuikov @ 2020-11-25  3:17 UTC (permalink / raw)
  To: Andrey Grodzovsky, Christian König, Lucas Stach, Alexander Deucher
  Cc: Emily Deng, Luben Tuikov, amd-gfx, dri-devel, steven.price

The job done callback is called from various
places, in two ways: in job done role, and
as a fence callback role.

Essentialize the callback to an atom
function to just complete the job,
and into a second function as a prototype
of fence callback which calls to complete
the job.

This is used in latter patches by the completion
code.

Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
---
 drivers/gpu/drm/scheduler/sched_main.c | 73 ++++++++++++++------------
 1 file changed, 40 insertions(+), 33 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index b694df12aaba..3eb7618a627d 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -60,8 +60,6 @@
 #define to_drm_sched_job(sched_job)		\
 		container_of((sched_job), struct drm_sched_job, queue_node)
 
-static void drm_sched_process_job(struct dma_fence *f, struct dma_fence_cb *cb);
-
 /**
  * drm_sched_rq_init - initialize a given run queue struct
  *
@@ -162,6 +160,40 @@ drm_sched_rq_select_entity(struct drm_sched_rq *rq)
 	return NULL;
 }
 
+/**
+ * drm_sched_job_done - complete a job
+ * @s_job: pointer to the job which is done
+ *
+ * Finish the job's fence and wake up the worker thread.
+ */
+static void drm_sched_job_done(struct drm_sched_job *s_job)
+{
+	struct drm_sched_fence *s_fence = s_job->s_fence;
+	struct drm_gpu_scheduler *sched = s_fence->sched;
+
+	atomic_dec(&sched->hw_rq_count);
+	atomic_dec(&sched->score);
+
+	trace_drm_sched_process_job(s_fence);
+
+	dma_fence_get(&s_fence->finished);
+	drm_sched_fence_finished(s_fence);
+	dma_fence_put(&s_fence->finished);
+	wake_up_interruptible(&sched->wake_up_worker);
+}
+
+/**
+ * drm_sched_job_done_cb - the callback for a done job
+ * @f: fence
+ * @cb: fence callbacks
+ */
+static void drm_sched_job_done_cb(struct dma_fence *f, struct dma_fence_cb *cb)
+{
+	struct drm_sched_job *s_job = container_of(cb, struct drm_sched_job, cb);
+
+	drm_sched_job_done(s_job);
+}
+
 /**
  * drm_sched_dependency_optimized
  *
@@ -473,14 +505,14 @@ void drm_sched_start(struct drm_gpu_scheduler *sched, bool full_recovery)
 
 		if (fence) {
 			r = dma_fence_add_callback(fence, &s_job->cb,
-						   drm_sched_process_job);
+						   drm_sched_job_done_cb);
 			if (r == -ENOENT)
-				drm_sched_process_job(fence, &s_job->cb);
+				drm_sched_job_done(s_job);
 			else if (r)
 				DRM_ERROR("fence add callback failed (%d)\n",
 					  r);
 		} else
-			drm_sched_process_job(NULL, &s_job->cb);
+			drm_sched_job_done(s_job);
 	}
 
 	if (full_recovery) {
@@ -635,31 +667,6 @@ drm_sched_select_entity(struct drm_gpu_scheduler *sched)
 	return entity;
 }
 
-/**
- * drm_sched_process_job - process a job
- *
- * @f: fence
- * @cb: fence callbacks
- *
- * Called after job has finished execution.
- */
-static void drm_sched_process_job(struct dma_fence *f, struct dma_fence_cb *cb)
-{
-	struct drm_sched_job *s_job = container_of(cb, struct drm_sched_job, cb);
-	struct drm_sched_fence *s_fence = s_job->s_fence;
-	struct drm_gpu_scheduler *sched = s_fence->sched;
-
-	atomic_dec(&sched->hw_rq_count);
-	atomic_dec(&sched->score);
-
-	trace_drm_sched_process_job(s_fence);
-
-	dma_fence_get(&s_fence->finished);
-	drm_sched_fence_finished(s_fence);
-	dma_fence_put(&s_fence->finished);
-	wake_up_interruptible(&sched->wake_up_worker);
-}
-
 /**
  * drm_sched_get_cleanup_job - fetch the next finished job to be destroyed
  *
@@ -809,9 +816,9 @@ static int drm_sched_main(void *param)
 		if (!IS_ERR_OR_NULL(fence)) {
 			s_fence->parent = dma_fence_get(fence);
 			r = dma_fence_add_callback(fence, &sched_job->cb,
-						   drm_sched_process_job);
+						   drm_sched_job_done_cb);
 			if (r == -ENOENT)
-				drm_sched_process_job(fence, &sched_job->cb);
+				drm_sched_job_done(sched_job);
 			else if (r)
 				DRM_ERROR("fence add callback failed (%d)\n",
 					  r);
@@ -820,7 +827,7 @@ static int drm_sched_main(void *param)
 			if (IS_ERR(fence))
 				dma_fence_set_error(&s_fence->finished, PTR_ERR(fence));
 
-			drm_sched_process_job(NULL, &sched_job->cb);
+			drm_sched_job_done(sched_job);
 		}
 
 		wake_up(&sched->job_scheduled);
-- 
2.29.2.154.g7f7ebe054a

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 125+ messages in thread

* [PATCH 4/6] drm/scheduler: Essentialize the job done callback
@ 2020-11-25  3:17                                       ` Luben Tuikov
  0 siblings, 0 replies; 125+ messages in thread
From: Luben Tuikov @ 2020-11-25  3:17 UTC (permalink / raw)
  To: Andrey Grodzovsky, Christian König, Lucas Stach, Alexander Deucher
  Cc: Emily Deng, Luben Tuikov, amd-gfx, dri-devel, steven.price

The job done callback is called from various
places, in two ways: in job done role, and
as a fence callback role.

Essentialize the callback to an atom
function to just complete the job,
and into a second function as a prototype
of fence callback which calls to complete
the job.

This is used in latter patches by the completion
code.

Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
---
 drivers/gpu/drm/scheduler/sched_main.c | 73 ++++++++++++++------------
 1 file changed, 40 insertions(+), 33 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index b694df12aaba..3eb7618a627d 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -60,8 +60,6 @@
 #define to_drm_sched_job(sched_job)		\
 		container_of((sched_job), struct drm_sched_job, queue_node)
 
-static void drm_sched_process_job(struct dma_fence *f, struct dma_fence_cb *cb);
-
 /**
  * drm_sched_rq_init - initialize a given run queue struct
  *
@@ -162,6 +160,40 @@ drm_sched_rq_select_entity(struct drm_sched_rq *rq)
 	return NULL;
 }
 
+/**
+ * drm_sched_job_done - complete a job
+ * @s_job: pointer to the job which is done
+ *
+ * Finish the job's fence and wake up the worker thread.
+ */
+static void drm_sched_job_done(struct drm_sched_job *s_job)
+{
+	struct drm_sched_fence *s_fence = s_job->s_fence;
+	struct drm_gpu_scheduler *sched = s_fence->sched;
+
+	atomic_dec(&sched->hw_rq_count);
+	atomic_dec(&sched->score);
+
+	trace_drm_sched_process_job(s_fence);
+
+	dma_fence_get(&s_fence->finished);
+	drm_sched_fence_finished(s_fence);
+	dma_fence_put(&s_fence->finished);
+	wake_up_interruptible(&sched->wake_up_worker);
+}
+
+/**
+ * drm_sched_job_done_cb - the callback for a done job
+ * @f: fence
+ * @cb: fence callbacks
+ */
+static void drm_sched_job_done_cb(struct dma_fence *f, struct dma_fence_cb *cb)
+{
+	struct drm_sched_job *s_job = container_of(cb, struct drm_sched_job, cb);
+
+	drm_sched_job_done(s_job);
+}
+
 /**
  * drm_sched_dependency_optimized
  *
@@ -473,14 +505,14 @@ void drm_sched_start(struct drm_gpu_scheduler *sched, bool full_recovery)
 
 		if (fence) {
 			r = dma_fence_add_callback(fence, &s_job->cb,
-						   drm_sched_process_job);
+						   drm_sched_job_done_cb);
 			if (r == -ENOENT)
-				drm_sched_process_job(fence, &s_job->cb);
+				drm_sched_job_done(s_job);
 			else if (r)
 				DRM_ERROR("fence add callback failed (%d)\n",
 					  r);
 		} else
-			drm_sched_process_job(NULL, &s_job->cb);
+			drm_sched_job_done(s_job);
 	}
 
 	if (full_recovery) {
@@ -635,31 +667,6 @@ drm_sched_select_entity(struct drm_gpu_scheduler *sched)
 	return entity;
 }
 
-/**
- * drm_sched_process_job - process a job
- *
- * @f: fence
- * @cb: fence callbacks
- *
- * Called after job has finished execution.
- */
-static void drm_sched_process_job(struct dma_fence *f, struct dma_fence_cb *cb)
-{
-	struct drm_sched_job *s_job = container_of(cb, struct drm_sched_job, cb);
-	struct drm_sched_fence *s_fence = s_job->s_fence;
-	struct drm_gpu_scheduler *sched = s_fence->sched;
-
-	atomic_dec(&sched->hw_rq_count);
-	atomic_dec(&sched->score);
-
-	trace_drm_sched_process_job(s_fence);
-
-	dma_fence_get(&s_fence->finished);
-	drm_sched_fence_finished(s_fence);
-	dma_fence_put(&s_fence->finished);
-	wake_up_interruptible(&sched->wake_up_worker);
-}
-
 /**
  * drm_sched_get_cleanup_job - fetch the next finished job to be destroyed
  *
@@ -809,9 +816,9 @@ static int drm_sched_main(void *param)
 		if (!IS_ERR_OR_NULL(fence)) {
 			s_fence->parent = dma_fence_get(fence);
 			r = dma_fence_add_callback(fence, &sched_job->cb,
-						   drm_sched_process_job);
+						   drm_sched_job_done_cb);
 			if (r == -ENOENT)
-				drm_sched_process_job(fence, &sched_job->cb);
+				drm_sched_job_done(sched_job);
 			else if (r)
 				DRM_ERROR("fence add callback failed (%d)\n",
 					  r);
@@ -820,7 +827,7 @@ static int drm_sched_main(void *param)
 			if (IS_ERR(fence))
 				dma_fence_set_error(&s_fence->finished, PTR_ERR(fence));
 
-			drm_sched_process_job(NULL, &sched_job->cb);
+			drm_sched_job_done(sched_job);
 		}
 
 		wake_up(&sched->job_scheduled);
-- 
2.29.2.154.g7f7ebe054a

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 125+ messages in thread

* [PATCH 5/6] drm/amdgpu: Don't hardcode thread name length
  2020-11-25  3:17                                     ` Luben Tuikov
@ 2020-11-25  3:17                                       ` Luben Tuikov
  -1 siblings, 0 replies; 125+ messages in thread
From: Luben Tuikov @ 2020-11-25  3:17 UTC (permalink / raw)
  To: Andrey Grodzovsky, Christian König, Lucas Stach, Alexander Deucher
  Cc: Emily Deng, Luben Tuikov, amd-gfx, dri-devel, steven.price

Introduce a macro DRM_THREAD_NAME_LEN
and use that to define ring name size,
instead of hardcoding it to 16.

Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 2 +-
 include/drm/gpu_scheduler.h              | 2 ++
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
index 7112137689db..bbd46c6dec65 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
@@ -230,7 +230,7 @@ struct amdgpu_ring {
 	unsigned		wptr_offs;
 	unsigned		fence_offs;
 	uint64_t		current_ctx;
-	char			name[16];
+	char			name[DRM_THREAD_NAME_LEN];
 	u32                     trail_seq;
 	unsigned		trail_fence_offs;
 	u64			trail_fence_gpu_addr;
diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
index 61f7121e1c19..3a5686c3b5e9 100644
--- a/include/drm/gpu_scheduler.h
+++ b/include/drm/gpu_scheduler.h
@@ -30,6 +30,8 @@
 
 #define MAX_WAIT_SCHED_ENTITY_Q_EMPTY msecs_to_jiffies(1000)
 
+#define DRM_THREAD_NAME_LEN     TASK_COMM_LEN
+
 struct drm_gpu_scheduler;
 struct drm_sched_rq;
 
-- 
2.29.2.154.g7f7ebe054a

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 125+ messages in thread

* [PATCH 5/6] drm/amdgpu: Don't hardcode thread name length
@ 2020-11-25  3:17                                       ` Luben Tuikov
  0 siblings, 0 replies; 125+ messages in thread
From: Luben Tuikov @ 2020-11-25  3:17 UTC (permalink / raw)
  To: Andrey Grodzovsky, Christian König, Lucas Stach, Alexander Deucher
  Cc: Emily Deng, Luben Tuikov, amd-gfx, dri-devel, steven.price

Introduce a macro DRM_THREAD_NAME_LEN
and use that to define ring name size,
instead of hardcoding it to 16.

Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 2 +-
 include/drm/gpu_scheduler.h              | 2 ++
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
index 7112137689db..bbd46c6dec65 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
@@ -230,7 +230,7 @@ struct amdgpu_ring {
 	unsigned		wptr_offs;
 	unsigned		fence_offs;
 	uint64_t		current_ctx;
-	char			name[16];
+	char			name[DRM_THREAD_NAME_LEN];
 	u32                     trail_seq;
 	unsigned		trail_fence_offs;
 	u64			trail_fence_gpu_addr;
diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
index 61f7121e1c19..3a5686c3b5e9 100644
--- a/include/drm/gpu_scheduler.h
+++ b/include/drm/gpu_scheduler.h
@@ -30,6 +30,8 @@
 
 #define MAX_WAIT_SCHED_ENTITY_Q_EMPTY msecs_to_jiffies(1000)
 
+#define DRM_THREAD_NAME_LEN     TASK_COMM_LEN
+
 struct drm_gpu_scheduler;
 struct drm_sched_rq;
 
-- 
2.29.2.154.g7f7ebe054a

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 125+ messages in thread

* [PATCH 6/6] drm/sched: Make use of a "done" thread
  2020-11-25  3:17                                     ` Luben Tuikov
@ 2020-11-25  3:17                                       ` Luben Tuikov
  -1 siblings, 0 replies; 125+ messages in thread
From: Luben Tuikov @ 2020-11-25  3:17 UTC (permalink / raw)
  To: Andrey Grodzovsky, Christian König, Lucas Stach, Alexander Deucher
  Cc: Emily Deng, Luben Tuikov, amd-gfx, dri-devel, steven.price

Add a "done" list to which all completed jobs are added
to be freed. The drm_sched_job_done() callback is the
producer of jobs to this list.

Add a "done" thread which consumes from the done list
and frees up jobs. Now, the main scheduler thread only
pushes jobs to the GPU and the "done" thread frees them
up, on the way out of the GPU when they've completed
execution.

Make use of the status returned by the GPU driver
timeout handler to decide whether to leave the job in
the pending list, or to send it off to the done list.
If a job is done, it is added to the done list and the
done thread woken up. If a job needs more time, it is
left on the pending list and the timeout timer
restarted.

Eliminate the polling mechanism of picking out done
jobs from the pending list, i.e. eliminate
drm_sched_get_cleanup_job(). Now the main scheduler
thread only pushes jobs down to the GPU.

Various other optimizations to the GPU scheduler
and job recovery are possible with this format.

Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
---
 drivers/gpu/drm/scheduler/sched_main.c | 173 +++++++++++++------------
 include/drm/gpu_scheduler.h            |  14 ++
 2 files changed, 101 insertions(+), 86 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index 3eb7618a627d..289ae68cd97f 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -164,7 +164,8 @@ drm_sched_rq_select_entity(struct drm_sched_rq *rq)
  * drm_sched_job_done - complete a job
  * @s_job: pointer to the job which is done
  *
- * Finish the job's fence and wake up the worker thread.
+ * Finish the job's fence, move it to the done list,
+ * and wake up the done thread.
  */
 static void drm_sched_job_done(struct drm_sched_job *s_job)
 {
@@ -179,7 +180,12 @@ static void drm_sched_job_done(struct drm_sched_job *s_job)
 	dma_fence_get(&s_fence->finished);
 	drm_sched_fence_finished(s_fence);
 	dma_fence_put(&s_fence->finished);
-	wake_up_interruptible(&sched->wake_up_worker);
+
+	spin_lock(&sched->job_list_lock);
+	list_move(&s_job->list, &sched->done_list);
+	spin_unlock(&sched->job_list_lock);
+
+	wake_up_interruptible(&sched->done_wait_q);
 }
 
 /**
@@ -221,11 +227,10 @@ bool drm_sched_dependency_optimized(struct dma_fence* fence,
 EXPORT_SYMBOL(drm_sched_dependency_optimized);
 
 /**
- * drm_sched_start_timeout - start timeout for reset worker
- *
- * @sched: scheduler instance to start the worker for
+ * drm_sched_start_timeout - start a timeout timer
+ * @sched: scheduler instance whose job we're timing
  *
- * Start the timeout for the given scheduler.
+ * Start a timeout timer for the given scheduler.
  */
 static void drm_sched_start_timeout(struct drm_gpu_scheduler *sched)
 {
@@ -305,8 +310,8 @@ static void drm_sched_job_begin(struct drm_sched_job *s_job)
 
 	spin_lock(&sched->job_list_lock);
 	list_add_tail(&s_job->list, &sched->pending_list);
-	drm_sched_start_timeout(sched);
 	spin_unlock(&sched->job_list_lock);
+	drm_sched_start_timeout(sched);
 }
 
 static void drm_sched_job_timedout(struct work_struct *work)
@@ -316,37 +321,30 @@ static void drm_sched_job_timedout(struct work_struct *work)
 
 	sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work);
 
-	/* Protects against concurrent deletion in drm_sched_get_cleanup_job */
 	spin_lock(&sched->job_list_lock);
 	job = list_first_entry_or_null(&sched->pending_list,
 				       struct drm_sched_job, list);
+	spin_unlock(&sched->job_list_lock);
 
 	if (job) {
-		/*
-		 * Remove the bad job so it cannot be freed by concurrent
-		 * drm_sched_cleanup_jobs. It will be reinserted back after sched->thread
-		 * is parked at which point it's safe.
-		 */
-		list_del_init(&job->list);
-		spin_unlock(&sched->job_list_lock);
+		int res;
 
-		job->sched->ops->timedout_job(job);
+		job->job_status |= DRM_JOB_STATUS_TIMEOUT;
+		res = job->sched->ops->timedout_job(job);
+		if (res == 0) {
+			/* The job is out of the device.
+			 */
+			spin_lock(&sched->job_list_lock);
+			list_move(&job->list, &sched->done_list);
+			spin_unlock(&sched->job_list_lock);
 
-		/*
-		 * Guilty job did complete and hence needs to be manually removed
-		 * See drm_sched_stop doc.
-		 */
-		if (sched->free_guilty) {
-			job->sched->ops->free_job(job);
-			sched->free_guilty = false;
+			wake_up_interruptible(&sched->done_wait_q);
+		} else {
+			/* The job needs more time.
+			 */
+			drm_sched_start_timeout(sched);
 		}
-	} else {
-		spin_unlock(&sched->job_list_lock);
 	}
-
-	spin_lock(&sched->job_list_lock);
-	drm_sched_start_timeout(sched);
-	spin_unlock(&sched->job_list_lock);
 }
 
  /**
@@ -511,15 +509,13 @@ void drm_sched_start(struct drm_gpu_scheduler *sched, bool full_recovery)
 			else if (r)
 				DRM_ERROR("fence add callback failed (%d)\n",
 					  r);
-		} else
+		} else {
 			drm_sched_job_done(s_job);
+		}
 	}
 
-	if (full_recovery) {
-		spin_lock(&sched->job_list_lock);
+	if (full_recovery)
 		drm_sched_start_timeout(sched);
-		spin_unlock(&sched->job_list_lock);
-	}
 
 	kthread_unpark(sched->thread);
 }
@@ -667,47 +663,6 @@ drm_sched_select_entity(struct drm_gpu_scheduler *sched)
 	return entity;
 }
 
-/**
- * drm_sched_get_cleanup_job - fetch the next finished job to be destroyed
- *
- * @sched: scheduler instance
- *
- * Returns the next finished job from the pending list (if there is one)
- * ready for it to be destroyed.
- */
-static struct drm_sched_job *
-drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched)
-{
-	struct drm_sched_job *job;
-
-	/*
-	 * Don't destroy jobs while the timeout worker is running  OR thread
-	 * is being parked and hence assumed to not touch pending_list
-	 */
-	if ((sched->timeout != MAX_SCHEDULE_TIMEOUT &&
-	    !cancel_delayed_work(&sched->work_tdr)) ||
-	    kthread_should_park())
-		return NULL;
-
-	spin_lock(&sched->job_list_lock);
-
-	job = list_first_entry_or_null(&sched->pending_list,
-				       struct drm_sched_job, list);
-
-	if (job && dma_fence_is_signaled(&job->s_fence->finished)) {
-		/* remove job from pending_list */
-		list_del_init(&job->list);
-	} else {
-		job = NULL;
-		/* queue timeout for next job */
-		drm_sched_start_timeout(sched);
-	}
-
-	spin_unlock(&sched->job_list_lock);
-
-	return job;
-}
-
 /**
  * drm_sched_pick_best - Get a drm sched from a sched_list with the least load
  * @sched_list: list of drm_gpu_schedulers
@@ -761,6 +716,44 @@ static bool drm_sched_blocked(struct drm_gpu_scheduler *sched)
 	return false;
 }
 
+/**
+ * drm_sched_done - free done tasks
+ * @param: pointer to a scheduler instance
+ *
+ * Returns 0.
+ */
+static int drm_sched_done(void *param)
+{
+	struct drm_gpu_scheduler *sched = param;
+
+	do {
+		LIST_HEAD(done_q);
+
+		wait_event_interruptible(sched->done_wait_q,
+					 kthread_should_stop() ||
+					 !list_empty(&sched->done_list));
+
+		spin_lock(&sched->job_list_lock);
+		list_splice_init(&sched->done_list, &done_q);
+		spin_unlock(&sched->job_list_lock);
+
+		if (list_empty(&done_q))
+			continue;
+
+		while (!list_empty(&done_q)) {
+			struct drm_sched_job *job;
+
+			job = list_first_entry(&done_q,
+					       struct drm_sched_job,
+					       list);
+			list_del_init(&job->list);
+			sched->ops->free_job(job);
+		}
+	} while (!kthread_should_stop());
+
+	return 0;
+}
+
 /**
  * drm_sched_main - main scheduler thread
  *
@@ -770,7 +763,7 @@ static bool drm_sched_blocked(struct drm_gpu_scheduler *sched)
  */
 static int drm_sched_main(void *param)
 {
-	struct drm_gpu_scheduler *sched = (struct drm_gpu_scheduler *)param;
+	struct drm_gpu_scheduler *sched = param;
 	int r;
 
 	sched_set_fifo_low(current);
@@ -780,20 +773,12 @@ static int drm_sched_main(void *param)
 		struct drm_sched_fence *s_fence;
 		struct drm_sched_job *sched_job;
 		struct dma_fence *fence;
-		struct drm_sched_job *cleanup_job = NULL;
 
 		wait_event_interruptible(sched->wake_up_worker,
-					 (cleanup_job = drm_sched_get_cleanup_job(sched)) ||
 					 (!drm_sched_blocked(sched) &&
 					  (entity = drm_sched_select_entity(sched))) ||
 					 kthread_should_stop());
 
-		if (cleanup_job) {
-			sched->ops->free_job(cleanup_job);
-			/* queue timeout for next job */
-			drm_sched_start_timeout(sched);
-		}
-
 		if (!entity)
 			continue;
 
@@ -820,8 +805,7 @@ static int drm_sched_main(void *param)
 			if (r == -ENOENT)
 				drm_sched_job_done(sched_job);
 			else if (r)
-				DRM_ERROR("fence add callback failed (%d)\n",
-					  r);
+				DRM_ERROR("fence add callback failed (%d)\n", r);
 			dma_fence_put(fence);
 		} else {
 			if (IS_ERR(fence))
@@ -865,7 +849,9 @@ int drm_sched_init(struct drm_gpu_scheduler *sched,
 
 	init_waitqueue_head(&sched->wake_up_worker);
 	init_waitqueue_head(&sched->job_scheduled);
+	init_waitqueue_head(&sched->done_wait_q);
 	INIT_LIST_HEAD(&sched->pending_list);
+	INIT_LIST_HEAD(&sched->done_list);
 	spin_lock_init(&sched->job_list_lock);
 	atomic_set(&sched->hw_rq_count, 0);
 	INIT_DELAYED_WORK(&sched->work_tdr, drm_sched_job_timedout);
@@ -881,6 +867,21 @@ int drm_sched_init(struct drm_gpu_scheduler *sched,
 		return ret;
 	}
 
+	snprintf(sched->thread_done_name, DRM_THREAD_NAME_LEN, "%s%s",
+		 sched->name, "-done");
+	sched->thread_done_name[DRM_THREAD_NAME_LEN - 1] = '\0';
+	sched->thread_done = kthread_run(drm_sched_done, sched,
+					 sched->thread_done_name);
+	if (IS_ERR(sched->thread_done)) {
+		ret = kthread_stop(sched->thread);
+		if (!ret) {
+			/* free_kthread_struct(sched->thread); */
+			sched->thread = NULL;
+		}
+		DRM_ERROR("Failed to start thread %s", sched->thread_done_name);
+		return ret;
+	}
+
 	sched->ready = true;
 	return 0;
 }
diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
index 3a5686c3b5e9..b282d6158b50 100644
--- a/include/drm/gpu_scheduler.h
+++ b/include/drm/gpu_scheduler.h
@@ -169,6 +169,12 @@ struct drm_sched_fence {
 
 struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f);
 
+enum drm_job_status {
+	DRM_JOB_STATUS_NONE    = 0 << 0,
+	DRM_JOB_STATUS_DONE    = 1 << 0,
+	DRM_JOB_STATUS_TIMEOUT = 1 << 1,
+};
+
 /**
  * struct drm_sched_job - A job to be run by an entity.
  *
@@ -198,6 +204,7 @@ struct drm_sched_job {
 	uint64_t			id;
 	atomic_t			karma;
 	enum drm_sched_priority		s_priority;
+	enum drm_job_status             job_status;
 	struct drm_sched_entity         *entity;
 	struct dma_fence_cb		cb;
 };
@@ -284,15 +291,22 @@ struct drm_gpu_scheduler {
 	uint32_t			hw_submission_limit;
 	long				timeout;
 	const char			*name;
+	char                            thread_done_name[DRM_THREAD_NAME_LEN];
+
 	struct drm_sched_rq		sched_rq[DRM_SCHED_PRIORITY_COUNT];
 	wait_queue_head_t		wake_up_worker;
 	wait_queue_head_t		job_scheduled;
+	wait_queue_head_t               done_wait_q;
 	atomic_t			hw_rq_count;
 	atomic64_t			job_id_count;
 	struct delayed_work		work_tdr;
 	struct task_struct		*thread;
+	struct task_struct		*thread_done;
+
 	struct list_head		pending_list;
+	struct list_head                done_list;
 	spinlock_t			job_list_lock;
+
 	int				hang_limit;
 	atomic_t                        score;
 	bool				ready;
-- 
2.29.2.154.g7f7ebe054a

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 125+ messages in thread

* [PATCH 6/6] drm/sched: Make use of a "done" thread
@ 2020-11-25  3:17                                       ` Luben Tuikov
  0 siblings, 0 replies; 125+ messages in thread
From: Luben Tuikov @ 2020-11-25  3:17 UTC (permalink / raw)
  To: Andrey Grodzovsky, Christian König, Lucas Stach, Alexander Deucher
  Cc: Emily Deng, Luben Tuikov, amd-gfx, dri-devel, steven.price

Add a "done" list to which all completed jobs are added
to be freed. The drm_sched_job_done() callback is the
producer of jobs to this list.

Add a "done" thread which consumes from the done list
and frees up jobs. Now, the main scheduler thread only
pushes jobs to the GPU and the "done" thread frees them
up, on the way out of the GPU when they've completed
execution.

Make use of the status returned by the GPU driver
timeout handler to decide whether to leave the job in
the pending list, or to send it off to the done list.
If a job is done, it is added to the done list and the
done thread woken up. If a job needs more time, it is
left on the pending list and the timeout timer
restarted.

Eliminate the polling mechanism of picking out done
jobs from the pending list, i.e. eliminate
drm_sched_get_cleanup_job(). Now the main scheduler
thread only pushes jobs down to the GPU.

Various other optimizations to the GPU scheduler
and job recovery are possible with this format.

Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
---
 drivers/gpu/drm/scheduler/sched_main.c | 173 +++++++++++++------------
 include/drm/gpu_scheduler.h            |  14 ++
 2 files changed, 101 insertions(+), 86 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index 3eb7618a627d..289ae68cd97f 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -164,7 +164,8 @@ drm_sched_rq_select_entity(struct drm_sched_rq *rq)
  * drm_sched_job_done - complete a job
  * @s_job: pointer to the job which is done
  *
- * Finish the job's fence and wake up the worker thread.
+ * Finish the job's fence, move it to the done list,
+ * and wake up the done thread.
  */
 static void drm_sched_job_done(struct drm_sched_job *s_job)
 {
@@ -179,7 +180,12 @@ static void drm_sched_job_done(struct drm_sched_job *s_job)
 	dma_fence_get(&s_fence->finished);
 	drm_sched_fence_finished(s_fence);
 	dma_fence_put(&s_fence->finished);
-	wake_up_interruptible(&sched->wake_up_worker);
+
+	spin_lock(&sched->job_list_lock);
+	list_move(&s_job->list, &sched->done_list);
+	spin_unlock(&sched->job_list_lock);
+
+	wake_up_interruptible(&sched->done_wait_q);
 }
 
 /**
@@ -221,11 +227,10 @@ bool drm_sched_dependency_optimized(struct dma_fence* fence,
 EXPORT_SYMBOL(drm_sched_dependency_optimized);
 
 /**
- * drm_sched_start_timeout - start timeout for reset worker
- *
- * @sched: scheduler instance to start the worker for
+ * drm_sched_start_timeout - start a timeout timer
+ * @sched: scheduler instance whose job we're timing
  *
- * Start the timeout for the given scheduler.
+ * Start a timeout timer for the given scheduler.
  */
 static void drm_sched_start_timeout(struct drm_gpu_scheduler *sched)
 {
@@ -305,8 +310,8 @@ static void drm_sched_job_begin(struct drm_sched_job *s_job)
 
 	spin_lock(&sched->job_list_lock);
 	list_add_tail(&s_job->list, &sched->pending_list);
-	drm_sched_start_timeout(sched);
 	spin_unlock(&sched->job_list_lock);
+	drm_sched_start_timeout(sched);
 }
 
 static void drm_sched_job_timedout(struct work_struct *work)
@@ -316,37 +321,30 @@ static void drm_sched_job_timedout(struct work_struct *work)
 
 	sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work);
 
-	/* Protects against concurrent deletion in drm_sched_get_cleanup_job */
 	spin_lock(&sched->job_list_lock);
 	job = list_first_entry_or_null(&sched->pending_list,
 				       struct drm_sched_job, list);
+	spin_unlock(&sched->job_list_lock);
 
 	if (job) {
-		/*
-		 * Remove the bad job so it cannot be freed by concurrent
-		 * drm_sched_cleanup_jobs. It will be reinserted back after sched->thread
-		 * is parked at which point it's safe.
-		 */
-		list_del_init(&job->list);
-		spin_unlock(&sched->job_list_lock);
+		int res;
 
-		job->sched->ops->timedout_job(job);
+		job->job_status |= DRM_JOB_STATUS_TIMEOUT;
+		res = job->sched->ops->timedout_job(job);
+		if (res == 0) {
+			/* The job is out of the device.
+			 */
+			spin_lock(&sched->job_list_lock);
+			list_move(&job->list, &sched->done_list);
+			spin_unlock(&sched->job_list_lock);
 
-		/*
-		 * Guilty job did complete and hence needs to be manually removed
-		 * See drm_sched_stop doc.
-		 */
-		if (sched->free_guilty) {
-			job->sched->ops->free_job(job);
-			sched->free_guilty = false;
+			wake_up_interruptible(&sched->done_wait_q);
+		} else {
+			/* The job needs more time.
+			 */
+			drm_sched_start_timeout(sched);
 		}
-	} else {
-		spin_unlock(&sched->job_list_lock);
 	}
-
-	spin_lock(&sched->job_list_lock);
-	drm_sched_start_timeout(sched);
-	spin_unlock(&sched->job_list_lock);
 }
 
  /**
@@ -511,15 +509,13 @@ void drm_sched_start(struct drm_gpu_scheduler *sched, bool full_recovery)
 			else if (r)
 				DRM_ERROR("fence add callback failed (%d)\n",
 					  r);
-		} else
+		} else {
 			drm_sched_job_done(s_job);
+		}
 	}
 
-	if (full_recovery) {
-		spin_lock(&sched->job_list_lock);
+	if (full_recovery)
 		drm_sched_start_timeout(sched);
-		spin_unlock(&sched->job_list_lock);
-	}
 
 	kthread_unpark(sched->thread);
 }
@@ -667,47 +663,6 @@ drm_sched_select_entity(struct drm_gpu_scheduler *sched)
 	return entity;
 }
 
-/**
- * drm_sched_get_cleanup_job - fetch the next finished job to be destroyed
- *
- * @sched: scheduler instance
- *
- * Returns the next finished job from the pending list (if there is one)
- * ready for it to be destroyed.
- */
-static struct drm_sched_job *
-drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched)
-{
-	struct drm_sched_job *job;
-
-	/*
-	 * Don't destroy jobs while the timeout worker is running  OR thread
-	 * is being parked and hence assumed to not touch pending_list
-	 */
-	if ((sched->timeout != MAX_SCHEDULE_TIMEOUT &&
-	    !cancel_delayed_work(&sched->work_tdr)) ||
-	    kthread_should_park())
-		return NULL;
-
-	spin_lock(&sched->job_list_lock);
-
-	job = list_first_entry_or_null(&sched->pending_list,
-				       struct drm_sched_job, list);
-
-	if (job && dma_fence_is_signaled(&job->s_fence->finished)) {
-		/* remove job from pending_list */
-		list_del_init(&job->list);
-	} else {
-		job = NULL;
-		/* queue timeout for next job */
-		drm_sched_start_timeout(sched);
-	}
-
-	spin_unlock(&sched->job_list_lock);
-
-	return job;
-}
-
 /**
  * drm_sched_pick_best - Get a drm sched from a sched_list with the least load
  * @sched_list: list of drm_gpu_schedulers
@@ -761,6 +716,44 @@ static bool drm_sched_blocked(struct drm_gpu_scheduler *sched)
 	return false;
 }
 
+/**
+ * drm_sched_done - free done tasks
+ * @param: pointer to a scheduler instance
+ *
+ * Returns 0.
+ */
+static int drm_sched_done(void *param)
+{
+	struct drm_gpu_scheduler *sched = param;
+
+	do {
+		LIST_HEAD(done_q);
+
+		wait_event_interruptible(sched->done_wait_q,
+					 kthread_should_stop() ||
+					 !list_empty(&sched->done_list));
+
+		spin_lock(&sched->job_list_lock);
+		list_splice_init(&sched->done_list, &done_q);
+		spin_unlock(&sched->job_list_lock);
+
+		if (list_empty(&done_q))
+			continue;
+
+		while (!list_empty(&done_q)) {
+			struct drm_sched_job *job;
+
+			job = list_first_entry(&done_q,
+					       struct drm_sched_job,
+					       list);
+			list_del_init(&job->list);
+			sched->ops->free_job(job);
+		}
+	} while (!kthread_should_stop());
+
+	return 0;
+}
+
 /**
  * drm_sched_main - main scheduler thread
  *
@@ -770,7 +763,7 @@ static bool drm_sched_blocked(struct drm_gpu_scheduler *sched)
  */
 static int drm_sched_main(void *param)
 {
-	struct drm_gpu_scheduler *sched = (struct drm_gpu_scheduler *)param;
+	struct drm_gpu_scheduler *sched = param;
 	int r;
 
 	sched_set_fifo_low(current);
@@ -780,20 +773,12 @@ static int drm_sched_main(void *param)
 		struct drm_sched_fence *s_fence;
 		struct drm_sched_job *sched_job;
 		struct dma_fence *fence;
-		struct drm_sched_job *cleanup_job = NULL;
 
 		wait_event_interruptible(sched->wake_up_worker,
-					 (cleanup_job = drm_sched_get_cleanup_job(sched)) ||
 					 (!drm_sched_blocked(sched) &&
 					  (entity = drm_sched_select_entity(sched))) ||
 					 kthread_should_stop());
 
-		if (cleanup_job) {
-			sched->ops->free_job(cleanup_job);
-			/* queue timeout for next job */
-			drm_sched_start_timeout(sched);
-		}
-
 		if (!entity)
 			continue;
 
@@ -820,8 +805,7 @@ static int drm_sched_main(void *param)
 			if (r == -ENOENT)
 				drm_sched_job_done(sched_job);
 			else if (r)
-				DRM_ERROR("fence add callback failed (%d)\n",
-					  r);
+				DRM_ERROR("fence add callback failed (%d)\n", r);
 			dma_fence_put(fence);
 		} else {
 			if (IS_ERR(fence))
@@ -865,7 +849,9 @@ int drm_sched_init(struct drm_gpu_scheduler *sched,
 
 	init_waitqueue_head(&sched->wake_up_worker);
 	init_waitqueue_head(&sched->job_scheduled);
+	init_waitqueue_head(&sched->done_wait_q);
 	INIT_LIST_HEAD(&sched->pending_list);
+	INIT_LIST_HEAD(&sched->done_list);
 	spin_lock_init(&sched->job_list_lock);
 	atomic_set(&sched->hw_rq_count, 0);
 	INIT_DELAYED_WORK(&sched->work_tdr, drm_sched_job_timedout);
@@ -881,6 +867,21 @@ int drm_sched_init(struct drm_gpu_scheduler *sched,
 		return ret;
 	}
 
+	snprintf(sched->thread_done_name, DRM_THREAD_NAME_LEN, "%s%s",
+		 sched->name, "-done");
+	sched->thread_done_name[DRM_THREAD_NAME_LEN - 1] = '\0';
+	sched->thread_done = kthread_run(drm_sched_done, sched,
+					 sched->thread_done_name);
+	if (IS_ERR(sched->thread_done)) {
+		ret = kthread_stop(sched->thread);
+		if (!ret) {
+			/* free_kthread_struct(sched->thread); */
+			sched->thread = NULL;
+		}
+		DRM_ERROR("Failed to start thread %s", sched->thread_done_name);
+		return ret;
+	}
+
 	sched->ready = true;
 	return 0;
 }
diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
index 3a5686c3b5e9..b282d6158b50 100644
--- a/include/drm/gpu_scheduler.h
+++ b/include/drm/gpu_scheduler.h
@@ -169,6 +169,12 @@ struct drm_sched_fence {
 
 struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f);
 
+enum drm_job_status {
+	DRM_JOB_STATUS_NONE    = 0 << 0,
+	DRM_JOB_STATUS_DONE    = 1 << 0,
+	DRM_JOB_STATUS_TIMEOUT = 1 << 1,
+};
+
 /**
  * struct drm_sched_job - A job to be run by an entity.
  *
@@ -198,6 +204,7 @@ struct drm_sched_job {
 	uint64_t			id;
 	atomic_t			karma;
 	enum drm_sched_priority		s_priority;
+	enum drm_job_status             job_status;
 	struct drm_sched_entity         *entity;
 	struct dma_fence_cb		cb;
 };
@@ -284,15 +291,22 @@ struct drm_gpu_scheduler {
 	uint32_t			hw_submission_limit;
 	long				timeout;
 	const char			*name;
+	char                            thread_done_name[DRM_THREAD_NAME_LEN];
+
 	struct drm_sched_rq		sched_rq[DRM_SCHED_PRIORITY_COUNT];
 	wait_queue_head_t		wake_up_worker;
 	wait_queue_head_t		job_scheduled;
+	wait_queue_head_t               done_wait_q;
 	atomic_t			hw_rq_count;
 	atomic64_t			job_id_count;
 	struct delayed_work		work_tdr;
 	struct task_struct		*thread;
+	struct task_struct		*thread_done;
+
 	struct list_head		pending_list;
+	struct list_head                done_list;
 	spinlock_t			job_list_lock;
+
 	int				hang_limit;
 	atomic_t                        score;
 	bool				ready;
-- 
2.29.2.154.g7f7ebe054a

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 125+ messages in thread

* Re: [PATCH 3/6] drm/scheduler: Job timeout handler returns status
  2020-11-25  3:17                                       ` Luben Tuikov
  (?)
@ 2020-11-25  4:41                                         ` kernel test robot
  -1 siblings, 0 replies; 125+ messages in thread
From: kernel test robot @ 2020-11-25  4:41 UTC (permalink / raw)
  To: Luben Tuikov, Andrey Grodzovsky, Christian König,
	Lucas Stach, Alexander Deucher
  Cc: kbuild-all, dri-devel, steven.price, Emily Deng, Luben Tuikov, amd-gfx

[-- Attachment #1: Type: text/plain, Size: 3615 bytes --]

Hi Luben,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on linus/master]
[also build test ERROR on v5.10-rc5 next-20201124]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Luben-Tuikov/Allow-to-extend-the-timeout-without-jobs-disappearing/20201125-111945
base:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 127c501a03d5db8b833e953728d3bcf53c8832a9
config: nds32-randconfig-s032-20201125 (attached as .config)
compiler: nds32le-linux-gcc (GCC) 9.3.0
reproduce:
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # apt-get install sparse
        # sparse version: v0.6.3-151-g540c2c4b-dirty
        # https://github.com/0day-ci/linux/commit/14b618148200370c3b43498550534c17d50218fc
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Luben-Tuikov/Allow-to-extend-the-timeout-without-jobs-disappearing/20201125-111945
        git checkout 14b618148200370c3b43498550534c17d50218fc
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' ARCH=nds32 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

>> drivers/gpu/drm/etnaviv/etnaviv_sched.c:140:18: error: initialization of 'int (*)(struct drm_sched_job *)' from incompatible pointer type 'void (*)(struct drm_sched_job *)' [-Werror=incompatible-pointer-types]
     140 |  .timedout_job = etnaviv_sched_timedout_job,
         |                  ^~~~~~~~~~~~~~~~~~~~~~~~~~
   drivers/gpu/drm/etnaviv/etnaviv_sched.c:140:18: note: (near initialization for 'etnaviv_sched_ops.timedout_job')
   cc1: some warnings being treated as errors
--
   drivers/gpu/drm/lima/lima_sched.c: In function 'lima_sched_run_job':
   drivers/gpu/drm/lima/lima_sched.c:226:20: warning: variable 'ret' set but not used [-Wunused-but-set-variable]
     226 |  struct dma_fence *ret;
         |                    ^~~
   drivers/gpu/drm/lima/lima_sched.c: At top level:
>> drivers/gpu/drm/lima/lima_sched.c:472:18: error: initialization of 'int (*)(struct drm_sched_job *)' from incompatible pointer type 'void (*)(struct drm_sched_job *)' [-Werror=incompatible-pointer-types]
     472 |  .timedout_job = lima_sched_timedout_job,
         |                  ^~~~~~~~~~~~~~~~~~~~~~~
   drivers/gpu/drm/lima/lima_sched.c:472:18: note: (near initialization for 'lima_sched_ops.timedout_job')
   cc1: some warnings being treated as errors

vim +140 drivers/gpu/drm/etnaviv/etnaviv_sched.c

e93b6deeb45a78 Lucas Stach 2017-12-04  136  
e93b6deeb45a78 Lucas Stach 2017-12-04  137  static const struct drm_sched_backend_ops etnaviv_sched_ops = {
e93b6deeb45a78 Lucas Stach 2017-12-04  138  	.dependency = etnaviv_sched_dependency,
e93b6deeb45a78 Lucas Stach 2017-12-04  139  	.run_job = etnaviv_sched_run_job,
e93b6deeb45a78 Lucas Stach 2017-12-04 @140  	.timedout_job = etnaviv_sched_timedout_job,
e93b6deeb45a78 Lucas Stach 2017-12-04  141  	.free_job = etnaviv_sched_free_job,
e93b6deeb45a78 Lucas Stach 2017-12-04  142  };
e93b6deeb45a78 Lucas Stach 2017-12-04  143  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 32859 bytes --]

[-- Attachment #3: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH 3/6] drm/scheduler: Job timeout handler returns status
@ 2020-11-25  4:41                                         ` kernel test robot
  0 siblings, 0 replies; 125+ messages in thread
From: kernel test robot @ 2020-11-25  4:41 UTC (permalink / raw)
  To: Luben Tuikov, Andrey Grodzovsky, Christian König,
	Lucas Stach, Alexander Deucher
  Cc: kbuild-all, dri-devel, steven.price, Emily Deng, Luben Tuikov, amd-gfx

[-- Attachment #1: Type: text/plain, Size: 3615 bytes --]

Hi Luben,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on linus/master]
[also build test ERROR on v5.10-rc5 next-20201124]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Luben-Tuikov/Allow-to-extend-the-timeout-without-jobs-disappearing/20201125-111945
base:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 127c501a03d5db8b833e953728d3bcf53c8832a9
config: nds32-randconfig-s032-20201125 (attached as .config)
compiler: nds32le-linux-gcc (GCC) 9.3.0
reproduce:
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # apt-get install sparse
        # sparse version: v0.6.3-151-g540c2c4b-dirty
        # https://github.com/0day-ci/linux/commit/14b618148200370c3b43498550534c17d50218fc
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Luben-Tuikov/Allow-to-extend-the-timeout-without-jobs-disappearing/20201125-111945
        git checkout 14b618148200370c3b43498550534c17d50218fc
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' ARCH=nds32 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

>> drivers/gpu/drm/etnaviv/etnaviv_sched.c:140:18: error: initialization of 'int (*)(struct drm_sched_job *)' from incompatible pointer type 'void (*)(struct drm_sched_job *)' [-Werror=incompatible-pointer-types]
     140 |  .timedout_job = etnaviv_sched_timedout_job,
         |                  ^~~~~~~~~~~~~~~~~~~~~~~~~~
   drivers/gpu/drm/etnaviv/etnaviv_sched.c:140:18: note: (near initialization for 'etnaviv_sched_ops.timedout_job')
   cc1: some warnings being treated as errors
--
   drivers/gpu/drm/lima/lima_sched.c: In function 'lima_sched_run_job':
   drivers/gpu/drm/lima/lima_sched.c:226:20: warning: variable 'ret' set but not used [-Wunused-but-set-variable]
     226 |  struct dma_fence *ret;
         |                    ^~~
   drivers/gpu/drm/lima/lima_sched.c: At top level:
>> drivers/gpu/drm/lima/lima_sched.c:472:18: error: initialization of 'int (*)(struct drm_sched_job *)' from incompatible pointer type 'void (*)(struct drm_sched_job *)' [-Werror=incompatible-pointer-types]
     472 |  .timedout_job = lima_sched_timedout_job,
         |                  ^~~~~~~~~~~~~~~~~~~~~~~
   drivers/gpu/drm/lima/lima_sched.c:472:18: note: (near initialization for 'lima_sched_ops.timedout_job')
   cc1: some warnings being treated as errors

vim +140 drivers/gpu/drm/etnaviv/etnaviv_sched.c

e93b6deeb45a78 Lucas Stach 2017-12-04  136  
e93b6deeb45a78 Lucas Stach 2017-12-04  137  static const struct drm_sched_backend_ops etnaviv_sched_ops = {
e93b6deeb45a78 Lucas Stach 2017-12-04  138  	.dependency = etnaviv_sched_dependency,
e93b6deeb45a78 Lucas Stach 2017-12-04  139  	.run_job = etnaviv_sched_run_job,
e93b6deeb45a78 Lucas Stach 2017-12-04 @140  	.timedout_job = etnaviv_sched_timedout_job,
e93b6deeb45a78 Lucas Stach 2017-12-04  141  	.free_job = etnaviv_sched_free_job,
e93b6deeb45a78 Lucas Stach 2017-12-04  142  };
e93b6deeb45a78 Lucas Stach 2017-12-04  143  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 32859 bytes --]

[-- Attachment #3: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH 3/6] drm/scheduler: Job timeout handler returns status
@ 2020-11-25  4:41                                         ` kernel test robot
  0 siblings, 0 replies; 125+ messages in thread
From: kernel test robot @ 2020-11-25  4:41 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 3679 bytes --]

Hi Luben,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on linus/master]
[also build test ERROR on v5.10-rc5 next-20201124]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Luben-Tuikov/Allow-to-extend-the-timeout-without-jobs-disappearing/20201125-111945
base:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 127c501a03d5db8b833e953728d3bcf53c8832a9
config: nds32-randconfig-s032-20201125 (attached as .config)
compiler: nds32le-linux-gcc (GCC) 9.3.0
reproduce:
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # apt-get install sparse
        # sparse version: v0.6.3-151-g540c2c4b-dirty
        # https://github.com/0day-ci/linux/commit/14b618148200370c3b43498550534c17d50218fc
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Luben-Tuikov/Allow-to-extend-the-timeout-without-jobs-disappearing/20201125-111945
        git checkout 14b618148200370c3b43498550534c17d50218fc
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' ARCH=nds32 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

>> drivers/gpu/drm/etnaviv/etnaviv_sched.c:140:18: error: initialization of 'int (*)(struct drm_sched_job *)' from incompatible pointer type 'void (*)(struct drm_sched_job *)' [-Werror=incompatible-pointer-types]
     140 |  .timedout_job = etnaviv_sched_timedout_job,
         |                  ^~~~~~~~~~~~~~~~~~~~~~~~~~
   drivers/gpu/drm/etnaviv/etnaviv_sched.c:140:18: note: (near initialization for 'etnaviv_sched_ops.timedout_job')
   cc1: some warnings being treated as errors
--
   drivers/gpu/drm/lima/lima_sched.c: In function 'lima_sched_run_job':
   drivers/gpu/drm/lima/lima_sched.c:226:20: warning: variable 'ret' set but not used [-Wunused-but-set-variable]
     226 |  struct dma_fence *ret;
         |                    ^~~
   drivers/gpu/drm/lima/lima_sched.c: At top level:
>> drivers/gpu/drm/lima/lima_sched.c:472:18: error: initialization of 'int (*)(struct drm_sched_job *)' from incompatible pointer type 'void (*)(struct drm_sched_job *)' [-Werror=incompatible-pointer-types]
     472 |  .timedout_job = lima_sched_timedout_job,
         |                  ^~~~~~~~~~~~~~~~~~~~~~~
   drivers/gpu/drm/lima/lima_sched.c:472:18: note: (near initialization for 'lima_sched_ops.timedout_job')
   cc1: some warnings being treated as errors

vim +140 drivers/gpu/drm/etnaviv/etnaviv_sched.c

e93b6deeb45a78 Lucas Stach 2017-12-04  136  
e93b6deeb45a78 Lucas Stach 2017-12-04  137  static const struct drm_sched_backend_ops etnaviv_sched_ops = {
e93b6deeb45a78 Lucas Stach 2017-12-04  138  	.dependency = etnaviv_sched_dependency,
e93b6deeb45a78 Lucas Stach 2017-12-04  139  	.run_job = etnaviv_sched_run_job,
e93b6deeb45a78 Lucas Stach 2017-12-04 @140  	.timedout_job = etnaviv_sched_timedout_job,
e93b6deeb45a78 Lucas Stach 2017-12-04  141  	.free_job = etnaviv_sched_free_job,
e93b6deeb45a78 Lucas Stach 2017-12-04  142  };
e93b6deeb45a78 Lucas Stach 2017-12-04  143  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 32859 bytes --]

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH 1/6] drm/scheduler: "node" --> "list"
  2020-11-25  3:17                                       ` Luben Tuikov
@ 2020-11-25  9:44                                         ` Christian König
  -1 siblings, 0 replies; 125+ messages in thread
From: Christian König @ 2020-11-25  9:44 UTC (permalink / raw)
  To: Luben Tuikov, Andrey Grodzovsky, Lucas Stach, Alexander Deucher
  Cc: Emily Deng, amd-gfx, dri-devel, steven.price

Am 25.11.20 um 04:17 schrieb Luben Tuikov:
> Rename "node" to "list" in struct drm_sched_job,
> in order to make it consistent with what we see
> being used throughout gpu_scheduler.h, for
> instance in struct drm_sched_entity, as well as
> the rest of DRM and the kernel.
>
> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>

Reviewed-by: Christian König <christian.koenig@amd.com>

> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c |  6 +++---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c  |  2 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_job.c     |  2 +-
>   drivers/gpu/drm/scheduler/sched_main.c      | 23 +++++++++++----------
>   include/drm/gpu_scheduler.h                 |  4 ++--
>   5 files changed, 19 insertions(+), 18 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> index 5c1f3725c741..8358cae0b5a4 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> @@ -1427,7 +1427,7 @@ static void amdgpu_ib_preempt_job_recovery(struct drm_gpu_scheduler *sched)
>   	struct dma_fence *fence;
>   
>   	spin_lock(&sched->job_list_lock);
> -	list_for_each_entry(s_job, &sched->ring_mirror_list, node) {
> +	list_for_each_entry(s_job, &sched->ring_mirror_list, list) {
>   		fence = sched->ops->run_job(s_job);
>   		dma_fence_put(fence);
>   	}
> @@ -1459,10 +1459,10 @@ static void amdgpu_ib_preempt_mark_partial_job(struct amdgpu_ring *ring)
>   
>   no_preempt:
>   	spin_lock(&sched->job_list_lock);
> -	list_for_each_entry_safe(s_job, tmp, &sched->ring_mirror_list, node) {
> +	list_for_each_entry_safe(s_job, tmp, &sched->ring_mirror_list, list) {
>   		if (dma_fence_is_signaled(&s_job->s_fence->finished)) {
>   			/* remove job from ring_mirror_list */
> -			list_del_init(&s_job->node);
> +			list_del_init(&s_job->list);
>   			sched->ops->free_job(s_job);
>   			continue;
>   		}
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 7560b05e4ac1..4df6de81cd41 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -4128,7 +4128,7 @@ bool amdgpu_device_has_job_running(struct amdgpu_device *adev)
>   
>   		spin_lock(&ring->sched.job_list_lock);
>   		job = list_first_entry_or_null(&ring->sched.ring_mirror_list,
> -				struct drm_sched_job, node);
> +				struct drm_sched_job, list);
>   		spin_unlock(&ring->sched.job_list_lock);
>   		if (job)
>   			return true;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> index dcfe8a3b03ff..aca52a46b93d 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> @@ -271,7 +271,7 @@ void amdgpu_job_stop_all_jobs_on_sched(struct drm_gpu_scheduler *sched)
>   	}
>   
>   	/* Signal all jobs already scheduled to HW */
> -	list_for_each_entry(s_job, &sched->ring_mirror_list, node) {
> +	list_for_each_entry(s_job, &sched->ring_mirror_list, list) {
>   		struct drm_sched_fence *s_fence = s_job->s_fence;
>   
>   		dma_fence_set_error(&s_fence->finished, -EHWPOISON);
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> index c6332d75025e..c52eba407ebd 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -272,7 +272,7 @@ static void drm_sched_job_begin(struct drm_sched_job *s_job)
>   	struct drm_gpu_scheduler *sched = s_job->sched;
>   
>   	spin_lock(&sched->job_list_lock);
> -	list_add_tail(&s_job->node, &sched->ring_mirror_list);
> +	list_add_tail(&s_job->list, &sched->ring_mirror_list);
>   	drm_sched_start_timeout(sched);
>   	spin_unlock(&sched->job_list_lock);
>   }
> @@ -287,7 +287,7 @@ static void drm_sched_job_timedout(struct work_struct *work)
>   	/* Protects against concurrent deletion in drm_sched_get_cleanup_job */
>   	spin_lock(&sched->job_list_lock);
>   	job = list_first_entry_or_null(&sched->ring_mirror_list,
> -				       struct drm_sched_job, node);
> +				       struct drm_sched_job, list);
>   
>   	if (job) {
>   		/*
> @@ -295,7 +295,7 @@ static void drm_sched_job_timedout(struct work_struct *work)
>   		 * drm_sched_cleanup_jobs. It will be reinserted back after sched->thread
>   		 * is parked at which point it's safe.
>   		 */
> -		list_del_init(&job->node);
> +		list_del_init(&job->list);
>   		spin_unlock(&sched->job_list_lock);
>   
>   		job->sched->ops->timedout_job(job);
> @@ -392,7 +392,7 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad)
>   		 * Add at the head of the queue to reflect it was the earliest
>   		 * job extracted.
>   		 */
> -		list_add(&bad->node, &sched->ring_mirror_list);
> +		list_add(&bad->list, &sched->ring_mirror_list);
>   
>   	/*
>   	 * Iterate the job list from later to  earlier one and either deactive
> @@ -400,7 +400,8 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad)
>   	 * signaled.
>   	 * This iteration is thread safe as sched thread is stopped.
>   	 */
> -	list_for_each_entry_safe_reverse(s_job, tmp, &sched->ring_mirror_list, node) {
> +	list_for_each_entry_safe_reverse(s_job, tmp, &sched->ring_mirror_list,
> +					 list) {
>   		if (s_job->s_fence->parent &&
>   		    dma_fence_remove_callback(s_job->s_fence->parent,
>   					      &s_job->cb)) {
> @@ -411,7 +412,7 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad)
>   			 * Locking here is for concurrent resume timeout
>   			 */
>   			spin_lock(&sched->job_list_lock);
> -			list_del_init(&s_job->node);
> +			list_del_init(&s_job->list);
>   			spin_unlock(&sched->job_list_lock);
>   
>   			/*
> @@ -462,7 +463,7 @@ void drm_sched_start(struct drm_gpu_scheduler *sched, bool full_recovery)
>   	 * so no new jobs are being inserted or removed. Also concurrent
>   	 * GPU recovers can't run in parallel.
>   	 */
> -	list_for_each_entry_safe(s_job, tmp, &sched->ring_mirror_list, node) {
> +	list_for_each_entry_safe(s_job, tmp, &sched->ring_mirror_list, list) {
>   		struct dma_fence *fence = s_job->s_fence->parent;
>   
>   		atomic_inc(&sched->hw_rq_count);
> @@ -505,7 +506,7 @@ void drm_sched_resubmit_jobs(struct drm_gpu_scheduler *sched)
>   	bool found_guilty = false;
>   	struct dma_fence *fence;
>   
> -	list_for_each_entry_safe(s_job, tmp, &sched->ring_mirror_list, node) {
> +	list_for_each_entry_safe(s_job, tmp, &sched->ring_mirror_list, list) {
>   		struct drm_sched_fence *s_fence = s_job->s_fence;
>   
>   		if (!found_guilty && atomic_read(&s_job->karma) > sched->hang_limit) {
> @@ -565,7 +566,7 @@ int drm_sched_job_init(struct drm_sched_job *job,
>   		return -ENOMEM;
>   	job->id = atomic64_inc_return(&sched->job_id_count);
>   
> -	INIT_LIST_HEAD(&job->node);
> +	INIT_LIST_HEAD(&job->list);
>   
>   	return 0;
>   }
> @@ -684,11 +685,11 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched)
>   	spin_lock(&sched->job_list_lock);
>   
>   	job = list_first_entry_or_null(&sched->ring_mirror_list,
> -				       struct drm_sched_job, node);
> +				       struct drm_sched_job, list);
>   
>   	if (job && dma_fence_is_signaled(&job->s_fence->finished)) {
>   		/* remove job from ring_mirror_list */
> -		list_del_init(&job->node);
> +		list_del_init(&job->list);
>   	} else {
>   		job = NULL;
>   		/* queue timeout for next job */
> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> index 92436553fd6a..3add0072bd37 100644
> --- a/include/drm/gpu_scheduler.h
> +++ b/include/drm/gpu_scheduler.h
> @@ -189,14 +189,14 @@ struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f);
>    */
>   struct drm_sched_job {
>   	struct spsc_node		queue_node;
> +	struct list_head		list;
>   	struct drm_gpu_scheduler	*sched;
>   	struct drm_sched_fence		*s_fence;
>   	struct dma_fence_cb		finish_cb;
> -	struct list_head		node;
>   	uint64_t			id;
>   	atomic_t			karma;
>   	enum drm_sched_priority		s_priority;
> -	struct drm_sched_entity  *entity;
> +	struct drm_sched_entity         *entity;
>   	struct dma_fence_cb		cb;
>   };
>   

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH 1/6] drm/scheduler: "node" --> "list"
@ 2020-11-25  9:44                                         ` Christian König
  0 siblings, 0 replies; 125+ messages in thread
From: Christian König @ 2020-11-25  9:44 UTC (permalink / raw)
  To: Luben Tuikov, Andrey Grodzovsky, Lucas Stach, Alexander Deucher
  Cc: Emily Deng, amd-gfx, dri-devel, steven.price

Am 25.11.20 um 04:17 schrieb Luben Tuikov:
> Rename "node" to "list" in struct drm_sched_job,
> in order to make it consistent with what we see
> being used throughout gpu_scheduler.h, for
> instance in struct drm_sched_entity, as well as
> the rest of DRM and the kernel.
>
> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>

Reviewed-by: Christian König <christian.koenig@amd.com>

> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c |  6 +++---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c  |  2 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_job.c     |  2 +-
>   drivers/gpu/drm/scheduler/sched_main.c      | 23 +++++++++++----------
>   include/drm/gpu_scheduler.h                 |  4 ++--
>   5 files changed, 19 insertions(+), 18 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> index 5c1f3725c741..8358cae0b5a4 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> @@ -1427,7 +1427,7 @@ static void amdgpu_ib_preempt_job_recovery(struct drm_gpu_scheduler *sched)
>   	struct dma_fence *fence;
>   
>   	spin_lock(&sched->job_list_lock);
> -	list_for_each_entry(s_job, &sched->ring_mirror_list, node) {
> +	list_for_each_entry(s_job, &sched->ring_mirror_list, list) {
>   		fence = sched->ops->run_job(s_job);
>   		dma_fence_put(fence);
>   	}
> @@ -1459,10 +1459,10 @@ static void amdgpu_ib_preempt_mark_partial_job(struct amdgpu_ring *ring)
>   
>   no_preempt:
>   	spin_lock(&sched->job_list_lock);
> -	list_for_each_entry_safe(s_job, tmp, &sched->ring_mirror_list, node) {
> +	list_for_each_entry_safe(s_job, tmp, &sched->ring_mirror_list, list) {
>   		if (dma_fence_is_signaled(&s_job->s_fence->finished)) {
>   			/* remove job from ring_mirror_list */
> -			list_del_init(&s_job->node);
> +			list_del_init(&s_job->list);
>   			sched->ops->free_job(s_job);
>   			continue;
>   		}
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 7560b05e4ac1..4df6de81cd41 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -4128,7 +4128,7 @@ bool amdgpu_device_has_job_running(struct amdgpu_device *adev)
>   
>   		spin_lock(&ring->sched.job_list_lock);
>   		job = list_first_entry_or_null(&ring->sched.ring_mirror_list,
> -				struct drm_sched_job, node);
> +				struct drm_sched_job, list);
>   		spin_unlock(&ring->sched.job_list_lock);
>   		if (job)
>   			return true;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> index dcfe8a3b03ff..aca52a46b93d 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> @@ -271,7 +271,7 @@ void amdgpu_job_stop_all_jobs_on_sched(struct drm_gpu_scheduler *sched)
>   	}
>   
>   	/* Signal all jobs already scheduled to HW */
> -	list_for_each_entry(s_job, &sched->ring_mirror_list, node) {
> +	list_for_each_entry(s_job, &sched->ring_mirror_list, list) {
>   		struct drm_sched_fence *s_fence = s_job->s_fence;
>   
>   		dma_fence_set_error(&s_fence->finished, -EHWPOISON);
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> index c6332d75025e..c52eba407ebd 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -272,7 +272,7 @@ static void drm_sched_job_begin(struct drm_sched_job *s_job)
>   	struct drm_gpu_scheduler *sched = s_job->sched;
>   
>   	spin_lock(&sched->job_list_lock);
> -	list_add_tail(&s_job->node, &sched->ring_mirror_list);
> +	list_add_tail(&s_job->list, &sched->ring_mirror_list);
>   	drm_sched_start_timeout(sched);
>   	spin_unlock(&sched->job_list_lock);
>   }
> @@ -287,7 +287,7 @@ static void drm_sched_job_timedout(struct work_struct *work)
>   	/* Protects against concurrent deletion in drm_sched_get_cleanup_job */
>   	spin_lock(&sched->job_list_lock);
>   	job = list_first_entry_or_null(&sched->ring_mirror_list,
> -				       struct drm_sched_job, node);
> +				       struct drm_sched_job, list);
>   
>   	if (job) {
>   		/*
> @@ -295,7 +295,7 @@ static void drm_sched_job_timedout(struct work_struct *work)
>   		 * drm_sched_cleanup_jobs. It will be reinserted back after sched->thread
>   		 * is parked at which point it's safe.
>   		 */
> -		list_del_init(&job->node);
> +		list_del_init(&job->list);
>   		spin_unlock(&sched->job_list_lock);
>   
>   		job->sched->ops->timedout_job(job);
> @@ -392,7 +392,7 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad)
>   		 * Add at the head of the queue to reflect it was the earliest
>   		 * job extracted.
>   		 */
> -		list_add(&bad->node, &sched->ring_mirror_list);
> +		list_add(&bad->list, &sched->ring_mirror_list);
>   
>   	/*
>   	 * Iterate the job list from later to  earlier one and either deactive
> @@ -400,7 +400,8 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad)
>   	 * signaled.
>   	 * This iteration is thread safe as sched thread is stopped.
>   	 */
> -	list_for_each_entry_safe_reverse(s_job, tmp, &sched->ring_mirror_list, node) {
> +	list_for_each_entry_safe_reverse(s_job, tmp, &sched->ring_mirror_list,
> +					 list) {
>   		if (s_job->s_fence->parent &&
>   		    dma_fence_remove_callback(s_job->s_fence->parent,
>   					      &s_job->cb)) {
> @@ -411,7 +412,7 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad)
>   			 * Locking here is for concurrent resume timeout
>   			 */
>   			spin_lock(&sched->job_list_lock);
> -			list_del_init(&s_job->node);
> +			list_del_init(&s_job->list);
>   			spin_unlock(&sched->job_list_lock);
>   
>   			/*
> @@ -462,7 +463,7 @@ void drm_sched_start(struct drm_gpu_scheduler *sched, bool full_recovery)
>   	 * so no new jobs are being inserted or removed. Also concurrent
>   	 * GPU recovers can't run in parallel.
>   	 */
> -	list_for_each_entry_safe(s_job, tmp, &sched->ring_mirror_list, node) {
> +	list_for_each_entry_safe(s_job, tmp, &sched->ring_mirror_list, list) {
>   		struct dma_fence *fence = s_job->s_fence->parent;
>   
>   		atomic_inc(&sched->hw_rq_count);
> @@ -505,7 +506,7 @@ void drm_sched_resubmit_jobs(struct drm_gpu_scheduler *sched)
>   	bool found_guilty = false;
>   	struct dma_fence *fence;
>   
> -	list_for_each_entry_safe(s_job, tmp, &sched->ring_mirror_list, node) {
> +	list_for_each_entry_safe(s_job, tmp, &sched->ring_mirror_list, list) {
>   		struct drm_sched_fence *s_fence = s_job->s_fence;
>   
>   		if (!found_guilty && atomic_read(&s_job->karma) > sched->hang_limit) {
> @@ -565,7 +566,7 @@ int drm_sched_job_init(struct drm_sched_job *job,
>   		return -ENOMEM;
>   	job->id = atomic64_inc_return(&sched->job_id_count);
>   
> -	INIT_LIST_HEAD(&job->node);
> +	INIT_LIST_HEAD(&job->list);
>   
>   	return 0;
>   }
> @@ -684,11 +685,11 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched)
>   	spin_lock(&sched->job_list_lock);
>   
>   	job = list_first_entry_or_null(&sched->ring_mirror_list,
> -				       struct drm_sched_job, node);
> +				       struct drm_sched_job, list);
>   
>   	if (job && dma_fence_is_signaled(&job->s_fence->finished)) {
>   		/* remove job from ring_mirror_list */
> -		list_del_init(&job->node);
> +		list_del_init(&job->list);
>   	} else {
>   		job = NULL;
>   		/* queue timeout for next job */
> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> index 92436553fd6a..3add0072bd37 100644
> --- a/include/drm/gpu_scheduler.h
> +++ b/include/drm/gpu_scheduler.h
> @@ -189,14 +189,14 @@ struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f);
>    */
>   struct drm_sched_job {
>   	struct spsc_node		queue_node;
> +	struct list_head		list;
>   	struct drm_gpu_scheduler	*sched;
>   	struct drm_sched_fence		*s_fence;
>   	struct dma_fence_cb		finish_cb;
> -	struct list_head		node;
>   	uint64_t			id;
>   	atomic_t			karma;
>   	enum drm_sched_priority		s_priority;
> -	struct drm_sched_entity  *entity;
> +	struct drm_sched_entity         *entity;
>   	struct dma_fence_cb		cb;
>   };
>   

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH 2/6] gpu/drm: ring_mirror_list --> pending_list
  2020-11-25  3:17                                       ` Luben Tuikov
@ 2020-11-25  9:47                                         ` Christian König
  -1 siblings, 0 replies; 125+ messages in thread
From: Christian König @ 2020-11-25  9:47 UTC (permalink / raw)
  To: Luben Tuikov, Andrey Grodzovsky, Lucas Stach, Alexander Deucher
  Cc: Emily Deng, amd-gfx, dri-devel, steven.price

Am 25.11.20 um 04:17 schrieb Luben Tuikov:
> Rename "ring_mirror_list" to "pending_list",
> to describe what something is, not what it does,
> how it's used, or how the hardware implements it.
>
> This also abstracts the actual hardware
> implementation, i.e. how the low-level driver
> communicates with the device it drives, ring, CAM,
> etc., shouldn't be exposed to DRM.
>
> The pending_list keeps jobs submitted, which are
> out of our control. Usually this means they are
> pending execution status in hardware, but the
> latter definition is a more general (inclusive)
> definition.
>
> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>

In general the rename is a good idea, but I think we should try to 
remove this linked list in general.

As the original name described this is essentially a ring buffer, the is 
no reason I can see to use a linked list here except for the add/remove 
madness we currently have.

Anyway patch is Acked-by: Christian König <christian.koenig@amd.com> for 
now.

Regards,
Christian.

> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c |  4 +--
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c  |  4 +--
>   drivers/gpu/drm/amd/amdgpu/amdgpu_job.c     |  2 +-
>   drivers/gpu/drm/scheduler/sched_main.c      | 34 ++++++++++-----------
>   include/drm/gpu_scheduler.h                 | 10 +++---
>   5 files changed, 27 insertions(+), 27 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> index 8358cae0b5a4..db77a5bdfa45 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> @@ -1427,7 +1427,7 @@ static void amdgpu_ib_preempt_job_recovery(struct drm_gpu_scheduler *sched)
>   	struct dma_fence *fence;
>   
>   	spin_lock(&sched->job_list_lock);
> -	list_for_each_entry(s_job, &sched->ring_mirror_list, list) {
> +	list_for_each_entry(s_job, &sched->pending_list, list) {
>   		fence = sched->ops->run_job(s_job);
>   		dma_fence_put(fence);
>   	}
> @@ -1459,7 +1459,7 @@ static void amdgpu_ib_preempt_mark_partial_job(struct amdgpu_ring *ring)
>   
>   no_preempt:
>   	spin_lock(&sched->job_list_lock);
> -	list_for_each_entry_safe(s_job, tmp, &sched->ring_mirror_list, list) {
> +	list_for_each_entry_safe(s_job, tmp, &sched->pending_list, list) {
>   		if (dma_fence_is_signaled(&s_job->s_fence->finished)) {
>   			/* remove job from ring_mirror_list */
>   			list_del_init(&s_job->list);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 4df6de81cd41..fbae600aa5f9 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -4127,8 +4127,8 @@ bool amdgpu_device_has_job_running(struct amdgpu_device *adev)
>   			continue;
>   
>   		spin_lock(&ring->sched.job_list_lock);
> -		job = list_first_entry_or_null(&ring->sched.ring_mirror_list,
> -				struct drm_sched_job, list);
> +		job = list_first_entry_or_null(&ring->sched.pending_list,
> +					       struct drm_sched_job, list);
>   		spin_unlock(&ring->sched.job_list_lock);
>   		if (job)
>   			return true;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> index aca52a46b93d..ff48101bab55 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> @@ -271,7 +271,7 @@ void amdgpu_job_stop_all_jobs_on_sched(struct drm_gpu_scheduler *sched)
>   	}
>   
>   	/* Signal all jobs already scheduled to HW */
> -	list_for_each_entry(s_job, &sched->ring_mirror_list, list) {
> +	list_for_each_entry(s_job, &sched->pending_list, list) {
>   		struct drm_sched_fence *s_fence = s_job->s_fence;
>   
>   		dma_fence_set_error(&s_fence->finished, -EHWPOISON);
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> index c52eba407ebd..b694df12aaba 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -198,7 +198,7 @@ EXPORT_SYMBOL(drm_sched_dependency_optimized);
>   static void drm_sched_start_timeout(struct drm_gpu_scheduler *sched)
>   {
>   	if (sched->timeout != MAX_SCHEDULE_TIMEOUT &&
> -	    !list_empty(&sched->ring_mirror_list))
> +	    !list_empty(&sched->pending_list))
>   		schedule_delayed_work(&sched->work_tdr, sched->timeout);
>   }
>   
> @@ -258,7 +258,7 @@ void drm_sched_resume_timeout(struct drm_gpu_scheduler *sched,
>   {
>   	spin_lock(&sched->job_list_lock);
>   
> -	if (list_empty(&sched->ring_mirror_list))
> +	if (list_empty(&sched->pending_list))
>   		cancel_delayed_work(&sched->work_tdr);
>   	else
>   		mod_delayed_work(system_wq, &sched->work_tdr, remaining);
> @@ -272,7 +272,7 @@ static void drm_sched_job_begin(struct drm_sched_job *s_job)
>   	struct drm_gpu_scheduler *sched = s_job->sched;
>   
>   	spin_lock(&sched->job_list_lock);
> -	list_add_tail(&s_job->list, &sched->ring_mirror_list);
> +	list_add_tail(&s_job->list, &sched->pending_list);
>   	drm_sched_start_timeout(sched);
>   	spin_unlock(&sched->job_list_lock);
>   }
> @@ -286,7 +286,7 @@ static void drm_sched_job_timedout(struct work_struct *work)
>   
>   	/* Protects against concurrent deletion in drm_sched_get_cleanup_job */
>   	spin_lock(&sched->job_list_lock);
> -	job = list_first_entry_or_null(&sched->ring_mirror_list,
> +	job = list_first_entry_or_null(&sched->pending_list,
>   				       struct drm_sched_job, list);
>   
>   	if (job) {
> @@ -371,7 +371,7 @@ EXPORT_SYMBOL(drm_sched_increase_karma);
>    * Stop the scheduler and also removes and frees all completed jobs.
>    * Note: bad job will not be freed as it might be used later and so it's
>    * callers responsibility to release it manually if it's not part of the
> - * mirror list any more.
> + * pending list any more.
>    *
>    */
>   void drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad)
> @@ -392,15 +392,15 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad)
>   		 * Add at the head of the queue to reflect it was the earliest
>   		 * job extracted.
>   		 */
> -		list_add(&bad->list, &sched->ring_mirror_list);
> +		list_add(&bad->list, &sched->pending_list);
>   
>   	/*
>   	 * Iterate the job list from later to  earlier one and either deactive
> -	 * their HW callbacks or remove them from mirror list if they already
> +	 * their HW callbacks or remove them from pending list if they already
>   	 * signaled.
>   	 * This iteration is thread safe as sched thread is stopped.
>   	 */
> -	list_for_each_entry_safe_reverse(s_job, tmp, &sched->ring_mirror_list,
> +	list_for_each_entry_safe_reverse(s_job, tmp, &sched->pending_list,
>   					 list) {
>   		if (s_job->s_fence->parent &&
>   		    dma_fence_remove_callback(s_job->s_fence->parent,
> @@ -408,7 +408,7 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad)
>   			atomic_dec(&sched->hw_rq_count);
>   		} else {
>   			/*
> -			 * remove job from ring_mirror_list.
> +			 * remove job from pending_list.
>   			 * Locking here is for concurrent resume timeout
>   			 */
>   			spin_lock(&sched->job_list_lock);
> @@ -463,7 +463,7 @@ void drm_sched_start(struct drm_gpu_scheduler *sched, bool full_recovery)
>   	 * so no new jobs are being inserted or removed. Also concurrent
>   	 * GPU recovers can't run in parallel.
>   	 */
> -	list_for_each_entry_safe(s_job, tmp, &sched->ring_mirror_list, list) {
> +	list_for_each_entry_safe(s_job, tmp, &sched->pending_list, list) {
>   		struct dma_fence *fence = s_job->s_fence->parent;
>   
>   		atomic_inc(&sched->hw_rq_count);
> @@ -494,7 +494,7 @@ void drm_sched_start(struct drm_gpu_scheduler *sched, bool full_recovery)
>   EXPORT_SYMBOL(drm_sched_start);
>   
>   /**
> - * drm_sched_resubmit_jobs - helper to relunch job from mirror ring list
> + * drm_sched_resubmit_jobs - helper to relunch job from pending ring list
>    *
>    * @sched: scheduler instance
>    *
> @@ -506,7 +506,7 @@ void drm_sched_resubmit_jobs(struct drm_gpu_scheduler *sched)
>   	bool found_guilty = false;
>   	struct dma_fence *fence;
>   
> -	list_for_each_entry_safe(s_job, tmp, &sched->ring_mirror_list, list) {
> +	list_for_each_entry_safe(s_job, tmp, &sched->pending_list, list) {
>   		struct drm_sched_fence *s_fence = s_job->s_fence;
>   
>   		if (!found_guilty && atomic_read(&s_job->karma) > sched->hang_limit) {
> @@ -665,7 +665,7 @@ static void drm_sched_process_job(struct dma_fence *f, struct dma_fence_cb *cb)
>    *
>    * @sched: scheduler instance
>    *
> - * Returns the next finished job from the mirror list (if there is one)
> + * Returns the next finished job from the pending list (if there is one)
>    * ready for it to be destroyed.
>    */
>   static struct drm_sched_job *
> @@ -675,7 +675,7 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched)
>   
>   	/*
>   	 * Don't destroy jobs while the timeout worker is running  OR thread
> -	 * is being parked and hence assumed to not touch ring_mirror_list
> +	 * is being parked and hence assumed to not touch pending_list
>   	 */
>   	if ((sched->timeout != MAX_SCHEDULE_TIMEOUT &&
>   	    !cancel_delayed_work(&sched->work_tdr)) ||
> @@ -684,11 +684,11 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched)
>   
>   	spin_lock(&sched->job_list_lock);
>   
> -	job = list_first_entry_or_null(&sched->ring_mirror_list,
> +	job = list_first_entry_or_null(&sched->pending_list,
>   				       struct drm_sched_job, list);
>   
>   	if (job && dma_fence_is_signaled(&job->s_fence->finished)) {
> -		/* remove job from ring_mirror_list */
> +		/* remove job from pending_list */
>   		list_del_init(&job->list);
>   	} else {
>   		job = NULL;
> @@ -858,7 +858,7 @@ int drm_sched_init(struct drm_gpu_scheduler *sched,
>   
>   	init_waitqueue_head(&sched->wake_up_worker);
>   	init_waitqueue_head(&sched->job_scheduled);
> -	INIT_LIST_HEAD(&sched->ring_mirror_list);
> +	INIT_LIST_HEAD(&sched->pending_list);
>   	spin_lock_init(&sched->job_list_lock);
>   	atomic_set(&sched->hw_rq_count, 0);
>   	INIT_DELAYED_WORK(&sched->work_tdr, drm_sched_job_timedout);
> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> index 3add0072bd37..2e0c368e19f6 100644
> --- a/include/drm/gpu_scheduler.h
> +++ b/include/drm/gpu_scheduler.h
> @@ -174,7 +174,7 @@ struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f);
>    * @sched: the scheduler instance on which this job is scheduled.
>    * @s_fence: contains the fences for the scheduling of job.
>    * @finish_cb: the callback for the finished fence.
> - * @node: used to append this struct to the @drm_gpu_scheduler.ring_mirror_list.
> + * @node: used to append this struct to the @drm_gpu_scheduler.pending_list.
>    * @id: a unique id assigned to each job scheduled on the scheduler.
>    * @karma: increment on every hang caused by this job. If this exceeds the hang
>    *         limit of the scheduler then the job is marked guilty and will not
> @@ -203,7 +203,7 @@ struct drm_sched_job {
>   static inline bool drm_sched_invalidate_job(struct drm_sched_job *s_job,
>   					    int threshold)
>   {
> -	return (s_job && atomic_inc_return(&s_job->karma) > threshold);
> +	return s_job && atomic_inc_return(&s_job->karma) > threshold;
>   }
>   
>   /**
> @@ -260,8 +260,8 @@ struct drm_sched_backend_ops {
>    * @work_tdr: schedules a delayed call to @drm_sched_job_timedout after the
>    *            timeout interval is over.
>    * @thread: the kthread on which the scheduler which run.
> - * @ring_mirror_list: the list of jobs which are currently in the job queue.
> - * @job_list_lock: lock to protect the ring_mirror_list.
> + * @pending_list: the list of jobs which are currently in the job queue.
> + * @job_list_lock: lock to protect the pending_list.
>    * @hang_limit: once the hangs by a job crosses this limit then it is marked
>    *              guilty and it will be considered for scheduling further.
>    * @score: score to help loadbalancer pick a idle sched
> @@ -282,7 +282,7 @@ struct drm_gpu_scheduler {
>   	atomic64_t			job_id_count;
>   	struct delayed_work		work_tdr;
>   	struct task_struct		*thread;
> -	struct list_head		ring_mirror_list;
> +	struct list_head		pending_list;
>   	spinlock_t			job_list_lock;
>   	int				hang_limit;
>   	atomic_t                        score;

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH 2/6] gpu/drm: ring_mirror_list --> pending_list
@ 2020-11-25  9:47                                         ` Christian König
  0 siblings, 0 replies; 125+ messages in thread
From: Christian König @ 2020-11-25  9:47 UTC (permalink / raw)
  To: Luben Tuikov, Andrey Grodzovsky, Lucas Stach, Alexander Deucher
  Cc: Emily Deng, amd-gfx, dri-devel, steven.price

Am 25.11.20 um 04:17 schrieb Luben Tuikov:
> Rename "ring_mirror_list" to "pending_list",
> to describe what something is, not what it does,
> how it's used, or how the hardware implements it.
>
> This also abstracts the actual hardware
> implementation, i.e. how the low-level driver
> communicates with the device it drives, ring, CAM,
> etc., shouldn't be exposed to DRM.
>
> The pending_list keeps jobs submitted, which are
> out of our control. Usually this means they are
> pending execution status in hardware, but the
> latter definition is a more general (inclusive)
> definition.
>
> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>

In general the rename is a good idea, but I think we should try to 
remove this linked list in general.

As the original name described this is essentially a ring buffer, the is 
no reason I can see to use a linked list here except for the add/remove 
madness we currently have.

Anyway patch is Acked-by: Christian König <christian.koenig@amd.com> for 
now.

Regards,
Christian.

> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c |  4 +--
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c  |  4 +--
>   drivers/gpu/drm/amd/amdgpu/amdgpu_job.c     |  2 +-
>   drivers/gpu/drm/scheduler/sched_main.c      | 34 ++++++++++-----------
>   include/drm/gpu_scheduler.h                 | 10 +++---
>   5 files changed, 27 insertions(+), 27 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> index 8358cae0b5a4..db77a5bdfa45 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> @@ -1427,7 +1427,7 @@ static void amdgpu_ib_preempt_job_recovery(struct drm_gpu_scheduler *sched)
>   	struct dma_fence *fence;
>   
>   	spin_lock(&sched->job_list_lock);
> -	list_for_each_entry(s_job, &sched->ring_mirror_list, list) {
> +	list_for_each_entry(s_job, &sched->pending_list, list) {
>   		fence = sched->ops->run_job(s_job);
>   		dma_fence_put(fence);
>   	}
> @@ -1459,7 +1459,7 @@ static void amdgpu_ib_preempt_mark_partial_job(struct amdgpu_ring *ring)
>   
>   no_preempt:
>   	spin_lock(&sched->job_list_lock);
> -	list_for_each_entry_safe(s_job, tmp, &sched->ring_mirror_list, list) {
> +	list_for_each_entry_safe(s_job, tmp, &sched->pending_list, list) {
>   		if (dma_fence_is_signaled(&s_job->s_fence->finished)) {
>   			/* remove job from ring_mirror_list */
>   			list_del_init(&s_job->list);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 4df6de81cd41..fbae600aa5f9 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -4127,8 +4127,8 @@ bool amdgpu_device_has_job_running(struct amdgpu_device *adev)
>   			continue;
>   
>   		spin_lock(&ring->sched.job_list_lock);
> -		job = list_first_entry_or_null(&ring->sched.ring_mirror_list,
> -				struct drm_sched_job, list);
> +		job = list_first_entry_or_null(&ring->sched.pending_list,
> +					       struct drm_sched_job, list);
>   		spin_unlock(&ring->sched.job_list_lock);
>   		if (job)
>   			return true;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> index aca52a46b93d..ff48101bab55 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> @@ -271,7 +271,7 @@ void amdgpu_job_stop_all_jobs_on_sched(struct drm_gpu_scheduler *sched)
>   	}
>   
>   	/* Signal all jobs already scheduled to HW */
> -	list_for_each_entry(s_job, &sched->ring_mirror_list, list) {
> +	list_for_each_entry(s_job, &sched->pending_list, list) {
>   		struct drm_sched_fence *s_fence = s_job->s_fence;
>   
>   		dma_fence_set_error(&s_fence->finished, -EHWPOISON);
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> index c52eba407ebd..b694df12aaba 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -198,7 +198,7 @@ EXPORT_SYMBOL(drm_sched_dependency_optimized);
>   static void drm_sched_start_timeout(struct drm_gpu_scheduler *sched)
>   {
>   	if (sched->timeout != MAX_SCHEDULE_TIMEOUT &&
> -	    !list_empty(&sched->ring_mirror_list))
> +	    !list_empty(&sched->pending_list))
>   		schedule_delayed_work(&sched->work_tdr, sched->timeout);
>   }
>   
> @@ -258,7 +258,7 @@ void drm_sched_resume_timeout(struct drm_gpu_scheduler *sched,
>   {
>   	spin_lock(&sched->job_list_lock);
>   
> -	if (list_empty(&sched->ring_mirror_list))
> +	if (list_empty(&sched->pending_list))
>   		cancel_delayed_work(&sched->work_tdr);
>   	else
>   		mod_delayed_work(system_wq, &sched->work_tdr, remaining);
> @@ -272,7 +272,7 @@ static void drm_sched_job_begin(struct drm_sched_job *s_job)
>   	struct drm_gpu_scheduler *sched = s_job->sched;
>   
>   	spin_lock(&sched->job_list_lock);
> -	list_add_tail(&s_job->list, &sched->ring_mirror_list);
> +	list_add_tail(&s_job->list, &sched->pending_list);
>   	drm_sched_start_timeout(sched);
>   	spin_unlock(&sched->job_list_lock);
>   }
> @@ -286,7 +286,7 @@ static void drm_sched_job_timedout(struct work_struct *work)
>   
>   	/* Protects against concurrent deletion in drm_sched_get_cleanup_job */
>   	spin_lock(&sched->job_list_lock);
> -	job = list_first_entry_or_null(&sched->ring_mirror_list,
> +	job = list_first_entry_or_null(&sched->pending_list,
>   				       struct drm_sched_job, list);
>   
>   	if (job) {
> @@ -371,7 +371,7 @@ EXPORT_SYMBOL(drm_sched_increase_karma);
>    * Stop the scheduler and also removes and frees all completed jobs.
>    * Note: bad job will not be freed as it might be used later and so it's
>    * callers responsibility to release it manually if it's not part of the
> - * mirror list any more.
> + * pending list any more.
>    *
>    */
>   void drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad)
> @@ -392,15 +392,15 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad)
>   		 * Add at the head of the queue to reflect it was the earliest
>   		 * job extracted.
>   		 */
> -		list_add(&bad->list, &sched->ring_mirror_list);
> +		list_add(&bad->list, &sched->pending_list);
>   
>   	/*
>   	 * Iterate the job list from later to  earlier one and either deactive
> -	 * their HW callbacks or remove them from mirror list if they already
> +	 * their HW callbacks or remove them from pending list if they already
>   	 * signaled.
>   	 * This iteration is thread safe as sched thread is stopped.
>   	 */
> -	list_for_each_entry_safe_reverse(s_job, tmp, &sched->ring_mirror_list,
> +	list_for_each_entry_safe_reverse(s_job, tmp, &sched->pending_list,
>   					 list) {
>   		if (s_job->s_fence->parent &&
>   		    dma_fence_remove_callback(s_job->s_fence->parent,
> @@ -408,7 +408,7 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad)
>   			atomic_dec(&sched->hw_rq_count);
>   		} else {
>   			/*
> -			 * remove job from ring_mirror_list.
> +			 * remove job from pending_list.
>   			 * Locking here is for concurrent resume timeout
>   			 */
>   			spin_lock(&sched->job_list_lock);
> @@ -463,7 +463,7 @@ void drm_sched_start(struct drm_gpu_scheduler *sched, bool full_recovery)
>   	 * so no new jobs are being inserted or removed. Also concurrent
>   	 * GPU recovers can't run in parallel.
>   	 */
> -	list_for_each_entry_safe(s_job, tmp, &sched->ring_mirror_list, list) {
> +	list_for_each_entry_safe(s_job, tmp, &sched->pending_list, list) {
>   		struct dma_fence *fence = s_job->s_fence->parent;
>   
>   		atomic_inc(&sched->hw_rq_count);
> @@ -494,7 +494,7 @@ void drm_sched_start(struct drm_gpu_scheduler *sched, bool full_recovery)
>   EXPORT_SYMBOL(drm_sched_start);
>   
>   /**
> - * drm_sched_resubmit_jobs - helper to relunch job from mirror ring list
> + * drm_sched_resubmit_jobs - helper to relunch job from pending ring list
>    *
>    * @sched: scheduler instance
>    *
> @@ -506,7 +506,7 @@ void drm_sched_resubmit_jobs(struct drm_gpu_scheduler *sched)
>   	bool found_guilty = false;
>   	struct dma_fence *fence;
>   
> -	list_for_each_entry_safe(s_job, tmp, &sched->ring_mirror_list, list) {
> +	list_for_each_entry_safe(s_job, tmp, &sched->pending_list, list) {
>   		struct drm_sched_fence *s_fence = s_job->s_fence;
>   
>   		if (!found_guilty && atomic_read(&s_job->karma) > sched->hang_limit) {
> @@ -665,7 +665,7 @@ static void drm_sched_process_job(struct dma_fence *f, struct dma_fence_cb *cb)
>    *
>    * @sched: scheduler instance
>    *
> - * Returns the next finished job from the mirror list (if there is one)
> + * Returns the next finished job from the pending list (if there is one)
>    * ready for it to be destroyed.
>    */
>   static struct drm_sched_job *
> @@ -675,7 +675,7 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched)
>   
>   	/*
>   	 * Don't destroy jobs while the timeout worker is running  OR thread
> -	 * is being parked and hence assumed to not touch ring_mirror_list
> +	 * is being parked and hence assumed to not touch pending_list
>   	 */
>   	if ((sched->timeout != MAX_SCHEDULE_TIMEOUT &&
>   	    !cancel_delayed_work(&sched->work_tdr)) ||
> @@ -684,11 +684,11 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched)
>   
>   	spin_lock(&sched->job_list_lock);
>   
> -	job = list_first_entry_or_null(&sched->ring_mirror_list,
> +	job = list_first_entry_or_null(&sched->pending_list,
>   				       struct drm_sched_job, list);
>   
>   	if (job && dma_fence_is_signaled(&job->s_fence->finished)) {
> -		/* remove job from ring_mirror_list */
> +		/* remove job from pending_list */
>   		list_del_init(&job->list);
>   	} else {
>   		job = NULL;
> @@ -858,7 +858,7 @@ int drm_sched_init(struct drm_gpu_scheduler *sched,
>   
>   	init_waitqueue_head(&sched->wake_up_worker);
>   	init_waitqueue_head(&sched->job_scheduled);
> -	INIT_LIST_HEAD(&sched->ring_mirror_list);
> +	INIT_LIST_HEAD(&sched->pending_list);
>   	spin_lock_init(&sched->job_list_lock);
>   	atomic_set(&sched->hw_rq_count, 0);
>   	INIT_DELAYED_WORK(&sched->work_tdr, drm_sched_job_timedout);
> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> index 3add0072bd37..2e0c368e19f6 100644
> --- a/include/drm/gpu_scheduler.h
> +++ b/include/drm/gpu_scheduler.h
> @@ -174,7 +174,7 @@ struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f);
>    * @sched: the scheduler instance on which this job is scheduled.
>    * @s_fence: contains the fences for the scheduling of job.
>    * @finish_cb: the callback for the finished fence.
> - * @node: used to append this struct to the @drm_gpu_scheduler.ring_mirror_list.
> + * @node: used to append this struct to the @drm_gpu_scheduler.pending_list.
>    * @id: a unique id assigned to each job scheduled on the scheduler.
>    * @karma: increment on every hang caused by this job. If this exceeds the hang
>    *         limit of the scheduler then the job is marked guilty and will not
> @@ -203,7 +203,7 @@ struct drm_sched_job {
>   static inline bool drm_sched_invalidate_job(struct drm_sched_job *s_job,
>   					    int threshold)
>   {
> -	return (s_job && atomic_inc_return(&s_job->karma) > threshold);
> +	return s_job && atomic_inc_return(&s_job->karma) > threshold;
>   }
>   
>   /**
> @@ -260,8 +260,8 @@ struct drm_sched_backend_ops {
>    * @work_tdr: schedules a delayed call to @drm_sched_job_timedout after the
>    *            timeout interval is over.
>    * @thread: the kthread on which the scheduler which run.
> - * @ring_mirror_list: the list of jobs which are currently in the job queue.
> - * @job_list_lock: lock to protect the ring_mirror_list.
> + * @pending_list: the list of jobs which are currently in the job queue.
> + * @job_list_lock: lock to protect the pending_list.
>    * @hang_limit: once the hangs by a job crosses this limit then it is marked
>    *              guilty and it will be considered for scheduling further.
>    * @score: score to help loadbalancer pick a idle sched
> @@ -282,7 +282,7 @@ struct drm_gpu_scheduler {
>   	atomic64_t			job_id_count;
>   	struct delayed_work		work_tdr;
>   	struct task_struct		*thread;
> -	struct list_head		ring_mirror_list;
> +	struct list_head		pending_list;
>   	spinlock_t			job_list_lock;
>   	int				hang_limit;
>   	atomic_t                        score;

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH 3/6] drm/scheduler: Job timeout handler returns status
  2020-11-25  3:17                                       ` Luben Tuikov
@ 2020-11-25  9:50                                         ` Christian König
  -1 siblings, 0 replies; 125+ messages in thread
From: Christian König @ 2020-11-25  9:50 UTC (permalink / raw)
  To: Luben Tuikov, Andrey Grodzovsky, Lucas Stach, Alexander Deucher
  Cc: Emily Deng, amd-gfx, dri-devel, steven.price

Am 25.11.20 um 04:17 schrieb Luben Tuikov:
> The job timeout handler now returns status
> indicating back to the DRM layer whether the job
> was successfully cancelled or whether more time
> should be given to the job to complete.
>
> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_job.c |  6 ++++--
>   include/drm/gpu_scheduler.h             | 13 ++++++++++---
>   2 files changed, 14 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> index ff48101bab55..81b73790ecc6 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> @@ -28,7 +28,7 @@
>   #include "amdgpu.h"
>   #include "amdgpu_trace.h"
>   
> -static void amdgpu_job_timedout(struct drm_sched_job *s_job)
> +static int amdgpu_job_timedout(struct drm_sched_job *s_job)
>   {
>   	struct amdgpu_ring *ring = to_amdgpu_ring(s_job->sched);
>   	struct amdgpu_job *job = to_amdgpu_job(s_job);
> @@ -41,7 +41,7 @@ static void amdgpu_job_timedout(struct drm_sched_job *s_job)
>   	    amdgpu_ring_soft_recovery(ring, job->vmid, s_job->s_fence->parent)) {
>   		DRM_ERROR("ring %s timeout, but soft recovered\n",
>   			  s_job->sched->name);
> -		return;
> +		return 0;
>   	}
>   
>   	amdgpu_vm_get_task_info(ring->adev, job->pasid, &ti);
> @@ -53,10 +53,12 @@ static void amdgpu_job_timedout(struct drm_sched_job *s_job)
>   
>   	if (amdgpu_device_should_recover_gpu(ring->adev)) {
>   		amdgpu_device_gpu_recover(ring->adev, job);
> +		return 0;
>   	} else {
>   		drm_sched_suspend_timeout(&ring->sched);
>   		if (amdgpu_sriov_vf(adev))
>   			adev->virt.tdr_debug = true;
> +		return 1;
>   	}
>   }
>   
> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> index 2e0c368e19f6..61f7121e1c19 100644
> --- a/include/drm/gpu_scheduler.h
> +++ b/include/drm/gpu_scheduler.h
> @@ -230,10 +230,17 @@ struct drm_sched_backend_ops {
>   	struct dma_fence *(*run_job)(struct drm_sched_job *sched_job);
>   
>   	/**
> -         * @timedout_job: Called when a job has taken too long to execute,
> -         * to trigger GPU recovery.
> +	 * @timedout_job: Called when a job has taken too long to execute,
> +	 * to trigger GPU recovery.
> +	 *
> +	 * Return 0, if the job has been aborted successfully and will
> +	 * never be heard of from the device. Return non-zero if the
> +	 * job wasn't able to be aborted, i.e. if more time should be
> +	 * given to this job. The result is not "bool" as this
> +	 * function is not a predicate, although its result may seem
> +	 * as one.

I think the whole approach of timing out a job needs to be rethinked. 
What's timing out here is the hardware engine, not the job.

So we should also not have the job as parameter here. Maybe we should 
make that the fence we are waiting for instead.

>   	 */
> -	void (*timedout_job)(struct drm_sched_job *sched_job);
> +	int (*timedout_job)(struct drm_sched_job *sched_job);

I would either return an error code, boolean or enum here. But not use a 
number without a define.

Regards,
Christian.

>   
>   	/**
>            * @free_job: Called once the job's finished fence has been signaled

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH 3/6] drm/scheduler: Job timeout handler returns status
@ 2020-11-25  9:50                                         ` Christian König
  0 siblings, 0 replies; 125+ messages in thread
From: Christian König @ 2020-11-25  9:50 UTC (permalink / raw)
  To: Luben Tuikov, Andrey Grodzovsky, Lucas Stach, Alexander Deucher
  Cc: Emily Deng, amd-gfx, dri-devel, steven.price

Am 25.11.20 um 04:17 schrieb Luben Tuikov:
> The job timeout handler now returns status
> indicating back to the DRM layer whether the job
> was successfully cancelled or whether more time
> should be given to the job to complete.
>
> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_job.c |  6 ++++--
>   include/drm/gpu_scheduler.h             | 13 ++++++++++---
>   2 files changed, 14 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> index ff48101bab55..81b73790ecc6 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> @@ -28,7 +28,7 @@
>   #include "amdgpu.h"
>   #include "amdgpu_trace.h"
>   
> -static void amdgpu_job_timedout(struct drm_sched_job *s_job)
> +static int amdgpu_job_timedout(struct drm_sched_job *s_job)
>   {
>   	struct amdgpu_ring *ring = to_amdgpu_ring(s_job->sched);
>   	struct amdgpu_job *job = to_amdgpu_job(s_job);
> @@ -41,7 +41,7 @@ static void amdgpu_job_timedout(struct drm_sched_job *s_job)
>   	    amdgpu_ring_soft_recovery(ring, job->vmid, s_job->s_fence->parent)) {
>   		DRM_ERROR("ring %s timeout, but soft recovered\n",
>   			  s_job->sched->name);
> -		return;
> +		return 0;
>   	}
>   
>   	amdgpu_vm_get_task_info(ring->adev, job->pasid, &ti);
> @@ -53,10 +53,12 @@ static void amdgpu_job_timedout(struct drm_sched_job *s_job)
>   
>   	if (amdgpu_device_should_recover_gpu(ring->adev)) {
>   		amdgpu_device_gpu_recover(ring->adev, job);
> +		return 0;
>   	} else {
>   		drm_sched_suspend_timeout(&ring->sched);
>   		if (amdgpu_sriov_vf(adev))
>   			adev->virt.tdr_debug = true;
> +		return 1;
>   	}
>   }
>   
> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> index 2e0c368e19f6..61f7121e1c19 100644
> --- a/include/drm/gpu_scheduler.h
> +++ b/include/drm/gpu_scheduler.h
> @@ -230,10 +230,17 @@ struct drm_sched_backend_ops {
>   	struct dma_fence *(*run_job)(struct drm_sched_job *sched_job);
>   
>   	/**
> -         * @timedout_job: Called when a job has taken too long to execute,
> -         * to trigger GPU recovery.
> +	 * @timedout_job: Called when a job has taken too long to execute,
> +	 * to trigger GPU recovery.
> +	 *
> +	 * Return 0, if the job has been aborted successfully and will
> +	 * never be heard of from the device. Return non-zero if the
> +	 * job wasn't able to be aborted, i.e. if more time should be
> +	 * given to this job. The result is not "bool" as this
> +	 * function is not a predicate, although its result may seem
> +	 * as one.

I think the whole approach of timing out a job needs to be rethinked. 
What's timing out here is the hardware engine, not the job.

So we should also not have the job as parameter here. Maybe we should 
make that the fence we are waiting for instead.

>   	 */
> -	void (*timedout_job)(struct drm_sched_job *sched_job);
> +	int (*timedout_job)(struct drm_sched_job *sched_job);

I would either return an error code, boolean or enum here. But not use a 
number without a define.

Regards,
Christian.

>   
>   	/**
>            * @free_job: Called once the job's finished fence has been signaled

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH 4/6] drm/scheduler: Essentialize the job done callback
  2020-11-25  3:17                                       ` Luben Tuikov
@ 2020-11-25  9:51                                         ` Christian König
  -1 siblings, 0 replies; 125+ messages in thread
From: Christian König @ 2020-11-25  9:51 UTC (permalink / raw)
  To: Luben Tuikov, Andrey Grodzovsky, Lucas Stach, Alexander Deucher
  Cc: Emily Deng, amd-gfx, dri-devel, steven.price

Am 25.11.20 um 04:17 schrieb Luben Tuikov:
> The job done callback is called from various
> places, in two ways: in job done role, and
> as a fence callback role.
>
> Essentialize the callback to an atom
> function to just complete the job,
> and into a second function as a prototype
> of fence callback which calls to complete
> the job.
>
> This is used in latter patches by the completion
> code.
>
> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>

Reviewed-by: Christian König <christian.koenig@amd.com>

> ---
>   drivers/gpu/drm/scheduler/sched_main.c | 73 ++++++++++++++------------
>   1 file changed, 40 insertions(+), 33 deletions(-)
>
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> index b694df12aaba..3eb7618a627d 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -60,8 +60,6 @@
>   #define to_drm_sched_job(sched_job)		\
>   		container_of((sched_job), struct drm_sched_job, queue_node)
>   
> -static void drm_sched_process_job(struct dma_fence *f, struct dma_fence_cb *cb);
> -
>   /**
>    * drm_sched_rq_init - initialize a given run queue struct
>    *
> @@ -162,6 +160,40 @@ drm_sched_rq_select_entity(struct drm_sched_rq *rq)
>   	return NULL;
>   }
>   
> +/**
> + * drm_sched_job_done - complete a job
> + * @s_job: pointer to the job which is done
> + *
> + * Finish the job's fence and wake up the worker thread.
> + */
> +static void drm_sched_job_done(struct drm_sched_job *s_job)
> +{
> +	struct drm_sched_fence *s_fence = s_job->s_fence;
> +	struct drm_gpu_scheduler *sched = s_fence->sched;
> +
> +	atomic_dec(&sched->hw_rq_count);
> +	atomic_dec(&sched->score);
> +
> +	trace_drm_sched_process_job(s_fence);
> +
> +	dma_fence_get(&s_fence->finished);
> +	drm_sched_fence_finished(s_fence);
> +	dma_fence_put(&s_fence->finished);
> +	wake_up_interruptible(&sched->wake_up_worker);
> +}
> +
> +/**
> + * drm_sched_job_done_cb - the callback for a done job
> + * @f: fence
> + * @cb: fence callbacks
> + */
> +static void drm_sched_job_done_cb(struct dma_fence *f, struct dma_fence_cb *cb)
> +{
> +	struct drm_sched_job *s_job = container_of(cb, struct drm_sched_job, cb);
> +
> +	drm_sched_job_done(s_job);
> +}
> +
>   /**
>    * drm_sched_dependency_optimized
>    *
> @@ -473,14 +505,14 @@ void drm_sched_start(struct drm_gpu_scheduler *sched, bool full_recovery)
>   
>   		if (fence) {
>   			r = dma_fence_add_callback(fence, &s_job->cb,
> -						   drm_sched_process_job);
> +						   drm_sched_job_done_cb);
>   			if (r == -ENOENT)
> -				drm_sched_process_job(fence, &s_job->cb);
> +				drm_sched_job_done(s_job);
>   			else if (r)
>   				DRM_ERROR("fence add callback failed (%d)\n",
>   					  r);
>   		} else
> -			drm_sched_process_job(NULL, &s_job->cb);
> +			drm_sched_job_done(s_job);
>   	}
>   
>   	if (full_recovery) {
> @@ -635,31 +667,6 @@ drm_sched_select_entity(struct drm_gpu_scheduler *sched)
>   	return entity;
>   }
>   
> -/**
> - * drm_sched_process_job - process a job
> - *
> - * @f: fence
> - * @cb: fence callbacks
> - *
> - * Called after job has finished execution.
> - */
> -static void drm_sched_process_job(struct dma_fence *f, struct dma_fence_cb *cb)
> -{
> -	struct drm_sched_job *s_job = container_of(cb, struct drm_sched_job, cb);
> -	struct drm_sched_fence *s_fence = s_job->s_fence;
> -	struct drm_gpu_scheduler *sched = s_fence->sched;
> -
> -	atomic_dec(&sched->hw_rq_count);
> -	atomic_dec(&sched->score);
> -
> -	trace_drm_sched_process_job(s_fence);
> -
> -	dma_fence_get(&s_fence->finished);
> -	drm_sched_fence_finished(s_fence);
> -	dma_fence_put(&s_fence->finished);
> -	wake_up_interruptible(&sched->wake_up_worker);
> -}
> -
>   /**
>    * drm_sched_get_cleanup_job - fetch the next finished job to be destroyed
>    *
> @@ -809,9 +816,9 @@ static int drm_sched_main(void *param)
>   		if (!IS_ERR_OR_NULL(fence)) {
>   			s_fence->parent = dma_fence_get(fence);
>   			r = dma_fence_add_callback(fence, &sched_job->cb,
> -						   drm_sched_process_job);
> +						   drm_sched_job_done_cb);
>   			if (r == -ENOENT)
> -				drm_sched_process_job(fence, &sched_job->cb);
> +				drm_sched_job_done(sched_job);
>   			else if (r)
>   				DRM_ERROR("fence add callback failed (%d)\n",
>   					  r);
> @@ -820,7 +827,7 @@ static int drm_sched_main(void *param)
>   			if (IS_ERR(fence))
>   				dma_fence_set_error(&s_fence->finished, PTR_ERR(fence));
>   
> -			drm_sched_process_job(NULL, &sched_job->cb);
> +			drm_sched_job_done(sched_job);
>   		}
>   
>   		wake_up(&sched->job_scheduled);

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH 4/6] drm/scheduler: Essentialize the job done callback
@ 2020-11-25  9:51                                         ` Christian König
  0 siblings, 0 replies; 125+ messages in thread
From: Christian König @ 2020-11-25  9:51 UTC (permalink / raw)
  To: Luben Tuikov, Andrey Grodzovsky, Lucas Stach, Alexander Deucher
  Cc: Emily Deng, amd-gfx, dri-devel, steven.price

Am 25.11.20 um 04:17 schrieb Luben Tuikov:
> The job done callback is called from various
> places, in two ways: in job done role, and
> as a fence callback role.
>
> Essentialize the callback to an atom
> function to just complete the job,
> and into a second function as a prototype
> of fence callback which calls to complete
> the job.
>
> This is used in latter patches by the completion
> code.
>
> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>

Reviewed-by: Christian König <christian.koenig@amd.com>

> ---
>   drivers/gpu/drm/scheduler/sched_main.c | 73 ++++++++++++++------------
>   1 file changed, 40 insertions(+), 33 deletions(-)
>
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> index b694df12aaba..3eb7618a627d 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -60,8 +60,6 @@
>   #define to_drm_sched_job(sched_job)		\
>   		container_of((sched_job), struct drm_sched_job, queue_node)
>   
> -static void drm_sched_process_job(struct dma_fence *f, struct dma_fence_cb *cb);
> -
>   /**
>    * drm_sched_rq_init - initialize a given run queue struct
>    *
> @@ -162,6 +160,40 @@ drm_sched_rq_select_entity(struct drm_sched_rq *rq)
>   	return NULL;
>   }
>   
> +/**
> + * drm_sched_job_done - complete a job
> + * @s_job: pointer to the job which is done
> + *
> + * Finish the job's fence and wake up the worker thread.
> + */
> +static void drm_sched_job_done(struct drm_sched_job *s_job)
> +{
> +	struct drm_sched_fence *s_fence = s_job->s_fence;
> +	struct drm_gpu_scheduler *sched = s_fence->sched;
> +
> +	atomic_dec(&sched->hw_rq_count);
> +	atomic_dec(&sched->score);
> +
> +	trace_drm_sched_process_job(s_fence);
> +
> +	dma_fence_get(&s_fence->finished);
> +	drm_sched_fence_finished(s_fence);
> +	dma_fence_put(&s_fence->finished);
> +	wake_up_interruptible(&sched->wake_up_worker);
> +}
> +
> +/**
> + * drm_sched_job_done_cb - the callback for a done job
> + * @f: fence
> + * @cb: fence callbacks
> + */
> +static void drm_sched_job_done_cb(struct dma_fence *f, struct dma_fence_cb *cb)
> +{
> +	struct drm_sched_job *s_job = container_of(cb, struct drm_sched_job, cb);
> +
> +	drm_sched_job_done(s_job);
> +}
> +
>   /**
>    * drm_sched_dependency_optimized
>    *
> @@ -473,14 +505,14 @@ void drm_sched_start(struct drm_gpu_scheduler *sched, bool full_recovery)
>   
>   		if (fence) {
>   			r = dma_fence_add_callback(fence, &s_job->cb,
> -						   drm_sched_process_job);
> +						   drm_sched_job_done_cb);
>   			if (r == -ENOENT)
> -				drm_sched_process_job(fence, &s_job->cb);
> +				drm_sched_job_done(s_job);
>   			else if (r)
>   				DRM_ERROR("fence add callback failed (%d)\n",
>   					  r);
>   		} else
> -			drm_sched_process_job(NULL, &s_job->cb);
> +			drm_sched_job_done(s_job);
>   	}
>   
>   	if (full_recovery) {
> @@ -635,31 +667,6 @@ drm_sched_select_entity(struct drm_gpu_scheduler *sched)
>   	return entity;
>   }
>   
> -/**
> - * drm_sched_process_job - process a job
> - *
> - * @f: fence
> - * @cb: fence callbacks
> - *
> - * Called after job has finished execution.
> - */
> -static void drm_sched_process_job(struct dma_fence *f, struct dma_fence_cb *cb)
> -{
> -	struct drm_sched_job *s_job = container_of(cb, struct drm_sched_job, cb);
> -	struct drm_sched_fence *s_fence = s_job->s_fence;
> -	struct drm_gpu_scheduler *sched = s_fence->sched;
> -
> -	atomic_dec(&sched->hw_rq_count);
> -	atomic_dec(&sched->score);
> -
> -	trace_drm_sched_process_job(s_fence);
> -
> -	dma_fence_get(&s_fence->finished);
> -	drm_sched_fence_finished(s_fence);
> -	dma_fence_put(&s_fence->finished);
> -	wake_up_interruptible(&sched->wake_up_worker);
> -}
> -
>   /**
>    * drm_sched_get_cleanup_job - fetch the next finished job to be destroyed
>    *
> @@ -809,9 +816,9 @@ static int drm_sched_main(void *param)
>   		if (!IS_ERR_OR_NULL(fence)) {
>   			s_fence->parent = dma_fence_get(fence);
>   			r = dma_fence_add_callback(fence, &sched_job->cb,
> -						   drm_sched_process_job);
> +						   drm_sched_job_done_cb);
>   			if (r == -ENOENT)
> -				drm_sched_process_job(fence, &sched_job->cb);
> +				drm_sched_job_done(sched_job);
>   			else if (r)
>   				DRM_ERROR("fence add callback failed (%d)\n",
>   					  r);
> @@ -820,7 +827,7 @@ static int drm_sched_main(void *param)
>   			if (IS_ERR(fence))
>   				dma_fence_set_error(&s_fence->finished, PTR_ERR(fence));
>   
> -			drm_sched_process_job(NULL, &sched_job->cb);
> +			drm_sched_job_done(sched_job);
>   		}
>   
>   		wake_up(&sched->job_scheduled);

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH 5/6] drm/amdgpu: Don't hardcode thread name length
  2020-11-25  3:17                                       ` Luben Tuikov
@ 2020-11-25  9:55                                         ` Christian König
  -1 siblings, 0 replies; 125+ messages in thread
From: Christian König @ 2020-11-25  9:55 UTC (permalink / raw)
  To: Luben Tuikov, Andrey Grodzovsky, Lucas Stach, Alexander Deucher
  Cc: Emily Deng, amd-gfx, dri-devel, steven.price

Am 25.11.20 um 04:17 schrieb Luben Tuikov:
> Introduce a macro DRM_THREAD_NAME_LEN
> and use that to define ring name size,
> instead of hardcoding it to 16.
>
> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 2 +-
>   include/drm/gpu_scheduler.h              | 2 ++
>   2 files changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> index 7112137689db..bbd46c6dec65 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> @@ -230,7 +230,7 @@ struct amdgpu_ring {
>   	unsigned		wptr_offs;
>   	unsigned		fence_offs;
>   	uint64_t		current_ctx;
> -	char			name[16];
> +	char			name[DRM_THREAD_NAME_LEN];
>   	u32                     trail_seq;
>   	unsigned		trail_fence_offs;
>   	u64			trail_fence_gpu_addr;
> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> index 61f7121e1c19..3a5686c3b5e9 100644
> --- a/include/drm/gpu_scheduler.h
> +++ b/include/drm/gpu_scheduler.h
> @@ -30,6 +30,8 @@
>   
>   #define MAX_WAIT_SCHED_ENTITY_Q_EMPTY msecs_to_jiffies(1000)
>   
> +#define DRM_THREAD_NAME_LEN     TASK_COMM_LEN
> +

The thread name is an amdgpu specific thing. I don't think we should 
have that in the scheduler.

And why do you use TASK_COMM_LEN here? That is completely unrelated stuff.

Regards,
Christian.

>   struct drm_gpu_scheduler;
>   struct drm_sched_rq;
>   

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH 5/6] drm/amdgpu: Don't hardcode thread name length
@ 2020-11-25  9:55                                         ` Christian König
  0 siblings, 0 replies; 125+ messages in thread
From: Christian König @ 2020-11-25  9:55 UTC (permalink / raw)
  To: Luben Tuikov, Andrey Grodzovsky, Lucas Stach, Alexander Deucher
  Cc: Emily Deng, amd-gfx, dri-devel, steven.price

Am 25.11.20 um 04:17 schrieb Luben Tuikov:
> Introduce a macro DRM_THREAD_NAME_LEN
> and use that to define ring name size,
> instead of hardcoding it to 16.
>
> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 2 +-
>   include/drm/gpu_scheduler.h              | 2 ++
>   2 files changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> index 7112137689db..bbd46c6dec65 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> @@ -230,7 +230,7 @@ struct amdgpu_ring {
>   	unsigned		wptr_offs;
>   	unsigned		fence_offs;
>   	uint64_t		current_ctx;
> -	char			name[16];
> +	char			name[DRM_THREAD_NAME_LEN];
>   	u32                     trail_seq;
>   	unsigned		trail_fence_offs;
>   	u64			trail_fence_gpu_addr;
> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> index 61f7121e1c19..3a5686c3b5e9 100644
> --- a/include/drm/gpu_scheduler.h
> +++ b/include/drm/gpu_scheduler.h
> @@ -30,6 +30,8 @@
>   
>   #define MAX_WAIT_SCHED_ENTITY_Q_EMPTY msecs_to_jiffies(1000)
>   
> +#define DRM_THREAD_NAME_LEN     TASK_COMM_LEN
> +

The thread name is an amdgpu specific thing. I don't think we should 
have that in the scheduler.

And why do you use TASK_COMM_LEN here? That is completely unrelated stuff.

Regards,
Christian.

>   struct drm_gpu_scheduler;
>   struct drm_sched_rq;
>   

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH 6/6] drm/sched: Make use of a "done" thread
  2020-11-25  3:17                                       ` Luben Tuikov
@ 2020-11-25 10:10                                         ` Christian König
  -1 siblings, 0 replies; 125+ messages in thread
From: Christian König @ 2020-11-25 10:10 UTC (permalink / raw)
  To: Luben Tuikov, Andrey Grodzovsky, Lucas Stach, Alexander Deucher
  Cc: Emily Deng, amd-gfx, dri-devel, steven.price

Am 25.11.20 um 04:17 schrieb Luben Tuikov:
> Add a "done" list to which all completed jobs are added
> to be freed. The drm_sched_job_done() callback is the
> producer of jobs to this list.
>
> Add a "done" thread which consumes from the done list
> and frees up jobs. Now, the main scheduler thread only
> pushes jobs to the GPU and the "done" thread frees them
> up, on the way out of the GPU when they've completed
> execution.

Well there are quite a number of problems in this patch.

 From the design I think we should be getting rid of the linked list and 
not extend its use. And we also don't want to offload the freeing of 
jobs into a different thread because that could potentially mean that 
this is executed on a different CPU.

Then one obvious problem seems to be that you don't take into account 
that we moved the job freeing into the scheduler thread to make sure 
that this is suspended while the scheduler thread is stopped. This 
behavior is now completely gone, e.g. the delete thread keeps running 
while the scheduler thread is stopped.

A few more comments below.

> Make use of the status returned by the GPU driver
> timeout handler to decide whether to leave the job in
> the pending list, or to send it off to the done list.
> If a job is done, it is added to the done list and the
> done thread woken up. If a job needs more time, it is
> left on the pending list and the timeout timer
> restarted.
>
> Eliminate the polling mechanism of picking out done
> jobs from the pending list, i.e. eliminate
> drm_sched_get_cleanup_job(). Now the main scheduler
> thread only pushes jobs down to the GPU.
>
> Various other optimizations to the GPU scheduler
> and job recovery are possible with this format.
>
> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
> ---
>   drivers/gpu/drm/scheduler/sched_main.c | 173 +++++++++++++------------
>   include/drm/gpu_scheduler.h            |  14 ++
>   2 files changed, 101 insertions(+), 86 deletions(-)
>
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> index 3eb7618a627d..289ae68cd97f 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -164,7 +164,8 @@ drm_sched_rq_select_entity(struct drm_sched_rq *rq)
>    * drm_sched_job_done - complete a job
>    * @s_job: pointer to the job which is done
>    *
> - * Finish the job's fence and wake up the worker thread.
> + * Finish the job's fence, move it to the done list,
> + * and wake up the done thread.
>    */
>   static void drm_sched_job_done(struct drm_sched_job *s_job)
>   {
> @@ -179,7 +180,12 @@ static void drm_sched_job_done(struct drm_sched_job *s_job)
>   	dma_fence_get(&s_fence->finished);
>   	drm_sched_fence_finished(s_fence);
>   	dma_fence_put(&s_fence->finished);
> -	wake_up_interruptible(&sched->wake_up_worker);
> +
> +	spin_lock(&sched->job_list_lock);
> +	list_move(&s_job->list, &sched->done_list);
> +	spin_unlock(&sched->job_list_lock);
> +
> +	wake_up_interruptible(&sched->done_wait_q);

How is the worker thread then woken up to push new jobs to the hardware?

>   }
>   
>   /**
> @@ -221,11 +227,10 @@ bool drm_sched_dependency_optimized(struct dma_fence* fence,
>   EXPORT_SYMBOL(drm_sched_dependency_optimized);
>   
>   /**
> - * drm_sched_start_timeout - start timeout for reset worker
> - *
> - * @sched: scheduler instance to start the worker for
> + * drm_sched_start_timeout - start a timeout timer
> + * @sched: scheduler instance whose job we're timing
>    *
> - * Start the timeout for the given scheduler.
> + * Start a timeout timer for the given scheduler.
>    */
>   static void drm_sched_start_timeout(struct drm_gpu_scheduler *sched)
>   {
> @@ -305,8 +310,8 @@ static void drm_sched_job_begin(struct drm_sched_job *s_job)
>   
>   	spin_lock(&sched->job_list_lock);
>   	list_add_tail(&s_job->list, &sched->pending_list);
> -	drm_sched_start_timeout(sched);
>   	spin_unlock(&sched->job_list_lock);
> +	drm_sched_start_timeout(sched);

This looks wrong, the drm_sched_start_timeout() function used to need 
the lock. Why should that have changed?

>   }
>   
>   static void drm_sched_job_timedout(struct work_struct *work)
> @@ -316,37 +321,30 @@ static void drm_sched_job_timedout(struct work_struct *work)
>   
>   	sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work);
>   
> -	/* Protects against concurrent deletion in drm_sched_get_cleanup_job */
>   	spin_lock(&sched->job_list_lock);
>   	job = list_first_entry_or_null(&sched->pending_list,
>   				       struct drm_sched_job, list);
> +	spin_unlock(&sched->job_list_lock);
>   
>   	if (job) {
> -		/*
> -		 * Remove the bad job so it cannot be freed by concurrent
> -		 * drm_sched_cleanup_jobs. It will be reinserted back after sched->thread
> -		 * is parked at which point it's safe.
> -		 */
> -		list_del_init(&job->list);
> -		spin_unlock(&sched->job_list_lock);
> +		int res;
>   
> -		job->sched->ops->timedout_job(job);
> +		job->job_status |= DRM_JOB_STATUS_TIMEOUT;
> +		res = job->sched->ops->timedout_job(job);
> +		if (res == 0) {
> +			/* The job is out of the device.
> +			 */
> +			spin_lock(&sched->job_list_lock);
> +			list_move(&job->list, &sched->done_list);
> +			spin_unlock(&sched->job_list_lock);
>   
> -		/*
> -		 * Guilty job did complete and hence needs to be manually removed
> -		 * See drm_sched_stop doc.
> -		 */
> -		if (sched->free_guilty) {
> -			job->sched->ops->free_job(job);
> -			sched->free_guilty = false;
> +			wake_up_interruptible(&sched->done_wait_q);
> +		} else {
> +			/* The job needs more time.
> +			 */
> +			drm_sched_start_timeout(sched);
>   		}
> -	} else {
> -		spin_unlock(&sched->job_list_lock);
>   	}
> -
> -	spin_lock(&sched->job_list_lock);
> -	drm_sched_start_timeout(sched);
> -	spin_unlock(&sched->job_list_lock);
>   }
>   
>    /**
> @@ -511,15 +509,13 @@ void drm_sched_start(struct drm_gpu_scheduler *sched, bool full_recovery)
>   			else if (r)
>   				DRM_ERROR("fence add callback failed (%d)\n",
>   					  r);
> -		} else
> +		} else {
>   			drm_sched_job_done(s_job);
> +		}
>   	}
>   
> -	if (full_recovery) {
> -		spin_lock(&sched->job_list_lock);
> +	if (full_recovery)
>   		drm_sched_start_timeout(sched);
> -		spin_unlock(&sched->job_list_lock);

Same here.

Regards,
Christian.

> -	}
>   
>   	kthread_unpark(sched->thread);
>   }
> @@ -667,47 +663,6 @@ drm_sched_select_entity(struct drm_gpu_scheduler *sched)
>   	return entity;
>   }
>   
> -/**
> - * drm_sched_get_cleanup_job - fetch the next finished job to be destroyed
> - *
> - * @sched: scheduler instance
> - *
> - * Returns the next finished job from the pending list (if there is one)
> - * ready for it to be destroyed.
> - */
> -static struct drm_sched_job *
> -drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched)
> -{
> -	struct drm_sched_job *job;
> -
> -	/*
> -	 * Don't destroy jobs while the timeout worker is running  OR thread
> -	 * is being parked and hence assumed to not touch pending_list
> -	 */
> -	if ((sched->timeout != MAX_SCHEDULE_TIMEOUT &&
> -	    !cancel_delayed_work(&sched->work_tdr)) ||
> -	    kthread_should_park())
> -		return NULL;
> -
> -	spin_lock(&sched->job_list_lock);
> -
> -	job = list_first_entry_or_null(&sched->pending_list,
> -				       struct drm_sched_job, list);
> -
> -	if (job && dma_fence_is_signaled(&job->s_fence->finished)) {
> -		/* remove job from pending_list */
> -		list_del_init(&job->list);
> -	} else {
> -		job = NULL;
> -		/* queue timeout for next job */
> -		drm_sched_start_timeout(sched);
> -	}
> -
> -	spin_unlock(&sched->job_list_lock);
> -
> -	return job;
> -}
> -
>   /**
>    * drm_sched_pick_best - Get a drm sched from a sched_list with the least load
>    * @sched_list: list of drm_gpu_schedulers
> @@ -761,6 +716,44 @@ static bool drm_sched_blocked(struct drm_gpu_scheduler *sched)
>   	return false;
>   }
>   
> +/**
> + * drm_sched_done - free done tasks
> + * @param: pointer to a scheduler instance
> + *
> + * Returns 0.
> + */
> +static int drm_sched_done(void *param)
> +{
> +	struct drm_gpu_scheduler *sched = param;
> +
> +	do {
> +		LIST_HEAD(done_q);
> +
> +		wait_event_interruptible(sched->done_wait_q,
> +					 kthread_should_stop() ||
> +					 !list_empty(&sched->done_list));
> +
> +		spin_lock(&sched->job_list_lock);
> +		list_splice_init(&sched->done_list, &done_q);
> +		spin_unlock(&sched->job_list_lock);
> +
> +		if (list_empty(&done_q))
> +			continue;
> +
> +		while (!list_empty(&done_q)) {
> +			struct drm_sched_job *job;
> +
> +			job = list_first_entry(&done_q,
> +					       struct drm_sched_job,
> +					       list);
> +			list_del_init(&job->list);
> +			sched->ops->free_job(job);
> +		}
> +	} while (!kthread_should_stop());
> +
> +	return 0;
> +}
> +
>   /**
>    * drm_sched_main - main scheduler thread
>    *
> @@ -770,7 +763,7 @@ static bool drm_sched_blocked(struct drm_gpu_scheduler *sched)
>    */
>   static int drm_sched_main(void *param)
>   {
> -	struct drm_gpu_scheduler *sched = (struct drm_gpu_scheduler *)param;
> +	struct drm_gpu_scheduler *sched = param;
>   	int r;
>   
>   	sched_set_fifo_low(current);
> @@ -780,20 +773,12 @@ static int drm_sched_main(void *param)
>   		struct drm_sched_fence *s_fence;
>   		struct drm_sched_job *sched_job;
>   		struct dma_fence *fence;
> -		struct drm_sched_job *cleanup_job = NULL;
>   
>   		wait_event_interruptible(sched->wake_up_worker,
> -					 (cleanup_job = drm_sched_get_cleanup_job(sched)) ||
>   					 (!drm_sched_blocked(sched) &&
>   					  (entity = drm_sched_select_entity(sched))) ||
>   					 kthread_should_stop());
>   
> -		if (cleanup_job) {
> -			sched->ops->free_job(cleanup_job);
> -			/* queue timeout for next job */
> -			drm_sched_start_timeout(sched);
> -		}
> -
>   		if (!entity)
>   			continue;
>   
> @@ -820,8 +805,7 @@ static int drm_sched_main(void *param)
>   			if (r == -ENOENT)
>   				drm_sched_job_done(sched_job);
>   			else if (r)
> -				DRM_ERROR("fence add callback failed (%d)\n",
> -					  r);
> +				DRM_ERROR("fence add callback failed (%d)\n", r);
>   			dma_fence_put(fence);
>   		} else {
>   			if (IS_ERR(fence))
> @@ -865,7 +849,9 @@ int drm_sched_init(struct drm_gpu_scheduler *sched,
>   
>   	init_waitqueue_head(&sched->wake_up_worker);
>   	init_waitqueue_head(&sched->job_scheduled);
> +	init_waitqueue_head(&sched->done_wait_q);
>   	INIT_LIST_HEAD(&sched->pending_list);
> +	INIT_LIST_HEAD(&sched->done_list);
>   	spin_lock_init(&sched->job_list_lock);
>   	atomic_set(&sched->hw_rq_count, 0);
>   	INIT_DELAYED_WORK(&sched->work_tdr, drm_sched_job_timedout);
> @@ -881,6 +867,21 @@ int drm_sched_init(struct drm_gpu_scheduler *sched,
>   		return ret;
>   	}
>   
> +	snprintf(sched->thread_done_name, DRM_THREAD_NAME_LEN, "%s%s",
> +		 sched->name, "-done");
> +	sched->thread_done_name[DRM_THREAD_NAME_LEN - 1] = '\0';
> +	sched->thread_done = kthread_run(drm_sched_done, sched,
> +					 sched->thread_done_name);
> +	if (IS_ERR(sched->thread_done)) {
> +		ret = kthread_stop(sched->thread);
> +		if (!ret) {
> +			/* free_kthread_struct(sched->thread); */
> +			sched->thread = NULL;
> +		}
> +		DRM_ERROR("Failed to start thread %s", sched->thread_done_name);
> +		return ret;
> +	}
> +
>   	sched->ready = true;
>   	return 0;
>   }
> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> index 3a5686c3b5e9..b282d6158b50 100644
> --- a/include/drm/gpu_scheduler.h
> +++ b/include/drm/gpu_scheduler.h
> @@ -169,6 +169,12 @@ struct drm_sched_fence {
>   
>   struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f);
>   
> +enum drm_job_status {
> +	DRM_JOB_STATUS_NONE    = 0 << 0,
> +	DRM_JOB_STATUS_DONE    = 1 << 0,
> +	DRM_JOB_STATUS_TIMEOUT = 1 << 1,
> +};
> +
>   /**
>    * struct drm_sched_job - A job to be run by an entity.
>    *
> @@ -198,6 +204,7 @@ struct drm_sched_job {
>   	uint64_t			id;
>   	atomic_t			karma;
>   	enum drm_sched_priority		s_priority;
> +	enum drm_job_status             job_status;
>   	struct drm_sched_entity         *entity;
>   	struct dma_fence_cb		cb;
>   };
> @@ -284,15 +291,22 @@ struct drm_gpu_scheduler {
>   	uint32_t			hw_submission_limit;
>   	long				timeout;
>   	const char			*name;
> +	char                            thread_done_name[DRM_THREAD_NAME_LEN];
> +
>   	struct drm_sched_rq		sched_rq[DRM_SCHED_PRIORITY_COUNT];
>   	wait_queue_head_t		wake_up_worker;
>   	wait_queue_head_t		job_scheduled;
> +	wait_queue_head_t               done_wait_q;
>   	atomic_t			hw_rq_count;
>   	atomic64_t			job_id_count;
>   	struct delayed_work		work_tdr;
>   	struct task_struct		*thread;
> +	struct task_struct		*thread_done;
> +
>   	struct list_head		pending_list;
> +	struct list_head                done_list;
>   	spinlock_t			job_list_lock;
> +
>   	int				hang_limit;
>   	atomic_t                        score;
>   	bool				ready;

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH 6/6] drm/sched: Make use of a "done" thread
@ 2020-11-25 10:10                                         ` Christian König
  0 siblings, 0 replies; 125+ messages in thread
From: Christian König @ 2020-11-25 10:10 UTC (permalink / raw)
  To: Luben Tuikov, Andrey Grodzovsky, Lucas Stach, Alexander Deucher
  Cc: Emily Deng, amd-gfx, dri-devel, steven.price

Am 25.11.20 um 04:17 schrieb Luben Tuikov:
> Add a "done" list to which all completed jobs are added
> to be freed. The drm_sched_job_done() callback is the
> producer of jobs to this list.
>
> Add a "done" thread which consumes from the done list
> and frees up jobs. Now, the main scheduler thread only
> pushes jobs to the GPU and the "done" thread frees them
> up, on the way out of the GPU when they've completed
> execution.

Well there are quite a number of problems in this patch.

 From the design I think we should be getting rid of the linked list and 
not extend its use. And we also don't want to offload the freeing of 
jobs into a different thread because that could potentially mean that 
this is executed on a different CPU.

Then one obvious problem seems to be that you don't take into account 
that we moved the job freeing into the scheduler thread to make sure 
that this is suspended while the scheduler thread is stopped. This 
behavior is now completely gone, e.g. the delete thread keeps running 
while the scheduler thread is stopped.

A few more comments below.

> Make use of the status returned by the GPU driver
> timeout handler to decide whether to leave the job in
> the pending list, or to send it off to the done list.
> If a job is done, it is added to the done list and the
> done thread woken up. If a job needs more time, it is
> left on the pending list and the timeout timer
> restarted.
>
> Eliminate the polling mechanism of picking out done
> jobs from the pending list, i.e. eliminate
> drm_sched_get_cleanup_job(). Now the main scheduler
> thread only pushes jobs down to the GPU.
>
> Various other optimizations to the GPU scheduler
> and job recovery are possible with this format.
>
> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
> ---
>   drivers/gpu/drm/scheduler/sched_main.c | 173 +++++++++++++------------
>   include/drm/gpu_scheduler.h            |  14 ++
>   2 files changed, 101 insertions(+), 86 deletions(-)
>
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> index 3eb7618a627d..289ae68cd97f 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -164,7 +164,8 @@ drm_sched_rq_select_entity(struct drm_sched_rq *rq)
>    * drm_sched_job_done - complete a job
>    * @s_job: pointer to the job which is done
>    *
> - * Finish the job's fence and wake up the worker thread.
> + * Finish the job's fence, move it to the done list,
> + * and wake up the done thread.
>    */
>   static void drm_sched_job_done(struct drm_sched_job *s_job)
>   {
> @@ -179,7 +180,12 @@ static void drm_sched_job_done(struct drm_sched_job *s_job)
>   	dma_fence_get(&s_fence->finished);
>   	drm_sched_fence_finished(s_fence);
>   	dma_fence_put(&s_fence->finished);
> -	wake_up_interruptible(&sched->wake_up_worker);
> +
> +	spin_lock(&sched->job_list_lock);
> +	list_move(&s_job->list, &sched->done_list);
> +	spin_unlock(&sched->job_list_lock);
> +
> +	wake_up_interruptible(&sched->done_wait_q);

How is the worker thread then woken up to push new jobs to the hardware?

>   }
>   
>   /**
> @@ -221,11 +227,10 @@ bool drm_sched_dependency_optimized(struct dma_fence* fence,
>   EXPORT_SYMBOL(drm_sched_dependency_optimized);
>   
>   /**
> - * drm_sched_start_timeout - start timeout for reset worker
> - *
> - * @sched: scheduler instance to start the worker for
> + * drm_sched_start_timeout - start a timeout timer
> + * @sched: scheduler instance whose job we're timing
>    *
> - * Start the timeout for the given scheduler.
> + * Start a timeout timer for the given scheduler.
>    */
>   static void drm_sched_start_timeout(struct drm_gpu_scheduler *sched)
>   {
> @@ -305,8 +310,8 @@ static void drm_sched_job_begin(struct drm_sched_job *s_job)
>   
>   	spin_lock(&sched->job_list_lock);
>   	list_add_tail(&s_job->list, &sched->pending_list);
> -	drm_sched_start_timeout(sched);
>   	spin_unlock(&sched->job_list_lock);
> +	drm_sched_start_timeout(sched);

This looks wrong, the drm_sched_start_timeout() function used to need 
the lock. Why should that have changed?

>   }
>   
>   static void drm_sched_job_timedout(struct work_struct *work)
> @@ -316,37 +321,30 @@ static void drm_sched_job_timedout(struct work_struct *work)
>   
>   	sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work);
>   
> -	/* Protects against concurrent deletion in drm_sched_get_cleanup_job */
>   	spin_lock(&sched->job_list_lock);
>   	job = list_first_entry_or_null(&sched->pending_list,
>   				       struct drm_sched_job, list);
> +	spin_unlock(&sched->job_list_lock);
>   
>   	if (job) {
> -		/*
> -		 * Remove the bad job so it cannot be freed by concurrent
> -		 * drm_sched_cleanup_jobs. It will be reinserted back after sched->thread
> -		 * is parked at which point it's safe.
> -		 */
> -		list_del_init(&job->list);
> -		spin_unlock(&sched->job_list_lock);
> +		int res;
>   
> -		job->sched->ops->timedout_job(job);
> +		job->job_status |= DRM_JOB_STATUS_TIMEOUT;
> +		res = job->sched->ops->timedout_job(job);
> +		if (res == 0) {
> +			/* The job is out of the device.
> +			 */
> +			spin_lock(&sched->job_list_lock);
> +			list_move(&job->list, &sched->done_list);
> +			spin_unlock(&sched->job_list_lock);
>   
> -		/*
> -		 * Guilty job did complete and hence needs to be manually removed
> -		 * See drm_sched_stop doc.
> -		 */
> -		if (sched->free_guilty) {
> -			job->sched->ops->free_job(job);
> -			sched->free_guilty = false;
> +			wake_up_interruptible(&sched->done_wait_q);
> +		} else {
> +			/* The job needs more time.
> +			 */
> +			drm_sched_start_timeout(sched);
>   		}
> -	} else {
> -		spin_unlock(&sched->job_list_lock);
>   	}
> -
> -	spin_lock(&sched->job_list_lock);
> -	drm_sched_start_timeout(sched);
> -	spin_unlock(&sched->job_list_lock);
>   }
>   
>    /**
> @@ -511,15 +509,13 @@ void drm_sched_start(struct drm_gpu_scheduler *sched, bool full_recovery)
>   			else if (r)
>   				DRM_ERROR("fence add callback failed (%d)\n",
>   					  r);
> -		} else
> +		} else {
>   			drm_sched_job_done(s_job);
> +		}
>   	}
>   
> -	if (full_recovery) {
> -		spin_lock(&sched->job_list_lock);
> +	if (full_recovery)
>   		drm_sched_start_timeout(sched);
> -		spin_unlock(&sched->job_list_lock);

Same here.

Regards,
Christian.

> -	}
>   
>   	kthread_unpark(sched->thread);
>   }
> @@ -667,47 +663,6 @@ drm_sched_select_entity(struct drm_gpu_scheduler *sched)
>   	return entity;
>   }
>   
> -/**
> - * drm_sched_get_cleanup_job - fetch the next finished job to be destroyed
> - *
> - * @sched: scheduler instance
> - *
> - * Returns the next finished job from the pending list (if there is one)
> - * ready for it to be destroyed.
> - */
> -static struct drm_sched_job *
> -drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched)
> -{
> -	struct drm_sched_job *job;
> -
> -	/*
> -	 * Don't destroy jobs while the timeout worker is running  OR thread
> -	 * is being parked and hence assumed to not touch pending_list
> -	 */
> -	if ((sched->timeout != MAX_SCHEDULE_TIMEOUT &&
> -	    !cancel_delayed_work(&sched->work_tdr)) ||
> -	    kthread_should_park())
> -		return NULL;
> -
> -	spin_lock(&sched->job_list_lock);
> -
> -	job = list_first_entry_or_null(&sched->pending_list,
> -				       struct drm_sched_job, list);
> -
> -	if (job && dma_fence_is_signaled(&job->s_fence->finished)) {
> -		/* remove job from pending_list */
> -		list_del_init(&job->list);
> -	} else {
> -		job = NULL;
> -		/* queue timeout for next job */
> -		drm_sched_start_timeout(sched);
> -	}
> -
> -	spin_unlock(&sched->job_list_lock);
> -
> -	return job;
> -}
> -
>   /**
>    * drm_sched_pick_best - Get a drm sched from a sched_list with the least load
>    * @sched_list: list of drm_gpu_schedulers
> @@ -761,6 +716,44 @@ static bool drm_sched_blocked(struct drm_gpu_scheduler *sched)
>   	return false;
>   }
>   
> +/**
> + * drm_sched_done - free done tasks
> + * @param: pointer to a scheduler instance
> + *
> + * Returns 0.
> + */
> +static int drm_sched_done(void *param)
> +{
> +	struct drm_gpu_scheduler *sched = param;
> +
> +	do {
> +		LIST_HEAD(done_q);
> +
> +		wait_event_interruptible(sched->done_wait_q,
> +					 kthread_should_stop() ||
> +					 !list_empty(&sched->done_list));
> +
> +		spin_lock(&sched->job_list_lock);
> +		list_splice_init(&sched->done_list, &done_q);
> +		spin_unlock(&sched->job_list_lock);
> +
> +		if (list_empty(&done_q))
> +			continue;
> +
> +		while (!list_empty(&done_q)) {
> +			struct drm_sched_job *job;
> +
> +			job = list_first_entry(&done_q,
> +					       struct drm_sched_job,
> +					       list);
> +			list_del_init(&job->list);
> +			sched->ops->free_job(job);
> +		}
> +	} while (!kthread_should_stop());
> +
> +	return 0;
> +}
> +
>   /**
>    * drm_sched_main - main scheduler thread
>    *
> @@ -770,7 +763,7 @@ static bool drm_sched_blocked(struct drm_gpu_scheduler *sched)
>    */
>   static int drm_sched_main(void *param)
>   {
> -	struct drm_gpu_scheduler *sched = (struct drm_gpu_scheduler *)param;
> +	struct drm_gpu_scheduler *sched = param;
>   	int r;
>   
>   	sched_set_fifo_low(current);
> @@ -780,20 +773,12 @@ static int drm_sched_main(void *param)
>   		struct drm_sched_fence *s_fence;
>   		struct drm_sched_job *sched_job;
>   		struct dma_fence *fence;
> -		struct drm_sched_job *cleanup_job = NULL;
>   
>   		wait_event_interruptible(sched->wake_up_worker,
> -					 (cleanup_job = drm_sched_get_cleanup_job(sched)) ||
>   					 (!drm_sched_blocked(sched) &&
>   					  (entity = drm_sched_select_entity(sched))) ||
>   					 kthread_should_stop());
>   
> -		if (cleanup_job) {
> -			sched->ops->free_job(cleanup_job);
> -			/* queue timeout for next job */
> -			drm_sched_start_timeout(sched);
> -		}
> -
>   		if (!entity)
>   			continue;
>   
> @@ -820,8 +805,7 @@ static int drm_sched_main(void *param)
>   			if (r == -ENOENT)
>   				drm_sched_job_done(sched_job);
>   			else if (r)
> -				DRM_ERROR("fence add callback failed (%d)\n",
> -					  r);
> +				DRM_ERROR("fence add callback failed (%d)\n", r);
>   			dma_fence_put(fence);
>   		} else {
>   			if (IS_ERR(fence))
> @@ -865,7 +849,9 @@ int drm_sched_init(struct drm_gpu_scheduler *sched,
>   
>   	init_waitqueue_head(&sched->wake_up_worker);
>   	init_waitqueue_head(&sched->job_scheduled);
> +	init_waitqueue_head(&sched->done_wait_q);
>   	INIT_LIST_HEAD(&sched->pending_list);
> +	INIT_LIST_HEAD(&sched->done_list);
>   	spin_lock_init(&sched->job_list_lock);
>   	atomic_set(&sched->hw_rq_count, 0);
>   	INIT_DELAYED_WORK(&sched->work_tdr, drm_sched_job_timedout);
> @@ -881,6 +867,21 @@ int drm_sched_init(struct drm_gpu_scheduler *sched,
>   		return ret;
>   	}
>   
> +	snprintf(sched->thread_done_name, DRM_THREAD_NAME_LEN, "%s%s",
> +		 sched->name, "-done");
> +	sched->thread_done_name[DRM_THREAD_NAME_LEN - 1] = '\0';
> +	sched->thread_done = kthread_run(drm_sched_done, sched,
> +					 sched->thread_done_name);
> +	if (IS_ERR(sched->thread_done)) {
> +		ret = kthread_stop(sched->thread);
> +		if (!ret) {
> +			/* free_kthread_struct(sched->thread); */
> +			sched->thread = NULL;
> +		}
> +		DRM_ERROR("Failed to start thread %s", sched->thread_done_name);
> +		return ret;
> +	}
> +
>   	sched->ready = true;
>   	return 0;
>   }
> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> index 3a5686c3b5e9..b282d6158b50 100644
> --- a/include/drm/gpu_scheduler.h
> +++ b/include/drm/gpu_scheduler.h
> @@ -169,6 +169,12 @@ struct drm_sched_fence {
>   
>   struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f);
>   
> +enum drm_job_status {
> +	DRM_JOB_STATUS_NONE    = 0 << 0,
> +	DRM_JOB_STATUS_DONE    = 1 << 0,
> +	DRM_JOB_STATUS_TIMEOUT = 1 << 1,
> +};
> +
>   /**
>    * struct drm_sched_job - A job to be run by an entity.
>    *
> @@ -198,6 +204,7 @@ struct drm_sched_job {
>   	uint64_t			id;
>   	atomic_t			karma;
>   	enum drm_sched_priority		s_priority;
> +	enum drm_job_status             job_status;
>   	struct drm_sched_entity         *entity;
>   	struct dma_fence_cb		cb;
>   };
> @@ -284,15 +291,22 @@ struct drm_gpu_scheduler {
>   	uint32_t			hw_submission_limit;
>   	long				timeout;
>   	const char			*name;
> +	char                            thread_done_name[DRM_THREAD_NAME_LEN];
> +
>   	struct drm_sched_rq		sched_rq[DRM_SCHED_PRIORITY_COUNT];
>   	wait_queue_head_t		wake_up_worker;
>   	wait_queue_head_t		job_scheduled;
> +	wait_queue_head_t               done_wait_q;
>   	atomic_t			hw_rq_count;
>   	atomic64_t			job_id_count;
>   	struct delayed_work		work_tdr;
>   	struct task_struct		*thread;
> +	struct task_struct		*thread_done;
> +
>   	struct list_head		pending_list;
> +	struct list_head                done_list;
>   	spinlock_t			job_list_lock;
> +
>   	int				hang_limit;
>   	atomic_t                        score;
>   	bool				ready;

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH 3/6] drm/scheduler: Job timeout handler returns status
  2020-11-25  3:17                                       ` Luben Tuikov
@ 2020-11-25 11:04                                         ` Steven Price
  -1 siblings, 0 replies; 125+ messages in thread
From: Steven Price @ 2020-11-25 11:04 UTC (permalink / raw)
  To: Luben Tuikov, Andrey Grodzovsky, Christian König,
	Lucas Stach, Alexander Deucher
  Cc: Emily Deng, amd-gfx, dri-devel

On 25/11/2020 03:17, Luben Tuikov wrote:
> The job timeout handler now returns status
> indicating back to the DRM layer whether the job
> was successfully cancelled or whether more time
> should be given to the job to complete.

I'm not sure I understand in what circumstances you would want to give 
the job more time to complete. Could you expand on that?

One thing we're missing at the moment in Panfrost is the ability to 
suspend ("soft stop" is the Mali jargon) a job and pick something else 
to run. The propitiatory driver stack uses this to avoid timing out long 
running jobs while still allowing other processes to have time on the 
GPU. But this interface as it stands doesn't seem to provide that.

As the kernel test robot has already pointed out - you'll need to at the 
very least update the other uses of this interface.

Steve

> 
> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_job.c |  6 ++++--
>   include/drm/gpu_scheduler.h             | 13 ++++++++++---
>   2 files changed, 14 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> index ff48101bab55..81b73790ecc6 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> @@ -28,7 +28,7 @@
>   #include "amdgpu.h"
>   #include "amdgpu_trace.h"
>   
> -static void amdgpu_job_timedout(struct drm_sched_job *s_job)
> +static int amdgpu_job_timedout(struct drm_sched_job *s_job)
>   {
>   	struct amdgpu_ring *ring = to_amdgpu_ring(s_job->sched);
>   	struct amdgpu_job *job = to_amdgpu_job(s_job);
> @@ -41,7 +41,7 @@ static void amdgpu_job_timedout(struct drm_sched_job *s_job)
>   	    amdgpu_ring_soft_recovery(ring, job->vmid, s_job->s_fence->parent)) {
>   		DRM_ERROR("ring %s timeout, but soft recovered\n",
>   			  s_job->sched->name);
> -		return;
> +		return 0;
>   	}
>   
>   	amdgpu_vm_get_task_info(ring->adev, job->pasid, &ti);
> @@ -53,10 +53,12 @@ static void amdgpu_job_timedout(struct drm_sched_job *s_job)
>   
>   	if (amdgpu_device_should_recover_gpu(ring->adev)) {
>   		amdgpu_device_gpu_recover(ring->adev, job);
> +		return 0;
>   	} else {
>   		drm_sched_suspend_timeout(&ring->sched);
>   		if (amdgpu_sriov_vf(adev))
>   			adev->virt.tdr_debug = true;
> +		return 1;
>   	}
>   }
>   
> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> index 2e0c368e19f6..61f7121e1c19 100644
> --- a/include/drm/gpu_scheduler.h
> +++ b/include/drm/gpu_scheduler.h
> @@ -230,10 +230,17 @@ struct drm_sched_backend_ops {
>   	struct dma_fence *(*run_job)(struct drm_sched_job *sched_job);
>   
>   	/**
> -         * @timedout_job: Called when a job has taken too long to execute,
> -         * to trigger GPU recovery.
> +	 * @timedout_job: Called when a job has taken too long to execute,
> +	 * to trigger GPU recovery.
> +	 *
> +	 * Return 0, if the job has been aborted successfully and will
> +	 * never be heard of from the device. Return non-zero if the
> +	 * job wasn't able to be aborted, i.e. if more time should be
> +	 * given to this job. The result is not "bool" as this
> +	 * function is not a predicate, although its result may seem
> +	 * as one.
>   	 */
> -	void (*timedout_job)(struct drm_sched_job *sched_job);
> +	int (*timedout_job)(struct drm_sched_job *sched_job);
>   
>   	/**
>            * @free_job: Called once the job's finished fence has been signaled
> 

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH 3/6] drm/scheduler: Job timeout handler returns status
@ 2020-11-25 11:04                                         ` Steven Price
  0 siblings, 0 replies; 125+ messages in thread
From: Steven Price @ 2020-11-25 11:04 UTC (permalink / raw)
  To: Luben Tuikov, Andrey Grodzovsky, Christian König,
	Lucas Stach, Alexander Deucher
  Cc: Emily Deng, amd-gfx, dri-devel

On 25/11/2020 03:17, Luben Tuikov wrote:
> The job timeout handler now returns status
> indicating back to the DRM layer whether the job
> was successfully cancelled or whether more time
> should be given to the job to complete.

I'm not sure I understand in what circumstances you would want to give 
the job more time to complete. Could you expand on that?

One thing we're missing at the moment in Panfrost is the ability to 
suspend ("soft stop" is the Mali jargon) a job and pick something else 
to run. The propitiatory driver stack uses this to avoid timing out long 
running jobs while still allowing other processes to have time on the 
GPU. But this interface as it stands doesn't seem to provide that.

As the kernel test robot has already pointed out - you'll need to at the 
very least update the other uses of this interface.

Steve

> 
> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_job.c |  6 ++++--
>   include/drm/gpu_scheduler.h             | 13 ++++++++++---
>   2 files changed, 14 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> index ff48101bab55..81b73790ecc6 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> @@ -28,7 +28,7 @@
>   #include "amdgpu.h"
>   #include "amdgpu_trace.h"
>   
> -static void amdgpu_job_timedout(struct drm_sched_job *s_job)
> +static int amdgpu_job_timedout(struct drm_sched_job *s_job)
>   {
>   	struct amdgpu_ring *ring = to_amdgpu_ring(s_job->sched);
>   	struct amdgpu_job *job = to_amdgpu_job(s_job);
> @@ -41,7 +41,7 @@ static void amdgpu_job_timedout(struct drm_sched_job *s_job)
>   	    amdgpu_ring_soft_recovery(ring, job->vmid, s_job->s_fence->parent)) {
>   		DRM_ERROR("ring %s timeout, but soft recovered\n",
>   			  s_job->sched->name);
> -		return;
> +		return 0;
>   	}
>   
>   	amdgpu_vm_get_task_info(ring->adev, job->pasid, &ti);
> @@ -53,10 +53,12 @@ static void amdgpu_job_timedout(struct drm_sched_job *s_job)
>   
>   	if (amdgpu_device_should_recover_gpu(ring->adev)) {
>   		amdgpu_device_gpu_recover(ring->adev, job);
> +		return 0;
>   	} else {
>   		drm_sched_suspend_timeout(&ring->sched);
>   		if (amdgpu_sriov_vf(adev))
>   			adev->virt.tdr_debug = true;
> +		return 1;
>   	}
>   }
>   
> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> index 2e0c368e19f6..61f7121e1c19 100644
> --- a/include/drm/gpu_scheduler.h
> +++ b/include/drm/gpu_scheduler.h
> @@ -230,10 +230,17 @@ struct drm_sched_backend_ops {
>   	struct dma_fence *(*run_job)(struct drm_sched_job *sched_job);
>   
>   	/**
> -         * @timedout_job: Called when a job has taken too long to execute,
> -         * to trigger GPU recovery.
> +	 * @timedout_job: Called when a job has taken too long to execute,
> +	 * to trigger GPU recovery.
> +	 *
> +	 * Return 0, if the job has been aborted successfully and will
> +	 * never be heard of from the device. Return non-zero if the
> +	 * job wasn't able to be aborted, i.e. if more time should be
> +	 * given to this job. The result is not "bool" as this
> +	 * function is not a predicate, although its result may seem
> +	 * as one.
>   	 */
> -	void (*timedout_job)(struct drm_sched_job *sched_job);
> +	int (*timedout_job)(struct drm_sched_job *sched_job);
>   
>   	/**
>            * @free_job: Called once the job's finished fence has been signaled
> 

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH 6/6] drm/sched: Make use of a "done" thread
  2020-11-25  3:17                                       ` Luben Tuikov
@ 2020-11-25 11:09                                         ` Steven Price
  -1 siblings, 0 replies; 125+ messages in thread
From: Steven Price @ 2020-11-25 11:09 UTC (permalink / raw)
  To: Luben Tuikov, Andrey Grodzovsky, Christian König,
	Lucas Stach, Alexander Deucher
  Cc: Emily Deng, amd-gfx, dri-devel

On 25/11/2020 03:17, Luben Tuikov wrote:
> Add a "done" list to which all completed jobs are added
> to be freed. The drm_sched_job_done() callback is the
> producer of jobs to this list.
> 
> Add a "done" thread which consumes from the done list
> and frees up jobs. Now, the main scheduler thread only
> pushes jobs to the GPU and the "done" thread frees them
> up, on the way out of the GPU when they've completed
> execution.

Generally I'd be in favour of a "done thread" as I think there are some 
murky corners of Panfrost's locking that would be helped by deferring 
the free_job() callback.

But I think you're trying to do too much in one patch here. And as 
Christian has pointed out there's some dodgy looking changes to locking 
which aren't explained.

Steve

> 
> Make use of the status returned by the GPU driver
> timeout handler to decide whether to leave the job in
> the pending list, or to send it off to the done list.
> If a job is done, it is added to the done list and the
> done thread woken up. If a job needs more time, it is
> left on the pending list and the timeout timer
> restarted.
> 
> Eliminate the polling mechanism of picking out done
> jobs from the pending list, i.e. eliminate
> drm_sched_get_cleanup_job(). Now the main scheduler
> thread only pushes jobs down to the GPU.
> 
> Various other optimizations to the GPU scheduler
> and job recovery are possible with this format.
> 
> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
> ---
>   drivers/gpu/drm/scheduler/sched_main.c | 173 +++++++++++++------------
>   include/drm/gpu_scheduler.h            |  14 ++
>   2 files changed, 101 insertions(+), 86 deletions(-)
> 
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> index 3eb7618a627d..289ae68cd97f 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -164,7 +164,8 @@ drm_sched_rq_select_entity(struct drm_sched_rq *rq)
>    * drm_sched_job_done - complete a job
>    * @s_job: pointer to the job which is done
>    *
> - * Finish the job's fence and wake up the worker thread.
> + * Finish the job's fence, move it to the done list,
> + * and wake up the done thread.
>    */
>   static void drm_sched_job_done(struct drm_sched_job *s_job)
>   {
> @@ -179,7 +180,12 @@ static void drm_sched_job_done(struct drm_sched_job *s_job)
>   	dma_fence_get(&s_fence->finished);
>   	drm_sched_fence_finished(s_fence);
>   	dma_fence_put(&s_fence->finished);
> -	wake_up_interruptible(&sched->wake_up_worker);
> +
> +	spin_lock(&sched->job_list_lock);
> +	list_move(&s_job->list, &sched->done_list);
> +	spin_unlock(&sched->job_list_lock);
> +
> +	wake_up_interruptible(&sched->done_wait_q);
>   }
>   
>   /**
> @@ -221,11 +227,10 @@ bool drm_sched_dependency_optimized(struct dma_fence* fence,
>   EXPORT_SYMBOL(drm_sched_dependency_optimized);
>   
>   /**
> - * drm_sched_start_timeout - start timeout for reset worker
> - *
> - * @sched: scheduler instance to start the worker for
> + * drm_sched_start_timeout - start a timeout timer
> + * @sched: scheduler instance whose job we're timing
>    *
> - * Start the timeout for the given scheduler.
> + * Start a timeout timer for the given scheduler.
>    */
>   static void drm_sched_start_timeout(struct drm_gpu_scheduler *sched)
>   {
> @@ -305,8 +310,8 @@ static void drm_sched_job_begin(struct drm_sched_job *s_job)
>   
>   	spin_lock(&sched->job_list_lock);
>   	list_add_tail(&s_job->list, &sched->pending_list);
> -	drm_sched_start_timeout(sched);
>   	spin_unlock(&sched->job_list_lock);
> +	drm_sched_start_timeout(sched);
>   }
>   
>   static void drm_sched_job_timedout(struct work_struct *work)
> @@ -316,37 +321,30 @@ static void drm_sched_job_timedout(struct work_struct *work)
>   
>   	sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work);
>   
> -	/* Protects against concurrent deletion in drm_sched_get_cleanup_job */
>   	spin_lock(&sched->job_list_lock);
>   	job = list_first_entry_or_null(&sched->pending_list,
>   				       struct drm_sched_job, list);
> +	spin_unlock(&sched->job_list_lock);
>   
>   	if (job) {
> -		/*
> -		 * Remove the bad job so it cannot be freed by concurrent
> -		 * drm_sched_cleanup_jobs. It will be reinserted back after sched->thread
> -		 * is parked at which point it's safe.
> -		 */
> -		list_del_init(&job->list);
> -		spin_unlock(&sched->job_list_lock);
> +		int res;
>   
> -		job->sched->ops->timedout_job(job);
> +		job->job_status |= DRM_JOB_STATUS_TIMEOUT;
> +		res = job->sched->ops->timedout_job(job);
> +		if (res == 0) {
> +			/* The job is out of the device.
> +			 */
> +			spin_lock(&sched->job_list_lock);
> +			list_move(&job->list, &sched->done_list);
> +			spin_unlock(&sched->job_list_lock);
>   
> -		/*
> -		 * Guilty job did complete and hence needs to be manually removed
> -		 * See drm_sched_stop doc.
> -		 */
> -		if (sched->free_guilty) {
> -			job->sched->ops->free_job(job);
> -			sched->free_guilty = false;
> +			wake_up_interruptible(&sched->done_wait_q);
> +		} else {
> +			/* The job needs more time.
> +			 */
> +			drm_sched_start_timeout(sched);
>   		}
> -	} else {
> -		spin_unlock(&sched->job_list_lock);
>   	}
> -
> -	spin_lock(&sched->job_list_lock);
> -	drm_sched_start_timeout(sched);
> -	spin_unlock(&sched->job_list_lock);
>   }
>   
>    /**
> @@ -511,15 +509,13 @@ void drm_sched_start(struct drm_gpu_scheduler *sched, bool full_recovery)
>   			else if (r)
>   				DRM_ERROR("fence add callback failed (%d)\n",
>   					  r);
> -		} else
> +		} else {
>   			drm_sched_job_done(s_job);
> +		}
>   	}
>   
> -	if (full_recovery) {
> -		spin_lock(&sched->job_list_lock);
> +	if (full_recovery)
>   		drm_sched_start_timeout(sched);
> -		spin_unlock(&sched->job_list_lock);
> -	}
>   
>   	kthread_unpark(sched->thread);
>   }
> @@ -667,47 +663,6 @@ drm_sched_select_entity(struct drm_gpu_scheduler *sched)
>   	return entity;
>   }
>   
> -/**
> - * drm_sched_get_cleanup_job - fetch the next finished job to be destroyed
> - *
> - * @sched: scheduler instance
> - *
> - * Returns the next finished job from the pending list (if there is one)
> - * ready for it to be destroyed.
> - */
> -static struct drm_sched_job *
> -drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched)
> -{
> -	struct drm_sched_job *job;
> -
> -	/*
> -	 * Don't destroy jobs while the timeout worker is running  OR thread
> -	 * is being parked and hence assumed to not touch pending_list
> -	 */
> -	if ((sched->timeout != MAX_SCHEDULE_TIMEOUT &&
> -	    !cancel_delayed_work(&sched->work_tdr)) ||
> -	    kthread_should_park())
> -		return NULL;
> -
> -	spin_lock(&sched->job_list_lock);
> -
> -	job = list_first_entry_or_null(&sched->pending_list,
> -				       struct drm_sched_job, list);
> -
> -	if (job && dma_fence_is_signaled(&job->s_fence->finished)) {
> -		/* remove job from pending_list */
> -		list_del_init(&job->list);
> -	} else {
> -		job = NULL;
> -		/* queue timeout for next job */
> -		drm_sched_start_timeout(sched);
> -	}
> -
> -	spin_unlock(&sched->job_list_lock);
> -
> -	return job;
> -}
> -
>   /**
>    * drm_sched_pick_best - Get a drm sched from a sched_list with the least load
>    * @sched_list: list of drm_gpu_schedulers
> @@ -761,6 +716,44 @@ static bool drm_sched_blocked(struct drm_gpu_scheduler *sched)
>   	return false;
>   }
>   
> +/**
> + * drm_sched_done - free done tasks
> + * @param: pointer to a scheduler instance
> + *
> + * Returns 0.
> + */
> +static int drm_sched_done(void *param)
> +{
> +	struct drm_gpu_scheduler *sched = param;
> +
> +	do {
> +		LIST_HEAD(done_q);
> +
> +		wait_event_interruptible(sched->done_wait_q,
> +					 kthread_should_stop() ||
> +					 !list_empty(&sched->done_list));
> +
> +		spin_lock(&sched->job_list_lock);
> +		list_splice_init(&sched->done_list, &done_q);
> +		spin_unlock(&sched->job_list_lock);
> +
> +		if (list_empty(&done_q))
> +			continue;
> +
> +		while (!list_empty(&done_q)) {
> +			struct drm_sched_job *job;
> +
> +			job = list_first_entry(&done_q,
> +					       struct drm_sched_job,
> +					       list);
> +			list_del_init(&job->list);
> +			sched->ops->free_job(job);
> +		}
> +	} while (!kthread_should_stop());
> +
> +	return 0;
> +}
> +
>   /**
>    * drm_sched_main - main scheduler thread
>    *
> @@ -770,7 +763,7 @@ static bool drm_sched_blocked(struct drm_gpu_scheduler *sched)
>    */
>   static int drm_sched_main(void *param)
>   {
> -	struct drm_gpu_scheduler *sched = (struct drm_gpu_scheduler *)param;
> +	struct drm_gpu_scheduler *sched = param;
>   	int r;
>   
>   	sched_set_fifo_low(current);
> @@ -780,20 +773,12 @@ static int drm_sched_main(void *param)
>   		struct drm_sched_fence *s_fence;
>   		struct drm_sched_job *sched_job;
>   		struct dma_fence *fence;
> -		struct drm_sched_job *cleanup_job = NULL;
>   
>   		wait_event_interruptible(sched->wake_up_worker,
> -					 (cleanup_job = drm_sched_get_cleanup_job(sched)) ||
>   					 (!drm_sched_blocked(sched) &&
>   					  (entity = drm_sched_select_entity(sched))) ||
>   					 kthread_should_stop());
>   
> -		if (cleanup_job) {
> -			sched->ops->free_job(cleanup_job);
> -			/* queue timeout for next job */
> -			drm_sched_start_timeout(sched);
> -		}
> -
>   		if (!entity)
>   			continue;
>   
> @@ -820,8 +805,7 @@ static int drm_sched_main(void *param)
>   			if (r == -ENOENT)
>   				drm_sched_job_done(sched_job);
>   			else if (r)
> -				DRM_ERROR("fence add callback failed (%d)\n",
> -					  r);
> +				DRM_ERROR("fence add callback failed (%d)\n", r);
>   			dma_fence_put(fence);
>   		} else {
>   			if (IS_ERR(fence))
> @@ -865,7 +849,9 @@ int drm_sched_init(struct drm_gpu_scheduler *sched,
>   
>   	init_waitqueue_head(&sched->wake_up_worker);
>   	init_waitqueue_head(&sched->job_scheduled);
> +	init_waitqueue_head(&sched->done_wait_q);
>   	INIT_LIST_HEAD(&sched->pending_list);
> +	INIT_LIST_HEAD(&sched->done_list);
>   	spin_lock_init(&sched->job_list_lock);
>   	atomic_set(&sched->hw_rq_count, 0);
>   	INIT_DELAYED_WORK(&sched->work_tdr, drm_sched_job_timedout);
> @@ -881,6 +867,21 @@ int drm_sched_init(struct drm_gpu_scheduler *sched,
>   		return ret;
>   	}
>   
> +	snprintf(sched->thread_done_name, DRM_THREAD_NAME_LEN, "%s%s",
> +		 sched->name, "-done");
> +	sched->thread_done_name[DRM_THREAD_NAME_LEN - 1] = '\0';
> +	sched->thread_done = kthread_run(drm_sched_done, sched,
> +					 sched->thread_done_name);
> +	if (IS_ERR(sched->thread_done)) {
> +		ret = kthread_stop(sched->thread);
> +		if (!ret) {
> +			/* free_kthread_struct(sched->thread); */
> +			sched->thread = NULL;
> +		}
> +		DRM_ERROR("Failed to start thread %s", sched->thread_done_name);
> +		return ret;
> +	}
> +
>   	sched->ready = true;
>   	return 0;
>   }
> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> index 3a5686c3b5e9..b282d6158b50 100644
> --- a/include/drm/gpu_scheduler.h
> +++ b/include/drm/gpu_scheduler.h
> @@ -169,6 +169,12 @@ struct drm_sched_fence {
>   
>   struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f);
>   
> +enum drm_job_status {
> +	DRM_JOB_STATUS_NONE    = 0 << 0,
> +	DRM_JOB_STATUS_DONE    = 1 << 0,
> +	DRM_JOB_STATUS_TIMEOUT = 1 << 1,
> +};
> +
>   /**
>    * struct drm_sched_job - A job to be run by an entity.
>    *
> @@ -198,6 +204,7 @@ struct drm_sched_job {
>   	uint64_t			id;
>   	atomic_t			karma;
>   	enum drm_sched_priority		s_priority;
> +	enum drm_job_status             job_status;
>   	struct drm_sched_entity         *entity;
>   	struct dma_fence_cb		cb;
>   };
> @@ -284,15 +291,22 @@ struct drm_gpu_scheduler {
>   	uint32_t			hw_submission_limit;
>   	long				timeout;
>   	const char			*name;
> +	char                            thread_done_name[DRM_THREAD_NAME_LEN];
> +
>   	struct drm_sched_rq		sched_rq[DRM_SCHED_PRIORITY_COUNT];
>   	wait_queue_head_t		wake_up_worker;
>   	wait_queue_head_t		job_scheduled;
> +	wait_queue_head_t               done_wait_q;
>   	atomic_t			hw_rq_count;
>   	atomic64_t			job_id_count;
>   	struct delayed_work		work_tdr;
>   	struct task_struct		*thread;
> +	struct task_struct		*thread_done;
> +
>   	struct list_head		pending_list;
> +	struct list_head                done_list;
>   	spinlock_t			job_list_lock;
> +
>   	int				hang_limit;
>   	atomic_t                        score;
>   	bool				ready;
> 

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH 6/6] drm/sched: Make use of a "done" thread
@ 2020-11-25 11:09                                         ` Steven Price
  0 siblings, 0 replies; 125+ messages in thread
From: Steven Price @ 2020-11-25 11:09 UTC (permalink / raw)
  To: Luben Tuikov, Andrey Grodzovsky, Christian König,
	Lucas Stach, Alexander Deucher
  Cc: Emily Deng, amd-gfx, dri-devel

On 25/11/2020 03:17, Luben Tuikov wrote:
> Add a "done" list to which all completed jobs are added
> to be freed. The drm_sched_job_done() callback is the
> producer of jobs to this list.
> 
> Add a "done" thread which consumes from the done list
> and frees up jobs. Now, the main scheduler thread only
> pushes jobs to the GPU and the "done" thread frees them
> up, on the way out of the GPU when they've completed
> execution.

Generally I'd be in favour of a "done thread" as I think there are some 
murky corners of Panfrost's locking that would be helped by deferring 
the free_job() callback.

But I think you're trying to do too much in one patch here. And as 
Christian has pointed out there's some dodgy looking changes to locking 
which aren't explained.

Steve

> 
> Make use of the status returned by the GPU driver
> timeout handler to decide whether to leave the job in
> the pending list, or to send it off to the done list.
> If a job is done, it is added to the done list and the
> done thread woken up. If a job needs more time, it is
> left on the pending list and the timeout timer
> restarted.
> 
> Eliminate the polling mechanism of picking out done
> jobs from the pending list, i.e. eliminate
> drm_sched_get_cleanup_job(). Now the main scheduler
> thread only pushes jobs down to the GPU.
> 
> Various other optimizations to the GPU scheduler
> and job recovery are possible with this format.
> 
> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
> ---
>   drivers/gpu/drm/scheduler/sched_main.c | 173 +++++++++++++------------
>   include/drm/gpu_scheduler.h            |  14 ++
>   2 files changed, 101 insertions(+), 86 deletions(-)
> 
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> index 3eb7618a627d..289ae68cd97f 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -164,7 +164,8 @@ drm_sched_rq_select_entity(struct drm_sched_rq *rq)
>    * drm_sched_job_done - complete a job
>    * @s_job: pointer to the job which is done
>    *
> - * Finish the job's fence and wake up the worker thread.
> + * Finish the job's fence, move it to the done list,
> + * and wake up the done thread.
>    */
>   static void drm_sched_job_done(struct drm_sched_job *s_job)
>   {
> @@ -179,7 +180,12 @@ static void drm_sched_job_done(struct drm_sched_job *s_job)
>   	dma_fence_get(&s_fence->finished);
>   	drm_sched_fence_finished(s_fence);
>   	dma_fence_put(&s_fence->finished);
> -	wake_up_interruptible(&sched->wake_up_worker);
> +
> +	spin_lock(&sched->job_list_lock);
> +	list_move(&s_job->list, &sched->done_list);
> +	spin_unlock(&sched->job_list_lock);
> +
> +	wake_up_interruptible(&sched->done_wait_q);
>   }
>   
>   /**
> @@ -221,11 +227,10 @@ bool drm_sched_dependency_optimized(struct dma_fence* fence,
>   EXPORT_SYMBOL(drm_sched_dependency_optimized);
>   
>   /**
> - * drm_sched_start_timeout - start timeout for reset worker
> - *
> - * @sched: scheduler instance to start the worker for
> + * drm_sched_start_timeout - start a timeout timer
> + * @sched: scheduler instance whose job we're timing
>    *
> - * Start the timeout for the given scheduler.
> + * Start a timeout timer for the given scheduler.
>    */
>   static void drm_sched_start_timeout(struct drm_gpu_scheduler *sched)
>   {
> @@ -305,8 +310,8 @@ static void drm_sched_job_begin(struct drm_sched_job *s_job)
>   
>   	spin_lock(&sched->job_list_lock);
>   	list_add_tail(&s_job->list, &sched->pending_list);
> -	drm_sched_start_timeout(sched);
>   	spin_unlock(&sched->job_list_lock);
> +	drm_sched_start_timeout(sched);
>   }
>   
>   static void drm_sched_job_timedout(struct work_struct *work)
> @@ -316,37 +321,30 @@ static void drm_sched_job_timedout(struct work_struct *work)
>   
>   	sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work);
>   
> -	/* Protects against concurrent deletion in drm_sched_get_cleanup_job */
>   	spin_lock(&sched->job_list_lock);
>   	job = list_first_entry_or_null(&sched->pending_list,
>   				       struct drm_sched_job, list);
> +	spin_unlock(&sched->job_list_lock);
>   
>   	if (job) {
> -		/*
> -		 * Remove the bad job so it cannot be freed by concurrent
> -		 * drm_sched_cleanup_jobs. It will be reinserted back after sched->thread
> -		 * is parked at which point it's safe.
> -		 */
> -		list_del_init(&job->list);
> -		spin_unlock(&sched->job_list_lock);
> +		int res;
>   
> -		job->sched->ops->timedout_job(job);
> +		job->job_status |= DRM_JOB_STATUS_TIMEOUT;
> +		res = job->sched->ops->timedout_job(job);
> +		if (res == 0) {
> +			/* The job is out of the device.
> +			 */
> +			spin_lock(&sched->job_list_lock);
> +			list_move(&job->list, &sched->done_list);
> +			spin_unlock(&sched->job_list_lock);
>   
> -		/*
> -		 * Guilty job did complete and hence needs to be manually removed
> -		 * See drm_sched_stop doc.
> -		 */
> -		if (sched->free_guilty) {
> -			job->sched->ops->free_job(job);
> -			sched->free_guilty = false;
> +			wake_up_interruptible(&sched->done_wait_q);
> +		} else {
> +			/* The job needs more time.
> +			 */
> +			drm_sched_start_timeout(sched);
>   		}
> -	} else {
> -		spin_unlock(&sched->job_list_lock);
>   	}
> -
> -	spin_lock(&sched->job_list_lock);
> -	drm_sched_start_timeout(sched);
> -	spin_unlock(&sched->job_list_lock);
>   }
>   
>    /**
> @@ -511,15 +509,13 @@ void drm_sched_start(struct drm_gpu_scheduler *sched, bool full_recovery)
>   			else if (r)
>   				DRM_ERROR("fence add callback failed (%d)\n",
>   					  r);
> -		} else
> +		} else {
>   			drm_sched_job_done(s_job);
> +		}
>   	}
>   
> -	if (full_recovery) {
> -		spin_lock(&sched->job_list_lock);
> +	if (full_recovery)
>   		drm_sched_start_timeout(sched);
> -		spin_unlock(&sched->job_list_lock);
> -	}
>   
>   	kthread_unpark(sched->thread);
>   }
> @@ -667,47 +663,6 @@ drm_sched_select_entity(struct drm_gpu_scheduler *sched)
>   	return entity;
>   }
>   
> -/**
> - * drm_sched_get_cleanup_job - fetch the next finished job to be destroyed
> - *
> - * @sched: scheduler instance
> - *
> - * Returns the next finished job from the pending list (if there is one)
> - * ready for it to be destroyed.
> - */
> -static struct drm_sched_job *
> -drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched)
> -{
> -	struct drm_sched_job *job;
> -
> -	/*
> -	 * Don't destroy jobs while the timeout worker is running  OR thread
> -	 * is being parked and hence assumed to not touch pending_list
> -	 */
> -	if ((sched->timeout != MAX_SCHEDULE_TIMEOUT &&
> -	    !cancel_delayed_work(&sched->work_tdr)) ||
> -	    kthread_should_park())
> -		return NULL;
> -
> -	spin_lock(&sched->job_list_lock);
> -
> -	job = list_first_entry_or_null(&sched->pending_list,
> -				       struct drm_sched_job, list);
> -
> -	if (job && dma_fence_is_signaled(&job->s_fence->finished)) {
> -		/* remove job from pending_list */
> -		list_del_init(&job->list);
> -	} else {
> -		job = NULL;
> -		/* queue timeout for next job */
> -		drm_sched_start_timeout(sched);
> -	}
> -
> -	spin_unlock(&sched->job_list_lock);
> -
> -	return job;
> -}
> -
>   /**
>    * drm_sched_pick_best - Get a drm sched from a sched_list with the least load
>    * @sched_list: list of drm_gpu_schedulers
> @@ -761,6 +716,44 @@ static bool drm_sched_blocked(struct drm_gpu_scheduler *sched)
>   	return false;
>   }
>   
> +/**
> + * drm_sched_done - free done tasks
> + * @param: pointer to a scheduler instance
> + *
> + * Returns 0.
> + */
> +static int drm_sched_done(void *param)
> +{
> +	struct drm_gpu_scheduler *sched = param;
> +
> +	do {
> +		LIST_HEAD(done_q);
> +
> +		wait_event_interruptible(sched->done_wait_q,
> +					 kthread_should_stop() ||
> +					 !list_empty(&sched->done_list));
> +
> +		spin_lock(&sched->job_list_lock);
> +		list_splice_init(&sched->done_list, &done_q);
> +		spin_unlock(&sched->job_list_lock);
> +
> +		if (list_empty(&done_q))
> +			continue;
> +
> +		while (!list_empty(&done_q)) {
> +			struct drm_sched_job *job;
> +
> +			job = list_first_entry(&done_q,
> +					       struct drm_sched_job,
> +					       list);
> +			list_del_init(&job->list);
> +			sched->ops->free_job(job);
> +		}
> +	} while (!kthread_should_stop());
> +
> +	return 0;
> +}
> +
>   /**
>    * drm_sched_main - main scheduler thread
>    *
> @@ -770,7 +763,7 @@ static bool drm_sched_blocked(struct drm_gpu_scheduler *sched)
>    */
>   static int drm_sched_main(void *param)
>   {
> -	struct drm_gpu_scheduler *sched = (struct drm_gpu_scheduler *)param;
> +	struct drm_gpu_scheduler *sched = param;
>   	int r;
>   
>   	sched_set_fifo_low(current);
> @@ -780,20 +773,12 @@ static int drm_sched_main(void *param)
>   		struct drm_sched_fence *s_fence;
>   		struct drm_sched_job *sched_job;
>   		struct dma_fence *fence;
> -		struct drm_sched_job *cleanup_job = NULL;
>   
>   		wait_event_interruptible(sched->wake_up_worker,
> -					 (cleanup_job = drm_sched_get_cleanup_job(sched)) ||
>   					 (!drm_sched_blocked(sched) &&
>   					  (entity = drm_sched_select_entity(sched))) ||
>   					 kthread_should_stop());
>   
> -		if (cleanup_job) {
> -			sched->ops->free_job(cleanup_job);
> -			/* queue timeout for next job */
> -			drm_sched_start_timeout(sched);
> -		}
> -
>   		if (!entity)
>   			continue;
>   
> @@ -820,8 +805,7 @@ static int drm_sched_main(void *param)
>   			if (r == -ENOENT)
>   				drm_sched_job_done(sched_job);
>   			else if (r)
> -				DRM_ERROR("fence add callback failed (%d)\n",
> -					  r);
> +				DRM_ERROR("fence add callback failed (%d)\n", r);
>   			dma_fence_put(fence);
>   		} else {
>   			if (IS_ERR(fence))
> @@ -865,7 +849,9 @@ int drm_sched_init(struct drm_gpu_scheduler *sched,
>   
>   	init_waitqueue_head(&sched->wake_up_worker);
>   	init_waitqueue_head(&sched->job_scheduled);
> +	init_waitqueue_head(&sched->done_wait_q);
>   	INIT_LIST_HEAD(&sched->pending_list);
> +	INIT_LIST_HEAD(&sched->done_list);
>   	spin_lock_init(&sched->job_list_lock);
>   	atomic_set(&sched->hw_rq_count, 0);
>   	INIT_DELAYED_WORK(&sched->work_tdr, drm_sched_job_timedout);
> @@ -881,6 +867,21 @@ int drm_sched_init(struct drm_gpu_scheduler *sched,
>   		return ret;
>   	}
>   
> +	snprintf(sched->thread_done_name, DRM_THREAD_NAME_LEN, "%s%s",
> +		 sched->name, "-done");
> +	sched->thread_done_name[DRM_THREAD_NAME_LEN - 1] = '\0';
> +	sched->thread_done = kthread_run(drm_sched_done, sched,
> +					 sched->thread_done_name);
> +	if (IS_ERR(sched->thread_done)) {
> +		ret = kthread_stop(sched->thread);
> +		if (!ret) {
> +			/* free_kthread_struct(sched->thread); */
> +			sched->thread = NULL;
> +		}
> +		DRM_ERROR("Failed to start thread %s", sched->thread_done_name);
> +		return ret;
> +	}
> +
>   	sched->ready = true;
>   	return 0;
>   }
> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> index 3a5686c3b5e9..b282d6158b50 100644
> --- a/include/drm/gpu_scheduler.h
> +++ b/include/drm/gpu_scheduler.h
> @@ -169,6 +169,12 @@ struct drm_sched_fence {
>   
>   struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f);
>   
> +enum drm_job_status {
> +	DRM_JOB_STATUS_NONE    = 0 << 0,
> +	DRM_JOB_STATUS_DONE    = 1 << 0,
> +	DRM_JOB_STATUS_TIMEOUT = 1 << 1,
> +};
> +
>   /**
>    * struct drm_sched_job - A job to be run by an entity.
>    *
> @@ -198,6 +204,7 @@ struct drm_sched_job {
>   	uint64_t			id;
>   	atomic_t			karma;
>   	enum drm_sched_priority		s_priority;
> +	enum drm_job_status             job_status;
>   	struct drm_sched_entity         *entity;
>   	struct dma_fence_cb		cb;
>   };
> @@ -284,15 +291,22 @@ struct drm_gpu_scheduler {
>   	uint32_t			hw_submission_limit;
>   	long				timeout;
>   	const char			*name;
> +	char                            thread_done_name[DRM_THREAD_NAME_LEN];
> +
>   	struct drm_sched_rq		sched_rq[DRM_SCHED_PRIORITY_COUNT];
>   	wait_queue_head_t		wake_up_worker;
>   	wait_queue_head_t		job_scheduled;
> +	wait_queue_head_t               done_wait_q;
>   	atomic_t			hw_rq_count;
>   	atomic64_t			job_id_count;
>   	struct delayed_work		work_tdr;
>   	struct task_struct		*thread;
> +	struct task_struct		*thread_done;
> +
>   	struct list_head		pending_list;
> +	struct list_head                done_list;
>   	spinlock_t			job_list_lock;
> +
>   	int				hang_limit;
>   	atomic_t                        score;
>   	bool				ready;
> 

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH 3/6] drm/scheduler: Job timeout handler returns status
  2020-11-25 11:04                                         ` Steven Price
@ 2020-11-25 11:15                                           ` Lucas Stach
  -1 siblings, 0 replies; 125+ messages in thread
From: Lucas Stach @ 2020-11-25 11:15 UTC (permalink / raw)
  To: Steven Price, Luben Tuikov, Andrey Grodzovsky,
	Christian König, Alexander Deucher
  Cc: Emily Deng, amd-gfx, dri-devel

Am Mittwoch, den 25.11.2020, 11:04 +0000 schrieb Steven Price:
> On 25/11/2020 03:17, Luben Tuikov wrote:
> > The job timeout handler now returns status
> > indicating back to the DRM layer whether the job
> > was successfully cancelled or whether more time
> > should be given to the job to complete.
> 
> I'm not sure I understand in what circumstances you would want to give 
> the job more time to complete. Could you expand on that?

On etnaviv we don't have the ability to preempt a running job, but we
can look at the GPU state to determine if it's still making progress
with the current job, so we want to extend the timeout in that case to
not kill a long running but valid job.

Regards,
Lucas

> One thing we're missing at the moment in Panfrost is the ability to 
> suspend ("soft stop" is the Mali jargon) a job and pick something else 
> to run. The propitiatory driver stack uses this to avoid timing out long 
> running jobs while still allowing other processes to have time on the 
> GPU. But this interface as it stands doesn't seem to provide that.
> 
> As the kernel test robot has already pointed out - you'll need to at the 
> very least update the other uses of this interface.
> 
> Steve
> 
> > Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
> > ---
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_job.c |  6 ++++--
> >   include/drm/gpu_scheduler.h             | 13 ++++++++++---
> >   2 files changed, 14 insertions(+), 5 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> > index ff48101bab55..81b73790ecc6 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> > @@ -28,7 +28,7 @@
> >   #include "amdgpu.h"
> >   #include "amdgpu_trace.h"
> >   
> > -static void amdgpu_job_timedout(struct drm_sched_job *s_job)
> > +static int amdgpu_job_timedout(struct drm_sched_job *s_job)
> >   {
> >   	struct amdgpu_ring *ring = to_amdgpu_ring(s_job->sched);
> >   	struct amdgpu_job *job = to_amdgpu_job(s_job);
> > @@ -41,7 +41,7 @@ static void amdgpu_job_timedout(struct drm_sched_job *s_job)
> >   	    amdgpu_ring_soft_recovery(ring, job->vmid, s_job->s_fence->parent)) {
> >   		DRM_ERROR("ring %s timeout, but soft recovered\n",
> >   			  s_job->sched->name);
> > -		return;
> > +		return 0;
> >   	}
> >   
> >   	amdgpu_vm_get_task_info(ring->adev, job->pasid, &ti);
> > @@ -53,10 +53,12 @@ static void amdgpu_job_timedout(struct drm_sched_job *s_job)
> >   
> >   	if (amdgpu_device_should_recover_gpu(ring->adev)) {
> >   		amdgpu_device_gpu_recover(ring->adev, job);
> > +		return 0;
> >   	} else {
> >   		drm_sched_suspend_timeout(&ring->sched);
> >   		if (amdgpu_sriov_vf(adev))
> >   			adev->virt.tdr_debug = true;
> > +		return 1;
> >   	}
> >   }
> >   
> > diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> > index 2e0c368e19f6..61f7121e1c19 100644
> > --- a/include/drm/gpu_scheduler.h
> > +++ b/include/drm/gpu_scheduler.h
> > @@ -230,10 +230,17 @@ struct drm_sched_backend_ops {
> >   	struct dma_fence *(*run_job)(struct drm_sched_job *sched_job);
> >   
> >   	/**
> > -         * @timedout_job: Called when a job has taken too long to execute,
> > -         * to trigger GPU recovery.
> > +	 * @timedout_job: Called when a job has taken too long to execute,
> > +	 * to trigger GPU recovery.
> > +	 *
> > +	 * Return 0, if the job has been aborted successfully and will
> > +	 * never be heard of from the device. Return non-zero if the
> > +	 * job wasn't able to be aborted, i.e. if more time should be
> > +	 * given to this job. The result is not "bool" as this
> > +	 * function is not a predicate, although its result may seem
> > +	 * as one.
> >   	 */
> > -	void (*timedout_job)(struct drm_sched_job *sched_job);
> > +	int (*timedout_job)(struct drm_sched_job *sched_job);
> >   
> >   	/**
> >            * @free_job: Called once the job's finished fence has been signaled
> > 

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH 3/6] drm/scheduler: Job timeout handler returns status
@ 2020-11-25 11:15                                           ` Lucas Stach
  0 siblings, 0 replies; 125+ messages in thread
From: Lucas Stach @ 2020-11-25 11:15 UTC (permalink / raw)
  To: Steven Price, Luben Tuikov, Andrey Grodzovsky,
	Christian König, Alexander Deucher
  Cc: Emily Deng, amd-gfx, dri-devel

Am Mittwoch, den 25.11.2020, 11:04 +0000 schrieb Steven Price:
> On 25/11/2020 03:17, Luben Tuikov wrote:
> > The job timeout handler now returns status
> > indicating back to the DRM layer whether the job
> > was successfully cancelled or whether more time
> > should be given to the job to complete.
> 
> I'm not sure I understand in what circumstances you would want to give 
> the job more time to complete. Could you expand on that?

On etnaviv we don't have the ability to preempt a running job, but we
can look at the GPU state to determine if it's still making progress
with the current job, so we want to extend the timeout in that case to
not kill a long running but valid job.

Regards,
Lucas

> One thing we're missing at the moment in Panfrost is the ability to 
> suspend ("soft stop" is the Mali jargon) a job and pick something else 
> to run. The propitiatory driver stack uses this to avoid timing out long 
> running jobs while still allowing other processes to have time on the 
> GPU. But this interface as it stands doesn't seem to provide that.
> 
> As the kernel test robot has already pointed out - you'll need to at the 
> very least update the other uses of this interface.
> 
> Steve
> 
> > Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
> > ---
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_job.c |  6 ++++--
> >   include/drm/gpu_scheduler.h             | 13 ++++++++++---
> >   2 files changed, 14 insertions(+), 5 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> > index ff48101bab55..81b73790ecc6 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> > @@ -28,7 +28,7 @@
> >   #include "amdgpu.h"
> >   #include "amdgpu_trace.h"
> >   
> > -static void amdgpu_job_timedout(struct drm_sched_job *s_job)
> > +static int amdgpu_job_timedout(struct drm_sched_job *s_job)
> >   {
> >   	struct amdgpu_ring *ring = to_amdgpu_ring(s_job->sched);
> >   	struct amdgpu_job *job = to_amdgpu_job(s_job);
> > @@ -41,7 +41,7 @@ static void amdgpu_job_timedout(struct drm_sched_job *s_job)
> >   	    amdgpu_ring_soft_recovery(ring, job->vmid, s_job->s_fence->parent)) {
> >   		DRM_ERROR("ring %s timeout, but soft recovered\n",
> >   			  s_job->sched->name);
> > -		return;
> > +		return 0;
> >   	}
> >   
> >   	amdgpu_vm_get_task_info(ring->adev, job->pasid, &ti);
> > @@ -53,10 +53,12 @@ static void amdgpu_job_timedout(struct drm_sched_job *s_job)
> >   
> >   	if (amdgpu_device_should_recover_gpu(ring->adev)) {
> >   		amdgpu_device_gpu_recover(ring->adev, job);
> > +		return 0;
> >   	} else {
> >   		drm_sched_suspend_timeout(&ring->sched);
> >   		if (amdgpu_sriov_vf(adev))
> >   			adev->virt.tdr_debug = true;
> > +		return 1;
> >   	}
> >   }
> >   
> > diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> > index 2e0c368e19f6..61f7121e1c19 100644
> > --- a/include/drm/gpu_scheduler.h
> > +++ b/include/drm/gpu_scheduler.h
> > @@ -230,10 +230,17 @@ struct drm_sched_backend_ops {
> >   	struct dma_fence *(*run_job)(struct drm_sched_job *sched_job);
> >   
> >   	/**
> > -         * @timedout_job: Called when a job has taken too long to execute,
> > -         * to trigger GPU recovery.
> > +	 * @timedout_job: Called when a job has taken too long to execute,
> > +	 * to trigger GPU recovery.
> > +	 *
> > +	 * Return 0, if the job has been aborted successfully and will
> > +	 * never be heard of from the device. Return non-zero if the
> > +	 * job wasn't able to be aborted, i.e. if more time should be
> > +	 * given to this job. The result is not "bool" as this
> > +	 * function is not a predicate, although its result may seem
> > +	 * as one.
> >   	 */
> > -	void (*timedout_job)(struct drm_sched_job *sched_job);
> > +	int (*timedout_job)(struct drm_sched_job *sched_job);
> >   
> >   	/**
> >            * @free_job: Called once the job's finished fence has been signaled
> > 

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH 3/6] drm/scheduler: Job timeout handler returns status
  2020-11-25 11:15                                           ` Lucas Stach
@ 2020-11-25 11:22                                             ` Steven Price
  -1 siblings, 0 replies; 125+ messages in thread
From: Steven Price @ 2020-11-25 11:22 UTC (permalink / raw)
  To: Lucas Stach, Luben Tuikov, Andrey Grodzovsky,
	Christian König, Alexander Deucher
  Cc: Emily Deng, amd-gfx, dri-devel

On 25/11/2020 11:15, Lucas Stach wrote:
> Am Mittwoch, den 25.11.2020, 11:04 +0000 schrieb Steven Price:
>> On 25/11/2020 03:17, Luben Tuikov wrote:
>>> The job timeout handler now returns status
>>> indicating back to the DRM layer whether the job
>>> was successfully cancelled or whether more time
>>> should be given to the job to complete.
>>
>> I'm not sure I understand in what circumstances you would want to give
>> the job more time to complete. Could you expand on that?
> 
> On etnaviv we don't have the ability to preempt a running job, but we
> can look at the GPU state to determine if it's still making progress
> with the current job, so we want to extend the timeout in that case to
> not kill a long running but valid job.

Ok, fair enough. Although from my experience (on Mali) jobs very rarely 
"get stuck" it's just that their run time can be excessive[1] causing 
other processes to not make forward progress. So I'd expect the timeout 
to be set based on how long a job can run before you need to stop it to 
allow other processes to run their jobs.

But I'm not familiar with etnaviv so perhaps stuck jobs are actually a 
thing there.

Thanks,

Steve

[1] Also on Mali it's quite possible to create an infinite duration job 
which appears to be making forward progress, so in that case our measure 
of progress isn't useful against these malicious jobs.

> Regards,
> Lucas
> 
>> One thing we're missing at the moment in Panfrost is the ability to
>> suspend ("soft stop" is the Mali jargon) a job and pick something else
>> to run. The propitiatory driver stack uses this to avoid timing out long
>> running jobs while still allowing other processes to have time on the
>> GPU. But this interface as it stands doesn't seem to provide that.
>>
>> As the kernel test robot has already pointed out - you'll need to at the
>> very least update the other uses of this interface.
>>
>> Steve
>>
>>> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
>>> ---
>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_job.c |  6 ++++--
>>>    include/drm/gpu_scheduler.h             | 13 ++++++++++---
>>>    2 files changed, 14 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>>> index ff48101bab55..81b73790ecc6 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>>> @@ -28,7 +28,7 @@
>>>    #include "amdgpu.h"
>>>    #include "amdgpu_trace.h"
>>>    
>>> -static void amdgpu_job_timedout(struct drm_sched_job *s_job)
>>> +static int amdgpu_job_timedout(struct drm_sched_job *s_job)
>>>    {
>>>    	struct amdgpu_ring *ring = to_amdgpu_ring(s_job->sched);
>>>    	struct amdgpu_job *job = to_amdgpu_job(s_job);
>>> @@ -41,7 +41,7 @@ static void amdgpu_job_timedout(struct drm_sched_job *s_job)
>>>    	    amdgpu_ring_soft_recovery(ring, job->vmid, s_job->s_fence->parent)) {
>>>    		DRM_ERROR("ring %s timeout, but soft recovered\n",
>>>    			  s_job->sched->name);
>>> -		return;
>>> +		return 0;
>>>    	}
>>>    
>>>    	amdgpu_vm_get_task_info(ring->adev, job->pasid, &ti);
>>> @@ -53,10 +53,12 @@ static void amdgpu_job_timedout(struct drm_sched_job *s_job)
>>>    
>>>    	if (amdgpu_device_should_recover_gpu(ring->adev)) {
>>>    		amdgpu_device_gpu_recover(ring->adev, job);
>>> +		return 0;
>>>    	} else {
>>>    		drm_sched_suspend_timeout(&ring->sched);
>>>    		if (amdgpu_sriov_vf(adev))
>>>    			adev->virt.tdr_debug = true;
>>> +		return 1;
>>>    	}
>>>    }
>>>    
>>> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
>>> index 2e0c368e19f6..61f7121e1c19 100644
>>> --- a/include/drm/gpu_scheduler.h
>>> +++ b/include/drm/gpu_scheduler.h
>>> @@ -230,10 +230,17 @@ struct drm_sched_backend_ops {
>>>    	struct dma_fence *(*run_job)(struct drm_sched_job *sched_job);
>>>    
>>>    	/**
>>> -         * @timedout_job: Called when a job has taken too long to execute,
>>> -         * to trigger GPU recovery.
>>> +	 * @timedout_job: Called when a job has taken too long to execute,
>>> +	 * to trigger GPU recovery.
>>> +	 *
>>> +	 * Return 0, if the job has been aborted successfully and will
>>> +	 * never be heard of from the device. Return non-zero if the
>>> +	 * job wasn't able to be aborted, i.e. if more time should be
>>> +	 * given to this job. The result is not "bool" as this
>>> +	 * function is not a predicate, although its result may seem
>>> +	 * as one.
>>>    	 */
>>> -	void (*timedout_job)(struct drm_sched_job *sched_job);
>>> +	int (*timedout_job)(struct drm_sched_job *sched_job);
>>>    
>>>    	/**
>>>             * @free_job: Called once the job's finished fence has been signaled
>>>
> 

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH 3/6] drm/scheduler: Job timeout handler returns status
@ 2020-11-25 11:22                                             ` Steven Price
  0 siblings, 0 replies; 125+ messages in thread
From: Steven Price @ 2020-11-25 11:22 UTC (permalink / raw)
  To: Lucas Stach, Luben Tuikov, Andrey Grodzovsky,
	Christian König, Alexander Deucher
  Cc: Emily Deng, amd-gfx, dri-devel

On 25/11/2020 11:15, Lucas Stach wrote:
> Am Mittwoch, den 25.11.2020, 11:04 +0000 schrieb Steven Price:
>> On 25/11/2020 03:17, Luben Tuikov wrote:
>>> The job timeout handler now returns status
>>> indicating back to the DRM layer whether the job
>>> was successfully cancelled or whether more time
>>> should be given to the job to complete.
>>
>> I'm not sure I understand in what circumstances you would want to give
>> the job more time to complete. Could you expand on that?
> 
> On etnaviv we don't have the ability to preempt a running job, but we
> can look at the GPU state to determine if it's still making progress
> with the current job, so we want to extend the timeout in that case to
> not kill a long running but valid job.

Ok, fair enough. Although from my experience (on Mali) jobs very rarely 
"get stuck" it's just that their run time can be excessive[1] causing 
other processes to not make forward progress. So I'd expect the timeout 
to be set based on how long a job can run before you need to stop it to 
allow other processes to run their jobs.

But I'm not familiar with etnaviv so perhaps stuck jobs are actually a 
thing there.

Thanks,

Steve

[1] Also on Mali it's quite possible to create an infinite duration job 
which appears to be making forward progress, so in that case our measure 
of progress isn't useful against these malicious jobs.

> Regards,
> Lucas
> 
>> One thing we're missing at the moment in Panfrost is the ability to
>> suspend ("soft stop" is the Mali jargon) a job and pick something else
>> to run. The propitiatory driver stack uses this to avoid timing out long
>> running jobs while still allowing other processes to have time on the
>> GPU. But this interface as it stands doesn't seem to provide that.
>>
>> As the kernel test robot has already pointed out - you'll need to at the
>> very least update the other uses of this interface.
>>
>> Steve
>>
>>> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
>>> ---
>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_job.c |  6 ++++--
>>>    include/drm/gpu_scheduler.h             | 13 ++++++++++---
>>>    2 files changed, 14 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>>> index ff48101bab55..81b73790ecc6 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>>> @@ -28,7 +28,7 @@
>>>    #include "amdgpu.h"
>>>    #include "amdgpu_trace.h"
>>>    
>>> -static void amdgpu_job_timedout(struct drm_sched_job *s_job)
>>> +static int amdgpu_job_timedout(struct drm_sched_job *s_job)
>>>    {
>>>    	struct amdgpu_ring *ring = to_amdgpu_ring(s_job->sched);
>>>    	struct amdgpu_job *job = to_amdgpu_job(s_job);
>>> @@ -41,7 +41,7 @@ static void amdgpu_job_timedout(struct drm_sched_job *s_job)
>>>    	    amdgpu_ring_soft_recovery(ring, job->vmid, s_job->s_fence->parent)) {
>>>    		DRM_ERROR("ring %s timeout, but soft recovered\n",
>>>    			  s_job->sched->name);
>>> -		return;
>>> +		return 0;
>>>    	}
>>>    
>>>    	amdgpu_vm_get_task_info(ring->adev, job->pasid, &ti);
>>> @@ -53,10 +53,12 @@ static void amdgpu_job_timedout(struct drm_sched_job *s_job)
>>>    
>>>    	if (amdgpu_device_should_recover_gpu(ring->adev)) {
>>>    		amdgpu_device_gpu_recover(ring->adev, job);
>>> +		return 0;
>>>    	} else {
>>>    		drm_sched_suspend_timeout(&ring->sched);
>>>    		if (amdgpu_sriov_vf(adev))
>>>    			adev->virt.tdr_debug = true;
>>> +		return 1;
>>>    	}
>>>    }
>>>    
>>> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
>>> index 2e0c368e19f6..61f7121e1c19 100644
>>> --- a/include/drm/gpu_scheduler.h
>>> +++ b/include/drm/gpu_scheduler.h
>>> @@ -230,10 +230,17 @@ struct drm_sched_backend_ops {
>>>    	struct dma_fence *(*run_job)(struct drm_sched_job *sched_job);
>>>    
>>>    	/**
>>> -         * @timedout_job: Called when a job has taken too long to execute,
>>> -         * to trigger GPU recovery.
>>> +	 * @timedout_job: Called when a job has taken too long to execute,
>>> +	 * to trigger GPU recovery.
>>> +	 *
>>> +	 * Return 0, if the job has been aborted successfully and will
>>> +	 * never be heard of from the device. Return non-zero if the
>>> +	 * job wasn't able to be aborted, i.e. if more time should be
>>> +	 * given to this job. The result is not "bool" as this
>>> +	 * function is not a predicate, although its result may seem
>>> +	 * as one.
>>>    	 */
>>> -	void (*timedout_job)(struct drm_sched_job *sched_job);
>>> +	int (*timedout_job)(struct drm_sched_job *sched_job);
>>>    
>>>    	/**
>>>             * @free_job: Called once the job's finished fence has been signaled
>>>
> 

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH 3/6] drm/scheduler: Job timeout handler returns status
  2020-11-25 11:22                                             ` Steven Price
@ 2020-11-25 11:47                                               ` Lucas Stach
  -1 siblings, 0 replies; 125+ messages in thread
From: Lucas Stach @ 2020-11-25 11:47 UTC (permalink / raw)
  To: Steven Price, Luben Tuikov, Andrey Grodzovsky,
	Christian König, Alexander Deucher
  Cc: Emily Deng, amd-gfx, dri-devel

Am Mittwoch, den 25.11.2020, 11:22 +0000 schrieb Steven Price:
> On 25/11/2020 11:15, Lucas Stach wrote:
> > Am Mittwoch, den 25.11.2020, 11:04 +0000 schrieb Steven Price:
> > > On 25/11/2020 03:17, Luben Tuikov wrote:
> > > > The job timeout handler now returns status
> > > > indicating back to the DRM layer whether the job
> > > > was successfully cancelled or whether more time
> > > > should be given to the job to complete.
> > > 
> > > I'm not sure I understand in what circumstances you would want to give
> > > the job more time to complete. Could you expand on that?
> > 
> > On etnaviv we don't have the ability to preempt a running job, but we
> > can look at the GPU state to determine if it's still making progress
> > with the current job, so we want to extend the timeout in that case to
> > not kill a long running but valid job.
> 
> Ok, fair enough. Although from my experience (on Mali) jobs very rarely 
> "get stuck" it's just that their run time can be excessive[1] causing 
> other processes to not make forward progress. So I'd expect the timeout 
> to be set based on how long a job can run before you need to stop it to 
> allow other processes to run their jobs.

Yea, we might want to kill the job eventually, but people tend to get
very angry if their use-case gets broken just because the userspace
driver manages to put enough blits in one job to run over the 500ms
timeout we allow for a job and the kernel then just hard-kills the job.

In an ideal world we would just preempt the job and allow something
else to run for a while, but without proper preemption support in HW
that's not an option right now.

> But I'm not familiar with etnaviv so perhaps stuck jobs are actually a 
> thing there.

It happens from time to time when our understanding of the HW isn't
complete and the userspace driver manages to create command streams
with missing semaphores between HW engines. ;)

Regards,
Lucas

> Thanks,
> 
> Steve
> 
> [1] Also on Mali it's quite possible to create an infinite duration job 
> which appears to be making forward progress, so in that case our measure 
> of progress isn't useful against these malicious jobs.
> 
> > Regards,
> > Lucas
> > 
> > > One thing we're missing at the moment in Panfrost is the ability to
> > > suspend ("soft stop" is the Mali jargon) a job and pick something else
> > > to run. The propitiatory driver stack uses this to avoid timing out long
> > > running jobs while still allowing other processes to have time on the
> > > GPU. But this interface as it stands doesn't seem to provide that.
> > > 
> > > As the kernel test robot has already pointed out - you'll need to at the
> > > very least update the other uses of this interface.
> > > 
> > > Steve
> > > 
> > > > Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
> > > > ---
> > > >    drivers/gpu/drm/amd/amdgpu/amdgpu_job.c |  6 ++++--
> > > >    include/drm/gpu_scheduler.h             | 13 ++++++++++---
> > > >    2 files changed, 14 insertions(+), 5 deletions(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> > > > index ff48101bab55..81b73790ecc6 100644
> > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> > > > @@ -28,7 +28,7 @@
> > > >    #include "amdgpu.h"
> > > >    #include "amdgpu_trace.h"
> > > >    
> > > > -static void amdgpu_job_timedout(struct drm_sched_job *s_job)
> > > > +static int amdgpu_job_timedout(struct drm_sched_job *s_job)
> > > >    {
> > > >    	struct amdgpu_ring *ring = to_amdgpu_ring(s_job->sched);
> > > >    	struct amdgpu_job *job = to_amdgpu_job(s_job);
> > > > @@ -41,7 +41,7 @@ static void amdgpu_job_timedout(struct drm_sched_job *s_job)
> > > >    	    amdgpu_ring_soft_recovery(ring, job->vmid, s_job->s_fence->parent)) {
> > > >    		DRM_ERROR("ring %s timeout, but soft recovered\n",
> > > >    			  s_job->sched->name);
> > > > -		return;
> > > > +		return 0;
> > > >    	}
> > > >    
> > > >    	amdgpu_vm_get_task_info(ring->adev, job->pasid, &ti);
> > > > @@ -53,10 +53,12 @@ static void amdgpu_job_timedout(struct drm_sched_job *s_job)
> > > >    
> > > >    	if (amdgpu_device_should_recover_gpu(ring->adev)) {
> > > >    		amdgpu_device_gpu_recover(ring->adev, job);
> > > > +		return 0;
> > > >    	} else {
> > > >    		drm_sched_suspend_timeout(&ring->sched);
> > > >    		if (amdgpu_sriov_vf(adev))
> > > >    			adev->virt.tdr_debug = true;
> > > > +		return 1;
> > > >    	}
> > > >    }
> > > >    
> > > > diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> > > > index 2e0c368e19f6..61f7121e1c19 100644
> > > > --- a/include/drm/gpu_scheduler.h
> > > > +++ b/include/drm/gpu_scheduler.h
> > > > @@ -230,10 +230,17 @@ struct drm_sched_backend_ops {
> > > >    	struct dma_fence *(*run_job)(struct drm_sched_job *sched_job);
> > > >    
> > > >    	/**
> > > > -         * @timedout_job: Called when a job has taken too long to execute,
> > > > -         * to trigger GPU recovery.
> > > > +	 * @timedout_job: Called when a job has taken too long to execute,
> > > > +	 * to trigger GPU recovery.
> > > > +	 *
> > > > +	 * Return 0, if the job has been aborted successfully and will
> > > > +	 * never be heard of from the device. Return non-zero if the
> > > > +	 * job wasn't able to be aborted, i.e. if more time should be
> > > > +	 * given to this job. The result is not "bool" as this
> > > > +	 * function is not a predicate, although its result may seem
> > > > +	 * as one.
> > > >    	 */
> > > > -	void (*timedout_job)(struct drm_sched_job *sched_job);
> > > > +	int (*timedout_job)(struct drm_sched_job *sched_job);
> > > >    
> > > >    	/**
> > > >             * @free_job: Called once the job's finished fence has been signaled
> > > > 

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH 3/6] drm/scheduler: Job timeout handler returns status
@ 2020-11-25 11:47                                               ` Lucas Stach
  0 siblings, 0 replies; 125+ messages in thread
From: Lucas Stach @ 2020-11-25 11:47 UTC (permalink / raw)
  To: Steven Price, Luben Tuikov, Andrey Grodzovsky,
	Christian König, Alexander Deucher
  Cc: Emily Deng, amd-gfx, dri-devel

Am Mittwoch, den 25.11.2020, 11:22 +0000 schrieb Steven Price:
> On 25/11/2020 11:15, Lucas Stach wrote:
> > Am Mittwoch, den 25.11.2020, 11:04 +0000 schrieb Steven Price:
> > > On 25/11/2020 03:17, Luben Tuikov wrote:
> > > > The job timeout handler now returns status
> > > > indicating back to the DRM layer whether the job
> > > > was successfully cancelled or whether more time
> > > > should be given to the job to complete.
> > > 
> > > I'm not sure I understand in what circumstances you would want to give
> > > the job more time to complete. Could you expand on that?
> > 
> > On etnaviv we don't have the ability to preempt a running job, but we
> > can look at the GPU state to determine if it's still making progress
> > with the current job, so we want to extend the timeout in that case to
> > not kill a long running but valid job.
> 
> Ok, fair enough. Although from my experience (on Mali) jobs very rarely 
> "get stuck" it's just that their run time can be excessive[1] causing 
> other processes to not make forward progress. So I'd expect the timeout 
> to be set based on how long a job can run before you need to stop it to 
> allow other processes to run their jobs.

Yea, we might want to kill the job eventually, but people tend to get
very angry if their use-case gets broken just because the userspace
driver manages to put enough blits in one job to run over the 500ms
timeout we allow for a job and the kernel then just hard-kills the job.

In an ideal world we would just preempt the job and allow something
else to run for a while, but without proper preemption support in HW
that's not an option right now.

> But I'm not familiar with etnaviv so perhaps stuck jobs are actually a 
> thing there.

It happens from time to time when our understanding of the HW isn't
complete and the userspace driver manages to create command streams
with missing semaphores between HW engines. ;)

Regards,
Lucas

> Thanks,
> 
> Steve
> 
> [1] Also on Mali it's quite possible to create an infinite duration job 
> which appears to be making forward progress, so in that case our measure 
> of progress isn't useful against these malicious jobs.
> 
> > Regards,
> > Lucas
> > 
> > > One thing we're missing at the moment in Panfrost is the ability to
> > > suspend ("soft stop" is the Mali jargon) a job and pick something else
> > > to run. The propitiatory driver stack uses this to avoid timing out long
> > > running jobs while still allowing other processes to have time on the
> > > GPU. But this interface as it stands doesn't seem to provide that.
> > > 
> > > As the kernel test robot has already pointed out - you'll need to at the
> > > very least update the other uses of this interface.
> > > 
> > > Steve
> > > 
> > > > Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
> > > > ---
> > > >    drivers/gpu/drm/amd/amdgpu/amdgpu_job.c |  6 ++++--
> > > >    include/drm/gpu_scheduler.h             | 13 ++++++++++---
> > > >    2 files changed, 14 insertions(+), 5 deletions(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> > > > index ff48101bab55..81b73790ecc6 100644
> > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> > > > @@ -28,7 +28,7 @@
> > > >    #include "amdgpu.h"
> > > >    #include "amdgpu_trace.h"
> > > >    
> > > > -static void amdgpu_job_timedout(struct drm_sched_job *s_job)
> > > > +static int amdgpu_job_timedout(struct drm_sched_job *s_job)
> > > >    {
> > > >    	struct amdgpu_ring *ring = to_amdgpu_ring(s_job->sched);
> > > >    	struct amdgpu_job *job = to_amdgpu_job(s_job);
> > > > @@ -41,7 +41,7 @@ static void amdgpu_job_timedout(struct drm_sched_job *s_job)
> > > >    	    amdgpu_ring_soft_recovery(ring, job->vmid, s_job->s_fence->parent)) {
> > > >    		DRM_ERROR("ring %s timeout, but soft recovered\n",
> > > >    			  s_job->sched->name);
> > > > -		return;
> > > > +		return 0;
> > > >    	}
> > > >    
> > > >    	amdgpu_vm_get_task_info(ring->adev, job->pasid, &ti);
> > > > @@ -53,10 +53,12 @@ static void amdgpu_job_timedout(struct drm_sched_job *s_job)
> > > >    
> > > >    	if (amdgpu_device_should_recover_gpu(ring->adev)) {
> > > >    		amdgpu_device_gpu_recover(ring->adev, job);
> > > > +		return 0;
> > > >    	} else {
> > > >    		drm_sched_suspend_timeout(&ring->sched);
> > > >    		if (amdgpu_sriov_vf(adev))
> > > >    			adev->virt.tdr_debug = true;
> > > > +		return 1;
> > > >    	}
> > > >    }
> > > >    
> > > > diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> > > > index 2e0c368e19f6..61f7121e1c19 100644
> > > > --- a/include/drm/gpu_scheduler.h
> > > > +++ b/include/drm/gpu_scheduler.h
> > > > @@ -230,10 +230,17 @@ struct drm_sched_backend_ops {
> > > >    	struct dma_fence *(*run_job)(struct drm_sched_job *sched_job);
> > > >    
> > > >    	/**
> > > > -         * @timedout_job: Called when a job has taken too long to execute,
> > > > -         * to trigger GPU recovery.
> > > > +	 * @timedout_job: Called when a job has taken too long to execute,
> > > > +	 * to trigger GPU recovery.
> > > > +	 *
> > > > +	 * Return 0, if the job has been aborted successfully and will
> > > > +	 * never be heard of from the device. Return non-zero if the
> > > > +	 * job wasn't able to be aborted, i.e. if more time should be
> > > > +	 * given to this job. The result is not "bool" as this
> > > > +	 * function is not a predicate, although its result may seem
> > > > +	 * as one.
> > > >    	 */
> > > > -	void (*timedout_job)(struct drm_sched_job *sched_job);
> > > > +	int (*timedout_job)(struct drm_sched_job *sched_job);
> > > >    
> > > >    	/**
> > > >             * @free_job: Called once the job's finished fence has been signaled
> > > > 

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH 3/6] drm/scheduler: Job timeout handler returns status
  2020-11-25 11:04                                         ` Steven Price
@ 2020-11-25 12:41                                           ` Christian König
  -1 siblings, 0 replies; 125+ messages in thread
From: Christian König @ 2020-11-25 12:41 UTC (permalink / raw)
  To: Steven Price, Luben Tuikov, Andrey Grodzovsky, Lucas Stach,
	Alexander Deucher
  Cc: Emily Deng, amd-gfx, dri-devel

Am 25.11.20 um 12:04 schrieb Steven Price:
> On 25/11/2020 03:17, Luben Tuikov wrote:
>> The job timeout handler now returns status
>> indicating back to the DRM layer whether the job
>> was successfully cancelled or whether more time
>> should be given to the job to complete.
>
> I'm not sure I understand in what circumstances you would want to give 
> the job more time to complete. Could you expand on that?
>
> One thing we're missing at the moment in Panfrost is the ability to 
> suspend ("soft stop" is the Mali jargon) a job and pick something else 
> to run. The propitiatory driver stack uses this to avoid timing out 
> long running jobs while still allowing other processes to have time on 
> the GPU. But this interface as it stands doesn't seem to provide that.

On AMD hardware we call this IB preemption and it is indeed not handled 
very well by the scheduler at the moment.

See how the amdgpu code messes with the preempted IBs to restart them 
for example.

Christian.

>
> As the kernel test robot has already pointed out - you'll need to at 
> the very least update the other uses of this interface.
>
> Steve
>
>>
>> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_job.c |  6 ++++--
>>   include/drm/gpu_scheduler.h             | 13 ++++++++++---
>>   2 files changed, 14 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>> index ff48101bab55..81b73790ecc6 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>> @@ -28,7 +28,7 @@
>>   #include "amdgpu.h"
>>   #include "amdgpu_trace.h"
>>   -static void amdgpu_job_timedout(struct drm_sched_job *s_job)
>> +static int amdgpu_job_timedout(struct drm_sched_job *s_job)
>>   {
>>       struct amdgpu_ring *ring = to_amdgpu_ring(s_job->sched);
>>       struct amdgpu_job *job = to_amdgpu_job(s_job);
>> @@ -41,7 +41,7 @@ static void amdgpu_job_timedout(struct 
>> drm_sched_job *s_job)
>>           amdgpu_ring_soft_recovery(ring, job->vmid, 
>> s_job->s_fence->parent)) {
>>           DRM_ERROR("ring %s timeout, but soft recovered\n",
>>                 s_job->sched->name);
>> -        return;
>> +        return 0;
>>       }
>>         amdgpu_vm_get_task_info(ring->adev, job->pasid, &ti);
>> @@ -53,10 +53,12 @@ static void amdgpu_job_timedout(struct 
>> drm_sched_job *s_job)
>>         if (amdgpu_device_should_recover_gpu(ring->adev)) {
>>           amdgpu_device_gpu_recover(ring->adev, job);
>> +        return 0;
>>       } else {
>>           drm_sched_suspend_timeout(&ring->sched);
>>           if (amdgpu_sriov_vf(adev))
>>               adev->virt.tdr_debug = true;
>> +        return 1;
>>       }
>>   }
>>   diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
>> index 2e0c368e19f6..61f7121e1c19 100644
>> --- a/include/drm/gpu_scheduler.h
>> +++ b/include/drm/gpu_scheduler.h
>> @@ -230,10 +230,17 @@ struct drm_sched_backend_ops {
>>       struct dma_fence *(*run_job)(struct drm_sched_job *sched_job);
>>         /**
>> -         * @timedout_job: Called when a job has taken too long to 
>> execute,
>> -         * to trigger GPU recovery.
>> +     * @timedout_job: Called when a job has taken too long to execute,
>> +     * to trigger GPU recovery.
>> +     *
>> +     * Return 0, if the job has been aborted successfully and will
>> +     * never be heard of from the device. Return non-zero if the
>> +     * job wasn't able to be aborted, i.e. if more time should be
>> +     * given to this job. The result is not "bool" as this
>> +     * function is not a predicate, although its result may seem
>> +     * as one.
>>        */
>> -    void (*timedout_job)(struct drm_sched_job *sched_job);
>> +    int (*timedout_job)(struct drm_sched_job *sched_job);
>>         /**
>>            * @free_job: Called once the job's finished fence has been 
>> signaled
>>
>

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH 3/6] drm/scheduler: Job timeout handler returns status
@ 2020-11-25 12:41                                           ` Christian König
  0 siblings, 0 replies; 125+ messages in thread
From: Christian König @ 2020-11-25 12:41 UTC (permalink / raw)
  To: Steven Price, Luben Tuikov, Andrey Grodzovsky, Lucas Stach,
	Alexander Deucher
  Cc: Emily Deng, amd-gfx, dri-devel

Am 25.11.20 um 12:04 schrieb Steven Price:
> On 25/11/2020 03:17, Luben Tuikov wrote:
>> The job timeout handler now returns status
>> indicating back to the DRM layer whether the job
>> was successfully cancelled or whether more time
>> should be given to the job to complete.
>
> I'm not sure I understand in what circumstances you would want to give 
> the job more time to complete. Could you expand on that?
>
> One thing we're missing at the moment in Panfrost is the ability to 
> suspend ("soft stop" is the Mali jargon) a job and pick something else 
> to run. The propitiatory driver stack uses this to avoid timing out 
> long running jobs while still allowing other processes to have time on 
> the GPU. But this interface as it stands doesn't seem to provide that.

On AMD hardware we call this IB preemption and it is indeed not handled 
very well by the scheduler at the moment.

See how the amdgpu code messes with the preempted IBs to restart them 
for example.

Christian.

>
> As the kernel test robot has already pointed out - you'll need to at 
> the very least update the other uses of this interface.
>
> Steve
>
>>
>> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_job.c |  6 ++++--
>>   include/drm/gpu_scheduler.h             | 13 ++++++++++---
>>   2 files changed, 14 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>> index ff48101bab55..81b73790ecc6 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>> @@ -28,7 +28,7 @@
>>   #include "amdgpu.h"
>>   #include "amdgpu_trace.h"
>>   -static void amdgpu_job_timedout(struct drm_sched_job *s_job)
>> +static int amdgpu_job_timedout(struct drm_sched_job *s_job)
>>   {
>>       struct amdgpu_ring *ring = to_amdgpu_ring(s_job->sched);
>>       struct amdgpu_job *job = to_amdgpu_job(s_job);
>> @@ -41,7 +41,7 @@ static void amdgpu_job_timedout(struct 
>> drm_sched_job *s_job)
>>           amdgpu_ring_soft_recovery(ring, job->vmid, 
>> s_job->s_fence->parent)) {
>>           DRM_ERROR("ring %s timeout, but soft recovered\n",
>>                 s_job->sched->name);
>> -        return;
>> +        return 0;
>>       }
>>         amdgpu_vm_get_task_info(ring->adev, job->pasid, &ti);
>> @@ -53,10 +53,12 @@ static void amdgpu_job_timedout(struct 
>> drm_sched_job *s_job)
>>         if (amdgpu_device_should_recover_gpu(ring->adev)) {
>>           amdgpu_device_gpu_recover(ring->adev, job);
>> +        return 0;
>>       } else {
>>           drm_sched_suspend_timeout(&ring->sched);
>>           if (amdgpu_sriov_vf(adev))
>>               adev->virt.tdr_debug = true;
>> +        return 1;
>>       }
>>   }
>>   diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
>> index 2e0c368e19f6..61f7121e1c19 100644
>> --- a/include/drm/gpu_scheduler.h
>> +++ b/include/drm/gpu_scheduler.h
>> @@ -230,10 +230,17 @@ struct drm_sched_backend_ops {
>>       struct dma_fence *(*run_job)(struct drm_sched_job *sched_job);
>>         /**
>> -         * @timedout_job: Called when a job has taken too long to 
>> execute,
>> -         * to trigger GPU recovery.
>> +     * @timedout_job: Called when a job has taken too long to execute,
>> +     * to trigger GPU recovery.
>> +     *
>> +     * Return 0, if the job has been aborted successfully and will
>> +     * never be heard of from the device. Return non-zero if the
>> +     * job wasn't able to be aborted, i.e. if more time should be
>> +     * given to this job. The result is not "bool" as this
>> +     * function is not a predicate, although its result may seem
>> +     * as one.
>>        */
>> -    void (*timedout_job)(struct drm_sched_job *sched_job);
>> +    int (*timedout_job)(struct drm_sched_job *sched_job);
>>         /**
>>            * @free_job: Called once the job's finished fence has been 
>> signaled
>>
>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH 2/6] gpu/drm: ring_mirror_list --> pending_list
  2020-11-25  9:47                                         ` Christian König
@ 2020-11-25 16:42                                           ` Luben Tuikov
  -1 siblings, 0 replies; 125+ messages in thread
From: Luben Tuikov @ 2020-11-25 16:42 UTC (permalink / raw)
  To: Christian König, Andrey Grodzovsky, Lucas Stach, Alexander Deucher
  Cc: Emily Deng, amd-gfx, dri-devel, steven.price

On 2020-11-25 04:47, Christian König wrote:
> Am 25.11.20 um 04:17 schrieb Luben Tuikov:
>> Rename "ring_mirror_list" to "pending_list",
>> to describe what something is, not what it does,
>> how it's used, or how the hardware implements it.
>>
>> This also abstracts the actual hardware
>> implementation, i.e. how the low-level driver
>> communicates with the device it drives, ring, CAM,
>> etc., shouldn't be exposed to DRM.
>>
>> The pending_list keeps jobs submitted, which are
>> out of our control. Usually this means they are
>> pending execution status in hardware, but the
>> latter definition is a more general (inclusive)
>> definition.
>>
>> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
> 
> In general the rename is a good idea, but I think we should try to 
> remove this linked list in general.
> 
> As the original name described this is essentially a ring buffer, the is 
> no reason I can see to use a linked list here except for the add/remove 
> madness we currently have.
> 
> Anyway patch is Acked-by: Christian König <christian.koenig@amd.com> for 
> now.

Thanks for the Ack Christian.

Well this list is there now and I don't want to change too many
things or this patch would get out of hand.

Yeah, in the future, perhaps we can overhaul and change this. For now
this is a minimal rename-only patch.

Thanks,
Luben

> 
> Regards,
> Christian.
> 
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c |  4 +--
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c  |  4 +--
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_job.c     |  2 +-
>>   drivers/gpu/drm/scheduler/sched_main.c      | 34 ++++++++++-----------
>>   include/drm/gpu_scheduler.h                 | 10 +++---
>>   5 files changed, 27 insertions(+), 27 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
>> index 8358cae0b5a4..db77a5bdfa45 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
>> @@ -1427,7 +1427,7 @@ static void amdgpu_ib_preempt_job_recovery(struct drm_gpu_scheduler *sched)
>>   	struct dma_fence *fence;
>>   
>>   	spin_lock(&sched->job_list_lock);
>> -	list_for_each_entry(s_job, &sched->ring_mirror_list, list) {
>> +	list_for_each_entry(s_job, &sched->pending_list, list) {
>>   		fence = sched->ops->run_job(s_job);
>>   		dma_fence_put(fence);
>>   	}
>> @@ -1459,7 +1459,7 @@ static void amdgpu_ib_preempt_mark_partial_job(struct amdgpu_ring *ring)
>>   
>>   no_preempt:
>>   	spin_lock(&sched->job_list_lock);
>> -	list_for_each_entry_safe(s_job, tmp, &sched->ring_mirror_list, list) {
>> +	list_for_each_entry_safe(s_job, tmp, &sched->pending_list, list) {
>>   		if (dma_fence_is_signaled(&s_job->s_fence->finished)) {
>>   			/* remove job from ring_mirror_list */
>>   			list_del_init(&s_job->list);
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> index 4df6de81cd41..fbae600aa5f9 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> @@ -4127,8 +4127,8 @@ bool amdgpu_device_has_job_running(struct amdgpu_device *adev)
>>   			continue;
>>   
>>   		spin_lock(&ring->sched.job_list_lock);
>> -		job = list_first_entry_or_null(&ring->sched.ring_mirror_list,
>> -				struct drm_sched_job, list);
>> +		job = list_first_entry_or_null(&ring->sched.pending_list,
>> +					       struct drm_sched_job, list);
>>   		spin_unlock(&ring->sched.job_list_lock);
>>   		if (job)
>>   			return true;
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>> index aca52a46b93d..ff48101bab55 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>> @@ -271,7 +271,7 @@ void amdgpu_job_stop_all_jobs_on_sched(struct drm_gpu_scheduler *sched)
>>   	}
>>   
>>   	/* Signal all jobs already scheduled to HW */
>> -	list_for_each_entry(s_job, &sched->ring_mirror_list, list) {
>> +	list_for_each_entry(s_job, &sched->pending_list, list) {
>>   		struct drm_sched_fence *s_fence = s_job->s_fence;
>>   
>>   		dma_fence_set_error(&s_fence->finished, -EHWPOISON);
>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
>> index c52eba407ebd..b694df12aaba 100644
>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>> @@ -198,7 +198,7 @@ EXPORT_SYMBOL(drm_sched_dependency_optimized);
>>   static void drm_sched_start_timeout(struct drm_gpu_scheduler *sched)
>>   {
>>   	if (sched->timeout != MAX_SCHEDULE_TIMEOUT &&
>> -	    !list_empty(&sched->ring_mirror_list))
>> +	    !list_empty(&sched->pending_list))
>>   		schedule_delayed_work(&sched->work_tdr, sched->timeout);
>>   }
>>   
>> @@ -258,7 +258,7 @@ void drm_sched_resume_timeout(struct drm_gpu_scheduler *sched,
>>   {
>>   	spin_lock(&sched->job_list_lock);
>>   
>> -	if (list_empty(&sched->ring_mirror_list))
>> +	if (list_empty(&sched->pending_list))
>>   		cancel_delayed_work(&sched->work_tdr);
>>   	else
>>   		mod_delayed_work(system_wq, &sched->work_tdr, remaining);
>> @@ -272,7 +272,7 @@ static void drm_sched_job_begin(struct drm_sched_job *s_job)
>>   	struct drm_gpu_scheduler *sched = s_job->sched;
>>   
>>   	spin_lock(&sched->job_list_lock);
>> -	list_add_tail(&s_job->list, &sched->ring_mirror_list);
>> +	list_add_tail(&s_job->list, &sched->pending_list);
>>   	drm_sched_start_timeout(sched);
>>   	spin_unlock(&sched->job_list_lock);
>>   }
>> @@ -286,7 +286,7 @@ static void drm_sched_job_timedout(struct work_struct *work)
>>   
>>   	/* Protects against concurrent deletion in drm_sched_get_cleanup_job */
>>   	spin_lock(&sched->job_list_lock);
>> -	job = list_first_entry_or_null(&sched->ring_mirror_list,
>> +	job = list_first_entry_or_null(&sched->pending_list,
>>   				       struct drm_sched_job, list);
>>   
>>   	if (job) {
>> @@ -371,7 +371,7 @@ EXPORT_SYMBOL(drm_sched_increase_karma);
>>    * Stop the scheduler and also removes and frees all completed jobs.
>>    * Note: bad job will not be freed as it might be used later and so it's
>>    * callers responsibility to release it manually if it's not part of the
>> - * mirror list any more.
>> + * pending list any more.
>>    *
>>    */
>>   void drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad)
>> @@ -392,15 +392,15 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad)
>>   		 * Add at the head of the queue to reflect it was the earliest
>>   		 * job extracted.
>>   		 */
>> -		list_add(&bad->list, &sched->ring_mirror_list);
>> +		list_add(&bad->list, &sched->pending_list);
>>   
>>   	/*
>>   	 * Iterate the job list from later to  earlier one and either deactive
>> -	 * their HW callbacks or remove them from mirror list if they already
>> +	 * their HW callbacks or remove them from pending list if they already
>>   	 * signaled.
>>   	 * This iteration is thread safe as sched thread is stopped.
>>   	 */
>> -	list_for_each_entry_safe_reverse(s_job, tmp, &sched->ring_mirror_list,
>> +	list_for_each_entry_safe_reverse(s_job, tmp, &sched->pending_list,
>>   					 list) {
>>   		if (s_job->s_fence->parent &&
>>   		    dma_fence_remove_callback(s_job->s_fence->parent,
>> @@ -408,7 +408,7 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad)
>>   			atomic_dec(&sched->hw_rq_count);
>>   		} else {
>>   			/*
>> -			 * remove job from ring_mirror_list.
>> +			 * remove job from pending_list.
>>   			 * Locking here is for concurrent resume timeout
>>   			 */
>>   			spin_lock(&sched->job_list_lock);
>> @@ -463,7 +463,7 @@ void drm_sched_start(struct drm_gpu_scheduler *sched, bool full_recovery)
>>   	 * so no new jobs are being inserted or removed. Also concurrent
>>   	 * GPU recovers can't run in parallel.
>>   	 */
>> -	list_for_each_entry_safe(s_job, tmp, &sched->ring_mirror_list, list) {
>> +	list_for_each_entry_safe(s_job, tmp, &sched->pending_list, list) {
>>   		struct dma_fence *fence = s_job->s_fence->parent;
>>   
>>   		atomic_inc(&sched->hw_rq_count);
>> @@ -494,7 +494,7 @@ void drm_sched_start(struct drm_gpu_scheduler *sched, bool full_recovery)
>>   EXPORT_SYMBOL(drm_sched_start);
>>   
>>   /**
>> - * drm_sched_resubmit_jobs - helper to relunch job from mirror ring list
>> + * drm_sched_resubmit_jobs - helper to relunch job from pending ring list
>>    *
>>    * @sched: scheduler instance
>>    *
>> @@ -506,7 +506,7 @@ void drm_sched_resubmit_jobs(struct drm_gpu_scheduler *sched)
>>   	bool found_guilty = false;
>>   	struct dma_fence *fence;
>>   
>> -	list_for_each_entry_safe(s_job, tmp, &sched->ring_mirror_list, list) {
>> +	list_for_each_entry_safe(s_job, tmp, &sched->pending_list, list) {
>>   		struct drm_sched_fence *s_fence = s_job->s_fence;
>>   
>>   		if (!found_guilty && atomic_read(&s_job->karma) > sched->hang_limit) {
>> @@ -665,7 +665,7 @@ static void drm_sched_process_job(struct dma_fence *f, struct dma_fence_cb *cb)
>>    *
>>    * @sched: scheduler instance
>>    *
>> - * Returns the next finished job from the mirror list (if there is one)
>> + * Returns the next finished job from the pending list (if there is one)
>>    * ready for it to be destroyed.
>>    */
>>   static struct drm_sched_job *
>> @@ -675,7 +675,7 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched)
>>   
>>   	/*
>>   	 * Don't destroy jobs while the timeout worker is running  OR thread
>> -	 * is being parked and hence assumed to not touch ring_mirror_list
>> +	 * is being parked and hence assumed to not touch pending_list
>>   	 */
>>   	if ((sched->timeout != MAX_SCHEDULE_TIMEOUT &&
>>   	    !cancel_delayed_work(&sched->work_tdr)) ||
>> @@ -684,11 +684,11 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched)
>>   
>>   	spin_lock(&sched->job_list_lock);
>>   
>> -	job = list_first_entry_or_null(&sched->ring_mirror_list,
>> +	job = list_first_entry_or_null(&sched->pending_list,
>>   				       struct drm_sched_job, list);
>>   
>>   	if (job && dma_fence_is_signaled(&job->s_fence->finished)) {
>> -		/* remove job from ring_mirror_list */
>> +		/* remove job from pending_list */
>>   		list_del_init(&job->list);
>>   	} else {
>>   		job = NULL;
>> @@ -858,7 +858,7 @@ int drm_sched_init(struct drm_gpu_scheduler *sched,
>>   
>>   	init_waitqueue_head(&sched->wake_up_worker);
>>   	init_waitqueue_head(&sched->job_scheduled);
>> -	INIT_LIST_HEAD(&sched->ring_mirror_list);
>> +	INIT_LIST_HEAD(&sched->pending_list);
>>   	spin_lock_init(&sched->job_list_lock);
>>   	atomic_set(&sched->hw_rq_count, 0);
>>   	INIT_DELAYED_WORK(&sched->work_tdr, drm_sched_job_timedout);
>> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
>> index 3add0072bd37..2e0c368e19f6 100644
>> --- a/include/drm/gpu_scheduler.h
>> +++ b/include/drm/gpu_scheduler.h
>> @@ -174,7 +174,7 @@ struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f);
>>    * @sched: the scheduler instance on which this job is scheduled.
>>    * @s_fence: contains the fences for the scheduling of job.
>>    * @finish_cb: the callback for the finished fence.
>> - * @node: used to append this struct to the @drm_gpu_scheduler.ring_mirror_list.
>> + * @node: used to append this struct to the @drm_gpu_scheduler.pending_list.
>>    * @id: a unique id assigned to each job scheduled on the scheduler.
>>    * @karma: increment on every hang caused by this job. If this exceeds the hang
>>    *         limit of the scheduler then the job is marked guilty and will not
>> @@ -203,7 +203,7 @@ struct drm_sched_job {
>>   static inline bool drm_sched_invalidate_job(struct drm_sched_job *s_job,
>>   					    int threshold)
>>   {
>> -	return (s_job && atomic_inc_return(&s_job->karma) > threshold);
>> +	return s_job && atomic_inc_return(&s_job->karma) > threshold;
>>   }
>>   
>>   /**
>> @@ -260,8 +260,8 @@ struct drm_sched_backend_ops {
>>    * @work_tdr: schedules a delayed call to @drm_sched_job_timedout after the
>>    *            timeout interval is over.
>>    * @thread: the kthread on which the scheduler which run.
>> - * @ring_mirror_list: the list of jobs which are currently in the job queue.
>> - * @job_list_lock: lock to protect the ring_mirror_list.
>> + * @pending_list: the list of jobs which are currently in the job queue.
>> + * @job_list_lock: lock to protect the pending_list.
>>    * @hang_limit: once the hangs by a job crosses this limit then it is marked
>>    *              guilty and it will be considered for scheduling further.
>>    * @score: score to help loadbalancer pick a idle sched
>> @@ -282,7 +282,7 @@ struct drm_gpu_scheduler {
>>   	atomic64_t			job_id_count;
>>   	struct delayed_work		work_tdr;
>>   	struct task_struct		*thread;
>> -	struct list_head		ring_mirror_list;
>> +	struct list_head		pending_list;
>>   	spinlock_t			job_list_lock;
>>   	int				hang_limit;
>>   	atomic_t                        score;
> 

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH 2/6] gpu/drm: ring_mirror_list --> pending_list
@ 2020-11-25 16:42                                           ` Luben Tuikov
  0 siblings, 0 replies; 125+ messages in thread
From: Luben Tuikov @ 2020-11-25 16:42 UTC (permalink / raw)
  To: Christian König, Andrey Grodzovsky, Lucas Stach, Alexander Deucher
  Cc: Emily Deng, amd-gfx, dri-devel, steven.price

On 2020-11-25 04:47, Christian König wrote:
> Am 25.11.20 um 04:17 schrieb Luben Tuikov:
>> Rename "ring_mirror_list" to "pending_list",
>> to describe what something is, not what it does,
>> how it's used, or how the hardware implements it.
>>
>> This also abstracts the actual hardware
>> implementation, i.e. how the low-level driver
>> communicates with the device it drives, ring, CAM,
>> etc., shouldn't be exposed to DRM.
>>
>> The pending_list keeps jobs submitted, which are
>> out of our control. Usually this means they are
>> pending execution status in hardware, but the
>> latter definition is a more general (inclusive)
>> definition.
>>
>> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
> 
> In general the rename is a good idea, but I think we should try to 
> remove this linked list in general.
> 
> As the original name described this is essentially a ring buffer, the is 
> no reason I can see to use a linked list here except for the add/remove 
> madness we currently have.
> 
> Anyway patch is Acked-by: Christian König <christian.koenig@amd.com> for 
> now.

Thanks for the Ack Christian.

Well this list is there now and I don't want to change too many
things or this patch would get out of hand.

Yeah, in the future, perhaps we can overhaul and change this. For now
this is a minimal rename-only patch.

Thanks,
Luben

> 
> Regards,
> Christian.
> 
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c |  4 +--
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c  |  4 +--
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_job.c     |  2 +-
>>   drivers/gpu/drm/scheduler/sched_main.c      | 34 ++++++++++-----------
>>   include/drm/gpu_scheduler.h                 | 10 +++---
>>   5 files changed, 27 insertions(+), 27 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
>> index 8358cae0b5a4..db77a5bdfa45 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
>> @@ -1427,7 +1427,7 @@ static void amdgpu_ib_preempt_job_recovery(struct drm_gpu_scheduler *sched)
>>   	struct dma_fence *fence;
>>   
>>   	spin_lock(&sched->job_list_lock);
>> -	list_for_each_entry(s_job, &sched->ring_mirror_list, list) {
>> +	list_for_each_entry(s_job, &sched->pending_list, list) {
>>   		fence = sched->ops->run_job(s_job);
>>   		dma_fence_put(fence);
>>   	}
>> @@ -1459,7 +1459,7 @@ static void amdgpu_ib_preempt_mark_partial_job(struct amdgpu_ring *ring)
>>   
>>   no_preempt:
>>   	spin_lock(&sched->job_list_lock);
>> -	list_for_each_entry_safe(s_job, tmp, &sched->ring_mirror_list, list) {
>> +	list_for_each_entry_safe(s_job, tmp, &sched->pending_list, list) {
>>   		if (dma_fence_is_signaled(&s_job->s_fence->finished)) {
>>   			/* remove job from ring_mirror_list */
>>   			list_del_init(&s_job->list);
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> index 4df6de81cd41..fbae600aa5f9 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> @@ -4127,8 +4127,8 @@ bool amdgpu_device_has_job_running(struct amdgpu_device *adev)
>>   			continue;
>>   
>>   		spin_lock(&ring->sched.job_list_lock);
>> -		job = list_first_entry_or_null(&ring->sched.ring_mirror_list,
>> -				struct drm_sched_job, list);
>> +		job = list_first_entry_or_null(&ring->sched.pending_list,
>> +					       struct drm_sched_job, list);
>>   		spin_unlock(&ring->sched.job_list_lock);
>>   		if (job)
>>   			return true;
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>> index aca52a46b93d..ff48101bab55 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>> @@ -271,7 +271,7 @@ void amdgpu_job_stop_all_jobs_on_sched(struct drm_gpu_scheduler *sched)
>>   	}
>>   
>>   	/* Signal all jobs already scheduled to HW */
>> -	list_for_each_entry(s_job, &sched->ring_mirror_list, list) {
>> +	list_for_each_entry(s_job, &sched->pending_list, list) {
>>   		struct drm_sched_fence *s_fence = s_job->s_fence;
>>   
>>   		dma_fence_set_error(&s_fence->finished, -EHWPOISON);
>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
>> index c52eba407ebd..b694df12aaba 100644
>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>> @@ -198,7 +198,7 @@ EXPORT_SYMBOL(drm_sched_dependency_optimized);
>>   static void drm_sched_start_timeout(struct drm_gpu_scheduler *sched)
>>   {
>>   	if (sched->timeout != MAX_SCHEDULE_TIMEOUT &&
>> -	    !list_empty(&sched->ring_mirror_list))
>> +	    !list_empty(&sched->pending_list))
>>   		schedule_delayed_work(&sched->work_tdr, sched->timeout);
>>   }
>>   
>> @@ -258,7 +258,7 @@ void drm_sched_resume_timeout(struct drm_gpu_scheduler *sched,
>>   {
>>   	spin_lock(&sched->job_list_lock);
>>   
>> -	if (list_empty(&sched->ring_mirror_list))
>> +	if (list_empty(&sched->pending_list))
>>   		cancel_delayed_work(&sched->work_tdr);
>>   	else
>>   		mod_delayed_work(system_wq, &sched->work_tdr, remaining);
>> @@ -272,7 +272,7 @@ static void drm_sched_job_begin(struct drm_sched_job *s_job)
>>   	struct drm_gpu_scheduler *sched = s_job->sched;
>>   
>>   	spin_lock(&sched->job_list_lock);
>> -	list_add_tail(&s_job->list, &sched->ring_mirror_list);
>> +	list_add_tail(&s_job->list, &sched->pending_list);
>>   	drm_sched_start_timeout(sched);
>>   	spin_unlock(&sched->job_list_lock);
>>   }
>> @@ -286,7 +286,7 @@ static void drm_sched_job_timedout(struct work_struct *work)
>>   
>>   	/* Protects against concurrent deletion in drm_sched_get_cleanup_job */
>>   	spin_lock(&sched->job_list_lock);
>> -	job = list_first_entry_or_null(&sched->ring_mirror_list,
>> +	job = list_first_entry_or_null(&sched->pending_list,
>>   				       struct drm_sched_job, list);
>>   
>>   	if (job) {
>> @@ -371,7 +371,7 @@ EXPORT_SYMBOL(drm_sched_increase_karma);
>>    * Stop the scheduler and also removes and frees all completed jobs.
>>    * Note: bad job will not be freed as it might be used later and so it's
>>    * callers responsibility to release it manually if it's not part of the
>> - * mirror list any more.
>> + * pending list any more.
>>    *
>>    */
>>   void drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad)
>> @@ -392,15 +392,15 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad)
>>   		 * Add at the head of the queue to reflect it was the earliest
>>   		 * job extracted.
>>   		 */
>> -		list_add(&bad->list, &sched->ring_mirror_list);
>> +		list_add(&bad->list, &sched->pending_list);
>>   
>>   	/*
>>   	 * Iterate the job list from later to  earlier one and either deactive
>> -	 * their HW callbacks or remove them from mirror list if they already
>> +	 * their HW callbacks or remove them from pending list if they already
>>   	 * signaled.
>>   	 * This iteration is thread safe as sched thread is stopped.
>>   	 */
>> -	list_for_each_entry_safe_reverse(s_job, tmp, &sched->ring_mirror_list,
>> +	list_for_each_entry_safe_reverse(s_job, tmp, &sched->pending_list,
>>   					 list) {
>>   		if (s_job->s_fence->parent &&
>>   		    dma_fence_remove_callback(s_job->s_fence->parent,
>> @@ -408,7 +408,7 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad)
>>   			atomic_dec(&sched->hw_rq_count);
>>   		} else {
>>   			/*
>> -			 * remove job from ring_mirror_list.
>> +			 * remove job from pending_list.
>>   			 * Locking here is for concurrent resume timeout
>>   			 */
>>   			spin_lock(&sched->job_list_lock);
>> @@ -463,7 +463,7 @@ void drm_sched_start(struct drm_gpu_scheduler *sched, bool full_recovery)
>>   	 * so no new jobs are being inserted or removed. Also concurrent
>>   	 * GPU recovers can't run in parallel.
>>   	 */
>> -	list_for_each_entry_safe(s_job, tmp, &sched->ring_mirror_list, list) {
>> +	list_for_each_entry_safe(s_job, tmp, &sched->pending_list, list) {
>>   		struct dma_fence *fence = s_job->s_fence->parent;
>>   
>>   		atomic_inc(&sched->hw_rq_count);
>> @@ -494,7 +494,7 @@ void drm_sched_start(struct drm_gpu_scheduler *sched, bool full_recovery)
>>   EXPORT_SYMBOL(drm_sched_start);
>>   
>>   /**
>> - * drm_sched_resubmit_jobs - helper to relunch job from mirror ring list
>> + * drm_sched_resubmit_jobs - helper to relunch job from pending ring list
>>    *
>>    * @sched: scheduler instance
>>    *
>> @@ -506,7 +506,7 @@ void drm_sched_resubmit_jobs(struct drm_gpu_scheduler *sched)
>>   	bool found_guilty = false;
>>   	struct dma_fence *fence;
>>   
>> -	list_for_each_entry_safe(s_job, tmp, &sched->ring_mirror_list, list) {
>> +	list_for_each_entry_safe(s_job, tmp, &sched->pending_list, list) {
>>   		struct drm_sched_fence *s_fence = s_job->s_fence;
>>   
>>   		if (!found_guilty && atomic_read(&s_job->karma) > sched->hang_limit) {
>> @@ -665,7 +665,7 @@ static void drm_sched_process_job(struct dma_fence *f, struct dma_fence_cb *cb)
>>    *
>>    * @sched: scheduler instance
>>    *
>> - * Returns the next finished job from the mirror list (if there is one)
>> + * Returns the next finished job from the pending list (if there is one)
>>    * ready for it to be destroyed.
>>    */
>>   static struct drm_sched_job *
>> @@ -675,7 +675,7 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched)
>>   
>>   	/*
>>   	 * Don't destroy jobs while the timeout worker is running  OR thread
>> -	 * is being parked and hence assumed to not touch ring_mirror_list
>> +	 * is being parked and hence assumed to not touch pending_list
>>   	 */
>>   	if ((sched->timeout != MAX_SCHEDULE_TIMEOUT &&
>>   	    !cancel_delayed_work(&sched->work_tdr)) ||
>> @@ -684,11 +684,11 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched)
>>   
>>   	spin_lock(&sched->job_list_lock);
>>   
>> -	job = list_first_entry_or_null(&sched->ring_mirror_list,
>> +	job = list_first_entry_or_null(&sched->pending_list,
>>   				       struct drm_sched_job, list);
>>   
>>   	if (job && dma_fence_is_signaled(&job->s_fence->finished)) {
>> -		/* remove job from ring_mirror_list */
>> +		/* remove job from pending_list */
>>   		list_del_init(&job->list);
>>   	} else {
>>   		job = NULL;
>> @@ -858,7 +858,7 @@ int drm_sched_init(struct drm_gpu_scheduler *sched,
>>   
>>   	init_waitqueue_head(&sched->wake_up_worker);
>>   	init_waitqueue_head(&sched->job_scheduled);
>> -	INIT_LIST_HEAD(&sched->ring_mirror_list);
>> +	INIT_LIST_HEAD(&sched->pending_list);
>>   	spin_lock_init(&sched->job_list_lock);
>>   	atomic_set(&sched->hw_rq_count, 0);
>>   	INIT_DELAYED_WORK(&sched->work_tdr, drm_sched_job_timedout);
>> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
>> index 3add0072bd37..2e0c368e19f6 100644
>> --- a/include/drm/gpu_scheduler.h
>> +++ b/include/drm/gpu_scheduler.h
>> @@ -174,7 +174,7 @@ struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f);
>>    * @sched: the scheduler instance on which this job is scheduled.
>>    * @s_fence: contains the fences for the scheduling of job.
>>    * @finish_cb: the callback for the finished fence.
>> - * @node: used to append this struct to the @drm_gpu_scheduler.ring_mirror_list.
>> + * @node: used to append this struct to the @drm_gpu_scheduler.pending_list.
>>    * @id: a unique id assigned to each job scheduled on the scheduler.
>>    * @karma: increment on every hang caused by this job. If this exceeds the hang
>>    *         limit of the scheduler then the job is marked guilty and will not
>> @@ -203,7 +203,7 @@ struct drm_sched_job {
>>   static inline bool drm_sched_invalidate_job(struct drm_sched_job *s_job,
>>   					    int threshold)
>>   {
>> -	return (s_job && atomic_inc_return(&s_job->karma) > threshold);
>> +	return s_job && atomic_inc_return(&s_job->karma) > threshold;
>>   }
>>   
>>   /**
>> @@ -260,8 +260,8 @@ struct drm_sched_backend_ops {
>>    * @work_tdr: schedules a delayed call to @drm_sched_job_timedout after the
>>    *            timeout interval is over.
>>    * @thread: the kthread on which the scheduler which run.
>> - * @ring_mirror_list: the list of jobs which are currently in the job queue.
>> - * @job_list_lock: lock to protect the ring_mirror_list.
>> + * @pending_list: the list of jobs which are currently in the job queue.
>> + * @job_list_lock: lock to protect the pending_list.
>>    * @hang_limit: once the hangs by a job crosses this limit then it is marked
>>    *              guilty and it will be considered for scheduling further.
>>    * @score: score to help loadbalancer pick a idle sched
>> @@ -282,7 +282,7 @@ struct drm_gpu_scheduler {
>>   	atomic64_t			job_id_count;
>>   	struct delayed_work		work_tdr;
>>   	struct task_struct		*thread;
>> -	struct list_head		ring_mirror_list;
>> +	struct list_head		pending_list;
>>   	spinlock_t			job_list_lock;
>>   	int				hang_limit;
>>   	atomic_t                        score;
> 

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH 3/6] drm/scheduler: Job timeout handler returns status
  2020-11-25  9:50                                         ` Christian König
@ 2020-11-25 16:48                                           ` Luben Tuikov
  -1 siblings, 0 replies; 125+ messages in thread
From: Luben Tuikov @ 2020-11-25 16:48 UTC (permalink / raw)
  To: Christian König, Andrey Grodzovsky, Lucas Stach, Alexander Deucher
  Cc: Emily Deng, amd-gfx, dri-devel, steven.price

On 2020-11-25 04:50, Christian König wrote:
> Am 25.11.20 um 04:17 schrieb Luben Tuikov:
>> The job timeout handler now returns status
>> indicating back to the DRM layer whether the job
>> was successfully cancelled or whether more time
>> should be given to the job to complete.
>>
>> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_job.c |  6 ++++--
>>   include/drm/gpu_scheduler.h             | 13 ++++++++++---
>>   2 files changed, 14 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>> index ff48101bab55..81b73790ecc6 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>> @@ -28,7 +28,7 @@
>>   #include "amdgpu.h"
>>   #include "amdgpu_trace.h"
>>   
>> -static void amdgpu_job_timedout(struct drm_sched_job *s_job)
>> +static int amdgpu_job_timedout(struct drm_sched_job *s_job)
>>   {
>>   	struct amdgpu_ring *ring = to_amdgpu_ring(s_job->sched);
>>   	struct amdgpu_job *job = to_amdgpu_job(s_job);
>> @@ -41,7 +41,7 @@ static void amdgpu_job_timedout(struct drm_sched_job *s_job)
>>   	    amdgpu_ring_soft_recovery(ring, job->vmid, s_job->s_fence->parent)) {
>>   		DRM_ERROR("ring %s timeout, but soft recovered\n",
>>   			  s_job->sched->name);
>> -		return;
>> +		return 0;
>>   	}
>>   
>>   	amdgpu_vm_get_task_info(ring->adev, job->pasid, &ti);
>> @@ -53,10 +53,12 @@ static void amdgpu_job_timedout(struct drm_sched_job *s_job)
>>   
>>   	if (amdgpu_device_should_recover_gpu(ring->adev)) {
>>   		amdgpu_device_gpu_recover(ring->adev, job);
>> +		return 0;
>>   	} else {
>>   		drm_sched_suspend_timeout(&ring->sched);
>>   		if (amdgpu_sriov_vf(adev))
>>   			adev->virt.tdr_debug = true;
>> +		return 1;
>>   	}
>>   }
>>   
>> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
>> index 2e0c368e19f6..61f7121e1c19 100644
>> --- a/include/drm/gpu_scheduler.h
>> +++ b/include/drm/gpu_scheduler.h
>> @@ -230,10 +230,17 @@ struct drm_sched_backend_ops {
>>   	struct dma_fence *(*run_job)(struct drm_sched_job *sched_job);
>>   
>>   	/**
>> -         * @timedout_job: Called when a job has taken too long to execute,
>> -         * to trigger GPU recovery.
>> +	 * @timedout_job: Called when a job has taken too long to execute,
>> +	 * to trigger GPU recovery.
>> +	 *
>> +	 * Return 0, if the job has been aborted successfully and will
>> +	 * never be heard of from the device. Return non-zero if the
>> +	 * job wasn't able to be aborted, i.e. if more time should be
>> +	 * given to this job. The result is not "bool" as this
>> +	 * function is not a predicate, although its result may seem
>> +	 * as one.
> 
> I think the whole approach of timing out a job needs to be rethinked. 
> What's timing out here is the hardware engine, not the job.
> 
> So we should also not have the job as parameter here. Maybe we should 
> make that the fence we are waiting for instead.

Yes, I wanted this patch to be minimal, and not to disrupt
too many things.

Yes, in the future we can totally revamp this, but this
is a minimal patch.

> 
>>   	 */
>> -	void (*timedout_job)(struct drm_sched_job *sched_job);
>> +	int (*timedout_job)(struct drm_sched_job *sched_job);
> 
> I would either return an error code, boolean or enum here. But not use a 
> number without a define.

Yes, that's a great idea--I'll make the change now, and resubmit.

Regards,
Luben

> 
> Regards,
> Christian.
> 
>>   
>>   	/**
>>            * @free_job: Called once the job's finished fence has been signaled
> 

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH 3/6] drm/scheduler: Job timeout handler returns status
@ 2020-11-25 16:48                                           ` Luben Tuikov
  0 siblings, 0 replies; 125+ messages in thread
From: Luben Tuikov @ 2020-11-25 16:48 UTC (permalink / raw)
  To: Christian König, Andrey Grodzovsky, Lucas Stach, Alexander Deucher
  Cc: Emily Deng, amd-gfx, dri-devel, steven.price

On 2020-11-25 04:50, Christian König wrote:
> Am 25.11.20 um 04:17 schrieb Luben Tuikov:
>> The job timeout handler now returns status
>> indicating back to the DRM layer whether the job
>> was successfully cancelled or whether more time
>> should be given to the job to complete.
>>
>> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_job.c |  6 ++++--
>>   include/drm/gpu_scheduler.h             | 13 ++++++++++---
>>   2 files changed, 14 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>> index ff48101bab55..81b73790ecc6 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>> @@ -28,7 +28,7 @@
>>   #include "amdgpu.h"
>>   #include "amdgpu_trace.h"
>>   
>> -static void amdgpu_job_timedout(struct drm_sched_job *s_job)
>> +static int amdgpu_job_timedout(struct drm_sched_job *s_job)
>>   {
>>   	struct amdgpu_ring *ring = to_amdgpu_ring(s_job->sched);
>>   	struct amdgpu_job *job = to_amdgpu_job(s_job);
>> @@ -41,7 +41,7 @@ static void amdgpu_job_timedout(struct drm_sched_job *s_job)
>>   	    amdgpu_ring_soft_recovery(ring, job->vmid, s_job->s_fence->parent)) {
>>   		DRM_ERROR("ring %s timeout, but soft recovered\n",
>>   			  s_job->sched->name);
>> -		return;
>> +		return 0;
>>   	}
>>   
>>   	amdgpu_vm_get_task_info(ring->adev, job->pasid, &ti);
>> @@ -53,10 +53,12 @@ static void amdgpu_job_timedout(struct drm_sched_job *s_job)
>>   
>>   	if (amdgpu_device_should_recover_gpu(ring->adev)) {
>>   		amdgpu_device_gpu_recover(ring->adev, job);
>> +		return 0;
>>   	} else {
>>   		drm_sched_suspend_timeout(&ring->sched);
>>   		if (amdgpu_sriov_vf(adev))
>>   			adev->virt.tdr_debug = true;
>> +		return 1;
>>   	}
>>   }
>>   
>> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
>> index 2e0c368e19f6..61f7121e1c19 100644
>> --- a/include/drm/gpu_scheduler.h
>> +++ b/include/drm/gpu_scheduler.h
>> @@ -230,10 +230,17 @@ struct drm_sched_backend_ops {
>>   	struct dma_fence *(*run_job)(struct drm_sched_job *sched_job);
>>   
>>   	/**
>> -         * @timedout_job: Called when a job has taken too long to execute,
>> -         * to trigger GPU recovery.
>> +	 * @timedout_job: Called when a job has taken too long to execute,
>> +	 * to trigger GPU recovery.
>> +	 *
>> +	 * Return 0, if the job has been aborted successfully and will
>> +	 * never be heard of from the device. Return non-zero if the
>> +	 * job wasn't able to be aborted, i.e. if more time should be
>> +	 * given to this job. The result is not "bool" as this
>> +	 * function is not a predicate, although its result may seem
>> +	 * as one.
> 
> I think the whole approach of timing out a job needs to be rethinked. 
> What's timing out here is the hardware engine, not the job.
> 
> So we should also not have the job as parameter here. Maybe we should 
> make that the fence we are waiting for instead.

Yes, I wanted this patch to be minimal, and not to disrupt
too many things.

Yes, in the future we can totally revamp this, but this
is a minimal patch.

> 
>>   	 */
>> -	void (*timedout_job)(struct drm_sched_job *sched_job);
>> +	int (*timedout_job)(struct drm_sched_job *sched_job);
> 
> I would either return an error code, boolean or enum here. But not use a 
> number without a define.

Yes, that's a great idea--I'll make the change now, and resubmit.

Regards,
Luben

> 
> Regards,
> Christian.
> 
>>   
>>   	/**
>>            * @free_job: Called once the job's finished fence has been signaled
> 

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH 5/6] drm/amdgpu: Don't hardcode thread name length
  2020-11-25  9:55                                         ` Christian König
@ 2020-11-25 17:01                                           ` Luben Tuikov
  -1 siblings, 0 replies; 125+ messages in thread
From: Luben Tuikov @ 2020-11-25 17:01 UTC (permalink / raw)
  To: Christian König, Andrey Grodzovsky, Lucas Stach, Alexander Deucher
  Cc: Emily Deng, amd-gfx, dri-devel, steven.price

On 2020-11-25 04:55, Christian König wrote:
> Am 25.11.20 um 04:17 schrieb Luben Tuikov:
>> Introduce a macro DRM_THREAD_NAME_LEN
>> and use that to define ring name size,
>> instead of hardcoding it to 16.
>>
>> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 2 +-
>>   include/drm/gpu_scheduler.h              | 2 ++
>>   2 files changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>> index 7112137689db..bbd46c6dec65 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>> @@ -230,7 +230,7 @@ struct amdgpu_ring {
>>   	unsigned		wptr_offs;
>>   	unsigned		fence_offs;
>>   	uint64_t		current_ctx;
>> -	char			name[16];
>> +	char			name[DRM_THREAD_NAME_LEN];
>>   	u32                     trail_seq;
>>   	unsigned		trail_fence_offs;
>>   	u64			trail_fence_gpu_addr;
>> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
>> index 61f7121e1c19..3a5686c3b5e9 100644
>> --- a/include/drm/gpu_scheduler.h
>> +++ b/include/drm/gpu_scheduler.h
>> @@ -30,6 +30,8 @@
>>   
>>   #define MAX_WAIT_SCHED_ENTITY_Q_EMPTY msecs_to_jiffies(1000)
>>   
>> +#define DRM_THREAD_NAME_LEN     TASK_COMM_LEN
>> +
> 
> The thread name is an amdgpu specific thing. I don't think we should 
> have that in the scheduler.

I need it in DRM when creating the done thread from the name
of the main scheduler thread. Since DRM creates threads,
the main scheduler thread and the done thread, it would
be good to have a preliminary limit to the name string.

> 
> And why do you use TASK_COMM_LEN here? That is completely unrelated stuff.

If you trace down into the kernel, TASK_COMM_LEN seems to be used in
snprintf() when naming a kernel thread, and its value is 16--same
as the one used in amdgpu.

So the size of the name string transitions from amdgpu to DRM to kernel
proper, where amdgpu and kernel proper set it to max 16, but DRM doesn't
give it a limit.

Sure, I can remove it from DRM, and just use a local limit
when snprintf() the name when creating a tread, possibly
using TASK_COMM_LEN. (That's in the next patch.)

Would that be better? I can do that in v2 of this patchset.

Thanks,
Luben

> 
> Regards,
> Christian.
> 
>>   struct drm_gpu_scheduler;
>>   struct drm_sched_rq;
>>   
> 

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH 5/6] drm/amdgpu: Don't hardcode thread name length
@ 2020-11-25 17:01                                           ` Luben Tuikov
  0 siblings, 0 replies; 125+ messages in thread
From: Luben Tuikov @ 2020-11-25 17:01 UTC (permalink / raw)
  To: Christian König, Andrey Grodzovsky, Lucas Stach, Alexander Deucher
  Cc: Emily Deng, amd-gfx, dri-devel, steven.price

On 2020-11-25 04:55, Christian König wrote:
> Am 25.11.20 um 04:17 schrieb Luben Tuikov:
>> Introduce a macro DRM_THREAD_NAME_LEN
>> and use that to define ring name size,
>> instead of hardcoding it to 16.
>>
>> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 2 +-
>>   include/drm/gpu_scheduler.h              | 2 ++
>>   2 files changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>> index 7112137689db..bbd46c6dec65 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>> @@ -230,7 +230,7 @@ struct amdgpu_ring {
>>   	unsigned		wptr_offs;
>>   	unsigned		fence_offs;
>>   	uint64_t		current_ctx;
>> -	char			name[16];
>> +	char			name[DRM_THREAD_NAME_LEN];
>>   	u32                     trail_seq;
>>   	unsigned		trail_fence_offs;
>>   	u64			trail_fence_gpu_addr;
>> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
>> index 61f7121e1c19..3a5686c3b5e9 100644
>> --- a/include/drm/gpu_scheduler.h
>> +++ b/include/drm/gpu_scheduler.h
>> @@ -30,6 +30,8 @@
>>   
>>   #define MAX_WAIT_SCHED_ENTITY_Q_EMPTY msecs_to_jiffies(1000)
>>   
>> +#define DRM_THREAD_NAME_LEN     TASK_COMM_LEN
>> +
> 
> The thread name is an amdgpu specific thing. I don't think we should 
> have that in the scheduler.

I need it in DRM when creating the done thread from the name
of the main scheduler thread. Since DRM creates threads,
the main scheduler thread and the done thread, it would
be good to have a preliminary limit to the name string.

> 
> And why do you use TASK_COMM_LEN here? That is completely unrelated stuff.

If you trace down into the kernel, TASK_COMM_LEN seems to be used in
snprintf() when naming a kernel thread, and its value is 16--same
as the one used in amdgpu.

So the size of the name string transitions from amdgpu to DRM to kernel
proper, where amdgpu and kernel proper set it to max 16, but DRM doesn't
give it a limit.

Sure, I can remove it from DRM, and just use a local limit
when snprintf() the name when creating a tread, possibly
using TASK_COMM_LEN. (That's in the next patch.)

Would that be better? I can do that in v2 of this patchset.

Thanks,
Luben

> 
> Regards,
> Christian.
> 
>>   struct drm_gpu_scheduler;
>>   struct drm_sched_rq;
>>   
> 

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH 6/6] drm/sched: Make use of a "done" thread
  2020-11-25 10:10                                         ` Christian König
@ 2020-11-26  0:24                                           ` Luben Tuikov
  -1 siblings, 0 replies; 125+ messages in thread
From: Luben Tuikov @ 2020-11-26  0:24 UTC (permalink / raw)
  To: Christian König, Andrey Grodzovsky, Lucas Stach, Alexander Deucher
  Cc: Emily Deng, amd-gfx, dri-devel, steven.price

On 2020-11-25 05:10, Christian König wrote:
> Am 25.11.20 um 04:17 schrieb Luben Tuikov:
>> Add a "done" list to which all completed jobs are added
>> to be freed. The drm_sched_job_done() callback is the
>> producer of jobs to this list.
>>
>> Add a "done" thread which consumes from the done list
>> and frees up jobs. Now, the main scheduler thread only
>> pushes jobs to the GPU and the "done" thread frees them
>> up, on the way out of the GPU when they've completed
>> execution.
> 
> Well there are quite a number of problems in this patch.
> 
>  From the design I think we should be getting rid of the linked list and

Sure, we can do this in a separate future patch. I'd imagine it'll
touch a lot of places and I didn't want this patch and this series
of patches to get out of hand, by changing too many things.

Here in this patch I wanted to change as little as possible.

> not extend its use. And we also don't want to offload the freeing of 
> jobs into a different thread because that could potentially mean that 
> this is executed on a different CPU.

Yes, of course it could.

From my experience working with hardware, I always envision work
being done by small units, in a pipeline, concurrently, while all
of them working concurrently, all the time.

It's hard to go back to unitary processing. :-)

> 
> Then one obvious problem seems to be that you don't take into account 
> that we moved the job freeing into the scheduler thread to make sure 
> that this is suspended while the scheduler thread is stopped. 

I don't understand what "this" refers to in "that this is suspended
while the scheduler thread is stopped."

> This 
> behavior is now completely gone, e.g. the delete thread keeps running 
> while the scheduler thread is stopped.

Yes, indeed, that is the case and intentional.

There seems to be no requirement to have to stop the main
scheduler thread, which pushes tasks down to the GPU,
so that we can free jobs. In other words, both
threads can work concurrently, one pushing jobs down
to the GPU, while the other freeing done jobs coming
out of the GPU.

If this concurrency is something you don't like,
then no problem, we can keep them interlocked in one
thread as before.

> 
> A few more comments below.
> 
>> Make use of the status returned by the GPU driver
>> timeout handler to decide whether to leave the job in
>> the pending list, or to send it off to the done list.
>> If a job is done, it is added to the done list and the
>> done thread woken up. If a job needs more time, it is
>> left on the pending list and the timeout timer
>> restarted.
>>
>> Eliminate the polling mechanism of picking out done
>> jobs from the pending list, i.e. eliminate
>> drm_sched_get_cleanup_job(). Now the main scheduler
>> thread only pushes jobs down to the GPU.
>>
>> Various other optimizations to the GPU scheduler
>> and job recovery are possible with this format.
>>
>> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
>> ---
>>   drivers/gpu/drm/scheduler/sched_main.c | 173 +++++++++++++------------
>>   include/drm/gpu_scheduler.h            |  14 ++
>>   2 files changed, 101 insertions(+), 86 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
>> index 3eb7618a627d..289ae68cd97f 100644
>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>> @@ -164,7 +164,8 @@ drm_sched_rq_select_entity(struct drm_sched_rq *rq)
>>    * drm_sched_job_done - complete a job
>>    * @s_job: pointer to the job which is done
>>    *
>> - * Finish the job's fence and wake up the worker thread.
>> + * Finish the job's fence, move it to the done list,
>> + * and wake up the done thread.
>>    */
>>   static void drm_sched_job_done(struct drm_sched_job *s_job)
>>   {
>> @@ -179,7 +180,12 @@ static void drm_sched_job_done(struct drm_sched_job *s_job)
>>   	dma_fence_get(&s_fence->finished);
>>   	drm_sched_fence_finished(s_fence);
>>   	dma_fence_put(&s_fence->finished);
>> -	wake_up_interruptible(&sched->wake_up_worker);
>> +
>> +	spin_lock(&sched->job_list_lock);
>> +	list_move(&s_job->list, &sched->done_list);
>> +	spin_unlock(&sched->job_list_lock);
>> +
>> +	wake_up_interruptible(&sched->done_wait_q);
> 
> How is the worker thread then woken up to push new jobs to the hardware?

A-ha! Thank you Christian for bringing this up--perhaps that is what
the problem is I was seeing on my test machine, which I described
in the cover letter 0/6, that X/GDM just sleeping in wait.

So, I'd imagined that whomever pushed jobs down to DRM, i.e.
the producer of jobs, also did a "up"/"wake-up" of the main
scheduler thread, so that the main scheduler thread would
then wake up and "schedule" tasks down into the GPU. It seems
I've only "imagined" :-) such concurrency and the the main scheduler
thread needs to be woken up to poll? I'll try this next.
Thanks for the tip Christian!

> 
>>   }
>>   
>>   /**
>> @@ -221,11 +227,10 @@ bool drm_sched_dependency_optimized(struct dma_fence* fence,
>>   EXPORT_SYMBOL(drm_sched_dependency_optimized);
>>   
>>   /**
>> - * drm_sched_start_timeout - start timeout for reset worker
>> - *
>> - * @sched: scheduler instance to start the worker for
>> + * drm_sched_start_timeout - start a timeout timer
>> + * @sched: scheduler instance whose job we're timing
>>    *
>> - * Start the timeout for the given scheduler.
>> + * Start a timeout timer for the given scheduler.
>>    */
>>   static void drm_sched_start_timeout(struct drm_gpu_scheduler *sched)
>>   {
>> @@ -305,8 +310,8 @@ static void drm_sched_job_begin(struct drm_sched_job *s_job)
>>   
>>   	spin_lock(&sched->job_list_lock);
>>   	list_add_tail(&s_job->list, &sched->pending_list);
>> -	drm_sched_start_timeout(sched);
>>   	spin_unlock(&sched->job_list_lock);
>> +	drm_sched_start_timeout(sched);
> 
> This looks wrong, the drm_sched_start_timeout() function used to need 
> the lock. Why should that have changed?

I'd originally removed the check in drm_sched_start_timeout(),
of whether the "pending_list" is empty, because the use
of that function became more _deterministic_, with this patch.
By this I mean that the timeout timer is now started
_only_ when we push down new jobs.

But then I noticed that "full recovery" business in
in drm_sched_start(), which calls drm_sched_start_timeout(),
and put that !list_empty() check back in, and I seem to have
forgotten to move this back inside the lock.

I'll move it back in, no problem, thanks for catching this.

Regards,
Luben

> 
>>   }
>>   
>>   static void drm_sched_job_timedout(struct work_struct *work)
>> @@ -316,37 +321,30 @@ static void drm_sched_job_timedout(struct work_struct *work)
>>   
>>   	sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work);
>>   
>> -	/* Protects against concurrent deletion in drm_sched_get_cleanup_job */
>>   	spin_lock(&sched->job_list_lock);
>>   	job = list_first_entry_or_null(&sched->pending_list,
>>   				       struct drm_sched_job, list);
>> +	spin_unlock(&sched->job_list_lock);
>>   
>>   	if (job) {
>> -		/*
>> -		 * Remove the bad job so it cannot be freed by concurrent
>> -		 * drm_sched_cleanup_jobs. It will be reinserted back after sched->thread
>> -		 * is parked at which point it's safe.
>> -		 */
>> -		list_del_init(&job->list);
>> -		spin_unlock(&sched->job_list_lock);
>> +		int res;
>>   
>> -		job->sched->ops->timedout_job(job);
>> +		job->job_status |= DRM_JOB_STATUS_TIMEOUT;
>> +		res = job->sched->ops->timedout_job(job);
>> +		if (res == 0) {
>> +			/* The job is out of the device.
>> +			 */
>> +			spin_lock(&sched->job_list_lock);
>> +			list_move(&job->list, &sched->done_list);
>> +			spin_unlock(&sched->job_list_lock);
>>   
>> -		/*
>> -		 * Guilty job did complete and hence needs to be manually removed
>> -		 * See drm_sched_stop doc.
>> -		 */
>> -		if (sched->free_guilty) {
>> -			job->sched->ops->free_job(job);
>> -			sched->free_guilty = false;
>> +			wake_up_interruptible(&sched->done_wait_q);
>> +		} else {
>> +			/* The job needs more time.
>> +			 */
>> +			drm_sched_start_timeout(sched);
>>   		}
>> -	} else {
>> -		spin_unlock(&sched->job_list_lock);
>>   	}
>> -
>> -	spin_lock(&sched->job_list_lock);
>> -	drm_sched_start_timeout(sched);
>> -	spin_unlock(&sched->job_list_lock);
>>   }
>>   
>>    /**
>> @@ -511,15 +509,13 @@ void drm_sched_start(struct drm_gpu_scheduler *sched, bool full_recovery)
>>   			else if (r)
>>   				DRM_ERROR("fence add callback failed (%d)\n",
>>   					  r);
>> -		} else
>> +		} else {
>>   			drm_sched_job_done(s_job);
>> +		}
>>   	}
>>   
>> -	if (full_recovery) {
>> -		spin_lock(&sched->job_list_lock);
>> +	if (full_recovery)
>>   		drm_sched_start_timeout(sched);
>> -		spin_unlock(&sched->job_list_lock);
> 
> Same here.
> 
> Regards,
> Christian.
> 
>> -	}
>>   
>>   	kthread_unpark(sched->thread);
>>   }
>> @@ -667,47 +663,6 @@ drm_sched_select_entity(struct drm_gpu_scheduler *sched)
>>   	return entity;
>>   }
>>   
>> -/**
>> - * drm_sched_get_cleanup_job - fetch the next finished job to be destroyed
>> - *
>> - * @sched: scheduler instance
>> - *
>> - * Returns the next finished job from the pending list (if there is one)
>> - * ready for it to be destroyed.
>> - */
>> -static struct drm_sched_job *
>> -drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched)
>> -{
>> -	struct drm_sched_job *job;
>> -
>> -	/*
>> -	 * Don't destroy jobs while the timeout worker is running  OR thread
>> -	 * is being parked and hence assumed to not touch pending_list
>> -	 */
>> -	if ((sched->timeout != MAX_SCHEDULE_TIMEOUT &&
>> -	    !cancel_delayed_work(&sched->work_tdr)) ||
>> -	    kthread_should_park())
>> -		return NULL;
>> -
>> -	spin_lock(&sched->job_list_lock);
>> -
>> -	job = list_first_entry_or_null(&sched->pending_list,
>> -				       struct drm_sched_job, list);
>> -
>> -	if (job && dma_fence_is_signaled(&job->s_fence->finished)) {
>> -		/* remove job from pending_list */
>> -		list_del_init(&job->list);
>> -	} else {
>> -		job = NULL;
>> -		/* queue timeout for next job */
>> -		drm_sched_start_timeout(sched);
>> -	}
>> -
>> -	spin_unlock(&sched->job_list_lock);
>> -
>> -	return job;
>> -}
>> -
>>   /**
>>    * drm_sched_pick_best - Get a drm sched from a sched_list with the least load
>>    * @sched_list: list of drm_gpu_schedulers
>> @@ -761,6 +716,44 @@ static bool drm_sched_blocked(struct drm_gpu_scheduler *sched)
>>   	return false;
>>   }
>>   
>> +/**
>> + * drm_sched_done - free done tasks
>> + * @param: pointer to a scheduler instance
>> + *
>> + * Returns 0.
>> + */
>> +static int drm_sched_done(void *param)
>> +{
>> +	struct drm_gpu_scheduler *sched = param;
>> +
>> +	do {
>> +		LIST_HEAD(done_q);
>> +
>> +		wait_event_interruptible(sched->done_wait_q,
>> +					 kthread_should_stop() ||
>> +					 !list_empty(&sched->done_list));
>> +
>> +		spin_lock(&sched->job_list_lock);
>> +		list_splice_init(&sched->done_list, &done_q);
>> +		spin_unlock(&sched->job_list_lock);
>> +
>> +		if (list_empty(&done_q))
>> +			continue;
>> +
>> +		while (!list_empty(&done_q)) {
>> +			struct drm_sched_job *job;
>> +
>> +			job = list_first_entry(&done_q,
>> +					       struct drm_sched_job,
>> +					       list);
>> +			list_del_init(&job->list);
>> +			sched->ops->free_job(job);
>> +		}
>> +	} while (!kthread_should_stop());
>> +
>> +	return 0;
>> +}
>> +
>>   /**
>>    * drm_sched_main - main scheduler thread
>>    *
>> @@ -770,7 +763,7 @@ static bool drm_sched_blocked(struct drm_gpu_scheduler *sched)
>>    */
>>   static int drm_sched_main(void *param)
>>   {
>> -	struct drm_gpu_scheduler *sched = (struct drm_gpu_scheduler *)param;
>> +	struct drm_gpu_scheduler *sched = param;
>>   	int r;
>>   
>>   	sched_set_fifo_low(current);
>> @@ -780,20 +773,12 @@ static int drm_sched_main(void *param)
>>   		struct drm_sched_fence *s_fence;
>>   		struct drm_sched_job *sched_job;
>>   		struct dma_fence *fence;
>> -		struct drm_sched_job *cleanup_job = NULL;
>>   
>>   		wait_event_interruptible(sched->wake_up_worker,
>> -					 (cleanup_job = drm_sched_get_cleanup_job(sched)) ||
>>   					 (!drm_sched_blocked(sched) &&
>>   					  (entity = drm_sched_select_entity(sched))) ||
>>   					 kthread_should_stop());
>>   
>> -		if (cleanup_job) {
>> -			sched->ops->free_job(cleanup_job);
>> -			/* queue timeout for next job */
>> -			drm_sched_start_timeout(sched);
>> -		}
>> -
>>   		if (!entity)
>>   			continue;
>>   
>> @@ -820,8 +805,7 @@ static int drm_sched_main(void *param)
>>   			if (r == -ENOENT)
>>   				drm_sched_job_done(sched_job);
>>   			else if (r)
>> -				DRM_ERROR("fence add callback failed (%d)\n",
>> -					  r);
>> +				DRM_ERROR("fence add callback failed (%d)\n", r);
>>   			dma_fence_put(fence);
>>   		} else {
>>   			if (IS_ERR(fence))
>> @@ -865,7 +849,9 @@ int drm_sched_init(struct drm_gpu_scheduler *sched,
>>   
>>   	init_waitqueue_head(&sched->wake_up_worker);
>>   	init_waitqueue_head(&sched->job_scheduled);
>> +	init_waitqueue_head(&sched->done_wait_q);
>>   	INIT_LIST_HEAD(&sched->pending_list);
>> +	INIT_LIST_HEAD(&sched->done_list);
>>   	spin_lock_init(&sched->job_list_lock);
>>   	atomic_set(&sched->hw_rq_count, 0);
>>   	INIT_DELAYED_WORK(&sched->work_tdr, drm_sched_job_timedout);
>> @@ -881,6 +867,21 @@ int drm_sched_init(struct drm_gpu_scheduler *sched,
>>   		return ret;
>>   	}
>>   
>> +	snprintf(sched->thread_done_name, DRM_THREAD_NAME_LEN, "%s%s",
>> +		 sched->name, "-done");
>> +	sched->thread_done_name[DRM_THREAD_NAME_LEN - 1] = '\0';
>> +	sched->thread_done = kthread_run(drm_sched_done, sched,
>> +					 sched->thread_done_name);
>> +	if (IS_ERR(sched->thread_done)) {
>> +		ret = kthread_stop(sched->thread);
>> +		if (!ret) {
>> +			/* free_kthread_struct(sched->thread); */
>> +			sched->thread = NULL;
>> +		}
>> +		DRM_ERROR("Failed to start thread %s", sched->thread_done_name);
>> +		return ret;
>> +	}
>> +
>>   	sched->ready = true;
>>   	return 0;
>>   }
>> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
>> index 3a5686c3b5e9..b282d6158b50 100644
>> --- a/include/drm/gpu_scheduler.h
>> +++ b/include/drm/gpu_scheduler.h
>> @@ -169,6 +169,12 @@ struct drm_sched_fence {
>>   
>>   struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f);
>>   
>> +enum drm_job_status {
>> +	DRM_JOB_STATUS_NONE    = 0 << 0,
>> +	DRM_JOB_STATUS_DONE    = 1 << 0,
>> +	DRM_JOB_STATUS_TIMEOUT = 1 << 1,
>> +};
>> +
>>   /**
>>    * struct drm_sched_job - A job to be run by an entity.
>>    *
>> @@ -198,6 +204,7 @@ struct drm_sched_job {
>>   	uint64_t			id;
>>   	atomic_t			karma;
>>   	enum drm_sched_priority		s_priority;
>> +	enum drm_job_status             job_status;
>>   	struct drm_sched_entity         *entity;
>>   	struct dma_fence_cb		cb;
>>   };
>> @@ -284,15 +291,22 @@ struct drm_gpu_scheduler {
>>   	uint32_t			hw_submission_limit;
>>   	long				timeout;
>>   	const char			*name;
>> +	char                            thread_done_name[DRM_THREAD_NAME_LEN];
>> +
>>   	struct drm_sched_rq		sched_rq[DRM_SCHED_PRIORITY_COUNT];
>>   	wait_queue_head_t		wake_up_worker;
>>   	wait_queue_head_t		job_scheduled;
>> +	wait_queue_head_t               done_wait_q;
>>   	atomic_t			hw_rq_count;
>>   	atomic64_t			job_id_count;
>>   	struct delayed_work		work_tdr;
>>   	struct task_struct		*thread;
>> +	struct task_struct		*thread_done;
>> +
>>   	struct list_head		pending_list;
>> +	struct list_head                done_list;
>>   	spinlock_t			job_list_lock;
>> +
>>   	int				hang_limit;
>>   	atomic_t                        score;
>>   	bool				ready;
> 

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH 6/6] drm/sched: Make use of a "done" thread
@ 2020-11-26  0:24                                           ` Luben Tuikov
  0 siblings, 0 replies; 125+ messages in thread
From: Luben Tuikov @ 2020-11-26  0:24 UTC (permalink / raw)
  To: Christian König, Andrey Grodzovsky, Lucas Stach, Alexander Deucher
  Cc: Emily Deng, amd-gfx, dri-devel, steven.price

On 2020-11-25 05:10, Christian König wrote:
> Am 25.11.20 um 04:17 schrieb Luben Tuikov:
>> Add a "done" list to which all completed jobs are added
>> to be freed. The drm_sched_job_done() callback is the
>> producer of jobs to this list.
>>
>> Add a "done" thread which consumes from the done list
>> and frees up jobs. Now, the main scheduler thread only
>> pushes jobs to the GPU and the "done" thread frees them
>> up, on the way out of the GPU when they've completed
>> execution.
> 
> Well there are quite a number of problems in this patch.
> 
>  From the design I think we should be getting rid of the linked list and

Sure, we can do this in a separate future patch. I'd imagine it'll
touch a lot of places and I didn't want this patch and this series
of patches to get out of hand, by changing too many things.

Here in this patch I wanted to change as little as possible.

> not extend its use. And we also don't want to offload the freeing of 
> jobs into a different thread because that could potentially mean that 
> this is executed on a different CPU.

Yes, of course it could.

From my experience working with hardware, I always envision work
being done by small units, in a pipeline, concurrently, while all
of them working concurrently, all the time.

It's hard to go back to unitary processing. :-)

> 
> Then one obvious problem seems to be that you don't take into account 
> that we moved the job freeing into the scheduler thread to make sure 
> that this is suspended while the scheduler thread is stopped. 

I don't understand what "this" refers to in "that this is suspended
while the scheduler thread is stopped."

> This 
> behavior is now completely gone, e.g. the delete thread keeps running 
> while the scheduler thread is stopped.

Yes, indeed, that is the case and intentional.

There seems to be no requirement to have to stop the main
scheduler thread, which pushes tasks down to the GPU,
so that we can free jobs. In other words, both
threads can work concurrently, one pushing jobs down
to the GPU, while the other freeing done jobs coming
out of the GPU.

If this concurrency is something you don't like,
then no problem, we can keep them interlocked in one
thread as before.

> 
> A few more comments below.
> 
>> Make use of the status returned by the GPU driver
>> timeout handler to decide whether to leave the job in
>> the pending list, or to send it off to the done list.
>> If a job is done, it is added to the done list and the
>> done thread woken up. If a job needs more time, it is
>> left on the pending list and the timeout timer
>> restarted.
>>
>> Eliminate the polling mechanism of picking out done
>> jobs from the pending list, i.e. eliminate
>> drm_sched_get_cleanup_job(). Now the main scheduler
>> thread only pushes jobs down to the GPU.
>>
>> Various other optimizations to the GPU scheduler
>> and job recovery are possible with this format.
>>
>> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
>> ---
>>   drivers/gpu/drm/scheduler/sched_main.c | 173 +++++++++++++------------
>>   include/drm/gpu_scheduler.h            |  14 ++
>>   2 files changed, 101 insertions(+), 86 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
>> index 3eb7618a627d..289ae68cd97f 100644
>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>> @@ -164,7 +164,8 @@ drm_sched_rq_select_entity(struct drm_sched_rq *rq)
>>    * drm_sched_job_done - complete a job
>>    * @s_job: pointer to the job which is done
>>    *
>> - * Finish the job's fence and wake up the worker thread.
>> + * Finish the job's fence, move it to the done list,
>> + * and wake up the done thread.
>>    */
>>   static void drm_sched_job_done(struct drm_sched_job *s_job)
>>   {
>> @@ -179,7 +180,12 @@ static void drm_sched_job_done(struct drm_sched_job *s_job)
>>   	dma_fence_get(&s_fence->finished);
>>   	drm_sched_fence_finished(s_fence);
>>   	dma_fence_put(&s_fence->finished);
>> -	wake_up_interruptible(&sched->wake_up_worker);
>> +
>> +	spin_lock(&sched->job_list_lock);
>> +	list_move(&s_job->list, &sched->done_list);
>> +	spin_unlock(&sched->job_list_lock);
>> +
>> +	wake_up_interruptible(&sched->done_wait_q);
> 
> How is the worker thread then woken up to push new jobs to the hardware?

A-ha! Thank you Christian for bringing this up--perhaps that is what
the problem is I was seeing on my test machine, which I described
in the cover letter 0/6, that X/GDM just sleeping in wait.

So, I'd imagined that whomever pushed jobs down to DRM, i.e.
the producer of jobs, also did a "up"/"wake-up" of the main
scheduler thread, so that the main scheduler thread would
then wake up and "schedule" tasks down into the GPU. It seems
I've only "imagined" :-) such concurrency and the the main scheduler
thread needs to be woken up to poll? I'll try this next.
Thanks for the tip Christian!

> 
>>   }
>>   
>>   /**
>> @@ -221,11 +227,10 @@ bool drm_sched_dependency_optimized(struct dma_fence* fence,
>>   EXPORT_SYMBOL(drm_sched_dependency_optimized);
>>   
>>   /**
>> - * drm_sched_start_timeout - start timeout for reset worker
>> - *
>> - * @sched: scheduler instance to start the worker for
>> + * drm_sched_start_timeout - start a timeout timer
>> + * @sched: scheduler instance whose job we're timing
>>    *
>> - * Start the timeout for the given scheduler.
>> + * Start a timeout timer for the given scheduler.
>>    */
>>   static void drm_sched_start_timeout(struct drm_gpu_scheduler *sched)
>>   {
>> @@ -305,8 +310,8 @@ static void drm_sched_job_begin(struct drm_sched_job *s_job)
>>   
>>   	spin_lock(&sched->job_list_lock);
>>   	list_add_tail(&s_job->list, &sched->pending_list);
>> -	drm_sched_start_timeout(sched);
>>   	spin_unlock(&sched->job_list_lock);
>> +	drm_sched_start_timeout(sched);
> 
> This looks wrong, the drm_sched_start_timeout() function used to need 
> the lock. Why should that have changed?

I'd originally removed the check in drm_sched_start_timeout(),
of whether the "pending_list" is empty, because the use
of that function became more _deterministic_, with this patch.
By this I mean that the timeout timer is now started
_only_ when we push down new jobs.

But then I noticed that "full recovery" business in
in drm_sched_start(), which calls drm_sched_start_timeout(),
and put that !list_empty() check back in, and I seem to have
forgotten to move this back inside the lock.

I'll move it back in, no problem, thanks for catching this.

Regards,
Luben

> 
>>   }
>>   
>>   static void drm_sched_job_timedout(struct work_struct *work)
>> @@ -316,37 +321,30 @@ static void drm_sched_job_timedout(struct work_struct *work)
>>   
>>   	sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work);
>>   
>> -	/* Protects against concurrent deletion in drm_sched_get_cleanup_job */
>>   	spin_lock(&sched->job_list_lock);
>>   	job = list_first_entry_or_null(&sched->pending_list,
>>   				       struct drm_sched_job, list);
>> +	spin_unlock(&sched->job_list_lock);
>>   
>>   	if (job) {
>> -		/*
>> -		 * Remove the bad job so it cannot be freed by concurrent
>> -		 * drm_sched_cleanup_jobs. It will be reinserted back after sched->thread
>> -		 * is parked at which point it's safe.
>> -		 */
>> -		list_del_init(&job->list);
>> -		spin_unlock(&sched->job_list_lock);
>> +		int res;
>>   
>> -		job->sched->ops->timedout_job(job);
>> +		job->job_status |= DRM_JOB_STATUS_TIMEOUT;
>> +		res = job->sched->ops->timedout_job(job);
>> +		if (res == 0) {
>> +			/* The job is out of the device.
>> +			 */
>> +			spin_lock(&sched->job_list_lock);
>> +			list_move(&job->list, &sched->done_list);
>> +			spin_unlock(&sched->job_list_lock);
>>   
>> -		/*
>> -		 * Guilty job did complete and hence needs to be manually removed
>> -		 * See drm_sched_stop doc.
>> -		 */
>> -		if (sched->free_guilty) {
>> -			job->sched->ops->free_job(job);
>> -			sched->free_guilty = false;
>> +			wake_up_interruptible(&sched->done_wait_q);
>> +		} else {
>> +			/* The job needs more time.
>> +			 */
>> +			drm_sched_start_timeout(sched);
>>   		}
>> -	} else {
>> -		spin_unlock(&sched->job_list_lock);
>>   	}
>> -
>> -	spin_lock(&sched->job_list_lock);
>> -	drm_sched_start_timeout(sched);
>> -	spin_unlock(&sched->job_list_lock);
>>   }
>>   
>>    /**
>> @@ -511,15 +509,13 @@ void drm_sched_start(struct drm_gpu_scheduler *sched, bool full_recovery)
>>   			else if (r)
>>   				DRM_ERROR("fence add callback failed (%d)\n",
>>   					  r);
>> -		} else
>> +		} else {
>>   			drm_sched_job_done(s_job);
>> +		}
>>   	}
>>   
>> -	if (full_recovery) {
>> -		spin_lock(&sched->job_list_lock);
>> +	if (full_recovery)
>>   		drm_sched_start_timeout(sched);
>> -		spin_unlock(&sched->job_list_lock);
> 
> Same here.
> 
> Regards,
> Christian.
> 
>> -	}
>>   
>>   	kthread_unpark(sched->thread);
>>   }
>> @@ -667,47 +663,6 @@ drm_sched_select_entity(struct drm_gpu_scheduler *sched)
>>   	return entity;
>>   }
>>   
>> -/**
>> - * drm_sched_get_cleanup_job - fetch the next finished job to be destroyed
>> - *
>> - * @sched: scheduler instance
>> - *
>> - * Returns the next finished job from the pending list (if there is one)
>> - * ready for it to be destroyed.
>> - */
>> -static struct drm_sched_job *
>> -drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched)
>> -{
>> -	struct drm_sched_job *job;
>> -
>> -	/*
>> -	 * Don't destroy jobs while the timeout worker is running  OR thread
>> -	 * is being parked and hence assumed to not touch pending_list
>> -	 */
>> -	if ((sched->timeout != MAX_SCHEDULE_TIMEOUT &&
>> -	    !cancel_delayed_work(&sched->work_tdr)) ||
>> -	    kthread_should_park())
>> -		return NULL;
>> -
>> -	spin_lock(&sched->job_list_lock);
>> -
>> -	job = list_first_entry_or_null(&sched->pending_list,
>> -				       struct drm_sched_job, list);
>> -
>> -	if (job && dma_fence_is_signaled(&job->s_fence->finished)) {
>> -		/* remove job from pending_list */
>> -		list_del_init(&job->list);
>> -	} else {
>> -		job = NULL;
>> -		/* queue timeout for next job */
>> -		drm_sched_start_timeout(sched);
>> -	}
>> -
>> -	spin_unlock(&sched->job_list_lock);
>> -
>> -	return job;
>> -}
>> -
>>   /**
>>    * drm_sched_pick_best - Get a drm sched from a sched_list with the least load
>>    * @sched_list: list of drm_gpu_schedulers
>> @@ -761,6 +716,44 @@ static bool drm_sched_blocked(struct drm_gpu_scheduler *sched)
>>   	return false;
>>   }
>>   
>> +/**
>> + * drm_sched_done - free done tasks
>> + * @param: pointer to a scheduler instance
>> + *
>> + * Returns 0.
>> + */
>> +static int drm_sched_done(void *param)
>> +{
>> +	struct drm_gpu_scheduler *sched = param;
>> +
>> +	do {
>> +		LIST_HEAD(done_q);
>> +
>> +		wait_event_interruptible(sched->done_wait_q,
>> +					 kthread_should_stop() ||
>> +					 !list_empty(&sched->done_list));
>> +
>> +		spin_lock(&sched->job_list_lock);
>> +		list_splice_init(&sched->done_list, &done_q);
>> +		spin_unlock(&sched->job_list_lock);
>> +
>> +		if (list_empty(&done_q))
>> +			continue;
>> +
>> +		while (!list_empty(&done_q)) {
>> +			struct drm_sched_job *job;
>> +
>> +			job = list_first_entry(&done_q,
>> +					       struct drm_sched_job,
>> +					       list);
>> +			list_del_init(&job->list);
>> +			sched->ops->free_job(job);
>> +		}
>> +	} while (!kthread_should_stop());
>> +
>> +	return 0;
>> +}
>> +
>>   /**
>>    * drm_sched_main - main scheduler thread
>>    *
>> @@ -770,7 +763,7 @@ static bool drm_sched_blocked(struct drm_gpu_scheduler *sched)
>>    */
>>   static int drm_sched_main(void *param)
>>   {
>> -	struct drm_gpu_scheduler *sched = (struct drm_gpu_scheduler *)param;
>> +	struct drm_gpu_scheduler *sched = param;
>>   	int r;
>>   
>>   	sched_set_fifo_low(current);
>> @@ -780,20 +773,12 @@ static int drm_sched_main(void *param)
>>   		struct drm_sched_fence *s_fence;
>>   		struct drm_sched_job *sched_job;
>>   		struct dma_fence *fence;
>> -		struct drm_sched_job *cleanup_job = NULL;
>>   
>>   		wait_event_interruptible(sched->wake_up_worker,
>> -					 (cleanup_job = drm_sched_get_cleanup_job(sched)) ||
>>   					 (!drm_sched_blocked(sched) &&
>>   					  (entity = drm_sched_select_entity(sched))) ||
>>   					 kthread_should_stop());
>>   
>> -		if (cleanup_job) {
>> -			sched->ops->free_job(cleanup_job);
>> -			/* queue timeout for next job */
>> -			drm_sched_start_timeout(sched);
>> -		}
>> -
>>   		if (!entity)
>>   			continue;
>>   
>> @@ -820,8 +805,7 @@ static int drm_sched_main(void *param)
>>   			if (r == -ENOENT)
>>   				drm_sched_job_done(sched_job);
>>   			else if (r)
>> -				DRM_ERROR("fence add callback failed (%d)\n",
>> -					  r);
>> +				DRM_ERROR("fence add callback failed (%d)\n", r);
>>   			dma_fence_put(fence);
>>   		} else {
>>   			if (IS_ERR(fence))
>> @@ -865,7 +849,9 @@ int drm_sched_init(struct drm_gpu_scheduler *sched,
>>   
>>   	init_waitqueue_head(&sched->wake_up_worker);
>>   	init_waitqueue_head(&sched->job_scheduled);
>> +	init_waitqueue_head(&sched->done_wait_q);
>>   	INIT_LIST_HEAD(&sched->pending_list);
>> +	INIT_LIST_HEAD(&sched->done_list);
>>   	spin_lock_init(&sched->job_list_lock);
>>   	atomic_set(&sched->hw_rq_count, 0);
>>   	INIT_DELAYED_WORK(&sched->work_tdr, drm_sched_job_timedout);
>> @@ -881,6 +867,21 @@ int drm_sched_init(struct drm_gpu_scheduler *sched,
>>   		return ret;
>>   	}
>>   
>> +	snprintf(sched->thread_done_name, DRM_THREAD_NAME_LEN, "%s%s",
>> +		 sched->name, "-done");
>> +	sched->thread_done_name[DRM_THREAD_NAME_LEN - 1] = '\0';
>> +	sched->thread_done = kthread_run(drm_sched_done, sched,
>> +					 sched->thread_done_name);
>> +	if (IS_ERR(sched->thread_done)) {
>> +		ret = kthread_stop(sched->thread);
>> +		if (!ret) {
>> +			/* free_kthread_struct(sched->thread); */
>> +			sched->thread = NULL;
>> +		}
>> +		DRM_ERROR("Failed to start thread %s", sched->thread_done_name);
>> +		return ret;
>> +	}
>> +
>>   	sched->ready = true;
>>   	return 0;
>>   }
>> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
>> index 3a5686c3b5e9..b282d6158b50 100644
>> --- a/include/drm/gpu_scheduler.h
>> +++ b/include/drm/gpu_scheduler.h
>> @@ -169,6 +169,12 @@ struct drm_sched_fence {
>>   
>>   struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f);
>>   
>> +enum drm_job_status {
>> +	DRM_JOB_STATUS_NONE    = 0 << 0,
>> +	DRM_JOB_STATUS_DONE    = 1 << 0,
>> +	DRM_JOB_STATUS_TIMEOUT = 1 << 1,
>> +};
>> +
>>   /**
>>    * struct drm_sched_job - A job to be run by an entity.
>>    *
>> @@ -198,6 +204,7 @@ struct drm_sched_job {
>>   	uint64_t			id;
>>   	atomic_t			karma;
>>   	enum drm_sched_priority		s_priority;
>> +	enum drm_job_status             job_status;
>>   	struct drm_sched_entity         *entity;
>>   	struct dma_fence_cb		cb;
>>   };
>> @@ -284,15 +291,22 @@ struct drm_gpu_scheduler {
>>   	uint32_t			hw_submission_limit;
>>   	long				timeout;
>>   	const char			*name;
>> +	char                            thread_done_name[DRM_THREAD_NAME_LEN];
>> +
>>   	struct drm_sched_rq		sched_rq[DRM_SCHED_PRIORITY_COUNT];
>>   	wait_queue_head_t		wake_up_worker;
>>   	wait_queue_head_t		job_scheduled;
>> +	wait_queue_head_t               done_wait_q;
>>   	atomic_t			hw_rq_count;
>>   	atomic64_t			job_id_count;
>>   	struct delayed_work		work_tdr;
>>   	struct task_struct		*thread;
>> +	struct task_struct		*thread_done;
>> +
>>   	struct list_head		pending_list;
>> +	struct list_head                done_list;
>>   	spinlock_t			job_list_lock;
>> +
>>   	int				hang_limit;
>>   	atomic_t                        score;
>>   	bool				ready;
> 

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH 6/6] drm/sched: Make use of a "done" thread
  2020-11-25 11:09                                         ` Steven Price
@ 2020-11-26  0:30                                           ` Luben Tuikov
  -1 siblings, 0 replies; 125+ messages in thread
From: Luben Tuikov @ 2020-11-26  0:30 UTC (permalink / raw)
  To: Steven Price, Andrey Grodzovsky, Christian König,
	Lucas Stach, Alexander Deucher
  Cc: Emily Deng, amd-gfx, dri-devel

On 2020-11-25 06:09, Steven Price wrote:
> On 25/11/2020 03:17, Luben Tuikov wrote:
>> Add a "done" list to which all completed jobs are added
>> to be freed. The drm_sched_job_done() callback is the
>> producer of jobs to this list.
>>
>> Add a "done" thread which consumes from the done list
>> and frees up jobs. Now, the main scheduler thread only
>> pushes jobs to the GPU and the "done" thread frees them
>> up, on the way out of the GPU when they've completed
>> execution.
> 
> Generally I'd be in favour of a "done thread" as I think there are some 
> murky corners of Panfrost's locking that would be helped by deferring 
> the free_job() callback.

Check my response to his email.

It seems you're okay with a separate thread, when both threads
could be working concurrently, and Christian wants
a single thread doing all this. You should probably address
this in a follow-up to his email, so this can be hashed out.

> 
> But I think you're trying to do too much in one patch here. And as 
> Christian has pointed out there's some dodgy looking changes to locking 
> which aren't explained.

I've addressed this in my response to his email, check it out.

So, if you're in favour of a separate thread working concurrently,
please follow up to his email, so this can be hashed out.

Thanks and Regards,
Luben

> 
> Steve
> 
>>
>> Make use of the status returned by the GPU driver
>> timeout handler to decide whether to leave the job in
>> the pending list, or to send it off to the done list.
>> If a job is done, it is added to the done list and the
>> done thread woken up. If a job needs more time, it is
>> left on the pending list and the timeout timer
>> restarted.
>>
>> Eliminate the polling mechanism of picking out done
>> jobs from the pending list, i.e. eliminate
>> drm_sched_get_cleanup_job(). Now the main scheduler
>> thread only pushes jobs down to the GPU.
>>
>> Various other optimizations to the GPU scheduler
>> and job recovery are possible with this format.
>>
>> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
>> ---
>>   drivers/gpu/drm/scheduler/sched_main.c | 173 +++++++++++++------------
>>   include/drm/gpu_scheduler.h            |  14 ++
>>   2 files changed, 101 insertions(+), 86 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
>> index 3eb7618a627d..289ae68cd97f 100644
>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>> @@ -164,7 +164,8 @@ drm_sched_rq_select_entity(struct drm_sched_rq *rq)
>>    * drm_sched_job_done - complete a job
>>    * @s_job: pointer to the job which is done
>>    *
>> - * Finish the job's fence and wake up the worker thread.
>> + * Finish the job's fence, move it to the done list,
>> + * and wake up the done thread.
>>    */
>>   static void drm_sched_job_done(struct drm_sched_job *s_job)
>>   {
>> @@ -179,7 +180,12 @@ static void drm_sched_job_done(struct drm_sched_job *s_job)
>>   	dma_fence_get(&s_fence->finished);
>>   	drm_sched_fence_finished(s_fence);
>>   	dma_fence_put(&s_fence->finished);
>> -	wake_up_interruptible(&sched->wake_up_worker);
>> +
>> +	spin_lock(&sched->job_list_lock);
>> +	list_move(&s_job->list, &sched->done_list);
>> +	spin_unlock(&sched->job_list_lock);
>> +
>> +	wake_up_interruptible(&sched->done_wait_q);
>>   }
>>   
>>   /**
>> @@ -221,11 +227,10 @@ bool drm_sched_dependency_optimized(struct dma_fence* fence,
>>   EXPORT_SYMBOL(drm_sched_dependency_optimized);
>>   
>>   /**
>> - * drm_sched_start_timeout - start timeout for reset worker
>> - *
>> - * @sched: scheduler instance to start the worker for
>> + * drm_sched_start_timeout - start a timeout timer
>> + * @sched: scheduler instance whose job we're timing
>>    *
>> - * Start the timeout for the given scheduler.
>> + * Start a timeout timer for the given scheduler.
>>    */
>>   static void drm_sched_start_timeout(struct drm_gpu_scheduler *sched)
>>   {
>> @@ -305,8 +310,8 @@ static void drm_sched_job_begin(struct drm_sched_job *s_job)
>>   
>>   	spin_lock(&sched->job_list_lock);
>>   	list_add_tail(&s_job->list, &sched->pending_list);
>> -	drm_sched_start_timeout(sched);
>>   	spin_unlock(&sched->job_list_lock);
>> +	drm_sched_start_timeout(sched);
>>   }
>>   
>>   static void drm_sched_job_timedout(struct work_struct *work)
>> @@ -316,37 +321,30 @@ static void drm_sched_job_timedout(struct work_struct *work)
>>   
>>   	sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work);
>>   
>> -	/* Protects against concurrent deletion in drm_sched_get_cleanup_job */
>>   	spin_lock(&sched->job_list_lock);
>>   	job = list_first_entry_or_null(&sched->pending_list,
>>   				       struct drm_sched_job, list);
>> +	spin_unlock(&sched->job_list_lock);
>>   
>>   	if (job) {
>> -		/*
>> -		 * Remove the bad job so it cannot be freed by concurrent
>> -		 * drm_sched_cleanup_jobs. It will be reinserted back after sched->thread
>> -		 * is parked at which point it's safe.
>> -		 */
>> -		list_del_init(&job->list);
>> -		spin_unlock(&sched->job_list_lock);
>> +		int res;
>>   
>> -		job->sched->ops->timedout_job(job);
>> +		job->job_status |= DRM_JOB_STATUS_TIMEOUT;
>> +		res = job->sched->ops->timedout_job(job);
>> +		if (res == 0) {
>> +			/* The job is out of the device.
>> +			 */
>> +			spin_lock(&sched->job_list_lock);
>> +			list_move(&job->list, &sched->done_list);
>> +			spin_unlock(&sched->job_list_lock);
>>   
>> -		/*
>> -		 * Guilty job did complete and hence needs to be manually removed
>> -		 * See drm_sched_stop doc.
>> -		 */
>> -		if (sched->free_guilty) {
>> -			job->sched->ops->free_job(job);
>> -			sched->free_guilty = false;
>> +			wake_up_interruptible(&sched->done_wait_q);
>> +		} else {
>> +			/* The job needs more time.
>> +			 */
>> +			drm_sched_start_timeout(sched);
>>   		}
>> -	} else {
>> -		spin_unlock(&sched->job_list_lock);
>>   	}
>> -
>> -	spin_lock(&sched->job_list_lock);
>> -	drm_sched_start_timeout(sched);
>> -	spin_unlock(&sched->job_list_lock);
>>   }
>>   
>>    /**
>> @@ -511,15 +509,13 @@ void drm_sched_start(struct drm_gpu_scheduler *sched, bool full_recovery)
>>   			else if (r)
>>   				DRM_ERROR("fence add callback failed (%d)\n",
>>   					  r);
>> -		} else
>> +		} else {
>>   			drm_sched_job_done(s_job);
>> +		}
>>   	}
>>   
>> -	if (full_recovery) {
>> -		spin_lock(&sched->job_list_lock);
>> +	if (full_recovery)
>>   		drm_sched_start_timeout(sched);
>> -		spin_unlock(&sched->job_list_lock);
>> -	}
>>   
>>   	kthread_unpark(sched->thread);
>>   }
>> @@ -667,47 +663,6 @@ drm_sched_select_entity(struct drm_gpu_scheduler *sched)
>>   	return entity;
>>   }
>>   
>> -/**
>> - * drm_sched_get_cleanup_job - fetch the next finished job to be destroyed
>> - *
>> - * @sched: scheduler instance
>> - *
>> - * Returns the next finished job from the pending list (if there is one)
>> - * ready for it to be destroyed.
>> - */
>> -static struct drm_sched_job *
>> -drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched)
>> -{
>> -	struct drm_sched_job *job;
>> -
>> -	/*
>> -	 * Don't destroy jobs while the timeout worker is running  OR thread
>> -	 * is being parked and hence assumed to not touch pending_list
>> -	 */
>> -	if ((sched->timeout != MAX_SCHEDULE_TIMEOUT &&
>> -	    !cancel_delayed_work(&sched->work_tdr)) ||
>> -	    kthread_should_park())
>> -		return NULL;
>> -
>> -	spin_lock(&sched->job_list_lock);
>> -
>> -	job = list_first_entry_or_null(&sched->pending_list,
>> -				       struct drm_sched_job, list);
>> -
>> -	if (job && dma_fence_is_signaled(&job->s_fence->finished)) {
>> -		/* remove job from pending_list */
>> -		list_del_init(&job->list);
>> -	} else {
>> -		job = NULL;
>> -		/* queue timeout for next job */
>> -		drm_sched_start_timeout(sched);
>> -	}
>> -
>> -	spin_unlock(&sched->job_list_lock);
>> -
>> -	return job;
>> -}
>> -
>>   /**
>>    * drm_sched_pick_best - Get a drm sched from a sched_list with the least load
>>    * @sched_list: list of drm_gpu_schedulers
>> @@ -761,6 +716,44 @@ static bool drm_sched_blocked(struct drm_gpu_scheduler *sched)
>>   	return false;
>>   }
>>   
>> +/**
>> + * drm_sched_done - free done tasks
>> + * @param: pointer to a scheduler instance
>> + *
>> + * Returns 0.
>> + */
>> +static int drm_sched_done(void *param)
>> +{
>> +	struct drm_gpu_scheduler *sched = param;
>> +
>> +	do {
>> +		LIST_HEAD(done_q);
>> +
>> +		wait_event_interruptible(sched->done_wait_q,
>> +					 kthread_should_stop() ||
>> +					 !list_empty(&sched->done_list));
>> +
>> +		spin_lock(&sched->job_list_lock);
>> +		list_splice_init(&sched->done_list, &done_q);
>> +		spin_unlock(&sched->job_list_lock);
>> +
>> +		if (list_empty(&done_q))
>> +			continue;
>> +
>> +		while (!list_empty(&done_q)) {
>> +			struct drm_sched_job *job;
>> +
>> +			job = list_first_entry(&done_q,
>> +					       struct drm_sched_job,
>> +					       list);
>> +			list_del_init(&job->list);
>> +			sched->ops->free_job(job);
>> +		}
>> +	} while (!kthread_should_stop());
>> +
>> +	return 0;
>> +}
>> +
>>   /**
>>    * drm_sched_main - main scheduler thread
>>    *
>> @@ -770,7 +763,7 @@ static bool drm_sched_blocked(struct drm_gpu_scheduler *sched)
>>    */
>>   static int drm_sched_main(void *param)
>>   {
>> -	struct drm_gpu_scheduler *sched = (struct drm_gpu_scheduler *)param;
>> +	struct drm_gpu_scheduler *sched = param;
>>   	int r;
>>   
>>   	sched_set_fifo_low(current);
>> @@ -780,20 +773,12 @@ static int drm_sched_main(void *param)
>>   		struct drm_sched_fence *s_fence;
>>   		struct drm_sched_job *sched_job;
>>   		struct dma_fence *fence;
>> -		struct drm_sched_job *cleanup_job = NULL;
>>   
>>   		wait_event_interruptible(sched->wake_up_worker,
>> -					 (cleanup_job = drm_sched_get_cleanup_job(sched)) ||
>>   					 (!drm_sched_blocked(sched) &&
>>   					  (entity = drm_sched_select_entity(sched))) ||
>>   					 kthread_should_stop());
>>   
>> -		if (cleanup_job) {
>> -			sched->ops->free_job(cleanup_job);
>> -			/* queue timeout for next job */
>> -			drm_sched_start_timeout(sched);
>> -		}
>> -
>>   		if (!entity)
>>   			continue;
>>   
>> @@ -820,8 +805,7 @@ static int drm_sched_main(void *param)
>>   			if (r == -ENOENT)
>>   				drm_sched_job_done(sched_job);
>>   			else if (r)
>> -				DRM_ERROR("fence add callback failed (%d)\n",
>> -					  r);
>> +				DRM_ERROR("fence add callback failed (%d)\n", r);
>>   			dma_fence_put(fence);
>>   		} else {
>>   			if (IS_ERR(fence))
>> @@ -865,7 +849,9 @@ int drm_sched_init(struct drm_gpu_scheduler *sched,
>>   
>>   	init_waitqueue_head(&sched->wake_up_worker);
>>   	init_waitqueue_head(&sched->job_scheduled);
>> +	init_waitqueue_head(&sched->done_wait_q);
>>   	INIT_LIST_HEAD(&sched->pending_list);
>> +	INIT_LIST_HEAD(&sched->done_list);
>>   	spin_lock_init(&sched->job_list_lock);
>>   	atomic_set(&sched->hw_rq_count, 0);
>>   	INIT_DELAYED_WORK(&sched->work_tdr, drm_sched_job_timedout);
>> @@ -881,6 +867,21 @@ int drm_sched_init(struct drm_gpu_scheduler *sched,
>>   		return ret;
>>   	}
>>   
>> +	snprintf(sched->thread_done_name, DRM_THREAD_NAME_LEN, "%s%s",
>> +		 sched->name, "-done");
>> +	sched->thread_done_name[DRM_THREAD_NAME_LEN - 1] = '\0';
>> +	sched->thread_done = kthread_run(drm_sched_done, sched,
>> +					 sched->thread_done_name);
>> +	if (IS_ERR(sched->thread_done)) {
>> +		ret = kthread_stop(sched->thread);
>> +		if (!ret) {
>> +			/* free_kthread_struct(sched->thread); */
>> +			sched->thread = NULL;
>> +		}
>> +		DRM_ERROR("Failed to start thread %s", sched->thread_done_name);
>> +		return ret;
>> +	}
>> +
>>   	sched->ready = true;
>>   	return 0;
>>   }
>> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
>> index 3a5686c3b5e9..b282d6158b50 100644
>> --- a/include/drm/gpu_scheduler.h
>> +++ b/include/drm/gpu_scheduler.h
>> @@ -169,6 +169,12 @@ struct drm_sched_fence {
>>   
>>   struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f);
>>   
>> +enum drm_job_status {
>> +	DRM_JOB_STATUS_NONE    = 0 << 0,
>> +	DRM_JOB_STATUS_DONE    = 1 << 0,
>> +	DRM_JOB_STATUS_TIMEOUT = 1 << 1,
>> +};
>> +
>>   /**
>>    * struct drm_sched_job - A job to be run by an entity.
>>    *
>> @@ -198,6 +204,7 @@ struct drm_sched_job {
>>   	uint64_t			id;
>>   	atomic_t			karma;
>>   	enum drm_sched_priority		s_priority;
>> +	enum drm_job_status             job_status;
>>   	struct drm_sched_entity         *entity;
>>   	struct dma_fence_cb		cb;
>>   };
>> @@ -284,15 +291,22 @@ struct drm_gpu_scheduler {
>>   	uint32_t			hw_submission_limit;
>>   	long				timeout;
>>   	const char			*name;
>> +	char                            thread_done_name[DRM_THREAD_NAME_LEN];
>> +
>>   	struct drm_sched_rq		sched_rq[DRM_SCHED_PRIORITY_COUNT];
>>   	wait_queue_head_t		wake_up_worker;
>>   	wait_queue_head_t		job_scheduled;
>> +	wait_queue_head_t               done_wait_q;
>>   	atomic_t			hw_rq_count;
>>   	atomic64_t			job_id_count;
>>   	struct delayed_work		work_tdr;
>>   	struct task_struct		*thread;
>> +	struct task_struct		*thread_done;
>> +
>>   	struct list_head		pending_list;
>> +	struct list_head                done_list;
>>   	spinlock_t			job_list_lock;
>> +
>>   	int				hang_limit;
>>   	atomic_t                        score;
>>   	bool				ready;
>>
> 

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH 6/6] drm/sched: Make use of a "done" thread
@ 2020-11-26  0:30                                           ` Luben Tuikov
  0 siblings, 0 replies; 125+ messages in thread
From: Luben Tuikov @ 2020-11-26  0:30 UTC (permalink / raw)
  To: Steven Price, Andrey Grodzovsky, Christian König,
	Lucas Stach, Alexander Deucher
  Cc: Emily Deng, amd-gfx, dri-devel

On 2020-11-25 06:09, Steven Price wrote:
> On 25/11/2020 03:17, Luben Tuikov wrote:
>> Add a "done" list to which all completed jobs are added
>> to be freed. The drm_sched_job_done() callback is the
>> producer of jobs to this list.
>>
>> Add a "done" thread which consumes from the done list
>> and frees up jobs. Now, the main scheduler thread only
>> pushes jobs to the GPU and the "done" thread frees them
>> up, on the way out of the GPU when they've completed
>> execution.
> 
> Generally I'd be in favour of a "done thread" as I think there are some 
> murky corners of Panfrost's locking that would be helped by deferring 
> the free_job() callback.

Check my response to his email.

It seems you're okay with a separate thread, when both threads
could be working concurrently, and Christian wants
a single thread doing all this. You should probably address
this in a follow-up to his email, so this can be hashed out.

> 
> But I think you're trying to do too much in one patch here. And as 
> Christian has pointed out there's some dodgy looking changes to locking 
> which aren't explained.

I've addressed this in my response to his email, check it out.

So, if you're in favour of a separate thread working concurrently,
please follow up to his email, so this can be hashed out.

Thanks and Regards,
Luben

> 
> Steve
> 
>>
>> Make use of the status returned by the GPU driver
>> timeout handler to decide whether to leave the job in
>> the pending list, or to send it off to the done list.
>> If a job is done, it is added to the done list and the
>> done thread woken up. If a job needs more time, it is
>> left on the pending list and the timeout timer
>> restarted.
>>
>> Eliminate the polling mechanism of picking out done
>> jobs from the pending list, i.e. eliminate
>> drm_sched_get_cleanup_job(). Now the main scheduler
>> thread only pushes jobs down to the GPU.
>>
>> Various other optimizations to the GPU scheduler
>> and job recovery are possible with this format.
>>
>> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
>> ---
>>   drivers/gpu/drm/scheduler/sched_main.c | 173 +++++++++++++------------
>>   include/drm/gpu_scheduler.h            |  14 ++
>>   2 files changed, 101 insertions(+), 86 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
>> index 3eb7618a627d..289ae68cd97f 100644
>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>> @@ -164,7 +164,8 @@ drm_sched_rq_select_entity(struct drm_sched_rq *rq)
>>    * drm_sched_job_done - complete a job
>>    * @s_job: pointer to the job which is done
>>    *
>> - * Finish the job's fence and wake up the worker thread.
>> + * Finish the job's fence, move it to the done list,
>> + * and wake up the done thread.
>>    */
>>   static void drm_sched_job_done(struct drm_sched_job *s_job)
>>   {
>> @@ -179,7 +180,12 @@ static void drm_sched_job_done(struct drm_sched_job *s_job)
>>   	dma_fence_get(&s_fence->finished);
>>   	drm_sched_fence_finished(s_fence);
>>   	dma_fence_put(&s_fence->finished);
>> -	wake_up_interruptible(&sched->wake_up_worker);
>> +
>> +	spin_lock(&sched->job_list_lock);
>> +	list_move(&s_job->list, &sched->done_list);
>> +	spin_unlock(&sched->job_list_lock);
>> +
>> +	wake_up_interruptible(&sched->done_wait_q);
>>   }
>>   
>>   /**
>> @@ -221,11 +227,10 @@ bool drm_sched_dependency_optimized(struct dma_fence* fence,
>>   EXPORT_SYMBOL(drm_sched_dependency_optimized);
>>   
>>   /**
>> - * drm_sched_start_timeout - start timeout for reset worker
>> - *
>> - * @sched: scheduler instance to start the worker for
>> + * drm_sched_start_timeout - start a timeout timer
>> + * @sched: scheduler instance whose job we're timing
>>    *
>> - * Start the timeout for the given scheduler.
>> + * Start a timeout timer for the given scheduler.
>>    */
>>   static void drm_sched_start_timeout(struct drm_gpu_scheduler *sched)
>>   {
>> @@ -305,8 +310,8 @@ static void drm_sched_job_begin(struct drm_sched_job *s_job)
>>   
>>   	spin_lock(&sched->job_list_lock);
>>   	list_add_tail(&s_job->list, &sched->pending_list);
>> -	drm_sched_start_timeout(sched);
>>   	spin_unlock(&sched->job_list_lock);
>> +	drm_sched_start_timeout(sched);
>>   }
>>   
>>   static void drm_sched_job_timedout(struct work_struct *work)
>> @@ -316,37 +321,30 @@ static void drm_sched_job_timedout(struct work_struct *work)
>>   
>>   	sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work);
>>   
>> -	/* Protects against concurrent deletion in drm_sched_get_cleanup_job */
>>   	spin_lock(&sched->job_list_lock);
>>   	job = list_first_entry_or_null(&sched->pending_list,
>>   				       struct drm_sched_job, list);
>> +	spin_unlock(&sched->job_list_lock);
>>   
>>   	if (job) {
>> -		/*
>> -		 * Remove the bad job so it cannot be freed by concurrent
>> -		 * drm_sched_cleanup_jobs. It will be reinserted back after sched->thread
>> -		 * is parked at which point it's safe.
>> -		 */
>> -		list_del_init(&job->list);
>> -		spin_unlock(&sched->job_list_lock);
>> +		int res;
>>   
>> -		job->sched->ops->timedout_job(job);
>> +		job->job_status |= DRM_JOB_STATUS_TIMEOUT;
>> +		res = job->sched->ops->timedout_job(job);
>> +		if (res == 0) {
>> +			/* The job is out of the device.
>> +			 */
>> +			spin_lock(&sched->job_list_lock);
>> +			list_move(&job->list, &sched->done_list);
>> +			spin_unlock(&sched->job_list_lock);
>>   
>> -		/*
>> -		 * Guilty job did complete and hence needs to be manually removed
>> -		 * See drm_sched_stop doc.
>> -		 */
>> -		if (sched->free_guilty) {
>> -			job->sched->ops->free_job(job);
>> -			sched->free_guilty = false;
>> +			wake_up_interruptible(&sched->done_wait_q);
>> +		} else {
>> +			/* The job needs more time.
>> +			 */
>> +			drm_sched_start_timeout(sched);
>>   		}
>> -	} else {
>> -		spin_unlock(&sched->job_list_lock);
>>   	}
>> -
>> -	spin_lock(&sched->job_list_lock);
>> -	drm_sched_start_timeout(sched);
>> -	spin_unlock(&sched->job_list_lock);
>>   }
>>   
>>    /**
>> @@ -511,15 +509,13 @@ void drm_sched_start(struct drm_gpu_scheduler *sched, bool full_recovery)
>>   			else if (r)
>>   				DRM_ERROR("fence add callback failed (%d)\n",
>>   					  r);
>> -		} else
>> +		} else {
>>   			drm_sched_job_done(s_job);
>> +		}
>>   	}
>>   
>> -	if (full_recovery) {
>> -		spin_lock(&sched->job_list_lock);
>> +	if (full_recovery)
>>   		drm_sched_start_timeout(sched);
>> -		spin_unlock(&sched->job_list_lock);
>> -	}
>>   
>>   	kthread_unpark(sched->thread);
>>   }
>> @@ -667,47 +663,6 @@ drm_sched_select_entity(struct drm_gpu_scheduler *sched)
>>   	return entity;
>>   }
>>   
>> -/**
>> - * drm_sched_get_cleanup_job - fetch the next finished job to be destroyed
>> - *
>> - * @sched: scheduler instance
>> - *
>> - * Returns the next finished job from the pending list (if there is one)
>> - * ready for it to be destroyed.
>> - */
>> -static struct drm_sched_job *
>> -drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched)
>> -{
>> -	struct drm_sched_job *job;
>> -
>> -	/*
>> -	 * Don't destroy jobs while the timeout worker is running  OR thread
>> -	 * is being parked and hence assumed to not touch pending_list
>> -	 */
>> -	if ((sched->timeout != MAX_SCHEDULE_TIMEOUT &&
>> -	    !cancel_delayed_work(&sched->work_tdr)) ||
>> -	    kthread_should_park())
>> -		return NULL;
>> -
>> -	spin_lock(&sched->job_list_lock);
>> -
>> -	job = list_first_entry_or_null(&sched->pending_list,
>> -				       struct drm_sched_job, list);
>> -
>> -	if (job && dma_fence_is_signaled(&job->s_fence->finished)) {
>> -		/* remove job from pending_list */
>> -		list_del_init(&job->list);
>> -	} else {
>> -		job = NULL;
>> -		/* queue timeout for next job */
>> -		drm_sched_start_timeout(sched);
>> -	}
>> -
>> -	spin_unlock(&sched->job_list_lock);
>> -
>> -	return job;
>> -}
>> -
>>   /**
>>    * drm_sched_pick_best - Get a drm sched from a sched_list with the least load
>>    * @sched_list: list of drm_gpu_schedulers
>> @@ -761,6 +716,44 @@ static bool drm_sched_blocked(struct drm_gpu_scheduler *sched)
>>   	return false;
>>   }
>>   
>> +/**
>> + * drm_sched_done - free done tasks
>> + * @param: pointer to a scheduler instance
>> + *
>> + * Returns 0.
>> + */
>> +static int drm_sched_done(void *param)
>> +{
>> +	struct drm_gpu_scheduler *sched = param;
>> +
>> +	do {
>> +		LIST_HEAD(done_q);
>> +
>> +		wait_event_interruptible(sched->done_wait_q,
>> +					 kthread_should_stop() ||
>> +					 !list_empty(&sched->done_list));
>> +
>> +		spin_lock(&sched->job_list_lock);
>> +		list_splice_init(&sched->done_list, &done_q);
>> +		spin_unlock(&sched->job_list_lock);
>> +
>> +		if (list_empty(&done_q))
>> +			continue;
>> +
>> +		while (!list_empty(&done_q)) {
>> +			struct drm_sched_job *job;
>> +
>> +			job = list_first_entry(&done_q,
>> +					       struct drm_sched_job,
>> +					       list);
>> +			list_del_init(&job->list);
>> +			sched->ops->free_job(job);
>> +		}
>> +	} while (!kthread_should_stop());
>> +
>> +	return 0;
>> +}
>> +
>>   /**
>>    * drm_sched_main - main scheduler thread
>>    *
>> @@ -770,7 +763,7 @@ static bool drm_sched_blocked(struct drm_gpu_scheduler *sched)
>>    */
>>   static int drm_sched_main(void *param)
>>   {
>> -	struct drm_gpu_scheduler *sched = (struct drm_gpu_scheduler *)param;
>> +	struct drm_gpu_scheduler *sched = param;
>>   	int r;
>>   
>>   	sched_set_fifo_low(current);
>> @@ -780,20 +773,12 @@ static int drm_sched_main(void *param)
>>   		struct drm_sched_fence *s_fence;
>>   		struct drm_sched_job *sched_job;
>>   		struct dma_fence *fence;
>> -		struct drm_sched_job *cleanup_job = NULL;
>>   
>>   		wait_event_interruptible(sched->wake_up_worker,
>> -					 (cleanup_job = drm_sched_get_cleanup_job(sched)) ||
>>   					 (!drm_sched_blocked(sched) &&
>>   					  (entity = drm_sched_select_entity(sched))) ||
>>   					 kthread_should_stop());
>>   
>> -		if (cleanup_job) {
>> -			sched->ops->free_job(cleanup_job);
>> -			/* queue timeout for next job */
>> -			drm_sched_start_timeout(sched);
>> -		}
>> -
>>   		if (!entity)
>>   			continue;
>>   
>> @@ -820,8 +805,7 @@ static int drm_sched_main(void *param)
>>   			if (r == -ENOENT)
>>   				drm_sched_job_done(sched_job);
>>   			else if (r)
>> -				DRM_ERROR("fence add callback failed (%d)\n",
>> -					  r);
>> +				DRM_ERROR("fence add callback failed (%d)\n", r);
>>   			dma_fence_put(fence);
>>   		} else {
>>   			if (IS_ERR(fence))
>> @@ -865,7 +849,9 @@ int drm_sched_init(struct drm_gpu_scheduler *sched,
>>   
>>   	init_waitqueue_head(&sched->wake_up_worker);
>>   	init_waitqueue_head(&sched->job_scheduled);
>> +	init_waitqueue_head(&sched->done_wait_q);
>>   	INIT_LIST_HEAD(&sched->pending_list);
>> +	INIT_LIST_HEAD(&sched->done_list);
>>   	spin_lock_init(&sched->job_list_lock);
>>   	atomic_set(&sched->hw_rq_count, 0);
>>   	INIT_DELAYED_WORK(&sched->work_tdr, drm_sched_job_timedout);
>> @@ -881,6 +867,21 @@ int drm_sched_init(struct drm_gpu_scheduler *sched,
>>   		return ret;
>>   	}
>>   
>> +	snprintf(sched->thread_done_name, DRM_THREAD_NAME_LEN, "%s%s",
>> +		 sched->name, "-done");
>> +	sched->thread_done_name[DRM_THREAD_NAME_LEN - 1] = '\0';
>> +	sched->thread_done = kthread_run(drm_sched_done, sched,
>> +					 sched->thread_done_name);
>> +	if (IS_ERR(sched->thread_done)) {
>> +		ret = kthread_stop(sched->thread);
>> +		if (!ret) {
>> +			/* free_kthread_struct(sched->thread); */
>> +			sched->thread = NULL;
>> +		}
>> +		DRM_ERROR("Failed to start thread %s", sched->thread_done_name);
>> +		return ret;
>> +	}
>> +
>>   	sched->ready = true;
>>   	return 0;
>>   }
>> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
>> index 3a5686c3b5e9..b282d6158b50 100644
>> --- a/include/drm/gpu_scheduler.h
>> +++ b/include/drm/gpu_scheduler.h
>> @@ -169,6 +169,12 @@ struct drm_sched_fence {
>>   
>>   struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f);
>>   
>> +enum drm_job_status {
>> +	DRM_JOB_STATUS_NONE    = 0 << 0,
>> +	DRM_JOB_STATUS_DONE    = 1 << 0,
>> +	DRM_JOB_STATUS_TIMEOUT = 1 << 1,
>> +};
>> +
>>   /**
>>    * struct drm_sched_job - A job to be run by an entity.
>>    *
>> @@ -198,6 +204,7 @@ struct drm_sched_job {
>>   	uint64_t			id;
>>   	atomic_t			karma;
>>   	enum drm_sched_priority		s_priority;
>> +	enum drm_job_status             job_status;
>>   	struct drm_sched_entity         *entity;
>>   	struct dma_fence_cb		cb;
>>   };
>> @@ -284,15 +291,22 @@ struct drm_gpu_scheduler {
>>   	uint32_t			hw_submission_limit;
>>   	long				timeout;
>>   	const char			*name;
>> +	char                            thread_done_name[DRM_THREAD_NAME_LEN];
>> +
>>   	struct drm_sched_rq		sched_rq[DRM_SCHED_PRIORITY_COUNT];
>>   	wait_queue_head_t		wake_up_worker;
>>   	wait_queue_head_t		job_scheduled;
>> +	wait_queue_head_t               done_wait_q;
>>   	atomic_t			hw_rq_count;
>>   	atomic64_t			job_id_count;
>>   	struct delayed_work		work_tdr;
>>   	struct task_struct		*thread;
>> +	struct task_struct		*thread_done;
>> +
>>   	struct list_head		pending_list;
>> +	struct list_head                done_list;
>>   	spinlock_t			job_list_lock;
>> +
>>   	int				hang_limit;
>>   	atomic_t                        score;
>>   	bool				ready;
>>
> 

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH 5/6] drm/amdgpu: Don't hardcode thread name length
  2020-11-25 17:01                                           ` Luben Tuikov
@ 2020-11-26  8:11                                             ` Christian König
  -1 siblings, 0 replies; 125+ messages in thread
From: Christian König @ 2020-11-26  8:11 UTC (permalink / raw)
  To: Luben Tuikov, Christian König, Andrey Grodzovsky,
	Lucas Stach, Alexander Deucher
  Cc: Emily Deng, dri-devel, amd-gfx, steven.price

Am 25.11.20 um 18:01 schrieb Luben Tuikov:
> On 2020-11-25 04:55, Christian König wrote:
>> Am 25.11.20 um 04:17 schrieb Luben Tuikov:
>>> Introduce a macro DRM_THREAD_NAME_LEN
>>> and use that to define ring name size,
>>> instead of hardcoding it to 16.
>>>
>>> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
>>> ---
>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 2 +-
>>>    include/drm/gpu_scheduler.h              | 2 ++
>>>    2 files changed, 3 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>> index 7112137689db..bbd46c6dec65 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>> @@ -230,7 +230,7 @@ struct amdgpu_ring {
>>>    	unsigned		wptr_offs;
>>>    	unsigned		fence_offs;
>>>    	uint64_t		current_ctx;
>>> -	char			name[16];
>>> +	char			name[DRM_THREAD_NAME_LEN];
>>>    	u32                     trail_seq;
>>>    	unsigned		trail_fence_offs;
>>>    	u64			trail_fence_gpu_addr;
>>> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
>>> index 61f7121e1c19..3a5686c3b5e9 100644
>>> --- a/include/drm/gpu_scheduler.h
>>> +++ b/include/drm/gpu_scheduler.h
>>> @@ -30,6 +30,8 @@
>>>    
>>>    #define MAX_WAIT_SCHED_ENTITY_Q_EMPTY msecs_to_jiffies(1000)
>>>    
>>> +#define DRM_THREAD_NAME_LEN     TASK_COMM_LEN
>>> +
>> The thread name is an amdgpu specific thing. I don't think we should
>> have that in the scheduler.
> I need it in DRM when creating the done thread from the name
> of the main scheduler thread. Since DRM creates threads,
> the main scheduler thread and the done thread, it would
> be good to have a preliminary limit to the name string.
>
>> And why do you use TASK_COMM_LEN here? That is completely unrelated stuff.
> If you trace down into the kernel, TASK_COMM_LEN seems to be used in
> snprintf() when naming a kernel thread, and its value is 16--same
> as the one used in amdgpu.

Oh, that's new to me. In my memory this name was used as a filename in 
debugfs only.

>
> So the size of the name string transitions from amdgpu to DRM to kernel
> proper, where amdgpu and kernel proper set it to max 16, but DRM doesn't
> give it a limit.
>
> Sure, I can remove it from DRM, and just use a local limit
> when snprintf() the name when creating a tread, possibly
> using TASK_COMM_LEN. (That's in the next patch.)

Yeah, just use TASK_COMM_LEN directly where appropriate.

Regards,
Christian.

>
> Would that be better? I can do that in v2 of this patchset.
>
> Thanks,
> Luben
>
>> Regards,
>> Christian.
>>
>>>    struct drm_gpu_scheduler;
>>>    struct drm_sched_rq;
>>>    
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH 5/6] drm/amdgpu: Don't hardcode thread name length
@ 2020-11-26  8:11                                             ` Christian König
  0 siblings, 0 replies; 125+ messages in thread
From: Christian König @ 2020-11-26  8:11 UTC (permalink / raw)
  To: Luben Tuikov, Christian König, Andrey Grodzovsky,
	Lucas Stach, Alexander Deucher
  Cc: Emily Deng, dri-devel, amd-gfx, steven.price

Am 25.11.20 um 18:01 schrieb Luben Tuikov:
> On 2020-11-25 04:55, Christian König wrote:
>> Am 25.11.20 um 04:17 schrieb Luben Tuikov:
>>> Introduce a macro DRM_THREAD_NAME_LEN
>>> and use that to define ring name size,
>>> instead of hardcoding it to 16.
>>>
>>> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
>>> ---
>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 2 +-
>>>    include/drm/gpu_scheduler.h              | 2 ++
>>>    2 files changed, 3 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>> index 7112137689db..bbd46c6dec65 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>> @@ -230,7 +230,7 @@ struct amdgpu_ring {
>>>    	unsigned		wptr_offs;
>>>    	unsigned		fence_offs;
>>>    	uint64_t		current_ctx;
>>> -	char			name[16];
>>> +	char			name[DRM_THREAD_NAME_LEN];
>>>    	u32                     trail_seq;
>>>    	unsigned		trail_fence_offs;
>>>    	u64			trail_fence_gpu_addr;
>>> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
>>> index 61f7121e1c19..3a5686c3b5e9 100644
>>> --- a/include/drm/gpu_scheduler.h
>>> +++ b/include/drm/gpu_scheduler.h
>>> @@ -30,6 +30,8 @@
>>>    
>>>    #define MAX_WAIT_SCHED_ENTITY_Q_EMPTY msecs_to_jiffies(1000)
>>>    
>>> +#define DRM_THREAD_NAME_LEN     TASK_COMM_LEN
>>> +
>> The thread name is an amdgpu specific thing. I don't think we should
>> have that in the scheduler.
> I need it in DRM when creating the done thread from the name
> of the main scheduler thread. Since DRM creates threads,
> the main scheduler thread and the done thread, it would
> be good to have a preliminary limit to the name string.
>
>> And why do you use TASK_COMM_LEN here? That is completely unrelated stuff.
> If you trace down into the kernel, TASK_COMM_LEN seems to be used in
> snprintf() when naming a kernel thread, and its value is 16--same
> as the one used in amdgpu.

Oh, that's new to me. In my memory this name was used as a filename in 
debugfs only.

>
> So the size of the name string transitions from amdgpu to DRM to kernel
> proper, where amdgpu and kernel proper set it to max 16, but DRM doesn't
> give it a limit.
>
> Sure, I can remove it from DRM, and just use a local limit
> when snprintf() the name when creating a tread, possibly
> using TASK_COMM_LEN. (That's in the next patch.)

Yeah, just use TASK_COMM_LEN directly where appropriate.

Regards,
Christian.

>
> Would that be better? I can do that in v2 of this patchset.
>
> Thanks,
> Luben
>
>> Regards,
>> Christian.
>>
>>>    struct drm_gpu_scheduler;
>>>    struct drm_sched_rq;
>>>    
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH 3/6] drm/scheduler: Job timeout handler returns status
  2020-11-25  3:17                                       ` Luben Tuikov
@ 2020-11-26 15:06                                         ` Andrey Grodzovsky
  -1 siblings, 0 replies; 125+ messages in thread
From: Andrey Grodzovsky @ 2020-11-26 15:06 UTC (permalink / raw)
  To: Luben Tuikov, Christian König, Lucas Stach, Alexander Deucher
  Cc: Emily Deng, amd-gfx, dri-devel, steven.price


On 11/24/20 10:17 PM, Luben Tuikov wrote:
> The job timeout handler now returns status
> indicating back to the DRM layer whether the job
> was successfully cancelled or whether more time
> should be given to the job to complete.
>
> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_job.c |  6 ++++--
>   include/drm/gpu_scheduler.h             | 13 ++++++++++---
>   2 files changed, 14 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> index ff48101bab55..81b73790ecc6 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> @@ -28,7 +28,7 @@
>   #include "amdgpu.h"
>   #include "amdgpu_trace.h"
>   
> -static void amdgpu_job_timedout(struct drm_sched_job *s_job)
> +static int amdgpu_job_timedout(struct drm_sched_job *s_job)
>   {
>   	struct amdgpu_ring *ring = to_amdgpu_ring(s_job->sched);
>   	struct amdgpu_job *job = to_amdgpu_job(s_job);
> @@ -41,7 +41,7 @@ static void amdgpu_job_timedout(struct drm_sched_job *s_job)
>   	    amdgpu_ring_soft_recovery(ring, job->vmid, s_job->s_fence->parent)) {
>   		DRM_ERROR("ring %s timeout, but soft recovered\n",
>   			  s_job->sched->name);
> -		return;
> +		return 0;
>   	}
>   
>   	amdgpu_vm_get_task_info(ring->adev, job->pasid, &ti);
> @@ -53,10 +53,12 @@ static void amdgpu_job_timedout(struct drm_sched_job *s_job)
>   
>   	if (amdgpu_device_should_recover_gpu(ring->adev)) {
>   		amdgpu_device_gpu_recover(ring->adev, job);
> +		return 0;


For amdgpu specifically - not that amdgpu_device_gpu_recover returns a value 
which is 0 for successful GPU reset
meaning we reset the GPU and resubmitted to HW the job that triggered the 
timeout to HW (guilty).
It means the job is still should be considered part of pending list and so a non 
zero value
should be returned. I think only if we reset the GPU and don't submit back the 
guilty job then
it can be considered 'aborted' - but I don't think we even do this.

Andrey


>   	} else {
>   		drm_sched_suspend_timeout(&ring->sched);
>   		if (amdgpu_sriov_vf(adev))
>   			adev->virt.tdr_debug = true;
> +		return 1;
>   	}
>   }
>   
> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> index 2e0c368e19f6..61f7121e1c19 100644
> --- a/include/drm/gpu_scheduler.h
> +++ b/include/drm/gpu_scheduler.h
> @@ -230,10 +230,17 @@ struct drm_sched_backend_ops {
>   	struct dma_fence *(*run_job)(struct drm_sched_job *sched_job);
>   
>   	/**
> -         * @timedout_job: Called when a job has taken too long to execute,
> -         * to trigger GPU recovery.
> +	 * @timedout_job: Called when a job has taken too long to execute,
> +	 * to trigger GPU recovery.
> +	 *
> +	 * Return 0, if the job has been aborted successfully and will
> +	 * never be heard of from the device. Return non-zero if the
> +	 * job wasn't able to be aborted, i.e. if more time should be
> +	 * given to this job. The result is not "bool" as this
> +	 * function is not a predicate, although its result may seem
> +	 * as one.
>   	 */
> -	void (*timedout_job)(struct drm_sched_job *sched_job);
> +	int (*timedout_job)(struct drm_sched_job *sched_job);
>   
>   	/**
>            * @free_job: Called once the job's finished fence has been signaled
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH 3/6] drm/scheduler: Job timeout handler returns status
@ 2020-11-26 15:06                                         ` Andrey Grodzovsky
  0 siblings, 0 replies; 125+ messages in thread
From: Andrey Grodzovsky @ 2020-11-26 15:06 UTC (permalink / raw)
  To: Luben Tuikov, Christian König, Lucas Stach, Alexander Deucher
  Cc: Emily Deng, amd-gfx, dri-devel, steven.price


On 11/24/20 10:17 PM, Luben Tuikov wrote:
> The job timeout handler now returns status
> indicating back to the DRM layer whether the job
> was successfully cancelled or whether more time
> should be given to the job to complete.
>
> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_job.c |  6 ++++--
>   include/drm/gpu_scheduler.h             | 13 ++++++++++---
>   2 files changed, 14 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> index ff48101bab55..81b73790ecc6 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> @@ -28,7 +28,7 @@
>   #include "amdgpu.h"
>   #include "amdgpu_trace.h"
>   
> -static void amdgpu_job_timedout(struct drm_sched_job *s_job)
> +static int amdgpu_job_timedout(struct drm_sched_job *s_job)
>   {
>   	struct amdgpu_ring *ring = to_amdgpu_ring(s_job->sched);
>   	struct amdgpu_job *job = to_amdgpu_job(s_job);
> @@ -41,7 +41,7 @@ static void amdgpu_job_timedout(struct drm_sched_job *s_job)
>   	    amdgpu_ring_soft_recovery(ring, job->vmid, s_job->s_fence->parent)) {
>   		DRM_ERROR("ring %s timeout, but soft recovered\n",
>   			  s_job->sched->name);
> -		return;
> +		return 0;
>   	}
>   
>   	amdgpu_vm_get_task_info(ring->adev, job->pasid, &ti);
> @@ -53,10 +53,12 @@ static void amdgpu_job_timedout(struct drm_sched_job *s_job)
>   
>   	if (amdgpu_device_should_recover_gpu(ring->adev)) {
>   		amdgpu_device_gpu_recover(ring->adev, job);
> +		return 0;


For amdgpu specifically - not that amdgpu_device_gpu_recover returns a value 
which is 0 for successful GPU reset
meaning we reset the GPU and resubmitted to HW the job that triggered the 
timeout to HW (guilty).
It means the job is still should be considered part of pending list and so a non 
zero value
should be returned. I think only if we reset the GPU and don't submit back the 
guilty job then
it can be considered 'aborted' - but I don't think we even do this.

Andrey


>   	} else {
>   		drm_sched_suspend_timeout(&ring->sched);
>   		if (amdgpu_sriov_vf(adev))
>   			adev->virt.tdr_debug = true;
> +		return 1;
>   	}
>   }
>   
> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> index 2e0c368e19f6..61f7121e1c19 100644
> --- a/include/drm/gpu_scheduler.h
> +++ b/include/drm/gpu_scheduler.h
> @@ -230,10 +230,17 @@ struct drm_sched_backend_ops {
>   	struct dma_fence *(*run_job)(struct drm_sched_job *sched_job);
>   
>   	/**
> -         * @timedout_job: Called when a job has taken too long to execute,
> -         * to trigger GPU recovery.
> +	 * @timedout_job: Called when a job has taken too long to execute,
> +	 * to trigger GPU recovery.
> +	 *
> +	 * Return 0, if the job has been aborted successfully and will
> +	 * never be heard of from the device. Return non-zero if the
> +	 * job wasn't able to be aborted, i.e. if more time should be
> +	 * given to this job. The result is not "bool" as this
> +	 * function is not a predicate, although its result may seem
> +	 * as one.
>   	 */
> -	void (*timedout_job)(struct drm_sched_job *sched_job);
> +	int (*timedout_job)(struct drm_sched_job *sched_job);
>   
>   	/**
>            * @free_job: Called once the job's finished fence has been signaled
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 125+ messages in thread

end of thread, other threads:[~2020-11-26 15:06 UTC | newest]

Thread overview: 125+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-25 20:51 [PATCH v4] drm/scheduler: Avoid accessing freed bad job Andrey Grodzovsky
2019-11-25 20:51 ` Andrey Grodzovsky
2019-11-25 20:51 ` Andrey Grodzovsky
2019-11-25 21:44 ` Deng, Emily
2019-11-25 21:44   ` Deng, Emily
2019-11-25 21:44   ` Deng, Emily
2019-11-26  0:09   ` Grodzovsky, Andrey
2019-11-26  0:09     ` Grodzovsky, Andrey
2019-11-26  0:09     ` Grodzovsky, Andrey
     [not found]     ` <MWHPR12MB1453C6FC45A83482232CA3EDEA450-Gy0DoCVfaSWZBIDmKHdw+wdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
2019-11-26 15:36       ` Deucher, Alexander
2019-11-26 15:36         ` Deucher, Alexander
2019-11-26 15:36         ` Deucher, Alexander
2019-11-26 15:37 ` Andrey Grodzovsky
2019-11-26 15:37   ` Andrey Grodzovsky
     [not found]   ` <b8b716a7-e235-38b2-ea6d-0a21881fa64e-5C7GfCeVMHo@public.gmane.org>
2019-11-27  0:41     ` Deng, Emily
2019-11-27  0:41       ` Deng, Emily
2019-11-27  0:41       ` Deng, Emily
     [not found]       ` <MN2PR12MB2975CA8858F21FDF325C33FE8F440-rweVpJHSKToFlvJWC7EAqwdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
2019-12-02 19:24         ` Deng, Emily
2019-12-02 19:24           ` Deng, Emily
2019-12-02 19:24           ` Deng, Emily
2019-12-03 19:10           ` Andrey Grodzovsky
2019-12-03 19:10             ` Andrey Grodzovsky
2019-12-03 19:44             ` Deucher, Alexander
2019-12-03 19:44               ` Deucher, Alexander
2019-12-03 19:57               ` Andrey Grodzovsky
2019-12-03 19:57                 ` Andrey Grodzovsky
2019-12-03 19:59                 ` Deucher, Alexander
2019-12-03 19:59                   ` Deucher, Alexander
2019-12-03 20:32                   ` Andrey Grodzovsky
2019-12-03 20:32                     ` Andrey Grodzovsky
2019-12-03 20:58                     ` Deng, Emily
2019-12-03 20:58                       ` Deng, Emily
2019-12-03 19:53             ` Deng, Emily
2019-12-03 19:53               ` Deng, Emily
2020-02-05 18:24 ` Lucas Stach
2020-02-05 18:24   ` Lucas Stach
2020-02-06 11:10   ` Lucas Stach
2020-02-06 11:10     ` Lucas Stach
2020-02-06 11:49     ` Christian König
2020-02-06 11:49       ` Christian König
2020-02-06 14:49       ` Alex Deucher
2020-02-06 14:49         ` Alex Deucher
2020-02-06 14:51         ` Christian König
2020-02-06 14:51           ` Christian König
2020-02-06 15:49           ` Andrey Grodzovsky
2020-02-06 15:49             ` Andrey Grodzovsky
2020-02-10 16:55             ` Andrey Grodzovsky
2020-02-10 16:55               ` Andrey Grodzovsky
2020-02-10 21:50               ` Luben Tuikov
2020-02-10 21:50                 ` Luben Tuikov
2020-02-11 15:55                 ` Andrey Grodzovsky
2020-02-11 15:55                   ` Andrey Grodzovsky
2020-02-11 21:27                   ` Andrey Grodzovsky
2020-02-11 21:27                     ` Andrey Grodzovsky
2020-02-12  0:53                     ` Luben Tuikov
2020-02-12  0:53                       ` Luben Tuikov
2020-02-12 16:33                       ` Andrey Grodzovsky
2020-02-12 16:33                         ` Andrey Grodzovsky
2020-07-21 11:03                         ` Lucas Stach
2020-07-21 11:03                           ` Lucas Stach
2020-07-21 13:36                           ` Andrey Grodzovsky
2020-07-21 13:36                             ` Andrey Grodzovsky
2020-07-21 13:39                             ` Christian König
2020-07-21 13:39                               ` Christian König
2020-07-21 13:42                               ` Andrey Grodzovsky
2020-07-21 13:42                                 ` Andrey Grodzovsky
2020-07-21 18:29                                 ` Luben Tuikov
2020-07-21 18:29                                   ` Luben Tuikov
2020-11-25  3:17                                   ` [PATCH 0/6] Allow to extend the timeout without jobs disappearing Luben Tuikov
2020-11-25  3:17                                     ` Luben Tuikov
2020-11-25  3:17                                     ` [PATCH 1/6] drm/scheduler: "node" --> "list" Luben Tuikov
2020-11-25  3:17                                       ` Luben Tuikov
2020-11-25  9:44                                       ` Christian König
2020-11-25  9:44                                         ` Christian König
2020-11-25  3:17                                     ` [PATCH 2/6] gpu/drm: ring_mirror_list --> pending_list Luben Tuikov
2020-11-25  3:17                                       ` Luben Tuikov
2020-11-25  9:47                                       ` Christian König
2020-11-25  9:47                                         ` Christian König
2020-11-25 16:42                                         ` Luben Tuikov
2020-11-25 16:42                                           ` Luben Tuikov
2020-11-25  3:17                                     ` [PATCH 3/6] drm/scheduler: Job timeout handler returns status Luben Tuikov
2020-11-25  3:17                                       ` Luben Tuikov
2020-11-25  4:41                                       ` kernel test robot
2020-11-25  4:41                                         ` kernel test robot
2020-11-25  4:41                                         ` kernel test robot
2020-11-25  9:50                                       ` Christian König
2020-11-25  9:50                                         ` Christian König
2020-11-25 16:48                                         ` Luben Tuikov
2020-11-25 16:48                                           ` Luben Tuikov
2020-11-25 11:04                                       ` Steven Price
2020-11-25 11:04                                         ` Steven Price
2020-11-25 11:15                                         ` Lucas Stach
2020-11-25 11:15                                           ` Lucas Stach
2020-11-25 11:22                                           ` Steven Price
2020-11-25 11:22                                             ` Steven Price
2020-11-25 11:47                                             ` Lucas Stach
2020-11-25 11:47                                               ` Lucas Stach
2020-11-25 12:41                                         ` Christian König
2020-11-25 12:41                                           ` Christian König
2020-11-26 15:06                                       ` Andrey Grodzovsky
2020-11-26 15:06                                         ` Andrey Grodzovsky
2020-11-25  3:17                                     ` [PATCH 4/6] drm/scheduler: Essentialize the job done callback Luben Tuikov
2020-11-25  3:17                                       ` Luben Tuikov
2020-11-25  9:51                                       ` Christian König
2020-11-25  9:51                                         ` Christian König
2020-11-25  3:17                                     ` [PATCH 5/6] drm/amdgpu: Don't hardcode thread name length Luben Tuikov
2020-11-25  3:17                                       ` Luben Tuikov
2020-11-25  9:55                                       ` Christian König
2020-11-25  9:55                                         ` Christian König
2020-11-25 17:01                                         ` Luben Tuikov
2020-11-25 17:01                                           ` Luben Tuikov
2020-11-26  8:11                                           ` Christian König
2020-11-26  8:11                                             ` Christian König
2020-11-25  3:17                                     ` [PATCH 6/6] drm/sched: Make use of a "done" thread Luben Tuikov
2020-11-25  3:17                                       ` Luben Tuikov
2020-11-25 10:10                                       ` Christian König
2020-11-25 10:10                                         ` Christian König
2020-11-26  0:24                                         ` Luben Tuikov
2020-11-26  0:24                                           ` Luben Tuikov
2020-11-25 11:09                                       ` Steven Price
2020-11-25 11:09                                         ` Steven Price
2020-11-26  0:30                                         ` Luben Tuikov
2020-11-26  0:30                                           ` Luben Tuikov
2020-02-07 15:26           ` [PATCH v4] drm/scheduler: Avoid accessing freed bad job Daniel Vetter
2020-02-07 15:26             ` Daniel Vetter

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.