linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* drm_sched with panfrost crash on T820
@ 2019-09-27  8:12 Neil Armstrong
  2019-09-27  9:55 ` Steven Price
  2019-09-27 20:55 ` Grodzovsky, Andrey
  0 siblings, 2 replies; 21+ messages in thread
From: Neil Armstrong @ 2019-09-27  8:12 UTC (permalink / raw)
  To: daniel, airlied, Christian König
  Cc: Erico Nunes, Andrey Grodzovsky, linux-kernel, steven.price,
	dri-devel, Rob Herring, Tomeu Vizoso,
	open list:ARM/Amlogic Meson...

Hi Christian,

In v5.3, running dEQP triggers the following kernel crash :

[   20.224982] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000038
[...]
[   20.291064] Hardware name: Khadas VIM2 (DT)
[   20.295217] Workqueue: events drm_sched_job_timedout
[...]
[   20.304867] pc : drm_sched_increase_karma+0x5c/0xf0
[   20.309696] lr : drm_sched_increase_karma+0x44/0xf0
[...]
[   20.396720] Call trace:
[   20.399138]  drm_sched_increase_karma+0x5c/0xf0
[   20.403623]  panfrost_job_timedout+0x12c/0x1e0
[   20.408021]  drm_sched_job_timedout+0x48/0xa0
[   20.412336]  process_one_work+0x1e0/0x320
[   20.416300]  worker_thread+0x40/0x450
[   20.419924]  kthread+0x124/0x128
[   20.423116]  ret_from_fork+0x10/0x18
[   20.426653] Code: f9400001 540001c0 f9400a83 f9402402 (f9401c64)
[   20.432690] ---[ end trace bd02f890139096a7 ]---

Which never happens, at all, on v5.2.

I did a (very) long (7 days, ~100runs) bisect run using our LAVA lab (thanks tomeu !), but
bisecting was not easy since the bad commit landed on drm-misc-next after v5.1-rc6, and
then v5.2-rc1 was backmerged into drm-misc-next at:
[1] 374ed5429346 Merge drm/drm-next into drm-misc-next

Thus bisecting between [1] ang v5.2-rc1 leads to commit based on v5.2-rc1... where panfrost was
not enabled in the Khadas VIM2 DT.

Anyway, I managed to identify 3 possibly breaking commits :
[2] 290764af7e36 drm/sched: Keep s_fence->parent pointer
[3] 5918045c4ed4 drm/scheduler: rework job destruction
[4] a5343b8a2ca5 drm/scheduler: Add flag to hint the release of guilty job.

But [1] and [2] doesn't crash the same way :
[   16.257912] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000060
[...]
[   16.308307] CPU: 4 PID: 80 Comm: kworker/4:1 Not tainted 5.1.0-rc2-01185-g290764af7e36-dirty #378
[   16.317099] Hardware name: Khadas VIM2 (DT)
[...])
[   16.330907] pc : refcount_sub_and_test_checked+0x4/0xb0
[   16.336078] lr : refcount_dec_and_test_checked+0x14/0x20
[...]
[   16.423533] Process kworker/4:1 (pid: 80, stack limit = 0x(____ptrval____))
[   16.430431] Call trace:
[   16.432851]  refcount_sub_and_test_checked+0x4/0xb0
[   16.437681]  drm_sched_job_cleanup+0x24/0x58
[   16.441908]  panfrost_job_free+0x14/0x28
[   16.445787]  drm_sched_job_timedout+0x6c/0xa0
[   16.450102]  process_one_work+0x1e0/0x320
[   16.454067]  worker_thread+0x40/0x450
[   16.457690]  kthread+0x124/0x128
[   16.460882]  ret_from_fork+0x10/0x18
[   16.464421] Code: 52800000 d65f03c0 d503201f aa0103e3 (b9400021)
[   16.470456] ---[ end trace 39a67412ee1b64b5 ]---

and [3] fails like on v5.3 (in drm_sched_increase_karma):
[   33.830080] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000038
[...]
[   33.871946] Internal error: Oops: 96000004 [#1] PREEMPT SMP
[   33.877450] Modules linked in:
[   33.880474] CPU: 6 PID: 81 Comm: kworker/6:1 Not tainted 5.1.0-rc2-01186-ga5343b8a2ca5-dirty #380
[   33.889265] Hardware name: Khadas VIM2 (DT)
[   33.893419] Workqueue: events drm_sched_job_timedout
[...]
[   33.903069] pc : drm_sched_increase_karma+0x5c/0xf0
[   33.907898] lr : drm_sched_increase_karma+0x44/0xf0
[...]
[   33.994924] Process kworker/6:1 (pid: 81, stack limit = 0x(____ptrval____))
[   34.001822] Call trace:
[   34.004242]  drm_sched_increase_karma+0x5c/0xf0
[   34.008726]  panfrost_job_timedout+0x12c/0x1e0
[   34.013122]  drm_sched_job_timedout+0x48/0xa0
[   34.017438]  process_one_work+0x1e0/0x320
[   34.021402]  worker_thread+0x40/0x450
[   34.025026]  kthread+0x124/0x128
[   34.028218]  ret_from_fork+0x10/0x18
[   34.031755] Code: f9400001 540001c0 f9400a83 f9402402 (f9401c64)
[   34.037792] ---[ end trace be3fd6f77f4df267 ]---


When I revert [3] on [1], i get the same crash as [2], meaning
the commit [3] masks the failure [2] introduced.

Do you know how to solve this ?

Thanks,
Neil

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: drm_sched with panfrost crash on T820
  2019-09-27  8:12 drm_sched with panfrost crash on T820 Neil Armstrong
@ 2019-09-27  9:55 ` Steven Price
  2019-09-27 10:48   ` Steven Price
  2019-09-27 20:55 ` Grodzovsky, Andrey
  1 sibling, 1 reply; 21+ messages in thread
From: Steven Price @ 2019-09-27  9:55 UTC (permalink / raw)
  To: Neil Armstrong, daniel, airlied, Christian König
  Cc: Tomeu Vizoso, linux-kernel, dri-devel,
	open list:ARM/Amlogic Meson...,
	Erico Nunes, Rob Herring

On 27/09/2019 09:12, Neil Armstrong wrote:
> Hi Christian,
> 
> In v5.3, running dEQP triggers the following kernel crash :
> 
> [   20.224982] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000038
> [...]
> [   20.291064] Hardware name: Khadas VIM2 (DT)
> [   20.295217] Workqueue: events drm_sched_job_timedout
> [...]
> [   20.304867] pc : drm_sched_increase_karma+0x5c/0xf0
> [   20.309696] lr : drm_sched_increase_karma+0x44/0xf0
> [...]
> [   20.396720] Call trace:
> [   20.399138]  drm_sched_increase_karma+0x5c/0xf0
> [   20.403623]  panfrost_job_timedout+0x12c/0x1e0
> [   20.408021]  drm_sched_job_timedout+0x48/0xa0
> [   20.412336]  process_one_work+0x1e0/0x320
> [   20.416300]  worker_thread+0x40/0x450
> [   20.419924]  kthread+0x124/0x128
> [   20.423116]  ret_from_fork+0x10/0x18
> [   20.426653] Code: f9400001 540001c0 f9400a83 f9402402 (f9401c64)
> [   20.432690] ---[ end trace bd02f890139096a7 ]---
> 
> Which never happens, at all, on v5.2.
> 
> I did a (very) long (7 days, ~100runs) bisect run using our LAVA lab (thanks tomeu !), but
> bisecting was not easy since the bad commit landed on drm-misc-next after v5.1-rc6, and
> then v5.2-rc1 was backmerged into drm-misc-next at:
> [1] 374ed5429346 Merge drm/drm-next into drm-misc-next
> 
> Thus bisecting between [1] ang v5.2-rc1 leads to commit based on v5.2-rc1... where panfrost was
> not enabled in the Khadas VIM2 DT.
> 
> Anyway, I managed to identify 3 possibly breaking commits :
> [2] 290764af7e36 drm/sched: Keep s_fence->parent pointer
> [3] 5918045c4ed4 drm/scheduler: rework job destruction
> [4] a5343b8a2ca5 drm/scheduler: Add flag to hint the release of guilty job.

My suspicion is that this is really a bug in Panfrost than the scheduler.

I can see the following sequence:

1. Jobs submitted to *both* slot 0 and slot 1 on the hardware.

2. Hardware 'hangs' (most likely an unhandled page fault)

3. The reset timeouts occur. Because Panfrost uses a separate scheduler
for each slot this means we have two threads racing to kill the
respective jobs.

4. Each thread calls drm_sched_job_timedout which calls
panfrost_job_timedout().

5. Both threads calling panfrost_job_timedout() execute:

	for (i = 0; i < NUM_JOB_SLOTS; i++)
		drm_sched_stop(&pfdev->js->queue[i].sched, sched_job);

6. drm_sched_stop() will free the job from the other scheduler. So both
jobs get freed (and sched->free_guilty is set).

7. Depending on the exact race either drm_sched_increase_karma()
dereferences the freed job, or drm_sched_job_timedout() attempts to
double-free the job (because free_guilty is set).

Now there is a lock (pfdev->reset_lock) held during steps 5-6, but the
problem is that the job which has timed out can be freed by the other
scheduler's timeout before/after that critical section.

One obvious issue with the DRM scheduler is that there is a call to
cancel_delayed_work() in drm_sched_stop() which to me looks like it
should be cancel_delayed_work_sync() to ensure that the timeout handling
has completed.

However in the above scenario a _sync() variety would then cause a
deadlock (one thread has pfdev->reset_lock and is waiting for the other
thread which is trying to take the lock).

So we need to update Panfrost so that it can coordinate the reset
between schedulers. Can you try something like the following (untested):

----8<---
diff --git a/drivers/gpu/drm/panfrost/panfrost_device.h b/drivers/gpu/drm/panfrost/panfrost_device.h
index f503c566e99f..6441c7fba6c4 100644
--- a/drivers/gpu/drm/panfrost/panfrost_device.h
+++ b/drivers/gpu/drm/panfrost/panfrost_device.h
@@ -99,6 +99,8 @@ struct panfrost_device {
 		unsigned long cur_volt;
 		struct panfrost_devfreq_slot slot[NUM_JOB_SLOTS];
 	} devfreq;
+
+	bool is_resetting;
 };
 
 struct panfrost_mmu {
diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
index 05c85f45a0de..8b935680c066 100644
--- a/drivers/gpu/drm/panfrost/panfrost_job.c
+++ b/drivers/gpu/drm/panfrost/panfrost_job.c
@@ -388,6 +388,10 @@ static void panfrost_job_timedout(struct drm_sched_job *sched_job)
 
 	mutex_lock(&pfdev->reset_lock);
 
+	if (pfdev->is_resetting)
+		goto out;
+	pfdev->is_resetting = true;
+
 	for (i = 0; i < NUM_JOB_SLOTS; i++)
 		drm_sched_stop(&pfdev->js->queue[i].sched, sched_job);
 
@@ -406,6 +410,8 @@ static void panfrost_job_timedout(struct drm_sched_job *sched_job)
 	for (i = 0; i < NUM_JOB_SLOTS; i++)
 		drm_sched_start(&pfdev->js->queue[i].sched, true);
 
+	pfdev->is_resetting = false;
+out:
 	mutex_unlock(&pfdev->reset_lock);
 }
 

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: drm_sched with panfrost crash on T820
  2019-09-27  9:55 ` Steven Price
@ 2019-09-27 10:48   ` Steven Price
  2019-09-27 11:27     ` Neil Armstrong
  0 siblings, 1 reply; 21+ messages in thread
From: Steven Price @ 2019-09-27 10:48 UTC (permalink / raw)
  To: Neil Armstrong, daniel, airlied, Christian König
  Cc: Tomeu Vizoso, linux-kernel, dri-devel,
	open list:ARM/Amlogic Meson...,
	Erico Nunes, Rob Herring

On 27/09/2019 10:55, Steven Price wrote:
[...]
> One obvious issue with the DRM scheduler is that there is a call to
> cancel_delayed_work() in drm_sched_stop() which to me looks like it
> should be cancel_delayed_work_sync() to ensure that the timeout handling
> has completed.
> 
> However in the above scenario a _sync() variety would then cause a
> deadlock (one thread has pfdev->reset_lock and is waiting for the other
> thread which is trying to take the lock).
> 
> So we need to update Panfrost so that it can coordinate the reset
> between schedulers. Can you try something like the following (untested):

And actually testing it I of course discover it doesn't quite work. We
do need the cancel_delayed_work_sync() in the DRM scheduler (when
stopping a different scheduler) and we need to avoid holding the
reset_lock during the drm_sched_stop() call to prevent deadlocking with
another thread handling a timeout.

Can you give the following patch a spin? I don't have a great
reproduction case, so it would be good to get some confidence it fixes
the problem.

----8<----
From 521a286789260197ae94f698932ebf369efc45ad Mon Sep 17 00:00:00 2001
From: Steven Price <steven.price@arm.com>
Date: Fri, 27 Sep 2019 11:42:40 +0100
Subject: [PATCH] drm/panfrost: Handle resetting on timeout better

Panfrost uses multiple schedulers (one for each slot, so 2 in reality),
and on a timeout has to stop all the schedulers to safely perform a
reset. However more than one scheduler can trigger a timeout at the same
time. This race condition results in jobs being freed while they are
still in use.

Modify drm_sched_stop() to call cancel_delayed_work_sync() when stopping
a different scheduler to the one belonging to the passed in job.
panfrost_job_timedout() is also modified to only allow one thread at a
time to handle the reset. Any subsequent threads simply return assuming
that the first thread will handle the situation.

Signed-off-by: Steven Price <steven.price@arm.com>
---
 drivers/gpu/drm/panfrost/panfrost_device.h |  2 ++
 drivers/gpu/drm/panfrost/panfrost_job.c    | 11 ++++++++++-
 drivers/gpu/drm/scheduler/sched_main.c     |  5 ++++-
 3 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/panfrost/panfrost_device.h b/drivers/gpu/drm/panfrost/panfrost_device.h
index f503c566e99f..6441c7fba6c4 100644
--- a/drivers/gpu/drm/panfrost/panfrost_device.h
+++ b/drivers/gpu/drm/panfrost/panfrost_device.h
@@ -99,6 +99,8 @@ struct panfrost_device {
 		unsigned long cur_volt;
 		struct panfrost_devfreq_slot slot[NUM_JOB_SLOTS];
 	} devfreq;
+
+	bool is_resetting;
 };
 
 struct panfrost_mmu {
diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
index 05c85f45a0de..1b2019e08b43 100644
--- a/drivers/gpu/drm/panfrost/panfrost_job.c
+++ b/drivers/gpu/drm/panfrost/panfrost_job.c
@@ -388,13 +388,21 @@ static void panfrost_job_timedout(struct drm_sched_job *sched_job)
 
 	mutex_lock(&pfdev->reset_lock);
 
+	if (pfdev->is_resetting) {
+		mutex_unlock(&pfdev->reset_lock);
+		return;
+	}
+	pfdev->is_resetting = true;
+
+	mutex_unlock(&pfdev->reset_lock);
+
 	for (i = 0; i < NUM_JOB_SLOTS; i++)
 		drm_sched_stop(&pfdev->js->queue[i].sched, sched_job);
 
 	if (sched_job)
 		drm_sched_increase_karma(sched_job);
 
-	/* panfrost_core_dump(pfdev); */
+	mutex_lock(&pfdev->reset_lock);
 
 	panfrost_devfreq_record_transition(pfdev, js);
 	panfrost_device_reset(pfdev);
@@ -406,6 +414,7 @@ static void panfrost_job_timedout(struct drm_sched_job *sched_job)
 	for (i = 0; i < NUM_JOB_SLOTS; i++)
 		drm_sched_start(&pfdev->js->queue[i].sched, true);
 
+	pfdev->is_resetting = false;
 	mutex_unlock(&pfdev->reset_lock);
 }
 
diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index 148468447ba9..bc6d1862ec8a 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -415,7 +415,10 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad)
 	 * this TDR finished and before the newly restarted jobs had a
 	 * chance to complete.
 	 */
-	cancel_delayed_work(&sched->work_tdr);
+	if (bad->sched != sched)
+		cancel_delayed_work_sync(&sched->work_tdr);
+	else
+		cancel_delayed_work(&sched->work_tdr);
 }
 
 EXPORT_SYMBOL(drm_sched_stop);
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: drm_sched with panfrost crash on T820
  2019-09-27 10:48   ` Steven Price
@ 2019-09-27 11:27     ` Neil Armstrong
  2019-09-27 11:48       ` Neil Armstrong
  0 siblings, 1 reply; 21+ messages in thread
From: Neil Armstrong @ 2019-09-27 11:27 UTC (permalink / raw)
  To: Steven Price, daniel, airlied, Christian König
  Cc: Tomeu Vizoso, linux-kernel, dri-devel,
	open list:ARM/Amlogic Meson...,
	Erico Nunes, Rob Herring

Hi Steven,

Thanks for your prompt reaction !

On 27/09/2019 12:48, Steven Price wrote:
> On 27/09/2019 10:55, Steven Price wrote:
> [...]
>> One obvious issue with the DRM scheduler is that there is a call to
>> cancel_delayed_work() in drm_sched_stop() which to me looks like it
>> should be cancel_delayed_work_sync() to ensure that the timeout handling
>> has completed.
>>
>> However in the above scenario a _sync() variety would then cause a
>> deadlock (one thread has pfdev->reset_lock and is waiting for the other
>> thread which is trying to take the lock).
>>
>> So we need to update Panfrost so that it can coordinate the reset
>> between schedulers. Can you try something like the following (untested):
> 
> And actually testing it I of course discover it doesn't quite work. We
> do need the cancel_delayed_work_sync() in the DRM scheduler (when
> stopping a different scheduler) and we need to avoid holding the
> reset_lock during the drm_sched_stop() call to prevent deadlocking with
> another thread handling a timeout.

Yep the first patch wasn't fixing the issue all the time.

> 
> Can you give the following patch a spin? I don't have a great
> reproduction case, so it would be good to get some confidence it fixes
> the problem.

Running it right now.

Thanks,
Neil

> 
> ----8<----
> From 521a286789260197ae94f698932ebf369efc45ad Mon Sep 17 00:00:00 2001
> From: Steven Price <steven.price@arm.com>
> Date: Fri, 27 Sep 2019 11:42:40 +0100
> Subject: [PATCH] drm/panfrost: Handle resetting on timeout better
> 
> Panfrost uses multiple schedulers (one for each slot, so 2 in reality),
> and on a timeout has to stop all the schedulers to safely perform a
> reset. However more than one scheduler can trigger a timeout at the same
> time. This race condition results in jobs being freed while they are
> still in use.
> 
> Modify drm_sched_stop() to call cancel_delayed_work_sync() when stopping
> a different scheduler to the one belonging to the passed in job.
> panfrost_job_timedout() is also modified to only allow one thread at a
> time to handle the reset. Any subsequent threads simply return assuming
> that the first thread will handle the situation.
> 
> Signed-off-by: Steven Price <steven.price@arm.com>
> ---
>  drivers/gpu/drm/panfrost/panfrost_device.h |  2 ++
>  drivers/gpu/drm/panfrost/panfrost_job.c    | 11 ++++++++++-
>  drivers/gpu/drm/scheduler/sched_main.c     |  5 ++++-
>  3 files changed, 16 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/panfrost/panfrost_device.h b/drivers/gpu/drm/panfrost/panfrost_device.h
> index f503c566e99f..6441c7fba6c4 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_device.h
> +++ b/drivers/gpu/drm/panfrost/panfrost_device.h
> @@ -99,6 +99,8 @@ struct panfrost_device {
>  		unsigned long cur_volt;
>  		struct panfrost_devfreq_slot slot[NUM_JOB_SLOTS];
>  	} devfreq;
> +
> +	bool is_resetting;
>  };
>  
>  struct panfrost_mmu {
> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
> index 05c85f45a0de..1b2019e08b43 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_job.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
> @@ -388,13 +388,21 @@ static void panfrost_job_timedout(struct drm_sched_job *sched_job)
>  
>  	mutex_lock(&pfdev->reset_lock);
>  
> +	if (pfdev->is_resetting) {
> +		mutex_unlock(&pfdev->reset_lock);
> +		return;
> +	}
> +	pfdev->is_resetting = true;
> +
> +	mutex_unlock(&pfdev->reset_lock);
> +
>  	for (i = 0; i < NUM_JOB_SLOTS; i++)
>  		drm_sched_stop(&pfdev->js->queue[i].sched, sched_job);
>  
>  	if (sched_job)
>  		drm_sched_increase_karma(sched_job);
>  
> -	/* panfrost_core_dump(pfdev); */
> +	mutex_lock(&pfdev->reset_lock);
>  
>  	panfrost_devfreq_record_transition(pfdev, js);
>  	panfrost_device_reset(pfdev);
> @@ -406,6 +414,7 @@ static void panfrost_job_timedout(struct drm_sched_job *sched_job)
>  	for (i = 0; i < NUM_JOB_SLOTS; i++)
>  		drm_sched_start(&pfdev->js->queue[i].sched, true);
>  
> +	pfdev->is_resetting = false;
>  	mutex_unlock(&pfdev->reset_lock);
>  }
>  
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> index 148468447ba9..bc6d1862ec8a 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -415,7 +415,10 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad)
>  	 * this TDR finished and before the newly restarted jobs had a
>  	 * chance to complete.
>  	 */
> -	cancel_delayed_work(&sched->work_tdr);
> +	if (bad->sched != sched)
> +		cancel_delayed_work_sync(&sched->work_tdr);
> +	else
> +		cancel_delayed_work(&sched->work_tdr);
>  }
>  
>  EXPORT_SYMBOL(drm_sched_stop);
> 


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: drm_sched with panfrost crash on T820
  2019-09-27 11:27     ` Neil Armstrong
@ 2019-09-27 11:48       ` Neil Armstrong
  2019-09-27 15:00         ` Steven Price
  0 siblings, 1 reply; 21+ messages in thread
From: Neil Armstrong @ 2019-09-27 11:48 UTC (permalink / raw)
  To: Steven Price, daniel, airlied, Christian König
  Cc: Tomeu Vizoso, linux-kernel, dri-devel,
	open list:ARM/Amlogic Meson...,
	Erico Nunes, Rob Herring

Hi,

On 27/09/2019 13:27, Neil Armstrong wrote:
> Hi Steven,
> 
> Thanks for your prompt reaction !
> 
> On 27/09/2019 12:48, Steven Price wrote:
>> On 27/09/2019 10:55, Steven Price wrote:
>> [...]
>>> One obvious issue with the DRM scheduler is that there is a call to
>>> cancel_delayed_work() in drm_sched_stop() which to me looks like it
>>> should be cancel_delayed_work_sync() to ensure that the timeout handling
>>> has completed.
>>>
>>> However in the above scenario a _sync() variety would then cause a
>>> deadlock (one thread has pfdev->reset_lock and is waiting for the other
>>> thread which is trying to take the lock).
>>>
>>> So we need to update Panfrost so that it can coordinate the reset
>>> between schedulers. Can you try something like the following (untested):
>>
>> And actually testing it I of course discover it doesn't quite work. We
>> do need the cancel_delayed_work_sync() in the DRM scheduler (when
>> stopping a different scheduler) and we need to avoid holding the
>> reset_lock during the drm_sched_stop() call to prevent deadlocking with
>> another thread handling a timeout.
> 
> Yep the first patch wasn't fixing the issue all the time.
> 
>>
>> Can you give the following patch a spin? I don't have a great
>> reproduction case, so it would be good to get some confidence it fixes
>> the problem.
> 
> Running it right now.

First run gave me (while applying on v5.3):
[  307.969230] Unable to handle kernel NULL pointer dereference at virtual address 00000000000000c0
[...]
[  308.029358] Hardware name: Khadas VIM2 (DT)
[  308.033510] Workqueue: events drm_sched_job_timedout
[  308.038416] pstate: 80000005 (Nzcv daif -PAN -UAO)
[  308.043160] pc : drm_sched_start+0x88/0x138
[  308.047300] lr : drm_sched_start+0xb0/0x138
[...]
[  308.133635] Call trace:
[  308.136052]  drm_sched_start+0x88/0x138
[  308.139847]  panfrost_job_timedout+0x1cc/0x208
[  308.144245]  drm_sched_job_timedout+0x44/0xa8
[  308.148560]  process_one_work+0x1e0/0x320
[  308.152524]  worker_thread+0x40/0x450
[  308.156149]  kthread+0x124/0x128
[  308.159342]  ret_from_fork+0x10/0x18
[  308.162879] Code: 54000280 f9400862 52800020 aa1a03e1 (f940605b)
[  308.168914] ---[ end trace 256b7f5faec101d2 ]---

Bad pointer seems to be struct dma_fence *fence = s_job->s_fence->parent that
could be a NULL return from panfrost_job_run().

Neil
> 
> Thanks,
> Neil
> 
>>
>> ----8<----
>> From 521a286789260197ae94f698932ebf369efc45ad Mon Sep 17 00:00:00 2001
>> From: Steven Price <steven.price@arm.com>
>> Date: Fri, 27 Sep 2019 11:42:40 +0100
>> Subject: [PATCH] drm/panfrost: Handle resetting on timeout better
>>
>> Panfrost uses multiple schedulers (one for each slot, so 2 in reality),
>> and on a timeout has to stop all the schedulers to safely perform a
>> reset. However more than one scheduler can trigger a timeout at the same
>> time. This race condition results in jobs being freed while they are
>> still in use.
>>
>> Modify drm_sched_stop() to call cancel_delayed_work_sync() when stopping
>> a different scheduler to the one belonging to the passed in job.
>> panfrost_job_timedout() is also modified to only allow one thread at a
>> time to handle the reset. Any subsequent threads simply return assuming
>> that the first thread will handle the situation.
>>
>> Signed-off-by: Steven Price <steven.price@arm.com>
>> ---
>>  drivers/gpu/drm/panfrost/panfrost_device.h |  2 ++
>>  drivers/gpu/drm/panfrost/panfrost_job.c    | 11 ++++++++++-
>>  drivers/gpu/drm/scheduler/sched_main.c     |  5 ++++-
>>  3 files changed, 16 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/panfrost/panfrost_device.h b/drivers/gpu/drm/panfrost/panfrost_device.h
>> index f503c566e99f..6441c7fba6c4 100644
>> --- a/drivers/gpu/drm/panfrost/panfrost_device.h
>> +++ b/drivers/gpu/drm/panfrost/panfrost_device.h
>> @@ -99,6 +99,8 @@ struct panfrost_device {
>>  		unsigned long cur_volt;
>>  		struct panfrost_devfreq_slot slot[NUM_JOB_SLOTS];
>>  	} devfreq;
>> +
>> +	bool is_resetting;
>>  };
>>  
>>  struct panfrost_mmu {
>> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
>> index 05c85f45a0de..1b2019e08b43 100644
>> --- a/drivers/gpu/drm/panfrost/panfrost_job.c
>> +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
>> @@ -388,13 +388,21 @@ static void panfrost_job_timedout(struct drm_sched_job *sched_job)
>>  
>>  	mutex_lock(&pfdev->reset_lock);
>>  
>> +	if (pfdev->is_resetting) {
>> +		mutex_unlock(&pfdev->reset_lock);
>> +		return;
>> +	}
>> +	pfdev->is_resetting = true;
>> +
>> +	mutex_unlock(&pfdev->reset_lock);
>> +
>>  	for (i = 0; i < NUM_JOB_SLOTS; i++)
>>  		drm_sched_stop(&pfdev->js->queue[i].sched, sched_job);
>>  
>>  	if (sched_job)
>>  		drm_sched_increase_karma(sched_job);
>>  
>> -	/* panfrost_core_dump(pfdev); */
>> +	mutex_lock(&pfdev->reset_lock);
>>  
>>  	panfrost_devfreq_record_transition(pfdev, js);
>>  	panfrost_device_reset(pfdev);
>> @@ -406,6 +414,7 @@ static void panfrost_job_timedout(struct drm_sched_job *sched_job)
>>  	for (i = 0; i < NUM_JOB_SLOTS; i++)
>>  		drm_sched_start(&pfdev->js->queue[i].sched, true);
>>  
>> +	pfdev->is_resetting = false;
>>  	mutex_unlock(&pfdev->reset_lock);
>>  }
>>  
>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
>> index 148468447ba9..bc6d1862ec8a 100644
>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>> @@ -415,7 +415,10 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad)
>>  	 * this TDR finished and before the newly restarted jobs had a
>>  	 * chance to complete.
>>  	 */
>> -	cancel_delayed_work(&sched->work_tdr);
>> +	if (bad->sched != sched)
>> +		cancel_delayed_work_sync(&sched->work_tdr);
>> +	else
>> +		cancel_delayed_work(&sched->work_tdr);
>>  }
>>  
>>  EXPORT_SYMBOL(drm_sched_stop);
>>
> 


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: drm_sched with panfrost crash on T820
  2019-09-27 11:48       ` Neil Armstrong
@ 2019-09-27 15:00         ` Steven Price
  2019-09-27 15:20           ` Neil Armstrong
  2019-09-30 13:18           ` Neil Armstrong
  0 siblings, 2 replies; 21+ messages in thread
From: Steven Price @ 2019-09-27 15:00 UTC (permalink / raw)
  To: Neil Armstrong, daniel, airlied, Christian König
  Cc: Tomeu Vizoso, linux-kernel, dri-devel,
	open list:ARM/Amlogic Meson...,
	Erico Nunes

On 27/09/2019 12:48, Neil Armstrong wrote:
> Hi,
> 
> On 27/09/2019 13:27, Neil Armstrong wrote:
>> Hi Steven,
>>
>> Thanks for your prompt reaction !
>>
>> On 27/09/2019 12:48, Steven Price wrote:
>>> On 27/09/2019 10:55, Steven Price wrote:
>>> [...]
>>>> One obvious issue with the DRM scheduler is that there is a call to
>>>> cancel_delayed_work() in drm_sched_stop() which to me looks like it
>>>> should be cancel_delayed_work_sync() to ensure that the timeout handling
>>>> has completed.
>>>>
>>>> However in the above scenario a _sync() variety would then cause a
>>>> deadlock (one thread has pfdev->reset_lock and is waiting for the other
>>>> thread which is trying to take the lock).
>>>>
>>>> So we need to update Panfrost so that it can coordinate the reset
>>>> between schedulers. Can you try something like the following (untested):
>>>
>>> And actually testing it I of course discover it doesn't quite work. We
>>> do need the cancel_delayed_work_sync() in the DRM scheduler (when
>>> stopping a different scheduler) and we need to avoid holding the
>>> reset_lock during the drm_sched_stop() call to prevent deadlocking with
>>> another thread handling a timeout.
>>
>> Yep the first patch wasn't fixing the issue all the time.
>>
>>>
>>> Can you give the following patch a spin? I don't have a great
>>> reproduction case, so it would be good to get some confidence it fixes
>>> the problem.
>>
>> Running it right now.
> 
> First run gave me (while applying on v5.3):
> [  307.969230] Unable to handle kernel NULL pointer dereference at virtual address 00000000000000c0
> [...]
> [  308.029358] Hardware name: Khadas VIM2 (DT)
> [  308.033510] Workqueue: events drm_sched_job_timedout
> [  308.038416] pstate: 80000005 (Nzcv daif -PAN -UAO)
> [  308.043160] pc : drm_sched_start+0x88/0x138
> [  308.047300] lr : drm_sched_start+0xb0/0x138
> [...]
> [  308.133635] Call trace:
> [  308.136052]  drm_sched_start+0x88/0x138
> [  308.139847]  panfrost_job_timedout+0x1cc/0x208
> [  308.144245]  drm_sched_job_timedout+0x44/0xa8
> [  308.148560]  process_one_work+0x1e0/0x320
> [  308.152524]  worker_thread+0x40/0x450
> [  308.156149]  kthread+0x124/0x128
> [  308.159342]  ret_from_fork+0x10/0x18
> [  308.162879] Code: 54000280 f9400862 52800020 aa1a03e1 (f940605b)
> [  308.168914] ---[ end trace 256b7f5faec101d2 ]---
> 
> Bad pointer seems to be struct dma_fence *fence = s_job->s_fence->parent that
> could be a NULL return from panfrost_job_run().

I haven't managed reproduce this locally, admittedly the test case I was
using before relies on changes in drm-misc-next (HEAPs specifically), so
it might be a bug present in v5.3 which isn't present in drm-misc-next.

From the code dumped we have:

   0:	54000280 	b.eq	50 <.text+0x50>  // b.none
   4:	f9400862 	ldr	x2, [x3, #16]
   8:	52800020 	mov	w0, #0x1                   	// #1
   c:	aa1a03e1 	mov	x1, x26
  10:	f940605b 	ldr	x27, [x2, #192]

Which looks like the expression s_job->s_fence->parent, so it looks like
s_job->s_fence == NULL. Which looks to me like drm_sched_job_cleanup()
has been called on the job. But I can't work out why.

Steve

> Neil
>>
>> Thanks,
>> Neil
>>
>>>
>>> ----8<----
>>> From 521a286789260197ae94f698932ebf369efc45ad Mon Sep 17 00:00:00 2001
>>> From: Steven Price <steven.price@arm.com>
>>> Date: Fri, 27 Sep 2019 11:42:40 +0100
>>> Subject: [PATCH] drm/panfrost: Handle resetting on timeout better
>>>
>>> Panfrost uses multiple schedulers (one for each slot, so 2 in reality),
>>> and on a timeout has to stop all the schedulers to safely perform a
>>> reset. However more than one scheduler can trigger a timeout at the same
>>> time. This race condition results in jobs being freed while they are
>>> still in use.
>>>
>>> Modify drm_sched_stop() to call cancel_delayed_work_sync() when stopping
>>> a different scheduler to the one belonging to the passed in job.
>>> panfrost_job_timedout() is also modified to only allow one thread at a
>>> time to handle the reset. Any subsequent threads simply return assuming
>>> that the first thread will handle the situation.
>>>
>>> Signed-off-by: Steven Price <steven.price@arm.com>
>>> ---
>>>  drivers/gpu/drm/panfrost/panfrost_device.h |  2 ++
>>>  drivers/gpu/drm/panfrost/panfrost_job.c    | 11 ++++++++++-
>>>  drivers/gpu/drm/scheduler/sched_main.c     |  5 ++++-
>>>  3 files changed, 16 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/panfrost/panfrost_device.h b/drivers/gpu/drm/panfrost/panfrost_device.h
>>> index f503c566e99f..6441c7fba6c4 100644
>>> --- a/drivers/gpu/drm/panfrost/panfrost_device.h
>>> +++ b/drivers/gpu/drm/panfrost/panfrost_device.h
>>> @@ -99,6 +99,8 @@ struct panfrost_device {
>>>  		unsigned long cur_volt;
>>>  		struct panfrost_devfreq_slot slot[NUM_JOB_SLOTS];
>>>  	} devfreq;
>>> +
>>> +	bool is_resetting;
>>>  };
>>>  
>>>  struct panfrost_mmu {
>>> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
>>> index 05c85f45a0de..1b2019e08b43 100644
>>> --- a/drivers/gpu/drm/panfrost/panfrost_job.c
>>> +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
>>> @@ -388,13 +388,21 @@ static void panfrost_job_timedout(struct drm_sched_job *sched_job)
>>>  
>>>  	mutex_lock(&pfdev->reset_lock);
>>>  
>>> +	if (pfdev->is_resetting) {
>>> +		mutex_unlock(&pfdev->reset_lock);
>>> +		return;
>>> +	}
>>> +	pfdev->is_resetting = true;
>>> +
>>> +	mutex_unlock(&pfdev->reset_lock);
>>> +
>>>  	for (i = 0; i < NUM_JOB_SLOTS; i++)
>>>  		drm_sched_stop(&pfdev->js->queue[i].sched, sched_job);
>>>  
>>>  	if (sched_job)
>>>  		drm_sched_increase_karma(sched_job);
>>>  
>>> -	/* panfrost_core_dump(pfdev); */
>>> +	mutex_lock(&pfdev->reset_lock);
>>>  
>>>  	panfrost_devfreq_record_transition(pfdev, js);
>>>  	panfrost_device_reset(pfdev);
>>> @@ -406,6 +414,7 @@ static void panfrost_job_timedout(struct drm_sched_job *sched_job)
>>>  	for (i = 0; i < NUM_JOB_SLOTS; i++)
>>>  		drm_sched_start(&pfdev->js->queue[i].sched, true);
>>>  
>>> +	pfdev->is_resetting = false;
>>>  	mutex_unlock(&pfdev->reset_lock);
>>>  }
>>>  
>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
>>> index 148468447ba9..bc6d1862ec8a 100644
>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>> @@ -415,7 +415,10 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad)
>>>  	 * this TDR finished and before the newly restarted jobs had a
>>>  	 * chance to complete.
>>>  	 */
>>> -	cancel_delayed_work(&sched->work_tdr);
>>> +	if (bad->sched != sched)
>>> +		cancel_delayed_work_sync(&sched->work_tdr);
>>> +	else
>>> +		cancel_delayed_work(&sched->work_tdr);
>>>  }
>>>  
>>>  EXPORT_SYMBOL(drm_sched_stop);
>>>
>>
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
> 


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: drm_sched with panfrost crash on T820
  2019-09-27 15:00         ` Steven Price
@ 2019-09-27 15:20           ` Neil Armstrong
  2019-09-30 13:18           ` Neil Armstrong
  1 sibling, 0 replies; 21+ messages in thread
From: Neil Armstrong @ 2019-09-27 15:20 UTC (permalink / raw)
  To: Steven Price, daniel, airlied, Christian König
  Cc: Tomeu Vizoso, linux-kernel, dri-devel,
	open list:ARM/Amlogic Meson...,
	Erico Nunes

On 27/09/2019 17:00, Steven Price wrote:
> On 27/09/2019 12:48, Neil Armstrong wrote:
>> Hi,
>>
>> On 27/09/2019 13:27, Neil Armstrong wrote:
>>> Hi Steven,
>>>
>>> Thanks for your prompt reaction !
>>>
>>> On 27/09/2019 12:48, Steven Price wrote:
>>>> On 27/09/2019 10:55, Steven Price wrote:
>>>> [...]
>>>>> One obvious issue with the DRM scheduler is that there is a call to
>>>>> cancel_delayed_work() in drm_sched_stop() which to me looks like it
>>>>> should be cancel_delayed_work_sync() to ensure that the timeout handling
>>>>> has completed.
>>>>>
>>>>> However in the above scenario a _sync() variety would then cause a
>>>>> deadlock (one thread has pfdev->reset_lock and is waiting for the other
>>>>> thread which is trying to take the lock).
>>>>>
>>>>> So we need to update Panfrost so that it can coordinate the reset
>>>>> between schedulers. Can you try something like the following (untested):
>>>>
>>>> And actually testing it I of course discover it doesn't quite work. We
>>>> do need the cancel_delayed_work_sync() in the DRM scheduler (when
>>>> stopping a different scheduler) and we need to avoid holding the
>>>> reset_lock during the drm_sched_stop() call to prevent deadlocking with
>>>> another thread handling a timeout.
>>>
>>> Yep the first patch wasn't fixing the issue all the time.
>>>
>>>>
>>>> Can you give the following patch a spin? I don't have a great
>>>> reproduction case, so it would be good to get some confidence it fixes
>>>> the problem.
>>>
>>> Running it right now.
>>
>> First run gave me (while applying on v5.3):
>> [  307.969230] Unable to handle kernel NULL pointer dereference at virtual address 00000000000000c0
>> [...]
>> [  308.029358] Hardware name: Khadas VIM2 (DT)
>> [  308.033510] Workqueue: events drm_sched_job_timedout
>> [  308.038416] pstate: 80000005 (Nzcv daif -PAN -UAO)
>> [  308.043160] pc : drm_sched_start+0x88/0x138
>> [  308.047300] lr : drm_sched_start+0xb0/0x138
>> [...]
>> [  308.133635] Call trace:
>> [  308.136052]  drm_sched_start+0x88/0x138
>> [  308.139847]  panfrost_job_timedout+0x1cc/0x208
>> [  308.144245]  drm_sched_job_timedout+0x44/0xa8
>> [  308.148560]  process_one_work+0x1e0/0x320
>> [  308.152524]  worker_thread+0x40/0x450
>> [  308.156149]  kthread+0x124/0x128
>> [  308.159342]  ret_from_fork+0x10/0x18
>> [  308.162879] Code: 54000280 f9400862 52800020 aa1a03e1 (f940605b)
>> [  308.168914] ---[ end trace 256b7f5faec101d2 ]---
>>
>> Bad pointer seems to be struct dma_fence *fence = s_job->s_fence->parent that
>> could be a NULL return from panfrost_job_run().
> 
> I haven't managed reproduce this locally, admittedly the test case I was
> using before relies on changes in drm-misc-next (HEAPs specifically), so
> it might be a bug present in v5.3 which isn't present in drm-misc-next.

I have a hard time reproducing it... anyway the original regression
doesn't happen anymore.

I'll continue running it a few times this week-end.

Thanks,
Neil

> 
> From the code dumped we have:
> 
>    0:	54000280 	b.eq	50 <.text+0x50>  // b.none
>    4:	f9400862 	ldr	x2, [x3, #16]
>    8:	52800020 	mov	w0, #0x1                   	// #1
>    c:	aa1a03e1 	mov	x1, x26
>   10:	f940605b 	ldr	x27, [x2, #192]
> 
> Which looks like the expression s_job->s_fence->parent, so it looks like
> s_job->s_fence == NULL. Which looks to me like drm_sched_job_cleanup()
> has been called on the job. But I can't work out why.
> 
> Steve
> 
>> Neil
>>>
>>> Thanks,
>>> Neil
>>>
>>>>
>>>> ----8<----
>>>> From 521a286789260197ae94f698932ebf369efc45ad Mon Sep 17 00:00:00 2001
>>>> From: Steven Price <steven.price@arm.com>
>>>> Date: Fri, 27 Sep 2019 11:42:40 +0100
>>>> Subject: [PATCH] drm/panfrost: Handle resetting on timeout better
>>>>
>>>> Panfrost uses multiple schedulers (one for each slot, so 2 in reality),
>>>> and on a timeout has to stop all the schedulers to safely perform a
>>>> reset. However more than one scheduler can trigger a timeout at the same
>>>> time. This race condition results in jobs being freed while they are
>>>> still in use.
>>>>
>>>> Modify drm_sched_stop() to call cancel_delayed_work_sync() when stopping
>>>> a different scheduler to the one belonging to the passed in job.
>>>> panfrost_job_timedout() is also modified to only allow one thread at a
>>>> time to handle the reset. Any subsequent threads simply return assuming
>>>> that the first thread will handle the situation.
>>>>
>>>> Signed-off-by: Steven Price <steven.price@arm.com>
>>>> ---
>>>>  drivers/gpu/drm/panfrost/panfrost_device.h |  2 ++
>>>>  drivers/gpu/drm/panfrost/panfrost_job.c    | 11 ++++++++++-
>>>>  drivers/gpu/drm/scheduler/sched_main.c     |  5 ++++-
>>>>  3 files changed, 16 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/panfrost/panfrost_device.h b/drivers/gpu/drm/panfrost/panfrost_device.h
>>>> index f503c566e99f..6441c7fba6c4 100644
>>>> --- a/drivers/gpu/drm/panfrost/panfrost_device.h
>>>> +++ b/drivers/gpu/drm/panfrost/panfrost_device.h
>>>> @@ -99,6 +99,8 @@ struct panfrost_device {
>>>>  		unsigned long cur_volt;
>>>>  		struct panfrost_devfreq_slot slot[NUM_JOB_SLOTS];
>>>>  	} devfreq;
>>>> +
>>>> +	bool is_resetting;
>>>>  };
>>>>  
>>>>  struct panfrost_mmu {
>>>> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
>>>> index 05c85f45a0de..1b2019e08b43 100644
>>>> --- a/drivers/gpu/drm/panfrost/panfrost_job.c
>>>> +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
>>>> @@ -388,13 +388,21 @@ static void panfrost_job_timedout(struct drm_sched_job *sched_job)
>>>>  
>>>>  	mutex_lock(&pfdev->reset_lock);
>>>>  
>>>> +	if (pfdev->is_resetting) {
>>>> +		mutex_unlock(&pfdev->reset_lock);
>>>> +		return;
>>>> +	}
>>>> +	pfdev->is_resetting = true;
>>>> +
>>>> +	mutex_unlock(&pfdev->reset_lock);
>>>> +
>>>>  	for (i = 0; i < NUM_JOB_SLOTS; i++)
>>>>  		drm_sched_stop(&pfdev->js->queue[i].sched, sched_job);
>>>>  
>>>>  	if (sched_job)
>>>>  		drm_sched_increase_karma(sched_job);
>>>>  
>>>> -	/* panfrost_core_dump(pfdev); */
>>>> +	mutex_lock(&pfdev->reset_lock);
>>>>  
>>>>  	panfrost_devfreq_record_transition(pfdev, js);
>>>>  	panfrost_device_reset(pfdev);
>>>> @@ -406,6 +414,7 @@ static void panfrost_job_timedout(struct drm_sched_job *sched_job)
>>>>  	for (i = 0; i < NUM_JOB_SLOTS; i++)
>>>>  		drm_sched_start(&pfdev->js->queue[i].sched, true);
>>>>  
>>>> +	pfdev->is_resetting = false;
>>>>  	mutex_unlock(&pfdev->reset_lock);
>>>>  }
>>>>  
>>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
>>>> index 148468447ba9..bc6d1862ec8a 100644
>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>>> @@ -415,7 +415,10 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad)
>>>>  	 * this TDR finished and before the newly restarted jobs had a
>>>>  	 * chance to complete.
>>>>  	 */
>>>> -	cancel_delayed_work(&sched->work_tdr);
>>>> +	if (bad->sched != sched)
>>>> +		cancel_delayed_work_sync(&sched->work_tdr);
>>>> +	else
>>>> +		cancel_delayed_work(&sched->work_tdr);
>>>>  }
>>>>  
>>>>  EXPORT_SYMBOL(drm_sched_stop);
>>>>
>>>
>>
>> _______________________________________________
>> dri-devel mailing list
>> dri-devel@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/dri-devel
>>
> 


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: drm_sched with panfrost crash on T820
  2019-09-27  8:12 drm_sched with panfrost crash on T820 Neil Armstrong
  2019-09-27  9:55 ` Steven Price
@ 2019-09-27 20:55 ` Grodzovsky, Andrey
  2019-09-30  9:17   ` Neil Armstrong
       [not found]   ` <20190930145228.14000-1-hdanton@sina.com>
  1 sibling, 2 replies; 21+ messages in thread
From: Grodzovsky, Andrey @ 2019-09-27 20:55 UTC (permalink / raw)
  To: Neil Armstrong, daniel, airlied, Koenig, Christian
  Cc: Erico Nunes, linux-kernel, steven.price, dri-devel, Rob Herring,
	Tomeu Vizoso, open list:ARM/Amlogic Meson...

Can you please use addr2line or gdb to pinpoint where in 
drm_sched_increase_karma you hit the NULL ptr ? It looks like the guilty 
job, but to be sure.

Andrey

On 9/27/19 4:12 AM, Neil Armstrong wrote:
> Hi Christian,
>
> In v5.3, running dEQP triggers the following kernel crash :
>
> [   20.224982] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000038
> [...]
> [   20.291064] Hardware name: Khadas VIM2 (DT)
> [   20.295217] Workqueue: events drm_sched_job_timedout
> [...]
> [   20.304867] pc : drm_sched_increase_karma+0x5c/0xf0
> [   20.309696] lr : drm_sched_increase_karma+0x44/0xf0
> [...]
> [   20.396720] Call trace:
> [   20.399138]  drm_sched_increase_karma+0x5c/0xf0
> [   20.403623]  panfrost_job_timedout+0x12c/0x1e0
> [   20.408021]  drm_sched_job_timedout+0x48/0xa0
> [   20.412336]  process_one_work+0x1e0/0x320
> [   20.416300]  worker_thread+0x40/0x450
> [   20.419924]  kthread+0x124/0x128
> [   20.423116]  ret_from_fork+0x10/0x18
> [   20.426653] Code: f9400001 540001c0 f9400a83 f9402402 (f9401c64)
> [   20.432690] ---[ end trace bd02f890139096a7 ]---
>
> Which never happens, at all, on v5.2.
>
> I did a (very) long (7 days, ~100runs) bisect run using our LAVA lab (thanks tomeu !), but
> bisecting was not easy since the bad commit landed on drm-misc-next after v5.1-rc6, and
> then v5.2-rc1 was backmerged into drm-misc-next at:
> [1] 374ed5429346 Merge drm/drm-next into drm-misc-next
>
> Thus bisecting between [1] ang v5.2-rc1 leads to commit based on v5.2-rc1... where panfrost was
> not enabled in the Khadas VIM2 DT.
>
> Anyway, I managed to identify 3 possibly breaking commits :
> [2] 290764af7e36 drm/sched: Keep s_fence->parent pointer
> [3] 5918045c4ed4 drm/scheduler: rework job destruction
> [4] a5343b8a2ca5 drm/scheduler: Add flag to hint the release of guilty job.
>
> But [1] and [2] doesn't crash the same way :
> [   16.257912] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000060
> [...]
> [   16.308307] CPU: 4 PID: 80 Comm: kworker/4:1 Not tainted 5.1.0-rc2-01185-g290764af7e36-dirty #378
> [   16.317099] Hardware name: Khadas VIM2 (DT)
> [...])
> [   16.330907] pc : refcount_sub_and_test_checked+0x4/0xb0
> [   16.336078] lr : refcount_dec_and_test_checked+0x14/0x20
> [...]
> [   16.423533] Process kworker/4:1 (pid: 80, stack limit = 0x(____ptrval____))
> [   16.430431] Call trace:
> [   16.432851]  refcount_sub_and_test_checked+0x4/0xb0
> [   16.437681]  drm_sched_job_cleanup+0x24/0x58
> [   16.441908]  panfrost_job_free+0x14/0x28
> [   16.445787]  drm_sched_job_timedout+0x6c/0xa0
> [   16.450102]  process_one_work+0x1e0/0x320
> [   16.454067]  worker_thread+0x40/0x450
> [   16.457690]  kthread+0x124/0x128
> [   16.460882]  ret_from_fork+0x10/0x18
> [   16.464421] Code: 52800000 d65f03c0 d503201f aa0103e3 (b9400021)
> [   16.470456] ---[ end trace 39a67412ee1b64b5 ]---
>
> and [3] fails like on v5.3 (in drm_sched_increase_karma):
> [   33.830080] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000038
> [...]
> [   33.871946] Internal error: Oops: 96000004 [#1] PREEMPT SMP
> [   33.877450] Modules linked in:
> [   33.880474] CPU: 6 PID: 81 Comm: kworker/6:1 Not tainted 5.1.0-rc2-01186-ga5343b8a2ca5-dirty #380
> [   33.889265] Hardware name: Khadas VIM2 (DT)
> [   33.893419] Workqueue: events drm_sched_job_timedout
> [...]
> [   33.903069] pc : drm_sched_increase_karma+0x5c/0xf0
> [   33.907898] lr : drm_sched_increase_karma+0x44/0xf0
> [...]
> [   33.994924] Process kworker/6:1 (pid: 81, stack limit = 0x(____ptrval____))
> [   34.001822] Call trace:
> [   34.004242]  drm_sched_increase_karma+0x5c/0xf0
> [   34.008726]  panfrost_job_timedout+0x12c/0x1e0
> [   34.013122]  drm_sched_job_timedout+0x48/0xa0
> [   34.017438]  process_one_work+0x1e0/0x320
> [   34.021402]  worker_thread+0x40/0x450
> [   34.025026]  kthread+0x124/0x128
> [   34.028218]  ret_from_fork+0x10/0x18
> [   34.031755] Code: f9400001 540001c0 f9400a83 f9402402 (f9401c64)
> [   34.037792] ---[ end trace be3fd6f77f4df267 ]---
>
>
> When I revert [3] on [1], i get the same crash as [2], meaning
> the commit [3] masks the failure [2] introduced.
>
> Do you know how to solve this ?
>
> Thanks,
> Neil

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: drm_sched with panfrost crash on T820
  2019-09-27 20:55 ` Grodzovsky, Andrey
@ 2019-09-30  9:17   ` Neil Armstrong
  2019-10-02 16:53     ` Grodzovsky, Andrey
       [not found]   ` <20190930145228.14000-1-hdanton@sina.com>
  1 sibling, 1 reply; 21+ messages in thread
From: Neil Armstrong @ 2019-09-30  9:17 UTC (permalink / raw)
  To: Grodzovsky, Andrey, daniel, airlied, Koenig, Christian
  Cc: Erico Nunes, linux-kernel, steven.price, dri-devel, Rob Herring,
	Tomeu Vizoso, open list:ARM/Amlogic Meson...

Hi Andrey,

On 27/09/2019 22:55, Grodzovsky, Andrey wrote:
> Can you please use addr2line or gdb to pinpoint where in 
> drm_sched_increase_karma you hit the NULL ptr ? It looks like the guilty 
> job, but to be sure.

Did a new run from 5.3:

[   35.971972] Call trace:
[   35.974391]  drm_sched_increase_karma+0x5c/0xf0	ffff000010667f38	FFFF000010667F94	drivers/gpu/drm/scheduler/sched_main.c:335


The crashing line is :
                                if (bad->s_fence->scheduled.context ==
                                    entity->fence_context) {

Doesn't seem related to guilty job.

Neil

> 
> Andrey
> 
> On 9/27/19 4:12 AM, Neil Armstrong wrote:
>> Hi Christian,
>>
>> In v5.3, running dEQP triggers the following kernel crash :
>>
>> [   20.224982] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000038
>> [...]
>> [   20.291064] Hardware name: Khadas VIM2 (DT)
>> [   20.295217] Workqueue: events drm_sched_job_timedout
>> [...]
>> [   20.304867] pc : drm_sched_increase_karma+0x5c/0xf0
>> [   20.309696] lr : drm_sched_increase_karma+0x44/0xf0
>> [...]
>> [   20.396720] Call trace:
>> [   20.399138]  drm_sched_increase_karma+0x5c/0xf0
>> [   20.403623]  panfrost_job_timedout+0x12c/0x1e0
>> [   20.408021]  drm_sched_job_timedout+0x48/0xa0
>> [   20.412336]  process_one_work+0x1e0/0x320
>> [   20.416300]  worker_thread+0x40/0x450
>> [   20.419924]  kthread+0x124/0x128
>> [   20.423116]  ret_from_fork+0x10/0x18
>> [   20.426653] Code: f9400001 540001c0 f9400a83 f9402402 (f9401c64)
>> [   20.432690] ---[ end trace bd02f890139096a7 ]---
>>
>> Which never happens, at all, on v5.2.
>>
>> I did a (very) long (7 days, ~100runs) bisect run using our LAVA lab (thanks tomeu !), but
>> bisecting was not easy since the bad commit landed on drm-misc-next after v5.1-rc6, and
>> then v5.2-rc1 was backmerged into drm-misc-next at:
>> [1] 374ed5429346 Merge drm/drm-next into drm-misc-next
>>
>> Thus bisecting between [1] ang v5.2-rc1 leads to commit based on v5.2-rc1... where panfrost was
>> not enabled in the Khadas VIM2 DT.
>>
>> Anyway, I managed to identify 3 possibly breaking commits :
>> [2] 290764af7e36 drm/sched: Keep s_fence->parent pointer
>> [3] 5918045c4ed4 drm/scheduler: rework job destruction
>> [4] a5343b8a2ca5 drm/scheduler: Add flag to hint the release of guilty job.
>>
>> But [1] and [2] doesn't crash the same way :
>> [   16.257912] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000060
>> [...]
>> [   16.308307] CPU: 4 PID: 80 Comm: kworker/4:1 Not tainted 5.1.0-rc2-01185-g290764af7e36-dirty #378
>> [   16.317099] Hardware name: Khadas VIM2 (DT)
>> [...])
>> [   16.330907] pc : refcount_sub_and_test_checked+0x4/0xb0
>> [   16.336078] lr : refcount_dec_and_test_checked+0x14/0x20
>> [...]
>> [   16.423533] Process kworker/4:1 (pid: 80, stack limit = 0x(____ptrval____))
>> [   16.430431] Call trace:
>> [   16.432851]  refcount_sub_and_test_checked+0x4/0xb0
>> [   16.437681]  drm_sched_job_cleanup+0x24/0x58
>> [   16.441908]  panfrost_job_free+0x14/0x28
>> [   16.445787]  drm_sched_job_timedout+0x6c/0xa0
>> [   16.450102]  process_one_work+0x1e0/0x320
>> [   16.454067]  worker_thread+0x40/0x450
>> [   16.457690]  kthread+0x124/0x128
>> [   16.460882]  ret_from_fork+0x10/0x18
>> [   16.464421] Code: 52800000 d65f03c0 d503201f aa0103e3 (b9400021)
>> [   16.470456] ---[ end trace 39a67412ee1b64b5 ]---
>>
>> and [3] fails like on v5.3 (in drm_sched_increase_karma):
>> [   33.830080] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000038
>> [...]
>> [   33.871946] Internal error: Oops: 96000004 [#1] PREEMPT SMP
>> [   33.877450] Modules linked in:
>> [   33.880474] CPU: 6 PID: 81 Comm: kworker/6:1 Not tainted 5.1.0-rc2-01186-ga5343b8a2ca5-dirty #380
>> [   33.889265] Hardware name: Khadas VIM2 (DT)
>> [   33.893419] Workqueue: events drm_sched_job_timedout
>> [...]
>> [   33.903069] pc : drm_sched_increase_karma+0x5c/0xf0
>> [   33.907898] lr : drm_sched_increase_karma+0x44/0xf0
>> [...]
>> [   33.994924] Process kworker/6:1 (pid: 81, stack limit = 0x(____ptrval____))
>> [   34.001822] Call trace:
>> [   34.004242]  drm_sched_increase_karma+0x5c/0xf0
>> [   34.008726]  panfrost_job_timedout+0x12c/0x1e0
>> [   34.013122]  drm_sched_job_timedout+0x48/0xa0
>> [   34.017438]  process_one_work+0x1e0/0x320
>> [   34.021402]  worker_thread+0x40/0x450
>> [   34.025026]  kthread+0x124/0x128
>> [   34.028218]  ret_from_fork+0x10/0x18
>> [   34.031755] Code: f9400001 540001c0 f9400a83 f9402402 (f9401c64)
>> [   34.037792] ---[ end trace be3fd6f77f4df267 ]---
>>
>>
>> When I revert [3] on [1], i get the same crash as [2], meaning
>> the commit [3] masks the failure [2] introduced.
>>
>> Do you know how to solve this ?
>>
>> Thanks,
>> Neil


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: drm_sched with panfrost crash on T820
  2019-09-27 15:00         ` Steven Price
  2019-09-27 15:20           ` Neil Armstrong
@ 2019-09-30 13:18           ` Neil Armstrong
  1 sibling, 0 replies; 21+ messages in thread
From: Neil Armstrong @ 2019-09-30 13:18 UTC (permalink / raw)
  To: Steven Price, daniel, airlied, Christian König
  Cc: Tomeu Vizoso, linux-kernel, dri-devel,
	open list:ARM/Amlogic Meson...,
	Erico Nunes

Hi Steven,

On 27/09/2019 17:00, Steven Price wrote:
> On 27/09/2019 12:48, Neil Armstrong wrote:
>> Hi,
>>
>> On 27/09/2019 13:27, Neil Armstrong wrote:
>>> Hi Steven,
>>>
>>> Thanks for your prompt reaction !
>>>
>>> On 27/09/2019 12:48, Steven Price wrote:
>>>> On 27/09/2019 10:55, Steven Price wrote:
>>>> [...]
>>>>> One obvious issue with the DRM scheduler is that there is a call to
>>>>> cancel_delayed_work() in drm_sched_stop() which to me looks like it
>>>>> should be cancel_delayed_work_sync() to ensure that the timeout handling
>>>>> has completed.
>>>>>
>>>>> However in the above scenario a _sync() variety would then cause a
>>>>> deadlock (one thread has pfdev->reset_lock and is waiting for the other
>>>>> thread which is trying to take the lock).
>>>>>
>>>>> So we need to update Panfrost so that it can coordinate the reset
>>>>> between schedulers. Can you try something like the following (untested):
>>>>
>>>> And actually testing it I of course discover it doesn't quite work. We
>>>> do need the cancel_delayed_work_sync() in the DRM scheduler (when
>>>> stopping a different scheduler) and we need to avoid holding the
>>>> reset_lock during the drm_sched_stop() call to prevent deadlocking with
>>>> another thread handling a timeout.
>>>
>>> Yep the first patch wasn't fixing the issue all the time.
>>>
>>>>
>>>> Can you give the following patch a spin? I don't have a great
>>>> reproduction case, so it would be good to get some confidence it fixes
>>>> the problem.
>>>

[...]

I ran 15 jobs with your patch and none failed, so you can add my:
Tested-by: Neil Armstrong <narmstrong@baylibre.com>

No idea how this drm_sched_start() failure came, but I wasn't able
to reproduce it at all...

Thanks,
Neil

> 
> Steve
> 
>> Neil
>>>
>>> Thanks,
>>> Neil
>>>
>>>>
>>>> ----8<----
>>>> From 521a286789260197ae94f698932ebf369efc45ad Mon Sep 17 00:00:00 2001
>>>> From: Steven Price <steven.price@arm.com>
>>>> Date: Fri, 27 Sep 2019 11:42:40 +0100
>>>> Subject: [PATCH] drm/panfrost: Handle resetting on timeout better
>>>>
>>>> Panfrost uses multiple schedulers (one for each slot, so 2 in reality),
>>>> and on a timeout has to stop all the schedulers to safely perform a
>>>> reset. However more than one scheduler can trigger a timeout at the same
>>>> time. This race condition results in jobs being freed while they are
>>>> still in use.
>>>>
>>>> Modify drm_sched_stop() to call cancel_delayed_work_sync() when stopping
>>>> a different scheduler to the one belonging to the passed in job.
>>>> panfrost_job_timedout() is also modified to only allow one thread at a
>>>> time to handle the reset. Any subsequent threads simply return assuming
>>>> that the first thread will handle the situation.
>>>>
>>>> Signed-off-by: Steven Price <steven.price@arm.com>
>>>> ---
>>>>  drivers/gpu/drm/panfrost/panfrost_device.h |  2 ++
>>>>  drivers/gpu/drm/panfrost/panfrost_job.c    | 11 ++++++++++-
>>>>  drivers/gpu/drm/scheduler/sched_main.c     |  5 ++++-
>>>>  3 files changed, 16 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/panfrost/panfrost_device.h b/drivers/gpu/drm/panfrost/panfrost_device.h
>>>> index f503c566e99f..6441c7fba6c4 100644
>>>> --- a/drivers/gpu/drm/panfrost/panfrost_device.h
>>>> +++ b/drivers/gpu/drm/panfrost/panfrost_device.h
>>>> @@ -99,6 +99,8 @@ struct panfrost_device {
>>>>  		unsigned long cur_volt;
>>>>  		struct panfrost_devfreq_slot slot[NUM_JOB_SLOTS];
>>>>  	} devfreq;
>>>> +
>>>> +	bool is_resetting;
>>>>  };
>>>>  
>>>>  struct panfrost_mmu {
>>>> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
>>>> index 05c85f45a0de..1b2019e08b43 100644
>>>> --- a/drivers/gpu/drm/panfrost/panfrost_job.c
>>>> +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
>>>> @@ -388,13 +388,21 @@ static void panfrost_job_timedout(struct drm_sched_job *sched_job)
>>>>  
>>>>  	mutex_lock(&pfdev->reset_lock);
>>>>  
>>>> +	if (pfdev->is_resetting) {
>>>> +		mutex_unlock(&pfdev->reset_lock);
>>>> +		return;
>>>> +	}
>>>> +	pfdev->is_resetting = true;
>>>> +
>>>> +	mutex_unlock(&pfdev->reset_lock);
>>>> +
>>>>  	for (i = 0; i < NUM_JOB_SLOTS; i++)
>>>>  		drm_sched_stop(&pfdev->js->queue[i].sched, sched_job);
>>>>  
>>>>  	if (sched_job)
>>>>  		drm_sched_increase_karma(sched_job);
>>>>  
>>>> -	/* panfrost_core_dump(pfdev); */
>>>> +	mutex_lock(&pfdev->reset_lock);
>>>>  
>>>>  	panfrost_devfreq_record_transition(pfdev, js);
>>>>  	panfrost_device_reset(pfdev);
>>>> @@ -406,6 +414,7 @@ static void panfrost_job_timedout(struct drm_sched_job *sched_job)
>>>>  	for (i = 0; i < NUM_JOB_SLOTS; i++)
>>>>  		drm_sched_start(&pfdev->js->queue[i].sched, true);
>>>>  
>>>> +	pfdev->is_resetting = false;
>>>>  	mutex_unlock(&pfdev->reset_lock);
>>>>  }
>>>>  
>>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
>>>> index 148468447ba9..bc6d1862ec8a 100644
>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>>> @@ -415,7 +415,10 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad)
>>>>  	 * this TDR finished and before the newly restarted jobs had a
>>>>  	 * chance to complete.
>>>>  	 */
>>>> -	cancel_delayed_work(&sched->work_tdr);
>>>> +	if (bad->sched != sched)
>>>> +		cancel_delayed_work_sync(&sched->work_tdr);
>>>> +	else
>>>> +		cancel_delayed_work(&sched->work_tdr);
>>>>  }
>>>>  
>>>>  EXPORT_SYMBOL(drm_sched_stop);
>>>>
>>>
>>
>> _______________________________________________
>> dri-devel mailing list
>> dri-devel@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/dri-devel
>>
> 


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: drm_sched with panfrost crash on T820
       [not found]   ` <20190930145228.14000-1-hdanton@sina.com>
@ 2019-10-02 14:40     ` Grodzovsky, Andrey
  2019-10-02 14:44       ` Neil Armstrong
  2019-10-03  8:34       ` Neil Armstrong
  0 siblings, 2 replies; 21+ messages in thread
From: Grodzovsky, Andrey @ 2019-10-02 14:40 UTC (permalink / raw)
  To: Hillf Danton, Neil Armstrong
  Cc: daniel, airlied, Koenig, Christian, Erico Nunes, linux-kernel,
	steven.price, dri-devel, Rob Herring, Tomeu Vizoso,
	open list:ARM/Amlogic Meson...


On 9/30/19 10:52 AM, Hillf Danton wrote:
> On Mon, 30 Sep 2019 11:17:45 +0200 Neil Armstrong wrote:
>> Did a new run from 5.3:
>>
>> [   35.971972] Call trace:
>> [   35.974391]  drm_sched_increase_karma+0x5c/0xf0
>> 			ffff000010667f38	FFFF000010667F94
>> 			drivers/gpu/drm/scheduler/sched_main.c:335
>>
>> The crashing line is :
>>                                  if (bad->s_fence->scheduled.context ==
>>                                      entity->fence_context) {
>>
>> Doesn't seem related to guilty job.
> Bail out if s_fence is no longer fresh.
>
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -333,6 +333,10 @@ void drm_sched_increase_karma(struct drm
>   
>   			spin_lock(&rq->lock);
>   			list_for_each_entry_safe(entity, tmp, &rq->entities, list) {
> +				if (!smp_load_acquire(&bad->s_fence)) {
> +					spin_unlock(&rq->lock);
> +					return;
> +				}
>   				if (bad->s_fence->scheduled.context ==
>   				    entity->fence_context) {
>   					if (atomic_read(&bad->karma) >
> @@ -543,7 +547,7 @@ EXPORT_SYMBOL(drm_sched_job_init);
>   void drm_sched_job_cleanup(struct drm_sched_job *job)
>   {
>   	dma_fence_put(&job->s_fence->finished);
> -	job->s_fence = NULL;
> +	smp_store_release(&job->s_fence, 0);
>   }
>   EXPORT_SYMBOL(drm_sched_job_cleanup);


Does this change help the problem ? Note that drm_sched_job_cleanup is 
called from scheduler thread which is stopped at all times when work_tdr 
thread is running and anyway the 'bad' job is still in the 
ring_mirror_list while it's being accessed from  
drm_sched_increase_karma so I don't think drm_sched_job_cleanup can be 
called for it BEFORE or while drm_sched_increase_karma is executed.

Andrey


>   
> --
>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: drm_sched with panfrost crash on T820
  2019-10-02 14:40     ` Grodzovsky, Andrey
@ 2019-10-02 14:44       ` Neil Armstrong
  2019-10-03  8:34       ` Neil Armstrong
  1 sibling, 0 replies; 21+ messages in thread
From: Neil Armstrong @ 2019-10-02 14:44 UTC (permalink / raw)
  To: Grodzovsky, Andrey, Hillf Danton
  Cc: daniel, airlied, Koenig, Christian, Erico Nunes, linux-kernel,
	steven.price, dri-devel, Rob Herring, Tomeu Vizoso,
	open list:ARM/Amlogic Meson...

Hi Andrey,


On 02/10/2019 16:40, Grodzovsky, Andrey wrote:
> 
> On 9/30/19 10:52 AM, Hillf Danton wrote:
>> On Mon, 30 Sep 2019 11:17:45 +0200 Neil Armstrong wrote:
>>> Did a new run from 5.3:
>>>
>>> [   35.971972] Call trace:
>>> [   35.974391]  drm_sched_increase_karma+0x5c/0xf0
>>> 			ffff000010667f38	FFFF000010667F94
>>> 			drivers/gpu/drm/scheduler/sched_main.c:335
>>>
>>> The crashing line is :
>>>                                  if (bad->s_fence->scheduled.context ==
>>>                                      entity->fence_context) {
>>>
>>> Doesn't seem related to guilty job.
>> Bail out if s_fence is no longer fresh.
>>
>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>> @@ -333,6 +333,10 @@ void drm_sched_increase_karma(struct drm
>>   
>>   			spin_lock(&rq->lock);
>>   			list_for_each_entry_safe(entity, tmp, &rq->entities, list) {
>> +				if (!smp_load_acquire(&bad->s_fence)) {
>> +					spin_unlock(&rq->lock);
>> +					return;
>> +				}
>>   				if (bad->s_fence->scheduled.context ==
>>   				    entity->fence_context) {
>>   					if (atomic_read(&bad->karma) >
>> @@ -543,7 +547,7 @@ EXPORT_SYMBOL(drm_sched_job_init);
>>   void drm_sched_job_cleanup(struct drm_sched_job *job)
>>   {
>>   	dma_fence_put(&job->s_fence->finished);
>> -	job->s_fence = NULL;
>> +	smp_store_release(&job->s_fence, 0);
>>   }
>>   EXPORT_SYMBOL(drm_sched_job_cleanup);
> 
> 
> Does this change help the problem ? Note that drm_sched_job_cleanup is 
> called from scheduler thread which is stopped at all times when work_tdr 
> thread is running and anyway the 'bad' job is still in the 
> ring_mirror_list while it's being accessed from  
> drm_sched_increase_karma so I don't think drm_sched_job_cleanup can be 
> called for it BEFORE or while drm_sched_increase_karma is executed.

I'm running it right now, will post results when finished.

Thanks,
Neil

> 
> Andrey
> 
> 
>>   
>> --
>>


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: drm_sched with panfrost crash on T820
  2019-09-30  9:17   ` Neil Armstrong
@ 2019-10-02 16:53     ` Grodzovsky, Andrey
  2019-10-03  8:36       ` Neil Armstrong
  0 siblings, 1 reply; 21+ messages in thread
From: Grodzovsky, Andrey @ 2019-10-02 16:53 UTC (permalink / raw)
  To: Neil Armstrong, daniel, airlied, Koenig, Christian
  Cc: Erico Nunes, linux-kernel, steven.price, dri-devel, Rob Herring,
	Tomeu Vizoso, open list:ARM/Amlogic Meson...


On 9/30/19 5:17 AM, Neil Armstrong wrote:
> Hi Andrey,
>
> On 27/09/2019 22:55, Grodzovsky, Andrey wrote:
>> Can you please use addr2line or gdb to pinpoint where in
>> drm_sched_increase_karma you hit the NULL ptr ? It looks like the guilty
>> job, but to be sure.
> Did a new run from 5.3:
>
> [   35.971972] Call trace:
> [   35.974391]  drm_sched_increase_karma+0x5c/0xf0	ffff000010667f38	FFFF000010667F94	drivers/gpu/drm/scheduler/sched_main.c:335
>
>
> The crashing line is :
>                                  if (bad->s_fence->scheduled.context ==
>                                      entity->fence_context) {
>
> Doesn't seem related to guilty job.
>
> Neil


Thanks Neil, by guilty i meant the 'bad' job. I reviewed the code and 
can't see anything suspicious for now. To help clarify could you please 
provide ftrace log for this ? All the dma_fence and gpu_scheduler traces 
can help. I usually just set them all up in one line using trace-cmd 
utility like this before starting the run. If you have any relevant 
traces in panfrost it aslo can be useful.

sudo trace-cmd start -e dma_fence -e gpu_scheduler

Andrey


>
>> Andrey
>>
>> On 9/27/19 4:12 AM, Neil Armstrong wrote:
>>> Hi Christian,
>>>
>>> In v5.3, running dEQP triggers the following kernel crash :
>>>
>>> [   20.224982] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000038
>>> [...]
>>> [   20.291064] Hardware name: Khadas VIM2 (DT)
>>> [   20.295217] Workqueue: events drm_sched_job_timedout
>>> [...]
>>> [   20.304867] pc : drm_sched_increase_karma+0x5c/0xf0
>>> [   20.309696] lr : drm_sched_increase_karma+0x44/0xf0
>>> [...]
>>> [   20.396720] Call trace:
>>> [   20.399138]  drm_sched_increase_karma+0x5c/0xf0
>>> [   20.403623]  panfrost_job_timedout+0x12c/0x1e0
>>> [   20.408021]  drm_sched_job_timedout+0x48/0xa0
>>> [   20.412336]  process_one_work+0x1e0/0x320
>>> [   20.416300]  worker_thread+0x40/0x450
>>> [   20.419924]  kthread+0x124/0x128
>>> [   20.423116]  ret_from_fork+0x10/0x18
>>> [   20.426653] Code: f9400001 540001c0 f9400a83 f9402402 (f9401c64)
>>> [   20.432690] ---[ end trace bd02f890139096a7 ]---
>>>
>>> Which never happens, at all, on v5.2.
>>>
>>> I did a (very) long (7 days, ~100runs) bisect run using our LAVA lab (thanks tomeu !), but
>>> bisecting was not easy since the bad commit landed on drm-misc-next after v5.1-rc6, and
>>> then v5.2-rc1 was backmerged into drm-misc-next at:
>>> [1] 374ed5429346 Merge drm/drm-next into drm-misc-next
>>>
>>> Thus bisecting between [1] ang v5.2-rc1 leads to commit based on v5.2-rc1... where panfrost was
>>> not enabled in the Khadas VIM2 DT.
>>>
>>> Anyway, I managed to identify 3 possibly breaking commits :
>>> [2] 290764af7e36 drm/sched: Keep s_fence->parent pointer
>>> [3] 5918045c4ed4 drm/scheduler: rework job destruction
>>> [4] a5343b8a2ca5 drm/scheduler: Add flag to hint the release of guilty job.
>>>
>>> But [1] and [2] doesn't crash the same way :
>>> [   16.257912] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000060
>>> [...]
>>> [   16.308307] CPU: 4 PID: 80 Comm: kworker/4:1 Not tainted 5.1.0-rc2-01185-g290764af7e36-dirty #378
>>> [   16.317099] Hardware name: Khadas VIM2 (DT)
>>> [...])
>>> [   16.330907] pc : refcount_sub_and_test_checked+0x4/0xb0
>>> [   16.336078] lr : refcount_dec_and_test_checked+0x14/0x20
>>> [...]
>>> [   16.423533] Process kworker/4:1 (pid: 80, stack limit = 0x(____ptrval____))
>>> [   16.430431] Call trace:
>>> [   16.432851]  refcount_sub_and_test_checked+0x4/0xb0
>>> [   16.437681]  drm_sched_job_cleanup+0x24/0x58
>>> [   16.441908]  panfrost_job_free+0x14/0x28
>>> [   16.445787]  drm_sched_job_timedout+0x6c/0xa0
>>> [   16.450102]  process_one_work+0x1e0/0x320
>>> [   16.454067]  worker_thread+0x40/0x450
>>> [   16.457690]  kthread+0x124/0x128
>>> [   16.460882]  ret_from_fork+0x10/0x18
>>> [   16.464421] Code: 52800000 d65f03c0 d503201f aa0103e3 (b9400021)
>>> [   16.470456] ---[ end trace 39a67412ee1b64b5 ]---
>>>
>>> and [3] fails like on v5.3 (in drm_sched_increase_karma):
>>> [   33.830080] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000038
>>> [...]
>>> [   33.871946] Internal error: Oops: 96000004 [#1] PREEMPT SMP
>>> [   33.877450] Modules linked in:
>>> [   33.880474] CPU: 6 PID: 81 Comm: kworker/6:1 Not tainted 5.1.0-rc2-01186-ga5343b8a2ca5-dirty #380
>>> [   33.889265] Hardware name: Khadas VIM2 (DT)
>>> [   33.893419] Workqueue: events drm_sched_job_timedout
>>> [...]
>>> [   33.903069] pc : drm_sched_increase_karma+0x5c/0xf0
>>> [   33.907898] lr : drm_sched_increase_karma+0x44/0xf0
>>> [...]
>>> [   33.994924] Process kworker/6:1 (pid: 81, stack limit = 0x(____ptrval____))
>>> [   34.001822] Call trace:
>>> [   34.004242]  drm_sched_increase_karma+0x5c/0xf0
>>> [   34.008726]  panfrost_job_timedout+0x12c/0x1e0
>>> [   34.013122]  drm_sched_job_timedout+0x48/0xa0
>>> [   34.017438]  process_one_work+0x1e0/0x320
>>> [   34.021402]  worker_thread+0x40/0x450
>>> [   34.025026]  kthread+0x124/0x128
>>> [   34.028218]  ret_from_fork+0x10/0x18
>>> [   34.031755] Code: f9400001 540001c0 f9400a83 f9402402 (f9401c64)
>>> [   34.037792] ---[ end trace be3fd6f77f4df267 ]---
>>>
>>>
>>> When I revert [3] on [1], i get the same crash as [2], meaning
>>> the commit [3] masks the failure [2] introduced.
>>>
>>> Do you know how to solve this ?
>>>
>>> Thanks,
>>> Neil

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: drm_sched with panfrost crash on T820
  2019-10-02 14:40     ` Grodzovsky, Andrey
  2019-10-02 14:44       ` Neil Armstrong
@ 2019-10-03  8:34       ` Neil Armstrong
  2019-10-04 14:53         ` Grodzovsky, Andrey
  1 sibling, 1 reply; 21+ messages in thread
From: Neil Armstrong @ 2019-10-03  8:34 UTC (permalink / raw)
  To: Grodzovsky, Andrey, Hillf Danton
  Cc: daniel, airlied, Koenig, Christian, Erico Nunes, linux-kernel,
	steven.price, dri-devel, Rob Herring, Tomeu Vizoso,
	open list:ARM/Amlogic Meson...

Hi Andrey,

Le 02/10/2019 à 16:40, Grodzovsky, Andrey a écrit :
> 
> On 9/30/19 10:52 AM, Hillf Danton wrote:
>> On Mon, 30 Sep 2019 11:17:45 +0200 Neil Armstrong wrote:
>>> Did a new run from 5.3:
>>>
>>> [   35.971972] Call trace:
>>> [   35.974391]  drm_sched_increase_karma+0x5c/0xf0
>>> 			ffff000010667f38	FFFF000010667F94
>>> 			drivers/gpu/drm/scheduler/sched_main.c:335
>>>
>>> The crashing line is :
>>>                                  if (bad->s_fence->scheduled.context ==
>>>                                      entity->fence_context) {
>>>
>>> Doesn't seem related to guilty job.
>> Bail out if s_fence is no longer fresh.
>>
>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>> @@ -333,6 +333,10 @@ void drm_sched_increase_karma(struct drm
>>   
>>   			spin_lock(&rq->lock);
>>   			list_for_each_entry_safe(entity, tmp, &rq->entities, list) {
>> +				if (!smp_load_acquire(&bad->s_fence)) {
>> +					spin_unlock(&rq->lock);
>> +					return;
>> +				}
>>   				if (bad->s_fence->scheduled.context ==
>>   				    entity->fence_context) {
>>   					if (atomic_read(&bad->karma) >
>> @@ -543,7 +547,7 @@ EXPORT_SYMBOL(drm_sched_job_init);
>>   void drm_sched_job_cleanup(struct drm_sched_job *job)
>>   {
>>   	dma_fence_put(&job->s_fence->finished);
>> -	job->s_fence = NULL;
>> +	smp_store_release(&job->s_fence, 0);
>>   }
>>   EXPORT_SYMBOL(drm_sched_job_cleanup);
> 

This fixed the problem on the 10 CI runs.

Neil

> 
> Does this change help the problem ? Note that drm_sched_job_cleanup is 
> called from scheduler thread which is stopped at all times when work_tdr 
> thread is running and anyway the 'bad' job is still in the 
> ring_mirror_list while it's being accessed from  
> drm_sched_increase_karma so I don't think drm_sched_job_cleanup can be 
> called for it BEFORE or while drm_sched_increase_karma is executed.
> 
> Andrey
> 
> 
>>   
>> --
>>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: drm_sched with panfrost crash on T820
  2019-10-02 16:53     ` Grodzovsky, Andrey
@ 2019-10-03  8:36       ` Neil Armstrong
  0 siblings, 0 replies; 21+ messages in thread
From: Neil Armstrong @ 2019-10-03  8:36 UTC (permalink / raw)
  To: Grodzovsky, Andrey, daniel, airlied, Koenig, Christian
  Cc: Erico Nunes, linux-kernel, steven.price, dri-devel, Rob Herring,
	Tomeu Vizoso, open list:ARM/Amlogic Meson...



Le 02/10/2019 à 18:53, Grodzovsky, Andrey a écrit :
> 
> On 9/30/19 5:17 AM, Neil Armstrong wrote:
>> Hi Andrey,
>>
>> On 27/09/2019 22:55, Grodzovsky, Andrey wrote:
>>> Can you please use addr2line or gdb to pinpoint where in
>>> drm_sched_increase_karma you hit the NULL ptr ? It looks like the guilty
>>> job, but to be sure.
>> Did a new run from 5.3:
>>
>> [   35.971972] Call trace:
>> [   35.974391]  drm_sched_increase_karma+0x5c/0xf0	ffff000010667f38	FFFF000010667F94	drivers/gpu/drm/scheduler/sched_main.c:335
>>
>>
>> The crashing line is :
>>                                  if (bad->s_fence->scheduled.context ==
>>                                      entity->fence_context) {
>>
>> Doesn't seem related to guilty job.
>>
>> Neil
> 
> 
> Thanks Neil, by guilty i meant the 'bad' job. I reviewed the code and 
> can't see anything suspicious for now. To help clarify could you please 
> provide ftrace log for this ? All the dma_fence and gpu_scheduler traces 
> can help. I usually just set them all up in one line using trace-cmd 
> utility like this before starting the run. If you have any relevant 
> traces in panfrost it aslo can be useful.
> 
> sudo trace-cmd start -e dma_fence -e gpu_scheduler

Sure but I'll need much more time to do this, in the meantime I did 10 runs with your
patch and is fixed the issue.

I'll try to generate the traces.

Neil

> 
> Andrey
> 
> 
>>
>>> Andrey
>>>
>>> On 9/27/19 4:12 AM, Neil Armstrong wrote:
>>>> Hi Christian,
>>>>
>>>> In v5.3, running dEQP triggers the following kernel crash :
>>>>
>>>> [   20.224982] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000038
>>>> [...]
>>>> [   20.291064] Hardware name: Khadas VIM2 (DT)
>>>> [   20.295217] Workqueue: events drm_sched_job_timedout
>>>> [...]
>>>> [   20.304867] pc : drm_sched_increase_karma+0x5c/0xf0
>>>> [   20.309696] lr : drm_sched_increase_karma+0x44/0xf0
>>>> [...]
>>>> [   20.396720] Call trace:
>>>> [   20.399138]  drm_sched_increase_karma+0x5c/0xf0
>>>> [   20.403623]  panfrost_job_timedout+0x12c/0x1e0
>>>> [   20.408021]  drm_sched_job_timedout+0x48/0xa0
>>>> [   20.412336]  process_one_work+0x1e0/0x320
>>>> [   20.416300]  worker_thread+0x40/0x450
>>>> [   20.419924]  kthread+0x124/0x128
>>>> [   20.423116]  ret_from_fork+0x10/0x18
>>>> [   20.426653] Code: f9400001 540001c0 f9400a83 f9402402 (f9401c64)
>>>> [   20.432690] ---[ end trace bd02f890139096a7 ]---
>>>>
>>>> Which never happens, at all, on v5.2.
>>>>
>>>> I did a (very) long (7 days, ~100runs) bisect run using our LAVA lab (thanks tomeu !), but
>>>> bisecting was not easy since the bad commit landed on drm-misc-next after v5.1-rc6, and
>>>> then v5.2-rc1 was backmerged into drm-misc-next at:
>>>> [1] 374ed5429346 Merge drm/drm-next into drm-misc-next
>>>>
>>>> Thus bisecting between [1] ang v5.2-rc1 leads to commit based on v5.2-rc1... where panfrost was
>>>> not enabled in the Khadas VIM2 DT.
>>>>
>>>> Anyway, I managed to identify 3 possibly breaking commits :
>>>> [2] 290764af7e36 drm/sched: Keep s_fence->parent pointer
>>>> [3] 5918045c4ed4 drm/scheduler: rework job destruction
>>>> [4] a5343b8a2ca5 drm/scheduler: Add flag to hint the release of guilty job.
>>>>
>>>> But [1] and [2] doesn't crash the same way :
>>>> [   16.257912] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000060
>>>> [...]
>>>> [   16.308307] CPU: 4 PID: 80 Comm: kworker/4:1 Not tainted 5.1.0-rc2-01185-g290764af7e36-dirty #378
>>>> [   16.317099] Hardware name: Khadas VIM2 (DT)
>>>> [...])
>>>> [   16.330907] pc : refcount_sub_and_test_checked+0x4/0xb0
>>>> [   16.336078] lr : refcount_dec_and_test_checked+0x14/0x20
>>>> [...]
>>>> [   16.423533] Process kworker/4:1 (pid: 80, stack limit = 0x(____ptrval____))
>>>> [   16.430431] Call trace:
>>>> [   16.432851]  refcount_sub_and_test_checked+0x4/0xb0
>>>> [   16.437681]  drm_sched_job_cleanup+0x24/0x58
>>>> [   16.441908]  panfrost_job_free+0x14/0x28
>>>> [   16.445787]  drm_sched_job_timedout+0x6c/0xa0
>>>> [   16.450102]  process_one_work+0x1e0/0x320
>>>> [   16.454067]  worker_thread+0x40/0x450
>>>> [   16.457690]  kthread+0x124/0x128
>>>> [   16.460882]  ret_from_fork+0x10/0x18
>>>> [   16.464421] Code: 52800000 d65f03c0 d503201f aa0103e3 (b9400021)
>>>> [   16.470456] ---[ end trace 39a67412ee1b64b5 ]---
>>>>
>>>> and [3] fails like on v5.3 (in drm_sched_increase_karma):
>>>> [   33.830080] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000038
>>>> [...]
>>>> [   33.871946] Internal error: Oops: 96000004 [#1] PREEMPT SMP
>>>> [   33.877450] Modules linked in:
>>>> [   33.880474] CPU: 6 PID: 81 Comm: kworker/6:1 Not tainted 5.1.0-rc2-01186-ga5343b8a2ca5-dirty #380
>>>> [   33.889265] Hardware name: Khadas VIM2 (DT)
>>>> [   33.893419] Workqueue: events drm_sched_job_timedout
>>>> [...]
>>>> [   33.903069] pc : drm_sched_increase_karma+0x5c/0xf0
>>>> [   33.907898] lr : drm_sched_increase_karma+0x44/0xf0
>>>> [...]
>>>> [   33.994924] Process kworker/6:1 (pid: 81, stack limit = 0x(____ptrval____))
>>>> [   34.001822] Call trace:
>>>> [   34.004242]  drm_sched_increase_karma+0x5c/0xf0
>>>> [   34.008726]  panfrost_job_timedout+0x12c/0x1e0
>>>> [   34.013122]  drm_sched_job_timedout+0x48/0xa0
>>>> [   34.017438]  process_one_work+0x1e0/0x320
>>>> [   34.021402]  worker_thread+0x40/0x450
>>>> [   34.025026]  kthread+0x124/0x128
>>>> [   34.028218]  ret_from_fork+0x10/0x18
>>>> [   34.031755] Code: f9400001 540001c0 f9400a83 f9402402 (f9401c64)
>>>> [   34.037792] ---[ end trace be3fd6f77f4df267 ]---
>>>>
>>>>
>>>> When I revert [3] on [1], i get the same crash as [2], meaning
>>>> the commit [3] masks the failure [2] introduced.
>>>>
>>>> Do you know how to solve this ?
>>>>
>>>> Thanks,
>>>> Neil

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: drm_sched with panfrost crash on T820
  2019-10-03  8:34       ` Neil Armstrong
@ 2019-10-04 14:53         ` Grodzovsky, Andrey
  2019-10-04 15:03           ` Neil Armstrong
  0 siblings, 1 reply; 21+ messages in thread
From: Grodzovsky, Andrey @ 2019-10-04 14:53 UTC (permalink / raw)
  To: Neil Armstrong, Hillf Danton
  Cc: daniel, airlied, Koenig, Christian, Erico Nunes, linux-kernel,
	steven.price, dri-devel, Rob Herring, Tomeu Vizoso,
	open list:ARM/Amlogic Meson...


On 10/3/19 4:34 AM, Neil Armstrong wrote:
> Hi Andrey,
>
> Le 02/10/2019 à 16:40, Grodzovsky, Andrey a écrit :
>> On 9/30/19 10:52 AM, Hillf Danton wrote:
>>> On Mon, 30 Sep 2019 11:17:45 +0200 Neil Armstrong wrote:
>>>> Did a new run from 5.3:
>>>>
>>>> [   35.971972] Call trace:
>>>> [   35.974391]  drm_sched_increase_karma+0x5c/0xf0
>>>> 			ffff000010667f38	FFFF000010667F94
>>>> 			drivers/gpu/drm/scheduler/sched_main.c:335
>>>>
>>>> The crashing line is :
>>>>                                   if (bad->s_fence->scheduled.context ==
>>>>                                       entity->fence_context) {
>>>>
>>>> Doesn't seem related to guilty job.
>>> Bail out if s_fence is no longer fresh.
>>>
>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>> @@ -333,6 +333,10 @@ void drm_sched_increase_karma(struct drm
>>>    
>>>    			spin_lock(&rq->lock);
>>>    			list_for_each_entry_safe(entity, tmp, &rq->entities, list) {
>>> +				if (!smp_load_acquire(&bad->s_fence)) {
>>> +					spin_unlock(&rq->lock);
>>> +					return;
>>> +				}
>>>    				if (bad->s_fence->scheduled.context ==
>>>    				    entity->fence_context) {
>>>    					if (atomic_read(&bad->karma) >
>>> @@ -543,7 +547,7 @@ EXPORT_SYMBOL(drm_sched_job_init);
>>>    void drm_sched_job_cleanup(struct drm_sched_job *job)
>>>    {
>>>    	dma_fence_put(&job->s_fence->finished);
>>> -	job->s_fence = NULL;
>>> +	smp_store_release(&job->s_fence, 0);
>>>    }
>>>    EXPORT_SYMBOL(drm_sched_job_cleanup);
> This fixed the problem on the 10 CI runs.
>
> Neil


These are good news but I still fail to see how this fixes the problem - 
Hillf, do you mind explaining how you came up with this particular fix - 
what was the bug you saw ?

Andrey

>
>> Does this change help the problem ? Note that drm_sched_job_cleanup is
>> called from scheduler thread which is stopped at all times when work_tdr
>> thread is running and anyway the 'bad' job is still in the
>> ring_mirror_list while it's being accessed from
>> drm_sched_increase_karma so I don't think drm_sched_job_cleanup can be
>> called for it BEFORE or while drm_sched_increase_karma is executed.
>>
>> Andrey
>>
>>
>>>    
>>> --
>>>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: drm_sched with panfrost crash on T820
  2019-10-04 14:53         ` Grodzovsky, Andrey
@ 2019-10-04 15:03           ` Neil Armstrong
  2019-10-04 15:27             ` Steven Price
  0 siblings, 1 reply; 21+ messages in thread
From: Neil Armstrong @ 2019-10-04 15:03 UTC (permalink / raw)
  To: Grodzovsky, Andrey, Hillf Danton
  Cc: daniel, airlied, Koenig, Christian, Erico Nunes, linux-kernel,
	steven.price, dri-devel, Rob Herring, Tomeu Vizoso,
	open list:ARM/Amlogic Meson...

On 04/10/2019 16:53, Grodzovsky, Andrey wrote:
> 
> On 10/3/19 4:34 AM, Neil Armstrong wrote:
>> Hi Andrey,
>>
>> Le 02/10/2019 à 16:40, Grodzovsky, Andrey a écrit :
>>> On 9/30/19 10:52 AM, Hillf Danton wrote:
>>>> On Mon, 30 Sep 2019 11:17:45 +0200 Neil Armstrong wrote:
>>>>> Did a new run from 5.3:
>>>>>
>>>>> [   35.971972] Call trace:
>>>>> [   35.974391]  drm_sched_increase_karma+0x5c/0xf0
>>>>> 			ffff000010667f38	FFFF000010667F94
>>>>> 			drivers/gpu/drm/scheduler/sched_main.c:335
>>>>>
>>>>> The crashing line is :
>>>>>                                   if (bad->s_fence->scheduled.context ==
>>>>>                                       entity->fence_context) {
>>>>>
>>>>> Doesn't seem related to guilty job.
>>>> Bail out if s_fence is no longer fresh.
>>>>
>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>>> @@ -333,6 +333,10 @@ void drm_sched_increase_karma(struct drm
>>>>    
>>>>    			spin_lock(&rq->lock);
>>>>    			list_for_each_entry_safe(entity, tmp, &rq->entities, list) {
>>>> +				if (!smp_load_acquire(&bad->s_fence)) {
>>>> +					spin_unlock(&rq->lock);
>>>> +					return;
>>>> +				}
>>>>    				if (bad->s_fence->scheduled.context ==
>>>>    				    entity->fence_context) {
>>>>    					if (atomic_read(&bad->karma) >
>>>> @@ -543,7 +547,7 @@ EXPORT_SYMBOL(drm_sched_job_init);
>>>>    void drm_sched_job_cleanup(struct drm_sched_job *job)
>>>>    {
>>>>    	dma_fence_put(&job->s_fence->finished);
>>>> -	job->s_fence = NULL;
>>>> +	smp_store_release(&job->s_fence, 0);
>>>>    }
>>>>    EXPORT_SYMBOL(drm_sched_job_cleanup);
>> This fixed the problem on the 10 CI runs.
>>
>> Neil
> 
> 
> These are good news but I still fail to see how this fixes the problem - 
> Hillf, do you mind explaining how you came up with this particular fix - 
> what was the bug you saw ?

As Steven explained, seems the same job was submitted on both HW slots,
and then when timeout occurs each thread calls panfrost_job_timedout
which leads to drm_sched_stop() on the first call and on the
second call the job was already freed.

Steven proposed a working fix, and this one does the same but on
the drm_sched side. This one looks cleaner, but panfrost should
not call drm_sched_stop() twice for the same job.

Neil

> 
> Andrey
> 
>>
>>> Does this change help the problem ? Note that drm_sched_job_cleanup is
>>> called from scheduler thread which is stopped at all times when work_tdr
>>> thread is running and anyway the 'bad' job is still in the
>>> ring_mirror_list while it's being accessed from
>>> drm_sched_increase_karma so I don't think drm_sched_job_cleanup can be
>>> called for it BEFORE or while drm_sched_increase_karma is executed.
>>>
>>> Andrey
>>>
>>>
>>>>    
>>>> --
>>>>


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: drm_sched with panfrost crash on T820
  2019-10-04 15:03           ` Neil Armstrong
@ 2019-10-04 15:27             ` Steven Price
  2019-10-04 15:34               ` Koenig, Christian
  0 siblings, 1 reply; 21+ messages in thread
From: Steven Price @ 2019-10-04 15:27 UTC (permalink / raw)
  To: Neil Armstrong, Grodzovsky, Andrey, Hillf Danton
  Cc: Tomeu Vizoso, airlied, linux-kernel, dri-devel,
	open list:ARM/Amlogic Meson...,
	Koenig, Christian, Erico Nunes

On 04/10/2019 16:03, Neil Armstrong wrote:
> On 04/10/2019 16:53, Grodzovsky, Andrey wrote:
>>
>> On 10/3/19 4:34 AM, Neil Armstrong wrote:
>>> Hi Andrey,
>>>
>>> Le 02/10/2019 à 16:40, Grodzovsky, Andrey a écrit :
>>>> On 9/30/19 10:52 AM, Hillf Danton wrote:
>>>>> On Mon, 30 Sep 2019 11:17:45 +0200 Neil Armstrong wrote:
>>>>>> Did a new run from 5.3:
>>>>>>
>>>>>> [   35.971972] Call trace:
>>>>>> [   35.974391]  drm_sched_increase_karma+0x5c/0xf0
>>>>>> 			ffff000010667f38	FFFF000010667F94
>>>>>> 			drivers/gpu/drm/scheduler/sched_main.c:335
>>>>>>
>>>>>> The crashing line is :
>>>>>>                                   if (bad->s_fence->scheduled.context ==
>>>>>>                                       entity->fence_context) {
>>>>>>
>>>>>> Doesn't seem related to guilty job.
>>>>> Bail out if s_fence is no longer fresh.
>>>>>
>>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>>>> @@ -333,6 +333,10 @@ void drm_sched_increase_karma(struct drm
>>>>>    
>>>>>    			spin_lock(&rq->lock);
>>>>>    			list_for_each_entry_safe(entity, tmp, &rq->entities, list) {
>>>>> +				if (!smp_load_acquire(&bad->s_fence)) {
>>>>> +					spin_unlock(&rq->lock);
>>>>> +					return;
>>>>> +				}
>>>>>    				if (bad->s_fence->scheduled.context ==
>>>>>    				    entity->fence_context) {
>>>>>    					if (atomic_read(&bad->karma) >
>>>>> @@ -543,7 +547,7 @@ EXPORT_SYMBOL(drm_sched_job_init);
>>>>>    void drm_sched_job_cleanup(struct drm_sched_job *job)
>>>>>    {
>>>>>    	dma_fence_put(&job->s_fence->finished);
>>>>> -	job->s_fence = NULL;
>>>>> +	smp_store_release(&job->s_fence, 0);
>>>>>    }
>>>>>    EXPORT_SYMBOL(drm_sched_job_cleanup);
>>> This fixed the problem on the 10 CI runs.
>>>
>>> Neil
>>
>>
>> These are good news but I still fail to see how this fixes the problem - 
>> Hillf, do you mind explaining how you came up with this particular fix - 
>> what was the bug you saw ?
> 
> As Steven explained, seems the same job was submitted on both HW slots,
> and then when timeout occurs each thread calls panfrost_job_timedout
> which leads to drm_sched_stop() on the first call and on the
> second call the job was already freed.
> 
> Steven proposed a working fix, and this one does the same but on
> the drm_sched side. This one looks cleaner, but panfrost should
> not call drm_sched_stop() twice for the same job.

I'm not sure that Hillf's fix is sufficient. In particular in
drm_sched_increase_karma() I don't understand how the smp_load_acquire()
call prevents bad->s_fence becoming NULL immediately afterwards (but
admittedly the window is much smaller). But really this is just a
Panfrost bug (calling drm_sched_stop() twice on the same job).

The part of my change that I'd welcome feedback on is changing
cancel_delayed_work() to cancel_delayed_work_sync() in drm_sched_stop()
when called on different scheduler to the bad job. It's not clear to me
exactly what the semantics of the function should be, and I haven't
tested the effect of the change on drivers other than Panfrost.

Steve

> Neil
> 
>>
>> Andrey
>>
>>>
>>>> Does this change help the problem ? Note that drm_sched_job_cleanup is
>>>> called from scheduler thread which is stopped at all times when work_tdr
>>>> thread is running and anyway the 'bad' job is still in the
>>>> ring_mirror_list while it's being accessed from
>>>> drm_sched_increase_karma so I don't think drm_sched_job_cleanup can be
>>>> called for it BEFORE or while drm_sched_increase_karma is executed.
>>>>
>>>> Andrey
>>>>
>>>>
>>>>>    
>>>>> --
>>>>>
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
> 


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: drm_sched with panfrost crash on T820
  2019-10-04 15:27             ` Steven Price
@ 2019-10-04 15:34               ` Koenig, Christian
  2019-10-04 16:02                 ` Steven Price
  0 siblings, 1 reply; 21+ messages in thread
From: Koenig, Christian @ 2019-10-04 15:34 UTC (permalink / raw)
  To: Steven Price, Neil Armstrong, Grodzovsky, Andrey, Hillf Danton
  Cc: Tomeu Vizoso, airlied, linux-kernel, dri-devel,
	open list:ARM/Amlogic Meson...,
	Erico Nunes

Am 04.10.19 um 17:27 schrieb Steven Price:
> On 04/10/2019 16:03, Neil Armstrong wrote:
>> On 04/10/2019 16:53, Grodzovsky, Andrey wrote:
>>> On 10/3/19 4:34 AM, Neil Armstrong wrote:
>>>> Hi Andrey,
>>>>
>>>> Le 02/10/2019 à 16:40, Grodzovsky, Andrey a écrit :
>>>>> On 9/30/19 10:52 AM, Hillf Danton wrote:
>>>>>> On Mon, 30 Sep 2019 11:17:45 +0200 Neil Armstrong wrote:
>>>>>>> Did a new run from 5.3:
>>>>>>>
>>>>>>> [   35.971972] Call trace:
>>>>>>> [   35.974391]  drm_sched_increase_karma+0x5c/0xf0
>>>>>>> 			ffff000010667f38	FFFF000010667F94
>>>>>>> 			drivers/gpu/drm/scheduler/sched_main.c:335
>>>>>>>
>>>>>>> The crashing line is :
>>>>>>>                                    if (bad->s_fence->scheduled.context ==
>>>>>>>                                        entity->fence_context) {
>>>>>>>
>>>>>>> Doesn't seem related to guilty job.
>>>>>> Bail out if s_fence is no longer fresh.
>>>>>>
>>>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>>>>> @@ -333,6 +333,10 @@ void drm_sched_increase_karma(struct drm
>>>>>>     
>>>>>>     			spin_lock(&rq->lock);
>>>>>>     			list_for_each_entry_safe(entity, tmp, &rq->entities, list) {
>>>>>> +				if (!smp_load_acquire(&bad->s_fence)) {
>>>>>> +					spin_unlock(&rq->lock);
>>>>>> +					return;
>>>>>> +				}
>>>>>>     				if (bad->s_fence->scheduled.context ==
>>>>>>     				    entity->fence_context) {
>>>>>>     					if (atomic_read(&bad->karma) >
>>>>>> @@ -543,7 +547,7 @@ EXPORT_SYMBOL(drm_sched_job_init);
>>>>>>     void drm_sched_job_cleanup(struct drm_sched_job *job)
>>>>>>     {
>>>>>>     	dma_fence_put(&job->s_fence->finished);
>>>>>> -	job->s_fence = NULL;
>>>>>> +	smp_store_release(&job->s_fence, 0);
>>>>>>     }
>>>>>>     EXPORT_SYMBOL(drm_sched_job_cleanup);
>>>> This fixed the problem on the 10 CI runs.
>>>>
>>>> Neil
>>>
>>> These are good news but I still fail to see how this fixes the problem -
>>> Hillf, do you mind explaining how you came up with this particular fix -
>>> what was the bug you saw ?
>> As Steven explained, seems the same job was submitted on both HW slots,
>> and then when timeout occurs each thread calls panfrost_job_timedout
>> which leads to drm_sched_stop() on the first call and on the
>> second call the job was already freed.
>>
>> Steven proposed a working fix, and this one does the same but on
>> the drm_sched side. This one looks cleaner, but panfrost should
>> not call drm_sched_stop() twice for the same job.
> I'm not sure that Hillf's fix is sufficient. In particular in
> drm_sched_increase_karma() I don't understand how the smp_load_acquire()
> call prevents bad->s_fence becoming NULL immediately afterwards (but
> admittedly the window is much smaller). But really this is just a
> Panfrost bug (calling drm_sched_stop() twice on the same job).
>
> The part of my change that I'd welcome feedback on is changing
> cancel_delayed_work() to cancel_delayed_work_sync() in drm_sched_stop()
> when called on different scheduler to the bad job. It's not clear to me
> exactly what the semantics of the function should be, and I haven't
> tested the effect of the change on drivers other than Panfrost.

Yeah, at least of hand that change doesn't seem to make sense to me.

Multiple timeout workers can perfectly run in parallel, Panfrost needs 
to make sure that they don't affect each other.

The simplest way of doing this is to have a mutex you trylock when 
starting the reset.

The first one grabbing it wins, all other just silently return.

Regards,
Christian.

>
> Steve
>
>> Neil
>>
>>> Andrey
>>>
>>>>> Does this change help the problem ? Note that drm_sched_job_cleanup is
>>>>> called from scheduler thread which is stopped at all times when work_tdr
>>>>> thread is running and anyway the 'bad' job is still in the
>>>>> ring_mirror_list while it's being accessed from
>>>>> drm_sched_increase_karma so I don't think drm_sched_job_cleanup can be
>>>>> called for it BEFORE or while drm_sched_increase_karma is executed.
>>>>>
>>>>> Andrey
>>>>>
>>>>>
>>>>>>     
>>>>>> --
>>>>>>
>> _______________________________________________
>> dri-devel mailing list
>> dri-devel@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/dri-devel
>>


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: drm_sched with panfrost crash on T820
  2019-10-04 15:34               ` Koenig, Christian
@ 2019-10-04 16:02                 ` Steven Price
  0 siblings, 0 replies; 21+ messages in thread
From: Steven Price @ 2019-10-04 16:02 UTC (permalink / raw)
  To: Koenig, Christian, Neil Armstrong, Grodzovsky, Andrey, Hillf Danton
  Cc: Tomeu Vizoso, airlied, linux-kernel, dri-devel,
	open list:ARM/Amlogic Meson...,
	Erico Nunes

On 04/10/2019 16:34, Koenig, Christian wrote:
> Am 04.10.19 um 17:27 schrieb Steven Price:
>> On 04/10/2019 16:03, Neil Armstrong wrote:
>>> On 04/10/2019 16:53, Grodzovsky, Andrey wrote:
>>>> On 10/3/19 4:34 AM, Neil Armstrong wrote:
>>>>> Hi Andrey,
>>>>>
>>>>> Le 02/10/2019 à 16:40, Grodzovsky, Andrey a écrit :
>>>>>> On 9/30/19 10:52 AM, Hillf Danton wrote:
>>>>>>> On Mon, 30 Sep 2019 11:17:45 +0200 Neil Armstrong wrote:
>>>>>>>> Did a new run from 5.3:
>>>>>>>>
>>>>>>>> [   35.971972] Call trace:
>>>>>>>> [   35.974391]  drm_sched_increase_karma+0x5c/0xf0
>>>>>>>> 			ffff000010667f38	FFFF000010667F94
>>>>>>>> 			drivers/gpu/drm/scheduler/sched_main.c:335
>>>>>>>>
>>>>>>>> The crashing line is :
>>>>>>>>                                    if (bad->s_fence->scheduled.context ==
>>>>>>>>                                        entity->fence_context) {
>>>>>>>>
>>>>>>>> Doesn't seem related to guilty job.
>>>>>>> Bail out if s_fence is no longer fresh.
>>>>>>>
>>>>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>>>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>>>>>> @@ -333,6 +333,10 @@ void drm_sched_increase_karma(struct drm
>>>>>>>     
>>>>>>>     			spin_lock(&rq->lock);
>>>>>>>     			list_for_each_entry_safe(entity, tmp, &rq->entities, list) {
>>>>>>> +				if (!smp_load_acquire(&bad->s_fence)) {
>>>>>>> +					spin_unlock(&rq->lock);
>>>>>>> +					return;
>>>>>>> +				}
>>>>>>>     				if (bad->s_fence->scheduled.context ==
>>>>>>>     				    entity->fence_context) {
>>>>>>>     					if (atomic_read(&bad->karma) >
>>>>>>> @@ -543,7 +547,7 @@ EXPORT_SYMBOL(drm_sched_job_init);
>>>>>>>     void drm_sched_job_cleanup(struct drm_sched_job *job)
>>>>>>>     {
>>>>>>>     	dma_fence_put(&job->s_fence->finished);
>>>>>>> -	job->s_fence = NULL;
>>>>>>> +	smp_store_release(&job->s_fence, 0);
>>>>>>>     }
>>>>>>>     EXPORT_SYMBOL(drm_sched_job_cleanup);
>>>>> This fixed the problem on the 10 CI runs.
>>>>>
>>>>> Neil
>>>>
>>>> These are good news but I still fail to see how this fixes the problem -
>>>> Hillf, do you mind explaining how you came up with this particular fix -
>>>> what was the bug you saw ?
>>> As Steven explained, seems the same job was submitted on both HW slots,
>>> and then when timeout occurs each thread calls panfrost_job_timedout
>>> which leads to drm_sched_stop() on the first call and on the
>>> second call the job was already freed.
>>>
>>> Steven proposed a working fix, and this one does the same but on
>>> the drm_sched side. This one looks cleaner, but panfrost should
>>> not call drm_sched_stop() twice for the same job.
>> I'm not sure that Hillf's fix is sufficient. In particular in
>> drm_sched_increase_karma() I don't understand how the smp_load_acquire()
>> call prevents bad->s_fence becoming NULL immediately afterwards (but
>> admittedly the window is much smaller). But really this is just a
>> Panfrost bug (calling drm_sched_stop() twice on the same job).
>>
>> The part of my change that I'd welcome feedback on is changing
>> cancel_delayed_work() to cancel_delayed_work_sync() in drm_sched_stop()
>> when called on different scheduler to the bad job. It's not clear to me
>> exactly what the semantics of the function should be, and I haven't
>> tested the effect of the change on drivers other than Panfrost.
> 
> Yeah, at least of hand that change doesn't seem to make sense to me.

We need to ensure that any other timeouts that might have started
processing are complete before actually resetting the hardware.
Otherwise after the reset another thread could come along and attempt to
reset the hardware again (and cause a double free of a job). My change
to use the _sync() variant prevents this happening.

> Multiple timeout workers can perfectly run in parallel, Panfrost needs 
> to make sure that they don't affect each other.
> 
> The simplest way of doing this is to have a mutex you trylock when 
> starting the reset.
> 
> The first one grabbing it wins, all other just silently return.

Ah that would simplify my change removing the need for the new variable.
I hadn't thought to use trylock. I'll give that a spin and post a new
simpler version.

Thanks,

Steve

> Regards,
> Christian.
> 
>>
>> Steve
>>
>>> Neil
>>>
>>>> Andrey
>>>>
>>>>>> Does this change help the problem ? Note that drm_sched_job_cleanup is
>>>>>> called from scheduler thread which is stopped at all times when work_tdr
>>>>>> thread is running and anyway the 'bad' job is still in the
>>>>>> ring_mirror_list while it's being accessed from
>>>>>> drm_sched_increase_karma so I don't think drm_sched_job_cleanup can be
>>>>>> called for it BEFORE or while drm_sched_increase_karma is executed.
>>>>>>
>>>>>> Andrey
>>>>>>
>>>>>>
>>>>>>>     
>>>>>>> --
>>>>>>>
>>> _______________________________________________
>>> dri-devel mailing list
>>> dri-devel@lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/dri-devel
>>>
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
> 


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: drm_sched with panfrost crash on T820
       [not found] <8e003dfd-2761-4941-8b5b-ecc186222229@email.android.com>
@ 2019-10-07 12:47 ` Steven Price
  0 siblings, 0 replies; 21+ messages in thread
From: Steven Price @ 2019-10-07 12:47 UTC (permalink / raw)
  To: Koenig, Christian
  Cc: Hillf Danton, Tomeu Vizoso, Neil Armstrong, airlied,
	linux-kernel, dri-devel, open list:ARM/Amlogic Meson...,
	Erico Nunes

On 04/10/2019 17:33, Koenig, Christian wrote:
> 
> 
> Am 04.10.2019 18:02 schrieb Steven Price <steven.price@arm.com>:
> On 04/10/2019 16:34, Koenig, Christian wrote:
>> Am 04.10.19 um 17:27 schrieb Steven Price:
>>> On 04/10/2019 16:03, Neil Armstrong wrote:
>>>> On 04/10/2019 16:53, Grodzovsky, Andrey wrote:
>>>>> On 10/3/19 4:34 AM, Neil Armstrong wrote:
>>>>>> Hi Andrey,
>>>>>>
>>>>>> Le 02/10/2019 à 16:40, Grodzovsky, Andrey a écrit :
>>>>>>> On 9/30/19 10:52 AM, Hillf Danton wrote:
>>>>>>>> On Mon, 30 Sep 2019 11:17:45 +0200 Neil Armstrong wrote:
>>>>>>>>> Did a new run from 5.3:
>>>>>>>>>
>>>>>>>>> [   35.971972] Call trace:
>>>>>>>>> [   35.974391]  drm_sched_increase_karma+0x5c/0xf0
>>>>>>>>>                         ffff000010667f38        FFFF000010667F94
>>>>>>>>>                         drivers/gpu/drm/scheduler/sched_main.c:335
>>>>>>>>>
>>>>>>>>> The crashing line is :
>>>>>>>>>                                    if (bad->s_fence->scheduled.context ==
>>>>>>>>>                                        entity->fence_context) {
>>>>>>>>>
>>>>>>>>> Doesn't seem related to guilty job.
>>>>>>>> Bail out if s_fence is no longer fresh.
>>>>>>>>
>>>>>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>>>>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>>>>>>> @@ -333,6 +333,10 @@ void drm_sched_increase_karma(struct drm
>>>>>>>>
>>>>>>>>                          spin_lock(&rq->lock);
>>>>>>>>                          list_for_each_entry_safe(entity, tmp, &rq->entities, list) {
>>>>>>>> +                               if (!smp_load_acquire(&bad->s_fence)) {
>>>>>>>> +                                       spin_unlock(&rq->lock);
>>>>>>>> +                                       return;
>>>>>>>> +                               }
>>>>>>>>                                  if (bad->s_fence->scheduled.context ==
>>>>>>>>                                      entity->fence_context) {
>>>>>>>>                                          if (atomic_read(&bad->karma) >
>>>>>>>> @@ -543,7 +547,7 @@ EXPORT_SYMBOL(drm_sched_job_init);
>>>>>>>>     void drm_sched_job_cleanup(struct drm_sched_job *job)
>>>>>>>>     {
>>>>>>>>          dma_fence_put(&job->s_fence->finished);
>>>>>>>> -       job->s_fence = NULL;
>>>>>>>> +       smp_store_release(&job->s_fence, 0);
>>>>>>>>     }
>>>>>>>>     EXPORT_SYMBOL(drm_sched_job_cleanup);
>>>>>> This fixed the problem on the 10 CI runs.
>>>>>>
>>>>>> Neil
>>>>>
>>>>> These are good news but I still fail to see how this fixes the problem -
>>>>> Hillf, do you mind explaining how you came up with this particular fix -
>>>>> what was the bug you saw ?
>>>> As Steven explained, seems the same job was submitted on both HW slots,
>>>> and then when timeout occurs each thread calls panfrost_job_timedout
>>>> which leads to drm_sched_stop() on the first call and on the
>>>> second call the job was already freed.
>>>>
>>>> Steven proposed a working fix, and this one does the same but on
>>>> the drm_sched side. This one looks cleaner, but panfrost should
>>>> not call drm_sched_stop() twice for the same job.
>>> I'm not sure that Hillf's fix is sufficient. In particular in
>>> drm_sched_increase_karma() I don't understand how the smp_load_acquire()
>>> call prevents bad->s_fence becoming NULL immediately afterwards (but
>>> admittedly the window is much smaller). But really this is just a
>>> Panfrost bug (calling drm_sched_stop() twice on the same job).
>>>
>>> The part of my change that I'd welcome feedback on is changing
>>> cancel_delayed_work() to cancel_delayed_work_sync() in drm_sched_stop()
>>> when called on different scheduler to the bad job. It's not clear to me
>>> exactly what the semantics of the function should be, and I haven't
>>> tested the effect of the change on drivers other than Panfrost.
>>
>> Yeah, at least of hand that change doesn't seem to make sense to me.
> 
> We need to ensure that any other timeouts that might have started
> processing are complete before actually resetting the hardware.
> Otherwise after the reset another thread could come along and attempt to
> reset the hardware again (and cause a double free of a job).
> 
> This is intentional behaviour. If you don't want the double reset in Panfrost you should probably call the cancel_sync yourself.

It's less the double reset that is the problem, more that the job gets
cleaned up twice: drm_sched_stop() will either free the job or mark it
to be freed later. By having two threads both drm_sched_stop()ing all
slots you end up with the guilty job(s) potentially being double freed.

I've move the call to cancel_delayed_work_sync() into Panfrost since I'm
not sure whether this is generically useful to other drivers.

Steve

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2019-10-07 12:47 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-27  8:12 drm_sched with panfrost crash on T820 Neil Armstrong
2019-09-27  9:55 ` Steven Price
2019-09-27 10:48   ` Steven Price
2019-09-27 11:27     ` Neil Armstrong
2019-09-27 11:48       ` Neil Armstrong
2019-09-27 15:00         ` Steven Price
2019-09-27 15:20           ` Neil Armstrong
2019-09-30 13:18           ` Neil Armstrong
2019-09-27 20:55 ` Grodzovsky, Andrey
2019-09-30  9:17   ` Neil Armstrong
2019-10-02 16:53     ` Grodzovsky, Andrey
2019-10-03  8:36       ` Neil Armstrong
     [not found]   ` <20190930145228.14000-1-hdanton@sina.com>
2019-10-02 14:40     ` Grodzovsky, Andrey
2019-10-02 14:44       ` Neil Armstrong
2019-10-03  8:34       ` Neil Armstrong
2019-10-04 14:53         ` Grodzovsky, Andrey
2019-10-04 15:03           ` Neil Armstrong
2019-10-04 15:27             ` Steven Price
2019-10-04 15:34               ` Koenig, Christian
2019-10-04 16:02                 ` Steven Price
     [not found] <8e003dfd-2761-4941-8b5b-ecc186222229@email.android.com>
2019-10-07 12:47 ` Steven Price

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).