* etnaviv: Possible circular lockingon i.MX6QP
@ 2019-06-12 15:48 Fabio Estevam
2019-06-27 9:43 ` Lucas Stach
0 siblings, 1 reply; 2+ messages in thread
From: Fabio Estevam @ 2019-06-12 15:48 UTC (permalink / raw)
To: Lucas Stach, Christian Gmeiner, Russell King - ARM Linux
Cc: The etnaviv authors, DRI mailing list
Hi,
On a imx6qp-wandboard I get the warning below about a possible
circular locking dependency running 5.1.9 built from
imx_v6_v7_defconfig.
Such warning does not happen on the imx6q or imx6solo variants of
wandboard though.
Any ideas?
Thanks,
Fabio Estevam
** (matchbox-panel:708): WARNING **: Failed to load applet "battery"
(/usr/lib/matchbox-panel/libbattery.so: cannot open shared object
file: No such file or directory).
matchbox-wm: X error warning (0xe00003): BadWindow (invalid Window
parameter) (opcode: 12)
etnaviv-gpu 134000.gpu: MMU fault status 0x00000001
etnaviv-gpu 134000.gpu: MMU 0 fault addr 0x0805ffc0
======================================================
WARNING: possible circular locking dependency detected
5.1.9 #58 Not tainted
------------------------------------------------------
kworker/0:1/29 is trying to acquire lock:
(ptrval) (&(&gpu->fence_spinlock)->rlock){-...}, at:
dma_fence_remove_callback+0x14/0x50
but task is already holding lock:
(ptrval) (&(&sched->job_list_lock)->rlock){-...}, at: drm_sched_stop+0x1c/0x124
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (&(&sched->job_list_lock)->rlock){-...}:
drm_sched_process_job+0x5c/0x1c8
dma_fence_signal+0xdc/0x1d4
irq_handler+0xd0/0x1e0
__handle_irq_event_percpu+0x48/0x360
handle_irq_event_percpu+0x28/0x7c
handle_irq_event+0x38/0x5c
handle_fasteoi_irq+0xc0/0x17c
generic_handle_irq+0x20/0x34
__handle_domain_irq+0x64/0xe0
gic_handle_irq+0x4c/0xa8
__irq_svc+0x70/0x98
cpuidle_enter_state+0x168/0x5a4
cpuidle_enter_state+0x168/0x5a4
do_idle+0x220/0x2c0
cpu_startup_entry+0x18/0x20
start_kernel+0x3e4/0x498
-> #0 (&(&gpu->fence_spinlock)->rlock){-...}:
_raw_spin_lock_irqsave+0x38/0x4c
dma_fence_remove_callback+0x14/0x50
drm_sched_stop+0x98/0x124
etnaviv_sched_timedout_job+0x7c/0xb4
drm_sched_job_timedout+0x34/0x5c
process_one_work+0x2ac/0x704
worker_thread+0x2c/0x574
kthread+0x134/0x148
ret_from_fork+0x14/0x20
(null)
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&(&sched->job_list_lock)->rlock);
lock(&(&gpu->fence_spinlock)->rlock);
lock(&(&sched->job_list_lock)->rlock);
lock(&(&gpu->fence_spinlock)->rlock);
*** DEADLOCK ***
3 locks held by kworker/0:1/29:
#0: (ptrval) ((wq_completion)events){+.+.}, at: process_one_work+0x1f4/0x704
#1: (ptrval) ((work_completion)(&(&sched->work_tdr)->work)){+.+.},
at: process_one_work+0x1f4/0x704
#2: (ptrval) (&(&sched->job_list_lock)->rlock){-...}, at:
drm_sched_stop+0x1c/0x124
stack backtrace:
CPU: 0 PID: 29 Comm: kworker/0:1 Not tainted 5.1.9 #58
Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
Workqueue: events drm_sched_job_timedout
[<c0112748>] (unwind_backtrace) from [<c010cfbc>] (show_stack+0x10/0x14)
[<c010cfbc>] (show_stack) from [<c0bd31ec>] (dump_stack+0xd8/0x110)
[<c0bd31ec>] (dump_stack) from [<c017a22c>]
(print_circular_bug.constprop.19+0x1bc/0x2f0)
[<c017a22c>] (print_circular_bug.constprop.19) from [<c017d408>]
(__lock_acquire+0x1778/0x1f38)
[<c017d408>] (__lock_acquire) from [<c017e3a4>] (lock_acquire+0xcc/0x1e8)
[<c017e3a4>] (lock_acquire) from [<c0bf4134>] (_raw_spin_lock_irqsave+0x38/0x4c)
[<c0bf4134>] (_raw_spin_lock_irqsave) from [<c0692710>]
(dma_fence_remove_callback+0x14/0x50)
[<c0692710>] (dma_fence_remove_callback) from [<c05d25b4>]
(drm_sched_stop+0x98/0x124)
[<c05d25b4>] (drm_sched_stop) from [<c064a3e8>]
(etnaviv_sched_timedout_job+0x7c/0xb4)
[<c064a3e8>] (etnaviv_sched_timedout_job) from [<c05d2964>]
(drm_sched_job_timedout+0x34/0x5c)
[<c05d2964>] (drm_sched_job_timedout) from [<c01468ec>]
(process_one_work+0x2ac/0x704)
[<c01468ec>] (process_one_work) from [<c0146d70>] (worker_thread+0x2c/0x574)
[<c0146d70>] (worker_thread) from [<c014cd88>] (kthread+0x134/0x148)
[<c014cd88>] (kthread) from [<c01010b4>] (ret_from_fork+0x14/0x20)
Exception stack(0xe81f7fb0 to 0xe81f7ff8)
7fa0: 00000000 00000000 00000000 00000000
7fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
7fe0: 00000000 00000000 00000000 00000000 00000013 00000000
etnaviv-gpu 134000.gpu: recover hung GPU!
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: etnaviv: Possible circular lockingon i.MX6QP
2019-06-12 15:48 etnaviv: Possible circular lockingon i.MX6QP Fabio Estevam
@ 2019-06-27 9:43 ` Lucas Stach
0 siblings, 0 replies; 2+ messages in thread
From: Lucas Stach @ 2019-06-27 9:43 UTC (permalink / raw)
To: Fabio Estevam, Christian Gmeiner, Russell King - ARM Linux
Cc: The etnaviv authors, DRI mailing list
Hi Fabio,
Am Mittwoch, den 12.06.2019, 12:48 -0300 schrieb Fabio Estevam:
> Hi,
>
> On a imx6qp-wandboard I get the warning below about a possible
> circular locking dependency running 5.1.9 built from
> imx_v6_v7_defconfig.
>
> Such warning does not happen on the imx6q or imx6solo variants of
> wandboard though.
>
> Any ideas?
The issue reported by lockdep is real. You probably only see it on QP
as it's uncovered due to a MMU exception triggered GPU hang. MMUv1
cores like the ones on the older i.MX6 are unable to signal MMU
exceptions but just read the dummy page.
Some git history digging shows that the bug has been introduced with
3741540e0413 (drm/sched: Rework HW fence processing.), which is part of kernel 5.1. The fix is 5918045c4ed4 (drm/scheduler: rework job destruction), which is not in any released kernel yet and seems to be too big for stable, so I'm not really sure what to do at this point.
Regards,
Lucas
> Thanks,
>
> Fabio Estevam
>
> ** (matchbox-panel:708): WARNING **: Failed to load applet "battery"
> (/usr/lib/matchbox-panel/libbattery.so: cannot open shared object
> file: No such file or directory).
> matchbox-wm: X error warning (0xe00003): BadWindow (invalid Window
> parameter) (opcode: 12)
> etnaviv-gpu 134000.gpu: MMU fault status 0x00000001
> etnaviv-gpu 134000.gpu: MMU 0 fault addr 0x0805ffc0
>
> ======================================================
> WARNING: possible circular locking dependency detected
> 5.1.9 #58 Not tainted
> ------------------------------------------------------
> kworker/0:1/29 is trying to acquire lock:
> (ptrval) (&(&gpu->fence_spinlock)->rlock){-...}, at:
> dma_fence_remove_callback+0x14/0x50
>
> but task is already holding lock:
> (ptrval) (&(&sched->job_list_lock)->rlock){-...}, at:
> drm_sched_stop+0x1c/0x124
>
> which lock already depends on the new lock.
>
>
> the existing dependency chain (in reverse order) is:
>
> -> #1 (&(&sched->job_list_lock)->rlock){-...}:
> drm_sched_process_job+0x5c/0x1c8
> dma_fence_signal+0xdc/0x1d4
> irq_handler+0xd0/0x1e0
> __handle_irq_event_percpu+0x48/0x360
> handle_irq_event_percpu+0x28/0x7c
> handle_irq_event+0x38/0x5c
> handle_fasteoi_irq+0xc0/0x17c
> generic_handle_irq+0x20/0x34
> __handle_domain_irq+0x64/0xe0
> gic_handle_irq+0x4c/0xa8
> __irq_svc+0x70/0x98
> cpuidle_enter_state+0x168/0x5a4
> cpuidle_enter_state+0x168/0x5a4
> do_idle+0x220/0x2c0
> cpu_startup_entry+0x18/0x20
> start_kernel+0x3e4/0x498
>
> -> #0 (&(&gpu->fence_spinlock)->rlock){-...}:
> _raw_spin_lock_irqsave+0x38/0x4c
> dma_fence_remove_callback+0x14/0x50
> drm_sched_stop+0x98/0x124
> etnaviv_sched_timedout_job+0x7c/0xb4
> drm_sched_job_timedout+0x34/0x5c
> process_one_work+0x2ac/0x704
> worker_thread+0x2c/0x574
> kthread+0x134/0x148
> ret_from_fork+0x14/0x20
> (null)
>
> other info that might help us debug this:
>
> Possible unsafe locking scenario:
>
> CPU0 CPU1
> ---- ----
> lock(&(&sched->job_list_lock)->rlock);
> lock(&(&gpu->fence_spinlock)->rlock);
> lock(&(&sched->job_list_lock)->rlock);
> lock(&(&gpu->fence_spinlock)->rlock);
>
> *** DEADLOCK ***
>
> 3 locks held by kworker/0:1/29:
> #0: (ptrval) ((wq_completion)events){+.+.}, at:
> process_one_work+0x1f4/0x704
> #1: (ptrval) ((work_completion)(&(&sched->work_tdr)->work)){+.+.},
> at: process_one_work+0x1f4/0x704
> #2: (ptrval) (&(&sched->job_list_lock)->rlock){-...}, at:
> drm_sched_stop+0x1c/0x124
>
> stack backtrace:
> CPU: 0 PID: 29 Comm: kworker/0:1 Not tainted 5.1.9 #58
> Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
> Workqueue: events drm_sched_job_timedout
> [<c0112748>] (unwind_backtrace) from [<c010cfbc>]
> (show_stack+0x10/0x14)
> [<c010cfbc>] (show_stack) from [<c0bd31ec>] (dump_stack+0xd8/0x110)
> [<c0bd31ec>] (dump_stack) from [<c017a22c>]
> (print_circular_bug.constprop.19+0x1bc/0x2f0)
> [<c017a22c>] (print_circular_bug.constprop.19) from [<c017d408>]
> (__lock_acquire+0x1778/0x1f38)
> [<c017d408>] (__lock_acquire) from [<c017e3a4>]
> (lock_acquire+0xcc/0x1e8)
> [<c017e3a4>] (lock_acquire) from [<c0bf4134>]
> (_raw_spin_lock_irqsave+0x38/0x4c)
> [<c0bf4134>] (_raw_spin_lock_irqsave) from [<c0692710>]
> (dma_fence_remove_callback+0x14/0x50)
> [<c0692710>] (dma_fence_remove_callback) from [<c05d25b4>]
> (drm_sched_stop+0x98/0x124)
> [<c05d25b4>] (drm_sched_stop) from [<c064a3e8>]
> (etnaviv_sched_timedout_job+0x7c/0xb4)
> [<c064a3e8>] (etnaviv_sched_timedout_job) from [<c05d2964>]
> (drm_sched_job_timedout+0x34/0x5c)
> [<c05d2964>] (drm_sched_job_timedout) from [<c01468ec>]
> (process_one_work+0x2ac/0x704)
> [<c01468ec>] (process_one_work) from [<c0146d70>]
> (worker_thread+0x2c/0x574)
> [<c0146d70>] (worker_thread) from [<c014cd88>] (kthread+0x134/0x148)
> [<c014cd88>] (kthread) from [<c01010b4>] (ret_from_fork+0x14/0x20)
> Exception stack(0xe81f7fb0 to 0xe81f7ff8)
> 7fa0: 00000000 00000000 00000000
> 00000000
> 7fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> 00000000
> 7fe0: 00000000 00000000 00000000 00000000 00000013 00000000
> etnaviv-gpu 134000.gpu: recover hung GPU!
> _______________________________________________
> etnaviv mailing list
> etnaviv@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/etnaviv
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2019-06-27 9:43 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-12 15:48 etnaviv: Possible circular lockingon i.MX6QP Fabio Estevam
2019-06-27 9:43 ` Lucas Stach
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.