All of lore.kernel.org
 help / color / mirror / Atom feed
* hung task detected in ubifs
@ 2018-12-20  9:49 Martin Townsend
  2018-12-20 10:54 ` Richard Weinberger
  0 siblings, 1 reply; 10+ messages in thread
From: Martin Townsend @ 2018-12-20  9:49 UTC (permalink / raw)
  To: linux-mtd

Hi,

A client has just reported this lock up that I thought I would pass on
in case it's already been fixed and a patch is available or the trace
can help identify a potential deadlock. They are running a vendor
specific 4.9.88 kernel for an i.MX6 UL but as far as I know there are
no modifications to the UBI/MTD layers:

INFO: task kworker/u2:1:92 blocked for more than 120 seconds.
      Tainted: G           O    4.9.88-1.0.0+g6507266 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[<80912e44>] (__schedule) from [<80913484>] (schedule+0x48/0xb0)
[<80913484>] (schedule) from [<80913830>] (schedule_preempt_disabled+0x14/0x20)
[<80913830>] (schedule_preempt_disabled) from [<80914644>]
(mutex_lock_nested+0x1b8/0x3f8)
[<80914644>] (mutex_lock_nested) from [<8038c788>] (make_reservation+0xa0/0x43c)
[<8038c788>] (make_reservation) from [<8038d614>]
(ubifs_jnl_write_inode+0x78/0x224)
[<8038d614>] (ubifs_jnl_write_inode) from [<80395170>]
(ubifs_write_inode+0x90/0x114)
[<80395170>] (ubifs_write_inode) from [<80391348>] (ubifs_writepage+0x168/0x1c4)
[<80391348>] (ubifs_writepage) from [<8020094c>] (__writepage+0x14/0x5c)
[<8020094c>] (__writepage) from [<80200104>] (write_cache_pages+0x1f4/0x558)
[<80200104>] (write_cache_pages) from [<80202968>] (do_writepages+0x90/0xa4)
[<80202968>] (do_writepages) from [<8027c46c>]
(__writeback_single_inode+0x84/0x61c)
[<8027c46c>] (__writeback_single_inode) from [<8027d00c>]
(writeback_sb_inodes+0x1dc/0x4f0)
[<8027d00c>] (writeback_sb_inodes) from [<8027d378>]
(__writeback_inodes_wb+0x58/0x9c)
[<8027d378>] (__writeback_inodes_wb) from [<8027d714>]
(wb_writeback+0x358/0x438)
[<8027d714>] (wb_writeback) from [<8027e210>] (wb_workfn+0x1fc/0x5bc)
[<8027e210>] (wb_workfn) from [<8013f298>] (process_one_work+0x2b4/0x6a8)
[<8013f298>] (process_one_work) from [<8013f6dc>] (worker_thread+0x50/0x5a8)
[<8013f6dc>] (worker_thread) from [<80145270>] (kthread+0x124/0x12c)
[<80145270>] (kthread) from [<80107d30>] (ret_from_fork+0x14/0x24)

Showing all locks held in the system:
6 locks held by kworker/u2:0/6:
 #0:  ("writeback"){......}, at: [<8013f1d0>] process_one_work+0x1ec/0x6a8
 #1:  ((&(&wb->dwork)->work)){......}, at: [<8013f1d0>]
process_one_work+0x1ec/0x6a8
 #2:  (&type->s_umount_key#25){......}, at: [<8024dcac>] trylock_super+0x14/0x58
 #3:  (&ui->ui_mutex){......}, at: [<80395124>] ubifs_write_inode+0x44/0x114
 #4:  (&c->commit_sem){......}, at: [<8038c758>] make_reservation+0x70/0x43c
 #5:  (&wbuf->io_mutex/1){......}, at: [<8038c788>] make_reservation+0xa0/0x43c
2 locks held by khungtaskd/14:
 #0:  (rcu_read_lock){......}, at: [<801b7920>] watchdog+0xdc/0x4b0
 #1:  (tasklist_lock){......}, at: [<80162208>] debug_show_all_locks+0x38/0x1ac
6 locks held by kworker/u2:1/92:
 #0:  ("writeback"){......}, at: [<8013f1d0>] process_one_work+0x1ec/0x6a8
 #1:  ((&(&wb->dwork)->work)){......}, at: [<8013f1d0>]
process_one_work+0x1ec/0x6a8
 #2:  (&type->s_umount_key#25){......}, at: [<8024dcac>] trylock_super+0x14/0x58
 #3:  (&ui->ui_mutex){......}, at: [<80395124>] ubifs_write_inode+0x44/0x114
 #4:  (&c->commit_sem){......}, at: [<8038c758>] make_reservation+0x70/0x43c
 #5:  (&wbuf->io_mutex/1){......}, at: [<8038c788>] make_reservation+0xa0/0x43c
2 locks held by ubifs_bgt0_2/241:
 #0:  (&wbuf->io_mutex/1){......}, at: [<8039a1a8>]
ubifs_bg_wbufs_sync+0x110/0x1b0
 #1:  (&le->mutex){......}, at: [<805686b4>] ubi_eba_write_leb+0x40/0x7a4
2 locks held by ubifs_bgt0_0/720:
 #0:  (&wbuf->io_mutex/1){......}, at: [<8039a1a8>]
ubifs_bg_wbufs_sync+0x110/0x1b0
 #1:  (&le->mutex){......}, at: [<805686b4>] ubi_eba_write_leb+0x40/0x7a4
5 locks held by programmingapp/930:
 #0:  (&mm->mmap_sem){......}, at: [<80113ef0>] do_page_fault+0xb8/0x350
 #1:  (&le->mutex){......}, at: [<80568150>] ubi_eba_read_leb+0x34/0x438
 #2:  (of_dma_lock){......}, at: [<804ab890>]
of_dma_request_slave_channel+0x140/0x228
 #3:  (dma_list_mutex){......}, at: [<804a9d3c>] __dma_request_channel+0x24/0x8c
 #4:  (cma_mutex){......}, at: [<80246814>] cma_alloc+0xc8/0x29c
3 locks held by ls/938:
 #0:  (&type->i_mutex_dir_key){......}, at: [<8025e924>] iterate_dir+0x40/0x188
 #1:  (&c->tnc_mutex){......}, at: [<8039f3ac>] ubifs_tnc_next_ent+0x4c/0x1f8
 #2:  (&le->mutex){......}, at: [<80568150>] ubi_eba_read_leb+0x34/0x438
2 locks held by ls/941:
 #0:  (&type->i_mutex_dir_key){......}, at: [<80254d28>] lookup_slow+0x84/0x198
 #1:  (&le->mutex){......}, at: [<80568150>] ubi_eba_read_leb+0x34/0x438

=============================================

Best Regards,
Martin.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: hung task detected in ubifs
  2018-12-20  9:49 hung task detected in ubifs Martin Townsend
@ 2018-12-20 10:54 ` Richard Weinberger
  2018-12-20 13:07   ` Martin Townsend
  0 siblings, 1 reply; 10+ messages in thread
From: Richard Weinberger @ 2018-12-20 10:54 UTC (permalink / raw)
  To: Martin Townsend; +Cc: linux-mtd

On Thu, Dec 20, 2018 at 10:49 AM Martin Townsend
<mtownsend1973@gmail.com> wrote:
> 5 locks held by programmingapp/930:>  #0:  (&mm->mmap_sem){......}, at: [<80113ef0>] do_page_fault+0xb8/0x350
>  #1:  (&le->mutex){......}, at: [<80568150>] ubi_eba_read_leb+0x34/0x438
>  #2:  (of_dma_lock){......}, at: [<804ab890>]
> of_dma_request_slave_channel+0x140/0x228
>  #3:  (dma_list_mutex){......}, at: [<804a9d3c>] __dma_request_channel+0x24/0x8c
>  #4:  (cma_mutex){......}, at: [<80246814>] cma_alloc+0xc8/0x29c

The system seems to block in CMA, therefore make_reservation() does not make any
progress and all write back blocks too.
Do you have CONFIG_MIGRATION enabled?

If possible, enable LOCKDEP, it can give us more details what is going on.

-- 
Thanks,
//richard

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: hung task detected in ubifs
  2018-12-20 10:54 ` Richard Weinberger
@ 2018-12-20 13:07   ` Martin Townsend
  2018-12-20 13:15     ` Richard Weinberger
  0 siblings, 1 reply; 10+ messages in thread
From: Martin Townsend @ 2018-12-20 13:07 UTC (permalink / raw)
  To: Richard Weinberger; +Cc: linux-mtd

Hi Richard,

On Thu, Dec 20, 2018 at 10:54 AM Richard Weinberger
<richard.weinberger@gmail.com> wrote:
>
> On Thu, Dec 20, 2018 at 10:49 AM Martin Townsend
> <mtownsend1973@gmail.com> wrote:
> > 5 locks held by programmingapp/930:>  #0:  (&mm->mmap_sem){......}, at: [<80113ef0>] do_page_fault+0xb8/0x350
> >  #1:  (&le->mutex){......}, at: [<80568150>] ubi_eba_read_leb+0x34/0x438
> >  #2:  (of_dma_lock){......}, at: [<804ab890>]
> > of_dma_request_slave_channel+0x140/0x228
> >  #3:  (dma_list_mutex){......}, at: [<804a9d3c>] __dma_request_channel+0x24/0x8c
> >  #4:  (cma_mutex){......}, at: [<80246814>] cma_alloc+0xc8/0x29c
>
> The system seems to block in CMA, therefore make_reservation() does not make any
> progress and all write back blocks too.
> Do you have CONFIG_MIGRATION enabled?
>
> If possible, enable LOCKDEP, it can give us more details what is going on.
>
> --
> Thanks,
> //richard

I just checked and CONFIG_MIGRATION=y.
Interesting that it seems to block in the CMA, could it be that the
programmingapp process has used up all the CMA memory?  I'll ask for a
complete log file.
I'll ask them to enable LOCKDEP and hope that the i.MX6 UL can handle it :)

Cheers, Martin.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: hung task detected in ubifs
  2018-12-20 13:07   ` Martin Townsend
@ 2018-12-20 13:15     ` Richard Weinberger
  2018-12-20 14:19       ` Martin Townsend
  0 siblings, 1 reply; 10+ messages in thread
From: Richard Weinberger @ 2018-12-20 13:15 UTC (permalink / raw)
  To: Martin Townsend, linux-mtd

Am Donnerstag, 20. Dezember 2018, 14:07:45 CET schrieb Martin Townsend:
> Hi Richard,
> 
> On Thu, Dec 20, 2018 at 10:54 AM Richard Weinberger
> <richard.weinberger@gmail.com> wrote:
> >
> > On Thu, Dec 20, 2018 at 10:49 AM Martin Townsend
> > <mtownsend1973@gmail.com> wrote:
> > > 5 locks held by programmingapp/930:>  #0:  (&mm->mmap_sem){......}, at: [<80113ef0>] do_page_fault+0xb8/0x350
> > >  #1:  (&le->mutex){......}, at: [<80568150>] ubi_eba_read_leb+0x34/0x438
> > >  #2:  (of_dma_lock){......}, at: [<804ab890>]
> > > of_dma_request_slave_channel+0x140/0x228
> > >  #3:  (dma_list_mutex){......}, at: [<804a9d3c>] __dma_request_channel+0x24/0x8c
> > >  #4:  (cma_mutex){......}, at: [<80246814>] cma_alloc+0xc8/0x29c
> >
> > The system seems to block in CMA, therefore make_reservation() does not make any
> > progress and all write back blocks too.
> > Do you have CONFIG_MIGRATION enabled?
> >
> > If possible, enable LOCKDEP, it can give us more details what is going on.
> >
> > --
> > Thanks,
> > //richard
> 
> I just checked and CONFIG_MIGRATION=y.

Hmm, that should be ok.
I expected CONFIG_MIGRATION=n.

In the meanwhile you can also try that patch:
http://lists.infradead.org/pipermail/linux-mtd/2018-December/086389.html

It fixes allocation failures of CMA.
But I'm not sure whether you're facing the same problem.

> Interesting that it seems to block in the CMA, could it be that the
> programmingapp process has used up all the CMA memory?  I'll ask for a
> complete log file.

I don't know what the system is doing.
Does it really(!) deadlock or does it just take very long to continue?
(Lockdep should be able to tell...)

The MTD driver seems to do DMA which uses CMA to allocate memory.
So we need to understand why the CMA allocation blocks (forever?).
>From the current picture I cannot tell whether UBIFS is the root cause
or just a victim.

Thanks,
//richard

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: hung task detected in ubifs
  2018-12-20 13:15     ` Richard Weinberger
@ 2018-12-20 14:19       ` Martin Townsend
  2018-12-20 14:28         ` Richard Weinberger
  0 siblings, 1 reply; 10+ messages in thread
From: Martin Townsend @ 2018-12-20 14:19 UTC (permalink / raw)
  To: Richard Weinberger; +Cc: linux-mtd

On Thu, Dec 20, 2018 at 1:15 PM Richard Weinberger <richard@nod.at> wrote:
>
> Am Donnerstag, 20. Dezember 2018, 14:07:45 CET schrieb Martin Townsend:
> > Hi Richard,
> >
> > On Thu, Dec 20, 2018 at 10:54 AM Richard Weinberger
> > <richard.weinberger@gmail.com> wrote:
> > >
> > > On Thu, Dec 20, 2018 at 10:49 AM Martin Townsend
> > > <mtownsend1973@gmail.com> wrote:
> > > > 5 locks held by programmingapp/930:>  #0:  (&mm->mmap_sem){......}, at: [<80113ef0>] do_page_fault+0xb8/0x350
> > > >  #1:  (&le->mutex){......}, at: [<80568150>] ubi_eba_read_leb+0x34/0x438
> > > >  #2:  (of_dma_lock){......}, at: [<804ab890>]
> > > > of_dma_request_slave_channel+0x140/0x228
> > > >  #3:  (dma_list_mutex){......}, at: [<804a9d3c>] __dma_request_channel+0x24/0x8c
> > > >  #4:  (cma_mutex){......}, at: [<80246814>] cma_alloc+0xc8/0x29c
> > >
> > > The system seems to block in CMA, therefore make_reservation() does not make any
> > > progress and all write back blocks too.
> > > Do you have CONFIG_MIGRATION enabled?
> > >
> > > If possible, enable LOCKDEP, it can give us more details what is going on.
> > >
> > > --
> > > Thanks,
> > > //richard
> >
> > I just checked and CONFIG_MIGRATION=y.
>
> Hmm, that should be ok.
> I expected CONFIG_MIGRATION=n.
>
> In the meanwhile you can also try that patch:
> http://lists.infradead.org/pipermail/linux-mtd/2018-December/086389.html
>
> It fixes allocation failures of CMA.
> But I'm not sure whether you're facing the same problem.
>
> > Interesting that it seems to block in the CMA, could it be that the
> > programmingapp process has used up all the CMA memory?  I'll ask for a
> > complete log file.
>
> I don't know what the system is doing.
> Does it really(!) deadlock or does it just take very long to continue?
> (Lockdep should be able to tell...)
>
> The MTD driver seems to do DMA which uses CMA to allocate memory.
> So we need to understand why the CMA allocation blocks (forever?).
> From the current picture I cannot tell whether UBIFS is the root cause
> or just a victim.
>
> Thanks,
> //richard
>
>
>
The hung task timeout is the default 2 minutes and it's setup to panic
so I don't know if it recovers. I'll ask them to try a longer value or
disable the panic and see if it recovers.
I checked and LOCKDEP=y and the held locks are in the original trace.
Are there particular locking checks you would like? Currently its
setup like this:
  │ │    [*] RT Mutex debugging, deadlock detection
  │ │    -*- Spinlock and rw-lock debugging: basic checks
  │ │    -*- Mutex debugging: basic checks
  │ │    [ ] Wait/wound mutex debugging: Slowpath testing
  │ │    [*] Lock debugging: detect incorrect freeing of live locks
  │ │    [ ] Lock debugging: prove locking correctness
  │ │    [ ] Lock usage statistics
  │ │    [ ] Lock dependency engine debugging
  │ │    [*] Sleep inside atomic section checking
  │ │    [ ] Locking API boot-time self-tests
  │ │    < > torture tests for locking

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: hung task detected in ubifs
  2018-12-20 14:19       ` Martin Townsend
@ 2018-12-20 14:28         ` Richard Weinberger
  2018-12-20 15:04           ` Martin Townsend
  0 siblings, 1 reply; 10+ messages in thread
From: Richard Weinberger @ 2018-12-20 14:28 UTC (permalink / raw)
  To: Martin Townsend; +Cc: linux-mtd

Am Donnerstag, 20. Dezember 2018, 15:19:24 CET schrieb Martin Townsend:
> The hung task timeout is the default 2 minutes and it's setup to panic
> so I don't know if it recovers. I'll ask them to try a longer value or
> disable the panic and see if it recovers.

Yes, please.

> I checked and LOCKDEP=y and the held locks are in the original trace.
> Are there particular locking checks you would like? Currently its
> setup like this:
>   │ │    [*] RT Mutex debugging, deadlock detection
>   │ │    -*- Spinlock and rw-lock debugging: basic checks
>   │ │    -*- Mutex debugging: basic checks
>   │ │    [ ] Wait/wound mutex debugging: Slowpath testing
>   │ │    [*] Lock debugging: detect incorrect freeing of live locks
>   │ │    [ ] Lock debugging: prove locking correctness

Hmm, this should be okay.
So it does not look like a traditional deadlock.

Basically we need to figure why and where exactly cma_alloc() hangs.
And of course also we need to know if it is really cma_alloc().

Can you please dig into that?

Thanks,
//richard

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: hung task detected in ubifs
  2018-12-20 14:28         ` Richard Weinberger
@ 2018-12-20 15:04           ` Martin Townsend
  2018-12-20 15:42             ` Richard Weinberger
  0 siblings, 1 reply; 10+ messages in thread
From: Martin Townsend @ 2018-12-20 15:04 UTC (permalink / raw)
  To: Richard Weinberger; +Cc: linux-mtd

On Thu, Dec 20, 2018 at 2:28 PM Richard Weinberger <richard@nod.at> wrote:
>
> Am Donnerstag, 20. Dezember 2018, 15:19:24 CET schrieb Martin Townsend:
> > The hung task timeout is the default 2 minutes and it's setup to panic
> > so I don't know if it recovers. I'll ask them to try a longer value or
> > disable the panic and see if it recovers.
>
> Yes, please.
Done.
>
> > I checked and LOCKDEP=y and the held locks are in the original trace.
> > Are there particular locking checks you would like? Currently its
> > setup like this:
> >   │ │    [*] RT Mutex debugging, deadlock detection
> >   │ │    -*- Spinlock and rw-lock debugging: basic checks
> >   │ │    -*- Mutex debugging: basic checks
> >   │ │    [ ] Wait/wound mutex debugging: Slowpath testing
> >   │ │    [*] Lock debugging: detect incorrect freeing of live locks
> >   │ │    [ ] Lock debugging: prove locking correctness
>
> Hmm, this should be okay.
> So it does not look like a traditional deadlock.
>
> Basically we need to figure why and where exactly cma_alloc() hangs.
> And of course also we need to know if it is really cma_alloc().
>
> Can you please dig into that?
Will do, would CMA_DEBUG help or would it produce too much log information?
>
> Thanks,
> //richard
>
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: hung task detected in ubifs
  2018-12-20 15:04           ` Martin Townsend
@ 2018-12-20 15:42             ` Richard Weinberger
  2018-12-20 17:03               ` Martin Townsend
  0 siblings, 1 reply; 10+ messages in thread
From: Richard Weinberger @ 2018-12-20 15:42 UTC (permalink / raw)
  To: Martin Townsend; +Cc: linux-mtd

Am Donnerstag, 20. Dezember 2018, 16:04:10 CET schrieb Martin Townsend:
> > Basically we need to figure why and where exactly cma_alloc() hangs.
> > And of course also we need to know if it is really cma_alloc().
> >
> > Can you please dig into that?
> Will do, would CMA_DEBUG help or would it produce too much log information?

I don't know. I'd first try to figure where exactly it hangs and why.

Thanks,
//richard

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: hung task detected in ubifs
  2018-12-20 15:42             ` Richard Weinberger
@ 2018-12-20 17:03               ` Martin Townsend
  2018-12-20 19:29                 ` Richard Weinberger
  0 siblings, 1 reply; 10+ messages in thread
From: Martin Townsend @ 2018-12-20 17:03 UTC (permalink / raw)
  To: Richard Weinberger; +Cc: linux-mtd

On Thu, Dec 20, 2018 at 3:42 PM Richard Weinberger <richard@nod.at> wrote:
>
> Am Donnerstag, 20. Dezember 2018, 16:04:10 CET schrieb Martin Townsend:
> > > Basically we need to figure why and where exactly cma_alloc() hangs.
> > > And of course also we need to know if it is really cma_alloc().
> > >
> > > Can you please dig into that?
> > Will do, would CMA_DEBUG help or would it produce too much log information?
>
> I don't know. I'd first try to figure where exactly it hangs and why.
>
> Thanks,
> //richard
>
>
>
I'm starting to think that MTD/UBI is a victim here.  I tried to
reproduce what the client was seeing with no luck then on one boot I
triggered a lockup really early in the boot:

[  OK  ] Started Dispatch Password Requests to Console Directory Watch.
[  OK  ] Reached target Swap.
[  OK  ] Created slice System Slice.
[  OK  ] Listening on Journal Audit Socket.
[  OK  ] Reached target Remote File Systems.
[  OK  ] Listening on Syslog Socket.
[  OK  ] Started Forward Password Requests to Wall Directory Watch.
[  OK  ] Created slice User and Session Slice.
[  OK  ] Listening on udev Kernel Socket.
[  OK  ] Reached target Paths.
[  OK  ] Listening on /dev/initctl Compatibility Named Pipe.
[  OK  ] Created slice system-serial\x2dgetty.slice.
brcmfmac: brcmf_sdio_htclk: HT Avail timeout (1000000): clkctl 0x50
brcmfmac: brcmf_sdio_htclk: HT Avail timeout (1000000): clkctl 0x50
INFO: task systemd:1 blocked for more than 120 seconds.
      Not tainted 4.9.88-1.0.0+g6507266 #3
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[<80918cac>] (__schedule) from [<809192ec>] (schedule+0x48/0xb0)
[<809192ec>] (schedule) from [<8091dc50>] (schedule_timeout+0x24c/0x448)
[<8091dc50>] (schedule_timeout) from [<809189d4>]
(io_schedule_timeout+0x74/0xa8)
[<809189d4>] (io_schedule_timeout) from [<80919c04>] (bit_wait_io+0x10/0x5c)
[<80919c04>] (bit_wait_io) from [<80919a8c>] (__wait_on_bit_lock+0x60/0xd4)
[<80919a8c>] (__wait_on_bit_lock) from [<801f2960>] (__lock_page+0x7c/0x98)
[<801f2960>] (__lock_page) from [<8023d1a4>] (migrate_pages+0x838/0x95c)
[<8023d1a4>] (migrate_pages) from [<801fe2ec>] (alloc_contig_range+0x164/0x354)
[<801fe2ec>] (alloc_contig_range) from [<80246824>] (cma_alloc+0xd8/0x29c)
[<80246824>] (cma_alloc) from [<80112ecc>] (__alloc_from_contiguous+0x38/0xd8)
[<80112ecc>] (__alloc_from_contiguous) from [<80112fa0>]
(cma_allocator_alloc+0x34/0x3c)
[<80112fa0>] (cma_allocator_alloc) from [<80113170>] (__dma_alloc+0x1c8/0x3ac)
[<80113170>] (__dma_alloc) from [<801133d0>] (arm_dma_alloc+0x40/0x48)
[<801133d0>] (arm_dma_alloc) from [<804af8d0>]
(mxs_dma_alloc_chan_resources+0x164/0x25c)
[<804af8d0>] (mxs_dma_alloc_chan_resources) from [<804a99e4>]
(dma_chan_get+0x68/0xdc)
[<804a99e4>] (dma_chan_get) from [<804a9bb0>] (find_candidate+0xb8/0x188)
[<804a9bb0>] (find_candidate) from [<804a9d64>]
(__dma_request_channel+0x4c/0x8c)
[<804a9d64>] (__dma_request_channel) from [<804aeef0>] (mxs_dma_xlate+0x60/0x84)
[<804aeef0>] (mxs_dma_xlate) from [<804ab8d8>]
(of_dma_request_slave_channel+0x188/0x228)
[<804ab8d8>] (of_dma_request_slave_channel) from [<804a9dd4>]
(dma_request_chan+0x30/0x194)
[<804a9dd4>] (dma_request_chan) from [<804a9f40>]
(dma_request_slave_channel+0x8/0x14)
[<804a9f40>] (dma_request_slave_channel) from [<8055c140>]
(gpmi_runtime_resume+0x4c/0x94)
[<8055c140>] (gpmi_runtime_resume) from [<804fddc4>] (__rpm_callback+0x2c/0x60)
[<804fddc4>] (__rpm_callback) from [<804fde4c>] (rpm_callback+0x54/0x80)
[<804fde4c>] (rpm_callback) from [<804ff1b0>] (rpm_resume+0x4c4/0x794)
[<804ff1b0>] (rpm_resume) from [<804ff4e0>] (__pm_runtime_resume+0x60/0x98)
[<804ff4e0>] (__pm_runtime_resume) from [<8055f6a8>] (gpmi_begin+0x1c/0x52c)
[<8055f6a8>] (gpmi_begin) from [<8055c51c>] (gpmi_select_chip+0x38/0x50)
[<8055c51c>] (gpmi_select_chip) from [<80556fd0>] (nand_do_read_ops+0x64/0x56c)
[<80556fd0>] (nand_do_read_ops) from [<80557850>] (nand_read+0x6c/0xa0)
[<80557850>] (nand_read) from [<80539c84>] (part_read+0x48/0x80)
[<80539c84>] (part_read) from [<805367d4>] (mtd_read+0x68/0xa4)
[<805367d4>] (mtd_read) from [<8056a58c>] (ubi_io_read+0xe0/0x358)
[<8056a58c>] (ubi_io_read) from [<805681b8>] (ubi_eba_read_leb+0x9c/0x438)
[<805681b8>] (ubi_eba_read_leb) from [<80566f34>] (ubi_leb_read+0x74/0xb4)
[<80566f34>] (ubi_leb_read) from [<803991e4>] (ubifs_leb_read+0x2c/0x78)
[<803991e4>] (ubifs_leb_read) from [<8039b848>] (fallible_read_node+0x48/0x120)
[<8039b848>] (fallible_read_node) from [<8039df08>]
(ubifs_tnc_locate+0x104/0x1e0)
[<8039df08>] (ubifs_tnc_locate) from [<80390660>] (do_readpage+0x184/0x438)
[<80390660>] (do_readpage) from [<80391b38>] (ubifs_readpage+0x4c/0x540)
[<80391b38>] (ubifs_readpage) from [<801f63b0>] (filemap_fault+0x51c/0x6a4)
[<801f63b0>] (filemap_fault) from [<80223b64>] (__do_fault+0x80/0x128)
[<80223b64>] (__do_fault) from [<80226f78>] (handle_mm_fault+0x738/0x1278)
[<80226f78>] (handle_mm_fault) from [<80113f64>] (do_page_fault+0x12c/0x350)
[<80113f64>] (do_page_fault) from [<8010134c>] (do_DataAbort+0x4c/0xdc)
[<8010134c>] (do_DataAbort) from [<8010d25c>] (__dabt_usr+0x3c/0x40)
Exception stack(0x960b5fb0 to 0x960b5ff8)
5fa0:                                     00000001 00000000 1e1b0500 76ee28c4
5fc0: 01bf15b0 76f68a58 00000001 00000001 fffffffe 0050119c 01c4f644 01c4f600
5fe0: 76f34318 7e909888 76e10f6c 76e10f88 80070010 ffffffff

Showing all locks held in the system:
5 locks held by systemd/1:
 #0:  (&mm->mmap_sem){......}, at: [<80113ef0>] do_page_fault+0xb8/0x350
 #1:  (&le->mutex){......}, at: [<80568150>] ubi_eba_read_leb+0x34/0x438
 #2:  (of_dma_lock){......}, at: [<804ab890>]
of_dma_request_slave_channel+0x140/0x228
 #3:  (dma_list_mutex){......}, at: [<804a9d3c>] __dma_request_channel+0x24/0x8c
 #4:  (cma_mutex){......}, at: [<80246814>] cma_alloc+0xc8/0x29c
2 locks held by khungtaskd/14:
 #0:  (rcu_read_lock){......}, at: [<801b7920>] watchdog+0xdc/0x4b0
 #1:  (tasklist_lock){......}, at: [<80162208>] debug_show_all_locks+0x38/0x1ac

This does point to some lockup in the CMA allocator when migrating
pages for a contiguous allocation.  Out of interest do you know why
do_DataAbort ends up calling filemap_fault and hence ending up in the
ubifs layer?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: hung task detected in ubifs
  2018-12-20 17:03               ` Martin Townsend
@ 2018-12-20 19:29                 ` Richard Weinberger
  0 siblings, 0 replies; 10+ messages in thread
From: Richard Weinberger @ 2018-12-20 19:29 UTC (permalink / raw)
  To: Martin Townsend; +Cc: linux-mtd

Am Donnerstag, 20. Dezember 2018, 18:03:46 CET schrieb Martin Townsend:
> I'm starting to think that MTD/UBI is a victim here.  I tried to
> reproduce what the client was seeing with no luck then on one boot I
> triggered a lockup really early in the boot:

Yes, this can be.
Maybe your CMA setup is odd or some other bug triggers.
Are other parts heavily using CMA? (hint: GPU stuff).
Maybe a different cma= setting helps.
See:
Documentation/admin-guide/kernel-parameters.txt

> Showing all locks held in the system:
> 5 locks held by systemd/1:
>  #0:  (&mm->mmap_sem){......}, at: [<80113ef0>] do_page_fault+0xb8/0x350
>  #1:  (&le->mutex){......}, at: [<80568150>] ubi_eba_read_leb+0x34/0x438
>  #2:  (of_dma_lock){......}, at: [<804ab890>]
> of_dma_request_slave_channel+0x140/0x228
>  #3:  (dma_list_mutex){......}, at: [<804a9d3c>] __dma_request_channel+0x24/0x8c
>  #4:  (cma_mutex){......}, at: [<80246814>] cma_alloc+0xc8/0x29c
> 2 locks held by khungtaskd/14:
>  #0:  (rcu_read_lock){......}, at: [<801b7920>] watchdog+0xdc/0x4b0
>  #1:  (tasklist_lock){......}, at: [<80162208>] debug_show_all_locks+0x38/0x1ac
> 
> This does point to some lockup in the CMA allocator when migrating
> pages for a contiguous allocation.  Out of interest do you know why
> do_DataAbort ends up calling filemap_fault and hence ending up in the
> ubifs layer?

Well, if you open and mmap a file, access to the memory will trigger a page
fault. This page fault is resolved by reading the pages from disk into memory.

Thanks,
//richard

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2018-12-20 19:30 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-20  9:49 hung task detected in ubifs Martin Townsend
2018-12-20 10:54 ` Richard Weinberger
2018-12-20 13:07   ` Martin Townsend
2018-12-20 13:15     ` Richard Weinberger
2018-12-20 14:19       ` Martin Townsend
2018-12-20 14:28         ` Richard Weinberger
2018-12-20 15:04           ` Martin Townsend
2018-12-20 15:42             ` Richard Weinberger
2018-12-20 17:03               ` Martin Townsend
2018-12-20 19:29                 ` Richard Weinberger

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.