All of lore.kernel.org
 help / color / mirror / Atom feed
* livelock in __writeback_inodes_wb ?
@ 2012-11-28 14:55 ` Dave Jones
  0 siblings, 0 replies; 12+ messages in thread
From: Dave Jones @ 2012-11-28 14:55 UTC (permalink / raw)
  To: linux-mm; +Cc: Linux Kernel

We had a user report the soft lockup detector kicked after 22
seconds of no progress, with this trace..

:BUG: soft lockup - CPU#1 stuck for 22s! [flush-8:16:3137]
:Pid: 3137, comm: flush-8:16 Not tainted 3.6.7-4.fc17.x86_64 #1
:RIP: 0010:[<ffffffff812eeb8c>]  [<ffffffff812eeb8c>] __list_del_entry+0x2c/0xd0
:Call Trace:
: [<ffffffff811b783e>] redirty_tail+0x5e/0x80
: [<ffffffff811b8212>] __writeback_inodes_wb+0x72/0xd0
: [<ffffffff811b980b>] wb_writeback+0x23b/0x2d0
: [<ffffffff811b9b5c>] wb_do_writeback+0xac/0x1f0
: [<ffffffff8106c0e0>] ? __internal_add_timer+0x130/0x130
: [<ffffffff811b9d2b>] bdi_writeback_thread+0x8b/0x230
: [<ffffffff811b9ca0>] ? wb_do_writeback+0x1f0/0x1f0
: [<ffffffff8107fde3>] kthread+0x93/0xa0
: [<ffffffff81627e04>] kernel_thread_helper+0x4/0x10
: [<ffffffff8107fd50>] ? kthread_freezable_should_stop+0x70/0x70
: [<ffffffff81627e00>] ? gs_change+0x13/0x13

Looking over the code, is it possible that something could be
dirtying pages faster than writeback can get them written out,
keeping us in this loop indefitely ?

Should there be something in this loop periodically poking
the watchdog perhaps ?

	Dave


^ permalink raw reply	[flat|nested] 12+ messages in thread

* livelock in __writeback_inodes_wb ?
@ 2012-11-28 14:55 ` Dave Jones
  0 siblings, 0 replies; 12+ messages in thread
From: Dave Jones @ 2012-11-28 14:55 UTC (permalink / raw)
  To: linux-mm; +Cc: Linux Kernel

We had a user report the soft lockup detector kicked after 22
seconds of no progress, with this trace..

:BUG: soft lockup - CPU#1 stuck for 22s! [flush-8:16:3137]
:Pid: 3137, comm: flush-8:16 Not tainted 3.6.7-4.fc17.x86_64 #1
:RIP: 0010:[<ffffffff812eeb8c>]  [<ffffffff812eeb8c>] __list_del_entry+0x2c/0xd0
:Call Trace:
: [<ffffffff811b783e>] redirty_tail+0x5e/0x80
: [<ffffffff811b8212>] __writeback_inodes_wb+0x72/0xd0
: [<ffffffff811b980b>] wb_writeback+0x23b/0x2d0
: [<ffffffff811b9b5c>] wb_do_writeback+0xac/0x1f0
: [<ffffffff8106c0e0>] ? __internal_add_timer+0x130/0x130
: [<ffffffff811b9d2b>] bdi_writeback_thread+0x8b/0x230
: [<ffffffff811b9ca0>] ? wb_do_writeback+0x1f0/0x1f0
: [<ffffffff8107fde3>] kthread+0x93/0xa0
: [<ffffffff81627e04>] kernel_thread_helper+0x4/0x10
: [<ffffffff8107fd50>] ? kthread_freezable_should_stop+0x70/0x70
: [<ffffffff81627e00>] ? gs_change+0x13/0x13

Looking over the code, is it possible that something could be
dirtying pages faster than writeback can get them written out,
keeping us in this loop indefitely ?

Should there be something in this loop periodically poking
the watchdog perhaps ?

	Dave

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: livelock in __writeback_inodes_wb ?
  2012-11-28 14:55 ` Dave Jones
@ 2012-12-11  8:23   ` Fengguang Wu
  -1 siblings, 0 replies; 12+ messages in thread
From: Fengguang Wu @ 2012-12-11  8:23 UTC (permalink / raw)
  To: Dave Jones, linux-mm, Linux Kernel; +Cc: Andrew Morton, Jan Kara, linux-fsdevel

On Wed, Nov 28, 2012 at 09:55:15AM -0500, Dave Jones wrote:
> We had a user report the soft lockup detector kicked after 22
> seconds of no progress, with this trace..

Where is the original report? The reporter may help provide some clues
on the workload that triggered the bug.

> :BUG: soft lockup - CPU#1 stuck for 22s! [flush-8:16:3137]
> :Pid: 3137, comm: flush-8:16 Not tainted 3.6.7-4.fc17.x86_64 #1
> :RIP: 0010:[<ffffffff812eeb8c>]  [<ffffffff812eeb8c>] __list_del_entry+0x2c/0xd0
> :Call Trace:
> : [<ffffffff811b783e>] redirty_tail+0x5e/0x80
> : [<ffffffff811b8212>] __writeback_inodes_wb+0x72/0xd0
> : [<ffffffff811b980b>] wb_writeback+0x23b/0x2d0
> : [<ffffffff811b9b5c>] wb_do_writeback+0xac/0x1f0
> : [<ffffffff8106c0e0>] ? __internal_add_timer+0x130/0x130
> : [<ffffffff811b9d2b>] bdi_writeback_thread+0x8b/0x230
> : [<ffffffff811b9ca0>] ? wb_do_writeback+0x1f0/0x1f0
> : [<ffffffff8107fde3>] kthread+0x93/0xa0
> : [<ffffffff81627e04>] kernel_thread_helper+0x4/0x10
> : [<ffffffff8107fd50>] ? kthread_freezable_should_stop+0x70/0x70
> : [<ffffffff81627e00>] ? gs_change+0x13/0x13
> 
> Looking over the code, is it possible that something could be
> dirtying pages faster than writeback can get them written out,
> keeping us in this loop indefitely ?

The bug reporter should know best whether there are heavy IO.

However I suspect it's not directly caused by heavy IO: we will
release &wb->list_lock before each __writeback_single_inode() call,
which starts writeback IO for each inode.

> Should there be something in this loop periodically poking
> the watchdog perhaps ?

It seems we failed to release &wb->list_lock in wb_writeback() for
long time (dozens of seconds). That is, the inode_sleep_on_writeback()
is somehow not called. However it's not obvious to me how come this
can happen..

Thanks,
Fengguang

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: livelock in __writeback_inodes_wb ?
@ 2012-12-11  8:23   ` Fengguang Wu
  0 siblings, 0 replies; 12+ messages in thread
From: Fengguang Wu @ 2012-12-11  8:23 UTC (permalink / raw)
  To: Dave Jones, linux-mm, Linux Kernel; +Cc: Andrew Morton, Jan Kara, linux-fsdevel

On Wed, Nov 28, 2012 at 09:55:15AM -0500, Dave Jones wrote:
> We had a user report the soft lockup detector kicked after 22
> seconds of no progress, with this trace..

Where is the original report? The reporter may help provide some clues
on the workload that triggered the bug.

> :BUG: soft lockup - CPU#1 stuck for 22s! [flush-8:16:3137]
> :Pid: 3137, comm: flush-8:16 Not tainted 3.6.7-4.fc17.x86_64 #1
> :RIP: 0010:[<ffffffff812eeb8c>]  [<ffffffff812eeb8c>] __list_del_entry+0x2c/0xd0
> :Call Trace:
> : [<ffffffff811b783e>] redirty_tail+0x5e/0x80
> : [<ffffffff811b8212>] __writeback_inodes_wb+0x72/0xd0
> : [<ffffffff811b980b>] wb_writeback+0x23b/0x2d0
> : [<ffffffff811b9b5c>] wb_do_writeback+0xac/0x1f0
> : [<ffffffff8106c0e0>] ? __internal_add_timer+0x130/0x130
> : [<ffffffff811b9d2b>] bdi_writeback_thread+0x8b/0x230
> : [<ffffffff811b9ca0>] ? wb_do_writeback+0x1f0/0x1f0
> : [<ffffffff8107fde3>] kthread+0x93/0xa0
> : [<ffffffff81627e04>] kernel_thread_helper+0x4/0x10
> : [<ffffffff8107fd50>] ? kthread_freezable_should_stop+0x70/0x70
> : [<ffffffff81627e00>] ? gs_change+0x13/0x13
> 
> Looking over the code, is it possible that something could be
> dirtying pages faster than writeback can get them written out,
> keeping us in this loop indefitely ?

The bug reporter should know best whether there are heavy IO.

However I suspect it's not directly caused by heavy IO: we will
release &wb->list_lock before each __writeback_single_inode() call,
which starts writeback IO for each inode.

> Should there be something in this loop periodically poking
> the watchdog perhaps ?

It seems we failed to release &wb->list_lock in wb_writeback() for
long time (dozens of seconds). That is, the inode_sleep_on_writeback()
is somehow not called. However it's not obvious to me how come this
can happen..

Thanks,
Fengguang

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: livelock in __writeback_inodes_wb ?
  2012-12-11  8:23   ` Fengguang Wu
@ 2012-12-11 13:41     ` Jan Kara
  -1 siblings, 0 replies; 12+ messages in thread
From: Jan Kara @ 2012-12-11 13:41 UTC (permalink / raw)
  To: Fengguang Wu
  Cc: Dave Jones, linux-mm, Linux Kernel, Andrew Morton, Jan Kara,
	linux-fsdevel

On Tue 11-12-12 16:23:27, Wu Fengguang wrote:
> On Wed, Nov 28, 2012 at 09:55:15AM -0500, Dave Jones wrote:
> > We had a user report the soft lockup detector kicked after 22
> > seconds of no progress, with this trace..
> 
> Where is the original report? The reporter may help provide some clues
> on the workload that triggered the bug.
> 
> > :BUG: soft lockup - CPU#1 stuck for 22s! [flush-8:16:3137]
> > :Pid: 3137, comm: flush-8:16 Not tainted 3.6.7-4.fc17.x86_64 #1
> > :RIP: 0010:[<ffffffff812eeb8c>]  [<ffffffff812eeb8c>] __list_del_entry+0x2c/0xd0
> > :Call Trace:
> > : [<ffffffff811b783e>] redirty_tail+0x5e/0x80
> > : [<ffffffff811b8212>] __writeback_inodes_wb+0x72/0xd0
> > : [<ffffffff811b980b>] wb_writeback+0x23b/0x2d0
> > : [<ffffffff811b9b5c>] wb_do_writeback+0xac/0x1f0
> > : [<ffffffff8106c0e0>] ? __internal_add_timer+0x130/0x130
> > : [<ffffffff811b9d2b>] bdi_writeback_thread+0x8b/0x230
> > : [<ffffffff811b9ca0>] ? wb_do_writeback+0x1f0/0x1f0
> > : [<ffffffff8107fde3>] kthread+0x93/0xa0
> > : [<ffffffff81627e04>] kernel_thread_helper+0x4/0x10
> > : [<ffffffff8107fd50>] ? kthread_freezable_should_stop+0x70/0x70
> > : [<ffffffff81627e00>] ? gs_change+0x13/0x13
> > 
> > Looking over the code, is it possible that something could be
> > dirtying pages faster than writeback can get them written out,
> > keeping us in this loop indefitely ?
> 
> The bug reporter should know best whether there are heavy IO.
> 
> However I suspect it's not directly caused by heavy IO: we will
> release &wb->list_lock before each __writeback_single_inode() call,
> which starts writeback IO for each inode.
  Umm, it's not about releasing wb->list_lock I think. Softlockup will
trigger whenever we are looping in a kernel for more than given timeout
(e.g. those 22 s) without sleeping.

> > Should there be something in this loop periodically poking
> > the watchdog perhaps ?
> 
> It seems we failed to release &wb->list_lock in wb_writeback() for
> long time (dozens of seconds). That is, the inode_sleep_on_writeback()
> is somehow not called. However it's not obvious to me how come this
> can happen..
  Maybe, progress is always non-zero but small and nr_pages is high (e.g.
when writeback is triggered by wakeup_flusher_threads()). What filesystem
is the guy using? I remember e.g. btrfs used to have always-dirty inodes
which could confuse us.

>From the backtrace it is clear there's some superblock which has s_umount
locked and we cannot writeback inodes there. So if this superblock contains
most of the dirty pages we need to write and there's another superblock
with always dirty inode we would livelock like observed... So my question
would be about what filesystems are there in the system (/proc/mounts),
what load does trigger this, trigger sysrq-w when the lockup happens.

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: livelock in __writeback_inodes_wb ?
@ 2012-12-11 13:41     ` Jan Kara
  0 siblings, 0 replies; 12+ messages in thread
From: Jan Kara @ 2012-12-11 13:41 UTC (permalink / raw)
  To: Fengguang Wu
  Cc: Dave Jones, linux-mm, Linux Kernel, Andrew Morton, Jan Kara,
	linux-fsdevel

On Tue 11-12-12 16:23:27, Wu Fengguang wrote:
> On Wed, Nov 28, 2012 at 09:55:15AM -0500, Dave Jones wrote:
> > We had a user report the soft lockup detector kicked after 22
> > seconds of no progress, with this trace..
> 
> Where is the original report? The reporter may help provide some clues
> on the workload that triggered the bug.
> 
> > :BUG: soft lockup - CPU#1 stuck for 22s! [flush-8:16:3137]
> > :Pid: 3137, comm: flush-8:16 Not tainted 3.6.7-4.fc17.x86_64 #1
> > :RIP: 0010:[<ffffffff812eeb8c>]  [<ffffffff812eeb8c>] __list_del_entry+0x2c/0xd0
> > :Call Trace:
> > : [<ffffffff811b783e>] redirty_tail+0x5e/0x80
> > : [<ffffffff811b8212>] __writeback_inodes_wb+0x72/0xd0
> > : [<ffffffff811b980b>] wb_writeback+0x23b/0x2d0
> > : [<ffffffff811b9b5c>] wb_do_writeback+0xac/0x1f0
> > : [<ffffffff8106c0e0>] ? __internal_add_timer+0x130/0x130
> > : [<ffffffff811b9d2b>] bdi_writeback_thread+0x8b/0x230
> > : [<ffffffff811b9ca0>] ? wb_do_writeback+0x1f0/0x1f0
> > : [<ffffffff8107fde3>] kthread+0x93/0xa0
> > : [<ffffffff81627e04>] kernel_thread_helper+0x4/0x10
> > : [<ffffffff8107fd50>] ? kthread_freezable_should_stop+0x70/0x70
> > : [<ffffffff81627e00>] ? gs_change+0x13/0x13
> > 
> > Looking over the code, is it possible that something could be
> > dirtying pages faster than writeback can get them written out,
> > keeping us in this loop indefitely ?
> 
> The bug reporter should know best whether there are heavy IO.
> 
> However I suspect it's not directly caused by heavy IO: we will
> release &wb->list_lock before each __writeback_single_inode() call,
> which starts writeback IO for each inode.
  Umm, it's not about releasing wb->list_lock I think. Softlockup will
trigger whenever we are looping in a kernel for more than given timeout
(e.g. those 22 s) without sleeping.

> > Should there be something in this loop periodically poking
> > the watchdog perhaps ?
> 
> It seems we failed to release &wb->list_lock in wb_writeback() for
> long time (dozens of seconds). That is, the inode_sleep_on_writeback()
> is somehow not called. However it's not obvious to me how come this
> can happen..
  Maybe, progress is always non-zero but small and nr_pages is high (e.g.
when writeback is triggered by wakeup_flusher_threads()). What filesystem
is the guy using? I remember e.g. btrfs used to have always-dirty inodes
which could confuse us.

>From the backtrace it is clear there's some superblock which has s_umount
locked and we cannot writeback inodes there. So if this superblock contains
most of the dirty pages we need to write and there's another superblock
with always dirty inode we would livelock like observed... So my question
would be about what filesystems are there in the system (/proc/mounts),
what load does trigger this, trigger sysrq-w when the lockup happens.

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: livelock in __writeback_inodes_wb ?
  2012-12-11  8:23   ` Fengguang Wu
@ 2012-12-11 14:29     ` Dave Jones
  -1 siblings, 0 replies; 12+ messages in thread
From: Dave Jones @ 2012-12-11 14:29 UTC (permalink / raw)
  To: Fengguang Wu
  Cc: linux-mm, Linux Kernel, Andrew Morton, Jan Kara, linux-fsdevel

On Tue, Dec 11, 2012 at 04:23:27PM +0800, Fengguang Wu wrote:
 > On Wed, Nov 28, 2012 at 09:55:15AM -0500, Dave Jones wrote:
 > > We had a user report the soft lockup detector kicked after 22
 > > seconds of no progress, with this trace..
 > 
 > Where is the original report? The reporter may help provide some clues
 > on the workload that triggered the bug.

https://bugzilla.redhat.com/show_bug.cgi?id=880949 

 > The bug reporter should know best whether there are heavy IO.
 > 
 > However I suspect it's not directly caused by heavy IO: we will
 > release &wb->list_lock before each __writeback_single_inode() call,
 > which starts writeback IO for each inode.
 > 
 > > Should there be something in this loop periodically poking
 > > the watchdog perhaps ?
 > 
 > It seems we failed to release &wb->list_lock in wb_writeback() for
 > long time (dozens of seconds). That is, the inode_sleep_on_writeback()
 > is somehow not called. However it's not obvious to me how come this
 > can happen..

Right, it seems that we only drop the lock when there is more work to do.
And if there is no work to do, then we would have bailed from the loop.

mysterious.

	Dave


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: livelock in __writeback_inodes_wb ?
@ 2012-12-11 14:29     ` Dave Jones
  0 siblings, 0 replies; 12+ messages in thread
From: Dave Jones @ 2012-12-11 14:29 UTC (permalink / raw)
  To: Fengguang Wu
  Cc: linux-mm, Linux Kernel, Andrew Morton, Jan Kara, linux-fsdevel

On Tue, Dec 11, 2012 at 04:23:27PM +0800, Fengguang Wu wrote:
 > On Wed, Nov 28, 2012 at 09:55:15AM -0500, Dave Jones wrote:
 > > We had a user report the soft lockup detector kicked after 22
 > > seconds of no progress, with this trace..
 > 
 > Where is the original report? The reporter may help provide some clues
 > on the workload that triggered the bug.

https://bugzilla.redhat.com/show_bug.cgi?id=880949 

 > The bug reporter should know best whether there are heavy IO.
 > 
 > However I suspect it's not directly caused by heavy IO: we will
 > release &wb->list_lock before each __writeback_single_inode() call,
 > which starts writeback IO for each inode.
 > 
 > > Should there be something in this loop periodically poking
 > > the watchdog perhaps ?
 > 
 > It seems we failed to release &wb->list_lock in wb_writeback() for
 > long time (dozens of seconds). That is, the inode_sleep_on_writeback()
 > is somehow not called. However it's not obvious to me how come this
 > can happen..

Right, it seems that we only drop the lock when there is more work to do.
And if there is no work to do, then we would have bailed from the loop.

mysterious.

	Dave

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: livelock in __writeback_inodes_wb ?
  2012-12-11 14:29     ` Dave Jones
@ 2012-12-12  2:26       ` Simon Jeons
  -1 siblings, 0 replies; 12+ messages in thread
From: Simon Jeons @ 2012-12-12  2:26 UTC (permalink / raw)
  To: Dave Jones
  Cc: Fengguang Wu, linux-mm, Linux Kernel, Andrew Morton, Jan Kara,
	linux-fsdevel

On Tue, 2012-12-11 at 09:29 -0500, Dave Jones wrote:
> On Tue, Dec 11, 2012 at 04:23:27PM +0800, Fengguang Wu wrote:
>  > On Wed, Nov 28, 2012 at 09:55:15AM -0500, Dave Jones wrote:
>  > > We had a user report the soft lockup detector kicked after 22
>  > > seconds of no progress, with this trace..
>  > 
>  > Where is the original report? The reporter may help provide some clues
>  > on the workload that triggered the bug.
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=880949 
> 
>  > The bug reporter should know best whether there are heavy IO.
>  > 
>  > However I suspect it's not directly caused by heavy IO: we will
>  > release &wb->list_lock before each __writeback_single_inode() call,
>  > which starts writeback IO for each inode.
>  > 
>  > > Should there be something in this loop periodically poking
>  > > the watchdog perhaps ?
>  > 
>  > It seems we failed to release &wb->list_lock in wb_writeback() for
>  > long time (dozens of seconds). That is, the inode_sleep_on_writeback()
>  > is somehow not called. However it's not obvious to me how come this
>  > can happen..
> 
> Right, it seems that we only drop the lock when there is more work to do.
> And if there is no work to do, then we would have bailed from the loop.

If no work to do, lock will be dropped after for loop.

> 
> mysterious.
> 
> 	Dave
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: livelock in __writeback_inodes_wb ?
@ 2012-12-12  2:26       ` Simon Jeons
  0 siblings, 0 replies; 12+ messages in thread
From: Simon Jeons @ 2012-12-12  2:26 UTC (permalink / raw)
  To: Dave Jones
  Cc: Fengguang Wu, linux-mm, Linux Kernel, Andrew Morton, Jan Kara,
	linux-fsdevel

On Tue, 2012-12-11 at 09:29 -0500, Dave Jones wrote:
> On Tue, Dec 11, 2012 at 04:23:27PM +0800, Fengguang Wu wrote:
>  > On Wed, Nov 28, 2012 at 09:55:15AM -0500, Dave Jones wrote:
>  > > We had a user report the soft lockup detector kicked after 22
>  > > seconds of no progress, with this trace..
>  > 
>  > Where is the original report? The reporter may help provide some clues
>  > on the workload that triggered the bug.
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=880949 
> 
>  > The bug reporter should know best whether there are heavy IO.
>  > 
>  > However I suspect it's not directly caused by heavy IO: we will
>  > release &wb->list_lock before each __writeback_single_inode() call,
>  > which starts writeback IO for each inode.
>  > 
>  > > Should there be something in this loop periodically poking
>  > > the watchdog perhaps ?
>  > 
>  > It seems we failed to release &wb->list_lock in wb_writeback() for
>  > long time (dozens of seconds). That is, the inode_sleep_on_writeback()
>  > is somehow not called. However it's not obvious to me how come this
>  > can happen..
> 
> Right, it seems that we only drop the lock when there is more work to do.
> And if there is no work to do, then we would have bailed from the loop.

If no work to do, lock will be dropped after for loop.

> 
> mysterious.
> 
> 	Dave
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: livelock in __writeback_inodes_wb ?
  2012-12-11 13:41     ` Jan Kara
@ 2012-12-14  9:13       ` Fengguang Wu
  -1 siblings, 0 replies; 12+ messages in thread
From: Fengguang Wu @ 2012-12-14  9:13 UTC (permalink / raw)
  To: Jan Kara; +Cc: Dave Jones, linux-mm, Linux Kernel, Andrew Morton, linux-fsdevel

On Tue, Dec 11, 2012 at 02:41:13PM +0100, Jan Kara wrote:
> On Tue 11-12-12 16:23:27, Wu Fengguang wrote:
> > On Wed, Nov 28, 2012 at 09:55:15AM -0500, Dave Jones wrote:
> > > We had a user report the soft lockup detector kicked after 22
> > > seconds of no progress, with this trace..
> > 
> > Where is the original report? The reporter may help provide some clues
> > on the workload that triggered the bug.
> > 
> > > :BUG: soft lockup - CPU#1 stuck for 22s! [flush-8:16:3137]
> > > :Pid: 3137, comm: flush-8:16 Not tainted 3.6.7-4.fc17.x86_64 #1
> > > :RIP: 0010:[<ffffffff812eeb8c>]  [<ffffffff812eeb8c>] __list_del_entry+0x2c/0xd0
> > > :Call Trace:
> > > : [<ffffffff811b783e>] redirty_tail+0x5e/0x80
> > > : [<ffffffff811b8212>] __writeback_inodes_wb+0x72/0xd0
> > > : [<ffffffff811b980b>] wb_writeback+0x23b/0x2d0
> > > : [<ffffffff811b9b5c>] wb_do_writeback+0xac/0x1f0
> > > : [<ffffffff8106c0e0>] ? __internal_add_timer+0x130/0x130
> > > : [<ffffffff811b9d2b>] bdi_writeback_thread+0x8b/0x230
> > > : [<ffffffff811b9ca0>] ? wb_do_writeback+0x1f0/0x1f0
> > > : [<ffffffff8107fde3>] kthread+0x93/0xa0
> > > : [<ffffffff81627e04>] kernel_thread_helper+0x4/0x10
> > > : [<ffffffff8107fd50>] ? kthread_freezable_should_stop+0x70/0x70
> > > : [<ffffffff81627e00>] ? gs_change+0x13/0x13
> > > 
> > > Looking over the code, is it possible that something could be
> > > dirtying pages faster than writeback can get them written out,
> > > keeping us in this loop indefitely ?
> > 
> > The bug reporter should know best whether there are heavy IO.
> > 
> > However I suspect it's not directly caused by heavy IO: we will
> > release &wb->list_lock before each __writeback_single_inode() call,
> > which starts writeback IO for each inode.
>   Umm, it's not about releasing wb->list_lock I think. Softlockup will
> trigger whenever we are looping in a kernel for more than given timeout
> (e.g. those 22 s) without sleeping.

Yes (and the spinlock is the typical reason to prevent it from sleeping).

> > > Should there be something in this loop periodically poking
> > > the watchdog perhaps ?
> > 
> > It seems we failed to release &wb->list_lock in wb_writeback() for
> > long time (dozens of seconds). That is, the inode_sleep_on_writeback()
> > is somehow not called. However it's not obvious to me how come this
> > can happen..
>   Maybe, progress is always non-zero but small and nr_pages is high (e.g.
> when writeback is triggered by wakeup_flusher_threads()). What filesystem
> is the guy using? I remember e.g. btrfs used to have always-dirty inodes
> which could confuse us.

Judging from the comm "flush-8:16", it's not btrfs. I cannot find the
reporter's email address in the original bug report. So we may at best
guess it's likely running the fedora's default filesystem. Also device
mapper is used judging from the boot param "rd.lvm.lv=vg_san-pc/lv_root".

> >From the backtrace it is clear there's some superblock which has s_umount
> locked and we cannot writeback inodes there. So if this superblock contains
> most of the dirty pages we need to write and there's another superblock
> with always dirty inode we would livelock like observed... So my question
> would be about what filesystems are there in the system (/proc/mounts),
> what load does trigger this, trigger sysrq-w when the lockup happens.

Yeah, it's likely related to some busy superblock.

Thanks,
Fengguang

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: livelock in __writeback_inodes_wb ?
@ 2012-12-14  9:13       ` Fengguang Wu
  0 siblings, 0 replies; 12+ messages in thread
From: Fengguang Wu @ 2012-12-14  9:13 UTC (permalink / raw)
  To: Jan Kara; +Cc: Dave Jones, linux-mm, Linux Kernel, Andrew Morton, linux-fsdevel

On Tue, Dec 11, 2012 at 02:41:13PM +0100, Jan Kara wrote:
> On Tue 11-12-12 16:23:27, Wu Fengguang wrote:
> > On Wed, Nov 28, 2012 at 09:55:15AM -0500, Dave Jones wrote:
> > > We had a user report the soft lockup detector kicked after 22
> > > seconds of no progress, with this trace..
> > 
> > Where is the original report? The reporter may help provide some clues
> > on the workload that triggered the bug.
> > 
> > > :BUG: soft lockup - CPU#1 stuck for 22s! [flush-8:16:3137]
> > > :Pid: 3137, comm: flush-8:16 Not tainted 3.6.7-4.fc17.x86_64 #1
> > > :RIP: 0010:[<ffffffff812eeb8c>]  [<ffffffff812eeb8c>] __list_del_entry+0x2c/0xd0
> > > :Call Trace:
> > > : [<ffffffff811b783e>] redirty_tail+0x5e/0x80
> > > : [<ffffffff811b8212>] __writeback_inodes_wb+0x72/0xd0
> > > : [<ffffffff811b980b>] wb_writeback+0x23b/0x2d0
> > > : [<ffffffff811b9b5c>] wb_do_writeback+0xac/0x1f0
> > > : [<ffffffff8106c0e0>] ? __internal_add_timer+0x130/0x130
> > > : [<ffffffff811b9d2b>] bdi_writeback_thread+0x8b/0x230
> > > : [<ffffffff811b9ca0>] ? wb_do_writeback+0x1f0/0x1f0
> > > : [<ffffffff8107fde3>] kthread+0x93/0xa0
> > > : [<ffffffff81627e04>] kernel_thread_helper+0x4/0x10
> > > : [<ffffffff8107fd50>] ? kthread_freezable_should_stop+0x70/0x70
> > > : [<ffffffff81627e00>] ? gs_change+0x13/0x13
> > > 
> > > Looking over the code, is it possible that something could be
> > > dirtying pages faster than writeback can get them written out,
> > > keeping us in this loop indefitely ?
> > 
> > The bug reporter should know best whether there are heavy IO.
> > 
> > However I suspect it's not directly caused by heavy IO: we will
> > release &wb->list_lock before each __writeback_single_inode() call,
> > which starts writeback IO for each inode.
>   Umm, it's not about releasing wb->list_lock I think. Softlockup will
> trigger whenever we are looping in a kernel for more than given timeout
> (e.g. those 22 s) without sleeping.

Yes (and the spinlock is the typical reason to prevent it from sleeping).

> > > Should there be something in this loop periodically poking
> > > the watchdog perhaps ?
> > 
> > It seems we failed to release &wb->list_lock in wb_writeback() for
> > long time (dozens of seconds). That is, the inode_sleep_on_writeback()
> > is somehow not called. However it's not obvious to me how come this
> > can happen..
>   Maybe, progress is always non-zero but small and nr_pages is high (e.g.
> when writeback is triggered by wakeup_flusher_threads()). What filesystem
> is the guy using? I remember e.g. btrfs used to have always-dirty inodes
> which could confuse us.

Judging from the comm "flush-8:16", it's not btrfs. I cannot find the
reporter's email address in the original bug report. So we may at best
guess it's likely running the fedora's default filesystem. Also device
mapper is used judging from the boot param "rd.lvm.lv=vg_san-pc/lv_root".

> >From the backtrace it is clear there's some superblock which has s_umount
> locked and we cannot writeback inodes there. So if this superblock contains
> most of the dirty pages we need to write and there's another superblock
> with always dirty inode we would livelock like observed... So my question
> would be about what filesystems are there in the system (/proc/mounts),
> what load does trigger this, trigger sysrq-w when the lockup happens.

Yeah, it's likely related to some busy superblock.

Thanks,
Fengguang

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2012-12-14  9:13 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-11-28 14:55 livelock in __writeback_inodes_wb ? Dave Jones
2012-11-28 14:55 ` Dave Jones
2012-12-11  8:23 ` Fengguang Wu
2012-12-11  8:23   ` Fengguang Wu
2012-12-11 13:41   ` Jan Kara
2012-12-11 13:41     ` Jan Kara
2012-12-14  9:13     ` Fengguang Wu
2012-12-14  9:13       ` Fengguang Wu
2012-12-11 14:29   ` Dave Jones
2012-12-11 14:29     ` Dave Jones
2012-12-12  2:26     ` Simon Jeons
2012-12-12  2:26       ` Simon Jeons

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.