From: Michal Hocko <mhocko@kernel.org>
To: Brian Foster <bfoster@redhat.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>,
linux-xfs@vger.kernel.org, linux-mm@kvack.org
Subject: Re: How to favor memory allocations for WQ_MEM_RECLAIM threads?
Date: Fri, 3 Mar 2017 16:52:58 +0100 [thread overview]
Message-ID: <20170303155258.GJ31499@dhcp22.suse.cz> (raw)
In-Reply-To: <20170303153720.GC21245@bfoster.bfoster>
On Fri 03-03-17 10:37:21, Brian Foster wrote:
[...]
> That aside, looking through some of the traces in this case...
>
> - kswapd0 is waiting on an inode flush lock. This means somebody else
> flushed the inode and it won't be unlocked until the underlying buffer
> I/O is completed. This context is also holding pag_ici_reclaim_lock
> which is what probably blocks other contexts from getting into inode
> reclaim.
> - xfsaild is in xfs_iflush(), which means it has the inode flush lock.
> It's waiting on reading the underlying inode buffer. The buffer read
> sets b_ioend_wq to the xfs-buf wq, which is ultimately going to be
> queued in xfs_buf_bio_end_io()->xfs_buf_ioend_async(). The associated
> work item is what eventually triggers the I/O completion in
> xfs_buf_ioend().
>
> So at this point reclaim is waiting on a read I/O completion. It's not
> clear to me whether the read had completed and the work item was queued
> or not. I do see the following in the workqueue lockup BUG output:
>
> [ 273.412600] workqueue xfs-buf/sda1: flags=0xc
> [ 273.414486] pwq 4: cpus=2 node=0 flags=0x0 nice=0 active=1/1
> [ 273.416415] pending: xfs_buf_ioend_work [xfs]
>
> ... which suggests that it was queued..? I suppose this could be one of
> the workqueues waiting on a kthread, but xfs-buf also has a rescuer that
> appears to be idle:
>
> [ 1041.555227] xfs-buf/sda1 S14904 450 2 0x00000000
> [ 1041.556813] Call Trace:
> [ 1041.557796] __schedule+0x336/0xe00
> [ 1041.558983] schedule+0x3d/0x90
> [ 1041.560085] rescuer_thread+0x322/0x3d0
> [ 1041.561333] kthread+0x10f/0x150
> [ 1041.562464] ? worker_thread+0x4b0/0x4b0
> [ 1041.563732] ? kthread_create_on_node+0x70/0x70
> [ 1041.565123] ret_from_fork+0x31/0x40
>
> So shouldn't that thread pick up the work item if that is the case?
Is it possible that the progress is done but tediously slow? Keep in
mind that the test case is doing write from 1k processes while one
process basically consumes all the memory. So I wouldn't be surprised
if this just made system to crawl on any attempt to do an IO.
--
Michal Hocko
SUSE Labs
WARNING: multiple messages have this Message-ID (diff)
From: Michal Hocko <mhocko@kernel.org>
To: Brian Foster <bfoster@redhat.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>,
linux-xfs@vger.kernel.org, linux-mm@kvack.org
Subject: Re: How to favor memory allocations for WQ_MEM_RECLAIM threads?
Date: Fri, 3 Mar 2017 16:52:58 +0100 [thread overview]
Message-ID: <20170303155258.GJ31499@dhcp22.suse.cz> (raw)
In-Reply-To: <20170303153720.GC21245@bfoster.bfoster>
On Fri 03-03-17 10:37:21, Brian Foster wrote:
[...]
> That aside, looking through some of the traces in this case...
>
> - kswapd0 is waiting on an inode flush lock. This means somebody else
> flushed the inode and it won't be unlocked until the underlying buffer
> I/O is completed. This context is also holding pag_ici_reclaim_lock
> which is what probably blocks other contexts from getting into inode
> reclaim.
> - xfsaild is in xfs_iflush(), which means it has the inode flush lock.
> It's waiting on reading the underlying inode buffer. The buffer read
> sets b_ioend_wq to the xfs-buf wq, which is ultimately going to be
> queued in xfs_buf_bio_end_io()->xfs_buf_ioend_async(). The associated
> work item is what eventually triggers the I/O completion in
> xfs_buf_ioend().
>
> So at this point reclaim is waiting on a read I/O completion. It's not
> clear to me whether the read had completed and the work item was queued
> or not. I do see the following in the workqueue lockup BUG output:
>
> [ 273.412600] workqueue xfs-buf/sda1: flags=0xc
> [ 273.414486] pwq 4: cpus=2 node=0 flags=0x0 nice=0 active=1/1
> [ 273.416415] pending: xfs_buf_ioend_work [xfs]
>
> ... which suggests that it was queued..? I suppose this could be one of
> the workqueues waiting on a kthread, but xfs-buf also has a rescuer that
> appears to be idle:
>
> [ 1041.555227] xfs-buf/sda1 S14904 450 2 0x00000000
> [ 1041.556813] Call Trace:
> [ 1041.557796] __schedule+0x336/0xe00
> [ 1041.558983] schedule+0x3d/0x90
> [ 1041.560085] rescuer_thread+0x322/0x3d0
> [ 1041.561333] kthread+0x10f/0x150
> [ 1041.562464] ? worker_thread+0x4b0/0x4b0
> [ 1041.563732] ? kthread_create_on_node+0x70/0x70
> [ 1041.565123] ret_from_fork+0x31/0x40
>
> So shouldn't that thread pick up the work item if that is the case?
Is it possible that the progress is done but tediously slow? Keep in
mind that the test case is doing write from 1k processes while one
process basically consumes all the memory. So I wouldn't be surprised
if this just made system to crawl on any attempt to do an IO.
--
Michal Hocko
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2017-03-03 15:53 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-03-03 10:48 How to favor memory allocations for WQ_MEM_RECLAIM threads? Tetsuo Handa
2017-03-03 10:48 ` Tetsuo Handa
2017-03-03 13:39 ` Michal Hocko
2017-03-03 13:39 ` Michal Hocko
2017-03-03 15:37 ` Brian Foster
2017-03-03 15:37 ` Brian Foster
2017-03-03 15:52 ` Michal Hocko [this message]
2017-03-03 15:52 ` Michal Hocko
2017-03-03 17:29 ` Brian Foster
2017-03-03 17:29 ` Brian Foster
2017-03-04 14:54 ` Tetsuo Handa
2017-03-04 14:54 ` Tetsuo Handa
2017-03-06 13:25 ` Brian Foster
2017-03-06 13:25 ` Brian Foster
2017-03-06 16:08 ` Tetsuo Handa
2017-03-06 16:08 ` Tetsuo Handa
2017-03-06 16:17 ` Brian Foster
2017-03-06 16:17 ` Brian Foster
2017-03-03 23:25 ` Dave Chinner
2017-03-03 23:25 ` Dave Chinner
2017-03-07 12:15 ` Michal Hocko
2017-03-07 12:15 ` Michal Hocko
2017-03-07 19:36 ` Tejun Heo
2017-03-07 19:36 ` Tejun Heo
2017-03-07 21:21 ` Dave Chinner
2017-03-07 21:21 ` Dave Chinner
2017-03-07 21:48 ` Tejun Heo
2017-03-07 21:48 ` Tejun Heo
2017-03-08 23:03 ` Tejun Heo
2017-03-08 23:03 ` Tejun Heo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170303155258.GJ31499@dhcp22.suse.cz \
--to=mhocko@kernel.org \
--cc=bfoster@redhat.com \
--cc=linux-mm@kvack.org \
--cc=linux-xfs@vger.kernel.org \
--cc=penguin-kernel@I-love.SAKURA.ne.jp \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.