All of lore.kernel.org
 help / color / mirror / Atom feed
From: Brian Foster <bfoster@redhat.com>
To: Michal Hocko <mhocko@kernel.org>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>,
	linux-xfs@vger.kernel.org, linux-mm@kvack.org
Subject: Re: How to favor memory allocations for WQ_MEM_RECLAIM threads?
Date: Fri, 3 Mar 2017 12:29:04 -0500	[thread overview]
Message-ID: <20170303172904.GE21245@bfoster.bfoster> (raw)
In-Reply-To: <20170303155258.GJ31499@dhcp22.suse.cz>

On Fri, Mar 03, 2017 at 04:52:58PM +0100, Michal Hocko wrote:
> On Fri 03-03-17 10:37:21, Brian Foster wrote:
> [...]
> > That aside, looking through some of the traces in this case...
> > 
> > - kswapd0 is waiting on an inode flush lock. This means somebody else
> >   flushed the inode and it won't be unlocked until the underlying buffer
> >   I/O is completed. This context is also holding pag_ici_reclaim_lock
> >   which is what probably blocks other contexts from getting into inode
> >   reclaim.
> > - xfsaild is in xfs_iflush(), which means it has the inode flush lock.
> >   It's waiting on reading the underlying inode buffer. The buffer read
> >   sets b_ioend_wq to the xfs-buf wq, which is ultimately going to be
> >   queued in xfs_buf_bio_end_io()->xfs_buf_ioend_async(). The associated
> >   work item is what eventually triggers the I/O completion in
> >   xfs_buf_ioend().
> > 
> > So at this point reclaim is waiting on a read I/O completion. It's not
> > clear to me whether the read had completed and the work item was queued
> > or not. I do see the following in the workqueue lockup BUG output:
> > 
> > [  273.412600] workqueue xfs-buf/sda1: flags=0xc
> > [  273.414486]   pwq 4: cpus=2 node=0 flags=0x0 nice=0 active=1/1
> > [  273.416415]     pending: xfs_buf_ioend_work [xfs]
> > 
> > ... which suggests that it was queued..? I suppose this could be one of
> > the workqueues waiting on a kthread, but xfs-buf also has a rescuer that
> > appears to be idle:
> > 
> > [ 1041.555227] xfs-buf/sda1    S14904   450      2 0x00000000
> > [ 1041.556813] Call Trace:
> > [ 1041.557796]  __schedule+0x336/0xe00
> > [ 1041.558983]  schedule+0x3d/0x90
> > [ 1041.560085]  rescuer_thread+0x322/0x3d0
> > [ 1041.561333]  kthread+0x10f/0x150
> > [ 1041.562464]  ? worker_thread+0x4b0/0x4b0
> > [ 1041.563732]  ? kthread_create_on_node+0x70/0x70
> > [ 1041.565123]  ret_from_fork+0x31/0x40
> > 
> > So shouldn't that thread pick up the work item if that is the case?
> 
> Is it possible that the progress is done but tediously slow? Keep in
> mind that the test case is doing write from 1k processes while one
> process basically consumes all the memory. So I wouldn't be surprised
> if this just made system to crawl on any attempt to do an IO.

That would seem like a possibility to me.. either waiting on an actual
I/O (no guarantee that the pending xfs-buf item is the one we care about
I suppose) completion or waiting for whatever needs to happen for the wq
infrastructure to kick off the rescuer. Though I think that's probably
something Tetsuo would ultimately have to confirm on his setup..

Brian

> -- 
> Michal Hocko
> SUSE Labs
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

WARNING: multiple messages have this Message-ID (diff)
From: Brian Foster <bfoster@redhat.com>
To: Michal Hocko <mhocko@kernel.org>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>,
	linux-xfs@vger.kernel.org, linux-mm@kvack.org
Subject: Re: How to favor memory allocations for WQ_MEM_RECLAIM threads?
Date: Fri, 3 Mar 2017 12:29:04 -0500	[thread overview]
Message-ID: <20170303172904.GE21245@bfoster.bfoster> (raw)
In-Reply-To: <20170303155258.GJ31499@dhcp22.suse.cz>

On Fri, Mar 03, 2017 at 04:52:58PM +0100, Michal Hocko wrote:
> On Fri 03-03-17 10:37:21, Brian Foster wrote:
> [...]
> > That aside, looking through some of the traces in this case...
> > 
> > - kswapd0 is waiting on an inode flush lock. This means somebody else
> >   flushed the inode and it won't be unlocked until the underlying buffer
> >   I/O is completed. This context is also holding pag_ici_reclaim_lock
> >   which is what probably blocks other contexts from getting into inode
> >   reclaim.
> > - xfsaild is in xfs_iflush(), which means it has the inode flush lock.
> >   It's waiting on reading the underlying inode buffer. The buffer read
> >   sets b_ioend_wq to the xfs-buf wq, which is ultimately going to be
> >   queued in xfs_buf_bio_end_io()->xfs_buf_ioend_async(). The associated
> >   work item is what eventually triggers the I/O completion in
> >   xfs_buf_ioend().
> > 
> > So at this point reclaim is waiting on a read I/O completion. It's not
> > clear to me whether the read had completed and the work item was queued
> > or not. I do see the following in the workqueue lockup BUG output:
> > 
> > [  273.412600] workqueue xfs-buf/sda1: flags=0xc
> > [  273.414486]   pwq 4: cpus=2 node=0 flags=0x0 nice=0 active=1/1
> > [  273.416415]     pending: xfs_buf_ioend_work [xfs]
> > 
> > ... which suggests that it was queued..? I suppose this could be one of
> > the workqueues waiting on a kthread, but xfs-buf also has a rescuer that
> > appears to be idle:
> > 
> > [ 1041.555227] xfs-buf/sda1    S14904   450      2 0x00000000
> > [ 1041.556813] Call Trace:
> > [ 1041.557796]  __schedule+0x336/0xe00
> > [ 1041.558983]  schedule+0x3d/0x90
> > [ 1041.560085]  rescuer_thread+0x322/0x3d0
> > [ 1041.561333]  kthread+0x10f/0x150
> > [ 1041.562464]  ? worker_thread+0x4b0/0x4b0
> > [ 1041.563732]  ? kthread_create_on_node+0x70/0x70
> > [ 1041.565123]  ret_from_fork+0x31/0x40
> > 
> > So shouldn't that thread pick up the work item if that is the case?
> 
> Is it possible that the progress is done but tediously slow? Keep in
> mind that the test case is doing write from 1k processes while one
> process basically consumes all the memory. So I wouldn't be surprised
> if this just made system to crawl on any attempt to do an IO.

That would seem like a possibility to me.. either waiting on an actual
I/O (no guarantee that the pending xfs-buf item is the one we care about
I suppose) completion or waiting for whatever needs to happen for the wq
infrastructure to kick off the rescuer. Though I think that's probably
something Tetsuo would ultimately have to confirm on his setup..

Brian

> -- 
> Michal Hocko
> SUSE Labs
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2017-03-03 17:29 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-03 10:48 How to favor memory allocations for WQ_MEM_RECLAIM threads? Tetsuo Handa
2017-03-03 10:48 ` Tetsuo Handa
2017-03-03 13:39 ` Michal Hocko
2017-03-03 13:39   ` Michal Hocko
2017-03-03 15:37   ` Brian Foster
2017-03-03 15:37     ` Brian Foster
2017-03-03 15:52     ` Michal Hocko
2017-03-03 15:52       ` Michal Hocko
2017-03-03 17:29       ` Brian Foster [this message]
2017-03-03 17:29         ` Brian Foster
2017-03-04 14:54         ` Tetsuo Handa
2017-03-04 14:54           ` Tetsuo Handa
2017-03-06 13:25           ` Brian Foster
2017-03-06 13:25             ` Brian Foster
2017-03-06 16:08             ` Tetsuo Handa
2017-03-06 16:08               ` Tetsuo Handa
2017-03-06 16:17               ` Brian Foster
2017-03-06 16:17                 ` Brian Foster
2017-03-03 23:25   ` Dave Chinner
2017-03-03 23:25     ` Dave Chinner
2017-03-07 12:15     ` Michal Hocko
2017-03-07 12:15       ` Michal Hocko
2017-03-07 19:36       ` Tejun Heo
2017-03-07 19:36         ` Tejun Heo
2017-03-07 21:21         ` Dave Chinner
2017-03-07 21:21           ` Dave Chinner
2017-03-07 21:48           ` Tejun Heo
2017-03-07 21:48             ` Tejun Heo
2017-03-08 23:03             ` Tejun Heo
2017-03-08 23:03               ` Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170303172904.GE21245@bfoster.bfoster \
    --to=bfoster@redhat.com \
    --cc=linux-mm@kvack.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=mhocko@kernel.org \
    --cc=penguin-kernel@I-love.SAKURA.ne.jp \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.