linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Matthew Wilcox <willy@infradead.org>
Cc: "Michal Hocko" <mhocko@kernel.org>,
	"Holger Hoffstätte" <holger@applied-asynchrony.com>,
	"Yafang Shao" <laoar.shao@gmail.com>,
	linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-mm@kvack.org
Subject: Re: [PATCH v3] xfs: avoid deadlock when trigger memory reclaim in ->writepages
Date: Tue, 16 Jun 2020 09:23:58 +1000	[thread overview]
Message-ID: <20200615232358.GW2040@dread.disaster.area> (raw)
In-Reply-To: <20200615150736.GU8681@bombadil.infradead.org>

On Mon, Jun 15, 2020 at 08:07:36AM -0700, Matthew Wilcox wrote:
> On Mon, Jun 15, 2020 at 04:53:31PM +0200, Michal Hocko wrote:
> > On Mon 15-06-20 16:25:52, Holger Hoffstätte wrote:
> > > On 2020-06-15 13:56, Yafang Shao wrote:
> > [...]
> > > > diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c
> > > > index b356118..1ccfbf2 100644
> > > > --- a/fs/xfs/xfs_aops.c
> > > > +++ b/fs/xfs/xfs_aops.c
> > > > @@ -573,9 +573,21 @@ static inline bool xfs_ioend_needs_workqueue(struct iomap_ioend *ioend)
> > > >   	struct writeback_control *wbc)
> > > >   {
> > > >   	struct xfs_writepage_ctx wpc = { };
> > > > +	unsigned int nofs_flag;
> > > > +	int ret;
> > > >   	xfs_iflags_clear(XFS_I(mapping->host), XFS_ITRUNCATED);
> > > > -	return iomap_writepages(mapping, wbc, &wpc.ctx, &xfs_writeback_ops);
> > > > +
> > > > +	/*
> > > > +	 * We can allocate memory here while doing writeback on behalf of
> > > > +	 * memory reclaim.  To avoid memory allocation deadlocks set the
> > > > +	 * task-wide nofs context for the following operations.
> > > > +	 */
> > > > +	nofs_flag = memalloc_nofs_save();
> > > > +	ret = iomap_writepages(mapping, wbc, &wpc.ctx, &xfs_writeback_ops);
> > > > +	memalloc_nofs_restore(nofs_flag);
> > > > +
> > > > +	return ret;
> > > >   }
> > > >   STATIC int
> > > > 
> > > 
> > > Not sure if I did something wrong, but while the previous version of this patch
> > > worked fine, this one gave me (with v2 removed obviously):
> > > 
> > > [  +0.000004] WARNING: CPU: 0 PID: 2811 at fs/iomap/buffered-io.c:1544 iomap_do_writepage+0x6b4/0x780
> > 
> > This corresponds to
> >         /*
> >          * Given that we do not allow direct reclaim to call us, we should
> >          * never be called in a recursive filesystem reclaim context.
> >          */
> >         if (WARN_ON_ONCE(current->flags & PF_MEMALLOC_NOFS))
> >                 goto redirty;
> > 
> > which effectivelly says that memalloc_nofs_save/restore cannot be used
> > for that code path. Your stack trace doesn't point to a reclaim path
> > which shows that this path is shared and also underlines that this is
> > not really an intended use of the api. Please refer to
> > Documentation/core-api/gfp_mask-from-fs-io.rst for more details but
> > shortly the API should be used at the layer which defines a context
> > which shouldn't allow to recurse. E.g. a lock which would be problematic
> > in the reclaim recursion path.
> 
> I'm concerned that this might not be the correct approach.  Generally we
> have the rule that freeing memory should not require allocating memory
> to succeed.  Since XFS obviously does need to allocate memory to start
> a transaction, this allocation should normally be protected by a mempool
> or similar.

<sigh>

We've been down this path before about exactly how much memory XFS
uses to do extent allocation and why mempools nor any existing mm
infrastructure can actually guarantee filesystem transactions
forwards progress. e.g:

https://lwn.net/Articles/636797/

It's even more problematic now that extent allocation is not jsut
one transaction now, but is a series of overlapping linked
transactions that update things like refcount and reverse mapping
btrees, and not just the freespace and inode block map btrees...

> XFS is subtle and has its own rules around memory allocation and
> writeback, so I don't want to say this is definitely wrong.  I'm just
> saying that I have concerns it might not be right.

Almost all XFS memory allocations within the writeback path are
within transaction contexts - the transaction allocation itself is
an exception, but then xfs_trans_alloc() sets PF_MEMALLOC_NOFS
itself (used to be PF_FSTRANS, as per my previous comments in this
thread). Hence putting the entire ->writepages path under NOFS
context won't change anything materially for XFS.

-Dave.
-- 
Dave Chinner
david@fromorbit.com

  reply	other threads:[~2020-06-15 23:24 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-15 11:56 [PATCH v3] xfs: avoid deadlock when trigger memory reclaim in ->writepages Yafang Shao
2020-06-15 14:25 ` Holger Hoffstätte
2020-06-15 14:51   ` Yafang Shao
2020-06-15 14:53   ` Michal Hocko
2020-06-15 15:07     ` Matthew Wilcox
2020-06-15 23:23       ` Dave Chinner [this message]
2020-06-15 15:08     ` Yafang Shao
2020-06-15 23:06     ` Dave Chinner
2020-06-16  7:56       ` Michal Hocko
2020-06-16 10:17       ` Yafang Shao
2020-06-16  8:16 ` Michal Hocko
2020-06-16  9:05   ` Yafang Shao
2020-06-16  9:27     ` Michal Hocko
2020-06-16  9:39       ` Yafang Shao
2020-06-16 10:48         ` Michal Hocko
2020-06-16 11:42           ` Yafang Shao
2020-06-18  0:34           ` Dave Chinner
2020-06-18 11:04             ` Yafang Shao
2020-06-22  1:23 ` [xfs] 59d77e81c5: WARNING:at_fs/iomap/buffered-io.c:#iomap_do_writepage kernel test robot
2020-06-22 12:20   ` Yafang Shao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200615232358.GW2040@dread.disaster.area \
    --to=david@fromorbit.com \
    --cc=holger@applied-asynchrony.com \
    --cc=laoar.shao@gmail.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=mhocko@kernel.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).