Linux-XFS Archive on lore.kernel.org
 help / color / Atom feed
From: "Darrick J. Wong" <djwong@kernel.org>
To: Brian Foster <bfoster@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>,
	linux-xfs@vger.kernel.org, hch@infradead.org,
	david@fromorbit.com
Subject: Re: [PATCH 02/11] xfs: don't stall cowblocks scan if we can't take locks
Date: Tue, 26 Jan 2021 19:09:09 -0800
Message-ID: <20210127030909.GD7698@magnolia> (raw)
In-Reply-To: <20210126200309.GA2515451@bfoster>

On Tue, Jan 26, 2021 at 03:03:09PM -0500, Brian Foster wrote:
> On Tue, Jan 26, 2021 at 10:34:52AM -0800, Darrick J. Wong wrote:
> > On Tue, Jan 26, 2021 at 08:14:51AM -0500, Brian Foster wrote:
> > > On Mon, Jan 25, 2021 at 11:54:46AM -0800, Darrick J. Wong wrote:
> > > > On Mon, Jan 25, 2021 at 01:14:06PM -0500, Brian Foster wrote:
> > > > > On Sat, Jan 23, 2021 at 10:52:10AM -0800, Darrick J. Wong wrote:
> > > > > > From: Darrick J. Wong <djwong@kernel.org>
> > > > > > 
> > > > > > Don't stall the cowblocks scan on a locked inode if we possibly can.
> > > > > > We'd much rather the background scanner keep moving.
> > > > > > 
> > > > > > Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> > > > > > Reviewed-by: Christoph Hellwig <hch@lst.de>
> > > > > > ---
> > > > > >  fs/xfs/xfs_icache.c |   21 ++++++++++++++++++---
> > > > > >  1 file changed, 18 insertions(+), 3 deletions(-)
> > > > > > 
> > > > > > 
> > > > > > diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
> > > > > > index c71eb15e3835..89f9e692fde7 100644
> > > > > > --- a/fs/xfs/xfs_icache.c
> > > > > > +++ b/fs/xfs/xfs_icache.c
> > > > > > @@ -1605,17 +1605,31 @@ xfs_inode_free_cowblocks(
> > > > > >  	void			*args)
> > > > > >  {
> > > > > >  	struct xfs_eofblocks	*eofb = args;
> > > > > > +	bool			wait;
> > > > > >  	int			ret = 0;
> > > > > >  
> > > > > > +	wait = eofb && (eofb->eof_flags & XFS_EOF_FLAGS_SYNC);
> > > > > > +
> > > > > >  	if (!xfs_prep_free_cowblocks(ip))
> > > > > >  		return 0;
> > > > > >  
> > > > > >  	if (!xfs_inode_matches_eofb(ip, eofb))
> > > > > >  		return 0;
> > > > > >  
> > > > > > -	/* Free the CoW blocks */
> > > > > > -	xfs_ilock(ip, XFS_IOLOCK_EXCL);
> > > > > > -	xfs_ilock(ip, XFS_MMAPLOCK_EXCL);
> > > > > > +	/*
> > > > > > +	 * If the caller is waiting, return -EAGAIN to keep the background
> > > > > > +	 * scanner moving and revisit the inode in a subsequent pass.
> > > > > > +	 */
> > > > > > +	if (!xfs_ilock_nowait(ip, XFS_IOLOCK_EXCL)) {
> > > > > > +		if (wait)
> > > > > > +			return -EAGAIN;
> > > > > > +		return 0;
> > > > > > +	}
> > > > > > +	if (!xfs_ilock_nowait(ip, XFS_MMAPLOCK_EXCL)) {
> > > > > > +		if (wait)
> > > > > > +			ret = -EAGAIN;
> > > > > > +		goto out_iolock;
> > > > > > +	}
> > > > > 
> > > > > Hmm.. I'd be a little concerned over this allowing a scan to repeat
> > > > > indefinitely with a competing workload because a restart doesn't carry
> > > > > over any state from the previous scan. I suppose the
> > > > > xfs_prep_free_cowblocks() checks make that slightly less likely on a
> > > > > given file, but I more wonder about a scenario with a large set of
> > > > > inodes in a particular AG with a sufficient amount of concurrent
> > > > > activity. All it takes is one trylock failure per scan to have to start
> > > > > the whole thing over again... hm?
> > > > 
> > > > I'm not quite sure what to do here -- xfs_inode_free_eofblocks already
> > > > has the ability to return EAGAIN, which (I think) means that it's
> > > > already possible for the low-quota scan to stall indefinitely if the
> > > > scan can't lock the inode.
> > > > 
> > > 
> > > Indeed, that is true.
> > > 
> > > > I think we already had a stall limiting factor here in that all the
> > > > other threads in the system that hit EDQUOT will drop their IOLOCKs to
> > > > scan the fs, which means that while they loop around the scanner they
> > > > can only be releasing quota and driving us towards having fewer inodes
> > > > with the same dquots and either blockgc tag set.
> > > > 
> > > 
> > > Yeah, that makes sense for the current use case. There's a broader
> > > sequence involved there that provides some throttling and serialization,
> > > along with the fact that the workload is imminently driving into
> > > -ENOSPC.
> > > 
> > > I think what had me a little concerned upon seeing this is whether the
> > > scanning mechanism is currently suitable for the broader usage
> > > introduced in this series. We've had related issues in the past with
> > > concurrent sync eofblocks scans and iolock (see [1], for example).
> > > Having made it through the rest of the series however, it looks like all
> > > of the new scan invocations are async, so perhaps this is not really an
> > > immediate problem.
> > > 
> > > I think it would be nice if we could somehow assert that the task that
> > > invokes a sync scan doesn't hold an iolock, but I'm not sure there's a
> > > clean way to do that. We'd probably have to define the interface to
> > > require an inode just for that purpose. It may not be worth that
> > > weirdness, and I suppose if code is tested it should be pretty obvious
> > > that such a scan will never complete..
> > 
> > Well... in theory it would be possible to deal with stalls (A->A
> > livelock or otherwise) if we had that IWALK_NORETRY flag I was talking
> > about that would cause xfs_iwalk to exit with EAGAIN instead of
> > restarting the scan at inode 0.  The caller could detect that a
> > synchronous scan didn't complete, and then decide if it wants to call
> > back to try again.
> > 
> > But, that might be a lot of extra code to deal with a requirement that
> > xfs_blockgc_free_* callers cannot hold an iolock or an mmaplock.  Maybe
> > that's the simpler course of action?
> > 
> 
> Yeah, I think we should require that callers drop all such locks before
> invoking a sync scan, since that may livelock against the lock held by
> the current task (or cause similar weirdness against concurrent sync
> scans, as the code prior to the commit below[1] had demonstrated).  The
> async scans used throughout this series seem reasonable to me..

Ok, will update the code comment for xfs_blockgc_free_quota to say that
callers cannot hold any inode IO/MMAP/ILOCKs for sync scans.

--D

> Brian
> 
> > --D
> > 
> > > Brian
> > > 
> > > [1] c3155097ad89 ("xfs: sync eofblocks scans under iolock are livelock prone")
> > > 
> > > > --D
> > > > 
> > > > > Brian
> > > > > 
> > > > > >  
> > > > > >  	/*
> > > > > >  	 * Check again, nobody else should be able to dirty blocks or change
> > > > > > @@ -1625,6 +1639,7 @@ xfs_inode_free_cowblocks(
> > > > > >  		ret = xfs_reflink_cancel_cow_range(ip, 0, NULLFILEOFF, false);
> > > > > >  
> > > > > >  	xfs_iunlock(ip, XFS_MMAPLOCK_EXCL);
> > > > > > +out_iolock:
> > > > > >  	xfs_iunlock(ip, XFS_IOLOCK_EXCL);
> > > > > >  
> > > > > >  	return ret;
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

  reply index

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-23 18:51 [PATCHSET v4 00/11] xfs: try harder to reclaim space when we run out Darrick J. Wong
2021-01-23 18:52 ` [PATCH 01/11] xfs: refactor messy xfs_inode_free_quota_* functions Darrick J. Wong
2021-01-25 18:13   ` Brian Foster
2021-01-25 19:33     ` Darrick J. Wong
2021-01-23 18:52 ` [PATCH 02/11] xfs: don't stall cowblocks scan if we can't take locks Darrick J. Wong
2021-01-25 18:14   ` Brian Foster
2021-01-25 19:54     ` Darrick J. Wong
2021-01-26 13:14       ` Brian Foster
2021-01-26 18:34         ` Darrick J. Wong
2021-01-26 20:03           ` Brian Foster
2021-01-27  3:09             ` Darrick J. Wong [this message]
2021-01-23 18:52 ` [PATCH 03/11] xfs: xfs_inode_free_quota_blocks should scan project quota Darrick J. Wong
2021-01-25 18:14   ` Brian Foster
2021-01-23 18:52 ` [PATCH 04/11] xfs: move and rename xfs_inode_free_quota_blocks to avoid conflicts Darrick J. Wong
2021-01-25 18:14   ` Brian Foster
2021-01-23 18:52 ` [PATCH 05/11] xfs: pass flags and return gc errors from xfs_blockgc_free_quota Darrick J. Wong
2021-01-24  9:34   ` Christoph Hellwig
2021-01-25 18:15   ` Brian Foster
2021-01-26  4:52   ` [PATCH v4.1 " Darrick J. Wong
2021-01-27 16:59     ` Christoph Hellwig
2021-01-27 17:11       ` Darrick J. Wong
2021-01-23 18:52 ` [PATCH 06/11] xfs: flush eof/cowblocks if we can't reserve quota for file blocks Darrick J. Wong
2021-01-24  9:39   ` Christoph Hellwig
2021-01-25 18:16     ` Brian Foster
2021-01-25 18:57       ` Darrick J. Wong
2021-01-26 13:26         ` Brian Foster
2021-01-26 21:12           ` Darrick J. Wong
2021-01-27 14:19             ` Brian Foster
2021-01-27 17:19               ` Darrick J. Wong
2021-01-26  4:53   ` [PATCH v4.1 " Darrick J. Wong
2021-01-23 18:52 ` [PATCH 07/11] xfs: flush eof/cowblocks if we can't reserve quota for inode creation Darrick J. Wong
2021-01-26  4:55   ` [PATCH v4.1 " Darrick J. Wong
2021-01-23 18:52 ` [PATCH 08/11] xfs: flush eof/cowblocks if we can't reserve quota for chown Darrick J. Wong
2021-01-26  4:55   ` [PATCH v4.1 " Darrick J. Wong
2021-01-23 18:52 ` [PATCH 09/11] xfs: add a tracepoint for blockgc scans Darrick J. Wong
2021-01-25 18:45   ` Brian Foster
2021-01-26  4:56   ` [PATCH v4.1 " Darrick J. Wong
2021-01-23 18:52 ` [PATCH 10/11] xfs: refactor xfs_icache_free_{eof,cow}blocks call sites Darrick J. Wong
2021-01-24  9:41   ` Christoph Hellwig
2021-01-25 18:46   ` Brian Foster
2021-01-26  2:33     ` Darrick J. Wong
2021-01-23 18:53 ` [PATCH 11/11] xfs: flush speculative space allocations when we run out of space Darrick J. Wong
2021-01-24  9:48   ` Christoph Hellwig
2021-01-25 18:46     ` Brian Foster
2021-01-25 20:02     ` Darrick J. Wong
2021-01-25 21:06       ` Brian Foster
2021-01-26  0:29         ` Darrick J. Wong
2021-01-27 16:57           ` Christoph Hellwig
2021-01-27 21:00             ` Darrick J. Wong
2021-01-26  4:59   ` [PATCH v4.1 " Darrick J. Wong
  -- strict thread matches above, loose matches on Subject: below --
2021-01-28  6:02 [PATCHSET v5 00/11] xfs: try harder to reclaim space when we run out Darrick J. Wong
2021-01-28  6:02 ` [PATCH 02/11] xfs: don't stall cowblocks scan if we can't take locks Darrick J. Wong
2021-01-18 22:11 [PATCHSET v3 00/11] xfs: try harder to reclaim space when we run out Darrick J. Wong
2021-01-18 22:12 ` [PATCH 02/11] xfs: don't stall cowblocks scan if we can't take locks Darrick J. Wong
2021-01-19  6:49   ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210127030909.GD7698@magnolia \
    --to=djwong@kernel.org \
    --cc=bfoster@redhat.com \
    --cc=david@fromorbit.com \
    --cc=hch@infradead.org \
    --cc=hch@lst.de \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-XFS Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-xfs/0 linux-xfs/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-xfs linux-xfs/ https://lore.kernel.org/linux-xfs \
		linux-xfs@vger.kernel.org
	public-inbox-index linux-xfs

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-xfs


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git