Linux-XFS Archive on lore.kernel.org
 help / color / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH 19/24] xfs: attach inodes to the cluster buffer when dirtied
Date: Sun, 24 May 2020 08:59:44 +1000
Message-ID: <20200523225943.GL2040@dread.disaster.area> (raw)
In-Reply-To: <20200522234859.GW8230@magnolia>

On Fri, May 22, 2020 at 04:48:59PM -0700, Darrick J. Wong wrote:
> On Fri, May 22, 2020 at 01:50:24PM +1000, Dave Chinner wrote:
> > @@ -2649,53 +2650,12 @@ xfs_ifree_cluster(
> >  		bp->b_ops = &xfs_inode_buf_ops;
> >  
> >  		/*
> > -		 * Walk the inodes already attached to the buffer and mark them
> > -		 * stale. These will all have the flush locks held, so an
> > -		 * in-memory inode walk can't lock them. By marking them all
> > -		 * stale first, we will not attempt to lock them in the loop
> > -		 * below as the XFS_ISTALE flag will be set.
> > -		 */
> > -		list_for_each_entry(lip, &bp->b_li_list, li_bio_list) {
> > -			if (lip->li_type == XFS_LI_INODE) {
> > -				iip = (struct xfs_inode_log_item *)lip;
> > -				xfs_trans_ail_copy_lsn(mp->m_ail,
> > -							&iip->ili_flush_lsn,
> > -							&iip->ili_item.li_lsn);
> > -				xfs_iflags_set(iip->ili_inode, XFS_ISTALE);
> > -			}
> > -		}
> 
> Hm.  I think I'm a little confused here.  I think the consequence of
> attaching inode items to the buffer whenever we dirty the inode is that
> we no longer need to travel the inode list to set ISTALE because we know
> that the lookup loop below is sufficient to catch all of the inodes that
> are still hanging around in memory?

Yes. The issue here is that we now have inodes on this list that are
not flush locked, and so we can't just walk it assuming we can
change the flush state without holding the ILOCK to first ensure
the inode is not racing with reclaim, etc.

> We don't call xfs_ifree_cluster if any of the inodes in it are
> allocated, which means that all the on-disk inodes are either (a) not
> represented in memory, in which case we don't have to stale them; or (b)
> they're in memory and dirty (because they've recently been freed).  But
> if that's true, then surely you could find them all via b_li_list?

No, we can have inodes that are free but clean in memory when we
free the cluster. do an unlink of an inode, commit, push the AIL, it
gets written back and is now clean in memory with mode == 0 and
state XFS_IRECLAIMABLE. That inode is not reclaimed until memory
reclaim or the background reclaimer finds it and reclaims it. Those
are the inodes we want to mark XFS_ISTALE so they get treated by
lookup correctly if the cluster is reallocated and the inode
reinstantiated before it is reclaimed from memory.

> On the other hand, it seems redundant to iterate the list /and/ do the
> lookups and we no longer need to "set it up for being staled", so the
> second loop survives?

Yeah, the second loop is used because we have to look up the cache
anyway, and there's no point in iterating the buffer list because
it now has to do all the same "is it clean" checks and flush
locking, etc that the cache lookup has to do. A future patchset can
make the cache lookup use a gang lookup on the radix tree to
optimise it if necessary...

> > -
> > -
> > -		/*
> > -		 * For each inode in memory attempt to add it to the inode
> > -		 * buffer and set it up for being staled on buffer IO
> > -		 * completion.  This is safe as we've locked out tail pushing
> > -		 * and flushing by locking the buffer.
> > -		 *
> > -		 * We have already marked every inode that was part of a
> > -		 * transaction stale above, which means there is no point in
> > -		 * even trying to lock them.
> > +		 * Now we need to set all the cached clean inodes as XFS_ISTALE,
> > +		 * too. This requires lookups, and will skip inodes that we've
> > +		 * already marked XFS_ISTALE.
> >  		 */
> > -		for (i = 0; i < igeo->inodes_per_cluster; i++) {
> > -			ip = xfs_ifree_get_one_inode(pag, free_ip, inum + i);
> > -			if (!ip)
> > -				continue;
> > -
> > -			iip = ip->i_itemp;
> > -			spin_lock(&iip->ili_lock);
> > -			iip->ili_last_fields = iip->ili_fields;
> > -			iip->ili_fields = 0;
> > -			iip->ili_fsync_fields = 0;
> > -			spin_unlock(&iip->ili_lock);
> > -			xfs_trans_ail_copy_lsn(mp->m_ail, &iip->ili_flush_lsn,
> > -						&iip->ili_item.li_lsn);
> > -
> > -			list_add_tail(&iip->ili_item.li_bio_list,
> > -						&bp->b_li_list);
> > -
> > -			if (ip != free_ip)
> > -				xfs_iunlock(ip, XFS_ILOCK_EXCL);
> > -		}
> > +		for (i = 0; i < igeo->inodes_per_cluster; i++)
> > +			xfs_ifree_mark_inode_stale(pag, free_ip, inum + i);
> 
> I get that we're mostly just hoisting everything in the loop body to the
> end of xfs_ifree_get_one_inode, but can that be part of a separate hoist
> patch?

Ok. Not really happy about breaking it up to even more fine grained
patches as this patchset is already a nightmare to keep up to date
against all the random cleanups going into for-next.....

-Dave.
-- 
Dave Chinner
david@fromorbit.com

  reply index

Thread overview: 91+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-22  3:50 [PATCH 00/24] xfs: rework inode flushing to make inode reclaim fully asynchronous Dave Chinner
2020-05-22  3:50 ` [PATCH 01/24] xfs: remove logged flag from inode log item Dave Chinner
2020-05-22  7:25   ` Christoph Hellwig
2020-05-22 21:13   ` Darrick J. Wong
2020-05-22  3:50 ` [PATCH 02/24] xfs: add an inode item lock Dave Chinner
2020-05-22  6:45   ` Amir Goldstein
2020-05-22 21:24   ` Darrick J. Wong
2020-05-23  8:45   ` Christoph Hellwig
2020-05-22  3:50 ` [PATCH 03/24] xfs: mark inode buffers in cache Dave Chinner
2020-05-22  7:45   ` Amir Goldstein
2020-05-22 21:35   ` Darrick J. Wong
2020-05-24 23:41     ` Dave Chinner
2020-05-23  8:48   ` Christoph Hellwig
2020-05-25  0:06     ` Dave Chinner
2020-05-22  3:50 ` [PATCH 04/24] xfs: mark dquot " Dave Chinner
2020-05-22  7:46   ` Amir Goldstein
2020-05-22 21:38   ` Darrick J. Wong
2020-05-22  3:50 ` [PATCH 05/24] xfs: mark log recovery buffers for completion Dave Chinner
2020-05-22  7:41   ` Amir Goldstein
2020-05-24 23:54     ` Dave Chinner
2020-05-22 21:41   ` Darrick J. Wong
2020-05-22  3:50 ` [PATCH 06/24] xfs: call xfs_buf_iodone directly Dave Chinner
2020-05-22  7:56   ` Amir Goldstein
2020-05-22 21:53   ` Darrick J. Wong
2020-05-22  3:50 ` [PATCH 07/24] xfs: clean up whacky buffer log item list reinit Dave Chinner
2020-05-22 22:01   ` Darrick J. Wong
2020-05-23  8:50   ` Christoph Hellwig
2020-05-22  3:50 ` [PATCH 08/24] xfs: fold xfs_istale_done into xfs_iflush_done Dave Chinner
2020-05-22 22:10   ` Darrick J. Wong
2020-05-23  9:12   ` Christoph Hellwig
2020-05-22  3:50 ` [PATCH 09/24] xfs: use direct calls for dquot IO completion Dave Chinner
2020-05-22 22:13   ` Darrick J. Wong
2020-05-23  9:16   ` Christoph Hellwig
2020-05-22  3:50 ` [PATCH 10/24] xfs: clean up the buffer iodone callback functions Dave Chinner
2020-05-22 22:26   ` Darrick J. Wong
2020-05-25  0:37     ` Dave Chinner
2020-05-23  9:19   ` Christoph Hellwig
2020-05-22  3:50 ` [PATCH 11/24] xfs: get rid of log item callbacks Dave Chinner
2020-05-22 22:27   ` Darrick J. Wong
2020-05-23  9:19   ` Christoph Hellwig
2020-05-22  3:50 ` [PATCH 12/24] xfs: pin inode backing buffer to the inode log item Dave Chinner
2020-05-22 22:39   ` Darrick J. Wong
2020-05-23  9:34   ` Christoph Hellwig
2020-05-23 21:43     ` Dave Chinner
2020-05-24  5:31       ` Christoph Hellwig
2020-05-24 23:13         ` Dave Chinner
2020-05-22  3:50 ` [PATCH 13/24] xfs: make inode reclaim almost non-blocking Dave Chinner
2020-05-22 12:19   ` Amir Goldstein
2020-05-22 22:48   ` Darrick J. Wong
2020-05-23 22:29     ` Dave Chinner
2020-05-22  3:50 ` [PATCH 14/24] xfs: remove IO submission from xfs_reclaim_inode() Dave Chinner
2020-05-22 23:06   ` Darrick J. Wong
2020-05-25  3:49     ` Dave Chinner
2020-05-23  9:40   ` Christoph Hellwig
2020-05-23 22:35     ` Dave Chinner
2020-05-22  3:50 ` [PATCH 15/24] xfs: allow multiple reclaimers per AG Dave Chinner
2020-05-22 23:10   ` Darrick J. Wong
2020-05-23 22:35     ` Dave Chinner
2020-05-22  3:50 ` [PATCH 16/24] xfs: don't block inode reclaim on the ILOCK Dave Chinner
2020-05-22 23:11   ` Darrick J. Wong
2020-05-22  3:50 ` [PATCH 17/24] xfs: remove SYNC_TRYLOCK from inode reclaim Dave Chinner
2020-05-22 23:14   ` Darrick J. Wong
2020-05-23 22:42     ` Dave Chinner
2020-05-22  3:50 ` [PATCH 18/24] xfs: clean up inode reclaim comments Dave Chinner
2020-05-22 23:17   ` Darrick J. Wong
2020-05-22  3:50 ` [PATCH 19/24] xfs: attach inodes to the cluster buffer when dirtied Dave Chinner
2020-05-22 23:48   ` Darrick J. Wong
2020-05-23 22:59     ` Dave Chinner [this message]
2020-05-22  3:50 ` [PATCH 20/24] xfs: xfs_iflush() is no longer necessary Dave Chinner
2020-05-22 23:54   ` Darrick J. Wong
2020-05-22  3:50 ` [PATCH 21/24] xfs: rename xfs_iflush_int() Dave Chinner
2020-05-22 12:33   ` Amir Goldstein
2020-05-22 23:57   ` Darrick J. Wong
2020-05-22  3:50 ` [PATCH 22/24] xfs: rework xfs_iflush_cluster() dirty inode iteration Dave Chinner
2020-05-23  0:13   ` Darrick J. Wong
2020-05-23 23:14     ` Dave Chinner
2020-05-23 11:31   ` Christoph Hellwig
2020-05-23 23:23     ` Dave Chinner
2020-05-24  5:32       ` Christoph Hellwig
2020-05-23 11:39   ` Christoph Hellwig
2020-05-22  3:50 ` [PATCH 23/24] xfs: factor xfs_iflush_done Dave Chinner
2020-05-23  0:20   ` Darrick J. Wong
2020-05-23 11:35   ` Christoph Hellwig
2020-05-22  3:50 ` [PATCH 24/24] xfs: remove xfs_inobp_check() Dave Chinner
2020-05-23  0:16   ` Darrick J. Wong
2020-05-23 11:36   ` Christoph Hellwig
2020-05-22  4:04 ` [PATCH 00/24] xfs: rework inode flushing to make inode reclaim fully asynchronous Dave Chinner
2020-05-23 16:18   ` Darrick J. Wong
2020-05-23 21:22     ` Dave Chinner
2020-05-22  6:18 ` Amir Goldstein
2020-05-22 12:01   ` Amir Goldstein

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200523225943.GL2040@dread.disaster.area \
    --to=david@fromorbit.com \
    --cc=darrick.wong@oracle.com \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-XFS Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-xfs/0 linux-xfs/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-xfs linux-xfs/ https://lore.kernel.org/linux-xfs \
		linux-xfs@vger.kernel.org
	public-inbox-index linux-xfs

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-xfs


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git