All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <djwong@kernel.org>
To: Dave Chinner <david@fromorbit.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH 1/7] xfs: take the ILOCK when accessing the inode core
Date: Tue, 4 Jan 2022 17:38:01 -0800	[thread overview]
Message-ID: <20220105013801.GG656707@magnolia> (raw)
In-Reply-To: <20220105000947.GK945095@dread.disaster.area>

On Wed, Jan 05, 2022 at 11:09:47AM +1100, Dave Chinner wrote:
> On Thu, Dec 16, 2021 at 03:56:09PM +1100, Dave Chinner wrote:
> > On Wed, Dec 15, 2021 at 05:09:21PM -0800, Darrick J. Wong wrote:
> > > From: Darrick J. Wong <djwong@kernel.org>
> > > 
> > > I was poking around in the directory code while diagnosing online fsck
> > > bugs, and noticed that xfs_readdir doesn't actually take the directory
> > > ILOCK when it calls xfs_dir2_isblock.  xfs_dir_open most probably loaded
> > > the data fork mappings
> > 
> > Yup, that is pretty much guaranteed. If the inode is extent or btree form as the
> > extent count will be non-zero, hence we can only get to the
> > xfs_dir2_isblock() check if the inode has moved from local to block
> > form between the open and xfs_dir2_isblock() get in the getdents
> > code.
> > 
> > > and the VFS took i_rwsem (aka IOLOCK_SHARED) so
> > > we're protected against writer threads, but we really need to follow the
> > > locking model like we do in other places.  The same applies to the
> > > shortform getdents function.
> > 
> > Locking rules should be the same as xfs_dir_lookup().....
> 
> Ok, I assumed the locking in xfs_dir_lookup() is optimal....
> 
> .... which it turns out it isn't. All calls to xfs_dir_lookup()
> occur with the directory locked at the VFS level, so the internal
> contents of the directory can never change during a lookup. Hence
> holding the ILOCK across the entire lookup is both unnecessary and
> excessive.
> 
> What xfs_dir_lookup() should be doing is what xfs_readdir() is
> largely already doing - just locking the ILOCK around buffer read
> operations when we are mapping directory offsets to physical disk
> locations and reading them from disk.  Changing this is a
> significant set of changes, so it's not something that needs to be
> done right now.
> 
> However, we still need to protect the xfs_dir2_isblock() lookup call
> in xfs_readdir().
> 
> > > While we're at it, clean up the somewhat strange structure of this
> > > function.
> > > 
> > > Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> > > ---
> > >  fs/xfs/xfs_dir2_readdir.c |   28 +++++++++++++++++-----------
> > >  1 file changed, 17 insertions(+), 11 deletions(-)
> > > 
> > > 
> > > diff --git a/fs/xfs/xfs_dir2_readdir.c b/fs/xfs/xfs_dir2_readdir.c
> > > index 8310005af00f..25560151c273 100644
> > > --- a/fs/xfs/xfs_dir2_readdir.c
> > > +++ b/fs/xfs/xfs_dir2_readdir.c
> > > @@ -507,8 +507,9 @@ xfs_readdir(
> > >  	size_t			bufsize)
> > >  {
> > >  	struct xfs_da_args	args = { NULL };
> > > -	int			rval;
> > > -	int			v;
> > > +	unsigned int		lock_mode;
> > > +	int			error;
> > > +	int			isblock;
> > >  
> > >  	trace_xfs_readdir(dp);
> > >  
> > > @@ -522,14 +523,19 @@ xfs_readdir(
> > >  	args.geo = dp->i_mount->m_dir_geo;
> > >  	args.trans = tp;
> > >  
> > > -	if (dp->i_df.if_format == XFS_DINODE_FMT_LOCAL)
> > > -		rval = xfs_dir2_sf_getdents(&args, ctx);
> > > -	else if ((rval = xfs_dir2_isblock(&args, &v)))
> > > -		;
> > > -	else if (v)
> > > -		rval = xfs_dir2_block_getdents(&args, ctx);
> > > -	else
> > > -		rval = xfs_dir2_leaf_getdents(&args, ctx, bufsize);
> > > +	lock_mode = xfs_ilock_data_map_shared(dp);
> > > +	if (dp->i_df.if_format == XFS_DINODE_FMT_LOCAL) {
> > > +		xfs_iunlock(dp, lock_mode);
> > > +		return xfs_dir2_sf_getdents(&args, ctx);
> > > +	}
> 
> Directory inode format cannot change here, so we don't need to
> hold the ILOCK at all to do shortform checks.

Ok.

> > >  
> > > -	return rval;
> > > +	error = xfs_dir2_isblock(&args, &isblock);
> > > +	xfs_iunlock(dp, lock_mode);
> > > +	if (error)
> > > +		return error;
> > > +
> > > +	if (isblock)
> > > +		return xfs_dir2_block_getdents(&args, ctx);
> 
> Can the xfs_dir2_isblock() call be moved into
> xfs_dir2_block_getdents() where it already takes the ILOCK to read
> the block?

Yeah.

> > > +
> > > +	return xfs_dir2_leaf_getdents(&args, ctx, bufsize);
> 
> Otherwise this patch is correct and this is where we should start
> fixing the directory locking mess...

<nod>

--D

> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com

  reply	other threads:[~2022-01-05  1:38 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-16  1:09 [PATCHSET 0/7] xfs: random fixes for 5.17 Darrick J. Wong
2021-12-16  1:09 ` [PATCH 1/7] xfs: take the ILOCK when accessing the inode core Darrick J. Wong
2021-12-16  4:56   ` Dave Chinner
2021-12-17 18:59     ` Darrick J. Wong
2021-12-21  1:08       ` Darrick J. Wong
2021-12-21  5:10         ` Dave Chinner
2022-01-04  1:19           ` Darrick J. Wong
2022-01-05  0:09     ` Dave Chinner
2022-01-05  1:38       ` Darrick J. Wong [this message]
2021-12-16  1:09 ` [PATCH 2/7] xfs: shut down filesystem if we xfs_trans_cancel with deferred work items Darrick J. Wong
2021-12-16  4:57   ` Dave Chinner
2021-12-24  7:16   ` Christoph Hellwig
2021-12-16  1:09 ` [PATCH 3/7] xfs: fix a bug in the online fsck directory leaf1 bestcount check Darrick J. Wong
2021-12-16  5:05   ` Dave Chinner
2021-12-16 19:25     ` Darrick J. Wong
2021-12-16 21:17       ` Dave Chinner
2021-12-16 21:40         ` Darrick J. Wong
2021-12-16 22:04   ` Dave Chinner
2021-12-24  7:17   ` Christoph Hellwig
2021-12-16  1:09 ` [PATCH 4/7] xfs: prevent UAF in xfs_log_item_in_current_chkpt Darrick J. Wong
2021-12-16  4:36   ` Dave Chinner
2021-12-16 16:35     ` Darrick J. Wong
2021-12-16  1:09 ` [PATCH 5/7] xfs: fix quotaoff mutex usage now that we don't support disabling it Darrick J. Wong
2021-12-16  5:07   ` Dave Chinner
2021-12-24  7:17   ` Christoph Hellwig
2021-12-16  1:09 ` [PATCH 6/7] xfs: don't expose internal symlink metadata buffers to the vfs Darrick J. Wong
2021-12-16  5:11   ` Dave Chinner
2021-12-17  2:58     ` Ian Kent
2021-12-24  7:22   ` Christoph Hellwig
2021-12-16  1:09 ` [PATCH 7/7] xfs: only run COW extent recovery when there are no live extents Darrick J. Wong
2021-12-16  4:41   ` Dave Chinner
2021-12-24  7:18   ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220105013801.GG656707@magnolia \
    --to=djwong@kernel.org \
    --cc=david@fromorbit.com \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.