All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <djwong@kernel.org>
To: Dave Chinner <david@fromorbit.com>
Cc: Jan Kara <jack@suse.cz>, Al Viro <viro@zeniv.linux.org.uk>,
	linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org,
	Ted Tso <tytso@mit.edu>,
	linux-xfs@vger.kernel.org
Subject: Re: Locking issue with directory renames
Date: Fri, 24 Feb 2023 19:46:57 -0800	[thread overview]
Message-ID: <Y/mEsfyhNCs8orCY@magnolia> (raw)
In-Reply-To: <20230117214457.GG360264@dread.disaster.area>

On Wed, Jan 18, 2023 at 08:44:57AM +1100, Dave Chinner wrote:
> On Tue, Jan 17, 2023 at 01:37:35PM +0100, Jan Kara wrote:
> > Hello!
> > 
> > I've some across an interesting issue that was spotted by syzbot [1]. The
> > report is against UDF but AFAICS the problem exists for ext4 as well and
> > possibly other filesystems. The problem is the following: When we are
> > renaming directory 'dir' say rename("foo/dir", "bar/") we lock 'foo' and
> > 'bar' but 'dir' is unlocked because the locking done by vfs_rename() is
> > 
> >         if (!is_dir || (flags & RENAME_EXCHANGE))
> >                 lock_two_nondirectories(source, target);
> >         else if (target)
> >                 inode_lock(target);
> > 
> > However some filesystems (e.g. UDF but ext4 as well, I suspect XFS may be
> > hurt by this as well because it converts among multiple dir formats) need
> > to update parent pointer in 'dir' and nothing protects this update against
> > a race with someone else modifying 'dir'. Now this is mostly harmless
> > because the parent pointer (".." directory entry) is at the beginning of
> > the directory and stable however if for example the directory is converted
> > from packed "in-inode" format to "expanded" format as a result of
> > concurrent operation on 'dir', the filesystem gets corrupted (or crashes as
> > in case of UDF).
> 
> No, xfs_rename() does not have this problem - we pass four inodes to
> the function - the source directory, source inode, destination
> directory and destination inode.

Um, I think it does have this problem.  xfs_readdir thinks it can parse
a shortform inode without taking the ILOCK:

	if (dp->i_df.if_format == XFS_DINODE_FMT_LOCAL)
		return xfs_dir2_sf_getdents(&args, ctx);

	lock_mode = xfs_ilock_data_map_shared(dp);
	error = xfs_dir2_isblock(&args, &isblock);

So xfs_dir2_sf_replace can rewrite the shortform structure (or even
convert it to block format!) while readdir is accessing it.  Or am I
mising something?

--D

> In the above case, "dir/" is passed to XFs as the source inode - the
> src_dir is "foo/", the target dir is "bar/" and the target inode is
> null. src_dir != target_dir, so we set the "new_parent" flag. the
> srouce inode is a directory, so we set the src_is_directory flag,
> too.
> 
> We lock all three inodes that are passed. We do various things, then
> run:
> 
>         if (new_parent && src_is_directory) {
>                 /*
>                  * Rewrite the ".." entry to point to the new
>                  * directory.
>                  */
>                 error = xfs_dir_replace(tp, src_ip, &xfs_name_dotdot,
>                                         target_dp->i_ino, spaceres);
>                 ASSERT(error != -EEXIST);
>                 if (error)
>                         goto out_trans_cancel;
>         }
> 
> which replaces the ".." entry in source inode atomically whilst it
> is locked.  Any directory format changes that occur during the
> rename are done while the ILOCK is held, so they appear atomic to
> outside observers that are trying to parse the directory structure
> (e.g. readdir).
> 
> > So we'd need to lock 'source' if it is a directory.
> 
> Yup, and XFS goes further by always locking the source inode in a
> rename, even if it is not a directory. This ensures the inode being
> moved cannot have it's metadata otherwise modified whilst the rename
> is in progress, even if that modification would have no impact on
> the rename. It's a pretty strict interpretation of "rename is an
> atomic operation", but it avoids accidentally missing nasty corner
> cases like the one described above...
> 
> > Ideally this would
> > happen in VFS as otherwise I bet a lot of filesystems will get this wrong
> > so could vfs_rename() lock 'source' if it is a dir as well? Essentially
> > this would amount to calling lock_two_nondirectories(source, target)
> > unconditionally but that would become a serious misnomer ;). Al, any
> > thought?
> 
> XFS just has a function that allows for an arbitrary number of
> inodes to be locked in the given order: xfs_lock_inodes(). For
> rename, the lock order is determined by xfs_sort_for_rename().
> 
> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com

  parent reply	other threads:[~2023-02-25  3:47 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-17 12:37 Locking issue with directory renames Jan Kara
2023-01-17 16:57 ` Al Viro
2023-01-18  9:10   ` Jan Kara
2023-01-18 16:30     ` Al Viro
2023-01-18 18:41       ` Jan Kara
2023-01-17 21:44 ` Dave Chinner
2023-01-18  8:56   ` Jan Kara
2023-02-24  0:19   ` Kent Overstreet
2023-02-25  3:46   ` Darrick J. Wong [this message]
2023-02-28  1:58     ` Dave Chinner
2023-03-01 12:36       ` Jan Kara
2023-03-02  0:30         ` Dave Chinner
2023-03-02  9:21           ` Jan Kara

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y/mEsfyhNCs8orCY@magnolia \
    --to=djwong@kernel.org \
    --cc=david@fromorbit.com \
    --cc=jack@suse.cz \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=tytso@mit.edu \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.