linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* d_instantiate() and unlock_new_inode() order in btrfs_mkdir()
@ 2018-04-19  0:00 Eric Biggers
  2018-04-19  0:06 ` Al Viro
  0 siblings, 1 reply; 4+ messages in thread
From: Eric Biggers @ 2018-04-19  0:00 UTC (permalink / raw)
  To: linux-btrfs, Chris Mason; +Cc: linux-fsdevel

Hi Chris and other btrfs folks,

btrfs_mkdir() calls d_instantiate() before unlock_new_inode(), which is wrong
because it exposes the inode to lookups before it's been fully initialized.
Most filesystems get it right, but f2fs and btrfs don't.  I sent a f2fs patch
(https://marc.info/?l=linux-fsdevel&m=152409178431350) and was going to send a
btrfs patch too, but in btrfs_mkdir() there is actually a comment claiming that
the existing order is intentional:

        d_instantiate(dentry, inode);
        /*
         * mkdir is special.  We're unlocking after we call d_instantiate
         * to avoid a race with nfsd calling d_instantiate.
         */
        unlock_new_inode(inode);

Unfortunately, I cannot find what it is refering to.  The comment was added by
commit b0d5d10f41a0 ("Btrfs: use insert_inode_locked4 for inode creation").
Chris, do you remember exactly what you had in mind when you wrote this?

And in case anyone wants it, here's a reproducer for the deadlock caused by the
current code that calls d_instantiate() before unlock_new_inode().  Note: it
needs CONFIG_DEBUG_LOCK_ALLOC=y.

	#include <sys/stat.h>
	#include <unistd.h>

	int main()
	{
		struct stat stbuf;

		if (fork() == 0) {
			for (;;)
				stat("dir/file", &stbuf);
		} else {
			for (;;) {
				mkdir("dir", 0777);
				stat("dir/file", &stbuf);
				rmdir("dir");
			}
		}
	}

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: d_instantiate() and unlock_new_inode() order in btrfs_mkdir()
  2018-04-19  0:00 d_instantiate() and unlock_new_inode() order in btrfs_mkdir() Eric Biggers
@ 2018-04-19  0:06 ` Al Viro
  2018-04-19  0:15   ` Al Viro
  0 siblings, 1 reply; 4+ messages in thread
From: Al Viro @ 2018-04-19  0:06 UTC (permalink / raw)
  To: Eric Biggers; +Cc: linux-btrfs, Chris Mason, linux-fsdevel

On Wed, Apr 18, 2018 at 05:00:29PM -0700, Eric Biggers wrote:
> Hi Chris and other btrfs folks,
> 
> btrfs_mkdir() calls d_instantiate() before unlock_new_inode(), which is wrong
> because it exposes the inode to lookups before it's been fully initialized.

Huh?  It *is* fully initialized by that point; what else is left to do?

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: d_instantiate() and unlock_new_inode() order in btrfs_mkdir()
  2018-04-19  0:06 ` Al Viro
@ 2018-04-19  0:15   ` Al Viro
  2018-04-19  0:54     ` Eric Biggers
  0 siblings, 1 reply; 4+ messages in thread
From: Al Viro @ 2018-04-19  0:15 UTC (permalink / raw)
  To: Eric Biggers; +Cc: linux-btrfs, Chris Mason, linux-fsdevel

On Thu, Apr 19, 2018 at 01:06:13AM +0100, Al Viro wrote:
> On Wed, Apr 18, 2018 at 05:00:29PM -0700, Eric Biggers wrote:
> > Hi Chris and other btrfs folks,
> > 
> > btrfs_mkdir() calls d_instantiate() before unlock_new_inode(), which is wrong
> > because it exposes the inode to lookups before it's been fully initialized.
> 
> Huh?  It *is* fully initialized by that point; what else is left to do?

	ISTR something about false positives from lockdep (with
lockdep_annotate_inode_mutex_key() called too late, perhaps?); said that, it
was a long time ago and I don't remember details at the moment...  Are you
actually seeing a deadlock there or is that just lockdep complaining?

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: d_instantiate() and unlock_new_inode() order in btrfs_mkdir()
  2018-04-19  0:15   ` Al Viro
@ 2018-04-19  0:54     ` Eric Biggers
  0 siblings, 0 replies; 4+ messages in thread
From: Eric Biggers @ 2018-04-19  0:54 UTC (permalink / raw)
  To: Al Viro; +Cc: linux-btrfs, Chris Mason, linux-fsdevel

On Thu, Apr 19, 2018 at 01:15:59AM +0100, Al Viro wrote:
> On Thu, Apr 19, 2018 at 01:06:13AM +0100, Al Viro wrote:
> > On Wed, Apr 18, 2018 at 05:00:29PM -0700, Eric Biggers wrote:
> > > Hi Chris and other btrfs folks,
> > > 
> > > btrfs_mkdir() calls d_instantiate() before unlock_new_inode(), which is wrong
> > > because it exposes the inode to lookups before it's been fully initialized.
> > 
> > Huh?  It *is* fully initialized by that point; what else is left to do?
> 
> 	ISTR something about false positives from lockdep (with
> lockdep_annotate_inode_mutex_key() called too late, perhaps?); said that, it
> was a long time ago and I don't remember details at the moment...  Are you
> actually seeing a deadlock there or is that just lockdep complaining?

It's an actual deadlock.  unlock_new_inode() calls
lockdep_annotate_inode_mutex_key() which calls init_rwsem(), which resets
i_rwsem->count while it's read-locked by lookup_slow().  Then the unlock in
lookup_slow() makes i_rwsem->count negative, which makes it appear to be
write-locked.

So no, the inode isn't fully initialized until unlock_new_inode() ran.

Eric

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2018-04-19  0:52 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-04-19  0:00 d_instantiate() and unlock_new_inode() order in btrfs_mkdir() Eric Biggers
2018-04-19  0:06 ` Al Viro
2018-04-19  0:15   ` Al Viro
2018-04-19  0:54     ` Eric Biggers

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).