* d_instantiate() and unlock_new_inode() order in btrfs_mkdir()
@ 2018-04-19 0:00 Eric Biggers
2018-04-19 0:06 ` Al Viro
0 siblings, 1 reply; 4+ messages in thread
From: Eric Biggers @ 2018-04-19 0:00 UTC (permalink / raw)
To: linux-btrfs, Chris Mason; +Cc: linux-fsdevel
Hi Chris and other btrfs folks,
btrfs_mkdir() calls d_instantiate() before unlock_new_inode(), which is wrong
because it exposes the inode to lookups before it's been fully initialized.
Most filesystems get it right, but f2fs and btrfs don't. I sent a f2fs patch
(https://marc.info/?l=linux-fsdevel&m=152409178431350) and was going to send a
btrfs patch too, but in btrfs_mkdir() there is actually a comment claiming that
the existing order is intentional:
d_instantiate(dentry, inode);
/*
* mkdir is special. We're unlocking after we call d_instantiate
* to avoid a race with nfsd calling d_instantiate.
*/
unlock_new_inode(inode);
Unfortunately, I cannot find what it is refering to. The comment was added by
commit b0d5d10f41a0 ("Btrfs: use insert_inode_locked4 for inode creation").
Chris, do you remember exactly what you had in mind when you wrote this?
And in case anyone wants it, here's a reproducer for the deadlock caused by the
current code that calls d_instantiate() before unlock_new_inode(). Note: it
needs CONFIG_DEBUG_LOCK_ALLOC=y.
#include <sys/stat.h>
#include <unistd.h>
int main()
{
struct stat stbuf;
if (fork() == 0) {
for (;;)
stat("dir/file", &stbuf);
} else {
for (;;) {
mkdir("dir", 0777);
stat("dir/file", &stbuf);
rmdir("dir");
}
}
}
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: d_instantiate() and unlock_new_inode() order in btrfs_mkdir()
2018-04-19 0:00 d_instantiate() and unlock_new_inode() order in btrfs_mkdir() Eric Biggers
@ 2018-04-19 0:06 ` Al Viro
2018-04-19 0:15 ` Al Viro
0 siblings, 1 reply; 4+ messages in thread
From: Al Viro @ 2018-04-19 0:06 UTC (permalink / raw)
To: Eric Biggers; +Cc: linux-btrfs, Chris Mason, linux-fsdevel
On Wed, Apr 18, 2018 at 05:00:29PM -0700, Eric Biggers wrote:
> Hi Chris and other btrfs folks,
>
> btrfs_mkdir() calls d_instantiate() before unlock_new_inode(), which is wrong
> because it exposes the inode to lookups before it's been fully initialized.
Huh? It *is* fully initialized by that point; what else is left to do?
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: d_instantiate() and unlock_new_inode() order in btrfs_mkdir()
2018-04-19 0:06 ` Al Viro
@ 2018-04-19 0:15 ` Al Viro
2018-04-19 0:54 ` Eric Biggers
0 siblings, 1 reply; 4+ messages in thread
From: Al Viro @ 2018-04-19 0:15 UTC (permalink / raw)
To: Eric Biggers; +Cc: linux-btrfs, Chris Mason, linux-fsdevel
On Thu, Apr 19, 2018 at 01:06:13AM +0100, Al Viro wrote:
> On Wed, Apr 18, 2018 at 05:00:29PM -0700, Eric Biggers wrote:
> > Hi Chris and other btrfs folks,
> >
> > btrfs_mkdir() calls d_instantiate() before unlock_new_inode(), which is wrong
> > because it exposes the inode to lookups before it's been fully initialized.
>
> Huh? It *is* fully initialized by that point; what else is left to do?
ISTR something about false positives from lockdep (with
lockdep_annotate_inode_mutex_key() called too late, perhaps?); said that, it
was a long time ago and I don't remember details at the moment... Are you
actually seeing a deadlock there or is that just lockdep complaining?
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: d_instantiate() and unlock_new_inode() order in btrfs_mkdir()
2018-04-19 0:15 ` Al Viro
@ 2018-04-19 0:54 ` Eric Biggers
0 siblings, 0 replies; 4+ messages in thread
From: Eric Biggers @ 2018-04-19 0:54 UTC (permalink / raw)
To: Al Viro; +Cc: linux-btrfs, Chris Mason, linux-fsdevel
On Thu, Apr 19, 2018 at 01:15:59AM +0100, Al Viro wrote:
> On Thu, Apr 19, 2018 at 01:06:13AM +0100, Al Viro wrote:
> > On Wed, Apr 18, 2018 at 05:00:29PM -0700, Eric Biggers wrote:
> > > Hi Chris and other btrfs folks,
> > >
> > > btrfs_mkdir() calls d_instantiate() before unlock_new_inode(), which is wrong
> > > because it exposes the inode to lookups before it's been fully initialized.
> >
> > Huh? It *is* fully initialized by that point; what else is left to do?
>
> ISTR something about false positives from lockdep (with
> lockdep_annotate_inode_mutex_key() called too late, perhaps?); said that, it
> was a long time ago and I don't remember details at the moment... Are you
> actually seeing a deadlock there or is that just lockdep complaining?
It's an actual deadlock. unlock_new_inode() calls
lockdep_annotate_inode_mutex_key() which calls init_rwsem(), which resets
i_rwsem->count while it's read-locked by lookup_slow(). Then the unlock in
lookup_slow() makes i_rwsem->count negative, which makes it appear to be
write-locked.
So no, the inode isn't fully initialized until unlock_new_inode() ran.
Eric
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2018-04-19 0:52 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-04-19 0:00 d_instantiate() and unlock_new_inode() order in btrfs_mkdir() Eric Biggers
2018-04-19 0:06 ` Al Viro
2018-04-19 0:15 ` Al Viro
2018-04-19 0:54 ` Eric Biggers
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).