linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Eric W. Biederman" <ebiederm@xmission.com>
To: Al Viro <viro@zeniv.linux.org.uk>
Cc: Gabriel Krisman Bertazi <gabriel@krisman.be>,
	 Linus Torvalds <torvalds@linux-foundation.org>,
	 Christian Brauner <brauner@kernel.org>,
	tytso@mit.edu,  linux-f2fs-devel@lists.sourceforge.net,
	ebiggers@kernel.org,  linux-fsdevel@vger.kernel.org,
	 jaegeuk@kernel.org, linux-ext4@vger.kernel.org,
	 Miklos Szeredi <miklos@szeredi.hu>
Subject: Re: fun with d_invalidate() vs. d_splice_alias() was Re: [f2fs-dev] [PATCH v6 0/9] Support negative dentries on case-insensitive ext4 and f2fs
Date: Mon, 27 Nov 2023 12:19:09 -0600	[thread overview]
Message-ID: <87v89nkqjm.fsf@email.froward.int.ebiederm.org> (raw)
In-Reply-To: <20231127161426.GA964333@ZenIV> (Al Viro's message of "Mon, 27 Nov 2023 16:14:26 +0000")

Al Viro <viro@zeniv.linux.org.uk> writes:

> On Mon, Nov 27, 2023 at 04:03:18PM +0000, Al Viro wrote:
>> On Mon, Nov 27, 2023 at 09:47:47AM -0600, Eric W. Biederman wrote:
>> 
>> > There is a lot going on there.  I remember one of the relevant
>> > restrictions was marking dentries dont_mount, and inodes S_DEAD
>> > in unlink and rmdir.
>> > 
>> > But even without out that marking if d_invalidate is called
>> > from d_revalidate the inode and all of it's dentries must be
>> > dead because the inode is stale and most go.  There should
>> > be no resurrecting it at that point.
>> > 
>> > I suspect the most fruitful way to think of the d_invalidate vs
>> > d_splice_alias races is an unlink vs rename race.
>> > 
>> > I don't think the mechanism matters, but deeply and fundamentally
>> > if we detect a directory inode is dead we need to stick with
>> > that decision and not attempt to resurrect it with d_splice_alias.
>> 
>> Wrong.  Deeply and fundamentally we detect a dentry that does not
>> match the directory contents according to the server.
>> 
>> For example, due to rename done on server.  With object in question
>> perfectly alive there - fhandle still works, etc.
>> 
>> However, it's no longer where it used to be.  And we would bloody better
>> not have lookups for the old name result in access to that object.
>> We also should never allow the access to *new* name lead to two live
>> dentries for the same directory inode.
>> 
>> Again, this is not about rmdir() or unlink() - invalidation can happen
>> for object that is still open, still accessed and still very much alive.
>> Does that all the time for any filesystem with ->d_revalidate().
>
> Put another way, there used to be very odd song and dance in ->d_revalidate()
> instances along the lines of "we can't possibly tell the caller to invalidate
> a mountpoint"; it was racy in the best case and during the rewrite of
> d_invalidate() to teach it how to evict submounts those attempts had been
> dropped - ->d_revalidate() returning 0 does end up with mounts dissolved
> by d_invalidate() from caller.
>
> It always had been racy, starting with the checks that used to be in
> ->d_revalidate() instances way before all those changes.  So the switch
> of d_invalidate() to dissolving submounts had been a step in the right
> direction, but it's not being careful enough.
>
> Again, it's about d_invalidate() caused by pathwalk running into a dentry that
> doesn't match the reality vs. d_splice_alias() finding that it matches the
> inode we had looked up elsewhere.

My point is we should have a atomic way to decide the disposition of
such a dentry, and it's children.

Either we should decide it is useless and remove it and all of it's
children.

Or we should decide it was renamed and just handle it that way.

If we can record such a decision on the dentry or possibly on the inode
then we can resolve the race by having it be a proper race of which
comes first.

It isn't a proper delete of the inode so anything messing with the inode
and marking it S_DEAD is probably wrong.

The code could do something like mark the dentry dont_mount which should
be enough to for d_splice_alias to say oops, something is not proper
here.  Let the d_invalidate do it's thing.

Or the code could remove the dentry from inode->i_dentry and keep
d_splice alias from finding it, and it's children completely.
That is different from unhashing it.

Anyway that is my memory and my general sense of what is going on.
I help it helps.

Eric

  reply	other threads:[~2023-11-27 18:19 UTC|newest]

Thread overview: 65+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-16  5:07 [PATCH v6 0/9] Support negative dentries on case-insensitive ext4 and f2fs Gabriel Krisman Bertazi
2023-08-16  5:07 ` [PATCH v6 1/9] ecryptfs: Reject casefold directory inodes Gabriel Krisman Bertazi
2023-08-16  5:07 ` [PATCH v6 2/9] 9p: Split ->weak_revalidate from ->revalidate Gabriel Krisman Bertazi
2023-08-16  5:07 ` [PATCH v6 3/9] fs: Expose name under lookup to d_revalidate hooks Gabriel Krisman Bertazi
2023-11-22 20:59   ` Al Viro
2023-08-16  5:07 ` [PATCH v6 4/9] fs: Add DCACHE_CASEFOLDED_NAME flag Gabriel Krisman Bertazi
2023-11-22 20:32   ` Al Viro
2023-08-16  5:07 ` [PATCH v6 5/9] libfs: Validate negative dentries in case-insensitive directories Gabriel Krisman Bertazi
2023-11-22 20:20   ` Al Viro
2023-08-16  5:08 ` [PATCH v6 6/9] libfs: Chain encryption checks after case-insensitive revalidation Gabriel Krisman Bertazi
2023-08-16  5:08 ` [PATCH v6 7/9] libfs: Merge encrypted_ci_dentry_ops and ci_dentry_ops Gabriel Krisman Bertazi
2023-08-16  5:08 ` [PATCH v6 8/9] ext4: Enable negative dentries on case-insensitive lookup Gabriel Krisman Bertazi
2023-08-16  5:08 ` [PATCH v6 9/9] f2fs: " Gabriel Krisman Bertazi
2023-08-17 17:06 ` [PATCH v6 0/9] Support negative dentries on case-insensitive ext4 and f2fs Eric Biggers
2023-08-21 15:52   ` Christian Brauner
2023-08-21 18:53     ` Gabriel Krisman Bertazi
2023-08-22  9:03       ` Christian Brauner
2023-10-24 22:20         ` Gabriel Krisman Bertazi
2023-10-25 13:32 ` Christian Brauner
2023-10-25 15:19   ` Gabriel Krisman Bertazi
2023-11-19 23:11   ` [f2fs-dev] " Gabriel Krisman Bertazi
     [not found]   ` <655a9634.630a0220.d50d7.5063SMTPIN_ADDED_BROKEN@mx.google.com>
2023-11-20 15:06     ` Christian Brauner
2023-11-20 16:59       ` Gabriel Krisman Bertazi
2023-11-20 18:07       ` Linus Torvalds
2023-11-21  2:02         ` Theodore Ts'o
2023-11-21  2:29           ` Linus Torvalds
2023-11-21  3:03             ` Linus Torvalds
2023-11-21  5:12               ` Theodore Ts'o
2023-11-22 21:04                 ` Al Viro
2023-11-21  2:27         ` Al Viro
2023-11-22 21:19           ` Al Viro
2023-11-23  0:18             ` Linus Torvalds
2023-11-23  5:09               ` Al Viro
2023-11-23 15:57               ` Gabriel Krisman Bertazi
2023-11-23 17:12                 ` Al Viro
2023-11-23 17:37                   ` Gabriel Krisman Bertazi
2023-11-23 18:24                     ` Al Viro
2023-11-23 19:06                       ` Gabriel Krisman Bertazi
2023-11-23 19:53                         ` Al Viro
2023-11-23 20:15                           ` Al Viro
2023-11-24 15:20                           ` Gabriel Krisman Bertazi
2023-11-28  0:02                             ` Al Viro
2023-11-23 21:52                         ` Al Viro
2023-11-24 15:22                           ` Gabriel Krisman Bertazi
2023-11-25 22:01                             ` Al Viro
2023-11-26  4:52                               ` Al Viro
2023-11-26 18:41                                 ` fun with d_invalidate() vs. d_splice_alias() was " Al Viro
2023-11-27  6:38                                   ` Al Viro
2023-11-27 15:47                                     ` Eric W. Biederman
2023-11-27 16:01                                       ` Eric W. Biederman
2023-11-27 17:25                                         ` Al Viro
2023-11-27 18:26                                           ` Al Viro
2023-11-27 16:03                                       ` Al Viro
2023-11-27 16:14                                         ` Al Viro
2023-11-27 18:19                                           ` Eric W. Biederman [this message]
2023-11-27 18:43                                             ` Al Viro
2023-11-27 16:33                                     ` Christian Brauner
2023-11-29  4:53                                     ` Al Viro
2023-11-29 10:21                                       ` Christian Brauner
2023-11-29 15:19                                       ` Eric W. Biederman
     [not found]               ` <655f7665.df0a0220.58a21.e84fSMTPIN_ADDED_BROKEN@mx.google.com>
2023-11-23 16:41                 ` Linus Torvalds
2023-11-23  1:12             ` Al Viro
2023-11-23  1:22               ` Al Viro
2023-11-22  3:30         ` Gabriel Krisman Bertazi
2024-01-16 19:02 ` patchwork-bot+f2fs

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87v89nkqjm.fsf@email.froward.int.ebiederm.org \
    --to=ebiederm@xmission.com \
    --cc=brauner@kernel.org \
    --cc=ebiggers@kernel.org \
    --cc=gabriel@krisman.be \
    --cc=jaegeuk@kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-f2fs-devel@lists.sourceforge.net \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    --cc=torvalds@linux-foundation.org \
    --cc=tytso@mit.edu \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).