All of lore.kernel.org
 help / color / mirror / Atom feed
From: Al Viro <viro@zeniv.linux.org.uk>
To: Brian Foster <bfoster@redhat.com>
Cc: Ian Kent <raven@themaw.net>,
	"Darrick J. Wong" <djwong@kernel.org>,
	Christoph Hellwig <hch@lst.de>,
	Miklos Szeredi <miklos@szeredi.hu>,
	David Howells <dhowells@redhat.com>,
	Kernel Mailing List <linux-kernel@vger.kernel.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	xfs <linux-xfs@vger.kernel.org>,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: [PATCH] vfs: check dentry is still valid in get_link()
Date: Tue, 18 Jan 2022 19:20:35 +0000	[thread overview]
Message-ID: <YecTA9nclOowdDEu@zeniv-ca.linux.org.uk> (raw)
In-Reply-To: <YecGC06UrGrfonS0@bfoster>

On Tue, Jan 18, 2022 at 01:25:15PM -0500, Brian Foster wrote:

> If I go back to the inactive -> reclaimable grace period variant and
> also insert a start_poll_synchronize_rcu() and
> poll_state_synchronize_rcu() pair across the inactive processing
> sequence, I start seeing numbers closer to ~36k cycles. IOW, the
> xfs_inodegc_inactivate() helper looks something like this:
> 
>         if (poll_state_synchronize_rcu(ip->i_destroy_gp))
>                 xfs_inodegc_set_reclaimable(ip);
>         else
>                 call_rcu(&VFS_I(ip)->i_rcu, xfs_inodegc_set_reclaimable_callback);
> 
> ... to skip the rcu grace period if one had already passed while the
> inode sat on the inactivation queue and was processed.
> 
> However my box went haywire shortly after with rcu stall reports or
> something if I let that continue to run longer, so I'll probably have to
> look into that lockdep splat (complaining about &pag->pag_ici_lock in
> rcu context, perhaps needs to become irq safe?) or see if something else
> is busted..

Umm...  Could you (or Dave) describe where does the mainline do the
RCU delay mentioned several times in these threads, in case of
	* unlink()
	* overwriting rename()
	* open() + unlink() + close() (that one, obviously, for regular files)

The thing is, if we already do have an RCU delay in there, there might be
a better solution - making sure it happens downstream of d_drop() (in case
when dentry has other references) or dentry going negative (in case we are
holding the sole reference to it).

If we can do that, RCU'd dcache lookups won't run into inode reuse at all.
Sure, right now d_delete() comes last *and* we want the target inode to stay
around past the call of ->unlink().  But if you look at do_unlinkat() you'll
see an interesting-looking wart with ihold/iput around vfs_unlink().  Not
sure if you remember the story on that one; basically, it's "we'd rather
have possible IO on inode freeing to happen outside of the lock on parent".

nfsd and mqueue do the same thing; ksmbd does not.  OTOH, ksmbd appears to
force the "make victim go unhashed, sole reference or not".  ecryptfs
definitely does that forcing (deliberately so).

That area could use some rethinking, and if we can deal with the inode reuse
delay while we are at it...

Overwriting rename() is also interesting in that respect, of course.

I can go and try to RTFS my way through xfs iget-related code, but I'd
obviously prefer to do so with at least some overview of that thing
from the folks familiar with it.  Seeing that it's a lockless search
structure, missing something subtle there is all too easy, especially
with the lookup-by-fhandle stuff in the mix...

  reply	other threads:[~2022-01-18 19:20 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-10  9:11 [PATCH] vfs: check dentry is still valid in get_link() Ian Kent
2022-01-15  6:38 ` Al Viro
2022-01-17  2:55   ` Ian Kent
2022-01-17 14:35     ` Brian Foster
2022-01-17 16:28       ` Al Viro
2022-01-17 18:10         ` Al Viro
2022-01-17 19:48           ` Al Viro
2022-01-18  1:32             ` Al Viro
2022-01-18  2:31               ` Ian Kent
2022-01-18  3:03                 ` Al Viro
2022-01-18 13:47               ` Brian Foster
2022-01-18 18:25                 ` Brian Foster
2022-01-18 19:20                   ` Al Viro [this message]
2022-01-18 20:58                     ` Brian Foster
2022-01-18  8:29           ` Christian Brauner
2022-01-18 16:04             ` Al Viro
2022-01-19  9:05               ` Christian Brauner
2022-01-17 18:42         ` Brian Foster
2022-01-18  3:00         ` Dave Chinner
2022-01-18  3:17           ` Al Viro
2022-01-18  4:12             ` Dave Chinner
2022-01-18  5:58               ` Al Viro
2022-01-18 23:25                 ` Dave Chinner
2022-01-19 14:08                   ` Brian Foster
2022-01-19 22:07                     ` Dave Chinner
2022-01-20 16:03                       ` Brian Foster
2022-01-20 16:34                         ` Brian Foster

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YecTA9nclOowdDEu@zeniv-ca.linux.org.uk \
    --to=viro@zeniv.linux.org.uk \
    --cc=bfoster@redhat.com \
    --cc=dhowells@redhat.com \
    --cc=djwong@kernel.org \
    --cc=hch@lst.de \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    --cc=raven@themaw.net \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.