All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jens Axboe <axboe@kernel.dk>
To: Al Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org, torvalds@linux-foundation.org
Subject: Re: [PATCH 1/2] fs: add support for LOOKUP_NONBLOCK
Date: Fri, 11 Dec 2020 08:57:26 -0700	[thread overview]
Message-ID: <bef3f905-f6b7-1134-7ca9-ff9385d6bf86@kernel.dk> (raw)
In-Reply-To: <20201211023555.GV3579531@ZenIV.linux.org.uk>

On 12/10/20 7:35 PM, Al Viro wrote:
> On Thu, Dec 10, 2020 at 01:01:13PM -0700, Jens Axboe wrote:
>> io_uring always punts opens to async context, since there's no control
>> over whether the lookup blocks or not. Add LOOKUP_NONBLOCK to support
>> just doing the fast RCU based lookups, which we know will not block. If
>> we can do a cached path resolution of the filename, then we don't have
>> to always punt lookups for a worker.
>>
>> During path resolution, we always do LOOKUP_RCU first. If that fails and
>> we terminate LOOKUP_RCU, then fail a LOOKUP_NONBLOCK attempt as well.
> 
> In effect you are adding a mode where
> 	* unlazy would fail, except when done from complete_walk()
> 	* ->d_revalidate() wouldn't be attempted at all (not even with LOOKUP_RCU)
> 	* ... but ->get_link() in RCU mode would
> 	* ... and so would everything done after complete_walk() in
> do_open(), very much including the joys like mnt_want_write() (i.e. waiting for
> frozen fs to thaw), handling O_TRUNC, calling ->open() itself...
> 
> So this "not punting lookups for a worker" looks fishy as hell - if you care
> about blocking operations, you haven't really won anything.
> 
> And why exactly is the RCU case of ->d_revalidate() worth buggering off (it
> really can't block - it's called under rcu_read_lock() and it does *not*
> drop it)?
> 
> _IF_ for some theoretical exercise you want to do "lookup without dropping
> out of RCU", just add a flag that has unlazy_walk() fail.  With -ECHILD.
> Strip it away in complete_walk() and have path_init() with that flag
> and without LOOKUP_RCU fail with -EAGAIN.  All there is to it.

Thanks Al, that makes for an easier implementation. I like that suggestion,
boils it down to just three hunks (see below).

For io_uring, the concept is just to perform the fast path inline. The
RCU lookup serves that purpose nicely - if we fail that, then it's expected
to take the latency hit of going async.

> It still leaves you with fuckloads of blocking operations (and that's
> "blocking" with "until admin thaws the damn filesystem several hours
> down the road") after complete_walk(), though.

But that's true (and expected) for any open that isn't non-blocking.


diff --git a/fs/namei.c b/fs/namei.c
index d7952f863e79..d49c72e34c6e 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -686,6 +686,8 @@ static bool unlazy_walk(struct nameidata *nd)
 	BUG_ON(!(nd->flags & LOOKUP_RCU));
 
 	nd->flags &= ~LOOKUP_RCU;
+	if (nd->flags & LOOKUP_NONBLOCK)
+		goto out1;
 	if (unlikely(!legitimize_links(nd)))
 		goto out1;
 	if (unlikely(!legitimize_path(nd, &nd->path, nd->seq)))
@@ -792,6 +794,7 @@ static int complete_walk(struct nameidata *nd)
 		 */
 		if (!(nd->flags & (LOOKUP_ROOT | LOOKUP_IS_SCOPED)))
 			nd->root.mnt = NULL;
+		nd->flags &= ~LOOKUP_NONBLOCK;
 		if (unlikely(unlazy_walk(nd)))
 			return -ECHILD;
 	}
@@ -2209,6 +2212,9 @@ static const char *path_init(struct nameidata *nd, unsigned flags)
 
 	if (!*s)
 		flags &= ~LOOKUP_RCU;
+	/* LOOKUP_NONBLOCK requires RCU, ask caller to retry */
+	if ((flags & (LOOKUP_RCU | LOOKUP_NONBLOCK)) == LOOKUP_NONBLOCK)
+		return ERR_PTR(-EAGAIN);
 	if (flags & LOOKUP_RCU)
 		rcu_read_lock();
 

-- 
Jens Axboe


  reply	other threads:[~2020-12-11 17:07 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-10 20:01 [PATCHSET 0/2] fs: Support for LOOKUP_NONBLOCK / RESOLVE_NONBLOCK Jens Axboe
2020-12-10 20:01 ` [PATCH 1/2] fs: add support for LOOKUP_NONBLOCK Jens Axboe
2020-12-10 20:53   ` Linus Torvalds
2020-12-10 21:06     ` Jens Axboe
2020-12-11  2:45       ` Al Viro
2020-12-11 16:05         ` Jens Axboe
2020-12-11 17:20           ` Al Viro
2020-12-11 17:35             ` Linus Torvalds
2020-12-11 18:50             ` Jens Axboe
2020-12-11 21:51               ` Al Viro
2020-12-11 23:47                 ` Jens Axboe
2020-12-11 17:33           ` Matthew Wilcox
2020-12-11 18:55             ` Jens Axboe
2020-12-11  2:35   ` Al Viro
2020-12-11 15:57     ` Jens Axboe [this message]
2020-12-11 17:21       ` Linus Torvalds
2020-12-11 17:29         ` Al Viro
2020-12-11 17:38           ` Al Viro
2020-12-11 17:44           ` Linus Torvalds
2020-12-11 21:46           ` Jens Axboe
2020-12-10 20:01 ` [PATCH 2/2] fs: expose LOOKUP_NONBLOCK through openat2() RESOLVE_NONBLOCK Jens Axboe
2020-12-10 22:29   ` Dave Chinner
2020-12-10 23:12     ` Jens Axboe
2020-12-10 23:29     ` Linus Torvalds
2020-12-11  0:58       ` Dave Chinner
2020-12-11  1:01         ` Linus Torvalds
2020-12-11  3:45           ` Dave Chinner
2020-12-11 18:07             ` Linus Torvalds

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bef3f905-f6b7-1134-7ca9-ff9385d6bf86@kernel.dk \
    --to=axboe@kernel.dk \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.