All of lore.kernel.org
 help / color / mirror / Atom feed
From: Al Viro <viro@zeniv.linux.org.uk>
To: Jens Axboe <axboe@kernel.dk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>
Subject: Re: [PATCH 1/2] fs: add support for LOOKUP_NONBLOCK
Date: Fri, 11 Dec 2020 17:20:54 +0000	[thread overview]
Message-ID: <20201211172054.GX3579531@ZenIV.linux.org.uk> (raw)
In-Reply-To: <89f96b42-9d58-cd46-e157-758e91269d89@kernel.dk>

On Fri, Dec 11, 2020 at 09:05:26AM -0700, Jens Axboe wrote:

> > Finally, I really wonder what is that for; if you are in conditions when
> > you really don't want to risk going to sleep, you do *NOT* want to
> > do mnt_want_write().  Or ->open().  Or truncate().  Or, for Cthulhu
> > sake, IMA hash calculation.
> 
> I just want to do the RCU side lookup, that is all. That's my fast path.
> If that doesn't work, then we'll go through the motions of pushing this
> to a context that allows blocking open.

Explain, please.  What's the difference between blocking in a lookup and
blocking in truncate?  Either your call site is fine with a potentially
long sleep, or it is not; I don't understand what makes one source of
that behaviour different from another.

"Fast path" in context like "we can't sleep here, but often enough we
won't need to; here's a function that will bail out rather than blocking,
let's call that and go through offload to helper thread in rare case
when it does bail out" does make sense; what you are proposing to do
here is rather different and AFAICS saying "that's my fast path" is
meaningless here.

I really do not understand what it is that you are trying to achieve;
fastpath lookup part would be usable on its own, but mixed with
the rest of do_open() (as well as the open_last_lookups(), BTW)
it does not give the caller any useful warranties.

Theoretically it could be amended into something usable, but you
would need to make do_open(), open_last_lookups() (as well as
do_tmpfile()) honour your flag, with similar warranties provided
to caller.

AFAICS, without that part it is pretty much worthless.  And details
of what you are going to do in the missing bits *do* matter - unlike the
pathwalk side (which is trivial) it has potential for being very
messy.  I want to see _that_ before we commit to going there, and
a user-visible flag to openat2() makes a very strong commitment.

PS: just to make sure we are on the same page - O_NDELAY will *NOT*
suffice here.  I apologize if that's obvious to you, but I think
it's worth spelling out explicitly.

PPS: regarding unlazy_walk() change...  If we go that way, I would
probably changed the name to "try_to_unlazy" and inverted the meaning
of return value.  0 for success, -E... for failure is fine, but
false for success, true for failure is asking for recurring confusion.
IOW, I would rather do something like (completely untested)
diff --git a/fs/namei.c b/fs/namei.c
index d4a6dd772303..5abd1de11306 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -669,17 +669,17 @@ static bool legitimize_root(struct nameidata *nd)
  */
 
 /**
- * unlazy_walk - try to switch to ref-walk mode.
+ * try_to_unlazy - try to switch to ref-walk mode.
  * @nd: nameidata pathwalk data
- * Returns: 0 on success, -ECHILD on failure
+ * Returns: true on success, false on failure
  *
- * unlazy_walk attempts to legitimize the current nd->path and nd->root
+ * try_to_unlazy attempts to legitimize the current nd->path and nd->root
  * for ref-walk mode.
  * Must be called from rcu-walk context.
- * Nothing should touch nameidata between unlazy_walk() failure and
+ * Nothing should touch nameidata between try_to_unlazy() failure and
  * terminate_walk().
  */
-static int unlazy_walk(struct nameidata *nd)
+static bool try_to_unlazy(struct nameidata *nd)
 {
 	struct dentry *parent = nd->path.dentry;
 
@@ -694,14 +694,14 @@ static int unlazy_walk(struct nameidata *nd)
 		goto out;
 	rcu_read_unlock();
 	BUG_ON(nd->inode != parent->d_inode);
-	return 0;
+	return true;
 
 out1:
 	nd->path.mnt = NULL;
 	nd->path.dentry = NULL;
 out:
 	rcu_read_unlock();
-	return -ECHILD;
+	return false;
 }
 
 /**
@@ -792,7 +792,7 @@ static int complete_walk(struct nameidata *nd)
 		 */
 		if (!(nd->flags & (LOOKUP_ROOT | LOOKUP_IS_SCOPED)))
 			nd->root.mnt = NULL;
-		if (unlikely(unlazy_walk(nd)))
+		if (unlikely(!try_to_unlazy(nd)))
 			return -ECHILD;
 	}
 
@@ -1466,7 +1466,7 @@ static struct dentry *lookup_fast(struct nameidata *nd,
 		unsigned seq;
 		dentry = __d_lookup_rcu(parent, &nd->last, &seq);
 		if (unlikely(!dentry)) {
-			if (unlazy_walk(nd))
+			if (!try_to_unlazy(nd))
 				return ERR_PTR(-ECHILD);
 			return NULL;
 		}
@@ -1567,10 +1567,8 @@ static inline int may_lookup(struct nameidata *nd)
 {
 	if (nd->flags & LOOKUP_RCU) {
 		int err = inode_permission(nd->inode, MAY_EXEC|MAY_NOT_BLOCK);
-		if (err != -ECHILD)
+		if (err != -ECHILD || !try_to_unlazy(nd))
 			return err;
-		if (unlazy_walk(nd))
-			return -ECHILD;
 	}
 	return inode_permission(nd->inode, MAY_EXEC);
 }
@@ -1592,7 +1590,7 @@ static int reserve_stack(struct nameidata *nd, struct path *link, unsigned seq)
 		// unlazy even if we fail to grab the link - cleanup needs it
 		bool grabbed_link = legitimize_path(nd, link, seq);
 
-		if (unlazy_walk(nd) != 0 || !grabbed_link)
+		if (!try_to_unlazy(nd) || !grabbed_link)
 			return -ECHILD;
 
 		if (nd_alloc_stack(nd))
@@ -1634,7 +1632,7 @@ static const char *pick_link(struct nameidata *nd, struct path *link,
 		touch_atime(&last->link);
 		cond_resched();
 	} else if (atime_needs_update(&last->link, inode)) {
-		if (unlikely(unlazy_walk(nd)))
+		if (unlikely(!try_to_unlazy(nd)))
 			return ERR_PTR(-ECHILD);
 		touch_atime(&last->link);
 	}
@@ -1651,11 +1649,8 @@ static const char *pick_link(struct nameidata *nd, struct path *link,
 		get = inode->i_op->get_link;
 		if (nd->flags & LOOKUP_RCU) {
 			res = get(NULL, inode, &last->done);
-			if (res == ERR_PTR(-ECHILD)) {
-				if (unlikely(unlazy_walk(nd)))
-					return ERR_PTR(-ECHILD);
+			if (res == ERR_PTR(-ECHILD) && try_to_unlazy(nd))
 				res = get(link->dentry, inode, &last->done);
-			}
 		} else {
 			res = get(link->dentry, inode, &last->done);
 		}
@@ -2193,7 +2188,7 @@ static int link_path_walk(const char *name, struct nameidata *nd)
 		}
 		if (unlikely(!d_can_lookup(nd->path.dentry))) {
 			if (nd->flags & LOOKUP_RCU) {
-				if (unlazy_walk(nd))
+				if (!try_to_unlazy(nd))
 					return -ECHILD;
 			}
 			return -ENOTDIR;
@@ -3127,7 +3122,6 @@ static const char *open_last_lookups(struct nameidata *nd,
 	struct inode *inode;
 	struct dentry *dentry;
 	const char *res;
-	int error;
 
 	nd->flags |= op->intent;
 
@@ -3151,9 +3145,8 @@ static const char *open_last_lookups(struct nameidata *nd,
 	} else {
 		/* create side of things */
 		if (nd->flags & LOOKUP_RCU) {
-			error = unlazy_walk(nd);
-			if (unlikely(error))
-				return ERR_PTR(error);
+			if (unlikely(!try_to_unlazy(nd)))
+				return ERR_PTR(-ECHILD);
 		}
 		audit_inode(nd->name, dir, AUDIT_INODE_PARENT);
 		/* trailing slashes? */
@@ -3162,8 +3155,7 @@ static const char *open_last_lookups(struct nameidata *nd,
 	}
 
 	if (open_flag & (O_CREAT | O_TRUNC | O_WRONLY | O_RDWR)) {
-		error = mnt_want_write(nd->path.mnt);
-		if (!error)
+		if (!mnt_want_write(nd->path.mnt))
 			got_write = true;
 		/*
 		 * do _not_ fail yet - we might not need that or fail with

  reply	other threads:[~2020-12-11 19:06 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-10 20:01 [PATCHSET 0/2] fs: Support for LOOKUP_NONBLOCK / RESOLVE_NONBLOCK Jens Axboe
2020-12-10 20:01 ` [PATCH 1/2] fs: add support for LOOKUP_NONBLOCK Jens Axboe
2020-12-10 20:53   ` Linus Torvalds
2020-12-10 21:06     ` Jens Axboe
2020-12-11  2:45       ` Al Viro
2020-12-11 16:05         ` Jens Axboe
2020-12-11 17:20           ` Al Viro [this message]
2020-12-11 17:35             ` Linus Torvalds
2020-12-11 18:50             ` Jens Axboe
2020-12-11 21:51               ` Al Viro
2020-12-11 23:47                 ` Jens Axboe
2020-12-11 17:33           ` Matthew Wilcox
2020-12-11 18:55             ` Jens Axboe
2020-12-11  2:35   ` Al Viro
2020-12-11 15:57     ` Jens Axboe
2020-12-11 17:21       ` Linus Torvalds
2020-12-11 17:29         ` Al Viro
2020-12-11 17:38           ` Al Viro
2020-12-11 17:44           ` Linus Torvalds
2020-12-11 21:46           ` Jens Axboe
2020-12-10 20:01 ` [PATCH 2/2] fs: expose LOOKUP_NONBLOCK through openat2() RESOLVE_NONBLOCK Jens Axboe
2020-12-10 22:29   ` Dave Chinner
2020-12-10 23:12     ` Jens Axboe
2020-12-10 23:29     ` Linus Torvalds
2020-12-11  0:58       ` Dave Chinner
2020-12-11  1:01         ` Linus Torvalds
2020-12-11  3:45           ` Dave Chinner
2020-12-11 18:07             ` Linus Torvalds

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201211172054.GX3579531@ZenIV.linux.org.uk \
    --to=viro@zeniv.linux.org.uk \
    --cc=axboe@kernel.dk \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.