All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v10 0/8] fuse: full atomic open and atomic-open-revalidate
@ 2023-10-23 18:30 Bernd Schubert
  2023-10-23 18:30 ` [PATCH v10 1/8] fuse: rename fuse_create_open Bernd Schubert
                   ` (7 more replies)
  0 siblings, 8 replies; 32+ messages in thread
From: Bernd Schubert @ 2023-10-23 18:30 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: bernd.schubert, miklos, dsingh, Bernd Schubert, Vivek Goyal,
	Christian Brauner, Al Viro, Horst Birthelmer, Yuan Yao,
	Amir Goldstein

In FUSE, as of now, uncached lookups are expensive over the wire.
E.g additional latencies and stressing (meta data) servers from
thousands of clients. With atomic-open lookup before open
can be avoided.

Here is the link to performance numbers
https://lore.kernel.org/linux-fsdevel/20220322121212.5087-1-dharamhans87@gmail.com/

Here is the libfuse pull request
https://github.com/libfuse/libfuse/pull/813

The patches are passing passthrough_hp xfstests (libfuse part applied),
although we had to introduce umount retries into xfstests, as recent
kernels/xfstests fail umount in some tests with
EBUSY - independent of atomic open. (Although outstanding for v7)

I'm especially interested in Al's and Christians opinion about the
atomic open dentry revalidation in v7. If the vfs changes are
acceptable, would it be possible to also look at the other patches
and their vfs/dcache interaction? I __hope__ I got it right and I hope
the vfs changes are acceptable.

v10:
    - Updates to Amirs suggestions (WARN_ONCE and graceful error code
      path instead of BUG_ON) in lookup_open
    - After discussion with Amir what patch 3 and 4 are about, change them
      into atomic-open-revalidate with (3)  and without (4) O_CREAT,
      instead of atomic-open-revalidate + optimization.
    - Fix opening of symlinks, thanks a lot Yuan for testing an reporting!
      To make reviews easier this is two fold:
          - In patch 2/8 it just falls back to create open (and expects
            server side not to hold a reference), in 8/8 the fallback and
            additional lookup is avoided (server needs to have an inode
            reference).
    - Fix smatch errors
        - lookup_open() error: uninitialized symbol 'error'.
        - atomic_revalidate_open() warn: variable dereferenced before
          check 'got_write'
        - error: 'switched_entry' dereferencing possible ERR_PTR()
    - rebase to 6.6, use ATTR_TIMEOUT()
v9:
    - Followed Miklos suggestion and added another patch to further
      optimize atomic revalidate/open, which avoids dentry
      acquire/release and also avoids double call into ->d_revalidate
    - Updates following Miklos' review
    - Dropped a temporary comment in patch 2/7 (accidental leftover)

v8: - Another slight indentation fix in _fuse_atomic_open
    - Fix compilation error in patch 4 (fuse atomic revalidate)
    - Remove LOOKUP_ATOMIC_REVALIDATE
    - Switch from DCACHE_ATOMIC_OPEN flag to return value and
      and introduce an enum for d_revalidate return values.
    - checkpatch fixes

v7: - Indentation and style fixes for _fuse_atomic_open.
    - Remodel atomic open to avoid races with parallel lookup, similar
      to NFS commit c94c09535c4debcc439f55b5b6d9ebe57bd4665a and what
      is done in _nfs4_open_and_get_state()
      A WARN_ONCE() and fallback is added to ensure operation is on
      negative dentries only.
    - Error handling is done via the fallback fuse_create_open()
      to reduce complexity and code duplication.
    - Remove entry cache invalidation on ENOENT in the atomic-open
      patch, as atomic-open so far operates on negative dentries only.
    - Remove fuse_advise_use_readdirplus() in _fuse_atomic_open
      (Thanks Miklos)
    - Add forgotten free_ext_value() (Thanks Miklos).
    - Declare struct fuse_inode per condition as the value needs to
      be retrieved anyway per condition.
    - Added atomic open-revalidation and required vfs changes
    - Added myself (Bernd) as Co-developed-by to Dharmendras patches, as
      I did substantial modifications.
    - More as reminder for myself, so far these tests below are
      done manually or with custom scripts, I think we need xfstests
      for these.

        With updated libfuse /scratch/dest is mounted by:
        passthrough_hp -o allow_other --foreground --debug-fuse /scratch/source /scratch/dest

        1) Test atomic open (file create) and negative dentry open

            rm -f /scratch/source/file # ensure file does not exist
            mount /scratch/dest  # overlay of /scratch source
            echo a > /scratch/dest/file # non-existing file
            umount and mount /scratch/test (clear cache)
            cat /scratch/dest/file
            rm -f /scratch/dest/file

        2) Test dir open

            mkdir /scratch/source/dir
            mount /scratch/dest  # overlay of /scratch source
            cat /scratch/source/dir
            rmdir /scratch/source/dir

        3)  Test revalidate without file change

            mount /scratch/dest
            echo "a" > /scratch/dest/file
            echo "b" >> /scratch/dest/file
            echo "c" >> /scratch/dest/file
            cat /scratch/dest/file
            rm -f /scratch/dest/file

        4)  Test revalidate by underlying file change

            mount /scratch/dest
            echo "a" > /scratch/dest/file
            cat /scratch/dest/file
            rm -f /scratch/source/file # unknown to dest mount
            str="b"
            echo "${str}" > /scratch/source/file
            reval=$(cat /scratch/dest/file)
            if [ "$str" != "reval" ]; then
                echo "String mismatch after revalidation"
                exit 1
            fi
            rm -f /scratch/dest/file

        5) Test revalidate by underlying file change, but with
           O_CREATE included. Tests dentry creation by the atomic
           revalidate

            mount /scratch/dest
            echo "a" >> /scratch/dest/file
            rm -f /scratch/source/file
            echo "b" > /scratch/source/file

            # revalidate includes O_CREATE
            echo "c" >> /scratch/dest/file

        6) Repeat above tests, but with additional "--nocache"
           passthrough_hp option

v6: Addressed Miklos comments and rewrote atomic open into its own
    function. Also dropped for now is the revalidate optimization, we
    have the code/patch, but it needs more testing. Also easier to
    first agree on atomic open and then to land the next optimization.

v5: Addressed comments

v3: Addressed comments

v4: Addressed all comments and refactored the code into 3 separate patches
    respectively for Atomic create, Atomic open, optimizing lookup in
    d_revalidate().

v3: handle review comments

v2: fixed a memory leak in <fuse_atomic_open_common>

Cc: Miklos Szeredi <miklos@szeredi.hu>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Dharmendra Singh <dsingh@ddn.com>
Cc: Horst Birthelmer <hbirthelmer@ddn.com>
Cc: Yuan Yao <yuanyaogoog@chromium.org>
Cc: Amir Goldstein <amir73il@gmail.com>
Cc: linux-fsdevel@vger.kernel.org

Bernd Schubert (7):
  fuse: rename fuse_create_open
  fuse: introduce atomic open
  [RFC] Allow atomic_open() on positive dentry (w/o O_CREAT)
  fuse: Revalidate positive entries in fuse_atomic_open
  fuse: Return D_REVALIDATE_ATOMIC for cached dentries
  fuse: Avoid code duplication in atomic open
  fuse atomic open: No fallback for symlinks, just call finish_no_open

Miklos Szeredi (1):
  [RFC] Allow atomic_open() on positive dentry (O_CREAT)

 fs/fuse/dir.c             | 395 +++++++++++++++++++++++++++++++++++++-
 fs/fuse/fuse_i.h          |   6 +
 fs/namei.c                |  77 +++++++-
 include/linux/namei.h     |   7 +
 include/uapi/linux/fuse.h |   3 +
 5 files changed, 474 insertions(+), 14 deletions(-)

-- 
2.39.2


^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH v10 1/8] fuse: rename fuse_create_open
  2023-10-23 18:30 [PATCH v10 0/8] fuse: full atomic open and atomic-open-revalidate Bernd Schubert
@ 2023-10-23 18:30 ` Bernd Schubert
  2023-10-23 18:30 ` [PATCH v10 2/8] fuse: introduce atomic open Bernd Schubert
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 32+ messages in thread
From: Bernd Schubert @ 2023-10-23 18:30 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: bernd.schubert, miklos, dsingh, Bernd Schubert

Just preparation work for atomic open.

Signed-off-by: Bernd Schubert <bschubert@ddn.com>
Cc: Miklos Szeredi <miklos@szeredi.hu>
Cc: Dharmendra Singh <dsingh@ddn.com>
Cc: linux-fsdevel@vger.kernel.org
---
 fs/fuse/dir.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
index d707e6987da9..e1095852601c 100644
--- a/fs/fuse/dir.c
+++ b/fs/fuse/dir.c
@@ -605,7 +605,7 @@ static void free_ext_value(struct fuse_args *args)
  * If the filesystem doesn't support this, then fall back to separate
  * 'mknod' + 'open' requests.
  */
-static int fuse_create_open(struct inode *dir, struct dentry *entry,
+static int _fuse_create_open(struct inode *dir, struct dentry *entry,
 			    struct file *file, unsigned int flags,
 			    umode_t mode, u32 opcode)
 {
@@ -745,7 +745,7 @@ static int fuse_atomic_open(struct inode *dir, struct dentry *entry,
 	if (fc->no_create)
 		goto mknod;
 
-	err = fuse_create_open(dir, entry, file, flags, mode, FUSE_CREATE);
+	err = _fuse_create_open(dir, entry, file, flags, mode, FUSE_CREATE);
 	if (err == -ENOSYS) {
 		fc->no_create = 1;
 		goto mknod;
@@ -874,7 +874,7 @@ static int fuse_tmpfile(struct mnt_idmap *idmap, struct inode *dir,
 	if (fc->no_tmpfile)
 		return -EOPNOTSUPP;
 
-	err = fuse_create_open(dir, file->f_path.dentry, file, file->f_flags, mode, FUSE_TMPFILE);
+	err = _fuse_create_open(dir, file->f_path.dentry, file, file->f_flags, mode, FUSE_TMPFILE);
 	if (err == -ENOSYS) {
 		fc->no_tmpfile = 1;
 		err = -EOPNOTSUPP;
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v10 2/8] fuse: introduce atomic open
  2023-10-23 18:30 [PATCH v10 0/8] fuse: full atomic open and atomic-open-revalidate Bernd Schubert
  2023-10-23 18:30 ` [PATCH v10 1/8] fuse: rename fuse_create_open Bernd Schubert
@ 2023-10-23 18:30 ` Bernd Schubert
  2023-10-24 10:12   ` Yuan Yao
                     ` (3 more replies)
  2023-10-23 18:30 ` [PATCH v10 3/8] [RFC] Allow atomic_open() on positive dentry (O_CREAT) Bernd Schubert
                   ` (5 subsequent siblings)
  7 siblings, 4 replies; 32+ messages in thread
From: Bernd Schubert @ 2023-10-23 18:30 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: bernd.schubert, miklos, dsingh, Bernd Schubert, Horst Birthelmer,
	Christian Brauner, Al Viro

From: Dharmendra Singh <dsingh@ddn.com>

This adds full atomic open support, to avoid lookup before open/create.
If the implementation (fuse server/daemon) does not support atomic open
it falls back to non-atomic open.

Co-developed-by: Bernd Schubert <bschubert@ddn.com>
Signed-off-by: Bernd Schubert <bschubert@ddn.com>
Signed-off-by: Dharmendra Singh <dsingh@ddn.com>
Signed-off-by: Horst Birthelmer <hbirthelmer@ddn.com>
Cc: Miklos Szeredi <miklos@szeredi.hu>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Dharmendra Singh <dsingh@ddn.com>
Cc: linux-fsdevel@vger.kernel.org
---
 fs/fuse/dir.c             | 214 +++++++++++++++++++++++++++++++++++++-
 fs/fuse/fuse_i.h          |   3 +
 include/uapi/linux/fuse.h |   3 +
 3 files changed, 219 insertions(+), 1 deletion(-)

diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
index e1095852601c..61cdb8e5f68e 100644
--- a/fs/fuse/dir.c
+++ b/fs/fuse/dir.c
@@ -716,7 +716,7 @@ static int _fuse_create_open(struct inode *dir, struct dentry *entry,
 
 static int fuse_mknod(struct mnt_idmap *, struct inode *, struct dentry *,
 		      umode_t, dev_t);
-static int fuse_atomic_open(struct inode *dir, struct dentry *entry,
+static int fuse_create_open(struct inode *dir, struct dentry *entry,
 			    struct file *file, unsigned flags,
 			    umode_t mode)
 {
@@ -763,6 +763,218 @@ static int fuse_atomic_open(struct inode *dir, struct dentry *entry,
 	return finish_no_open(file, res);
 }
 
+static int _fuse_atomic_open(struct inode *dir, struct dentry *entry,
+			     struct file *file, unsigned int flags,
+			     umode_t mode)
+{
+	int err;
+	struct inode *inode;
+	FUSE_ARGS(args);
+	struct fuse_mount *fm = get_fuse_mount(dir);
+	struct fuse_conn *fc = fm->fc;
+	struct fuse_forget_link *forget;
+	struct fuse_create_in inarg;
+	struct fuse_open_out outopen;
+	struct fuse_entry_out outentry;
+	struct fuse_inode *fi;
+	struct fuse_file *ff;
+	struct dentry *switched_entry = NULL, *alias = NULL;
+	DECLARE_WAIT_QUEUE_HEAD_ONSTACK(wq);
+
+	/* Expect a negative dentry */
+	if (unlikely(d_inode(entry)))
+		goto fallback;
+
+	/* Userspace expects S_IFREG in create mode */
+	if ((flags & O_CREAT) && (mode & S_IFMT) != S_IFREG)
+		goto fallback;
+
+	forget = fuse_alloc_forget();
+	err = -ENOMEM;
+	if (!forget)
+		goto out_err;
+
+	err = -ENOMEM;
+	ff = fuse_file_alloc(fm);
+	if (!ff)
+		goto out_put_forget_req;
+
+	if (!fc->dont_mask)
+		mode &= ~current_umask();
+
+	flags &= ~O_NOCTTY;
+	memset(&inarg, 0, sizeof(inarg));
+	memset(&outentry, 0, sizeof(outentry));
+	inarg.flags = flags;
+	inarg.mode = mode;
+	inarg.umask = current_umask();
+
+	if (fc->handle_killpriv_v2 && (flags & O_TRUNC) &&
+	    !(flags & O_EXCL) && !capable(CAP_FSETID)) {
+		inarg.open_flags |= FUSE_OPEN_KILL_SUIDGID;
+	}
+
+	args.opcode = FUSE_OPEN_ATOMIC;
+	args.nodeid = get_node_id(dir);
+	args.in_numargs = 2;
+	args.in_args[0].size = sizeof(inarg);
+	args.in_args[0].value = &inarg;
+	args.in_args[1].size = entry->d_name.len + 1;
+	args.in_args[1].value = entry->d_name.name;
+	args.out_numargs = 2;
+	args.out_args[0].size = sizeof(outentry);
+	args.out_args[0].value = &outentry;
+	args.out_args[1].size = sizeof(outopen);
+	args.out_args[1].value = &outopen;
+
+	if (flags & O_CREAT) {
+		err = get_create_ext(&args, dir, entry, mode);
+		if (err)
+			goto out_free_ff;
+	}
+
+	err = fuse_simple_request(fm, &args);
+	free_ext_value(&args);
+	if (err == -ENOSYS || err == -ELOOP) {
+		if (unlikely(err == -ENOSYS))
+			fc->no_open_atomic = 1;
+		goto free_and_fallback;
+	}
+
+	if (!err && !outentry.nodeid)
+		err = -ENOENT;
+
+	if (err)
+		goto out_free_ff;
+
+	err = -EIO;
+	if (invalid_nodeid(outentry.nodeid) || fuse_invalid_attr(&outentry.attr))
+		goto out_free_ff;
+
+	ff->fh = outopen.fh;
+	ff->nodeid = outentry.nodeid;
+	ff->open_flags = outopen.open_flags;
+	inode = fuse_iget(dir->i_sb, outentry.nodeid, outentry.generation,
+			  &outentry.attr, ATTR_TIMEOUT(&outentry), 0);
+	if (!inode) {
+		flags &= ~(O_CREAT | O_EXCL | O_TRUNC);
+		fuse_sync_release(NULL, ff, flags);
+		fuse_queue_forget(fm->fc, forget, outentry.nodeid, 1);
+		err = -ENOMEM;
+		goto out_err;
+	}
+
+	/* prevent racing/parallel lookup on a negative hashed */
+	if (!(flags & O_CREAT) && !d_in_lookup(entry)) {
+		d_drop(entry);
+		switched_entry = d_alloc_parallel(entry->d_parent,
+						   &entry->d_name, &wq);
+		if (IS_ERR(switched_entry)) {
+			err = PTR_ERR(switched_entry);
+			switched_entry = NULL;
+			goto out_free_ff;
+		}
+
+		if (unlikely(!d_in_lookup(switched_entry))) {
+			/* fall back */
+			dput(switched_entry);
+			switched_entry = NULL;
+			goto free_and_fallback;
+		}
+
+		entry = switched_entry;
+	}
+
+	if (d_really_is_negative(entry)) {
+		d_drop(entry);
+		alias = d_exact_alias(entry, inode);
+		if (!alias) {
+			alias = d_splice_alias(inode, entry);
+			if (IS_ERR(alias)) {
+				/*
+				 * Close the file in user space, but do not unlink it,
+				 * if it was created - with network file systems other
+				 * clients might have already accessed it.
+				 */
+				fi = get_fuse_inode(inode);
+				fuse_sync_release(fi, ff, flags);
+				fuse_queue_forget(fm->fc, forget, outentry.nodeid, 1);
+				err = PTR_ERR(alias);
+				goto out_err;
+			}
+		}
+
+		if (alias)
+			entry = alias;
+	}
+
+	fuse_change_entry_timeout(entry, &outentry);
+
+	/*  File was indeed created */
+	if (outopen.open_flags & FOPEN_FILE_CREATED) {
+		if (!(flags & O_CREAT)) {
+			pr_debug("Server side bug, FOPEN_FILE_CREATED set "
+				 "without O_CREAT, ignoring.");
+		} else {
+			/* This should be always set when the file is created */
+			fuse_dir_changed(dir);
+			file->f_mode |= FMODE_CREATED;
+		}
+	}
+
+	if (S_ISDIR(mode))
+		ff->open_flags &= ~FOPEN_DIRECT_IO;
+	err = finish_open(file, entry, generic_file_open);
+	if (err) {
+		fi = get_fuse_inode(inode);
+		fuse_sync_release(fi, ff, flags);
+	} else {
+		file->private_data = ff;
+		fuse_finish_open(inode, file);
+	}
+
+	kfree(forget);
+
+	if (switched_entry) {
+		d_lookup_done(switched_entry);
+		dput(switched_entry);
+	}
+
+	dput(alias);
+
+	return err;
+
+out_free_ff:
+	fuse_file_free(ff);
+out_put_forget_req:
+	kfree(forget);
+out_err:
+	if (switched_entry) {
+		d_lookup_done(switched_entry);
+		dput(switched_entry);
+	}
+
+	return err;
+
+free_and_fallback:
+	fuse_file_free(ff);
+	kfree(forget);
+fallback:
+	return fuse_create_open(dir, entry, file, flags, mode);
+}
+
+static int fuse_atomic_open(struct inode *dir, struct dentry *entry,
+			    struct file *file, unsigned int flags,
+			    umode_t mode)
+{
+	struct fuse_conn *fc = get_fuse_conn(dir);
+
+	if (fc->no_open_atomic)
+		return fuse_create_open(dir, entry, file, flags, mode);
+	else
+		return _fuse_atomic_open(dir, entry, file, flags, mode);
+}
+
 /*
  * Code shared between mknod, mkdir, symlink and link
  */
diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
index bf0b85d0b95c..af69578763ef 100644
--- a/fs/fuse/fuse_i.h
+++ b/fs/fuse/fuse_i.h
@@ -677,6 +677,9 @@ struct fuse_conn {
 	/** Is open/release not implemented by fs? */
 	unsigned no_open:1;
 
+	/** Is open atomic not implemented by fs? */
+	unsigned no_open_atomic:1;
+
 	/** Is opendir/releasedir not implemented by fs? */
 	unsigned no_opendir:1;
 
diff --git a/include/uapi/linux/fuse.h b/include/uapi/linux/fuse.h
index db92a7202b34..1508afbd9446 100644
--- a/include/uapi/linux/fuse.h
+++ b/include/uapi/linux/fuse.h
@@ -353,6 +353,7 @@ struct fuse_file_lock {
  * FOPEN_STREAM: the file is stream-like (no file position at all)
  * FOPEN_NOFLUSH: don't flush data cache on close (unless FUSE_WRITEBACK_CACHE)
  * FOPEN_PARALLEL_DIRECT_WRITES: Allow concurrent direct writes on the same inode
+ * FOPEN_FILE_CREATED: the file was indeed created
  */
 #define FOPEN_DIRECT_IO		(1 << 0)
 #define FOPEN_KEEP_CACHE	(1 << 1)
@@ -361,6 +362,7 @@ struct fuse_file_lock {
 #define FOPEN_STREAM		(1 << 4)
 #define FOPEN_NOFLUSH		(1 << 5)
 #define FOPEN_PARALLEL_DIRECT_WRITES	(1 << 6)
+#define FOPEN_FILE_CREATED	(1 << 7)
 
 /**
  * INIT request/reply flags
@@ -617,6 +619,7 @@ enum fuse_opcode {
 	FUSE_SYNCFS		= 50,
 	FUSE_TMPFILE		= 51,
 	FUSE_STATX		= 52,
+	FUSE_OPEN_ATOMIC	= 53,
 
 	/* CUSE specific operations */
 	CUSE_INIT		= 4096,
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v10 3/8] [RFC] Allow atomic_open() on positive dentry (O_CREAT)
  2023-10-23 18:30 [PATCH v10 0/8] fuse: full atomic open and atomic-open-revalidate Bernd Schubert
  2023-10-23 18:30 ` [PATCH v10 1/8] fuse: rename fuse_create_open Bernd Schubert
  2023-10-23 18:30 ` [PATCH v10 2/8] fuse: introduce atomic open Bernd Schubert
@ 2023-10-23 18:30 ` Bernd Schubert
  2023-10-23 18:30 ` [PATCH v10 4/8] [RFC] Allow atomic_open() on positive dentry (w/o O_CREAT) Bernd Schubert
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 32+ messages in thread
From: Bernd Schubert @ 2023-10-23 18:30 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: bernd.schubert, miklos, dsingh, Bernd Schubert,
	Christian Brauner, Al Viro, Amir Goldstein

From: Miklos Szeredi <miklos@szeredi.hu>

atomic_open() will do an open-by-name or create-and-open
depending on the flags.

If file was created, then the old positive dentry is obviously
stale, so it will be invalidated and a new one will be allocated.

If not created, then check whether it's the same inode (same as in
->d_revalidate()) and if not, invalidate & allocate new dentry.

This only works with O_CREAT, without O_CREAT open_last_lookups
will call into lookup_fast and then return the dentry via
finish_lookup - lookup_open is never called.
This is going to be addressed in the next commit.

Another included change is the introduction of an enum as
d_revalidate return code.

Co-developed-by: Bernd Schubert <bschubert@ddn.com>
Signed-off-by: Bernd Schubert <bschubert@ddn.com>
Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Dharmendra Singh <dsingh@ddn.com>
Cc: Amir Goldstein <amir73il@gmail.com>
Cc: linux-fsdevel@vger.kernel.org
---
 fs/namei.c            | 11 ++++++-----
 include/linux/namei.h |  7 +++++++
 2 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index 567ee547492b..ff913e6b12b4 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -860,7 +860,7 @@ static inline int d_revalidate(struct dentry *dentry, unsigned int flags)
 	if (unlikely(dentry->d_flags & DCACHE_OP_REVALIDATE))
 		return dentry->d_op->d_revalidate(dentry, flags);
 	else
-		return 1;
+		return D_REVALIDATE_VALID;
 }
 
 /**
@@ -3330,8 +3330,9 @@ static int may_o_create(struct mnt_idmap *idmap,
 }
 
 /*
- * Attempt to atomically look up, create and open a file from a negative
- * dentry.
+ * Attempt to atomically look up, create and open a file from a
+ * dentry. Unless the file system returns D_REVALIDATE_ATOMIC in ->d_revalidate,
+ * the dentry is always negative.
  *
  * Returns 0 if successful.  The file will have been created and attached to
  * @file by the filesystem calling finish_open().
@@ -3406,7 +3407,7 @@ static struct dentry *lookup_open(struct nameidata *nd, struct file *file,
 	struct inode *dir_inode = dir->d_inode;
 	int open_flag = op->open_flag;
 	struct dentry *dentry;
-	int error, create_error = 0;
+	int error = 0, create_error = 0;
 	umode_t mode = op->mode;
 	DECLARE_WAIT_QUEUE_HEAD_ONSTACK(wq);
 
@@ -3433,7 +3434,7 @@ static struct dentry *lookup_open(struct nameidata *nd, struct file *file,
 		dput(dentry);
 		dentry = NULL;
 	}
-	if (dentry->d_inode) {
+	if (dentry->d_inode && error != D_REVALIDATE_ATOMIC) {
 		/* Cached positive dentry: will open in f_op->open */
 		return dentry;
 	}
diff --git a/include/linux/namei.h b/include/linux/namei.h
index 1463cbda4888..a70e87d2b2a9 100644
--- a/include/linux/namei.h
+++ b/include/linux/namei.h
@@ -47,6 +47,13 @@ enum {LAST_NORM, LAST_ROOT, LAST_DOT, LAST_DOTDOT};
 /* LOOKUP_* flags which do scope-related checks based on the dirfd. */
 #define LOOKUP_IS_SCOPED (LOOKUP_BENEATH | LOOKUP_IN_ROOT)
 
+/* ->d_revalidate return codes */
+enum {
+	D_REVALIDATE_INVALID = 0, /* invalid dentry */
+	D_REVALIDATE_VALID   = 1, /* valid dentry */
+	D_REVALIDATE_ATOMIC =  2, /* atomic_open will revalidate */
+};
+
 extern int path_pts(struct path *path);
 
 extern int user_path_at_empty(int, const char __user *, unsigned, struct path *, int *empty);
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v10 4/8] [RFC] Allow atomic_open() on positive dentry (w/o O_CREAT)
  2023-10-23 18:30 [PATCH v10 0/8] fuse: full atomic open and atomic-open-revalidate Bernd Schubert
                   ` (2 preceding siblings ...)
  2023-10-23 18:30 ` [PATCH v10 3/8] [RFC] Allow atomic_open() on positive dentry (O_CREAT) Bernd Schubert
@ 2023-10-23 18:30 ` Bernd Schubert
  2023-10-23 23:21   ` kernel test robot
                     ` (2 more replies)
  2023-10-23 18:30 ` [PATCH v10 5/8] fuse: Revalidate positive entries in fuse_atomic_open Bernd Schubert
                   ` (3 subsequent siblings)
  7 siblings, 3 replies; 32+ messages in thread
From: Bernd Schubert @ 2023-10-23 18:30 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: bernd.schubert, miklos, dsingh, Bernd Schubert,
	Christian Brauner, Amir Goldstein, Al Viro

Previous patch allowed atomic-open on a positive dentry when
O_CREAT was set (in lookup_open). This adds in atomic-open
when O_CREAT is not set.

Code wise it would be possible to just drop the dentry in
open_last_lookups and then fall through to lookup_open.
But then this would add some overhead for dentry drop,
re-lookup and actually also call into d_revalidate.
So as suggested by Miklos, this adds a helper function
(atomic_revalidate_open) to immediately open the dentry
with atomic_open.

Signed-off-by: Bernd Schubert <bschubert@ddn.com>
Cc: Miklos Szeredi <miklos@szeredi.hu>
Cc: Dharmendra Singh <dsingh@ddn.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Amir Goldstein <amir73il@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
---
 fs/namei.c | 66 +++++++++++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 63 insertions(+), 3 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index ff913e6b12b4..5e2d569ffe38 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -1614,10 +1614,11 @@ struct dentry *lookup_one_qstr_excl(const struct qstr *name,
 }
 EXPORT_SYMBOL(lookup_one_qstr_excl);
 
-static struct dentry *lookup_fast(struct nameidata *nd)
+static struct dentry *lookup_fast(struct nameidata *nd, bool *atomic_revalidate)
 {
 	struct dentry *dentry, *parent = nd->path.dentry;
 	int status = 1;
+	*atomic_revalidate = false;
 
 	/*
 	 * Rename seqlock is not required here because in the off chance
@@ -1659,6 +1660,10 @@ static struct dentry *lookup_fast(struct nameidata *nd)
 		dput(dentry);
 		return ERR_PTR(status);
 	}
+
+	if (status == D_REVALIDATE_ATOMIC)
+		*atomic_revalidate = true;
+
 	return dentry;
 }
 
@@ -1984,6 +1989,7 @@ static const char *handle_dots(struct nameidata *nd, int type)
 static const char *walk_component(struct nameidata *nd, int flags)
 {
 	struct dentry *dentry;
+	bool atomic_revalidate;
 	/*
 	 * "." and ".." are special - ".." especially so because it has
 	 * to be able to know about the current root directory and
@@ -1994,7 +2000,7 @@ static const char *walk_component(struct nameidata *nd, int flags)
 			put_link(nd);
 		return handle_dots(nd, nd->last_type);
 	}
-	dentry = lookup_fast(nd);
+	dentry = lookup_fast(nd, &atomic_revalidate);
 	if (IS_ERR(dentry))
 		return ERR_CAST(dentry);
 	if (unlikely(!dentry)) {
@@ -2002,6 +2008,9 @@ static const char *walk_component(struct nameidata *nd, int flags)
 		if (IS_ERR(dentry))
 			return ERR_CAST(dentry);
 	}
+
+	WARN_ON_ONCE(atomic_revalidate);
+
 	if (!(flags & WALK_MORE) && nd->depth)
 		put_link(nd);
 	return step_into(nd, flags, dentry);
@@ -3383,6 +3392,42 @@ static struct dentry *atomic_open(struct nameidata *nd, struct dentry *dentry,
 	return dentry;
 }
 
+static struct dentry *atomic_revalidate_open(struct dentry *dentry,
+					     struct nameidata *nd,
+					     struct file *file,
+					     const struct open_flags *op,
+					     bool *got_write)
+{
+	struct mnt_idmap *idmap;
+	struct dentry *dir = nd->path.dentry;
+	struct inode *dir_inode = dir->d_inode;
+	int open_flag = op->open_flag;
+	umode_t mode = op->mode;
+
+	if (unlikely(IS_DEADDIR(dir_inode)))
+		return ERR_PTR(-ENOENT);
+
+	file->f_mode &= ~FMODE_CREATED;
+
+	if (WARN_ON_ONCE(open_flag & O_CREAT))
+		return ERR_PTR(-EINVAL);
+
+	if (open_flag & (O_TRUNC | O_WRONLY | O_RDWR))
+		*got_write = !mnt_want_write(nd->path.mnt);
+	else
+		*got_write = false;
+
+	if (!*got_write)
+		open_flag &= ~O_TRUNC;
+
+	inode_lock_shared(dir->d_inode);
+	dentry = atomic_open(nd, dentry, file, open_flag, mode);
+	inode_unlock_shared(dir->d_inode);
+
+	return dentry;
+
+}
+
 /*
  * Look up and maybe create and open the last component.
  *
@@ -3527,12 +3572,26 @@ static const char *open_last_lookups(struct nameidata *nd,
 	}
 
 	if (!(open_flag & O_CREAT)) {
+		bool atomic_revalidate;
+
 		if (nd->last.name[nd->last.len])
 			nd->flags |= LOOKUP_FOLLOW | LOOKUP_DIRECTORY;
 		/* we _can_ be in RCU mode here */
-		dentry = lookup_fast(nd);
+		dentry = lookup_fast(nd, &atomic_revalidate);
 		if (IS_ERR(dentry))
 			return ERR_CAST(dentry);
+		if (dentry && unlikely(atomic_revalidate)) {
+			/* The file system shall not claim to support atomic
+			 * revalidate in RCU mode
+			 */
+			if (WARN_ON_ONCE(nd->flags & LOOKUP_RCU)) {
+				dput(dentry);
+				return ERR_PTR(-ECHILD);
+			}
+			dentry = atomic_revalidate_open(dentry, nd, file, op,
+							&got_write);
+			goto drop_write;
+		}
 		if (likely(dentry))
 			goto finish_lookup;
 
@@ -3569,6 +3628,7 @@ static const char *open_last_lookups(struct nameidata *nd,
 	else
 		inode_unlock_shared(dir->d_inode);
 
+drop_write:
 	if (got_write)
 		mnt_drop_write(nd->path.mnt);
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v10 5/8] fuse: Revalidate positive entries in fuse_atomic_open
  2023-10-23 18:30 [PATCH v10 0/8] fuse: full atomic open and atomic-open-revalidate Bernd Schubert
                   ` (3 preceding siblings ...)
  2023-10-23 18:30 ` [PATCH v10 4/8] [RFC] Allow atomic_open() on positive dentry (w/o O_CREAT) Bernd Schubert
@ 2023-10-23 18:30 ` Bernd Schubert
  2023-10-28  5:18   ` Al Viro
  2023-10-28  5:25   ` Al Viro
  2023-10-23 18:30 ` [PATCH v10 6/8] fuse: Return D_REVALIDATE_ATOMIC for cached dentries Bernd Schubert
                   ` (2 subsequent siblings)
  7 siblings, 2 replies; 32+ messages in thread
From: Bernd Schubert @ 2023-10-23 18:30 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: bernd.schubert, miklos, dsingh, Bernd Schubert, Horst Birthelmer,
	Christian Brauner, Al Viro

From: Dharmendra Singh <dsingh@ddn.com>

This makes use of the vfs changes and fuse_dentry_revalidate()
can now skip revalidate, if the fuse implementation has
atomic_open support, which will has to do the dentry
revalidation.

Skipping revalidate is only possible when we absolutely
know that the implementation supports atomic_open, so
another bit had to be added to struct fuse_conn, which
is set when atomic_open was successful.

Once struct fuse_conn has the positive 'has_open_atomic'
fuse_dentry_revalidate() might set DCACHE_ATOMIC_OPEN.
vfs use that flag to use atomic_open.

If the file was newly created, the previous positive dentry
is invalidated and a new dentry and inode are allocated
and linked (d_splice_alias).

If file was not created, we revalidate the inode. If inode is
stale, current inode is marked as bad. And new inode is allocated
and linked to new dentry(old dentry invalidated). In case of
inode attributes differing with fresh attr, we allocate new
dentry and hook current inode to it and open the file.

For negative dentry, FS just allocate new inode and hook it onto
passed entry from VFS and open the file.

Co-developed-by: Bernd Schubert <bschubert@ddn.com>
Signed-off-by: Bernd Schubert <bschubert@ddn.com>
Signed-off-by: Dharmendra Singh <dsingh@ddn.com>
Signed-off-by: Horst Birthelmer <hbirthelmer@ddn.com>
Cc: Miklos Szeredi <miklos@szeredi.hu>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Dharmendra Singh <dsingh@ddn.com>
Cc: linux-fsdevel@vger.kernel.org
---
 fs/fuse/dir.c    | 202 ++++++++++++++++++++++++++++++++++++++++-------
 fs/fuse/fuse_i.h |   3 +
 2 files changed, 176 insertions(+), 29 deletions(-)

diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
index 61cdb8e5f68e..17ae788776db 100644
--- a/fs/fuse/dir.c
+++ b/fs/fuse/dir.c
@@ -220,6 +220,19 @@ static int fuse_dentry_revalidate(struct dentry *entry, unsigned int flags)
 
 		fm = get_fuse_mount(inode);
 
+		/* If open atomic is supported by FUSE then use this opportunity
+		 * to avoid this lookup and combine lookup + open into a single call.
+		 *
+		 * Note: Fuse detects open atomic implementation automatically.
+		 * Therefore first few call would go into open atomic code path
+		 * , detects that open atomic is implemented or not by setting
+		 * fc->no_open_atomic. In case open atomic is not implemented,
+		 * calls fall back to non-atomic open.
+		 */
+		if (fm->fc->has_open_atomic && flags & LOOKUP_OPEN) {
+			ret = D_REVALIDATE_ATOMIC;
+			goto out;
+		}
 		forget = fuse_alloc_forget();
 		ret = -ENOMEM;
 		if (!forget)
@@ -270,12 +283,12 @@ static int fuse_dentry_revalidate(struct dentry *entry, unsigned int flags)
 			dput(parent);
 		}
 	}
-	ret = 1;
+	ret = D_REVALIDATE_VALID;
 out:
 	return ret;
 
 invalid:
-	ret = 0;
+	ret = D_REVALIDATE_INVALID;
 	goto out;
 }
 
@@ -763,12 +776,84 @@ static int fuse_create_open(struct inode *dir, struct dentry *entry,
 	return finish_no_open(file, res);
 }
 
+/**
+ * Revalidate inode hooked into dentry against freshly acquired
+ * attributes. If inode is stale then allocate new dentry and
+ * hook it onto fresh inode.
+ */
+static struct dentry *
+fuse_atomic_open_revalidate(struct fuse_conn *fc, struct dentry *entry,
+			    struct inode *inode, int switched,
+			    struct fuse_entry_out *outentry,
+			    wait_queue_head_t *wq, int *alloc_inode)
+{
+	u64 attr_version;
+	struct dentry *prev = entry;
+
+	if (outentry->nodeid != get_node_id(inode) ||
+	    (bool) IS_AUTOMOUNT(inode) !=
+	    (bool) (outentry->attr.flags & FUSE_ATTR_SUBMOUNT)) {
+		*alloc_inode = 1;
+	} else if (fuse_stale_inode(inode, outentry->generation,
+				  &outentry->attr)) {
+		fuse_make_bad(inode);
+		*alloc_inode = 1;
+	}
+
+	if (*alloc_inode) {
+		struct dentry *new = NULL;
+
+		if (!switched && !d_in_lookup(entry)) {
+			d_drop(entry);
+			new = d_alloc_parallel(entry->d_parent, &entry->d_name,
+					       wq);
+			if (IS_ERR(new))
+				return new;
+
+			if (unlikely(!d_in_lookup(new))) {
+				dput(new);
+				new = ERR_PTR(-EIO);
+				return new;
+			}
+		}
+
+		fuse_invalidate_entry(entry);
+
+		entry = new;
+	} else if (!*alloc_inode) {
+		attr_version = fuse_get_attr_version(fc);
+		forget_all_cached_acls(inode);
+		fuse_change_attributes(inode, &outentry->attr, NULL,
+				       ATTR_TIMEOUT(outentry),
+				       attr_version);
+	}
+
+	if (prev == entry) {
+		/* nothing changed, atomic-open on the server side
+		 * had increased the lookup count - do the same here
+		 */
+		struct fuse_inode *fi = get_fuse_inode(inode);
+
+		spin_lock(&fi->lock);
+		fi->nlookup++;
+		spin_unlock(&fi->lock);
+	}
+
+	return entry;
+}
+
+/**
+ * Does 'lookup + create + open' or 'lookup + open' atomically.
+ * @entry might be positive as well, therefore inode is re-validated.
+ * Positive dentry is invalidated in case inode attributes differ or
+ * we encountered error.
+ */
 static int _fuse_atomic_open(struct inode *dir, struct dentry *entry,
 			     struct file *file, unsigned int flags,
 			     umode_t mode)
 {
 	int err;
-	struct inode *inode;
+	struct inode *inode = d_inode(entry);
 	FUSE_ARGS(args);
 	struct fuse_mount *fm = get_fuse_mount(dir);
 	struct fuse_conn *fc = fm->fc;
@@ -780,10 +865,7 @@ static int _fuse_atomic_open(struct inode *dir, struct dentry *entry,
 	struct fuse_file *ff;
 	struct dentry *switched_entry = NULL, *alias = NULL;
 	DECLARE_WAIT_QUEUE_HEAD_ONSTACK(wq);
-
-	/* Expect a negative dentry */
-	if (unlikely(d_inode(entry)))
-		goto fallback;
+	int alloc_inode = 0;
 
 	/* Userspace expects S_IFREG in create mode */
 	if ((flags & O_CREAT) && (mode & S_IFMT) != S_IFREG)
@@ -835,36 +917,56 @@ static int _fuse_atomic_open(struct inode *dir, struct dentry *entry,
 
 	err = fuse_simple_request(fm, &args);
 	free_ext_value(&args);
-	if (err == -ENOSYS || err == -ELOOP) {
-		if (unlikely(err == -ENOSYS))
-			fc->no_open_atomic = 1;
-		goto free_and_fallback;
-	}
 
 	if (!err && !outentry.nodeid)
 		err = -ENOENT;
 
-	if (err)
-		goto out_free_ff;
+	if (err) {
+		if (unlikely(err == -ENOSYS)) {
+			fc->no_open_atomic = 1;
+
+			/* Might come up if userspace tricks us and would
+			 * return -ENOSYS for OPEN_ATOMIC after it was
+			 * aready working
+			 */
+			if (unlikely(fc->has_open_atomic == 1))
+				pr_info("fuse server/daemon bug, atomic open "
+					"got -ENOSYS although it was already "
+					"succeeding before.");
+
+			/* This should better never happen, revalidate
+			 * is missing for this entry
+			 */
+			if (WARN_ON_ONCE(d_really_is_positive(entry))) {
+				err = -EIO;
+				goto out_free_ff;
+			}
+			goto free_and_fallback;
+		} else if (err == -ELOOP) {
+			/* likely a symlink */
+			goto free_and_fallback;
+		} else {
+			if (d_really_is_positive(entry)) {
+				if (err != -EINTR && err != -ENOMEM)
+					fuse_invalidate_entry(entry);
+			}
+
+			goto out_free_ff;
+		}
+	}
+
+	if (!err && !fc->has_open_atomic) {
+		/* Only set this flag when atomic open did not return an error,
+		 * so that we are absolutely sure it is implemented.
+		 */
+		fc->has_open_atomic = 1;
+	}
 
 	err = -EIO;
 	if (invalid_nodeid(outentry.nodeid) || fuse_invalid_attr(&outentry.attr))
 		goto out_free_ff;
 
-	ff->fh = outopen.fh;
-	ff->nodeid = outentry.nodeid;
-	ff->open_flags = outopen.open_flags;
-	inode = fuse_iget(dir->i_sb, outentry.nodeid, outentry.generation,
-			  &outentry.attr, ATTR_TIMEOUT(&outentry), 0);
-	if (!inode) {
-		flags &= ~(O_CREAT | O_EXCL | O_TRUNC);
-		fuse_sync_release(NULL, ff, flags);
-		fuse_queue_forget(fm->fc, forget, outentry.nodeid, 1);
-		err = -ENOMEM;
-		goto out_err;
-	}
-
-	/* prevent racing/parallel lookup on a negative hashed */
+	/* prevent racing/parallel lookup */
 	if (!(flags & O_CREAT) && !d_in_lookup(entry)) {
 		d_drop(entry);
 		switched_entry = d_alloc_parallel(entry->d_parent,
@@ -879,10 +981,52 @@ static int _fuse_atomic_open(struct inode *dir, struct dentry *entry,
 			/* fall back */
 			dput(switched_entry);
 			switched_entry = NULL;
-			goto free_and_fallback;
+
+			if (!inode) {
+				goto free_and_fallback;
+			} else {
+				/* XXX can this happen at all and is there a
+				 * better way to handle it?
+				 */
+				err = -EIO;
+				goto out_free_ff;
+			}
+		}
+	}
+
+	if (inode) {
+		struct dentry *new;
+
+		err = -ESTALE;
+		new = fuse_atomic_open_revalidate(fm->fc, entry, inode,
+						  !!switched_entry,
+						  &outentry, &wq, &alloc_inode);
+		if (IS_ERR(new)) {
+			err = PTR_ERR(new);
+			goto out_free_ff;
 		}
 
+		if (new != entry && new != NULL)
+			switched_entry = new;
+	}
+
+	if (switched_entry)
 		entry = switched_entry;
+
+	ff->fh = outopen.fh;
+	ff->nodeid = outentry.nodeid;
+	ff->open_flags = outopen.open_flags;
+
+	if (!inode || alloc_inode) {
+		inode = fuse_iget(dir->i_sb, outentry.nodeid, outentry.generation,
+				  &outentry.attr, ATTR_TIMEOUT(&outentry), 0);
+		if (!inode) {
+			flags &= ~(O_CREAT | O_EXCL | O_TRUNC);
+			fuse_sync_release(NULL, ff, flags);
+			fuse_queue_forget(fm->fc, forget, outentry.nodeid, 1);
+			err = -ENOMEM;
+			goto out_err;
+		}
 	}
 
 	if (d_really_is_negative(entry)) {
diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
index af69578763ef..80a1fc6aa103 100644
--- a/fs/fuse/fuse_i.h
+++ b/fs/fuse/fuse_i.h
@@ -680,6 +680,9 @@ struct fuse_conn {
 	/** Is open atomic not implemented by fs? */
 	unsigned no_open_atomic:1;
 
+	/** Is open atomic is proven to be implemented by fs? */
+	unsigned has_open_atomic:1;
+
 	/** Is opendir/releasedir not implemented by fs? */
 	unsigned no_opendir:1;
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v10 6/8] fuse: Return D_REVALIDATE_ATOMIC for cached dentries
  2023-10-23 18:30 [PATCH v10 0/8] fuse: full atomic open and atomic-open-revalidate Bernd Schubert
                   ` (4 preceding siblings ...)
  2023-10-23 18:30 ` [PATCH v10 5/8] fuse: Revalidate positive entries in fuse_atomic_open Bernd Schubert
@ 2023-10-23 18:30 ` Bernd Schubert
  2023-10-23 18:30 ` [PATCH v10 7/8] fuse: Avoid code duplication in atomic open Bernd Schubert
  2023-10-23 18:30 ` [PATCH v10 8/8] fuse atomic open: No fallback for symlinks, just call finish_no_open Bernd Schubert
  7 siblings, 0 replies; 32+ messages in thread
From: Bernd Schubert @ 2023-10-23 18:30 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: bernd.schubert, miklos, dsingh, Bernd Schubert, Horst Birthelmer

Cached dentries do not get revalidate, but open would still result in
open + getattr, instead of one atomic_open call only.

libfuse logs (passthrough_hp):

Unpatched:
----------
unique: 22, opcode: OPEN (14), nodeid: 140698229673544, insize: 48, pid: 3434
   unique: 22, success, outsize: 32
unique: 24, opcode: GETATTR (3), nodeid: 140698229673544, insize: 56, pid: 3434
   unique: 24, success, outsize: 120
unique: 26, opcode: FLUSH (25), nodeid: 140698229673544, insize: 64, pid: 3434
   unique: 26, success, outsize: 16
unique: 28, opcode: RELEASE (18), nodeid: 140698229673544, insize: 64, pid: 0
   unique: 28, success, outsize: 16

Patched:
----------
unique: 20, opcode: OPEN_ATOMIC (52), nodeid: 1, insize: 63, pid: 3397
   unique: 20, success, outsize: 160
unique: 22, opcode: FLUSH (25), nodeid: 140024188243528, insize: 64, pid: 3397
   unique: 22, success, outsize: 16
unique: 24, opcode: RELEASE (18), nodeid: 140024188243528, insize: 64, pid: 0
   unique: 24, success, outsize: 16

Signed-off-by: Bernd Schubert <bschubert@ddn.com>
Cc: Miklos Szeredi <miklos@szeredi.hu>
Cc: Dharmendra Singh <dsingh@ddn.com>
Cc: Horst Birthelmer <hbirthelmer@ddn.com>
Cc: linux-fsdevel@vger.kernel.org
---
 fs/fuse/dir.c | 49 +++++++++++++++++++++++++++++++++++++------------
 1 file changed, 37 insertions(+), 12 deletions(-)

diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
index 17ae788776db..a770c0a6e022 100644
--- a/fs/fuse/dir.c
+++ b/fs/fuse/dir.c
@@ -183,6 +183,22 @@ static void fuse_lookup_init(struct fuse_conn *fc, struct fuse_args *args,
 	args->out_args[0].value = outarg;
 }
 
+/*
+ * If open atomic is supported by FUSE then use this opportunity
+ * to avoid this lookup and combine lookup + open into a single call.
+ */
+static int fuse_dentry_do_atomic_revalidate(struct dentry *entry,
+					     unsigned int flags,
+					     struct fuse_conn *fc)
+{
+	int ret = 0;
+
+	if (flags & LOOKUP_OPEN && fc->has_open_atomic)
+		ret = D_REVALIDATE_ATOMIC;
+
+	return ret;
+}
+
 /*
  * Check whether the dentry is still valid
  *
@@ -220,19 +236,10 @@ static int fuse_dentry_revalidate(struct dentry *entry, unsigned int flags)
 
 		fm = get_fuse_mount(inode);
 
-		/* If open atomic is supported by FUSE then use this opportunity
-		 * to avoid this lookup and combine lookup + open into a single call.
-		 *
-		 * Note: Fuse detects open atomic implementation automatically.
-		 * Therefore first few call would go into open atomic code path
-		 * , detects that open atomic is implemented or not by setting
-		 * fc->no_open_atomic. In case open atomic is not implemented,
-		 * calls fall back to non-atomic open.
-		 */
-		if (fm->fc->has_open_atomic && flags & LOOKUP_OPEN) {
-			ret = D_REVALIDATE_ATOMIC;
+		ret = fuse_dentry_do_atomic_revalidate(entry, flags, fm->fc);
+		if (ret)
 			goto out;
-		}
+
 		forget = fuse_alloc_forget();
 		ret = -ENOMEM;
 		if (!forget)
@@ -275,6 +282,16 @@ static int fuse_dentry_revalidate(struct dentry *entry, unsigned int flags)
 	} else if (inode) {
 		fi = get_fuse_inode(inode);
 		if (flags & LOOKUP_RCU) {
+			fm = get_fuse_mount(inode);
+			if (fm->fc->has_open_atomic) {
+				/* Atomic open is preferred, as it does entry
+				 * revalidate and attribute refresh, but
+				 * DCACHE_ATOMIC_OPEN cannot be set in RCU mode
+				 */
+				if (flags & LOOKUP_OPEN)
+					return -ECHILD;
+			}
+
 			if (test_bit(FUSE_I_INIT_RDPLUS, &fi->state))
 				return -ECHILD;
 		} else if (test_and_clear_bit(FUSE_I_INIT_RDPLUS, &fi->state)) {
@@ -282,6 +299,14 @@ static int fuse_dentry_revalidate(struct dentry *entry, unsigned int flags)
 			fuse_advise_use_readdirplus(d_inode(parent));
 			dput(parent);
 		}
+
+		/* revalidate is skipped, but we still want atomic open to
+		 * update attributes during open
+		 */
+		fm = get_fuse_mount(inode);
+		ret = fuse_dentry_do_atomic_revalidate(entry, flags, fm->fc);
+		if (ret)
+			goto out;
 	}
 	ret = D_REVALIDATE_VALID;
 out:
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v10 7/8] fuse: Avoid code duplication in atomic open
  2023-10-23 18:30 [PATCH v10 0/8] fuse: full atomic open and atomic-open-revalidate Bernd Schubert
                   ` (5 preceding siblings ...)
  2023-10-23 18:30 ` [PATCH v10 6/8] fuse: Return D_REVALIDATE_ATOMIC for cached dentries Bernd Schubert
@ 2023-10-23 18:30 ` Bernd Schubert
  2023-10-23 18:30 ` [PATCH v10 8/8] fuse atomic open: No fallback for symlinks, just call finish_no_open Bernd Schubert
  7 siblings, 0 replies; 32+ messages in thread
From: Bernd Schubert @ 2023-10-23 18:30 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: bernd.schubert, miklos, dsingh, Bernd Schubert, Horst Birthelmer

The same code was used in fuse_atomic_open_revalidate()
_fuse_atomic_open().

Signed-off-by: Bernd Schubert <bschubert@ddn.com>
Cc: Miklos Szeredi <miklos@szeredi.hu>
Cc: Dharmendra Singh <dsingh@ddn.com>
Cc: Horst Birthelmer <hbirthelmer@ddn.com>
Cc: linux-fsdevel@vger.kernel.org

---
(If preferred, this could be merged into the main fuse atomic
revalidate patch). Or adding the function could be moved up
in the series.
---
 fs/fuse/dir.c | 51 +++++++++++++++++++++++++--------------------------
 1 file changed, 25 insertions(+), 26 deletions(-)

diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
index a770c0a6e022..c4564831af3c 100644
--- a/fs/fuse/dir.c
+++ b/fs/fuse/dir.c
@@ -801,6 +801,25 @@ static int fuse_create_open(struct inode *dir, struct dentry *entry,
 	return finish_no_open(file, res);
 }
 
+static struct dentry *fuse_atomic_open_alloc_dentry(struct dentry *entry,
+						    wait_queue_head_t *wq)
+{
+	struct dentry *new;
+
+	d_drop(entry);
+	new = d_alloc_parallel(entry->d_parent, &entry->d_name,
+			       wq);
+	if (IS_ERR(new))
+		return new;
+
+	/* XXX Can this happen at all and there a way to handle it? */
+	if (unlikely(!d_in_lookup(new))) {
+		dput(new);
+		new = ERR_PTR(-EIO);
+	}
+	return new;
+}
+
 /**
  * Revalidate inode hooked into dentry against freshly acquired
  * attributes. If inode is stale then allocate new dentry and
@@ -829,17 +848,9 @@ fuse_atomic_open_revalidate(struct fuse_conn *fc, struct dentry *entry,
 		struct dentry *new = NULL;
 
 		if (!switched && !d_in_lookup(entry)) {
-			d_drop(entry);
-			new = d_alloc_parallel(entry->d_parent, &entry->d_name,
-					       wq);
+			new = fuse_atomic_open_alloc_dentry(entry, wq);
 			if (IS_ERR(new))
 				return new;
-
-			if (unlikely(!d_in_lookup(new))) {
-				dput(new);
-				new = ERR_PTR(-EIO);
-				return new;
-			}
 		}
 
 		fuse_invalidate_entry(entry);
@@ -993,27 +1004,15 @@ static int _fuse_atomic_open(struct inode *dir, struct dentry *entry,
 
 	/* prevent racing/parallel lookup */
 	if (!(flags & O_CREAT) && !d_in_lookup(entry)) {
-		d_drop(entry);
-		switched_entry = d_alloc_parallel(entry->d_parent,
-						   &entry->d_name, &wq);
+		switched_entry = fuse_atomic_open_alloc_dentry(entry, &wq);
 		if (IS_ERR(switched_entry)) {
-			err = PTR_ERR(switched_entry);
-			switched_entry = NULL;
-			goto out_free_ff;
-		}
-
-		if (unlikely(!d_in_lookup(switched_entry))) {
-			/* fall back */
-			dput(switched_entry);
-			switched_entry = NULL;
-
 			if (!inode) {
+				switched_entry = NULL;
 				goto free_and_fallback;
 			} else {
-				/* XXX can this happen at all and is there a
-				 * better way to handle it?
-				 */
-				err = -EIO;
+				/* XXX Is there a better way to handle it? */
+				err = PTR_ERR(switched_entry);
+				switched_entry = NULL;
 				goto out_free_ff;
 			}
 		}
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v10 8/8] fuse atomic open: No fallback for symlinks, just call finish_no_open
  2023-10-23 18:30 [PATCH v10 0/8] fuse: full atomic open and atomic-open-revalidate Bernd Schubert
                   ` (6 preceding siblings ...)
  2023-10-23 18:30 ` [PATCH v10 7/8] fuse: Avoid code duplication in atomic open Bernd Schubert
@ 2023-10-23 18:30 ` Bernd Schubert
  7 siblings, 0 replies; 32+ messages in thread
From: Bernd Schubert @ 2023-10-23 18:30 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: bernd.schubert, miklos, dsingh, Bernd Schubert, Horst Birthelmer

Symlinks must not be opened as it would result in ELOOP,
but fallback to fuse_create_open is also
not ideal, as it would result in atomic-open + lookup for symlinks.
Atomic-open already carries all information lookup provides, so
just use that and then call finish_no_open instead of finish_open.

Codewise, as finish_no_open consumes a reference, compared to
finish_open, dput(alias) must not be called for symlinks.
Obviously, if we don't have an additional alias reference yet,
we need to get one for symlinks.

Signed-off-by: Bernd Schubert <bschubert@ddn.com>
Cc: Miklos Szeredi <miklos@szeredi.hu>
Cc: Dharmendra Singh <dsingh@ddn.com>
Cc: Horst Birthelmer <hbirthelmer@ddn.com>
Cc: linux-fsdevel@vger.kernel.org

---
(If preferred, this could be merged into the main fuse atomic
revalidate patch). Or adding the function could be moved up
in the series.
---
 fs/fuse/dir.c | 29 ++++++++++++++++-------------
 1 file changed, 16 insertions(+), 13 deletions(-)

diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
index c4564831af3c..e8cc33a8b3a2 100644
--- a/fs/fuse/dir.c
+++ b/fs/fuse/dir.c
@@ -978,9 +978,6 @@ static int _fuse_atomic_open(struct inode *dir, struct dentry *entry,
 				goto out_free_ff;
 			}
 			goto free_and_fallback;
-		} else if (err == -ELOOP) {
-			/* likely a symlink */
-			goto free_and_fallback;
 		} else {
 			if (d_really_is_positive(entry)) {
 				if (err != -EINTR && err != -ENOMEM)
@@ -1090,15 +1087,23 @@ static int _fuse_atomic_open(struct inode *dir, struct dentry *entry,
 		}
 	}
 
-	if (S_ISDIR(mode))
+	if (S_ISDIR(mode) || S_ISDIR(outentry.attr.mode))
 		ff->open_flags &= ~FOPEN_DIRECT_IO;
-	err = finish_open(file, entry, generic_file_open);
-	if (err) {
-		fi = get_fuse_inode(inode);
-		fuse_sync_release(fi, ff, flags);
-	} else {
-		file->private_data = ff;
-		fuse_finish_open(inode, file);
+
+	if (S_ISLNK(outentry.attr.mode)) {
+		err = finish_no_open(file, entry);
+		if (!alias)
+			dget(entry);
+	} else  {
+		err = finish_open(file, entry, generic_file_open);
+		if (err) {
+			fi = get_fuse_inode(inode);
+			fuse_sync_release(fi, ff, flags);
+		} else {
+			file->private_data = ff;
+			fuse_finish_open(inode, file);
+		}
+		dput(alias);
 	}
 
 	kfree(forget);
@@ -1108,8 +1113,6 @@ static int _fuse_atomic_open(struct inode *dir, struct dentry *entry,
 		dput(switched_entry);
 	}
 
-	dput(alias);
-
 	return err;
 
 out_free_ff:
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [PATCH v10 4/8] [RFC] Allow atomic_open() on positive dentry (w/o O_CREAT)
  2023-10-23 18:30 ` [PATCH v10 4/8] [RFC] Allow atomic_open() on positive dentry (w/o O_CREAT) Bernd Schubert
@ 2023-10-23 23:21   ` kernel test robot
  2023-10-24 13:46   ` Bernd Schubert
  2023-10-28  4:46   ` Al Viro
  2 siblings, 0 replies; 32+ messages in thread
From: kernel test robot @ 2023-10-23 23:21 UTC (permalink / raw)
  To: Bernd Schubert; +Cc: oe-kbuild-all

Hi Bernd,

[This is a private test report for your RFC patch.]
kernel test robot noticed the following build warnings:

[auto build test WARNING on mszeredi-fuse/for-next]
[also build test WARNING on linus/master v6.6-rc7 next-20231023]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Bernd-Schubert/fuse-rename-fuse_create_open/20231024-034655
base:   https://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse.git for-next
patch link:    https://lore.kernel.org/r/20231023183035.11035-5-bschubert%40ddn.com
patch subject: [PATCH v10 4/8] [RFC] Allow atomic_open() on positive dentry (w/o O_CREAT)
config: m68k-allyesconfig (https://download.01.org/0day-ci/archive/20231024/202310240755.yrNEPXQD-lkp@intel.com/config)
compiler: m68k-linux-gcc (GCC) 13.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20231024/202310240755.yrNEPXQD-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202310240755.yrNEPXQD-lkp@intel.com/

All warnings (new ones prefixed by >>):

   fs/namei.c: In function 'atomic_revalidate_open':
>> fs/namei.c:3401:27: warning: unused variable 'idmap' [-Wunused-variable]
    3401 |         struct mnt_idmap *idmap;
         |                           ^~~~~


vim +/idmap +3401 fs/namei.c

  3394	
  3395	static struct dentry *atomic_revalidate_open(struct dentry *dentry,
  3396						     struct nameidata *nd,
  3397						     struct file *file,
  3398						     const struct open_flags *op,
  3399						     bool *got_write)
  3400	{
> 3401		struct mnt_idmap *idmap;
  3402		struct dentry *dir = nd->path.dentry;
  3403		struct inode *dir_inode = dir->d_inode;
  3404		int open_flag = op->open_flag;
  3405		umode_t mode = op->mode;
  3406	
  3407		if (unlikely(IS_DEADDIR(dir_inode)))
  3408			return ERR_PTR(-ENOENT);
  3409	
  3410		file->f_mode &= ~FMODE_CREATED;
  3411	
  3412		if (WARN_ON_ONCE(open_flag & O_CREAT))
  3413			return ERR_PTR(-EINVAL);
  3414	
  3415		if (open_flag & (O_TRUNC | O_WRONLY | O_RDWR))
  3416			*got_write = !mnt_want_write(nd->path.mnt);
  3417		else
  3418			*got_write = false;
  3419	
  3420		if (!*got_write)
  3421			open_flag &= ~O_TRUNC;
  3422	
  3423		inode_lock_shared(dir->d_inode);
  3424		dentry = atomic_open(nd, dentry, file, open_flag, mode);
  3425		inode_unlock_shared(dir->d_inode);
  3426	
  3427		return dentry;
  3428	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v10 2/8] fuse: introduce atomic open
  2023-10-23 18:30 ` [PATCH v10 2/8] fuse: introduce atomic open Bernd Schubert
@ 2023-10-24 10:12   ` Yuan Yao
  2023-10-24 12:36     ` Bernd Schubert
  2023-10-28  3:03   ` [PATCH v10 2/8] fuse: introduce atomic open Al Viro
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 32+ messages in thread
From: Yuan Yao @ 2023-10-24 10:12 UTC (permalink / raw)
  To: Bernd Schubert
  Cc: linux-fsdevel, bernd.schubert, miklos, dsingh, Horst Birthelmer,
	Christian Brauner, Al Viro, Keiichi Watanabe, Takaya Saeki

Thank you for addressing the symbolic link problems!
Additionally, I found an incompatibility with the no_open feature.
When the FUSE server has the no_open feature enabled, the atomic_open
FUSE request returns a d_entry with an empty file handler and open
option FOPEN_KEEP_CACHE (for files) or FOPEN_CACHE_DIR (for
directories). With this situation, if we can set fc->no_open = 1 or
fc->no_opendir = 1 after receiving the such FUSE reply, as follows,
will make the atomic_open compatible with no_open feature:
+       if (!inode) {
+               flags &= ~(O_CREAT | O_EXCL | O_TRUNC);
+               fuse_sync_release(NULL, ff, flags);
+               fuse_queue_forget(fm->fc, forget, outentry.nodeid, 1);
+               err = -ENOMEM;
+               goto out_err;
+       }
+
+ if(ff->fh == 0) {
+        if(ff->open_flags & FOPEN_KEEP_CACHE)
+            fc->no_open = 1;
+        if(ff->open_flags & FOPEN_CACHE_DIR)
+          fc->no_opendir = 1;
+       }
+
+       /* prevent racing/parallel lookup on a negative hashed */


On Tue, Oct 24, 2023 at 3:41 AM Bernd Schubert <bschubert@ddn.com> wrote:
>
> From: Dharmendra Singh <dsingh@ddn.com>
>
> This adds full atomic open support, to avoid lookup before open/create.
> If the implementation (fuse server/daemon) does not support atomic open
> it falls back to non-atomic open.
>
> Co-developed-by: Bernd Schubert <bschubert@ddn.com>
> Signed-off-by: Bernd Schubert <bschubert@ddn.com>
> Signed-off-by: Dharmendra Singh <dsingh@ddn.com>
> Signed-off-by: Horst Birthelmer <hbirthelmer@ddn.com>
> Cc: Miklos Szeredi <miklos@szeredi.hu>
> Cc: Christian Brauner <brauner@kernel.org>
> Cc: Al Viro <viro@zeniv.linux.org.uk>
> Cc: Dharmendra Singh <dsingh@ddn.com>
> Cc: linux-fsdevel@vger.kernel.org
> ---
>  fs/fuse/dir.c             | 214 +++++++++++++++++++++++++++++++++++++-
>  fs/fuse/fuse_i.h          |   3 +
>  include/uapi/linux/fuse.h |   3 +
>  3 files changed, 219 insertions(+), 1 deletion(-)
>
> diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
> index e1095852601c..61cdb8e5f68e 100644
> --- a/fs/fuse/dir.c
> +++ b/fs/fuse/dir.c
> @@ -716,7 +716,7 @@ static int _fuse_create_open(struct inode *dir, struct dentry *entry,
>
>  static int fuse_mknod(struct mnt_idmap *, struct inode *, struct dentry *,
>                       umode_t, dev_t);
> -static int fuse_atomic_open(struct inode *dir, struct dentry *entry,
> +static int fuse_create_open(struct inode *dir, struct dentry *entry,
>                             struct file *file, unsigned flags,
>                             umode_t mode)
>  {
> @@ -763,6 +763,218 @@ static int fuse_atomic_open(struct inode *dir, struct dentry *entry,
>         return finish_no_open(file, res);
>  }
>
> +static int _fuse_atomic_open(struct inode *dir, struct dentry *entry,
> +                            struct file *file, unsigned int flags,
> +                            umode_t mode)
> +{
> +       int err;
> +       struct inode *inode;
> +       FUSE_ARGS(args);
> +       struct fuse_mount *fm = get_fuse_mount(dir);
> +       struct fuse_conn *fc = fm->fc;
> +       struct fuse_forget_link *forget;
> +       struct fuse_create_in inarg;
> +       struct fuse_open_out outopen;
> +       struct fuse_entry_out outentry;
> +       struct fuse_inode *fi;
> +       struct fuse_file *ff;
> +       struct dentry *switched_entry = NULL, *alias = NULL;
> +       DECLARE_WAIT_QUEUE_HEAD_ONSTACK(wq);
> +
> +       /* Expect a negative dentry */
> +       if (unlikely(d_inode(entry)))
> +               goto fallback;
> +
> +       /* Userspace expects S_IFREG in create mode */
> +       if ((flags & O_CREAT) && (mode & S_IFMT) != S_IFREG)
> +               goto fallback;
> +
> +       forget = fuse_alloc_forget();
> +       err = -ENOMEM;
> +       if (!forget)
> +               goto out_err;
> +
> +       err = -ENOMEM;
> +       ff = fuse_file_alloc(fm);
> +       if (!ff)
> +               goto out_put_forget_req;
> +
> +       if (!fc->dont_mask)
> +               mode &= ~current_umask();
> +
> +       flags &= ~O_NOCTTY;
> +       memset(&inarg, 0, sizeof(inarg));
> +       memset(&outentry, 0, sizeof(outentry));
> +       inarg.flags = flags;
> +       inarg.mode = mode;
> +       inarg.umask = current_umask();
> +
> +       if (fc->handle_killpriv_v2 && (flags & O_TRUNC) &&
> +           !(flags & O_EXCL) && !capable(CAP_FSETID)) {
> +               inarg.open_flags |= FUSE_OPEN_KILL_SUIDGID;
> +       }
> +
> +       args.opcode = FUSE_OPEN_ATOMIC;
> +       args.nodeid = get_node_id(dir);
> +       args.in_numargs = 2;
> +       args.in_args[0].size = sizeof(inarg);
> +       args.in_args[0].value = &inarg;
> +       args.in_args[1].size = entry->d_name.len + 1;
> +       args.in_args[1].value = entry->d_name.name;
> +       args.out_numargs = 2;
> +       args.out_args[0].size = sizeof(outentry);
> +       args.out_args[0].value = &outentry;
> +       args.out_args[1].size = sizeof(outopen);
> +       args.out_args[1].value = &outopen;
> +
> +       if (flags & O_CREAT) {
> +               err = get_create_ext(&args, dir, entry, mode);
> +               if (err)
> +                       goto out_free_ff;
> +       }
> +
> +       err = fuse_simple_request(fm, &args);
> +       free_ext_value(&args);
> +       if (err == -ENOSYS || err == -ELOOP) {
> +               if (unlikely(err == -ENOSYS))
> +                       fc->no_open_atomic = 1;
> +               goto free_and_fallback;
> +       }
> +
> +       if (!err && !outentry.nodeid)
> +               err = -ENOENT;
> +
> +       if (err)
> +               goto out_free_ff;
> +
> +       err = -EIO;
> +       if (invalid_nodeid(outentry.nodeid) || fuse_invalid_attr(&outentry.attr))
> +               goto out_free_ff;
> +
> +       ff->fh = outopen.fh;
> +       ff->nodeid = outentry.nodeid;
> +       ff->open_flags = outopen.open_flags;
> +       inode = fuse_iget(dir->i_sb, outentry.nodeid, outentry.generation,
> +                         &outentry.attr, ATTR_TIMEOUT(&outentry), 0);
> +       if (!inode) {
> +               flags &= ~(O_CREAT | O_EXCL | O_TRUNC);
> +               fuse_sync_release(NULL, ff, flags);
> +               fuse_queue_forget(fm->fc, forget, outentry.nodeid, 1);
> +               err = -ENOMEM;
> +               goto out_err;
> +       }
> +
> +       /* prevent racing/parallel lookup on a negative hashed */
> +       if (!(flags & O_CREAT) && !d_in_lookup(entry)) {
> +               d_drop(entry);
> +               switched_entry = d_alloc_parallel(entry->d_parent,
> +                                                  &entry->d_name, &wq);
> +               if (IS_ERR(switched_entry)) {
> +                       err = PTR_ERR(switched_entry);
> +                       switched_entry = NULL;
> +                       goto out_free_ff;
> +               }
> +
> +               if (unlikely(!d_in_lookup(switched_entry))) {
> +                       /* fall back */
> +                       dput(switched_entry);
> +                       switched_entry = NULL;
> +                       goto free_and_fallback;
> +               }
> +
> +               entry = switched_entry;
> +       }
> +
> +       if (d_really_is_negative(entry)) {
> +               d_drop(entry);
> +               alias = d_exact_alias(entry, inode);
> +               if (!alias) {
> +                       alias = d_splice_alias(inode, entry);
> +                       if (IS_ERR(alias)) {
> +                               /*
> +                                * Close the file in user space, but do not unlink it,
> +                                * if it was created - with network file systems other
> +                                * clients might have already accessed it.
> +                                */
> +                               fi = get_fuse_inode(inode);
> +                               fuse_sync_release(fi, ff, flags);
> +                               fuse_queue_forget(fm->fc, forget, outentry.nodeid, 1);
> +                               err = PTR_ERR(alias);
> +                               goto out_err;
> +                       }
> +               }
> +
> +               if (alias)
> +                       entry = alias;
> +       }
> +
> +       fuse_change_entry_timeout(entry, &outentry);
> +
> +       /*  File was indeed created */
> +       if (outopen.open_flags & FOPEN_FILE_CREATED) {
> +               if (!(flags & O_CREAT)) {
> +                       pr_debug("Server side bug, FOPEN_FILE_CREATED set "
> +                                "without O_CREAT, ignoring.");
> +               } else {
> +                       /* This should be always set when the file is created */
> +                       fuse_dir_changed(dir);
> +                       file->f_mode |= FMODE_CREATED;
> +               }
> +       }
> +
> +       if (S_ISDIR(mode))
> +               ff->open_flags &= ~FOPEN_DIRECT_IO;
> +       err = finish_open(file, entry, generic_file_open);
> +       if (err) {
> +               fi = get_fuse_inode(inode);
> +               fuse_sync_release(fi, ff, flags);
> +       } else {
> +               file->private_data = ff;
> +               fuse_finish_open(inode, file);
> +       }
> +
> +       kfree(forget);
> +
> +       if (switched_entry) {
> +               d_lookup_done(switched_entry);
> +               dput(switched_entry);
> +       }
> +
> +       dput(alias);
> +
> +       return err;
> +
> +out_free_ff:
> +       fuse_file_free(ff);
> +out_put_forget_req:
> +       kfree(forget);
> +out_err:
> +       if (switched_entry) {
> +               d_lookup_done(switched_entry);
> +               dput(switched_entry);
> +       }
> +
> +       return err;
> +
> +free_and_fallback:
> +       fuse_file_free(ff);
> +       kfree(forget);
> +fallback:
> +       return fuse_create_open(dir, entry, file, flags, mode);
> +}
> +
> +static int fuse_atomic_open(struct inode *dir, struct dentry *entry,
> +                           struct file *file, unsigned int flags,
> +                           umode_t mode)
> +{
> +       struct fuse_conn *fc = get_fuse_conn(dir);
> +
> +       if (fc->no_open_atomic)
> +               return fuse_create_open(dir, entry, file, flags, mode);
> +       else
> +               return _fuse_atomic_open(dir, entry, file, flags, mode);
> +}
> +
>  /*
>   * Code shared between mknod, mkdir, symlink and link
>   */
> diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
> index bf0b85d0b95c..af69578763ef 100644
> --- a/fs/fuse/fuse_i.h
> +++ b/fs/fuse/fuse_i.h
> @@ -677,6 +677,9 @@ struct fuse_conn {
>         /** Is open/release not implemented by fs? */
>         unsigned no_open:1;
>
> +       /** Is open atomic not implemented by fs? */
> +       unsigned no_open_atomic:1;
> +
>         /** Is opendir/releasedir not implemented by fs? */
>         unsigned no_opendir:1;
>
> diff --git a/include/uapi/linux/fuse.h b/include/uapi/linux/fuse.h
> index db92a7202b34..1508afbd9446 100644
> --- a/include/uapi/linux/fuse.h
> +++ b/include/uapi/linux/fuse.h
> @@ -353,6 +353,7 @@ struct fuse_file_lock {
>   * FOPEN_STREAM: the file is stream-like (no file position at all)
>   * FOPEN_NOFLUSH: don't flush data cache on close (unless FUSE_WRITEBACK_CACHE)
>   * FOPEN_PARALLEL_DIRECT_WRITES: Allow concurrent direct writes on the same inode
> + * FOPEN_FILE_CREATED: the file was indeed created
>   */
>  #define FOPEN_DIRECT_IO                (1 << 0)
>  #define FOPEN_KEEP_CACHE       (1 << 1)
> @@ -361,6 +362,7 @@ struct fuse_file_lock {
>  #define FOPEN_STREAM           (1 << 4)
>  #define FOPEN_NOFLUSH          (1 << 5)
>  #define FOPEN_PARALLEL_DIRECT_WRITES   (1 << 6)
> +#define FOPEN_FILE_CREATED     (1 << 7)
>
>  /**
>   * INIT request/reply flags
> @@ -617,6 +619,7 @@ enum fuse_opcode {
>         FUSE_SYNCFS             = 50,
>         FUSE_TMPFILE            = 51,
>         FUSE_STATX              = 52,
> +       FUSE_OPEN_ATOMIC        = 53,
>
>         /* CUSE specific operations */
>         CUSE_INIT               = 4096,
> --
> 2.39.2
>
>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v10 2/8] fuse: introduce atomic open
  2023-10-24 10:12   ` Yuan Yao
@ 2023-10-24 12:36     ` Bernd Schubert
  2023-10-26  2:42       ` Yuan Yao
  0 siblings, 1 reply; 32+ messages in thread
From: Bernd Schubert @ 2023-10-24 12:36 UTC (permalink / raw)
  To: Yuan Yao, Bernd Schubert
  Cc: linux-fsdevel, miklos, dsingh, Horst Birthelmer,
	Christian Brauner, Al Viro, Keiichi Watanabe, Takaya Saeki



On 10/24/23 12:12, Yuan Yao wrote:
> Thank you for addressing the symbolic link problems!
> Additionally, I found an incompatibility with the no_open feature.
> When the FUSE server has the no_open feature enabled, the atomic_open
> FUSE request returns a d_entry with an empty file handler and open
> option FOPEN_KEEP_CACHE (for files) or FOPEN_CACHE_DIR (for
> directories). With this situation, if we can set fc->no_open = 1 or
> fc->no_opendir = 1 after receiving the such FUSE reply, as follows,
> will make the atomic_open compatible with no_open feature:
> +       if (!inode) {
> +               flags &= ~(O_CREAT | O_EXCL | O_TRUNC);
> +               fuse_sync_release(NULL, ff, flags);
> +               fuse_queue_forget(fm->fc, forget, outentry.nodeid, 1);
> +               err = -ENOMEM;
> +               goto out_err;
> +       }
> +
> + if(ff->fh == 0) {
> +        if(ff->open_flags & FOPEN_KEEP_CACHE)
> +            fc->no_open = 1;
> +        if(ff->open_flags & FOPEN_CACHE_DIR)
> +          fc->no_opendir = 1;
> +       }
> +
> +       /* prevent racing/parallel lookup on a negative hashed */
> 

Thanks again for your review!

Hmm, are you sure atomic open needs to handle no-open? fuse_file_open 
sets no-open / no-opendir on -ENOSYS. _fuse_atomic_open has a handler 
for -ENOSYS and falls back to the existing create_open. So why does 
atomic open need a no-open handling?


Thanks,
Bernd





^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v10 4/8] [RFC] Allow atomic_open() on positive dentry (w/o O_CREAT)
  2023-10-23 18:30 ` [PATCH v10 4/8] [RFC] Allow atomic_open() on positive dentry (w/o O_CREAT) Bernd Schubert
  2023-10-23 23:21   ` kernel test robot
@ 2023-10-24 13:46   ` Bernd Schubert
  2023-10-28  4:46   ` Al Viro
  2 siblings, 0 replies; 32+ messages in thread
From: Bernd Schubert @ 2023-10-24 13:46 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: bernd.schubert, miklos, Dharmendra Singh, Christian Brauner,
	Amir Goldstein, Al Viro

On 10/23/23 20:30, Bernd Schubert wrote:
> Previous patch allowed atomic-open on a positive dentry when
> O_CREAT was set (in lookup_open). This adds in atomic-open
> when O_CREAT is not set.
> 
> Code wise it would be possible to just drop the dentry in
> open_last_lookups and then fall through to lookup_open.
> But then this would add some overhead for dentry drop,
> re-lookup and actually also call into d_revalidate.
> So as suggested by Miklos, this adds a helper function
> (atomic_revalidate_open) to immediately open the dentry
> with atomic_open.
> 
> Signed-off-by: Bernd Schubert <bschubert@ddn.com>
> Cc: Miklos Szeredi <miklos@szeredi.hu>
> Cc: Dharmendra Singh <dsingh@ddn.com>
> Cc: Christian Brauner <brauner@kernel.org>
> Cc: Amir Goldstein <amir73il@gmail.com>
> Cc: Al Viro <viro@zeniv.linux.org.uk>
> Cc: linux-fsdevel@vger.kernel.org
> ---
>   fs/namei.c | 66 +++++++++++++++++++++++++++++++++++++++++++++++++++---
>   1 file changed, 63 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/namei.c b/fs/namei.c
> index ff913e6b12b4..5e2d569ffe38 100644
> --- a/fs/namei.c
> +++ b/fs/namei.c
> @@ -1614,10 +1614,11 @@ struct dentry *lookup_one_qstr_excl(const struct qstr *name,
>   }
>   EXPORT_SYMBOL(lookup_one_qstr_excl);
>   
> -static struct dentry *lookup_fast(struct nameidata *nd)
> +static struct dentry *lookup_fast(struct nameidata *nd, bool *atomic_revalidate)
>   {
>   	struct dentry *dentry, *parent = nd->path.dentry;
>   	int status = 1;
> +	*atomic_revalidate = false;
>   
>   	/*
>   	 * Rename seqlock is not required here because in the off chance
> @@ -1659,6 +1660,10 @@ static struct dentry *lookup_fast(struct nameidata *nd)
>   		dput(dentry);
>   		return ERR_PTR(status);
>   	}
> +
> +	if (status == D_REVALIDATE_ATOMIC)
> +		*atomic_revalidate = true;
> +
>   	return dentry;
>   }
>   
> @@ -1984,6 +1989,7 @@ static const char *handle_dots(struct nameidata *nd, int type)
>   static const char *walk_component(struct nameidata *nd, int flags)
>   {
>   	struct dentry *dentry;
> +	bool atomic_revalidate;
>   	/*
>   	 * "." and ".." are special - ".." especially so because it has
>   	 * to be able to know about the current root directory and
> @@ -1994,7 +2000,7 @@ static const char *walk_component(struct nameidata *nd, int flags)
>   			put_link(nd);
>   		return handle_dots(nd, nd->last_type);
>   	}
> -	dentry = lookup_fast(nd);
> +	dentry = lookup_fast(nd, &atomic_revalidate);
>   	if (IS_ERR(dentry))
>   		return ERR_CAST(dentry);
>   	if (unlikely(!dentry)) {
> @@ -2002,6 +2008,9 @@ static const char *walk_component(struct nameidata *nd, int flags)
>   		if (IS_ERR(dentry))
>   			return ERR_CAST(dentry);
>   	}
> +
> +	WARN_ON_ONCE(atomic_revalidate);
> +
>   	if (!(flags & WALK_MORE) && nd->depth)
>   		put_link(nd);
>   	return step_into(nd, flags, dentry);
> @@ -3383,6 +3392,42 @@ static struct dentry *atomic_open(struct nameidata *nd, struct dentry *dentry,
>   	return dentry;
>   }
>   
> +static struct dentry *atomic_revalidate_open(struct dentry *dentry,
> +					     struct nameidata *nd,
> +					     struct file *file,
> +					     const struct open_flags *op,
> +					     bool *got_write)
> +{
> +	struct mnt_idmap *idmap;

As kernel test robot noticed, idmap is unused in this function.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v10 2/8] fuse: introduce atomic open
  2023-10-24 12:36     ` Bernd Schubert
@ 2023-10-26  2:42       ` Yuan Yao
  2023-11-29  6:46         ` [PATCH 0/1] Adapt atomic open to fuse no_open/no_open_dir Yuan Yao
  0 siblings, 1 reply; 32+ messages in thread
From: Yuan Yao @ 2023-10-26  2:42 UTC (permalink / raw)
  To: Bernd Schubert
  Cc: Bernd Schubert, linux-fsdevel, miklos, dsingh, Horst Birthelmer,
	Christian Brauner, Al Viro, Keiichi Watanabe, Takaya Saeki

Currently, if _fuse_atomic_open receives an -ENOSYS error, the
no_open_atomic flag is set, preventing further use of atomic_open.
However, even with the no_open feature enabled, atomic_open can still
provide performance benefits when creating new files due to its
ability to combine FUSE lookup and FUSE create operations into a
single atomic request. Therefore, I think it would be advantageous to
allow these two features to coexist.


Thanks,
Yuan

On Tue, Oct 24, 2023 at 9:36 PM Bernd Schubert
<bernd.schubert@fastmail.fm> wrote:
>
>
>
> On 10/24/23 12:12, Yuan Yao wrote:
> > Thank you for addressing the symbolic link problems!
> > Additionally, I found an incompatibility with the no_open feature.
> > When the FUSE server has the no_open feature enabled, the atomic_open
> > FUSE request returns a d_entry with an empty file handler and open
> > option FOPEN_KEEP_CACHE (for files) or FOPEN_CACHE_DIR (for
> > directories). With this situation, if we can set fc->no_open = 1 or
> > fc->no_opendir = 1 after receiving the such FUSE reply, as follows,
> > will make the atomic_open compatible with no_open feature:
> > +       if (!inode) {
> > +               flags &= ~(O_CREAT | O_EXCL | O_TRUNC);
> > +               fuse_sync_release(NULL, ff, flags);
> > +               fuse_queue_forget(fm->fc, forget, outentry.nodeid, 1);
> > +               err = -ENOMEM;
> > +               goto out_err;
> > +       }
> > +
> > + if(ff->fh == 0) {
> > +        if(ff->open_flags & FOPEN_KEEP_CACHE)
> > +            fc->no_open = 1;
> > +        if(ff->open_flags & FOPEN_CACHE_DIR)
> > +          fc->no_opendir = 1;
> > +       }
> > +
> > +       /* prevent racing/parallel lookup on a negative hashed */
> >
>
> Thanks again for your review!
>
> Hmm, are you sure atomic open needs to handle no-open? fuse_file_open
> sets no-open / no-opendir on -ENOSYS. _fuse_atomic_open has a handler
> for -ENOSYS and falls back to the existing create_open. So why does
> atomic open need a no-open handling?
>
>
> Thanks,
> Bernd
>
>
>
>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v10 2/8] fuse: introduce atomic open
  2023-10-23 18:30 ` [PATCH v10 2/8] fuse: introduce atomic open Bernd Schubert
  2023-10-24 10:12   ` Yuan Yao
@ 2023-10-28  3:03   ` Al Viro
  2023-10-30 15:21     ` Bernd Schubert
  2024-01-23  8:40   ` [PATCH 0/1] Fix-atomic_open-not-using-negative-d_entry Yuan Yao
  2024-03-14 10:34   ` [PATCH] fuse: Do NULL check instead of IS_ERR in atomic_open Keiichi Watanabe
  3 siblings, 1 reply; 32+ messages in thread
From: Al Viro @ 2023-10-28  3:03 UTC (permalink / raw)
  To: Bernd Schubert
  Cc: linux-fsdevel, bernd.schubert, miklos, dsingh, Horst Birthelmer,
	Christian Brauner

On Mon, Oct 23, 2023 at 08:30:29PM +0200, Bernd Schubert wrote:
> +{
> +	int err;
> +	struct inode *inode;
> +	FUSE_ARGS(args);
> +	struct fuse_mount *fm = get_fuse_mount(dir);
> +	struct fuse_conn *fc = fm->fc;
> +	struct fuse_forget_link *forget;
> +	struct fuse_create_in inarg;
> +	struct fuse_open_out outopen;
> +	struct fuse_entry_out outentry;
> +	struct fuse_inode *fi;
> +	struct fuse_file *ff;
> +	struct dentry *switched_entry = NULL, *alias = NULL;
> +	DECLARE_WAIT_QUEUE_HEAD_ONSTACK(wq);
> +
> +	/* Expect a negative dentry */
> +	if (unlikely(d_inode(entry)))
> +		goto fallback;
> +
> +	/* Userspace expects S_IFREG in create mode */
> +	if ((flags & O_CREAT) && (mode & S_IFMT) != S_IFREG)
> +		goto fallback;

How could it get there with such mode?  We could check that
in fs/namei.c:atomic_open() (with
	if (WARN_ON_ONCE((open_flags & O_CREAT) && !S_ISREG(mode)))
		error = -EINVAL; // for the lack of EWTFAREYOUSMOKING
	else
		error = dir->i_op->atomic_open(....)
or something similar), but that doesn't belong in the method instances.
And it really should never happen - that thing comes from op->mode and
we have build_open_flags() doing this:
        if (WILL_CREATE(flags)) {
                if (how->mode & ~S_IALLUGO)
                        return -EINVAL;
                op->mode = how->mode | S_IFREG;
        } else {
                if (how->mode != 0)
                        return -EINVAL;
                op->mode = 0;
        }
so...  Are other instances doing the same kind of paranoia?  That BUG_ON()
in current fuse_atomic_open() is bogus (and seriously misplaced)...

> +	forget = fuse_alloc_forget();
> +	err = -ENOMEM;
> +	if (!forget)
> +		goto out_err;
> +
> +	err = -ENOMEM;
> +	ff = fuse_file_alloc(fm);
> +	if (!ff)
> +		goto out_put_forget_req;
> +
> +	if (!fc->dont_mask)
> +		mode &= ~current_umask();
> +
> +	flags &= ~O_NOCTTY;
> +	memset(&inarg, 0, sizeof(inarg));
> +	memset(&outentry, 0, sizeof(outentry));
> +	inarg.flags = flags;
> +	inarg.mode = mode;
> +	inarg.umask = current_umask();
> +
> +	if (fc->handle_killpriv_v2 && (flags & O_TRUNC) &&
> +	    !(flags & O_EXCL) && !capable(CAP_FSETID)) {
> +		inarg.open_flags |= FUSE_OPEN_KILL_SUIDGID;
> +	}
> +
> +	args.opcode = FUSE_OPEN_ATOMIC;
> +	args.nodeid = get_node_id(dir);
> +	args.in_numargs = 2;
> +	args.in_args[0].size = sizeof(inarg);
> +	args.in_args[0].value = &inarg;
> +	args.in_args[1].size = entry->d_name.len + 1;
> +	args.in_args[1].value = entry->d_name.name;
> +	args.out_numargs = 2;
> +	args.out_args[0].size = sizeof(outentry);
> +	args.out_args[0].value = &outentry;
> +	args.out_args[1].size = sizeof(outopen);
> +	args.out_args[1].value = &outopen;
> +
> +	if (flags & O_CREAT) {
> +		err = get_create_ext(&args, dir, entry, mode);
> +		if (err)
> +			goto out_free_ff;
> +	}
> +
> +	err = fuse_simple_request(fm, &args);
> +	free_ext_value(&args);
> +	if (err == -ENOSYS || err == -ELOOP) {
> +		if (unlikely(err == -ENOSYS))
> +			fc->no_open_atomic = 1;
> +		goto free_and_fallback;
> +	}
> +
> +	if (!err && !outentry.nodeid)
> +		err = -ENOENT;
> +
> +	if (err)
> +		goto out_free_ff;
> +
> +	err = -EIO;
> +	if (invalid_nodeid(outentry.nodeid) || fuse_invalid_attr(&outentry.attr))
> +		goto out_free_ff;
> +
> +	ff->fh = outopen.fh;
> +	ff->nodeid = outentry.nodeid;
> +	ff->open_flags = outopen.open_flags;
> +	inode = fuse_iget(dir->i_sb, outentry.nodeid, outentry.generation,
> +			  &outentry.attr, ATTR_TIMEOUT(&outentry), 0);
> +	if (!inode) {
> +		flags &= ~(O_CREAT | O_EXCL | O_TRUNC);
> +		fuse_sync_release(NULL, ff, flags);
> +		fuse_queue_forget(fm->fc, forget, outentry.nodeid, 1);
> +		err = -ENOMEM;
> +		goto out_err;
> +	}
> +
> +	/* prevent racing/parallel lookup on a negative hashed */
> +	if (!(flags & O_CREAT) && !d_in_lookup(entry)) {

... in which case it has just passed ->d_revalidate()...

> +		d_drop(entry);
> +		switched_entry = d_alloc_parallel(entry->d_parent,
> +						   &entry->d_name, &wq);
> +		if (IS_ERR(switched_entry)) {
> +			err = PTR_ERR(switched_entry);
> +			switched_entry = NULL;
> +			goto out_free_ff;

leaked inode?

> +		}
> +
> +		if (unlikely(!d_in_lookup(switched_entry))) {
> +			/* fall back */
> +			dput(switched_entry);
> +			switched_entry = NULL;
> +			goto free_and_fallback;

ditto, and I don't really understand what the hell is going on with
dentry references here.  What is the intended behaviour in that case?

> +		}
> +
> +		entry = switched_entry;
> +	}
> +
> +	if (d_really_is_negative(entry)) {
> +		d_drop(entry);
> +		alias = d_exact_alias(entry, inode);

What case is that about?  "We have an unhashed positive dentry with that
exact name, parent and inode"?  Where would it have come from?

Another thing: this does not consume an inode reference, no matter what
gets returned,

> +		if (!alias) {
> +			alias = d_splice_alias(inode, entry);

but that one *does* consume the inode reference; note the igrab() in
nfs4 code where you've nicked that from...

> +			if (IS_ERR(alias)) {
> +				/*
> +				 * Close the file in user space, but do not unlink it,
> +				 * if it was created - with network file systems other
> +				 * clients might have already accessed it.
> +				 */
> +				fi = get_fuse_inode(inode);

... so this is asking for UAF.

> +				fuse_sync_release(fi, ff, flags);
> +				fuse_queue_forget(fm->fc, forget, outentry.nodeid, 1);
> +				err = PTR_ERR(alias);
> +				goto out_err;
> +			}
> +		}
> +
> +		if (alias)
> +			entry = alias;
> +	}

... and here we have no way to tell if inode needs to be dropped.

> +
> +	fuse_change_entry_timeout(entry, &outentry);
> +
> +	/*  File was indeed created */
> +	if (outopen.open_flags & FOPEN_FILE_CREATED) {
> +		if (!(flags & O_CREAT)) {
> +			pr_debug("Server side bug, FOPEN_FILE_CREATED set "
> +				 "without O_CREAT, ignoring.");
> +		} else {
> +			/* This should be always set when the file is created */
> +			fuse_dir_changed(dir);
> +			file->f_mode |= FMODE_CREATED;
> +		}
> +	}
> +
> +	if (S_ISDIR(mode))
> +		ff->open_flags &= ~FOPEN_DIRECT_IO;
> +	err = finish_open(file, entry, generic_file_open);
> +	if (err) {
> +		fi = get_fuse_inode(inode);
> +		fuse_sync_release(fi, ff, flags);
> +	} else {
> +		file->private_data = ff;
> +		fuse_finish_open(inode, file);
> +	}
> +
> +	kfree(forget);
> +
> +	if (switched_entry) {
> +		d_lookup_done(switched_entry);
> +		dput(switched_entry);
> +	}
> +
> +	dput(alias);
> +
> +	return err;
> +
> +out_free_ff:
> +	fuse_file_free(ff);
> +out_put_forget_req:
> +	kfree(forget);
> +out_err:
> +	if (switched_entry) {
> +		d_lookup_done(switched_entry);
> +		dput(switched_entry);
> +	}
> +
> +	return err;
> +
> +free_and_fallback:
> +	fuse_file_free(ff);
> +	kfree(forget);
> +fallback:
> +	return fuse_create_open(dir, entry, file, flags, mode);
> +}

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v10 4/8] [RFC] Allow atomic_open() on positive dentry (w/o O_CREAT)
  2023-10-23 18:30 ` [PATCH v10 4/8] [RFC] Allow atomic_open() on positive dentry (w/o O_CREAT) Bernd Schubert
  2023-10-23 23:21   ` kernel test robot
  2023-10-24 13:46   ` Bernd Schubert
@ 2023-10-28  4:46   ` Al Viro
  2 siblings, 0 replies; 32+ messages in thread
From: Al Viro @ 2023-10-28  4:46 UTC (permalink / raw)
  To: Bernd Schubert
  Cc: linux-fsdevel, bernd.schubert, miklos, dsingh, Christian Brauner,
	Amir Goldstein

On Mon, Oct 23, 2023 at 08:30:31PM +0200, Bernd Schubert wrote:

> Previous patch allowed atomic-open on a positive dentry when
> O_CREAT was set (in lookup_open). This adds in atomic-open
> when O_CREAT is not set.
> 
> Code wise it would be possible to just drop the dentry in
> open_last_lookups and then fall through to lookup_open.
> But then this would add some overhead for dentry drop,
> re-lookup and actually also call into d_revalidate.
> So as suggested by Miklos, this adds a helper function
> (atomic_revalidate_open) to immediately open the dentry
> with atomic_open.
> 
> Signed-off-by: Bernd Schubert <bschubert@ddn.com>
> Cc: Miklos Szeredi <miklos@szeredi.hu>
> Cc: Dharmendra Singh <dsingh@ddn.com>
> Cc: Christian Brauner <brauner@kernel.org>
> Cc: Amir Goldstein <amir73il@gmail.com>
> Cc: Al Viro <viro@zeniv.linux.org.uk>
> Cc: linux-fsdevel@vger.kernel.org
> ---
>  fs/namei.c | 66 +++++++++++++++++++++++++++++++++++++++++++++++++++---
>  1 file changed, 63 insertions(+), 3 deletions(-)

This is bloody awful.
 
> diff --git a/fs/namei.c b/fs/namei.c
> index ff913e6b12b4..5e2d569ffe38 100644
> --- a/fs/namei.c
> +++ b/fs/namei.c
> @@ -1614,10 +1614,11 @@ struct dentry *lookup_one_qstr_excl(const struct qstr *name,
>  }
>  EXPORT_SYMBOL(lookup_one_qstr_excl);
>  
> -static struct dentry *lookup_fast(struct nameidata *nd)
> +static struct dentry *lookup_fast(struct nameidata *nd, bool *atomic_revalidate)

Yechhh...  Note that absolute majority of calls will be nowhere near
the case when that atomic_revalidate thing might possibly be set.

>  {
>  	struct dentry *dentry, *parent = nd->path.dentry;
>  	int status = 1;
> +	*atomic_revalidate = false;
>  
>  	/*
>  	 * Rename seqlock is not required here because in the off chance
> @@ -1659,6 +1660,10 @@ static struct dentry *lookup_fast(struct nameidata *nd)
>  		dput(dentry);
>  		return ERR_PTR(status);
>  	}
> +
> +	if (status == D_REVALIDATE_ATOMIC)
> +		*atomic_revalidate = true;
> +
>  	return dentry;
>  }

 
> @@ -1984,6 +1989,7 @@ static const char *handle_dots(struct nameidata *nd, int type)
>  static const char *walk_component(struct nameidata *nd, int flags)
>  {
>  	struct dentry *dentry;
> +	bool atomic_revalidate;
>  	/*
>  	 * "." and ".." are special - ".." especially so because it has
>  	 * to be able to know about the current root directory and
> @@ -1994,7 +2000,7 @@ static const char *walk_component(struct nameidata *nd, int flags)
>  			put_link(nd);
>  		return handle_dots(nd, nd->last_type);
>  	}
> -	dentry = lookup_fast(nd);
> +	dentry = lookup_fast(nd, &atomic_revalidate);
>  	if (IS_ERR(dentry))
>  		return ERR_CAST(dentry);
>  	if (unlikely(!dentry)) {
> @@ -2002,6 +2008,9 @@ static const char *walk_component(struct nameidata *nd, int flags)
>  		if (IS_ERR(dentry))
>  			return ERR_CAST(dentry);
>  	}
> +
> +	WARN_ON_ONCE(atomic_revalidate);
> +
>  	if (!(flags & WALK_MORE) && nd->depth)
>  		put_link(nd);
>  	return step_into(nd, flags, dentry);
> @@ -3383,6 +3392,42 @@ static struct dentry *atomic_open(struct nameidata *nd, struct dentry *dentry,
>  	return dentry;
>  }
  
> +static struct dentry *atomic_revalidate_open(struct dentry *dentry,
> +					     struct nameidata *nd,
> +					     struct file *file,
> +					     const struct open_flags *op,
> +					     bool *got_write)
> +{
> +	struct mnt_idmap *idmap;
> +	struct dentry *dir = nd->path.dentry;
> +	struct inode *dir_inode = dir->d_inode;
> +	int open_flag = op->open_flag;
> +	umode_t mode = op->mode;
> +
> +	if (unlikely(IS_DEADDIR(dir_inode)))
> +		return ERR_PTR(-ENOENT);

What's the point of doing that check when there's nothing to stop
directory from being removed right under you?  Note that similar
check in lookup_open() is done after the caller has locked the
damn thing.

> +	file->f_mode &= ~FMODE_CREATED;
> +
> +	if (WARN_ON_ONCE(open_flag & O_CREAT))
> +		return ERR_PTR(-EINVAL);

Really.  With the only caller being under

        int open_flag = op->open_flag;
	...
	if (!(open_flag & O_CREAT)) {

> +
> +	if (open_flag & (O_TRUNC | O_WRONLY | O_RDWR))
> +		*got_write = !mnt_want_write(nd->path.mnt);
> +	else
> +		*got_write = false;
> +
> +	if (!*got_write)
> +		open_flag &= ~O_TRUNC;
> +
> +	inode_lock_shared(dir->d_inode);
> +	dentry = atomic_open(nd, dentry, file, open_flag, mode);
> +	inode_unlock_shared(dir->d_inode);

What will happen if you get that thing called with NULL ->i_op->atomic_open()?
> +
> +	return dentry;
> +
> +}
> +
>  /*
>   * Look up and maybe create and open the last component.
>   *
> @@ -3527,12 +3572,26 @@ static const char *open_last_lookups(struct nameidata *nd,
>  	}
>  
>  	if (!(open_flag & O_CREAT)) {
> +		bool atomic_revalidate;
> +
>  		if (nd->last.name[nd->last.len])
>  			nd->flags |= LOOKUP_FOLLOW | LOOKUP_DIRECTORY;
>  		/* we _can_ be in RCU mode here */
> -		dentry = lookup_fast(nd);
> +		dentry = lookup_fast(nd, &atomic_revalidate);
>  		if (IS_ERR(dentry))
>  			return ERR_CAST(dentry);
> +		if (dentry && unlikely(atomic_revalidate)) {
> +			/* The file system shall not claim to support atomic
> +			 * revalidate in RCU mode
> +			 */
> +			if (WARN_ON_ONCE(nd->flags & LOOKUP_RCU)) {
> +				dput(dentry);

dput() under rcu_read_lock()?  For one thing, it's completely wrong
as far as recovery strategy goes; we do *NOT* grab references under
LOOKUP_RCU, so whatever we got here is not a counting reference.
What's more, your comment is actively misleading - you set that
atomic_revalidate thing in the very end of lookup_fast() and
there is no way to get there with LOOKUP_RCU.  Look:

static struct dentry *lookup_fast(struct nameidata *nd)
{
        ...
        if (nd->flags & LOOKUP_RCU) {
		...
                status = d_revalidate(dentry, nd->flags);
                if (likely(status > 0))
                        return dentry;
That's where we leave if we'd found and successfully
revalidated a dentry in RCU mode.
                if (!try_to_unlazy_next(nd, dentry))
                        return ERR_PTR(-ECHILD);
... and this is where we'd already left the RCU mode.
                if (status == -ECHILD)
                        /* we'd been told to redo it in non-rcu mode */
                        status = d_revalidate(dentry, nd->flags);
	} else {
here we hadn't been in RCU mode to start with and we *never*
switch from non-RCU to RCU.
		...
	}
	// and here you set that flag of yours.

So no matter what your ->d_revalidate() returns, you are
not going to see atomic_... shite set in RCU mode.  It's not
a matter of filesystem behaviour, contrary to your comment.

> +				return ERR_PTR(-ECHILD);
> +			}
> +			dentry = atomic_revalidate_open(dentry, nd, file, op,
> +							&got_write);
> +			goto drop_write;
> +		}
>  		if (likely(dentry))
>  			goto finish_lookup;
>  
> @@ -3569,6 +3628,7 @@ static const char *open_last_lookups(struct nameidata *nd,
>  	else
>  		inode_unlock_shared(dir->d_inode);
>  
> +drop_write:
>  	if (got_write)
>  		mnt_drop_write(nd->path.mnt);

That helper of yours is a bad idea.  Control flow in that area is
messy and hard to follow as it is, and we had _MANY_ bugs stemming
from that.  You are making it harder to follow; this stuff really
should've gone into lookup_open().

And I really hate that 'atomic_revalidate' thing of yours.
Especially since the reader gets to do major head-scratching about
the WARN_ON_ONCE(atomic_revalidate) in walk_component().  Takes
guessing that it's probably a matter of LOOKUP_OPEN *not* being
there in walk_component() and always being there in the
open_last_lookups() (we never get there for O_PATH opens, so
op->intent will have it).  So at a guess you mean to have
->d_revalidate() only return that magical value if LOOKUP_OPEN
is present in flags.  Which seems to be the case, judging by
the subsequent patches in the series.

_IF_ we want to go in that direction, at least make it
	if (status == THAT_MAGIC_VALUE) {
		if (unlikely(!atomic_revalidate)) {
			if (WARN_ON_ONCE(nd->flags & LOOKUP_OPEN))
				// insane caller
				;
			else
				// insane ->d_revalidate() instance
				WARN_ON_ONCE(1);
		} else {
			*atomic_revalidate = true;
		}
	}
and pass it NULL when calling it from walk_component().

Again, I'm not at all sure it's a good idea to start with.  Hard to
tell without seeing how it'll look after massage that would move
that new call of atomic_open() down into lookup_open().

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v10 5/8] fuse: Revalidate positive entries in fuse_atomic_open
  2023-10-23 18:30 ` [PATCH v10 5/8] fuse: Revalidate positive entries in fuse_atomic_open Bernd Schubert
@ 2023-10-28  5:18   ` Al Viro
  2023-10-28  5:25   ` Al Viro
  1 sibling, 0 replies; 32+ messages in thread
From: Al Viro @ 2023-10-28  5:18 UTC (permalink / raw)
  To: Bernd Schubert
  Cc: linux-fsdevel, bernd.schubert, miklos, dsingh, Horst Birthelmer,
	Christian Brauner

On Mon, Oct 23, 2023 at 08:30:32PM +0200, Bernd Schubert wrote:
> +static struct dentry *
> +fuse_atomic_open_revalidate(struct fuse_conn *fc, struct dentry *entry,
> +			    struct inode *inode, int switched,
> +			    struct fuse_entry_out *outentry,
> +			    wait_queue_head_t *wq, int *alloc_inode)
> +{
> +	u64 attr_version;
> +	struct dentry *prev = entry;
> +
> +	if (outentry->nodeid != get_node_id(inode) ||
> +	    (bool) IS_AUTOMOUNT(inode) !=
> +	    (bool) (outentry->attr.flags & FUSE_ATTR_SUBMOUNT)) {
> +		*alloc_inode = 1;
> +	} else if (fuse_stale_inode(inode, outentry->generation,
> +				  &outentry->attr)) {
> +		fuse_make_bad(inode);
> +		*alloc_inode = 1;
> +	}
> +
> +	if (*alloc_inode) {
> +		struct dentry *new = NULL;
> +
> +		if (!switched && !d_in_lookup(entry)) {
> +			d_drop(entry);
> +			new = d_alloc_parallel(entry->d_parent, &entry->d_name,
> +					       wq);
> +			if (IS_ERR(new))
> +				return new;
> +
> +			if (unlikely(!d_in_lookup(new))) {
> +				dput(new);
> +				new = ERR_PTR(-EIO);
> +				return new;
> +			}
> +		}
> +
> +		fuse_invalidate_entry(entry);
> +
> +		entry = new;
> +	} else if (!*alloc_inode) {
> +		attr_version = fuse_get_attr_version(fc);
> +		forget_all_cached_acls(inode);
> +		fuse_change_attributes(inode, &outentry->attr, NULL,
> +				       ATTR_TIMEOUT(outentry),
> +				       attr_version);
> +	}
> +
> +	if (prev == entry) {
> +		/* nothing changed, atomic-open on the server side
> +		 * had increased the lookup count - do the same here
> +		 */
> +		struct fuse_inode *fi = get_fuse_inode(inode);
> +
> +		spin_lock(&fi->lock);
> +		fi->nlookup++;
> +		spin_unlock(&fi->lock);
> +	}
> +
> +	return entry;
> +}
> +
> +/**
> + * Does 'lookup + create + open' or 'lookup + open' atomically.
> + * @entry might be positive as well, therefore inode is re-validated.
> + * Positive dentry is invalidated in case inode attributes differ or
> + * we encountered error.
> + */
>  static int _fuse_atomic_open(struct inode *dir, struct dentry *entry,
>  			     struct file *file, unsigned int flags,
>  			     umode_t mode)
>  {
>  	int err;
> -	struct inode *inode;
> +	struct inode *inode = d_inode(entry);
>  	FUSE_ARGS(args);
>  	struct fuse_mount *fm = get_fuse_mount(dir);
>  	struct fuse_conn *fc = fm->fc;
> @@ -780,10 +865,7 @@ static int _fuse_atomic_open(struct inode *dir, struct dentry *entry,
>  	struct fuse_file *ff;
>  	struct dentry *switched_entry = NULL, *alias = NULL;
>  	DECLARE_WAIT_QUEUE_HEAD_ONSTACK(wq);
> -
> -	/* Expect a negative dentry */
> -	if (unlikely(d_inode(entry)))
> -		goto fallback;
> +	int alloc_inode = 0;
>  
>  	/* Userspace expects S_IFREG in create mode */
>  	if ((flags & O_CREAT) && (mode & S_IFMT) != S_IFREG)
> @@ -835,36 +917,56 @@ static int _fuse_atomic_open(struct inode *dir, struct dentry *entry,
>  
>  	err = fuse_simple_request(fm, &args);
>  	free_ext_value(&args);
> -	if (err == -ENOSYS || err == -ELOOP) {
> -		if (unlikely(err == -ENOSYS))
> -			fc->no_open_atomic = 1;
> -		goto free_and_fallback;
> -	}
>  
>  	if (!err && !outentry.nodeid)
>  		err = -ENOENT;
>  
> -	if (err)
> -		goto out_free_ff;
> +	if (err) {
> +		if (unlikely(err == -ENOSYS)) {
> +			fc->no_open_atomic = 1;
> +
> +			/* Might come up if userspace tricks us and would
> +			 * return -ENOSYS for OPEN_ATOMIC after it was
> +			 * aready working
> +			 */
> +			if (unlikely(fc->has_open_atomic == 1))
> +				pr_info("fuse server/daemon bug, atomic open "
> +					"got -ENOSYS although it was already "
> +					"succeeding before.");
> +
> +			/* This should better never happen, revalidate
> +			 * is missing for this entry
> +			 */
> +			if (WARN_ON_ONCE(d_really_is_positive(entry))) {
> +				err = -EIO;
> +				goto out_free_ff;
> +			}
> +			goto free_and_fallback;
> +		} else if (err == -ELOOP) {
> +			/* likely a symlink */
> +			goto free_and_fallback;
> +		} else {
> +			if (d_really_is_positive(entry)) {
> +				if (err != -EINTR && err != -ENOMEM)
> +					fuse_invalidate_entry(entry);
> +			}
> +
> +			goto out_free_ff;
> +		}
> +	}
> +
> +	if (!err && !fc->has_open_atomic) {
> +		/* Only set this flag when atomic open did not return an error,
> +		 * so that we are absolutely sure it is implemented.
> +		 */
> +		fc->has_open_atomic = 1;
> +	}
>  
>  	err = -EIO;
>  	if (invalid_nodeid(outentry.nodeid) || fuse_invalid_attr(&outentry.attr))
>  		goto out_free_ff;
>  
> -	ff->fh = outopen.fh;
> -	ff->nodeid = outentry.nodeid;
> -	ff->open_flags = outopen.open_flags;
> -	inode = fuse_iget(dir->i_sb, outentry.nodeid, outentry.generation,
> -			  &outentry.attr, ATTR_TIMEOUT(&outentry), 0);
> -	if (!inode) {
> -		flags &= ~(O_CREAT | O_EXCL | O_TRUNC);
> -		fuse_sync_release(NULL, ff, flags);
> -		fuse_queue_forget(fm->fc, forget, outentry.nodeid, 1);
> -		err = -ENOMEM;
> -		goto out_err;
> -	}
> -
> -	/* prevent racing/parallel lookup on a negative hashed */
> +	/* prevent racing/parallel lookup */
>  	if (!(flags & O_CREAT) && !d_in_lookup(entry)) {
>  		d_drop(entry);
>  		switched_entry = d_alloc_parallel(entry->d_parent,
> @@ -879,10 +981,52 @@ static int _fuse_atomic_open(struct inode *dir, struct dentry *entry,
>  			/* fall back */
>  			dput(switched_entry);
>  			switched_entry = NULL;
> -			goto free_and_fallback;
> +
> +			if (!inode) {
> +				goto free_and_fallback;
> +			} else {
> +				/* XXX can this happen at all and is there a
> +				 * better way to handle it?
> +				 */

"this" being !d_in_lookup() on result of d_alloc_parallel()?  Sure,
that's what you get if there had been a lookup on the same thing
when you called d_alloc_parallel().  Or, for that matter, if that
lookup got completed just as you called the damn thing.

What are you trying to achieve here?

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v10 5/8] fuse: Revalidate positive entries in fuse_atomic_open
  2023-10-23 18:30 ` [PATCH v10 5/8] fuse: Revalidate positive entries in fuse_atomic_open Bernd Schubert
  2023-10-28  5:18   ` Al Viro
@ 2023-10-28  5:25   ` Al Viro
  1 sibling, 0 replies; 32+ messages in thread
From: Al Viro @ 2023-10-28  5:25 UTC (permalink / raw)
  To: Bernd Schubert
  Cc: linux-fsdevel, bernd.schubert, miklos, dsingh, Horst Birthelmer,
	Christian Brauner

On Mon, Oct 23, 2023 at 08:30:32PM +0200, Bernd Schubert wrote:

> +		if (!switched && !d_in_lookup(entry)) {
> +			d_drop(entry);
> +			new = d_alloc_parallel(entry->d_parent, &entry->d_name,
> +					       wq);
> +			if (IS_ERR(new))
> +				return new;
> +
> +			if (unlikely(!d_in_lookup(new))) {
> +				dput(new);
> +				new = ERR_PTR(-EIO);
> +				return new;

Again, huh?  You call d_drop().  Then another thread tries to look
the same thing up and gets there while d_alloc_parallel() is
allocating a new dentry.  d_alloc_parallel() sees that dentry
is already in hash (if lookup has managed to complete) or
in in-lookup hash (if lookup is in progress).  In the former
case it returns the dentry it had found (and frees the one
it intended to put in); in the latter it waits for lookup to
complete and checks if dentry is hashed, has the same name and
the same parent.  If it is, same as if we'd found it hashed
in the first place - we return an additional reference to it.

It's perfectly valid and I really don't understand what are you
trying to achieve here.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v10 2/8] fuse: introduce atomic open
  2023-10-28  3:03   ` [PATCH v10 2/8] fuse: introduce atomic open Al Viro
@ 2023-10-30 15:21     ` Bernd Schubert
  0 siblings, 0 replies; 32+ messages in thread
From: Bernd Schubert @ 2023-10-30 15:21 UTC (permalink / raw)
  To: Al Viro
  Cc: linux-fsdevel, bernd.schubert, miklos, Dharmendra Singh,
	Horst Birthelmer, Christian Brauner

Thanks a lot for all your reviews, Al!

On 10/28/23 05:03, Al Viro wrote:
> On Mon, Oct 23, 2023 at 08:30:29PM +0200, Bernd Schubert wrote:
>> +{
>> +	int err;
>> +	struct inode *inode;
>> +	FUSE_ARGS(args);
>> +	struct fuse_mount *fm = get_fuse_mount(dir);
>> +	struct fuse_conn *fc = fm->fc;
>> +	struct fuse_forget_link *forget;
>> +	struct fuse_create_in inarg;
>> +	struct fuse_open_out outopen;
>> +	struct fuse_entry_out outentry;
>> +	struct fuse_inode *fi;
>> +	struct fuse_file *ff;
>> +	struct dentry *switched_entry = NULL, *alias = NULL;
>> +	DECLARE_WAIT_QUEUE_HEAD_ONSTACK(wq);
>> +
>> +	/* Expect a negative dentry */
>> +	if (unlikely(d_inode(entry)))
>> +		goto fallback;
>> +
>> +	/* Userspace expects S_IFREG in create mode */
>> +	if ((flags & O_CREAT) && (mode & S_IFMT) != S_IFREG)
>> +		goto fallback;
> 
> How could it get there with such mode?  We could check that
> in fs/namei.c:atomic_open() (with
> 	if (WARN_ON_ONCE((open_flags & O_CREAT) && !S_ISREG(mode)))
> 		error = -EINVAL; // for the lack of EWTFAREYOUSMOKING
> 	else
> 		error = dir->i_op->atomic_open(....)
> or something similar), but that doesn't belong in the method instances.
> And it really should never happen - that thing comes from op->mode and
> we have build_open_flags() doing this:
>          if (WILL_CREATE(flags)) {
>                  if (how->mode & ~S_IALLUGO)
>                          return -EINVAL;
>                  op->mode = how->mode | S_IFREG;
>          } else {
>                  if (how->mode != 0)
>                          return -EINVAL;
>                  op->mode = 0;
>          }
> so...  Are other instances doing the same kind of paranoia?  That BUG_ON()
> in current fuse_atomic_open() is bogus (and seriously misplaced)...

Ok, sorry, we took over the check blindly. I added another patch in the 
v11 branch to remove the BUG_ON in current fuse_atomic_open.

> 
>> +	forget = fuse_alloc_forget();
>> +	err = -ENOMEM;
>> +	if (!forget)
>> +		goto out_err;
>> +
>> +	err = -ENOMEM;
>> +	ff = fuse_file_alloc(fm);
>> +	if (!ff)
>> +		goto out_put_forget_req;
>> +
>> +	if (!fc->dont_mask)
>> +		mode &= ~current_umask();
>> +
>> +	flags &= ~O_NOCTTY;
>> +	memset(&inarg, 0, sizeof(inarg));
>> +	memset(&outentry, 0, sizeof(outentry));
>> +	inarg.flags = flags;
>> +	inarg.mode = mode;
>> +	inarg.umask = current_umask();
>> +
>> +	if (fc->handle_killpriv_v2 && (flags & O_TRUNC) &&
>> +	    !(flags & O_EXCL) && !capable(CAP_FSETID)) {
>> +		inarg.open_flags |= FUSE_OPEN_KILL_SUIDGID;
>> +	}
>> +
>> +	args.opcode = FUSE_OPEN_ATOMIC;
>> +	args.nodeid = get_node_id(dir);
>> +	args.in_numargs = 2;
>> +	args.in_args[0].size = sizeof(inarg);
>> +	args.in_args[0].value = &inarg;
>> +	args.in_args[1].size = entry->d_name.len + 1;
>> +	args.in_args[1].value = entry->d_name.name;
>> +	args.out_numargs = 2;
>> +	args.out_args[0].size = sizeof(outentry);
>> +	args.out_args[0].value = &outentry;
>> +	args.out_args[1].size = sizeof(outopen);
>> +	args.out_args[1].value = &outopen;
>> +
>> +	if (flags & O_CREAT) {
>> +		err = get_create_ext(&args, dir, entry, mode);
>> +		if (err)
>> +			goto out_free_ff;
>> +	}
>> +
>> +	err = fuse_simple_request(fm, &args);
>> +	free_ext_value(&args);
>> +	if (err == -ENOSYS || err == -ELOOP) {
>> +		if (unlikely(err == -ENOSYS))
>> +			fc->no_open_atomic = 1;
>> +		goto free_and_fallback;
>> +	}
>> +
>> +	if (!err && !outentry.nodeid)
>> +		err = -ENOENT;
>> +
>> +	if (err)
>> +		goto out_free_ff;
>> +
>> +	err = -EIO;
>> +	if (invalid_nodeid(outentry.nodeid) || fuse_invalid_attr(&outentry.attr))
>> +		goto out_free_ff;
>> +
>> +	ff->fh = outopen.fh;
>> +	ff->nodeid = outentry.nodeid;
>> +	ff->open_flags = outopen.open_flags;
>> +	inode = fuse_iget(dir->i_sb, outentry.nodeid, outentry.generation,
>> +			  &outentry.attr, ATTR_TIMEOUT(&outentry), 0);
>> +	if (!inode) {
>> +		flags &= ~(O_CREAT | O_EXCL | O_TRUNC);
>> +		fuse_sync_release(NULL, ff, flags);
>> +		fuse_queue_forget(fm->fc, forget, outentry.nodeid, 1);
>> +		err = -ENOMEM;
>> +		goto out_err;
>> +	}
>> +
>> +	/* prevent racing/parallel lookup on a negative hashed */
>> +	if (!(flags & O_CREAT) && !d_in_lookup(entry)) {
> 
> ... in which case it has just passed ->d_revalidate()...

With the follow up patches that revalidate is going to be moved to this 
function.
Is there anything here that needs to be improved? I had added that check 
after looking through the other atomic_open methods and then noticed 
your commit c94c09535c4d:
nfs_atomic_open(): prevent parallel nfs_lookup() on a negative hashed


> 
>> +		d_drop(entry);
>> +		switched_entry = d_alloc_parallel(entry->d_parent,
>> +						   &entry->d_name, &wq);
>> +		if (IS_ERR(switched_entry)) {
>> +			err = PTR_ERR(switched_entry);
>> +			switched_entry = NULL;
>> +			goto out_free_ff;
> 
> leaked inode?

Yikes, silly me and with that also leaked fuse userspace inode.

> 
>> +		}
>> +
>> +		if (unlikely(!d_in_lookup(switched_entry))) {
>> +			/* fall back */
>> +			dput(switched_entry);
>> +			switched_entry = NULL;
>> +			goto free_and_fallback;
> 
> ditto, and I don't really understand what the hell is going on with
> dentry references here.  What is the intended behaviour in that case?


The idea was to give up 'switched_entry' and let the existing 
fuse_create_open do the fallback work. Hmm, yeah, already called 
d_drop(dentry). And it also already did the userspace part - fallback 
without fuse-forget is totally wrong.
I guess I need severe patch update because of this - the other parts are 
not difficult, but here it gets complex.

> 
>> +		}
>> +
>> +		entry = switched_entry;
>> +	}
>> +
>> +	if (d_really_is_negative(entry)) {
>> +		d_drop(entry);
>> +		alias = d_exact_alias(entry, inode);
> 
> What case is that about?  "We have an unhashed positive dentry with that
> exact name, parent and inode"?  Where would it have come from?

Sorry, taken from _nfs4_open_and_get_state and I wasn't sure if it is 
needed. A bit lame excuse, but NFS is the only other file system I found 
that handles !O_CREAT. I definitely should have marked it with something 
like /* XXX is this really needed, from _nfs4_open_and_get_state */

> 
> Another thing: this does not consume an inode reference, no matter what
> gets returned,
> 
>> +		if (!alias) {
>> +			alias = d_splice_alias(inode, entry);
> 
> but that one *does* consume the inode reference; note the igrab() in
> nfs4 code where you've nicked that from...
> 
>> +			if (IS_ERR(alias)) {
>> +				/*
>> +				 * Close the file in user space, but do not unlink it,
>> +				 * if it was created - with network file systems other
>> +				 * clients might have already accessed it.
>> +				 */
>> +				fi = get_fuse_inode(inode);
> 
> ... so this is asking for UAF.

Yeah, and staring at it again, the below fuse_queue_forget is not right 
either, as that is already handled through the inode reference / 
->evict_inode.

> 
>> +				fuse_sync_release(fi, ff, flags);
>> +				fuse_queue_forget(fm->fc, forget, outentry.nodeid, 1);
>> +				err = PTR_ERR(alias);
>> +				goto out_err;
>> +			}
>> +		}
>> +
>> +		if (alias)
>> +			entry = alias;
>> +	}
> 
> ... and here we have no way to tell if inode needs to be dropped.

I guess I could solve this with another variable, but maybe there is a 
more beautiful way. I first need to think about the 
!d_in_lookup(switched_entry).


Thanks again so much for your reviews,
Bernd


^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH 0/1] Adapt atomic open to fuse no_open/no_open_dir
  2023-10-26  2:42       ` Yuan Yao
@ 2023-11-29  6:46         ` Yuan Yao
  2023-11-29  6:46           ` [PATCH 1/1] fuse: Handle no_open/no_opendir in atomic_open Yuan Yao
  0 siblings, 1 reply; 32+ messages in thread
From: Yuan Yao @ 2023-11-29  6:46 UTC (permalink / raw)
  To: bernd.schubert, yuanyaogoog
  Cc: linux-fsdevel, miklos, dsingh, hbirthelmer, brauner, viro,
	bschubert, keiichiw, takayas

This patch implements my purpose from previous mail, adapting the atomic
open to work with no_open/no_open_dir features. I tested this patch by
creating and subsequently closing 10,000 empty files using virtio-fs.
The total execution time demonstrated a reduction of around 15%
comparing to the unpatched version. Hope this patch could be integrated 
to the v11 version.

Yuan Yao (1):
  fuse: Handle no_open/no_opendir in atomic_open

 fs/fuse/dir.c | 8 ++++++++
 1 file changed, 8 insertions(+)

-- 
2.43.0.rc1.413.gea7ed67945-goog


^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH 1/1] fuse: Handle no_open/no_opendir in atomic_open
  2023-11-29  6:46         ` [PATCH 0/1] Adapt atomic open to fuse no_open/no_open_dir Yuan Yao
@ 2023-11-29  6:46           ` Yuan Yao
  2023-11-29 22:21             ` Bernd Schubert
  0 siblings, 1 reply; 32+ messages in thread
From: Yuan Yao @ 2023-11-29  6:46 UTC (permalink / raw)
  To: bernd.schubert, yuanyaogoog
  Cc: linux-fsdevel, miklos, dsingh, hbirthelmer, brauner, viro,
	bschubert, keiichiw, takayas

Currently, if the fuse server supporting the no_open/no_opendir feature
uses atomic_open to open a file, the corresponding no_open/no_opendir
flag is not set in kernel. This leads to the kernel unnecessarily
sending extra FUSE_RELEASE request, receiving an empty reply from
server when closes that file.

This patch addresses the issue by setting the no_open/no_opendir feature
bit to true if the kernel receives a valid dentry with an empty file
handler.

Signed-off-by: Yuan Yao <yuanyaogoog@chromium.org>
---
 fs/fuse/dir.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
index 9956fae7f875..edee4f715f39 100644
--- a/fs/fuse/dir.c
+++ b/fs/fuse/dir.c
@@ -8,6 +8,7 @@
 
 #include "fuse_i.h"
 
+#include <linux/fuse.h>
 #include <linux/pagemap.h>
 #include <linux/file.h>
 #include <linux/fs_context.h>
@@ -869,6 +870,13 @@ static int _fuse_atomic_open(struct inode *dir, struct dentry *entry,
 		goto out_err;
 	}
 
+	if (ff->fh == 0) {
+		if (ff->open_flags & FOPEN_KEEP_CACHE)
+			fc->no_open = 1;
+		if (ff->open_flags & FOPEN_CACHE_DIR)
+			fc->no_opendir = 1;
+	}
+
 	/* prevent racing/parallel lookup on a negative hashed */
 	if (!(flags & O_CREAT) && !d_in_lookup(entry)) {
 		d_drop(entry);
-- 
2.43.0.rc1.413.gea7ed67945-goog


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [PATCH 1/1] fuse: Handle no_open/no_opendir in atomic_open
  2023-11-29  6:46           ` [PATCH 1/1] fuse: Handle no_open/no_opendir in atomic_open Yuan Yao
@ 2023-11-29 22:21             ` Bernd Schubert
  0 siblings, 0 replies; 32+ messages in thread
From: Bernd Schubert @ 2023-11-29 22:21 UTC (permalink / raw)
  To: Yuan Yao
  Cc: linux-fsdevel, miklos, dsingh, hbirthelmer, brauner, viro,
	bschubert, keiichiw, takayas



On 11/29/23 07:46, Yuan Yao wrote:
> Currently, if the fuse server supporting the no_open/no_opendir feature
> uses atomic_open to open a file, the corresponding no_open/no_opendir
> flag is not set in kernel. This leads to the kernel unnecessarily
> sending extra FUSE_RELEASE request, receiving an empty reply from
> server when closes that file.
> 
> This patch addresses the issue by setting the no_open/no_opendir feature
> bit to true if the kernel receives a valid dentry with an empty file
> handler.
> 
> Signed-off-by: Yuan Yao <yuanyaogoog@chromium.org>
> ---
>   fs/fuse/dir.c | 8 ++++++++
>   1 file changed, 8 insertions(+)
> 
> diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
> index 9956fae7f875..edee4f715f39 100644
> --- a/fs/fuse/dir.c
> +++ b/fs/fuse/dir.c
> @@ -8,6 +8,7 @@
>   
>   #include "fuse_i.h"
>   
> +#include <linux/fuse.h>
>   #include <linux/pagemap.h>
>   #include <linux/file.h>
>   #include <linux/fs_context.h>
> @@ -869,6 +870,13 @@ static int _fuse_atomic_open(struct inode *dir, struct dentry *entry,
>   		goto out_err;
>   	}
>   
> +	if (ff->fh == 0) {
> +		if (ff->open_flags & FOPEN_KEEP_CACHE)
> +			fc->no_open = 1;
> +		if (ff->open_flags & FOPEN_CACHE_DIR)
> +			fc->no_opendir = 1;
> +	}
> +
>   	/* prevent racing/parallel lookup on a negative hashed */
>   	if (!(flags & O_CREAT) && !d_in_lookup(entry)) {
>   		d_drop(entry);


Thanks, I first need to fix all the issues Al found (and need to find 
the time for that, hopefully during the next days) and will then add 
this to my series.

(We also need to document for userspace that the atomic_open method 
shall not fill in fi->fh in atomic-open, if it wants the no-open feature).

Thanks,
Bernd

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH 0/1] Fix-atomic_open-not-using-negative-d_entry
  2023-10-23 18:30 ` [PATCH v10 2/8] fuse: introduce atomic open Bernd Schubert
  2023-10-24 10:12   ` Yuan Yao
  2023-10-28  3:03   ` [PATCH v10 2/8] fuse: introduce atomic open Al Viro
@ 2024-01-23  8:40   ` Yuan Yao
  2024-01-23  8:40     ` [PATCH 1/1] fuse: Make atomic_open use negative d_entry Yuan Yao
  2024-03-14 10:34   ` [PATCH] fuse: Do NULL check instead of IS_ERR in atomic_open Keiichi Watanabe
  3 siblings, 1 reply; 32+ messages in thread
From: Yuan Yao @ 2024-01-23  8:40 UTC (permalink / raw)
  To: bschubert
  Cc: bernd.schubert, brauner, dsingh, hbirthelmer, linux-fsdevel,
	miklos, viro, Yuan Yao

I found the non-existing file looked up by _fuse_atomic_open()
function will not set negative dentry properly. This issue came to
light when I was benchmarking the performance of the second open
operation on a non-existing file. Using _fuse_atomic_open() resulted 
in a 10x performance regression compared to the original 
fuse_create_open() function.

With the previous fuse_create_open() function, when a negative dentry
is returned by fuse server. kernel will set the dentry timeout and
return with finish_no_open(). The _fuse_atomic_open() function will
return ENOENT error instead.

Yuan Yao (1):
  fuse: Fix atomic_open not using negative d_entry

 fs/fuse/dir.c | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

-- 
2.43.0.429.g432eaa2c6b-goog


^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH 1/1] fuse: Make atomic_open use negative d_entry
  2024-01-23  8:40   ` [PATCH 0/1] Fix-atomic_open-not-using-negative-d_entry Yuan Yao
@ 2024-01-23  8:40     ` Yuan Yao
  2024-01-27 13:38       ` Bernd Schubert
  0 siblings, 1 reply; 32+ messages in thread
From: Yuan Yao @ 2024-01-23  8:40 UTC (permalink / raw)
  To: bschubert
  Cc: bernd.schubert, brauner, dsingh, hbirthelmer, linux-fsdevel,
	miklos, viro, Yuan Yao

With current implementation, when fuse server replies a negative
d_entry, _fuse_atomic_open() function will return ENOENT error. This
behaviour will prevent using kernel's negative d_entry. The original
fuse_create_open() function will get negative d_entry by fuse_lookup().
And the finish_no_open() will be called with that negative d_entry.

This patch fixes the problem by adding a check for the case that
negative d_entry is returned by fuse server. Atomic open will update the
d_entry's timeout and call finish_no_open(). This change makes negative
d_entry be used in kernel.

Signed-off-by: Yuan Yao <yuanyaogoog@chromium.org>
---
 fs/fuse/dir.c | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
index 4ae89f428243..11b3193c3902 100644
--- a/fs/fuse/dir.c
+++ b/fs/fuse/dir.c
@@ -843,8 +843,15 @@ static int _fuse_atomic_open(struct inode *dir, struct dentry *entry,
 		goto free_and_fallback;
 	}
 
-	if (!err && !outentry.nodeid)
+	if (!err && !outentry.nodeid) {
+		if (outentry.entry_valid) {
+			inode = NULL;
+			d_splice_alias(inode, entry);
+			fuse_change_entry_timeout(entry, &outentry);
+			goto free_and_no_open;
+		}
 		err = -ENOENT;
+	}
 
 	if (err)
 		goto out_free_ff;
@@ -991,6 +998,10 @@ static int _fuse_atomic_open(struct inode *dir, struct dentry *entry,
 	kfree(forget);
 fallback:
 	return fuse_create_open(dir, entry, file, flags, mode);
+free_and_no_open:
+	fuse_file_free(ff);
+	kfree(forget);
+	return finish_no_open(file, entry);
 }
 
 static int fuse_atomic_open(struct inode *dir, struct dentry *entry,
-- 
2.43.0.429.g432eaa2c6b-goog


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [PATCH 1/1] fuse: Make atomic_open use negative d_entry
  2024-01-23  8:40     ` [PATCH 1/1] fuse: Make atomic_open use negative d_entry Yuan Yao
@ 2024-01-27 13:38       ` Bernd Schubert
  2024-02-09  7:46         ` Yuan Yao
  0 siblings, 1 reply; 32+ messages in thread
From: Bernd Schubert @ 2024-01-27 13:38 UTC (permalink / raw)
  To: Yuan Yao, bschubert
  Cc: brauner, dsingh, hbirthelmer, linux-fsdevel, miklos, viro



On 1/23/24 09:40, Yuan Yao wrote:
> With current implementation, when fuse server replies a negative
> d_entry, _fuse_atomic_open() function will return ENOENT error. This
> behaviour will prevent using kernel's negative d_entry. The original
> fuse_create_open() function will get negative d_entry by fuse_lookup().
> And the finish_no_open() will be called with that negative d_entry.
> 
> This patch fixes the problem by adding a check for the case that
> negative d_entry is returned by fuse server. Atomic open will update the
> d_entry's timeout and call finish_no_open(). This change makes negative
> d_entry be used in kernel.
> 
> Signed-off-by: Yuan Yao <yuanyaogoog@chromium.org>
> ---
>   fs/fuse/dir.c | 13 ++++++++++++-
>   1 file changed, 12 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
> index 4ae89f428243..11b3193c3902 100644
> --- a/fs/fuse/dir.c
> +++ b/fs/fuse/dir.c
> @@ -843,8 +843,15 @@ static int _fuse_atomic_open(struct inode *dir, struct dentry *entry,
>   		goto free_and_fallback;
>   	}
>   
> -	if (!err && !outentry.nodeid)
> +	if (!err && !outentry.nodeid) {
> +		if (outentry.entry_valid) {
> +			inode = NULL;
> +			d_splice_alias(inode, entry);
> +			fuse_change_entry_timeout(entry, &outentry);
> +			goto free_and_no_open;
> +		}
>   		err = -ENOENT;
> +	}
>   
>   	if (err)
>   		goto out_free_ff;
> @@ -991,6 +998,10 @@ static int _fuse_atomic_open(struct inode *dir, struct dentry *entry,
>   	kfree(forget);
>   fallback:
>   	return fuse_create_open(dir, entry, file, flags, mode);
> +free_and_no_open:
> +	fuse_file_free(ff);
> +	kfree(forget);
> +	return finish_no_open(file, entry);
>   }
>   
>   static int fuse_atomic_open(struct inode *dir, struct dentry *entry,


Thank you and sorry for my late reply! Yeah definitely makes sense, I'm 
just going add it in and will also add a small comment. And as I said, I 
first need to address all the issues Al had found - still didn't have 
time for it.


Thanks,
Bernd

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 1/1] fuse: Make atomic_open use negative d_entry
  2024-01-27 13:38       ` Bernd Schubert
@ 2024-02-09  7:46         ` Yuan Yao
  2024-02-19 11:37           ` Bernd Schubert
  0 siblings, 1 reply; 32+ messages in thread
From: Yuan Yao @ 2024-02-09  7:46 UTC (permalink / raw)
  To: Bernd Schubert
  Cc: bschubert, brauner, dsingh, hbirthelmer, linux-fsdevel, miklos, viro

Hi Bernd,

Thank you for taking a look at this patch! I appreciate it a lot for
adding this patch to the next version. However, I just found that
there’s a bug with my patch. The *BUG_ON(!d_unhashed(dentry));* in
d_splice_alias() function will be triggered when doing the parallel
lookup on a non-exist file.

> struct dentry *d_splice_alias(struct inode *inode, struct dentry *dentry)
> {
>    if (IS_ERR(inode))
>        return ERR_CAST(inode);
>
>    BUG_ON(!d_unhashed(dentry));

This bug can be easily reproduced by adding 3 seconds sleep in fuse
server’s atomic_open handler, and execute the following commands in
fuse filesystem:
cat non-exist-file &
cat non-exist-file &

I think this bug can be addressed by following two approaches:

1. adding check for d_in_lookup(entry)
-----------------------------------------------------------------------
       if (outentry.entry_valid) {
+            if (d_in_lookup(entry)) {
                inode = NULL;
                d_splice_alias(inode, entry);
               fuse_change_entry_timeout(entry, &outentry);
+          }
            goto free_and_no_open;
        }

2. adding d_drop(entry)
-----------------------------------------------------------------------
        if (outentry.entry_valid) {
             inode = NULL;
+           d_drop(entry)
             d_splice_alias(inode, entry);
             fuse_change_entry_timeout(entry, &outentry);
            goto free_and_no_open;
        }

But I'm not so sure which one is preferable, or maybe the handling
should be elsewhere?

Best,
Yuan

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 1/1] fuse: Make atomic_open use negative d_entry
  2024-02-09  7:46         ` Yuan Yao
@ 2024-02-19 11:37           ` Bernd Schubert
  2024-03-13 10:25             ` Yuan Yao
  0 siblings, 1 reply; 32+ messages in thread
From: Bernd Schubert @ 2024-02-19 11:37 UTC (permalink / raw)
  To: Yuan Yao
  Cc: bschubert, brauner, dsingh, hbirthelmer, linux-fsdevel, miklos, viro



On 2/9/24 08:46, Yuan Yao wrote:
> Hi Bernd,
> 
> Thank you for taking a look at this patch! I appreciate it a lot for
> adding this patch to the next version. However, I just found that
> there’s a bug with my patch. The *BUG_ON(!d_unhashed(dentry));* in
> d_splice_alias() function will be triggered when doing the parallel
> lookup on a non-exist file.
> 
>> struct dentry *d_splice_alias(struct inode *inode, struct dentry *dentry)
>> {
>>    if (IS_ERR(inode))
>>        return ERR_CAST(inode);
>>
>>    BUG_ON(!d_unhashed(dentry));
> 
> This bug can be easily reproduced by adding 3 seconds sleep in fuse
> server’s atomic_open handler, and execute the following commands in
> fuse filesystem:
> cat non-exist-file &
> cat non-exist-file &
> 
> I think this bug can be addressed by following two approaches:
> 
> 1. adding check for d_in_lookup(entry)
> -----------------------------------------------------------------------
>        if (outentry.entry_valid) {
> +            if (d_in_lookup(entry)) {
>                 inode = NULL;
>                 d_splice_alias(inode, entry);
>                fuse_change_entry_timeout(entry, &outentry);
> +          }
>             goto free_and_no_open;
>         }
> 
> 2. adding d_drop(entry)
> -----------------------------------------------------------------------
>         if (outentry.entry_valid) {
>              inode = NULL;
> +           d_drop(entry)
>              d_splice_alias(inode, entry);
>              fuse_change_entry_timeout(entry, &outentry);
>             goto free_and_no_open;
>         }
> 
> But I'm not so sure which one is preferable, or maybe the handling
> should be elsewhere?
> 

Sorry for my terribly late reply, will look into it in the evening.


Thanks,
Bernd

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 1/1] fuse: Make atomic_open use negative d_entry
  2024-02-19 11:37           ` Bernd Schubert
@ 2024-03-13 10:25             ` Yuan Yao
  2024-03-13 23:00               ` Bernd Schubert
  0 siblings, 1 reply; 32+ messages in thread
From: Yuan Yao @ 2024-03-13 10:25 UTC (permalink / raw)
  To: Bernd Schubert
  Cc: bschubert, brauner, dsingh, hbirthelmer, linux-fsdevel, miklos, viro

Hi Bernd,

Thank you for taking time to review this issue.

I'm writing to inquire about any recent updates or plans for atomic
open threads?
It will help a lot if the negative d_entry issue could be solved.

Best,
Yuan

On Mon, Feb 19, 2024 at 8:37 PM Bernd Schubert
<bernd.schubert@fastmail.fm> wrote:
>
>
>
> On 2/9/24 08:46, Yuan Yao wrote:
> > Hi Bernd,
> >
> > Thank you for taking a look at this patch! I appreciate it a lot for
> > adding this patch to the next version. However, I just found that
> > there’s a bug with my patch. The *BUG_ON(!d_unhashed(dentry));* in
> > d_splice_alias() function will be triggered when doing the parallel
> > lookup on a non-exist file.
> >
> >> struct dentry *d_splice_alias(struct inode *inode, struct dentry *dentry)
> >> {
> >>    if (IS_ERR(inode))
> >>        return ERR_CAST(inode);
> >>
> >>    BUG_ON(!d_unhashed(dentry));
> >
> > This bug can be easily reproduced by adding 3 seconds sleep in fuse
> > server’s atomic_open handler, and execute the following commands in
> > fuse filesystem:
> > cat non-exist-file &
> > cat non-exist-file &
> >
> > I think this bug can be addressed by following two approaches:
> >
> > 1. adding check for d_in_lookup(entry)
> > -----------------------------------------------------------------------
> >        if (outentry.entry_valid) {
> > +            if (d_in_lookup(entry)) {
> >                 inode = NULL;
> >                 d_splice_alias(inode, entry);
> >                fuse_change_entry_timeout(entry, &outentry);
> > +          }
> >             goto free_and_no_open;
> >         }
> >
> > 2. adding d_drop(entry)
> > -----------------------------------------------------------------------
> >         if (outentry.entry_valid) {
> >              inode = NULL;
> > +           d_drop(entry)
> >              d_splice_alias(inode, entry);
> >              fuse_change_entry_timeout(entry, &outentry);
> >             goto free_and_no_open;
> >         }
> >
> > But I'm not so sure which one is preferable, or maybe the handling
> > should be elsewhere?
> >
>
> Sorry for my terribly late reply, will look into it in the evening.
>
>
> Thanks,
> Bernd

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 1/1] fuse: Make atomic_open use negative d_entry
  2024-03-13 10:25             ` Yuan Yao
@ 2024-03-13 23:00               ` Bernd Schubert
  0 siblings, 0 replies; 32+ messages in thread
From: Bernd Schubert @ 2024-03-13 23:00 UTC (permalink / raw)
  To: Yuan Yao, Bernd Schubert
  Cc: brauner, Dharmendra Singh, Horst Birthelmer, linux-fsdevel, miklos, viro

Hi Yuan,

On 3/13/24 11:25, Yuan Yao wrote:
> Hi Bernd,
> 
> Thank you for taking time to review this issue.
> 
> I'm writing to inquire about any recent updates or plans for atomic
> open threads?
> It will help a lot if the negative d_entry issue could be solved.

thanks for pinging me I'm really sorry, I'm currently totally swamped. I
will really try to get to it during the next days, latest during the
weekend.

Best,
Bernd

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH] fuse: Do NULL check instead of IS_ERR in atomic_open
  2023-10-23 18:30 ` [PATCH v10 2/8] fuse: introduce atomic open Bernd Schubert
                     ` (2 preceding siblings ...)
  2024-01-23  8:40   ` [PATCH 0/1] Fix-atomic_open-not-using-negative-d_entry Yuan Yao
@ 2024-03-14 10:34   ` Keiichi Watanabe
  2024-03-15 13:09     ` Keiichi Watanabe
  2024-03-24  4:32     ` Al Viro
  3 siblings, 2 replies; 32+ messages in thread
From: Keiichi Watanabe @ 2024-03-14 10:34 UTC (permalink / raw)
  To: bschubert, linux-fsdevel
  Cc: bernd.schubert, miklos, dsingh, hbirthelmer, brauner, viro,
	yuanyaogoog, Keiichi Watanabe

Since d_splice_alias returns NULL on error, we need to do NUL check
instead of IS_ERR.

Signed-off-by: Keiichi Watanabe <keiichiw@chromium.org>
---
 fs/fuse/dir.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
index 4ae89f428243..4843a749dd91 100644
--- a/fs/fuse/dir.c
+++ b/fs/fuse/dir.c
@@ -914,7 +914,7 @@ static int _fuse_atomic_open(struct inode *dir, struct dentry *entry,
 		alias = d_exact_alias(entry, inode);
 		if (!alias) {
 			alias = d_splice_alias(inode, entry);
-			if (IS_ERR(alias)) {
+			if (!alias) {
 				/*
 				 * Close the file in user space, but do not unlink it,
 				 * if it was created - with network file systems other
@@ -928,8 +928,7 @@ static int _fuse_atomic_open(struct inode *dir, struct dentry *entry,
 			}
 		}
 
-		if (alias)
-			entry = alias;
+		entry = alias; // alias must not be NULL here.
 	}
 
 	fuse_change_entry_timeout(entry, &outentry);
-- 
2.44.0.291.gc1ea87d7ee-goog


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [PATCH] fuse: Do NULL check instead of IS_ERR in atomic_open
  2024-03-14 10:34   ` [PATCH] fuse: Do NULL check instead of IS_ERR in atomic_open Keiichi Watanabe
@ 2024-03-15 13:09     ` Keiichi Watanabe
  2024-03-24  4:32     ` Al Viro
  1 sibling, 0 replies; 32+ messages in thread
From: Keiichi Watanabe @ 2024-03-15 13:09 UTC (permalink / raw)
  To: bschubert, linux-fsdevel
  Cc: bernd.schubert, miklos, dsingh, hbirthelmer, brauner, viro, yuanyaogoog

I realized this patch was wrong, as I misunderstood the spec of d_splice_alias.
d_splice_alias can return NULL in a success case, so the original code
was totally correct.
Please ignore this patch. Sorry for the noise!

Keiichi


On Thu, Mar 14, 2024 at 7:34 PM Keiichi Watanabe <keiichiw@chromium.org> wrote:
>
> Since d_splice_alias returns NULL on error, we need to do NUL check
> instead of IS_ERR.
>
> Signed-off-by: Keiichi Watanabe <keiichiw@chromium.org>
> ---
>  fs/fuse/dir.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
> index 4ae89f428243..4843a749dd91 100644
> --- a/fs/fuse/dir.c
> +++ b/fs/fuse/dir.c
> @@ -914,7 +914,7 @@ static int _fuse_atomic_open(struct inode *dir, struct dentry *entry,
>                 alias = d_exact_alias(entry, inode);
>                 if (!alias) {
>                         alias = d_splice_alias(inode, entry);
> -                       if (IS_ERR(alias)) {
> +                       if (!alias) {
>                                 /*
>                                  * Close the file in user space, but do not unlink it,
>                                  * if it was created - with network file systems other
> @@ -928,8 +928,7 @@ static int _fuse_atomic_open(struct inode *dir, struct dentry *entry,
>                         }
>                 }
>
> -               if (alias)
> -                       entry = alias;
> +               entry = alias; // alias must not be NULL here.
>         }
>
>         fuse_change_entry_timeout(entry, &outentry);
> --
> 2.44.0.291.gc1ea87d7ee-goog
>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] fuse: Do NULL check instead of IS_ERR in atomic_open
  2024-03-14 10:34   ` [PATCH] fuse: Do NULL check instead of IS_ERR in atomic_open Keiichi Watanabe
  2024-03-15 13:09     ` Keiichi Watanabe
@ 2024-03-24  4:32     ` Al Viro
  1 sibling, 0 replies; 32+ messages in thread
From: Al Viro @ 2024-03-24  4:32 UTC (permalink / raw)
  To: Keiichi Watanabe
  Cc: bschubert, linux-fsdevel, bernd.schubert, miklos, dsingh,
	hbirthelmer, brauner, yuanyaogoog

On Thu, Mar 14, 2024 at 07:34:04PM +0900, Keiichi Watanabe wrote:
> Since d_splice_alias returns NULL on error, we need to do NUL check
> instead of IS_ERR.

d_splice_alias() does *NOT* return NULL on error.  Never did.  Moreover,
passing it a pointer to non-directory inode will definitely return NULL.
So will passing it a pointer to directory inode that currently has
no aliases.

NAK.

^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2024-03-24  4:32 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-10-23 18:30 [PATCH v10 0/8] fuse: full atomic open and atomic-open-revalidate Bernd Schubert
2023-10-23 18:30 ` [PATCH v10 1/8] fuse: rename fuse_create_open Bernd Schubert
2023-10-23 18:30 ` [PATCH v10 2/8] fuse: introduce atomic open Bernd Schubert
2023-10-24 10:12   ` Yuan Yao
2023-10-24 12:36     ` Bernd Schubert
2023-10-26  2:42       ` Yuan Yao
2023-11-29  6:46         ` [PATCH 0/1] Adapt atomic open to fuse no_open/no_open_dir Yuan Yao
2023-11-29  6:46           ` [PATCH 1/1] fuse: Handle no_open/no_opendir in atomic_open Yuan Yao
2023-11-29 22:21             ` Bernd Schubert
2023-10-28  3:03   ` [PATCH v10 2/8] fuse: introduce atomic open Al Viro
2023-10-30 15:21     ` Bernd Schubert
2024-01-23  8:40   ` [PATCH 0/1] Fix-atomic_open-not-using-negative-d_entry Yuan Yao
2024-01-23  8:40     ` [PATCH 1/1] fuse: Make atomic_open use negative d_entry Yuan Yao
2024-01-27 13:38       ` Bernd Schubert
2024-02-09  7:46         ` Yuan Yao
2024-02-19 11:37           ` Bernd Schubert
2024-03-13 10:25             ` Yuan Yao
2024-03-13 23:00               ` Bernd Schubert
2024-03-14 10:34   ` [PATCH] fuse: Do NULL check instead of IS_ERR in atomic_open Keiichi Watanabe
2024-03-15 13:09     ` Keiichi Watanabe
2024-03-24  4:32     ` Al Viro
2023-10-23 18:30 ` [PATCH v10 3/8] [RFC] Allow atomic_open() on positive dentry (O_CREAT) Bernd Schubert
2023-10-23 18:30 ` [PATCH v10 4/8] [RFC] Allow atomic_open() on positive dentry (w/o O_CREAT) Bernd Schubert
2023-10-23 23:21   ` kernel test robot
2023-10-24 13:46   ` Bernd Schubert
2023-10-28  4:46   ` Al Viro
2023-10-23 18:30 ` [PATCH v10 5/8] fuse: Revalidate positive entries in fuse_atomic_open Bernd Schubert
2023-10-28  5:18   ` Al Viro
2023-10-28  5:25   ` Al Viro
2023-10-23 18:30 ` [PATCH v10 6/8] fuse: Return D_REVALIDATE_ATOMIC for cached dentries Bernd Schubert
2023-10-23 18:30 ` [PATCH v10 7/8] fuse: Avoid code duplication in atomic open Bernd Schubert
2023-10-23 18:30 ` [PATCH v10 8/8] fuse atomic open: No fallback for symlinks, just call finish_no_open Bernd Schubert

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.