All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] virtiofs submounts forgotten after client / guest memory pressure
@ 2023-09-11 18:38 Krister Johansen
  2023-09-11 18:38 ` [PATCH 1/2] fuse: revalidate: move lookup into a separate function Krister Johansen
  2023-09-11 18:38 ` [PATCH 2/2] fuse: ensure that submounts lookup their root Krister Johansen
  0 siblings, 2 replies; 4+ messages in thread
From: Krister Johansen @ 2023-09-11 18:38 UTC (permalink / raw)
  To: Miklos Szeredi, linux-fsdevel
  Cc: linux-kernel, German Maglione, Greg Kurz, Max Reitz

Hi,
I recently ran into a situation where a virtiofs client began
encountering EBADF after the client / guest system had an OOM.  After
reproducing the issue and debugging, it appears that the problem is
caused by a virtiofsd submount being forgotten once the dentry
referencing that submount is killed by the shrinker.  In this particular
case, the submount had been bind mounted into a container's mount
namespace.  The reference count on the original dentry was 0, making it
eligible for eviction.  However, because this dentry was also the last
reference the client knew it had, it sent a forget message to the
server.  This caused all future references to the FUSE node-id from
virtiofsd perspective to become invalid.  Subsequent attempts to used
the node-id received an EBADF from the server.

This pair of patches modifies the virtiofs submount code to perform a
lookup on the nodeid that forms the root of the submount.  The patch
before this pulls the revalidate lookup code into a helper function that
can be used both in revalidate and submount superblock fill.

Tested via:

- fstests for virtiofs
- fstests for fuse (against passthrough_ll)
- manual testing to watch how refcounts change between client and server
  in response to filesytem access, umount, and eviction by the shrinker.

Changes since RFC:

- Modified fuse_fill_super_submount to always fail if dentry cannot be
  revalidated.  (Feedback from Bernd Schubert)
- Fixed up an edge case where looked up but subsequently declared
  invalid dentries were not correctly tracking nlookup.  (Error was
  introduced in my RFC).

Thanks,

-K

Krister Johansen (2):
  fuse: revalidate: move lookup into a separate function
  fuse: ensure that submounts lookup their root

 fs/fuse/dir.c    | 87 +++++++++++++++++++++++++++++++++---------------
 fs/fuse/fuse_i.h |  6 ++++
 fs/fuse/inode.c  | 43 ++++++++++++++++++++----
 3 files changed, 103 insertions(+), 33 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH 1/2] fuse: revalidate: move lookup into a separate function
  2023-09-11 18:38 [PATCH 0/2] virtiofs submounts forgotten after client / guest memory pressure Krister Johansen
@ 2023-09-11 18:38 ` Krister Johansen
  2023-09-12  0:46   ` kernel test robot
  2023-09-11 18:38 ` [PATCH 2/2] fuse: ensure that submounts lookup their root Krister Johansen
  1 sibling, 1 reply; 4+ messages in thread
From: Krister Johansen @ 2023-09-11 18:38 UTC (permalink / raw)
  To: Miklos Szeredi, linux-fsdevel
  Cc: linux-kernel, German Maglione, Greg Kurz, Max Reitz

If this refactoring seems cumbersome, it's because the goal is to move
the lookup parts of fuse_dentry_revalidate into a common function.  This
function will be used in a subsequent commit.  In the meantime, the new
function fuse_dentry_revalidate_lookup is responsible for just the
lookup and validation portions of the revalidate dance.  The
fuse_dentry_revalidate function retains the responsibility for
invalidating and mutating any state associated with the origial
fuse_inode and dentry.

Cc: stable@vger.kernel.org
Fixes: 1866d779d5d2 ("fuse: Allow fuse_fill_super_common() for submounts")
Signed-off-by: Krister Johansen <kjlx@templeofstupid.com>
---
 fs/fuse/dir.c | 87 +++++++++++++++++++++++++++++++++++----------------
 1 file changed, 60 insertions(+), 27 deletions(-)

diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
index e190d09f220d..afbdd223b0f3 100644
--- a/fs/fuse/dir.c
+++ b/fs/fuse/dir.c
@@ -183,6 +183,59 @@ static void fuse_lookup_init(struct fuse_conn *fc, struct fuse_args *args,
 	args->out_args[0].value = outarg;
 }
 
+static int fuse_dentry_revalidate_lookup(struct fuse_mount *fm,
+					 struct dentry *entry,
+					 struct inode *inode,
+					 struct fuse_entry_out *outarg,
+					 bool *lookedup)
+{
+	struct dentry *parent;
+	struct fuse_forget_link *forget;
+	struct fuse_inode *fi;
+	FUSE_ARGS(args);
+	int ret;
+
+	forget = fuse_alloc_forget();
+	ret = -ENOMEM;
+	if (!forget)
+		goto out;
+
+	parent = dget_parent(entry);
+	fuse_lookup_init(fm->fc, &args, get_node_id(d_inode(parent)),
+			 &entry->d_name, outarg);
+	ret = fuse_simple_request(fm, &args);
+	dput(parent);
+
+	/* Zero nodeid is same as -ENOENT */
+	if (!ret && !outarg->nodeid)
+		ret = -ENOENT;
+	if (!ret) {
+		fi = get_fuse_inode(inode);
+		if (outarg->nodeid != get_node_id(inode) ||
+		    (bool) IS_AUTOMOUNT(inode) != (bool) (outarg->attr.flags & FUSE_ATTR_SUBMOUNT)) {
+			fuse_queue_forget(fm->fc, forget,
+					  outarg->nodeid, 1);
+			goto invalid;
+		}
+		*lookedup = true;
+	}
+	kfree(forget);
+	if (ret == -ENOMEM || ret == -EINTR)
+		goto out;
+	if (ret || fuse_invalid_attr(&outarg->attr) ||
+	    fuse_stale_inode(inode, outarg->generation, &outarg->attr)) {
+		goto invalid;
+	}
+
+	ret = 1;
+out:
+	return ret;
+
+invalid:
+	ret = 0;
+	goto out;
+}
+
 /*
  * Check whether the dentry is still valid
  *
@@ -206,9 +259,8 @@ static int fuse_dentry_revalidate(struct dentry *entry, unsigned int flags)
 	else if (time_before64(fuse_dentry_time(entry), get_jiffies_64()) ||
 		 (flags & (LOOKUP_EXCL | LOOKUP_REVAL | LOOKUP_RENAME_TARGET))) {
 		struct fuse_entry_out outarg;
-		FUSE_ARGS(args);
-		struct fuse_forget_link *forget;
 		u64 attr_version;
+		bool lookedup = false;
 
 		/* For negative dentries, always do a fresh lookup */
 		if (!inode)
@@ -220,38 +272,19 @@ static int fuse_dentry_revalidate(struct dentry *entry, unsigned int flags)
 
 		fm = get_fuse_mount(inode);
 
-		forget = fuse_alloc_forget();
-		ret = -ENOMEM;
-		if (!forget)
-			goto out;
-
 		attr_version = fuse_get_attr_version(fm->fc);
 
-		parent = dget_parent(entry);
-		fuse_lookup_init(fm->fc, &args, get_node_id(d_inode(parent)),
-				 &entry->d_name, &outarg);
-		ret = fuse_simple_request(fm, &args);
-		dput(parent);
-		/* Zero nodeid is same as -ENOENT */
-		if (!ret && !outarg.nodeid)
-			ret = -ENOENT;
-		if (!ret) {
+		ret = fuse_dentry_revalidate_lookup(fm, entry, inode, &outarg,
+						    &lookedup);
+		if (ret == -ENOMEM || ret == -EINTR)
+			goto out;
+		if (lookedup) {
 			fi = get_fuse_inode(inode);
-			if (outarg.nodeid != get_node_id(inode) ||
-			    (bool) IS_AUTOMOUNT(inode) != (bool) (outarg.attr.flags & FUSE_ATTR_SUBMOUNT)) {
-				fuse_queue_forget(fm->fc, forget,
-						  outarg.nodeid, 1);
-				goto invalid;
-			}
 			spin_lock(&fi->lock);
 			fi->nlookup++;
 			spin_unlock(&fi->lock);
 		}
-		kfree(forget);
-		if (ret == -ENOMEM || ret == -EINTR)
-			goto out;
-		if (ret || fuse_invalid_attr(&outarg.attr) ||
-		    fuse_stale_inode(inode, outarg.generation, &outarg.attr))
+		if (ret <= 0)
 			goto invalid;
 
 		forget_all_cached_acls(inode);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH 2/2] fuse: ensure that submounts lookup their root
  2023-09-11 18:38 [PATCH 0/2] virtiofs submounts forgotten after client / guest memory pressure Krister Johansen
  2023-09-11 18:38 ` [PATCH 1/2] fuse: revalidate: move lookup into a separate function Krister Johansen
@ 2023-09-11 18:38 ` Krister Johansen
  1 sibling, 0 replies; 4+ messages in thread
From: Krister Johansen @ 2023-09-11 18:38 UTC (permalink / raw)
  To: Miklos Szeredi, linux-fsdevel
  Cc: linux-kernel, German Maglione, Greg Kurz, Max Reitz

Prior to this commit, the submount code assumed that the inode for the
root filesystem could not be evicted.  When eviction occurs the server
may forget the inode.  This author has observed a submount get an EBADF
from a virtiofsd server that resulted from the sole dentry / inode
pair getting evicted from a mount namespace and superblock where they
were originally referenced.  The dentry shrinker triggered a forget
after killing the dentry with the last reference.

As a result, a container that was also using this submount failed to
access its filesystem because it had borrowed the reference instead of
taking its own when setting up its superblock for the submount.

Fix by ensuring that submount superblock configuration looks up the
nodeid for the submount as well.

Cc: stable@vger.kernel.org
Fixes: 1866d779d5d2 ("fuse: Allow fuse_fill_super_common() for submounts")
Signed-off-by: Krister Johansen <kjlx@templeofstupid.com>
---
 fs/fuse/dir.c    | 10 +++++-----
 fs/fuse/fuse_i.h |  6 ++++++
 fs/fuse/inode.c  | 43 +++++++++++++++++++++++++++++++++++++------
 3 files changed, 48 insertions(+), 11 deletions(-)

diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
index afbdd223b0f3..37dd4800aa21 100644
--- a/fs/fuse/dir.c
+++ b/fs/fuse/dir.c
@@ -183,11 +183,11 @@ static void fuse_lookup_init(struct fuse_conn *fc, struct fuse_args *args,
 	args->out_args[0].value = outarg;
 }
 
-static int fuse_dentry_revalidate_lookup(struct fuse_mount *fm,
-					 struct dentry *entry,
-					 struct inode *inode,
-					 struct fuse_entry_out *outarg,
-					 bool *lookedup)
+int fuse_dentry_revalidate_lookup(struct fuse_mount *fm,
+				  struct dentry *entry,
+				  struct inode *inode,
+				  struct fuse_entry_out *outarg,
+				  bool *lookedup)
 {
 	struct dentry *parent;
 	struct fuse_forget_link *forget;
diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
index bf0b85d0b95c..f8ba293d5d30 100644
--- a/fs/fuse/fuse_i.h
+++ b/fs/fuse/fuse_i.h
@@ -1325,6 +1325,12 @@ void fuse_dax_dontcache(struct inode *inode, unsigned int flags);
 bool fuse_dax_check_alignment(struct fuse_conn *fc, unsigned int map_alignment);
 void fuse_dax_cancel_work(struct fuse_conn *fc);
 
+/* dir.c */
+int fuse_dentry_revalidate_lookup(struct fuse_mount *fm, struct dentry *entry,
+				  struct inode *inode,
+				  struct fuse_entry_out *outarg,
+				  bool *lookedup);
+
 /* ioctl.c */
 long fuse_file_ioctl(struct file *file, unsigned int cmd, unsigned long arg);
 long fuse_file_compat_ioctl(struct file *file, unsigned int cmd,
diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
index d47606206ec3..b3e7b0a397ae 100644
--- a/fs/fuse/inode.c
+++ b/fs/fuse/inode.c
@@ -1464,7 +1464,13 @@ static int fuse_fill_super_submount(struct super_block *sb,
 	struct fuse_mount *fm = get_fuse_mount_super(sb);
 	struct super_block *parent_sb = parent_fi->inode.i_sb;
 	struct fuse_attr root_attr;
+	struct fuse_inode *fi;
 	struct inode *root;
+	struct inode *parent;
+	struct dentry *pdent;
+	struct fuse_entry_out outarg;
+	bool lookedup = false;
+	int ret;
 
 	fuse_sb_defaults(sb);
 	fm->sb = sb;
@@ -1480,14 +1486,39 @@ static int fuse_fill_super_submount(struct super_block *sb,
 	if (parent_sb->s_subtype && !sb->s_subtype)
 		return -ENOMEM;
 
-	fuse_fill_attr_from_inode(&root_attr, parent_fi);
-	root = fuse_iget(sb, parent_fi->nodeid, 0, &root_attr, 0, 0);
 	/*
-	 * This inode is just a duplicate, so it is not looked up and
-	 * its nlookup should not be incremented.  fuse_iget() does
-	 * that, though, so undo it here.
+	 * It is necessary to lookup the parent_if->nodeid in case the dentry
+	 * that triggered the automount of the submount is later evicted.
+	 * If this dentry is evicted without the lookup count getting increased
+	 * on the submount root, then the server can subsequently forget this
+	 * nodeid which leads to errors when trying to access the root of the
+	 * submount.
 	 */
-	get_fuse_inode(root)->nlookup--;
+	parent = &parent_fi->inode;
+	pdent = d_find_alias(parent);
+	if (!pdent)
+		return -EINVAL;
+
+	ret = fuse_dentry_revalidate_lookup(fm, pdent, parent, &outarg,
+	    &lookedup);
+	dput(pdent);
+	/*
+	 * The new root owns this nlookup on success, and it is incremented by
+	 * fuse_iget().  In the case the lookup succeeded but revalidate fails,
+	 * ensure that the lookup count is tracked by the parent.
+	 */
+	if (ret <= 0) {
+		if (lookedup) {
+			fi = get_fuse_inode(parent);
+			spin_lock(&fi->lock);
+			fi->nlookup++;
+			spin_unlock(&fi->lock);
+		}
+		return ret ? ret : -EINVAL;
+	}
+
+	fuse_fill_attr_from_inode(&root_attr, parent_fi);
+	root = fuse_iget(sb, parent_fi->nodeid, 0, &root_attr, 0, 0);
 	sb->s_d_op = &fuse_dentry_operations;
 	sb->s_root = d_make_root(root);
 	if (!sb->s_root)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH 1/2] fuse: revalidate: move lookup into a separate function
  2023-09-11 18:38 ` [PATCH 1/2] fuse: revalidate: move lookup into a separate function Krister Johansen
@ 2023-09-12  0:46   ` kernel test robot
  0 siblings, 0 replies; 4+ messages in thread
From: kernel test robot @ 2023-09-12  0:46 UTC (permalink / raw)
  To: Krister Johansen, Miklos Szeredi, linux-fsdevel
  Cc: oe-kbuild-all, linux-kernel, German Maglione, Greg Kurz, Max Reitz

Hi Krister,

kernel test robot noticed the following build warnings:

[auto build test WARNING on linus/master]
[also build test WARNING on v6.6-rc1 next-20230911]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Krister-Johansen/fuse-revalidate-move-lookup-into-a-separate-function/20230912-051352
base:   linus/master
patch link:    https://lore.kernel.org/r/9a2b0c5b625cd88c561289bf7d4d7dfe305c10ed.1693440240.git.kjlx%40templeofstupid.com
patch subject: [PATCH 1/2] fuse: revalidate: move lookup into a separate function
config: m68k-allyesconfig (https://download.01.org/0day-ci/archive/20230912/202309120853.QbAMM1to-lkp@intel.com/config)
compiler: m68k-linux-gcc (GCC) 13.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20230912/202309120853.QbAMM1to-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202309120853.QbAMM1to-lkp@intel.com/

All warnings (new ones prefixed by >>):

   fs/fuse/dir.c: In function 'fuse_dentry_revalidate_lookup':
>> fs/fuse/dir.c:194:28: warning: variable 'fi' set but not used [-Wunused-but-set-variable]
     194 |         struct fuse_inode *fi;
         |                            ^~


vim +/fi +194 fs/fuse/dir.c

   185	
   186	static int fuse_dentry_revalidate_lookup(struct fuse_mount *fm,
   187						 struct dentry *entry,
   188						 struct inode *inode,
   189						 struct fuse_entry_out *outarg,
   190						 bool *lookedup)
   191	{
   192		struct dentry *parent;
   193		struct fuse_forget_link *forget;
 > 194		struct fuse_inode *fi;
   195		FUSE_ARGS(args);
   196		int ret;
   197	
   198		forget = fuse_alloc_forget();
   199		ret = -ENOMEM;
   200		if (!forget)
   201			goto out;
   202	
   203		parent = dget_parent(entry);
   204		fuse_lookup_init(fm->fc, &args, get_node_id(d_inode(parent)),
   205				 &entry->d_name, outarg);
   206		ret = fuse_simple_request(fm, &args);
   207		dput(parent);
   208	
   209		/* Zero nodeid is same as -ENOENT */
   210		if (!ret && !outarg->nodeid)
   211			ret = -ENOENT;
   212		if (!ret) {
   213			fi = get_fuse_inode(inode);
   214			if (outarg->nodeid != get_node_id(inode) ||
   215			    (bool) IS_AUTOMOUNT(inode) != (bool) (outarg->attr.flags & FUSE_ATTR_SUBMOUNT)) {
   216				fuse_queue_forget(fm->fc, forget,
   217						  outarg->nodeid, 1);
   218				goto invalid;
   219			}
   220			*lookedup = true;
   221		}
   222		kfree(forget);
   223		if (ret == -ENOMEM || ret == -EINTR)
   224			goto out;
   225		if (ret || fuse_invalid_attr(&outarg->attr) ||
   226		    fuse_stale_inode(inode, outarg->generation, &outarg->attr)) {
   227			goto invalid;
   228		}
   229	
   230		ret = 1;
   231	out:
   232		return ret;
   233	
   234	invalid:
   235		ret = 0;
   236		goto out;
   237	}
   238	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2023-09-12  0:47 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-09-11 18:38 [PATCH 0/2] virtiofs submounts forgotten after client / guest memory pressure Krister Johansen
2023-09-11 18:38 ` [PATCH 1/2] fuse: revalidate: move lookup into a separate function Krister Johansen
2023-09-12  0:46   ` kernel test robot
2023-09-11 18:38 ` [PATCH 2/2] fuse: ensure that submounts lookup their root Krister Johansen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.