[PATCH 0/4] kernfs: proposed locking and concurrency improvement

linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 0/4] kernfs: proposed locking and concurrency improvement
@ 2020-05-25  5:46 Ian Kent
  2020-05-25  5:47 ` [PATCH 1/4] kernfs: switch kernfs to use an rwsem Ian Kent
                   ` (4 more replies)
  0 siblings, 5 replies; 16+ messages in thread
From: Ian Kent @ 2020-05-25  5:46 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Al Viro, Greg Kroah-Hartman, Tejun Heo, Rick Lindsley,
	Stephen Rothwell, David Howells, Miklos Szeredi, linux-fsdevel,
	Kernel Mailing List

For very large systems with hundreds of CPUs and TBs of RAM booting can
take a very long time.

Initial reports showed that booting a configuration of several hundred
CPUs and 64TB of RAM would take more than 30 minutes and require kernel
parameters of udev.children-max=1024 systemd.default_timeout_start_sec=3600
to prevent dropping into emergency mode.

Gathering information about what's happening during the boot is a bit
challenging. But two main issues appeared to be, a large number of path
lookups for non-existent files, and high lock contention in the VFS during
path walks particularly in the dentry allocation code path.

The underlying cause of this was believed to be the sheer number of sysfs
memory objects, 100,000+ for a 64TB memory configuration.

This patch series tries to reduce the locking needed during path walks
based on the assumption that there are many path walks with a fairly
large portion of those for non-existent paths.

This was done by adding kernfs negative dentry caching (non-existent
paths) to avoid continual alloc/free cycle of dentries and a read/write
semaphore introduced to increase kernfs concurrency during path walks.

With these changes the kernel parameters of udev.children-max=2048 and
systemd.default_timeout_start_sec=300 for are still needed to get the
fastest boot times and result in boot time of under 5 minutes.

There may be opportunities for further improvements but the series here
has seen a fair amount of testing. And thinking about what else could be
done, and discussing it with Rick Lindsay, I suspect improvements will
get more difficult to implement for somewhat less improvement so I think
what we have here is a good start for now.

I think what's needed now is patch review, and if we can get through
that, send them via linux-next for broader exposure and hopefully have
them merged into mainline.
---

Ian Kent (4):
      kernfs: switch kernfs to use an rwsem
      kernfs: move revalidate to be near lookup
      kernfs: improve kernfs path resolution
      kernfs: use revision to identify directory node changes

 fs/kernfs/dir.c             |  283 ++++++++++++++++++++++++++++---------------
 fs/kernfs/file.c            |    4 -
 fs/kernfs/inode.c           |   16 +-
 fs/kernfs/kernfs-internal.h |   29 ++++
 fs/kernfs/mount.c           |   12 +-
 fs/kernfs/symlink.c         |    4 -
 include/linux/kernfs.h      |    5 +
 7 files changed, 232 insertions(+), 121 deletions(-)

--
Ian

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 1/4] kernfs: switch kernfs to use an rwsem
  2020-05-25  5:46 [PATCH 0/4] kernfs: proposed locking and concurrency improvement Ian Kent
@ 2020-05-25  5:47 ` Ian Kent
       [not found]   ` <20200606155216.GU12456@shao2-debian>
  2020-06-07  8:40   ` [PATCH 1/4] kernfs: switch kernfs to use an rwsem Ian Kent
  2020-05-25  5:47 ` [PATCH 2/4] kernfs: move revalidate to be near lookup Ian Kent
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 16+ messages in thread
From: Ian Kent @ 2020-05-25  5:47 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Al Viro, Greg Kroah-Hartman, Tejun Heo, Rick Lindsley,
	Stephen Rothwell, David Howells, Miklos Szeredi, linux-fsdevel,
	Kernel Mailing List

The kernfs global lock restricts the ability to perform kernfs node
lookup operations in parallel.

Change the kernfs mutex to an rwsem so that, when oppertunity arises,
node searches can be done in parallel.

Signed-off-by: Ian Kent <raven@themaw.net>
---
 fs/kernfs/dir.c             |  119 +++++++++++++++++++++++--------------------
 fs/kernfs/file.c            |    4 +
 fs/kernfs/inode.c           |   16 +++---
 fs/kernfs/kernfs-internal.h |    5 +-
 fs/kernfs/mount.c           |   12 ++--
 fs/kernfs/symlink.c         |    4 +
 6 files changed, 86 insertions(+), 74 deletions(-)

diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c
index 9aec80b9d7c6..d8213fc65eba 100644
--- a/fs/kernfs/dir.c
+++ b/fs/kernfs/dir.c
@@ -17,7 +17,7 @@
 
 #include "kernfs-internal.h"
 
-DEFINE_MUTEX(kernfs_mutex);
+DECLARE_RWSEM(kernfs_rwsem);
 static DEFINE_SPINLOCK(kernfs_rename_lock);	/* kn->parent and ->name */
 static char kernfs_pr_cont_buf[PATH_MAX];	/* protected by rename_lock */
 static DEFINE_SPINLOCK(kernfs_idr_lock);	/* root->ino_idr */
@@ -26,10 +26,21 @@ static DEFINE_SPINLOCK(kernfs_idr_lock);	/* root->ino_idr */
 
 static bool kernfs_active(struct kernfs_node *kn)
 {
-	lockdep_assert_held(&kernfs_mutex);
 	return atomic_read(&kn->active) >= 0;
 }
 
+static bool kernfs_active_write(struct kernfs_node *kn)
+{
+	lockdep_assert_held_write(&kernfs_rwsem);
+	return kernfs_active(kn);
+}
+
+static bool kernfs_active_read(struct kernfs_node *kn)
+{
+	lockdep_assert_held_read(&kernfs_rwsem);
+	return kernfs_active(kn);
+}
+
 static bool kernfs_lockdep(struct kernfs_node *kn)
 {
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
@@ -340,7 +351,7 @@ static int kernfs_sd_compare(const struct kernfs_node *left,
  *	@kn->parent->dir.children.
  *
  *	Locking:
- *	mutex_lock(kernfs_mutex)
+ *	kernfs_rwsem write lock
  *
  *	RETURNS:
  *	0 on susccess -EEXIST on failure.
@@ -385,7 +396,7 @@ static int kernfs_link_sibling(struct kernfs_node *kn)
  *	removed, %false if @kn wasn't on the rbtree.
  *
  *	Locking:
- *	mutex_lock(kernfs_mutex)
+ *	kernfs_rwsem write lock
  */
 static bool kernfs_unlink_sibling(struct kernfs_node *kn)
 {
@@ -455,14 +466,14 @@ void kernfs_put_active(struct kernfs_node *kn)
  * return after draining is complete.
  */
 static void kernfs_drain(struct kernfs_node *kn)
-	__releases(&kernfs_mutex) __acquires(&kernfs_mutex)
+	__releases(&kernfs_rwsem) __acquires(&kernfs_rwsem)
 {
 	struct kernfs_root *root = kernfs_root(kn);
 
-	lockdep_assert_held(&kernfs_mutex);
+	lockdep_assert_held_write(&kernfs_rwsem);
 	WARN_ON_ONCE(kernfs_active(kn));
 
-	mutex_unlock(&kernfs_mutex);
+	up_write(&kernfs_rwsem);
 
 	if (kernfs_lockdep(kn)) {
 		rwsem_acquire(&kn->dep_map, 0, 0, _RET_IP_);
@@ -481,7 +492,7 @@ static void kernfs_drain(struct kernfs_node *kn)
 
 	kernfs_drain_open_files(kn);
 
-	mutex_lock(&kernfs_mutex);
+	down_write(&kernfs_rwsem);
 }
 
 /**
@@ -560,10 +571,10 @@ static int kernfs_dop_revalidate(struct dentry *dentry, unsigned int flags)
 		goto out_bad_unlocked;
 
 	kn = kernfs_dentry_node(dentry);
-	mutex_lock(&kernfs_mutex);
+	down_read(&kernfs_rwsem);
 
 	/* The kernfs node has been deactivated */
-	if (!kernfs_active(kn))
+	if (!kernfs_active_read(kn))
 		goto out_bad;
 
 	/* The kernfs node has been moved? */
@@ -579,10 +590,10 @@ static int kernfs_dop_revalidate(struct dentry *dentry, unsigned int flags)
 	    kernfs_info(dentry->d_sb)->ns != kn->ns)
 		goto out_bad;
 
-	mutex_unlock(&kernfs_mutex);
+	up_read(&kernfs_rwsem);
 	return 1;
 out_bad:
-	mutex_unlock(&kernfs_mutex);
+	up_read(&kernfs_rwsem);
 out_bad_unlocked:
 	return 0;
 }
@@ -764,7 +775,7 @@ int kernfs_add_one(struct kernfs_node *kn)
 	bool has_ns;
 	int ret;
 
-	mutex_lock(&kernfs_mutex);
+	down_write(&kernfs_rwsem);
 
 	ret = -EINVAL;
 	has_ns = kernfs_ns_enabled(parent);
@@ -779,7 +790,7 @@ int kernfs_add_one(struct kernfs_node *kn)
 	if (parent->flags & KERNFS_EMPTY_DIR)
 		goto out_unlock;
 
-	if ((parent->flags & KERNFS_ACTIVATED) && !kernfs_active(parent))
+	if ((parent->flags & KERNFS_ACTIVATED) && !kernfs_active_write(parent))
 		goto out_unlock;
 
 	kn->hash = kernfs_name_hash(kn->name, kn->ns);
@@ -795,7 +806,7 @@ int kernfs_add_one(struct kernfs_node *kn)
 		ps_iattr->ia_mtime = ps_iattr->ia_ctime;
 	}
 
-	mutex_unlock(&kernfs_mutex);
+	up_write(&kernfs_rwsem);
 
 	/*
 	 * Activate the new node unless CREATE_DEACTIVATED is requested.
@@ -809,7 +820,7 @@ int kernfs_add_one(struct kernfs_node *kn)
 	return 0;
 
 out_unlock:
-	mutex_unlock(&kernfs_mutex);
+	up_write(&kernfs_rwsem);
 	return ret;
 }
 
@@ -830,7 +841,7 @@ static struct kernfs_node *kernfs_find_ns(struct kernfs_node *parent,
 	bool has_ns = kernfs_ns_enabled(parent);
 	unsigned int hash;
 
-	lockdep_assert_held(&kernfs_mutex);
+	lockdep_assert_held(&kernfs_rwsem);
 
 	if (has_ns != (bool)ns) {
 		WARN(1, KERN_WARNING "kernfs: ns %s in '%s' for '%s'\n",
@@ -862,7 +873,7 @@ static struct kernfs_node *kernfs_walk_ns(struct kernfs_node *parent,
 	size_t len;
 	char *p, *name;
 
-	lockdep_assert_held(&kernfs_mutex);
+	lockdep_assert_held_read(&kernfs_rwsem);
 
 	/* grab kernfs_rename_lock to piggy back on kernfs_pr_cont_buf */
 	spin_lock_irq(&kernfs_rename_lock);
@@ -902,10 +913,10 @@ struct kernfs_node *kernfs_find_and_get_ns(struct kernfs_node *parent,
 {
 	struct kernfs_node *kn;
 
-	mutex_lock(&kernfs_mutex);
+	down_read(&kernfs_rwsem);
 	kn = kernfs_find_ns(parent, name, ns);
 	kernfs_get(kn);
-	mutex_unlock(&kernfs_mutex);
+	up_read(&kernfs_rwsem);
 
 	return kn;
 }
@@ -926,10 +937,10 @@ struct kernfs_node *kernfs_walk_and_get_ns(struct kernfs_node *parent,
 {
 	struct kernfs_node *kn;
 
-	mutex_lock(&kernfs_mutex);
+	down_read(&kernfs_rwsem);
 	kn = kernfs_walk_ns(parent, path, ns);
 	kernfs_get(kn);
-	mutex_unlock(&kernfs_mutex);
+	up_read(&kernfs_rwsem);
 
 	return kn;
 }
@@ -1084,7 +1095,7 @@ static struct dentry *kernfs_iop_lookup(struct inode *dir,
 	struct inode *inode;
 	const void *ns = NULL;
 
-	mutex_lock(&kernfs_mutex);
+	down_read(&kernfs_rwsem);
 
 	if (kernfs_ns_enabled(parent))
 		ns = kernfs_info(dir->i_sb)->ns;
@@ -1107,7 +1118,7 @@ static struct dentry *kernfs_iop_lookup(struct inode *dir,
 	/* instantiate and hash dentry */
 	ret = d_splice_alias(inode, dentry);
  out_unlock:
-	mutex_unlock(&kernfs_mutex);
+	up_read(&kernfs_rwsem);
 	return ret;
 }
 
@@ -1226,7 +1237,7 @@ static struct kernfs_node *kernfs_next_descendant_post(struct kernfs_node *pos,
 {
 	struct rb_node *rbn;
 
-	lockdep_assert_held(&kernfs_mutex);
+	lockdep_assert_held_write(&kernfs_rwsem);
 
 	/* if first iteration, visit leftmost descendant which may be root */
 	if (!pos)
@@ -1262,7 +1273,7 @@ void kernfs_activate(struct kernfs_node *kn)
 {
 	struct kernfs_node *pos;
 
-	mutex_lock(&kernfs_mutex);
+	down_write(&kernfs_rwsem);
 
 	pos = NULL;
 	while ((pos = kernfs_next_descendant_post(pos, kn))) {
@@ -1276,14 +1287,14 @@ void kernfs_activate(struct kernfs_node *kn)
 		pos->flags |= KERNFS_ACTIVATED;
 	}
 
-	mutex_unlock(&kernfs_mutex);
+	up_write(&kernfs_rwsem);
 }
 
 static void __kernfs_remove(struct kernfs_node *kn)
 {
 	struct kernfs_node *pos;
 
-	lockdep_assert_held(&kernfs_mutex);
+	lockdep_assert_held_write(&kernfs_rwsem);
 
 	/*
 	 * Short-circuit if non-root @kn has already finished removal.
@@ -1298,7 +1309,7 @@ static void __kernfs_remove(struct kernfs_node *kn)
 	/* prevent any new usage under @kn by deactivating all nodes */
 	pos = NULL;
 	while ((pos = kernfs_next_descendant_post(pos, kn)))
-		if (kernfs_active(pos))
+		if (kernfs_active_write(pos))
 			atomic_add(KN_DEACTIVATED_BIAS, &pos->active);
 
 	/* deactivate and unlink the subtree node-by-node */
@@ -1306,7 +1317,7 @@ static void __kernfs_remove(struct kernfs_node *kn)
 		pos = kernfs_leftmost_descendant(kn);
 
 		/*
-		 * kernfs_drain() drops kernfs_mutex temporarily and @pos's
+		 * kernfs_drain() drops kernfs_rwsem temporarily and @pos's
 		 * base ref could have been put by someone else by the time
 		 * the function returns.  Make sure it doesn't go away
 		 * underneath us.
@@ -1353,9 +1364,9 @@ static void __kernfs_remove(struct kernfs_node *kn)
  */
 void kernfs_remove(struct kernfs_node *kn)
 {
-	mutex_lock(&kernfs_mutex);
+	down_write(&kernfs_rwsem);
 	__kernfs_remove(kn);
-	mutex_unlock(&kernfs_mutex);
+	up_write(&kernfs_rwsem);
 }
 
 /**
@@ -1442,17 +1453,17 @@ bool kernfs_remove_self(struct kernfs_node *kn)
 {
 	bool ret;
 
-	mutex_lock(&kernfs_mutex);
+	down_write(&kernfs_rwsem);
 	kernfs_break_active_protection(kn);
 
 	/*
 	 * SUICIDAL is used to arbitrate among competing invocations.  Only
 	 * the first one will actually perform removal.  When the removal
 	 * is complete, SUICIDED is set and the active ref is restored
-	 * while holding kernfs_mutex.  The ones which lost arbitration
-	 * waits for SUICDED && drained which can happen only after the
-	 * enclosing kernfs operation which executed the winning instance
-	 * of kernfs_remove_self() finished.
+	 * while holding kernfs_rwsem for write.  The ones which lost
+	 * arbitration waits for SUICIDED && drained which can happen only
+	 * after the enclosing kernfs operation which executed the winning
+	 * instance of kernfs_remove_self() finished.
 	 */
 	if (!(kn->flags & KERNFS_SUICIDAL)) {
 		kn->flags |= KERNFS_SUICIDAL;
@@ -1470,9 +1481,9 @@ bool kernfs_remove_self(struct kernfs_node *kn)
 			    atomic_read(&kn->active) == KN_DEACTIVATED_BIAS)
 				break;
 
-			mutex_unlock(&kernfs_mutex);
+			up_write(&kernfs_rwsem);
 			schedule();
-			mutex_lock(&kernfs_mutex);
+			down_write(&kernfs_rwsem);
 		}
 		finish_wait(waitq, &wait);
 		WARN_ON_ONCE(!RB_EMPTY_NODE(&kn->rb));
@@ -1480,12 +1491,12 @@ bool kernfs_remove_self(struct kernfs_node *kn)
 	}
 
 	/*
-	 * This must be done while holding kernfs_mutex; otherwise, waiting
-	 * for SUICIDED && deactivated could finish prematurely.
+	 * This must be done while holding kernfs_rwsem for write; otherwise,
+	 * waiting for SUICIDED && deactivated could finish prematurely.
 	 */
 	kernfs_unbreak_active_protection(kn);
 
-	mutex_unlock(&kernfs_mutex);
+	up_write(&kernfs_rwsem);
 	return ret;
 }
 
@@ -1509,13 +1520,13 @@ int kernfs_remove_by_name_ns(struct kernfs_node *parent, const char *name,
 		return -ENOENT;
 	}
 
-	mutex_lock(&kernfs_mutex);
+	down_write(&kernfs_rwsem);
 
 	kn = kernfs_find_ns(parent, name, ns);
 	if (kn)
 		__kernfs_remove(kn);
 
-	mutex_unlock(&kernfs_mutex);
+	up_write(&kernfs_rwsem);
 
 	if (kn)
 		return 0;
@@ -1541,10 +1552,10 @@ int kernfs_rename_ns(struct kernfs_node *kn, struct kernfs_node *new_parent,
 	if (!kn->parent)
 		return -EINVAL;
 
-	mutex_lock(&kernfs_mutex);
+	down_write(&kernfs_rwsem);
 
 	error = -ENOENT;
-	if (!kernfs_active(kn) || !kernfs_active(new_parent) ||
+	if (!kernfs_active_write(kn) || !kernfs_active_write(new_parent) ||
 	    (new_parent->flags & KERNFS_EMPTY_DIR))
 		goto out;
 
@@ -1595,7 +1606,7 @@ int kernfs_rename_ns(struct kernfs_node *kn, struct kernfs_node *new_parent,
 
 	error = 0;
  out:
-	mutex_unlock(&kernfs_mutex);
+	up_write(&kernfs_rwsem);
 	return error;
 }
 
@@ -1615,7 +1626,7 @@ static struct kernfs_node *kernfs_dir_pos(const void *ns,
 	struct kernfs_node *parent, loff_t hash, struct kernfs_node *pos)
 {
 	if (pos) {
-		int valid = kernfs_active(pos) &&
+		int valid = kernfs_active_read(pos) &&
 			pos->parent == parent && hash == pos->hash;
 		kernfs_put(pos);
 		if (!valid)
@@ -1635,7 +1646,7 @@ static struct kernfs_node *kernfs_dir_pos(const void *ns,
 		}
 	}
 	/* Skip over entries which are dying/dead or in the wrong namespace */
-	while (pos && (!kernfs_active(pos) || pos->ns != ns)) {
+	while (pos && (!kernfs_active_read(pos) || pos->ns != ns)) {
 		struct rb_node *node = rb_next(&pos->rb);
 		if (!node)
 			pos = NULL;
@@ -1656,7 +1667,7 @@ static struct kernfs_node *kernfs_dir_next_pos(const void *ns,
 				pos = NULL;
 			else
 				pos = rb_to_kn(node);
-		} while (pos && (!kernfs_active(pos) || pos->ns != ns));
+		} while (pos && (!kernfs_active_read(pos) || pos->ns != ns));
 	}
 	return pos;
 }
@@ -1670,7 +1681,7 @@ static int kernfs_fop_readdir(struct file *file, struct dir_context *ctx)
 
 	if (!dir_emit_dots(file, ctx))
 		return 0;
-	mutex_lock(&kernfs_mutex);
+	down_read(&kernfs_rwsem);
 
 	if (kernfs_ns_enabled(parent))
 		ns = kernfs_info(dentry->d_sb)->ns;
@@ -1687,12 +1698,12 @@ static int kernfs_fop_readdir(struct file *file, struct dir_context *ctx)
 		file->private_data = pos;
 		kernfs_get(pos);
 
-		mutex_unlock(&kernfs_mutex);
+		up_read(&kernfs_rwsem);
 		if (!dir_emit(ctx, name, len, ino, type))
 			return 0;
-		mutex_lock(&kernfs_mutex);
+		down_read(&kernfs_rwsem);
 	}
-	mutex_unlock(&kernfs_mutex);
+	up_read(&kernfs_rwsem);
 	file->private_data = NULL;
 	ctx->pos = INT_MAX;
 	return 0;
diff --git a/fs/kernfs/file.c b/fs/kernfs/file.c
index 34366db3620d..455caea6ab0b 100644
--- a/fs/kernfs/file.c
+++ b/fs/kernfs/file.c
@@ -879,7 +879,7 @@ static void kernfs_notify_workfn(struct work_struct *work)
 	spin_unlock_irq(&kernfs_notify_lock);
 
 	/* kick fsnotify */
-	mutex_lock(&kernfs_mutex);
+	down_write(&kernfs_rwsem);
 
 	list_for_each_entry(info, &kernfs_root(kn)->supers, node) {
 		struct kernfs_node *parent;
@@ -916,7 +916,7 @@ static void kernfs_notify_workfn(struct work_struct *work)
 		iput(inode);
 	}
 
-	mutex_unlock(&kernfs_mutex);
+	up_write(&kernfs_rwsem);
 	kernfs_put(kn);
 	goto repeat;
 }
diff --git a/fs/kernfs/inode.c b/fs/kernfs/inode.c
index fc2469a20fed..32e334e0d687 100644
--- a/fs/kernfs/inode.c
+++ b/fs/kernfs/inode.c
@@ -106,9 +106,9 @@ int kernfs_setattr(struct kernfs_node *kn, const struct iattr *iattr)
 {
 	int ret;
 
-	mutex_lock(&kernfs_mutex);
+	down_write(&kernfs_rwsem);
 	ret = __kernfs_setattr(kn, iattr);
-	mutex_unlock(&kernfs_mutex);
+	up_write(&kernfs_rwsem);
 	return ret;
 }
 
@@ -121,7 +121,7 @@ int kernfs_iop_setattr(struct dentry *dentry, struct iattr *iattr)
 	if (!kn)
 		return -EINVAL;
 
-	mutex_lock(&kernfs_mutex);
+	down_write(&kernfs_rwsem);
 	error = setattr_prepare(dentry, iattr);
 	if (error)
 		goto out;
@@ -134,7 +134,7 @@ int kernfs_iop_setattr(struct dentry *dentry, struct iattr *iattr)
 	setattr_copy(inode, iattr);
 
 out:
-	mutex_unlock(&kernfs_mutex);
+	up_write(&kernfs_rwsem);
 	return error;
 }
 
@@ -189,9 +189,9 @@ int kernfs_iop_getattr(const struct path *path, struct kstat *stat,
 	struct inode *inode = d_inode(path->dentry);
 	struct kernfs_node *kn = inode->i_private;
 
-	mutex_lock(&kernfs_mutex);
+	down_read(&kernfs_rwsem);
 	kernfs_refresh_inode(kn, inode);
-	mutex_unlock(&kernfs_mutex);
+	up_read(&kernfs_rwsem);
 
 	generic_fillattr(inode, stat);
 	return 0;
@@ -281,9 +281,9 @@ int kernfs_iop_permission(struct inode *inode, int mask)
 
 	kn = inode->i_private;
 
-	mutex_lock(&kernfs_mutex);
+	down_read(&kernfs_rwsem);
 	kernfs_refresh_inode(kn, inode);
-	mutex_unlock(&kernfs_mutex);
+	up_read(&kernfs_rwsem);
 
 	return generic_permission(inode, mask);
 }
diff --git a/fs/kernfs/kernfs-internal.h b/fs/kernfs/kernfs-internal.h
index 7ee97ef59184..097c1a989aa4 100644
--- a/fs/kernfs/kernfs-internal.h
+++ b/fs/kernfs/kernfs-internal.h
@@ -13,6 +13,7 @@
 #include <linux/lockdep.h>
 #include <linux/fs.h>
 #include <linux/mutex.h>
+#include <linux/rwsem.h>
 #include <linux/xattr.h>
 
 #include <linux/kernfs.h>
@@ -69,7 +70,7 @@ struct kernfs_super_info {
 	 */
 	const void		*ns;
 
-	/* anchored at kernfs_root->supers, protected by kernfs_mutex */
+	/* anchored at kernfs_root->supers, protected by kernfs_rwsem */
 	struct list_head	node;
 };
 #define kernfs_info(SB) ((struct kernfs_super_info *)(SB->s_fs_info))
@@ -99,7 +100,7 @@ int __kernfs_setattr(struct kernfs_node *kn, const struct iattr *iattr);
 /*
  * dir.c
  */
-extern struct mutex kernfs_mutex;
+extern struct rw_semaphore kernfs_rwsem;
 extern const struct dentry_operations kernfs_dops;
 extern const struct file_operations kernfs_dir_fops;
 extern const struct inode_operations kernfs_dir_iops;
diff --git a/fs/kernfs/mount.c b/fs/kernfs/mount.c
index 9dc7e7a64e10..baa4155ba2ed 100644
--- a/fs/kernfs/mount.c
+++ b/fs/kernfs/mount.c
@@ -255,9 +255,9 @@ static int kernfs_fill_super(struct super_block *sb, struct kernfs_fs_context *k
 	sb->s_shrink.seeks = 0;
 
 	/* get root inode, initialize and unlock it */
-	mutex_lock(&kernfs_mutex);
+	down_write(&kernfs_rwsem);
 	inode = kernfs_get_inode(sb, info->root->kn);
-	mutex_unlock(&kernfs_mutex);
+	up_write(&kernfs_rwsem);
 	if (!inode) {
 		pr_debug("kernfs: could not get root inode\n");
 		return -ENOMEM;
@@ -344,9 +344,9 @@ int kernfs_get_tree(struct fs_context *fc)
 		}
 		sb->s_flags |= SB_ACTIVE;
 
-		mutex_lock(&kernfs_mutex);
+		down_write(&kernfs_rwsem);
 		list_add(&info->node, &info->root->supers);
-		mutex_unlock(&kernfs_mutex);
+		up_write(&kernfs_rwsem);
 	}
 
 	fc->root = dget(sb->s_root);
@@ -372,9 +372,9 @@ void kernfs_kill_sb(struct super_block *sb)
 {
 	struct kernfs_super_info *info = kernfs_info(sb);
 
-	mutex_lock(&kernfs_mutex);
+	down_write(&kernfs_rwsem);
 	list_del(&info->node);
-	mutex_unlock(&kernfs_mutex);
+	up_write(&kernfs_rwsem);
 
 	/*
 	 * Remove the superblock from fs_supers/s_instances
diff --git a/fs/kernfs/symlink.c b/fs/kernfs/symlink.c
index 5432883d819f..7246b470de3c 100644
--- a/fs/kernfs/symlink.c
+++ b/fs/kernfs/symlink.c
@@ -116,9 +116,9 @@ static int kernfs_getlink(struct inode *inode, char *path)
 	struct kernfs_node *target = kn->symlink.target_kn;
 	int error;
 
-	mutex_lock(&kernfs_mutex);
+	down_write(&kernfs_rwsem);
 	error = kernfs_get_target_path(parent, target, path);
-	mutex_unlock(&kernfs_mutex);
+	up_write(&kernfs_rwsem);
 
 	return error;
 }



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 2/4] kernfs: move revalidate to be near lookup
  2020-05-25  5:46 [PATCH 0/4] kernfs: proposed locking and concurrency improvement Ian Kent
  2020-05-25  5:47 ` [PATCH 1/4] kernfs: switch kernfs to use an rwsem Ian Kent
@ 2020-05-25  5:47 ` Ian Kent
  2020-05-25  5:47 ` [PATCH 3/4] kernfs: improve kernfs path resolution Ian Kent
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 16+ messages in thread
From: Ian Kent @ 2020-05-25  5:47 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Al Viro, Greg Kroah-Hartman, Tejun Heo, Rick Lindsley,
	Stephen Rothwell, David Howells, Miklos Szeredi, linux-fsdevel,
	Kernel Mailing List

While the dentry operation kernfs_dop_revalidate() is grouped with
dentry'ish functions it also has a strong afinity to the inode
operation ->lookup(). And when path walk improvements are applied
it will need to call kernfs_find_ns() so move it to be near
kernfs_iop_lookup() to avoid the need for a forward declaration.

There's no functional change from this patch.

Signed-off-by: Ian Kent <raven@themaw.net>
---
 fs/kernfs/dir.c |   86 ++++++++++++++++++++++++++++---------------------------
 1 file changed, 43 insertions(+), 43 deletions(-)

diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c
index d8213fc65eba..9b315f3b20ee 100644
--- a/fs/kernfs/dir.c
+++ b/fs/kernfs/dir.c
@@ -559,49 +559,6 @@ void kernfs_put(struct kernfs_node *kn)
 }
 EXPORT_SYMBOL_GPL(kernfs_put);
 
-static int kernfs_dop_revalidate(struct dentry *dentry, unsigned int flags)
-{
-	struct kernfs_node *kn;
-
-	if (flags & LOOKUP_RCU)
-		return -ECHILD;
-
-	/* Always perform fresh lookup for negatives */
-	if (d_really_is_negative(dentry))
-		goto out_bad_unlocked;
-
-	kn = kernfs_dentry_node(dentry);
-	down_read(&kernfs_rwsem);
-
-	/* The kernfs node has been deactivated */
-	if (!kernfs_active_read(kn))
-		goto out_bad;
-
-	/* The kernfs node has been moved? */
-	if (kernfs_dentry_node(dentry->d_parent) != kn->parent)
-		goto out_bad;
-
-	/* The kernfs node has been renamed */
-	if (strcmp(dentry->d_name.name, kn->name) != 0)
-		goto out_bad;
-
-	/* The kernfs node has been moved to a different namespace */
-	if (kn->parent && kernfs_ns_enabled(kn->parent) &&
-	    kernfs_info(dentry->d_sb)->ns != kn->ns)
-		goto out_bad;
-
-	up_read(&kernfs_rwsem);
-	return 1;
-out_bad:
-	up_read(&kernfs_rwsem);
-out_bad_unlocked:
-	return 0;
-}
-
-const struct dentry_operations kernfs_dops = {
-	.d_revalidate	= kernfs_dop_revalidate,
-};
-
 /**
  * kernfs_node_from_dentry - determine kernfs_node associated with a dentry
  * @dentry: the dentry in question
@@ -1085,6 +1042,49 @@ struct kernfs_node *kernfs_create_empty_dir(struct kernfs_node *parent,
 	return ERR_PTR(rc);
 }
 
+static int kernfs_dop_revalidate(struct dentry *dentry, unsigned int flags)
+{
+	struct kernfs_node *kn;
+
+	if (flags & LOOKUP_RCU)
+		return -ECHILD;
+
+	/* Always perform fresh lookup for negatives */
+	if (d_really_is_negative(dentry))
+		goto out_bad_unlocked;
+
+	kn = kernfs_dentry_node(dentry);
+	down_read(&kernfs_rwsem);
+
+	/* The kernfs node has been deactivated */
+	if (!kernfs_active_read(kn))
+		goto out_bad;
+
+	/* The kernfs node has been moved? */
+	if (kernfs_dentry_node(dentry->d_parent) != kn->parent)
+		goto out_bad;
+
+	/* The kernfs node has been renamed */
+	if (strcmp(dentry->d_name.name, kn->name) != 0)
+		goto out_bad;
+
+	/* The kernfs node has been moved to a different namespace */
+	if (kn->parent && kernfs_ns_enabled(kn->parent) &&
+	    kernfs_info(dentry->d_sb)->ns != kn->ns)
+		goto out_bad;
+
+	up_read(&kernfs_rwsem);
+	return 1;
+out_bad:
+	up_read(&kernfs_rwsem);
+out_bad_unlocked:
+	return 0;
+}
+
+const struct dentry_operations kernfs_dops = {
+	.d_revalidate	= kernfs_dop_revalidate,
+};
+
 static struct dentry *kernfs_iop_lookup(struct inode *dir,
 					struct dentry *dentry,
 					unsigned int flags)



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 3/4] kernfs: improve kernfs path resolution
  2020-05-25  5:46 [PATCH 0/4] kernfs: proposed locking and concurrency improvement Ian Kent
  2020-05-25  5:47 ` [PATCH 1/4] kernfs: switch kernfs to use an rwsem Ian Kent
  2020-05-25  5:47 ` [PATCH 2/4] kernfs: move revalidate to be near lookup Ian Kent
@ 2020-05-25  5:47 ` Ian Kent
  2020-05-25  5:47 ` [PATCH 4/4] kernfs: use revision to identify directory node changes Ian Kent
  2020-05-25  6:16 ` [PATCH 0/4] kernfs: proposed locking and concurrency improvement Greg Kroah-Hartman
  4 siblings, 0 replies; 16+ messages in thread
From: Ian Kent @ 2020-05-25  5:47 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Al Viro, Greg Kroah-Hartman, Tejun Heo, Rick Lindsley,
	Stephen Rothwell, David Howells, Miklos Szeredi, linux-fsdevel,
	Kernel Mailing List

Now that an rwsem is used by kernfs, take advantage of it to reduce
lookup overhead.

If there are many lookups (possibly many negative ones) there can
be a lot of overhead during path walks.

To reduce lookup overhead avoid allocating a new dentry where possible.

To do this stay in rcu-walk mode where possible and use the dentry cache
handling of negative hashed dentries to avoid allocating (and freeing
shortly after) new dentries on every negative lookup.

Signed-off-by: Ian Kent <raven@themaw.net>
---
 fs/kernfs/dir.c |   87 ++++++++++++++++++++++++++++++++++++++++++++++---------
 1 file changed, 72 insertions(+), 15 deletions(-)

diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c
index 9b315f3b20ee..f4943329e578 100644
--- a/fs/kernfs/dir.c
+++ b/fs/kernfs/dir.c
@@ -1046,15 +1046,75 @@ static int kernfs_dop_revalidate(struct dentry *dentry, unsigned int flags)
 {
 	struct kernfs_node *kn;
 
-	if (flags & LOOKUP_RCU)
+	if (flags & LOOKUP_RCU) {
+		kn = kernfs_dentry_node(dentry);
+		if (!kn) {
+			/* Negative hashed dentry, tell the VFS to switch to
+			 * ref-walk mode and call us again so that node
+			 * existence can be checked.
+			 */
+			if (!d_unhashed(dentry))
+				return -ECHILD;
+
+			/* Negative unhashed dentry, this shouldn't happen
+			 * because this case occurs in rcu-walk mode after
+			 * dentry allocation which is followed by a call
+			 * to ->loopup(). But if it does happen the dentry
+			 * is surely invalid.
+			 */
+			return 0;
+		}
+
+		/* Since the dentry is positive (we got the kernfs node) a
+		 * kernfs node reference was held at the time. Now if the
+		 * dentry reference count is still greater than 0 it's still
+		 * positive so take a reference to the node to perform an
+		 * active check.
+		 */
+		if (d_count(dentry) <= 0 || !atomic_inc_not_zero(&kn->count))
+			return -ECHILD;
+
+		/* The kernfs node reference count was greater than 0, if
+		 * it's active continue in rcu-walk mode.
+		 */
+		if (kernfs_active_read(kn)) {
+			kernfs_put(kn);
+			return 1;
+		}
+
+		/* Otherwise, just tell the VFS to switch to ref-walk mode
+		 * and call us again so the kernfs node can be validated.
+		 */
+		kernfs_put(kn);
 		return -ECHILD;
+	}
 
-	/* Always perform fresh lookup for negatives */
-	if (d_really_is_negative(dentry))
-		goto out_bad_unlocked;
+	down_read(&kernfs_rwsem);
 
 	kn = kernfs_dentry_node(dentry);
-	down_read(&kernfs_rwsem);
+	if (!kn) {
+		struct kernfs_node *parent;
+
+		/* If the kernfs node can be found this is a stale negative
+		 * hashed dentry so it must be discarded and the lookup redone.
+		 */
+		parent = kernfs_dentry_node(dentry->d_parent);
+		if (parent) {
+			const void *ns = NULL;
+
+			if (kernfs_ns_enabled(parent))
+				ns = kernfs_info(dentry->d_parent->d_sb)->ns;
+			kn = kernfs_find_ns(parent, dentry->d_name.name, ns);
+			if (kn)
+				goto out_bad;
+		}
+
+		/* The kernfs node doesn't exist, leave the dentry negative
+		 * and return success.
+		 */
+		goto out;
+	}
+
 
 	/* The kernfs node has been deactivated */
 	if (!kernfs_active_read(kn))
@@ -1072,12 +1132,11 @@ static int kernfs_dop_revalidate(struct dentry *dentry, unsigned int flags)
 	if (kn->parent && kernfs_ns_enabled(kn->parent) &&
 	    kernfs_info(dentry->d_sb)->ns != kn->ns)
 		goto out_bad;
-
+out:
 	up_read(&kernfs_rwsem);
 	return 1;
 out_bad:
 	up_read(&kernfs_rwsem);
-out_bad_unlocked:
 	return 0;
 }
 
@@ -1092,7 +1151,7 @@ static struct dentry *kernfs_iop_lookup(struct inode *dir,
 	struct dentry *ret;
 	struct kernfs_node *parent = dir->i_private;
 	struct kernfs_node *kn;
-	struct inode *inode;
+	struct inode *inode = NULL;
 	const void *ns = NULL;
 
 	down_read(&kernfs_rwsem);
@@ -1102,11 +1161,9 @@ static struct dentry *kernfs_iop_lookup(struct inode *dir,
 
 	kn = kernfs_find_ns(parent, dentry->d_name.name, ns);
 
-	/* no such entry */
-	if (!kn || !kernfs_active(kn)) {
-		ret = NULL;
-		goto out_unlock;
-	}
+	/* no such entry, retain as negative hashed dentry */
+	if (!kn || !kernfs_active(kn))
+		goto out_negative;
 
 	/* attach dentry and inode */
 	inode = kernfs_get_inode(dir->i_sb, kn);
@@ -1114,10 +1171,10 @@ static struct dentry *kernfs_iop_lookup(struct inode *dir,
 		ret = ERR_PTR(-ENOMEM);
 		goto out_unlock;
 	}
-
+out_negative:
 	/* instantiate and hash dentry */
 	ret = d_splice_alias(inode, dentry);
- out_unlock:
+out_unlock:
 	up_read(&kernfs_rwsem);
 	return ret;
 }



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 4/4] kernfs: use revision to identify directory node changes
  2020-05-25  5:46 [PATCH 0/4] kernfs: proposed locking and concurrency improvement Ian Kent
                   ` (2 preceding siblings ...)
  2020-05-25  5:47 ` [PATCH 3/4] kernfs: improve kernfs path resolution Ian Kent
@ 2020-05-25  5:47 ` Ian Kent
  2020-05-25  6:16 ` [PATCH 0/4] kernfs: proposed locking and concurrency improvement Greg Kroah-Hartman
  4 siblings, 0 replies; 16+ messages in thread
From: Ian Kent @ 2020-05-25  5:47 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Al Viro, Greg Kroah-Hartman, Tejun Heo, Rick Lindsley,
	Stephen Rothwell, David Howells, Miklos Szeredi, linux-fsdevel,
	Kernel Mailing List

If a kernfs directory node hasn't changed there's no need to search for
an added (or removed) child dentry.

Add a revision counter to kernfs directory nodes so it can be used
to detect if a directory node has changed.

Signed-off-by: Ian Kent <raven@themaw.net>
---
 fs/kernfs/dir.c             |   17 +++++++++++++++--
 fs/kernfs/kernfs-internal.h |   24 ++++++++++++++++++++++++
 include/linux/kernfs.h      |    5 +++++
 3 files changed, 44 insertions(+), 2 deletions(-)

diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c
index f4943329e578..03f4f179bbc4 100644
--- a/fs/kernfs/dir.c
+++ b/fs/kernfs/dir.c
@@ -383,6 +383,7 @@ static int kernfs_link_sibling(struct kernfs_node *kn)
 	/* successfully added, account subdir number */
 	if (kernfs_type(kn) == KERNFS_DIR)
 		kn->parent->dir.subdirs++;
+	kernfs_inc_rev(kn->parent);
 
 	return 0;
 }
@@ -405,6 +406,7 @@ static bool kernfs_unlink_sibling(struct kernfs_node *kn)
 
 	if (kernfs_type(kn) == KERNFS_DIR)
 		kn->parent->dir.subdirs--;
+	kernfs_inc_rev(kn->parent);
 
 	rb_erase(&kn->rb, &kn->parent->dir.children);
 	RB_CLEAR_NODE(&kn->rb);
@@ -1044,9 +1046,16 @@ struct kernfs_node *kernfs_create_empty_dir(struct kernfs_node *parent,
 
 static int kernfs_dop_revalidate(struct dentry *dentry, unsigned int flags)
 {
+	struct kernfs_node *parent;
 	struct kernfs_node *kn;
 
 	if (flags & LOOKUP_RCU) {
+		/* Directory node changed? */
+		parent = kernfs_dentry_node(dentry->d_parent);
+
+		if (!kernfs_dir_changed(parent, dentry))
+			return 1;
+
 		kn = kernfs_dentry_node(dentry);
 		if (!kn) {
 			/* Negative hashed dentry, tell the VFS to switch to
@@ -1093,8 +1102,6 @@ static int kernfs_dop_revalidate(struct dentry *dentry, unsigned int flags)
 
 	kn = kernfs_dentry_node(dentry);
 	if (!kn) {
-		struct kernfs_node *parent;
-
 		/* If the kernfs node can be found this is a stale negative
 		 * hashed dentry so it must be discarded and the lookup redone.
 		 */
@@ -1102,6 +1109,10 @@ static int kernfs_dop_revalidate(struct dentry *dentry, unsigned int flags)
 		if (parent) {
 			const void *ns = NULL;
 
+			/* Directory node changed? */
+			if (kernfs_dir_changed(parent, dentry))
+				goto out_bad;
+
 			if (kernfs_ns_enabled(parent))
 				ns = kernfs_info(dentry->d_parent->d_sb)->ns;
 			kn = kernfs_find_ns(parent, dentry->d_name.name, ns);
@@ -1156,6 +1167,8 @@ static struct dentry *kernfs_iop_lookup(struct inode *dir,
 
 	down_read(&kernfs_rwsem);
 
+	kernfs_set_rev(dentry, parent);
+
 	if (kernfs_ns_enabled(parent))
 		ns = kernfs_info(dir->i_sb)->ns;
 
diff --git a/fs/kernfs/kernfs-internal.h b/fs/kernfs/kernfs-internal.h
index 097c1a989aa4..a7b0e2074260 100644
--- a/fs/kernfs/kernfs-internal.h
+++ b/fs/kernfs/kernfs-internal.h
@@ -82,6 +82,30 @@ static inline struct kernfs_node *kernfs_dentry_node(struct dentry *dentry)
 	return d_inode(dentry)->i_private;
 }
 
+static inline void kernfs_set_rev(struct dentry *dentry,
+				  struct kernfs_node *kn)
+{
+	dentry->d_time = kn->dir.rev;
+}
+
+static inline void kernfs_inc_rev(struct kernfs_node *kn)
+{
+	if (kernfs_type(kn) == KERNFS_DIR) {
+		if (!++kn->dir.rev)
+			kn->dir.rev++;
+	}
+}
+
+static inline bool kernfs_dir_changed(struct kernfs_node *kn,
+				      struct dentry *dentry)
+{
+	if (kernfs_type(kn) == KERNFS_DIR) {
+		if (kn->dir.rev != dentry->d_time)
+			return true;
+	}
+	return false;
+}
+
 extern const struct super_operations kernfs_sops;
 extern struct kmem_cache *kernfs_node_cache, *kernfs_iattrs_cache;
 
diff --git a/include/linux/kernfs.h b/include/linux/kernfs.h
index 89f6a4214a70..74727d98e380 100644
--- a/include/linux/kernfs.h
+++ b/include/linux/kernfs.h
@@ -98,6 +98,11 @@ struct kernfs_elem_dir {
 	 * better directly in kernfs_node but is here to save space.
 	 */
 	struct kernfs_root	*root;
+	/*
+	 * Monotonic revision counter, used to identify if a directory
+	 * node has changed during revalidation.
+	 */
+	unsigned long rev;
 };
 
 struct kernfs_elem_symlink {



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 0/4] kernfs: proposed locking and concurrency improvement
  2020-05-25  5:46 [PATCH 0/4] kernfs: proposed locking and concurrency improvement Ian Kent
                   ` (3 preceding siblings ...)
  2020-05-25  5:47 ` [PATCH 4/4] kernfs: use revision to identify directory node changes Ian Kent
@ 2020-05-25  6:16 ` Greg Kroah-Hartman
  2020-05-25  7:23   ` Ian Kent
  2020-05-27 12:44   ` Rick Lindsley
  4 siblings, 2 replies; 16+ messages in thread
From: Greg Kroah-Hartman @ 2020-05-25  6:16 UTC (permalink / raw)
  To: Ian Kent
  Cc: Andrew Morton, Al Viro, Tejun Heo, Rick Lindsley,
	Stephen Rothwell, David Howells, Miklos Szeredi, linux-fsdevel,
	Kernel Mailing List

On Mon, May 25, 2020 at 01:46:59PM +0800, Ian Kent wrote:
> For very large systems with hundreds of CPUs and TBs of RAM booting can
> take a very long time.
> 
> Initial reports showed that booting a configuration of several hundred
> CPUs and 64TB of RAM would take more than 30 minutes and require kernel
> parameters of udev.children-max=1024 systemd.default_timeout_start_sec=3600
> to prevent dropping into emergency mode.
> 
> Gathering information about what's happening during the boot is a bit
> challenging. But two main issues appeared to be, a large number of path
> lookups for non-existent files, and high lock contention in the VFS during
> path walks particularly in the dentry allocation code path.
> 
> The underlying cause of this was believed to be the sheer number of sysfs
> memory objects, 100,000+ for a 64TB memory configuration.

Independant of your kernfs changes, why do we really need to represent
all of this memory with that many different "memory objects"?  What is
that providing to userspace?

I remember Ben Herrenschmidt did a lot of work on some of the kernfs and
other functions to make large-memory systems boot faster to remove some
of the complexity in our functions, but that too did not look into why
we needed to create so many objects in the first place.

Perhaps you might want to look there instead?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 0/4] kernfs: proposed locking and concurrency improvement
  2020-05-25  6:16 ` [PATCH 0/4] kernfs: proposed locking and concurrency improvement Greg Kroah-Hartman
@ 2020-05-25  7:23   ` Ian Kent
  2020-05-25  7:31     ` Greg Kroah-Hartman
  2020-05-27 12:44   ` Rick Lindsley
  1 sibling, 1 reply; 16+ messages in thread
From: Ian Kent @ 2020-05-25  7:23 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Andrew Morton, Al Viro, Tejun Heo, Rick Lindsley,
	Stephen Rothwell, David Howells, Miklos Szeredi, linux-fsdevel,
	Kernel Mailing List

On Mon, 2020-05-25 at 08:16 +0200, Greg Kroah-Hartman wrote:
> On Mon, May 25, 2020 at 01:46:59PM +0800, Ian Kent wrote:
> > For very large systems with hundreds of CPUs and TBs of RAM booting
> > can
> > take a very long time.
> > 
> > Initial reports showed that booting a configuration of several
> > hundred
> > CPUs and 64TB of RAM would take more than 30 minutes and require
> > kernel
> > parameters of udev.children-max=1024
> > systemd.default_timeout_start_sec=3600
> > to prevent dropping into emergency mode.
> > 
> > Gathering information about what's happening during the boot is a
> > bit
> > challenging. But two main issues appeared to be, a large number of
> > path
> > lookups for non-existent files, and high lock contention in the VFS
> > during
> > path walks particularly in the dentry allocation code path.
> > 
> > The underlying cause of this was believed to be the sheer number of
> > sysfs
> > memory objects, 100,000+ for a 64TB memory configuration.
> 
> Independant of your kernfs changes, why do we really need to
> represent
> all of this memory with that many different "memory objects"?  What
> is
> that providing to userspace?
> 
> I remember Ben Herrenschmidt did a lot of work on some of the kernfs
> and
> other functions to make large-memory systems boot faster to remove
> some
> of the complexity in our functions, but that too did not look into
> why
> we needed to create so many objects in the first place.
> 
> Perhaps you might want to look there instead?

I presumed it was a hardware design requirement or IBM VM design
requirement.

Perhaps Rick can find out more on that question.

Ian


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 0/4] kernfs: proposed locking and concurrency improvement
  2020-05-25  7:23   ` Ian Kent
@ 2020-05-25  7:31     ` Greg Kroah-Hartman
  0 siblings, 0 replies; 16+ messages in thread
From: Greg Kroah-Hartman @ 2020-05-25  7:31 UTC (permalink / raw)
  To: Ian Kent
  Cc: Andrew Morton, Al Viro, Tejun Heo, Rick Lindsley,
	Stephen Rothwell, David Howells, Miklos Szeredi, linux-fsdevel,
	Kernel Mailing List

On Mon, May 25, 2020 at 03:23:35PM +0800, Ian Kent wrote:
> On Mon, 2020-05-25 at 08:16 +0200, Greg Kroah-Hartman wrote:
> > On Mon, May 25, 2020 at 01:46:59PM +0800, Ian Kent wrote:
> > > For very large systems with hundreds of CPUs and TBs of RAM booting
> > > can
> > > take a very long time.
> > > 
> > > Initial reports showed that booting a configuration of several
> > > hundred
> > > CPUs and 64TB of RAM would take more than 30 minutes and require
> > > kernel
> > > parameters of udev.children-max=1024
> > > systemd.default_timeout_start_sec=3600
> > > to prevent dropping into emergency mode.
> > > 
> > > Gathering information about what's happening during the boot is a
> > > bit
> > > challenging. But two main issues appeared to be, a large number of
> > > path
> > > lookups for non-existent files, and high lock contention in the VFS
> > > during
> > > path walks particularly in the dentry allocation code path.
> > > 
> > > The underlying cause of this was believed to be the sheer number of
> > > sysfs
> > > memory objects, 100,000+ for a 64TB memory configuration.
> > 
> > Independant of your kernfs changes, why do we really need to
> > represent
> > all of this memory with that many different "memory objects"?  What
> > is
> > that providing to userspace?
> > 
> > I remember Ben Herrenschmidt did a lot of work on some of the kernfs
> > and
> > other functions to make large-memory systems boot faster to remove
> > some
> > of the complexity in our functions, but that too did not look into
> > why
> > we needed to create so many objects in the first place.
> > 
> > Perhaps you might want to look there instead?
> 
> I presumed it was a hardware design requirement or IBM VM design
> requirement.
> 
> Perhaps Rick can find out more on that question.

Also, why do you need to create the devices _when_ you create them?  Can
you wait until after init is up and running to start populating the
device tree with them?  That way boot can be moving on and disks can be
spinning up earlier?

Also, what about just hot-adding all of that memory after init happens?

Those two options only delay the long delay, but it could allow other
things to be moving and speed up the overall boot process.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 0/4] kernfs: proposed locking and concurrency improvement
  2020-05-25  6:16 ` [PATCH 0/4] kernfs: proposed locking and concurrency improvement Greg Kroah-Hartman
  2020-05-25  7:23   ` Ian Kent
@ 2020-05-27 12:44   ` Rick Lindsley
  1 sibling, 0 replies; 16+ messages in thread
From: Rick Lindsley @ 2020-05-27 12:44 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Ian Kent
  Cc: Andrew Morton, Al Viro, Tejun Heo, Stephen Rothwell,
	David Howells, Miklos Szeredi, linux-fsdevel,
	Kernel Mailing List

On 5/24/20 11:16 PM, Greg Kroah-Hartman wrote:

> Independant of your kernfs changes, why do we really need to represent
> all of this memory with that many different "memory objects"?  What is
> that providing to userspace?
> 
> I remember Ben Herrenschmidt did a lot of work on some of the kernfs and
> other functions to make large-memory systems boot faster to remove some
> of the complexity in our functions, but that too did not look into why
> we needed to create so many objects in the first place.

That was my first choice too.  Unfortunately, I was not consulted on this design decision, however, and now it's out there.  It is, as you guessed, a hardware "feature".  The hw believes there is value in identifying memory in 256MB chunks.  There are, unfortunately, 2^18 or over 250,000 of those on a 64TB system, compared with dozens or maybe even hundreds of other devices.

We considered a revamping of the boot process - delaying some devices, reordering operations and such - but deemed that more dangerous to other architectures.  Although this change is driven by a particular architecture, the changes we've identified are architecture independent.  The risk of breaking something else is much lower than if we start reordering boot steps.

> Also, why do you need to create the devices _when_ you create them?  Can
> you wait until after init is up and running to start populating the
> device tree with them?  That way boot can be moving on and disks can be
> spinning up earlier?

I'm not a systemd expert, unfortunately, so I don't know if it needs to happen *right* then or not.  I do know that upon successful boot, a ps reveals many systemd children still reporting in.  It's not that we're waiting on everybody; the contention is causing a delay in the discovery of key devices like disks, and *that* leads to timeouts firing in systemd rules.  Any workaround bent on dodging the problem tends to get exponentially worse when the numbers change.  We noticed this problem at 32TB, designed some timeout changes and udev options to improve it, only to have both fail at 64TB.  Worse, at 64TB, larger timeouts and udev options failed to work consistently anymore.

There are two times we do coldplugs - once in the initramfs, and then again after we switch over to the actual root.  I did try omitting memory devices after the switchover.  Much faster!  So, why is the second one necessary?  Are there some architectures that need that?  I've not found anyone who can answer that, so going that route presents us with a different big risk.

Rick

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [kernfs] ea7c5fc39a: stress-ng.stream.ops_per_sec 11827.2% improvement
       [not found]   ` <20200606155216.GU12456@shao2-debian>
@ 2020-06-06 18:18     ` Greg Kroah-Hartman
  2020-06-07  1:13       ` Ian Kent
  0 siblings, 1 reply; 16+ messages in thread
From: Greg Kroah-Hartman @ 2020-06-06 18:18 UTC (permalink / raw)
  To: kernel test robot
  Cc: Ian Kent, Andrew Morton, Al Viro, Tejun Heo, Rick Lindsley,
	Stephen Rothwell, David Howells, Miklos Szeredi, linux-fsdevel,
	Kernel Mailing List, lkp

On Sat, Jun 06, 2020 at 11:52:16PM +0800, kernel test robot wrote:
> Greeting,
> 
> FYI, we noticed a 11827.2% improvement of stress-ng.stream.ops_per_sec due to commit:
> 
> 
> commit: ea7c5fc39ab005b501e0c7666c29db36321e4f74 ("[PATCH 1/4] kernfs: switch kernfs to use an rwsem")
> url: https://github.com/0day-ci/linux/commits/Ian-Kent/kernfs-proposed-locking-and-concurrency-improvement/20200525-134849
> 

Seriously?  That's a huge performance increase, and one that feels
really odd.  Why would a stress-ng test be touching sysfs?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [kernfs] ea7c5fc39a: stress-ng.stream.ops_per_sec 11827.2% improvement
  2020-06-06 18:18     ` [kernfs] ea7c5fc39a: stress-ng.stream.ops_per_sec 11827.2% improvement Greg Kroah-Hartman
@ 2020-06-07  1:13       ` Ian Kent
  2020-06-11  2:06         ` kernel test robot
  0 siblings, 1 reply; 16+ messages in thread
From: Ian Kent @ 2020-06-07  1:13 UTC (permalink / raw)
  To: Greg Kroah-Hartman, kernel test robot
  Cc: Andrew Morton, Al Viro, Tejun Heo, Rick Lindsley,
	Stephen Rothwell, David Howells, Miklos Szeredi, linux-fsdevel,
	Kernel Mailing List, lkp

On Sat, 2020-06-06 at 20:18 +0200, Greg Kroah-Hartman wrote:
> On Sat, Jun 06, 2020 at 11:52:16PM +0800, kernel test robot wrote:
> > Greeting,
> > 
> > FYI, we noticed a 11827.2% improvement of stress-
> > ng.stream.ops_per_sec due to commit:
> > 
> > 
> > commit: ea7c5fc39ab005b501e0c7666c29db36321e4f74 ("[PATCH 1/4]
> > kernfs: switch kernfs to use an rwsem")
> > url: 
> > https://github.com/0day-ci/linux/commits/Ian-Kent/kernfs-proposed-locking-and-concurrency-improvement/20200525-134849
> > 
> 
> Seriously?  That's a huge performance increase, and one that feels
> really odd.  Why would a stress-ng test be touching sysfs?

That is unusually high even if there's a lot of sysfs or kernfs
activity and that patch shouldn't improve VFS path walk contention
very much even if it is present.

Maybe I've missed something, and the information provided doesn't
seem to be quite enough to even make a start on it.

That's going to need some analysis which, for my part, will need to
wait probably until around rc1 time frame to allow me to get through
the push down stack (reactive, postponed due to other priorities) of
jobs I have in order to get back to the fifo queue (longer term tasks,
of which this is one) list of jobs I need to do as well, ;)

Please, kernel test robot, more information about this test and what
it's doing.

Ian


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/4] kernfs: switch kernfs to use an rwsem
  2020-05-25  5:47 ` [PATCH 1/4] kernfs: switch kernfs to use an rwsem Ian Kent
       [not found]   ` <20200606155216.GU12456@shao2-debian>
@ 2020-06-07  8:40   ` Ian Kent
  2020-06-08  9:58     ` Ian Kent
  1 sibling, 1 reply; 16+ messages in thread
From: Ian Kent @ 2020-06-07  8:40 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Andrew Morton, l Viro, Greg Kroah-Hartman, Tejun Heo,
	Rick Lindsley, Stephen Rothwell, David Howells, Miklos Szeredi,
	linux-fsdevel, Kernel Mailing List

Hi Greg,

On Mon, 2020-05-25 at 13:47 +0800, Ian Kent wrote:
> @@ -189,9 +189,9 @@ int kernfs_iop_getattr(const struct path *path,
> struct kstat *stat,
>  	struct inode *inode = d_inode(path->dentry);
>  	struct kernfs_node *kn = inode->i_private;
>  
> -	mutex_lock(&kernfs_mutex);
> +	down_read(&kernfs_rwsem);
>  	kernfs_refresh_inode(kn, inode);
> -	mutex_unlock(&kernfs_mutex);
> +	up_read(&kernfs_rwsem);
>  
>  	generic_fillattr(inode, stat);
>  	return 0;
> @@ -281,9 +281,9 @@ int kernfs_iop_permission(struct inode *inode,
> int mask)
>  
>  	kn = inode->i_private;
>  
> -	mutex_lock(&kernfs_mutex);
> +	down_read(&kernfs_rwsem);
>  	kernfs_refresh_inode(kn, inode);
> -	mutex_unlock(&kernfs_mutex);
> +	up_read(&kernfs_rwsem);
>  
>  	return generic_permission(inode, mask);
>  }

I changed these from a write lock to a read lock late in the
development.

But kernfs_refresh_inode() modifies the inode so I think I should
have taken the inode lock as well as taking the read lock.

I'll look again but a second opinion (anyone) would be welcome.

Ian


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/4] kernfs: switch kernfs to use an rwsem
  2020-06-07  8:40   ` [PATCH 1/4] kernfs: switch kernfs to use an rwsem Ian Kent
@ 2020-06-08  9:58     ` Ian Kent
  0 siblings, 0 replies; 16+ messages in thread
From: Ian Kent @ 2020-06-08  9:58 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Andrew Morton, l Viro, Tejun Heo, Rick Lindsley,
	Stephen Rothwell, David Howells, Miklos Szeredi, linux-fsdevel,
	Kernel Mailing List

On Sun, 2020-06-07 at 16:40 +0800, Ian Kent wrote:
> Hi Greg,
> 
> On Mon, 2020-05-25 at 13:47 +0800, Ian Kent wrote:
> > @@ -189,9 +189,9 @@ int kernfs_iop_getattr(const struct path *path,
> > struct kstat *stat,
> >  	struct inode *inode = d_inode(path->dentry);
> >  	struct kernfs_node *kn = inode->i_private;
> >  
> > -	mutex_lock(&kernfs_mutex);
> > +	down_read(&kernfs_rwsem);
> >  	kernfs_refresh_inode(kn, inode);
> > -	mutex_unlock(&kernfs_mutex);
> > +	up_read(&kernfs_rwsem);
> >  
> >  	generic_fillattr(inode, stat);
> >  	return 0;
> > @@ -281,9 +281,9 @@ int kernfs_iop_permission(struct inode *inode,
> > int mask)
> >  
> >  	kn = inode->i_private;
> >  
> > -	mutex_lock(&kernfs_mutex);
> > +	down_read(&kernfs_rwsem);
> >  	kernfs_refresh_inode(kn, inode);
> > -	mutex_unlock(&kernfs_mutex);
> > +	up_read(&kernfs_rwsem);
> >  
> >  	return generic_permission(inode, mask);
> >  }
> 
> I changed these from a write lock to a read lock late in the
> development.
> 
> But kernfs_refresh_inode() modifies the inode so I think I should
> have taken the inode lock as well as taking the read lock.
> 
> I'll look again but a second opinion (anyone) would be welcome.

I had a look at this today and came up with a couple of patches
to fix it, I don't particularly like to have to do what I did
but I don't think there's any other choice. That's because the
rb tree locking is under significant contention and changing
this back to use the write lock will adversely affect that.

But unless I can find out more about the anomalous kernel test
robot result I can't do anything!

Providing a job.yaml to reproduce it with the hardware specification
of the lkp machine it was run on and no guidelines on what that test
does and what the test needs so it can actually be reproduced isn't
that useful.

Ian


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [kernfs] ea7c5fc39a: stress-ng.stream.ops_per_sec 11827.2% improvement
  2020-06-07  1:13       ` Ian Kent
@ 2020-06-11  2:06         ` kernel test robot
  2020-06-11  2:20           ` Rick Lindsley
  2020-06-11  3:02           ` Ian Kent
  0 siblings, 2 replies; 16+ messages in thread
From: kernel test robot @ 2020-06-11  2:06 UTC (permalink / raw)
  To: Ian Kent
  Cc: Greg Kroah-Hartman, Andrew Morton, Al Viro, Tejun Heo,
	Rick Lindsley, Stephen Rothwell, David Howells, Miklos Szeredi,
	linux-fsdevel, Kernel Mailing List, lkp

On Sun, Jun 07, 2020 at 09:13:08AM +0800, Ian Kent wrote:
> On Sat, 2020-06-06 at 20:18 +0200, Greg Kroah-Hartman wrote:
> > On Sat, Jun 06, 2020 at 11:52:16PM +0800, kernel test robot wrote:
> > > Greeting,
> > > 
> > > FYI, we noticed a 11827.2% improvement of stress-
> > > ng.stream.ops_per_sec due to commit:
> > > 
> > > 
> > > commit: ea7c5fc39ab005b501e0c7666c29db36321e4f74 ("[PATCH 1/4]
> > > kernfs: switch kernfs to use an rwsem")
> > > url: 
> > > https://github.com/0day-ci/linux/commits/Ian-Kent/kernfs-proposed-locking-and-concurrency-improvement/20200525-134849
> > > 
> > 
> > Seriously?  That's a huge performance increase, and one that feels
> > really odd.  Why would a stress-ng test be touching sysfs?
> 
> That is unusually high even if there's a lot of sysfs or kernfs
> activity and that patch shouldn't improve VFS path walk contention
> very much even if it is present.
> 
> Maybe I've missed something, and the information provided doesn't
> seem to be quite enough to even make a start on it.
> 
> That's going to need some analysis which, for my part, will need to
> wait probably until around rc1 time frame to allow me to get through
> the push down stack (reactive, postponed due to other priorities) of
> jobs I have in order to get back to the fifo queue (longer term tasks,
> of which this is one) list of jobs I need to do as well, ;)
> 
> Please, kernel test robot, more information about this test and what
> it's doing.
> 

Hi Ian,

We increased the timeout of stress-ng from 1s to 32s, and there's only
3% improvement of stress-ng.stream.ops_per_sec:

fefcfc968723caf9  ea7c5fc39ab005b501e0c7666c  testcase/testparams/testbox
----------------  --------------------------  ---------------------------
         %stddev      change         %stddev
             \          |                \  
     10686               3%      11037        stress-ng/cpu-cache-performance-1HDD-100%-32s-ucode=0x500002c/lkp-csl-2sp5
     10686               3%      11037        GEO-MEAN stress-ng.stream.ops_per_sec

It seems the result of stress-ng is inaccurate if test time too
short, we'll increase the test time to avoid unreasonable results,
sorry for the inconvenience.

Best Regards,
Rong Chen

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [kernfs] ea7c5fc39a: stress-ng.stream.ops_per_sec 11827.2% improvement
  2020-06-11  2:06         ` kernel test robot
@ 2020-06-11  2:20           ` Rick Lindsley
  2020-06-11  3:02           ` Ian Kent
  1 sibling, 0 replies; 16+ messages in thread
From: Rick Lindsley @ 2020-06-11  2:20 UTC (permalink / raw)
  To: kernel test robot, Ian Kent
  Cc: Greg Kroah-Hartman, Andrew Morton, Al Viro, Tejun Heo,
	Stephen Rothwell, David Howells, Miklos Szeredi, linux-fsdevel,
	Kernel Mailing List, lkp

On 6/10/20 7:06 PM, kernel test robot wrote:
> On Sun, Jun 07, 2020 at 09:13:08AM +0800, Ian Kent wrote:
>
> It seems the result of stress-ng is inaccurate if test time too
> short, we'll increase the test time to avoid unreasonable results,
> sorry for the inconvenience.

Thank you for your response!  I was examining the 25 tests in the 'cpu-cache' class and had nothing but head scratching so far on what could be having that effect.

Rick

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [kernfs] ea7c5fc39a: stress-ng.stream.ops_per_sec 11827.2% improvement
  2020-06-11  2:06         ` kernel test robot
  2020-06-11  2:20           ` Rick Lindsley
@ 2020-06-11  3:02           ` Ian Kent
  1 sibling, 0 replies; 16+ messages in thread
From: Ian Kent @ 2020-06-11  3:02 UTC (permalink / raw)
  To: kernel test robot
  Cc: Greg Kroah-Hartman, Andrew Morton, Al Viro, Tejun Heo,
	Rick Lindsley, Stephen Rothwell, David Howells, Miklos Szeredi,
	linux-fsdevel, Kernel Mailing List, lkp

On Thu, 2020-06-11 at 10:06 +0800, kernel test robot wrote:
> On Sun, Jun 07, 2020 at 09:13:08AM +0800, Ian Kent wrote:
> > On Sat, 2020-06-06 at 20:18 +0200, Greg Kroah-Hartman wrote:
> > > On Sat, Jun 06, 2020 at 11:52:16PM +0800, kernel test robot
> > > wrote:
> > > > Greeting,
> > > > 
> > > > FYI, we noticed a 11827.2% improvement of stress-
> > > > ng.stream.ops_per_sec due to commit:
> > > > 
> > > > 
> > > > commit: ea7c5fc39ab005b501e0c7666c29db36321e4f74 ("[PATCH 1/4]
> > > > kernfs: switch kernfs to use an rwsem")
> > > > url: 
> > > > https://github.com/0day-ci/linux/commits/Ian-Kent/kernfs-proposed-locking-and-concurrency-improvement/20200525-134849
> > > > 
> > > 
> > > Seriously?  That's a huge performance increase, and one that
> > > feels
> > > really odd.  Why would a stress-ng test be touching sysfs?
> > 
> > That is unusually high even if there's a lot of sysfs or kernfs
> > activity and that patch shouldn't improve VFS path walk contention
> > very much even if it is present.
> > 
> > Maybe I've missed something, and the information provided doesn't
> > seem to be quite enough to even make a start on it.
> > 
> > That's going to need some analysis which, for my part, will need to
> > wait probably until around rc1 time frame to allow me to get
> > through
> > the push down stack (reactive, postponed due to other priorities)
> > of
> > jobs I have in order to get back to the fifo queue (longer term
> > tasks,
> > of which this is one) list of jobs I need to do as well, ;)
> > 
> > Please, kernel test robot, more information about this test and
> > what
> > it's doing.
> > 
> 
> Hi Ian,
> 
> We increased the timeout of stress-ng from 1s to 32s, and there's
> only
> 3% improvement of stress-ng.stream.ops_per_sec:
> 
> fefcfc968723caf9  ea7c5fc39ab005b501e0c7666c  testcase/testparams/tes
> tbox
> ----------------  --------------------------  -----------------------
> ----
>          %stddev      change         %stddev
>              \          |                \  
>      10686               3%      11037        stress-ng/cpu-cache-
> performance-1HDD-100%-32s-ucode=0x500002c/lkp-csl-2sp5
>      10686               3%      11037        GEO-MEAN stress-
> ng.stream.ops_per_sec
> 
> It seems the result of stress-ng is inaccurate if test time too
> short, we'll increase the test time to avoid unreasonable results,
> sorry for the inconvenience.

Haha, I was worried there wasn't anything that could be done to
work out what was wrong.

I had tried to reproduce it, and failed since the job file specifies
a host config that I simply don't have, and I don't get how to alter
the job to suit, or how to specify a host definition file.

I also couldn't work out what parameters where used in running the
test so I was about to ask on the lkp list after working through
this in a VM.

So your timing on looking into this is fortunate, for sure.
Thank you very much for that.

Now, Greg, there's that locking I changed around kernfs_refresh_inode()
that I need to fix which I re-considered as a result of this, so that's
a plus for the testing because it's certainly wrong.

I'll have another look at that and boot test it on a couple of systems
then post a v2 for you to consider. What I've done might offend your
sensibilities as it does mine, or perhaps not so much.

Ian


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2020-06-11  3:02 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-25  5:46 [PATCH 0/4] kernfs: proposed locking and concurrency improvement Ian Kent
2020-05-25  5:47 ` [PATCH 1/4] kernfs: switch kernfs to use an rwsem Ian Kent
     [not found]   ` <20200606155216.GU12456@shao2-debian>
2020-06-06 18:18     ` [kernfs] ea7c5fc39a: stress-ng.stream.ops_per_sec 11827.2% improvement Greg Kroah-Hartman
2020-06-07  1:13       ` Ian Kent
2020-06-11  2:06         ` kernel test robot
2020-06-11  2:20           ` Rick Lindsley
2020-06-11  3:02           ` Ian Kent
2020-06-07  8:40   ` [PATCH 1/4] kernfs: switch kernfs to use an rwsem Ian Kent
2020-06-08  9:58     ` Ian Kent
2020-05-25  5:47 ` [PATCH 2/4] kernfs: move revalidate to be near lookup Ian Kent
2020-05-25  5:47 ` [PATCH 3/4] kernfs: improve kernfs path resolution Ian Kent
2020-05-25  5:47 ` [PATCH 4/4] kernfs: use revision to identify directory node changes Ian Kent
2020-05-25  6:16 ` [PATCH 0/4] kernfs: proposed locking and concurrency improvement Greg Kroah-Hartman
2020-05-25  7:23   ` Ian Kent
2020-05-25  7:31     ` Greg Kroah-Hartman
2020-05-27 12:44   ` Rick Lindsley

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).