linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCHSET v2 driver-core-next] kernfs: prepare for cgroup's kernfs conversion
@ 2014-02-03 19:09 Tejun Heo
  2014-02-03 19:09 ` [PATCH 01/10] kernfs: invoke dir_ops while holding active ref of the target node Tejun Heo
                   ` (11 more replies)
  0 siblings, 12 replies; 19+ messages in thread
From: Tejun Heo @ 2014-02-03 19:09 UTC (permalink / raw)
  To: gregkh; +Cc: linux-kernel

Hello,

This patchset makes various updates in preparation of cgroup's kernfs
conversion.  Changes from the last take[L] are

* Refreshed on top of v3.14-rc1.

* 0010-kernfs-add-CONFIG_KERNFS.patch added.

This patchset contains the following 10 patches.

 0001-kernfs-invoke-dir_ops-while-holding-active-ref-of-th.patch
 0002-kernfs-rename-kernfs_dir_ops-to-kernfs_syscall_ops.patch
 0003-kernfs-implement-kernfs_syscall_ops-remount_fs-and-s.patch
 0004-kernfs-add-missing-kernfs_active-checks-in-directory.patch
 0005-kernfs-allow-nodes-to-be-created-in-the-deactivated-.patch
 0006-kernfs-implement-kernfs_ops-atomic_write_len.patch
 0007-kernfs-add-kernfs_open_file-priv.patch
 0008-kernfs-implement-kernfs_node_from_dentry-kernfs_root.patch
 0009-kernfs-implement-kernfs_get_parent-kernfs_name-path-.patch
 0010-kernfs-add-CONFIG_KERNFS.patch

 0001-0003 update kernfs_dir_ops for active ref protection, add
 additional syscall operations and rename the struct to
 kernfs_syscall_ops.

 0004-0005 implement an option to create new kernfs_nodes in
 deactivated state so that multiple nodes can be made visible
 atomically or removed without ever being visible to userland.

 0006 implements kernfs_ops->atomic_write_len which will be used to
 support cgroup cftype->max_write_len.

 0007 adds kernfs_open_file->priv.

 0008-0009 add various helpers and accessors.

 0010 adds CONFIG_KERNFS so that kernfs can be selected independently
 from sysfs.

This patchset is on top of

  v3.14-rc1 38dbfb59d117
+ [1] [PATCHSET v5 driver-core-next] kernfs, sysfs, driver-core: implement synchronous self-removal

diffstat follows.

 fs/Kconfig             |    1 
 fs/Makefile            |    3 
 fs/kernfs/Kconfig      |    7 +
 fs/kernfs/dir.c        |  325 ++++++++++++++++++++++++++++++++++++++++++++-----
 fs/kernfs/file.c       |   49 ++++---
 fs/kernfs/mount.c      |   37 +++++
 fs/sysfs/Kconfig       |    1 
 fs/sysfs/dir.c         |   44 +-----
 fs/sysfs/mount.c       |    2 
 include/linux/kernfs.h |   90 +++++++++++--
 10 files changed, 460 insertions(+), 99 deletions(-)

Thanks!

--
tejun

[L] http://lkml.kernel.org/g/1390951971-15671-1-git-send-email-tj@kernel.org
[1] http://lkml.kernel.org/g/1391454185-32143-1-git-send-email-tj@kernel.org

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 01/10] kernfs: invoke dir_ops while holding active ref of the target node
  2014-02-03 19:09 [PATCHSET v2 driver-core-next] kernfs: prepare for cgroup's kernfs conversion Tejun Heo
@ 2014-02-03 19:09 ` Tejun Heo
  2014-02-03 19:09 ` [PATCH 02/10] kernfs: rename kernfs_dir_ops to kernfs_syscall_ops Tejun Heo
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 19+ messages in thread
From: Tejun Heo @ 2014-02-03 19:09 UTC (permalink / raw)
  To: gregkh; +Cc: linux-kernel, Tejun Heo

kernfs_dir_ops are currently being invoked without any active
reference, which makes it tricky for the invoked operations to
determine whether the objects associated those nodes are safe to
access and will remain that way for the duration of such operations.

kernfs already has active_ref mechanism to deal with this which makes
the removal of a given node the synchronization point for gating the
file operations.  There's no reason for dir_ops to be any different.
Update the dir_ops handling so that active_ref is held while the
dir_ops are executing.  This guarantees that while a dir_ops is
executing the target nodes stay alive.

As kernfs_dir_ops doesn't have any in-kernel user at this point, this
doesn't affect anybody.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 fs/kernfs/dir.c        | 33 ++++++++++++++++++++++++++++++---
 include/linux/kernfs.h |  3 ++-
 2 files changed, 32 insertions(+), 4 deletions(-)

diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c
index 8c63ae1..bfbfb48 100644
--- a/fs/kernfs/dir.c
+++ b/fs/kernfs/dir.c
@@ -654,22 +654,36 @@ static int kernfs_iop_mkdir(struct inode *dir, struct dentry *dentry,
 {
 	struct kernfs_node *parent = dir->i_private;
 	struct kernfs_dir_ops *kdops = kernfs_root(parent)->dir_ops;
+	int ret;
 
 	if (!kdops || !kdops->mkdir)
 		return -EPERM;
 
-	return kdops->mkdir(parent, dentry->d_name.name, mode);
+	if (!kernfs_get_active(parent))
+		return -ENODEV;
+
+	ret = kdops->mkdir(parent, dentry->d_name.name, mode);
+
+	kernfs_put_active(parent);
+	return ret;
 }
 
 static int kernfs_iop_rmdir(struct inode *dir, struct dentry *dentry)
 {
 	struct kernfs_node *kn  = dentry->d_fsdata;
 	struct kernfs_dir_ops *kdops = kernfs_root(kn)->dir_ops;
+	int ret;
 
 	if (!kdops || !kdops->rmdir)
 		return -EPERM;
 
-	return kdops->rmdir(kn);
+	if (!kernfs_get_active(kn))
+		return -ENODEV;
+
+	ret = kdops->rmdir(kn);
+
+	kernfs_put_active(kn);
+	return ret;
 }
 
 static int kernfs_iop_rename(struct inode *old_dir, struct dentry *old_dentry,
@@ -678,11 +692,24 @@ static int kernfs_iop_rename(struct inode *old_dir, struct dentry *old_dentry,
 	struct kernfs_node *kn  = old_dentry->d_fsdata;
 	struct kernfs_node *new_parent = new_dir->i_private;
 	struct kernfs_dir_ops *kdops = kernfs_root(kn)->dir_ops;
+	int ret;
 
 	if (!kdops || !kdops->rename)
 		return -EPERM;
 
-	return kdops->rename(kn, new_parent, new_dentry->d_name.name);
+	if (!kernfs_get_active(kn))
+		return -ENODEV;
+
+	if (!kernfs_get_active(new_parent)) {
+		kernfs_put_active(kn);
+		return -ENODEV;
+	}
+
+	ret = kdops->rename(kn, new_parent, new_dentry->d_name.name);
+
+	kernfs_put_active(new_parent);
+	kernfs_put_active(kn);
+	return ret;
 }
 
 const struct inode_operations kernfs_dir_iops = {
diff --git a/include/linux/kernfs.h b/include/linux/kernfs.h
index 02ac334..58a131d 100644
--- a/include/linux/kernfs.h
+++ b/include/linux/kernfs.h
@@ -111,7 +111,8 @@ struct kernfs_node {
  * kernfs_dir_ops may be specified on kernfs_create_root() to support
  * directory manipulation syscalls.  These optional callbacks are invoked
  * on the matching syscalls and can perform any kernfs operations which
- * don't necessarily have to be the exact operation requested.
+ * don't necessarily have to be the exact operation requested.  An active
+ * reference is held for each kernfs_node parameter.
  */
 struct kernfs_dir_ops {
 	int (*mkdir)(struct kernfs_node *parent, const char *name,
-- 
1.8.5.3


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 02/10] kernfs: rename kernfs_dir_ops to kernfs_syscall_ops
  2014-02-03 19:09 [PATCHSET v2 driver-core-next] kernfs: prepare for cgroup's kernfs conversion Tejun Heo
  2014-02-03 19:09 ` [PATCH 01/10] kernfs: invoke dir_ops while holding active ref of the target node Tejun Heo
@ 2014-02-03 19:09 ` Tejun Heo
  2014-02-03 19:09 ` [PATCH 03/10] kernfs: implement kernfs_syscall_ops->remount_fs() and ->show_options() Tejun Heo
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 19+ messages in thread
From: Tejun Heo @ 2014-02-03 19:09 UTC (permalink / raw)
  To: gregkh; +Cc: linux-kernel, Tejun Heo

We're gonna need non-dir syscall callbacks, which will make dir_ops a
misnomer.  Let's rename kernfs_dir_ops to kernfs_syscall_ops.

This is pure rename.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 fs/kernfs/dir.c        | 25 +++++++++++++------------
 include/linux/kernfs.h | 18 +++++++++---------
 2 files changed, 22 insertions(+), 21 deletions(-)

diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c
index bfbfb48..f58d2f1 100644
--- a/fs/kernfs/dir.c
+++ b/fs/kernfs/dir.c
@@ -527,13 +527,14 @@ EXPORT_SYMBOL_GPL(kernfs_find_and_get_ns);
 
 /**
  * kernfs_create_root - create a new kernfs hierarchy
- * @kdops: optional directory syscall operations for the hierarchy
+ * @scops: optional syscall operations for the hierarchy
  * @priv: opaque data associated with the new directory
  *
  * Returns the root of the new hierarchy on success, ERR_PTR() value on
  * failure.
  */
-struct kernfs_root *kernfs_create_root(struct kernfs_dir_ops *kdops, void *priv)
+struct kernfs_root *kernfs_create_root(struct kernfs_syscall_ops *scops,
+				       void *priv)
 {
 	struct kernfs_root *root;
 	struct kernfs_node *kn;
@@ -556,7 +557,7 @@ struct kernfs_root *kernfs_create_root(struct kernfs_dir_ops *kdops, void *priv)
 	kn->priv = priv;
 	kn->dir.root = root;
 
-	root->dir_ops = kdops;
+	root->syscall_ops = scops;
 	root->kn = kn;
 	init_waitqueue_head(&root->deactivate_waitq);
 
@@ -653,16 +654,16 @@ static int kernfs_iop_mkdir(struct inode *dir, struct dentry *dentry,
 			    umode_t mode)
 {
 	struct kernfs_node *parent = dir->i_private;
-	struct kernfs_dir_ops *kdops = kernfs_root(parent)->dir_ops;
+	struct kernfs_syscall_ops *scops = kernfs_root(parent)->syscall_ops;
 	int ret;
 
-	if (!kdops || !kdops->mkdir)
+	if (!scops || !scops->mkdir)
 		return -EPERM;
 
 	if (!kernfs_get_active(parent))
 		return -ENODEV;
 
-	ret = kdops->mkdir(parent, dentry->d_name.name, mode);
+	ret = scops->mkdir(parent, dentry->d_name.name, mode);
 
 	kernfs_put_active(parent);
 	return ret;
@@ -671,16 +672,16 @@ static int kernfs_iop_mkdir(struct inode *dir, struct dentry *dentry,
 static int kernfs_iop_rmdir(struct inode *dir, struct dentry *dentry)
 {
 	struct kernfs_node *kn  = dentry->d_fsdata;
-	struct kernfs_dir_ops *kdops = kernfs_root(kn)->dir_ops;
+	struct kernfs_syscall_ops *scops = kernfs_root(kn)->syscall_ops;
 	int ret;
 
-	if (!kdops || !kdops->rmdir)
+	if (!scops || !scops->rmdir)
 		return -EPERM;
 
 	if (!kernfs_get_active(kn))
 		return -ENODEV;
 
-	ret = kdops->rmdir(kn);
+	ret = scops->rmdir(kn);
 
 	kernfs_put_active(kn);
 	return ret;
@@ -691,10 +692,10 @@ static int kernfs_iop_rename(struct inode *old_dir, struct dentry *old_dentry,
 {
 	struct kernfs_node *kn  = old_dentry->d_fsdata;
 	struct kernfs_node *new_parent = new_dir->i_private;
-	struct kernfs_dir_ops *kdops = kernfs_root(kn)->dir_ops;
+	struct kernfs_syscall_ops *scops = kernfs_root(kn)->syscall_ops;
 	int ret;
 
-	if (!kdops || !kdops->rename)
+	if (!scops || !scops->rename)
 		return -EPERM;
 
 	if (!kernfs_get_active(kn))
@@ -705,7 +706,7 @@ static int kernfs_iop_rename(struct inode *old_dir, struct dentry *old_dentry,
 		return -ENODEV;
 	}
 
-	ret = kdops->rename(kn, new_parent, new_dentry->d_name.name);
+	ret = scops->rename(kn, new_parent, new_dentry->d_name.name);
 
 	kernfs_put_active(new_parent);
 	kernfs_put_active(kn);
diff --git a/include/linux/kernfs.h b/include/linux/kernfs.h
index 58a131d..5ddc474 100644
--- a/include/linux/kernfs.h
+++ b/include/linux/kernfs.h
@@ -108,13 +108,13 @@ struct kernfs_node {
 };
 
 /*
- * kernfs_dir_ops may be specified on kernfs_create_root() to support
- * directory manipulation syscalls.  These optional callbacks are invoked
- * on the matching syscalls and can perform any kernfs operations which
- * don't necessarily have to be the exact operation requested.  An active
- * reference is held for each kernfs_node parameter.
+ * kernfs_syscall_ops may be specified on kernfs_create_root() to support
+ * syscalls.  These optional callbacks are invoked on the matching syscalls
+ * and can perform any kernfs operations which don't necessarily have to be
+ * the exact operation requested.  An active reference is held for each
+ * kernfs_node parameter.
  */
-struct kernfs_dir_ops {
+struct kernfs_syscall_ops {
 	int (*mkdir)(struct kernfs_node *parent, const char *name,
 		     umode_t mode);
 	int (*rmdir)(struct kernfs_node *kn);
@@ -128,7 +128,7 @@ struct kernfs_root {
 
 	/* private fields, do not use outside kernfs proper */
 	struct ida		ino_ida;
-	struct kernfs_dir_ops	*dir_ops;
+	struct kernfs_syscall_ops *syscall_ops;
 	wait_queue_head_t	deactivate_waitq;
 };
 
@@ -219,7 +219,7 @@ struct kernfs_node *kernfs_find_and_get_ns(struct kernfs_node *parent,
 void kernfs_get(struct kernfs_node *kn);
 void kernfs_put(struct kernfs_node *kn);
 
-struct kernfs_root *kernfs_create_root(struct kernfs_dir_ops *kdops,
+struct kernfs_root *kernfs_create_root(struct kernfs_syscall_ops *scops,
 				       void *priv);
 void kernfs_destroy_root(struct kernfs_root *root);
 
@@ -273,7 +273,7 @@ static inline void kernfs_get(struct kernfs_node *kn) { }
 static inline void kernfs_put(struct kernfs_node *kn) { }
 
 static inline struct kernfs_root *
-kernfs_create_root(struct kernfs_dir_ops *kdops, void *priv)
+kernfs_create_root(struct kernfs_syscall_ops *scops, void *priv)
 { return ERR_PTR(-ENOSYS); }
 
 static inline void kernfs_destroy_root(struct kernfs_root *root) { }
-- 
1.8.5.3


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 03/10] kernfs: implement kernfs_syscall_ops->remount_fs() and ->show_options()
  2014-02-03 19:09 [PATCHSET v2 driver-core-next] kernfs: prepare for cgroup's kernfs conversion Tejun Heo
  2014-02-03 19:09 ` [PATCH 01/10] kernfs: invoke dir_ops while holding active ref of the target node Tejun Heo
  2014-02-03 19:09 ` [PATCH 02/10] kernfs: rename kernfs_dir_ops to kernfs_syscall_ops Tejun Heo
@ 2014-02-03 19:09 ` Tejun Heo
  2014-02-03 19:09 ` [PATCH 04/10] kernfs: add missing kernfs_active() checks in directory operations Tejun Heo
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 19+ messages in thread
From: Tejun Heo @ 2014-02-03 19:09 UTC (permalink / raw)
  To: gregkh; +Cc: linux-kernel, Tejun Heo

Add two super_block related syscall callbacks ->remount_fs() and
->show_options() to kernfs_syscall_ops.  These simply forward the
matching super_operations.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 fs/kernfs/mount.c      | 23 +++++++++++++++++++++++
 include/linux/kernfs.h |  3 +++
 2 files changed, 26 insertions(+)

diff --git a/fs/kernfs/mount.c b/fs/kernfs/mount.c
index 0d6ce89..70cc698 100644
--- a/fs/kernfs/mount.c
+++ b/fs/kernfs/mount.c
@@ -19,10 +19,33 @@
 
 struct kmem_cache *kernfs_node_cache;
 
+static int kernfs_sop_remount_fs(struct super_block *sb, int *flags, char *data)
+{
+	struct kernfs_root *root = kernfs_info(sb)->root;
+	struct kernfs_syscall_ops *scops = root->syscall_ops;
+
+	if (scops && scops->remount_fs)
+		return scops->remount_fs(root, flags, data);
+	return 0;
+}
+
+static int kernfs_sop_show_options(struct seq_file *sf, struct dentry *dentry)
+{
+	struct kernfs_root *root = kernfs_root(dentry->d_fsdata);
+	struct kernfs_syscall_ops *scops = root->syscall_ops;
+
+	if (scops && scops->show_options)
+		return scops->show_options(sf, root);
+	return 0;
+}
+
 static const struct super_operations kernfs_sops = {
 	.statfs		= simple_statfs,
 	.drop_inode	= generic_delete_inode,
 	.evict_inode	= kernfs_evict_inode,
+
+	.remount_fs	= kernfs_sop_remount_fs,
+	.show_options	= kernfs_sop_show_options,
 };
 
 static int kernfs_fill_super(struct super_block *sb)
diff --git a/include/linux/kernfs.h b/include/linux/kernfs.h
index 5ddc474..5d5b7e9 100644
--- a/include/linux/kernfs.h
+++ b/include/linux/kernfs.h
@@ -115,6 +115,9 @@ struct kernfs_node {
  * kernfs_node parameter.
  */
 struct kernfs_syscall_ops {
+	int (*remount_fs)(struct kernfs_root *root, int *flags, char *data);
+	int (*show_options)(struct seq_file *sf, struct kernfs_root *root);
+
 	int (*mkdir)(struct kernfs_node *parent, const char *name,
 		     umode_t mode);
 	int (*rmdir)(struct kernfs_node *kn);
-- 
1.8.5.3


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 04/10] kernfs: add missing kernfs_active() checks in directory operations
  2014-02-03 19:09 [PATCHSET v2 driver-core-next] kernfs: prepare for cgroup's kernfs conversion Tejun Heo
                   ` (2 preceding siblings ...)
  2014-02-03 19:09 ` [PATCH 03/10] kernfs: implement kernfs_syscall_ops->remount_fs() and ->show_options() Tejun Heo
@ 2014-02-03 19:09 ` Tejun Heo
  2014-02-03 19:09 ` [PATCH 05/10] kernfs: allow nodes to be created in the deactivated state Tejun Heo
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 19+ messages in thread
From: Tejun Heo @ 2014-02-03 19:09 UTC (permalink / raw)
  To: gregkh; +Cc: linux-kernel, Tejun Heo

kernfs_iop_lookup(), kernfs_dir_pos() and kernfs_dir_next_pos() were
missing kernfs_active() tests before using the found kernfs_node.  As
deactivated state is currently visible only while a node is being
removed, this doesn't pose an actual problem.  e.g. lookup succeeding
on a deactivated node doesn't harm anything as the eventual file
operations are gonna fail and those failures are indistinguishible
from the cases in which the lookups had happened before the node was
deactivated.

However, we're gonna allow new nodes to be created deactivated and
then activated explicitly by the kernfs user when it sees fit.  This
is to support atomically making multiple nodes visible to userland and
thus those nodes must not be visible to userland before activated.

Let's plug the lookup and readdir holes so that deactivated nodes are
invisible to userland.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 fs/kernfs/dir.c | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c
index f58d2f1..89f8462 100644
--- a/fs/kernfs/dir.c
+++ b/fs/kernfs/dir.c
@@ -629,7 +629,7 @@ static struct dentry *kernfs_iop_lookup(struct inode *dir,
 	kn = kernfs_find_ns(parent, dentry->d_name.name, ns);
 
 	/* no such entry */
-	if (!kn) {
+	if (!kn || !kernfs_active(kn)) {
 		ret = NULL;
 		goto out_unlock;
 	}
@@ -1112,8 +1112,8 @@ static struct kernfs_node *kernfs_dir_pos(const void *ns,
 				break;
 		}
 	}
-	/* Skip over entries in the wrong namespace */
-	while (pos && pos->ns != ns) {
+	/* Skip over entries which are dying/dead or in the wrong namespace */
+	while (pos && (!kernfs_active(pos) || pos->ns != ns)) {
 		struct rb_node *node = rb_next(&pos->rb);
 		if (!node)
 			pos = NULL;
@@ -1127,14 +1127,15 @@ static struct kernfs_node *kernfs_dir_next_pos(const void *ns,
 	struct kernfs_node *parent, ino_t ino, struct kernfs_node *pos)
 {
 	pos = kernfs_dir_pos(ns, parent, ino, pos);
-	if (pos)
+	if (pos) {
 		do {
 			struct rb_node *node = rb_next(&pos->rb);
 			if (!node)
 				pos = NULL;
 			else
 				pos = rb_to_kn(node);
-		} while (pos && pos->ns != ns);
+		} while (pos && (!kernfs_active(pos) || pos->ns != ns));
+	}
 	return pos;
 }
 
-- 
1.8.5.3


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 05/10] kernfs: allow nodes to be created in the deactivated state
  2014-02-03 19:09 [PATCHSET v2 driver-core-next] kernfs: prepare for cgroup's kernfs conversion Tejun Heo
                   ` (3 preceding siblings ...)
  2014-02-03 19:09 ` [PATCH 04/10] kernfs: add missing kernfs_active() checks in directory operations Tejun Heo
@ 2014-02-03 19:09 ` Tejun Heo
  2014-02-03 19:09 ` [PATCH 06/10] kernfs: implement kernfs_ops->atomic_write_len Tejun Heo
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 19+ messages in thread
From: Tejun Heo @ 2014-02-03 19:09 UTC (permalink / raw)
  To: gregkh; +Cc: linux-kernel, Tejun Heo

Currently, kernfs_nodes are made visible to userland on creation,
which makes it difficult for kernfs users to atomically succeed or
fail creation of multiple nodes.  In addition, if something fails
after creating some nodes, the created nodes might already be in use
and their active refs need to be drained for removal, which has the
potential to introduce tricky reverse locking dependency on active_ref
depending on how the error path is synchronized.

This patch introduces per-root flag KERNFS_ROOT_CREATE_DEACTIVATED.
If set, all nodes under the root are created in the deactivated state
and stay invisible to userland until explicitly enabled by the new
kernfs_activate() API.  Also, nodes which have never been activated
are guaranteed to bypass draining on removal thus allowing error paths
to not worry about lockding dependency on active_ref draining.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 fs/kernfs/dir.c        | 71 +++++++++++++++++++++++++++++++++++++++++++++-----
 fs/sysfs/mount.c       |  2 +-
 include/linux/kernfs.h | 15 +++++++++--
 3 files changed, 78 insertions(+), 10 deletions(-)

diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c
index 89f8462..3cff0a2 100644
--- a/fs/kernfs/dir.c
+++ b/fs/kernfs/dir.c
@@ -435,7 +435,7 @@ int kernfs_add_one(struct kernfs_node *kn)
 		goto out_unlock;
 
 	ret = -ENOENT;
-	if (!kernfs_active(parent))
+	if ((parent->flags & KERNFS_ACTIVATED) && !kernfs_active(parent))
 		goto out_unlock;
 
 	kn->hash = kernfs_name_hash(kn->name, kn->ns);
@@ -451,9 +451,19 @@ int kernfs_add_one(struct kernfs_node *kn)
 		ps_iattrs->ia_ctime = ps_iattrs->ia_mtime = CURRENT_TIME;
 	}
 
-	/* Mark the entry added into directory tree */
-	atomic_sub(KN_DEACTIVATED_BIAS, &kn->active);
-	ret = 0;
+	mutex_unlock(&kernfs_mutex);
+
+	/*
+	 * Activate the new node unless CREATE_DEACTIVATED is requested.
+	 * If not activated here, the kernfs user is responsible for
+	 * activating the node with kernfs_activate().  A node which hasn't
+	 * been activated is not visible to userland and its removal won't
+	 * trigger deactivation.
+	 */
+	if (!(kernfs_root(kn)->flags & KERNFS_ROOT_CREATE_DEACTIVATED))
+		kernfs_activate(kn);
+	return 0;
+
 out_unlock:
 	mutex_unlock(&kernfs_mutex);
 	return ret;
@@ -528,13 +538,14 @@ EXPORT_SYMBOL_GPL(kernfs_find_and_get_ns);
 /**
  * kernfs_create_root - create a new kernfs hierarchy
  * @scops: optional syscall operations for the hierarchy
+ * @flags: KERNFS_ROOT_* flags
  * @priv: opaque data associated with the new directory
  *
  * Returns the root of the new hierarchy on success, ERR_PTR() value on
  * failure.
  */
 struct kernfs_root *kernfs_create_root(struct kernfs_syscall_ops *scops,
-				       void *priv)
+				       unsigned int flags, void *priv)
 {
 	struct kernfs_root *root;
 	struct kernfs_node *kn;
@@ -553,14 +564,17 @@ struct kernfs_root *kernfs_create_root(struct kernfs_syscall_ops *scops,
 		return ERR_PTR(-ENOMEM);
 	}
 
-	atomic_sub(KN_DEACTIVATED_BIAS, &kn->active);
 	kn->priv = priv;
 	kn->dir.root = root;
 
 	root->syscall_ops = scops;
+	root->flags = flags;
 	root->kn = kn;
 	init_waitqueue_head(&root->deactivate_waitq);
 
+	if (!(root->flags & KERNFS_ROOT_CREATE_DEACTIVATED))
+		kernfs_activate(kn);
+
 	return root;
 }
 
@@ -783,6 +797,40 @@ static struct kernfs_node *kernfs_next_descendant_post(struct kernfs_node *pos,
 	return pos->parent;
 }
 
+/**
+ * kernfs_activate - activate a node which started deactivated
+ * @kn: kernfs_node whose subtree is to be activated
+ *
+ * If the root has KERNFS_ROOT_CREATE_DEACTIVATED set, a newly created node
+ * needs to be explicitly activated.  A node which hasn't been activated
+ * isn't visible to userland and deactivation is skipped during its
+ * removal.  This is useful to construct atomic init sequences where
+ * creation of multiple nodes should either succeed or fail atomically.
+ *
+ * The caller is responsible for ensuring that this function is not called
+ * after kernfs_remove*() is invoked on @kn.
+ */
+void kernfs_activate(struct kernfs_node *kn)
+{
+	struct kernfs_node *pos;
+
+	mutex_lock(&kernfs_mutex);
+
+	pos = NULL;
+	while ((pos = kernfs_next_descendant_post(pos, kn))) {
+		if (!pos || (pos->flags & KERNFS_ACTIVATED))
+			continue;
+
+		WARN_ON_ONCE(pos->parent && RB_EMPTY_NODE(&pos->rb));
+		WARN_ON_ONCE(atomic_read(&pos->active) != KN_DEACTIVATED_BIAS);
+
+		atomic_sub(KN_DEACTIVATED_BIAS, &pos->active);
+		pos->flags |= KERNFS_ACTIVATED;
+	}
+
+	mutex_unlock(&kernfs_mutex);
+}
+
 static void __kernfs_remove(struct kernfs_node *kn)
 {
 	struct kernfs_node *pos;
@@ -817,7 +865,16 @@ static void __kernfs_remove(struct kernfs_node *kn)
 		 */
 		kernfs_get(pos);
 
-		kernfs_drain(pos);
+		/*
+		 * Drain iff @kn was activated.  This avoids draining and
+		 * its lockdep annotations for nodes which have never been
+		 * activated and allows embedding kernfs_remove() in create
+		 * error paths without worrying about draining.
+		 */
+		if (kn->flags & KERNFS_ACTIVATED)
+			kernfs_drain(pos);
+		else
+			WARN_ON_ONCE(atomic_read(&kn->active) != KN_DEACTIVATED_BIAS);
 
 		/*
 		 * kernfs_unlink_sibling() succeeds once per node.  Use it
diff --git a/fs/sysfs/mount.c b/fs/sysfs/mount.c
index 6211230..5c7fdd9 100644
--- a/fs/sysfs/mount.c
+++ b/fs/sysfs/mount.c
@@ -62,7 +62,7 @@ int __init sysfs_init(void)
 {
 	int err;
 
-	sysfs_root = kernfs_create_root(NULL, NULL);
+	sysfs_root = kernfs_create_root(NULL, 0, NULL);
 	if (IS_ERR(sysfs_root))
 		return PTR_ERR(sysfs_root);
 
diff --git a/include/linux/kernfs.h b/include/linux/kernfs.h
index 5d5b7e9..4520c86 100644
--- a/include/linux/kernfs.h
+++ b/include/linux/kernfs.h
@@ -38,6 +38,7 @@ enum kernfs_node_type {
 #define KERNFS_FLAG_MASK	~KERNFS_TYPE_MASK
 
 enum kernfs_node_flag {
+	KERNFS_ACTIVATED	= 0x0010,
 	KERNFS_NS		= 0x0020,
 	KERNFS_HAS_SEQ_SHOW	= 0x0040,
 	KERNFS_HAS_MMAP		= 0x0080,
@@ -47,6 +48,11 @@ enum kernfs_node_flag {
 	KERNFS_SUICIDED		= 0x0800,
 };
 
+/* @flags for kernfs_create_root() */
+enum kernfs_root_flag {
+	KERNFS_ROOT_CREATE_DEACTIVATED = 0x0001,
+};
+
 /* type-specific structures for kernfs_node union members */
 struct kernfs_elem_dir {
 	unsigned long		subdirs;
@@ -128,6 +134,7 @@ struct kernfs_syscall_ops {
 struct kernfs_root {
 	/* published fields */
 	struct kernfs_node	*kn;
+	unsigned int		flags;	/* KERNFS_ROOT_* flags */
 
 	/* private fields, do not use outside kernfs proper */
 	struct ida		ino_ida;
@@ -223,7 +230,7 @@ void kernfs_get(struct kernfs_node *kn);
 void kernfs_put(struct kernfs_node *kn);
 
 struct kernfs_root *kernfs_create_root(struct kernfs_syscall_ops *scops,
-				       void *priv);
+				       unsigned int flags, void *priv);
 void kernfs_destroy_root(struct kernfs_root *root);
 
 struct kernfs_node *kernfs_create_dir_ns(struct kernfs_node *parent,
@@ -239,6 +246,7 @@ struct kernfs_node *__kernfs_create_file(struct kernfs_node *parent,
 struct kernfs_node *kernfs_create_link(struct kernfs_node *parent,
 				       const char *name,
 				       struct kernfs_node *target);
+void kernfs_activate(struct kernfs_node *kn);
 void kernfs_remove(struct kernfs_node *kn);
 void kernfs_break_active_protection(struct kernfs_node *kn);
 void kernfs_unbreak_active_protection(struct kernfs_node *kn);
@@ -276,7 +284,8 @@ static inline void kernfs_get(struct kernfs_node *kn) { }
 static inline void kernfs_put(struct kernfs_node *kn) { }
 
 static inline struct kernfs_root *
-kernfs_create_root(struct kernfs_syscall_ops *scops, void *priv)
+kernfs_create_root(struct kernfs_syscall_ops *scops, unsigned int flags,
+		   void *priv)
 { return ERR_PTR(-ENOSYS); }
 
 static inline void kernfs_destroy_root(struct kernfs_root *root) { }
@@ -298,6 +307,8 @@ kernfs_create_link(struct kernfs_node *parent, const char *name,
 		   struct kernfs_node *target)
 { return ERR_PTR(-ENOSYS); }
 
+static inline void kernfs_activate(struct kernfs_node *kn) { }
+
 static inline void kernfs_remove(struct kernfs_node *kn) { }
 
 static inline bool kernfs_remove_self(struct kernfs_node *kn)
-- 
1.8.5.3


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 06/10] kernfs: implement kernfs_ops->atomic_write_len
  2014-02-03 19:09 [PATCHSET v2 driver-core-next] kernfs: prepare for cgroup's kernfs conversion Tejun Heo
                   ` (4 preceding siblings ...)
  2014-02-03 19:09 ` [PATCH 05/10] kernfs: allow nodes to be created in the deactivated state Tejun Heo
@ 2014-02-03 19:09 ` Tejun Heo
  2014-02-03 19:09 ` [PATCH 07/10] kernfs: add kernfs_open_file->priv Tejun Heo
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 19+ messages in thread
From: Tejun Heo @ 2014-02-03 19:09 UTC (permalink / raw)
  To: gregkh; +Cc: linux-kernel, Tejun Heo

A write to a kernfs_node is buffered through a kernel buffer.  Writes
<= PAGE_SIZE are performed atomically, while larger ones are executed
in PAGE_SIZE chunks.  While this is enough for sysfs, cgroup which is
scheduled to be converted to use kernfs needs a bit more control over
it.

This patch adds kernfs_ops->atomic_write_len.  If not set (zero), the
behavior stays the same.  If set, writes upto the size are executed
atomically and larger writes are rejected with -E2BIG.

A different implementation strategy would be allowing configuring
chunking size while making the original write size available to the
write method; however, such strategy, while being more complicated,
doesn't really buy anything.  If the write implementation has to
handle chunking, the specific chunk size shouldn't matter all that
much.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 fs/kernfs/file.c       | 49 +++++++++++++++++++++++++++++++------------------
 include/linux/kernfs.h |  8 ++++++--
 2 files changed, 37 insertions(+), 20 deletions(-)

diff --git a/fs/kernfs/file.c b/fs/kernfs/file.c
index 10a8c91..ddcb471 100644
--- a/fs/kernfs/file.c
+++ b/fs/kernfs/file.c
@@ -252,19 +252,9 @@ static ssize_t kernfs_fop_write(struct file *file, const char __user *user_buf,
 				size_t count, loff_t *ppos)
 {
 	struct kernfs_open_file *of = kernfs_of(file);
-	ssize_t len = min_t(size_t, count, PAGE_SIZE);
 	const struct kernfs_ops *ops;
-	char *buf;
-
-	buf = kmalloc(len + 1, GFP_KERNEL);
-	if (!buf)
-		return -ENOMEM;
-
-	if (copy_from_user(buf, user_buf, len)) {
-		len = -EFAULT;
-		goto out_free;
-	}
-	buf[len] = '\0';	/* guarantee string termination */
+	char *buf = NULL;
+	ssize_t len;
 
 	/*
 	 * @of->mutex nests outside active ref and is just to ensure that
@@ -273,22 +263,45 @@ static ssize_t kernfs_fop_write(struct file *file, const char __user *user_buf,
 	mutex_lock(&of->mutex);
 	if (!kernfs_get_active(of->kn)) {
 		mutex_unlock(&of->mutex);
-		len = -ENODEV;
-		goto out_free;
+		return -ENODEV;
 	}
 
 	ops = kernfs_ops(of->kn);
-	if (ops->write)
-		len = ops->write(of, buf, len, *ppos);
-	else
+	if (!ops->write) {
 		len = -EINVAL;
+		goto out_unlock;
+	}
+
+	if (ops->atomic_write_len) {
+		len = count;
+		if (len > ops->atomic_write_len) {
+			len = -E2BIG;
+			goto out_unlock;
+		}
+	} else {
+		len = min_t(size_t, count, PAGE_SIZE);
+	}
+
+	buf = kmalloc(len + 1, GFP_KERNEL);
+	if (!buf) {
+		len = -ENOMEM;
+		goto out_unlock;
+	}
 
+	if (copy_from_user(buf, user_buf, len)) {
+		len = -EFAULT;
+		goto out_unlock;
+	}
+	buf[len] = '\0';	/* guarantee string termination */
+
+	len = ops->write(of, buf, len, *ppos);
+out_unlock:
 	kernfs_put_active(of->kn);
 	mutex_unlock(&of->mutex);
 
 	if (len > 0)
 		*ppos += len;
-out_free:
+
 	kfree(buf);
 	return len;
 }
diff --git a/include/linux/kernfs.h b/include/linux/kernfs.h
index 4520c86..47f5235 100644
--- a/include/linux/kernfs.h
+++ b/include/linux/kernfs.h
@@ -178,9 +178,13 @@ struct kernfs_ops {
 			loff_t off);
 
 	/*
-	 * write() is bounced through kernel buffer and a write larger than
-	 * PAGE_SIZE results in partial operation of PAGE_SIZE.
+	 * write() is bounced through kernel buffer.  If atomic_write_len
+	 * is not set, a write larger than PAGE_SIZE results in partial
+	 * operations of PAGE_SIZE chunks.  If atomic_write_len is set,
+	 * writes upto the specified size are executed atomically but
+	 * larger ones are rejected with -E2BIG.
 	 */
+	size_t atomic_write_len;
 	ssize_t (*write)(struct kernfs_open_file *of, char *buf, size_t bytes,
 			 loff_t off);
 
-- 
1.8.5.3


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 07/10] kernfs: add kernfs_open_file->priv
  2014-02-03 19:09 [PATCHSET v2 driver-core-next] kernfs: prepare for cgroup's kernfs conversion Tejun Heo
                   ` (5 preceding siblings ...)
  2014-02-03 19:09 ` [PATCH 06/10] kernfs: implement kernfs_ops->atomic_write_len Tejun Heo
@ 2014-02-03 19:09 ` Tejun Heo
  2014-02-03 19:09 ` [PATCH 08/10] kernfs: implement kernfs_node_from_dentry(), kernfs_root_from_sb() and kernfs_rename() Tejun Heo
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 19+ messages in thread
From: Tejun Heo @ 2014-02-03 19:09 UTC (permalink / raw)
  To: gregkh; +Cc: linux-kernel, Tejun Heo

Add a private data field to be used by kernfs file operations.  This
generally makes sense and will be used by cgroup.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 include/linux/kernfs.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/linux/kernfs.h b/include/linux/kernfs.h
index 47f5235..9ca0f09 100644
--- a/include/linux/kernfs.h
+++ b/include/linux/kernfs.h
@@ -146,6 +146,7 @@ struct kernfs_open_file {
 	/* published fields */
 	struct kernfs_node	*kn;
 	struct file		*file;
+	void			*priv;
 
 	/* private fields, do not use outside kernfs proper */
 	struct mutex		mutex;
-- 
1.8.5.3


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 08/10] kernfs: implement kernfs_node_from_dentry(), kernfs_root_from_sb() and kernfs_rename()
  2014-02-03 19:09 [PATCHSET v2 driver-core-next] kernfs: prepare for cgroup's kernfs conversion Tejun Heo
                   ` (6 preceding siblings ...)
  2014-02-03 19:09 ` [PATCH 07/10] kernfs: add kernfs_open_file->priv Tejun Heo
@ 2014-02-03 19:09 ` Tejun Heo
  2014-02-03 19:09 ` [PATCH 09/10] kernfs: implement kernfs_get_parent(), kernfs_name/path() and friends Tejun Heo
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 19+ messages in thread
From: Tejun Heo @ 2014-02-03 19:09 UTC (permalink / raw)
  To: gregkh; +Cc: linux-kernel, Tejun Heo, kbuild test robot

Implement helpers to determine node from dentry and root from
super_block.  Also add a kernfs_rename_ns() wrapper which assumes NULL
namespace.  These generally make sense and will be used by cgroup.

v2: Some dummy implementations for !CONFIG_SYSFS was missing.  Fixed.
    Reported by kbuild test robot.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: kbuild test robot <fengguang.wu@intel.com>
---
 fs/kernfs/dir.c        | 18 ++++++++++++++++++
 fs/kernfs/mount.c      | 14 ++++++++++++++
 include/linux/kernfs.h | 16 ++++++++++++++++
 3 files changed, 48 insertions(+)

diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c
index 3cff0a2..42a250f 100644
--- a/fs/kernfs/dir.c
+++ b/fs/kernfs/dir.c
@@ -350,6 +350,24 @@ const struct dentry_operations kernfs_dops = {
 	.d_release	= kernfs_dop_release,
 };
 
+/**
+ * kernfs_node_from_dentry - determine kernfs_node associated with a dentry
+ * @dentry: the dentry in question
+ *
+ * Return the kernfs_node associated with @dentry.  If @dentry is not a
+ * kernfs one, %NULL is returned.
+ *
+ * While the returned kernfs_node will stay accessible as long as @dentry
+ * is accessible, the returned node can be in any state and the caller is
+ * fully responsible for determining what's accessible.
+ */
+struct kernfs_node *kernfs_node_from_dentry(struct dentry *dentry)
+{
+	if (dentry->d_op == &kernfs_dops)
+		return dentry->d_fsdata;
+	return NULL;
+}
+
 static struct kernfs_node *__kernfs_new_node(struct kernfs_root *root,
 					     const char *name, umode_t mode,
 					     unsigned flags)
diff --git a/fs/kernfs/mount.c b/fs/kernfs/mount.c
index 70cc698..e5b28b0 100644
--- a/fs/kernfs/mount.c
+++ b/fs/kernfs/mount.c
@@ -48,6 +48,20 @@ static const struct super_operations kernfs_sops = {
 	.show_options	= kernfs_sop_show_options,
 };
 
+/**
+ * kernfs_root_from_sb - determine kernfs_root associated with a super_block
+ * @sb: the super_block in question
+ *
+ * Return the kernfs_root associated with @sb.  If @sb is not a kernfs one,
+ * %NULL is returned.
+ */
+struct kernfs_root *kernfs_root_from_sb(struct super_block *sb)
+{
+	if (sb->s_op == &kernfs_sops)
+		return kernfs_info(sb)->root;
+	return NULL;
+}
+
 static int kernfs_fill_super(struct super_block *sb)
 {
 	struct kernfs_super_info *info = kernfs_info(sb);
diff --git a/include/linux/kernfs.h b/include/linux/kernfs.h
index 9ca0f09..9c89904 100644
--- a/include/linux/kernfs.h
+++ b/include/linux/kernfs.h
@@ -234,6 +234,9 @@ struct kernfs_node *kernfs_find_and_get_ns(struct kernfs_node *parent,
 void kernfs_get(struct kernfs_node *kn);
 void kernfs_put(struct kernfs_node *kn);
 
+struct kernfs_node *kernfs_node_from_dentry(struct dentry *dentry);
+struct kernfs_root *kernfs_root_from_sb(struct super_block *sb);
+
 struct kernfs_root *kernfs_create_root(struct kernfs_syscall_ops *scops,
 				       unsigned int flags, void *priv);
 void kernfs_destroy_root(struct kernfs_root *root);
@@ -288,6 +291,12 @@ kernfs_find_and_get_ns(struct kernfs_node *parent, const char *name,
 static inline void kernfs_get(struct kernfs_node *kn) { }
 static inline void kernfs_put(struct kernfs_node *kn) { }
 
+static inline struct kernfs_node *kernfs_node_from_dentry(struct dentry *dentry)
+{ return NULL; }
+
+static inline struct kernfs_root *kernfs_root_from_sb(struct super_block *sb)
+{ return NULL; }
+
 static inline struct kernfs_root *
 kernfs_create_root(struct kernfs_syscall_ops *scops, unsigned int flags,
 		   void *priv)
@@ -388,6 +397,13 @@ static inline int kernfs_remove_by_name(struct kernfs_node *parent,
 	return kernfs_remove_by_name_ns(parent, name, NULL);
 }
 
+static inline int kernfs_rename(struct kernfs_node *kn,
+				struct kernfs_node *new_parent,
+				const char *new_name)
+{
+	return kernfs_rename_ns(kn, new_parent, new_name, NULL);
+}
+
 static inline struct dentry *
 kernfs_mount(struct file_system_type *fs_type, int flags,
 	     struct kernfs_root *root)
-- 
1.8.5.3


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 09/10] kernfs: implement kernfs_get_parent(), kernfs_name/path() and friends
  2014-02-03 19:09 [PATCHSET v2 driver-core-next] kernfs: prepare for cgroup's kernfs conversion Tejun Heo
                   ` (7 preceding siblings ...)
  2014-02-03 19:09 ` [PATCH 08/10] kernfs: implement kernfs_node_from_dentry(), kernfs_root_from_sb() and kernfs_rename() Tejun Heo
@ 2014-02-03 19:09 ` Tejun Heo
  2014-02-07 18:40   ` [PATCH v2 " Tejun Heo
  2014-02-03 19:09 ` [PATCH 10/10] kernfs: add CONFIG_KERNFS Tejun Heo
                   ` (2 subsequent siblings)
  11 siblings, 1 reply; 19+ messages in thread
From: Tejun Heo @ 2014-02-03 19:09 UTC (permalink / raw)
  To: gregkh; +Cc: linux-kernel, Tejun Heo

kernfs_node->parent and ->name are currently marked as "published"
indicating that kernfs users may access them directly; however, those
fields may get updated by kernfs_rename[_ns]() and unrestricted access
may lead to erroneous values or oops.

Protect ->parent and ->name updates with a irq-safe spinlock
kernfs_rename_lock and implement the following accessors for these
fields.

* kernfs_name()		- format the node's name into the specified buffer
* kernfs_path()		- format the node's path into the specified buffer
* pr_cont_kernfs_name()	- pr_cont a node's name (doesn't need buffer)
* pr_cont_kernfs_path()	- pr_cont a node's path (doesn't need buffer)
* kernfs_get_parent()	- pin and return a node's parent

All can be called under any context.  The recursive sysfs_pathname()
in fs/sysfs/dir.c is replaced with kernfs_path() and
sysfs_rename_dir_ns() is updated to use kernfs_get_parent() instead of
dereferencing parent directly.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 fs/kernfs/dir.c        | 175 ++++++++++++++++++++++++++++++++++++++++++++++---
 fs/sysfs/dir.c         |  44 ++++---------
 include/linux/kernfs.h |  26 +++++++-
 3 files changed, 203 insertions(+), 42 deletions(-)

diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c
index 42a250f..a347792 100644
--- a/fs/kernfs/dir.c
+++ b/fs/kernfs/dir.c
@@ -19,6 +19,8 @@
 #include "kernfs-internal.h"
 
 DEFINE_MUTEX(kernfs_mutex);
+static DEFINE_SPINLOCK(kernfs_rename_lock);	/* kn->parent and ->name */
+static char kernfs_pr_cont_buf[PATH_MAX];	/* protected by rename_lock */
 
 #define rb_to_kn(X) rb_entry((X), struct kernfs_node, rb)
 
@@ -37,6 +39,141 @@ static bool kernfs_lockdep(struct kernfs_node *kn)
 #endif
 }
 
+static int kernfs_name_locked(struct kernfs_node *kn, char *buf, size_t buflen)
+{
+	return strlcpy(buf, kn->parent ? kn->name : "/", buflen);
+}
+
+static char * __must_check kernfs_path_locked(struct kernfs_node *kn, char *buf,
+					      size_t buflen)
+{
+	char *p = buf + buflen;
+	int len;
+
+	*--p = '\0';
+
+	do {
+		len = strlen(kn->name);
+		if (p - buf < len + 1) {
+			buf[0] = '\0';
+			p = NULL;
+			break;
+		}
+		p -= len;
+		memcpy(p, kn->name, len);
+		*--p = '/';
+		kn = kn->parent;
+	} while (kn && kn->parent);
+
+	return p;
+}
+
+/**
+ * kernfs_name - obtain the name of a given node
+ * @kn: kernfs_node of interest
+ * @buf: buffer to copy @kn's name into
+ * @buflen: size of @buf
+ *
+ * Copies the name of @kn into @buf of @buflen bytes.  The behavior is
+ * similar to strlcpy().  It returns the length of @kn's name and if @buf
+ * isn't long enough, it's filled upto @buflen-1 and nul terminated.
+ *
+ * This function can be called from any context.
+ */
+int kernfs_name(struct kernfs_node *kn, char *buf, size_t buflen)
+{
+	unsigned long flags;
+	int ret;
+
+	spin_lock_irqsave(&kernfs_rename_lock, flags);
+	ret = kernfs_name_locked(kn, buf, buflen);
+	spin_unlock_irqrestore(&kernfs_rename_lock, flags);
+	return ret;
+}
+
+/**
+ * kernfs_path - build full path of a given node
+ * @kn: kernfs_node of interest
+ * @buf: buffer to copy @kn's name into
+ * @buflen: size of @buf
+ *
+ * Builds and returns the full path of @kn in @buf of @buflen bytes.  The
+ * path is built from the end of @buf so the returned pointer usually
+ * doesn't match @buf.  If @buf isn't long enough, @buf is nul terminated
+ * and %NULL is returned.
+ */
+char *kernfs_path(struct kernfs_node *kn, char *buf, size_t buflen)
+{
+	unsigned long flags;
+	char *p;
+
+	spin_lock_irqsave(&kernfs_rename_lock, flags);
+	p = kernfs_path_locked(kn, buf, buflen);
+	spin_unlock_irqrestore(&kernfs_rename_lock, flags);
+	return p;
+}
+
+/**
+ * pr_cont_kernfs_name - pr_cont name of a kernfs_node
+ * @kn: kernfs_node of interest
+ *
+ * This function can be called from any context.
+ */
+void pr_cont_kernfs_name(struct kernfs_node *kn)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&kernfs_rename_lock, flags);
+
+	kernfs_name_locked(kn, kernfs_pr_cont_buf, sizeof(kernfs_pr_cont_buf));
+	pr_cont("%s", kernfs_pr_cont_buf);
+
+	spin_unlock_irqrestore(&kernfs_rename_lock, flags);
+}
+
+/**
+ * pr_cont_kernfs_path - pr_cont path of a kernfs_node
+ * @kn: kernfs_node of interest
+ *
+ * This function can be called from any context.
+ */
+void pr_cont_kernfs_path(struct kernfs_node *kn)
+{
+	unsigned long flags;
+	char *p;
+
+	spin_lock_irqsave(&kernfs_rename_lock, flags);
+
+	p = kernfs_path_locked(kn, kernfs_pr_cont_buf,
+			       sizeof(kernfs_pr_cont_buf));
+	if (p)
+		pr_cont("%s", p);
+	else
+		pr_cont("<name too long>");
+
+	spin_unlock_irqrestore(&kernfs_rename_lock, flags);
+}
+
+/**
+ * kernfs_get_parent - determine the parent node and pin it
+ * @kn: kernfs_node of interest
+ *
+ * Determines @kn's parent, pins and returns it.  This function can be
+ * called from any context.
+ */
+struct kernfs_node *kernfs_get_parent(struct kernfs_node *kn)
+{
+	struct kernfs_node *parent;
+	unsigned long flags;
+
+	spin_lock_irqsave(&kernfs_rename_lock, flags);
+	parent = kn->parent;
+	kernfs_get(parent);
+	spin_unlock_irqrestore(&kernfs_rename_lock, flags);
+
+	return parent;
+}
+
 /**
  *	kernfs_name_hash
  *	@name: Null terminated string to hash
@@ -1103,8 +1240,14 @@ int kernfs_remove_by_name_ns(struct kernfs_node *parent, const char *name,
 int kernfs_rename_ns(struct kernfs_node *kn, struct kernfs_node *new_parent,
 		     const char *new_name, const void *new_ns)
 {
+	struct kernfs_node *old_parent;
+	const char *old_name = NULL;
 	int error;
 
+	/* can't move or rename root */
+	if (!kn->parent)
+		return -EINVAL;
+
 	mutex_lock(&kernfs_mutex);
 
 	error = -ENOENT;
@@ -1126,13 +1269,8 @@ int kernfs_rename_ns(struct kernfs_node *kn, struct kernfs_node *new_parent,
 		new_name = kstrdup(new_name, GFP_KERNEL);
 		if (!new_name)
 			goto out;
-
-		if (kn->flags & KERNFS_STATIC_NAME)
-			kn->flags &= ~KERNFS_STATIC_NAME;
-		else
-			kfree(kn->name);
-
-		kn->name = new_name;
+	} else {
+		new_name = NULL;
 	}
 
 	/*
@@ -1140,12 +1278,29 @@ int kernfs_rename_ns(struct kernfs_node *kn, struct kernfs_node *new_parent,
 	 */
 	kernfs_unlink_sibling(kn);
 	kernfs_get(new_parent);
-	kernfs_put(kn->parent);
-	kn->ns = new_ns;
-	kn->hash = kernfs_name_hash(kn->name, kn->ns);
+
+	/* rename_lock protects ->parent and ->name accessors */
+	spin_lock_irq(&kernfs_rename_lock);
+
+	old_parent = kn->parent;
 	kn->parent = new_parent;
+
+	kn->ns = new_ns;
+	if (new_name) {
+		if (!(kn->flags & KERNFS_STATIC_NAME))
+			old_name = kn->name;
+		kn->flags &= ~KERNFS_STATIC_NAME;
+		kn->name = new_name;
+	}
+
+	spin_unlock_irq(&kernfs_rename_lock);
+
+	kn->hash = kernfs_name_hash(new_name, new_ns);
 	kernfs_link_sibling(kn);
 
+	kernfs_put(old_parent);
+	kfree(old_name);
+
 	error = 0;
  out:
 	mutex_unlock(&kernfs_mutex);
diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index ee0d761..0b45ff4 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -19,39 +19,18 @@
 
 DEFINE_SPINLOCK(sysfs_symlink_target_lock);
 
-/**
- *	sysfs_pathname - return full path to sysfs dirent
- *	@kn: kernfs_node whose path we want
- *	@path: caller allocated buffer of size PATH_MAX
- *
- *	Gives the name "/" to the sysfs_root entry; any path returned
- *	is relative to wherever sysfs is mounted.
- */
-static char *sysfs_pathname(struct kernfs_node *kn, char *path)
-{
-	if (kn->parent) {
-		sysfs_pathname(kn->parent, path);
-		strlcat(path, "/", PATH_MAX);
-	}
-	strlcat(path, kn->name, PATH_MAX);
-	return path;
-}
-
 void sysfs_warn_dup(struct kernfs_node *parent, const char *name)
 {
-	char *path;
+	char *buf, *path = NULL;
 
-	path = kzalloc(PATH_MAX, GFP_KERNEL);
-	if (path) {
-		sysfs_pathname(parent, path);
-		strlcat(path, "/", PATH_MAX);
-		strlcat(path, name, PATH_MAX);
-	}
+	buf = kzalloc(PATH_MAX, GFP_KERNEL);
+	if (buf)
+		path = kernfs_path(parent, buf, PATH_MAX);
 
-	WARN(1, KERN_WARNING "sysfs: cannot create duplicate filename '%s'\n",
-	     path ? path : name);
+	WARN(1, KERN_WARNING "sysfs: cannot create duplicate filename '%s/%s'\n",
+	     path, name);
 
-	kfree(path);
+	kfree(buf);
 }
 
 /**
@@ -122,9 +101,13 @@ void sysfs_remove_dir(struct kobject *kobj)
 int sysfs_rename_dir_ns(struct kobject *kobj, const char *new_name,
 			const void *new_ns)
 {
-	struct kernfs_node *parent = kobj->sd->parent;
+	struct kernfs_node *parent;
+	int ret;
 
-	return kernfs_rename_ns(kobj->sd, parent, new_name, new_ns);
+	parent = kernfs_get_parent(kobj->sd);
+	ret = kernfs_rename_ns(kobj->sd, parent, new_name, new_ns);
+	kernfs_put(parent);
+	return ret;
 }
 
 int sysfs_move_dir_ns(struct kobject *kobj, struct kobject *new_parent_kobj,
@@ -133,7 +116,6 @@ int sysfs_move_dir_ns(struct kobject *kobj, struct kobject *new_parent_kobj,
 	struct kernfs_node *kn = kobj->sd;
 	struct kernfs_node *new_parent;
 
-	BUG_ON(!kn->parent);
 	new_parent = new_parent_kobj && new_parent_kobj->sd ?
 		new_parent_kobj->sd : sysfs_root_kn;
 
diff --git a/include/linux/kernfs.h b/include/linux/kernfs.h
index 9c89904..b7cb267 100644
--- a/include/linux/kernfs.h
+++ b/include/linux/kernfs.h
@@ -91,7 +91,12 @@ struct kernfs_node {
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
 	struct lockdep_map	dep_map;
 #endif
-	/* the following two fields are published */
+	/*
+	 * Use kernfs_get_parent() and kernfs_name/path() instead of
+	 * accessing the following two fields directly.  If the node is
+	 * never moved to a different parent, it is safe to access the
+	 * parent directly.
+	 */
 	struct kernfs_node	*parent;
 	const char		*name;
 
@@ -229,6 +234,12 @@ static inline bool kernfs_ns_enabled(struct kernfs_node *kn)
 	return kn->flags & KERNFS_NS;
 }
 
+int kernfs_name(struct kernfs_node *kn, char *buf, size_t buflen);
+char * __must_check kernfs_path(struct kernfs_node *kn, char *buf,
+				size_t buflen);
+void pr_cont_kernfs_name(struct kernfs_node *kn);
+void pr_cont_kernfs_path(struct kernfs_node *kn);
+struct kernfs_node *kernfs_get_parent(struct kernfs_node *kn);
 struct kernfs_node *kernfs_find_and_get_ns(struct kernfs_node *parent,
 					   const char *name, const void *ns);
 void kernfs_get(struct kernfs_node *kn);
@@ -283,6 +294,19 @@ static inline void kernfs_enable_ns(struct kernfs_node *kn) { }
 static inline bool kernfs_ns_enabled(struct kernfs_node *kn)
 { return false; }
 
+static inline int kernfs_name(struct kernfs_node *kn, char *buf, size_t buflen)
+{ return -ENOSYS; }
+
+char * __must_check kernfs_path(struct kernfs_node *kn, char *buf,
+				size_t buflen)
+{ return NULL; }
+
+static inline void pr_cont_kernfs_name(struct kernfs_node *kn) { }
+static inline void pr_cont_kernfs_path(struct kernfs_node *kn) { }
+
+static inline struct kernfs_node *kernfs_get_parent(struct kernfs_node *kn)
+{ return NULL; }
+
 static inline struct kernfs_node *
 kernfs_find_and_get_ns(struct kernfs_node *parent, const char *name,
 		       const void *ns)
-- 
1.8.5.3


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 10/10] kernfs: add CONFIG_KERNFS
  2014-02-03 19:09 [PATCHSET v2 driver-core-next] kernfs: prepare for cgroup's kernfs conversion Tejun Heo
                   ` (8 preceding siblings ...)
  2014-02-03 19:09 ` [PATCH 09/10] kernfs: implement kernfs_get_parent(), kernfs_name/path() and friends Tejun Heo
@ 2014-02-03 19:09 ` Tejun Heo
  2014-02-07 18:41 ` [PATCH 9.5/10] sysfs, kobject: add sysfs wrapper for kernfs_enable_ns() Tejun Heo
  2014-02-07 18:43 ` [PATCHSET v2 driver-core-next] kernfs: prepare for cgroup's kernfs conversion Tejun Heo
  11 siblings, 0 replies; 19+ messages in thread
From: Tejun Heo @ 2014-02-03 19:09 UTC (permalink / raw)
  To: gregkh; +Cc: linux-kernel, Tejun Heo, linux-fsdevel

As sysfs was kernfs's only user, kernfs has been piggybacking on
CONFIG_SYSFS; however, kernfs is scheduled to grow a new user very
soon.  Introduce a separate config option CONFIG_KERNFS which is to be
selected by kernfs users.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: linux-fsdevel@vger.kernel.org
---
 fs/Kconfig             | 1 +
 fs/Makefile            | 3 ++-
 fs/kernfs/Kconfig      | 7 +++++++
 fs/sysfs/Kconfig       | 1 +
 include/linux/kernfs.h | 6 +++---
 5 files changed, 14 insertions(+), 4 deletions(-)
 create mode 100644 fs/kernfs/Kconfig

diff --git a/fs/Kconfig b/fs/Kconfig
index 7385e54..312393f 100644
--- a/fs/Kconfig
+++ b/fs/Kconfig
@@ -96,6 +96,7 @@ endif # BLOCK
 menu "Pseudo filesystems"
 
 source "fs/proc/Kconfig"
+source "fs/kernfs/Kconfig"
 source "fs/sysfs/Kconfig"
 
 config TMPFS
diff --git a/fs/Makefile b/fs/Makefile
index 47ac07b..f9cb987 100644
--- a/fs/Makefile
+++ b/fs/Makefile
@@ -52,7 +52,8 @@ obj-$(CONFIG_FHANDLE)		+= fhandle.o
 obj-y				+= quota/
 
 obj-$(CONFIG_PROC_FS)		+= proc/
-obj-$(CONFIG_SYSFS)		+= sysfs/ kernfs/
+obj-$(CONFIG_KERNFS)		+= kernfs/
+obj-$(CONFIG_SYSFS)		+= sysfs/
 obj-$(CONFIG_CONFIGFS_FS)	+= configfs/
 obj-y				+= devpts/
 
diff --git a/fs/kernfs/Kconfig b/fs/kernfs/Kconfig
new file mode 100644
index 0000000..397b5f7
--- /dev/null
+++ b/fs/kernfs/Kconfig
@@ -0,0 +1,7 @@
+#
+# KERNFS should be selected by its users
+#
+
+config KERNFS
+	bool
+	default n
diff --git a/fs/sysfs/Kconfig b/fs/sysfs/Kconfig
index 8c41fea..b275601 100644
--- a/fs/sysfs/Kconfig
+++ b/fs/sysfs/Kconfig
@@ -1,6 +1,7 @@
 config SYSFS
 	bool "sysfs file system support" if EXPERT
 	default y
+	select KERNFS
 	help
 	The sysfs filesystem is a virtual filesystem that the kernel uses to
 	export internal kernel objects, their attributes, and their
diff --git a/include/linux/kernfs.h b/include/linux/kernfs.h
index b7cb267..32b696f 100644
--- a/include/linux/kernfs.h
+++ b/include/linux/kernfs.h
@@ -201,7 +201,7 @@ struct kernfs_ops {
 #endif
 };
 
-#ifdef CONFIG_SYSFS
+#ifdef CONFIG_KERNFS
 
 static inline enum kernfs_node_type kernfs_type(struct kernfs_node *kn)
 {
@@ -284,7 +284,7 @@ void kernfs_kill_sb(struct super_block *sb);
 
 void kernfs_init(void);
 
-#else	/* CONFIG_SYSFS */
+#else	/* CONFIG_KERNFS */
 
 static inline enum kernfs_node_type kernfs_type(struct kernfs_node *kn)
 { return 0; }	/* whatever */
@@ -379,7 +379,7 @@ static inline void kernfs_kill_sb(struct super_block *sb) { }
 
 static inline void kernfs_init(void) { }
 
-#endif	/* CONFIG_SYSFS */
+#endif	/* CONFIG_KERNFS */
 
 static inline struct kernfs_node *
 kernfs_find_and_get(struct kernfs_node *kn, const char *name)
-- 
1.8.5.3


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 09/10] kernfs: implement kernfs_get_parent(), kernfs_name/path() and friends
  2014-02-03 19:09 ` [PATCH 09/10] kernfs: implement kernfs_get_parent(), kernfs_name/path() and friends Tejun Heo
@ 2014-02-07 18:40   ` Tejun Heo
  0 siblings, 0 replies; 19+ messages in thread
From: Tejun Heo @ 2014-02-07 18:40 UTC (permalink / raw)
  To: gregkh; +Cc: linux-kernel

>From 8a753c7a38b909f95e98ecfdffa8343e79ed4256 Mon Sep 17 00:00:00 2001
From: Tejun Heo <tj@kernel.org>
Date: Fri, 7 Feb 2014 13:32:07 -0500

kernfs_node->parent and ->name are currently marked as "published"
indicating that kernfs users may access them directly; however, those
fields may get updated by kernfs_rename[_ns]() and unrestricted access
may lead to erroneous values or oops.

Protect ->parent and ->name updates with a irq-safe spinlock
kernfs_rename_lock and implement the following accessors for these
fields.

* kernfs_name()		- format the node's name into the specified buffer
* kernfs_path()		- format the node's path into the specified buffer
* pr_cont_kernfs_name()	- pr_cont a node's name (doesn't need buffer)
* pr_cont_kernfs_path()	- pr_cont a node's path (doesn't need buffer)
* kernfs_get_parent()	- pin and return a node's parent

All can be called under any context.  The recursive sysfs_pathname()
in fs/sysfs/dir.c is replaced with kernfs_path() and
sysfs_rename_dir_ns() is updated to use kernfs_get_parent() instead of
dereferencing parent directly.

v2: Dummy definition of kernfs_path() for !CONFIG_KERNFS was missing
    static inline making it cause a lot of build warnings.  Add it.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 fs/kernfs/dir.c        | 175 ++++++++++++++++++++++++++++++++++++++++++++++---
 fs/sysfs/dir.c         |  44 ++++---------
 include/linux/kernfs.h |  26 +++++++-
 3 files changed, 203 insertions(+), 42 deletions(-)

diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c
index 42a250f..a347792 100644
--- a/fs/kernfs/dir.c
+++ b/fs/kernfs/dir.c
@@ -19,6 +19,8 @@
 #include "kernfs-internal.h"
 
 DEFINE_MUTEX(kernfs_mutex);
+static DEFINE_SPINLOCK(kernfs_rename_lock);	/* kn->parent and ->name */
+static char kernfs_pr_cont_buf[PATH_MAX];	/* protected by rename_lock */
 
 #define rb_to_kn(X) rb_entry((X), struct kernfs_node, rb)
 
@@ -37,6 +39,141 @@ static bool kernfs_lockdep(struct kernfs_node *kn)
 #endif
 }
 
+static int kernfs_name_locked(struct kernfs_node *kn, char *buf, size_t buflen)
+{
+	return strlcpy(buf, kn->parent ? kn->name : "/", buflen);
+}
+
+static char * __must_check kernfs_path_locked(struct kernfs_node *kn, char *buf,
+					      size_t buflen)
+{
+	char *p = buf + buflen;
+	int len;
+
+	*--p = '\0';
+
+	do {
+		len = strlen(kn->name);
+		if (p - buf < len + 1) {
+			buf[0] = '\0';
+			p = NULL;
+			break;
+		}
+		p -= len;
+		memcpy(p, kn->name, len);
+		*--p = '/';
+		kn = kn->parent;
+	} while (kn && kn->parent);
+
+	return p;
+}
+
+/**
+ * kernfs_name - obtain the name of a given node
+ * @kn: kernfs_node of interest
+ * @buf: buffer to copy @kn's name into
+ * @buflen: size of @buf
+ *
+ * Copies the name of @kn into @buf of @buflen bytes.  The behavior is
+ * similar to strlcpy().  It returns the length of @kn's name and if @buf
+ * isn't long enough, it's filled upto @buflen-1 and nul terminated.
+ *
+ * This function can be called from any context.
+ */
+int kernfs_name(struct kernfs_node *kn, char *buf, size_t buflen)
+{
+	unsigned long flags;
+	int ret;
+
+	spin_lock_irqsave(&kernfs_rename_lock, flags);
+	ret = kernfs_name_locked(kn, buf, buflen);
+	spin_unlock_irqrestore(&kernfs_rename_lock, flags);
+	return ret;
+}
+
+/**
+ * kernfs_path - build full path of a given node
+ * @kn: kernfs_node of interest
+ * @buf: buffer to copy @kn's name into
+ * @buflen: size of @buf
+ *
+ * Builds and returns the full path of @kn in @buf of @buflen bytes.  The
+ * path is built from the end of @buf so the returned pointer usually
+ * doesn't match @buf.  If @buf isn't long enough, @buf is nul terminated
+ * and %NULL is returned.
+ */
+char *kernfs_path(struct kernfs_node *kn, char *buf, size_t buflen)
+{
+	unsigned long flags;
+	char *p;
+
+	spin_lock_irqsave(&kernfs_rename_lock, flags);
+	p = kernfs_path_locked(kn, buf, buflen);
+	spin_unlock_irqrestore(&kernfs_rename_lock, flags);
+	return p;
+}
+
+/**
+ * pr_cont_kernfs_name - pr_cont name of a kernfs_node
+ * @kn: kernfs_node of interest
+ *
+ * This function can be called from any context.
+ */
+void pr_cont_kernfs_name(struct kernfs_node *kn)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&kernfs_rename_lock, flags);
+
+	kernfs_name_locked(kn, kernfs_pr_cont_buf, sizeof(kernfs_pr_cont_buf));
+	pr_cont("%s", kernfs_pr_cont_buf);
+
+	spin_unlock_irqrestore(&kernfs_rename_lock, flags);
+}
+
+/**
+ * pr_cont_kernfs_path - pr_cont path of a kernfs_node
+ * @kn: kernfs_node of interest
+ *
+ * This function can be called from any context.
+ */
+void pr_cont_kernfs_path(struct kernfs_node *kn)
+{
+	unsigned long flags;
+	char *p;
+
+	spin_lock_irqsave(&kernfs_rename_lock, flags);
+
+	p = kernfs_path_locked(kn, kernfs_pr_cont_buf,
+			       sizeof(kernfs_pr_cont_buf));
+	if (p)
+		pr_cont("%s", p);
+	else
+		pr_cont("<name too long>");
+
+	spin_unlock_irqrestore(&kernfs_rename_lock, flags);
+}
+
+/**
+ * kernfs_get_parent - determine the parent node and pin it
+ * @kn: kernfs_node of interest
+ *
+ * Determines @kn's parent, pins and returns it.  This function can be
+ * called from any context.
+ */
+struct kernfs_node *kernfs_get_parent(struct kernfs_node *kn)
+{
+	struct kernfs_node *parent;
+	unsigned long flags;
+
+	spin_lock_irqsave(&kernfs_rename_lock, flags);
+	parent = kn->parent;
+	kernfs_get(parent);
+	spin_unlock_irqrestore(&kernfs_rename_lock, flags);
+
+	return parent;
+}
+
 /**
  *	kernfs_name_hash
  *	@name: Null terminated string to hash
@@ -1103,8 +1240,14 @@ int kernfs_remove_by_name_ns(struct kernfs_node *parent, const char *name,
 int kernfs_rename_ns(struct kernfs_node *kn, struct kernfs_node *new_parent,
 		     const char *new_name, const void *new_ns)
 {
+	struct kernfs_node *old_parent;
+	const char *old_name = NULL;
 	int error;
 
+	/* can't move or rename root */
+	if (!kn->parent)
+		return -EINVAL;
+
 	mutex_lock(&kernfs_mutex);
 
 	error = -ENOENT;
@@ -1126,13 +1269,8 @@ int kernfs_rename_ns(struct kernfs_node *kn, struct kernfs_node *new_parent,
 		new_name = kstrdup(new_name, GFP_KERNEL);
 		if (!new_name)
 			goto out;
-
-		if (kn->flags & KERNFS_STATIC_NAME)
-			kn->flags &= ~KERNFS_STATIC_NAME;
-		else
-			kfree(kn->name);
-
-		kn->name = new_name;
+	} else {
+		new_name = NULL;
 	}
 
 	/*
@@ -1140,12 +1278,29 @@ int kernfs_rename_ns(struct kernfs_node *kn, struct kernfs_node *new_parent,
 	 */
 	kernfs_unlink_sibling(kn);
 	kernfs_get(new_parent);
-	kernfs_put(kn->parent);
-	kn->ns = new_ns;
-	kn->hash = kernfs_name_hash(kn->name, kn->ns);
+
+	/* rename_lock protects ->parent and ->name accessors */
+	spin_lock_irq(&kernfs_rename_lock);
+
+	old_parent = kn->parent;
 	kn->parent = new_parent;
+
+	kn->ns = new_ns;
+	if (new_name) {
+		if (!(kn->flags & KERNFS_STATIC_NAME))
+			old_name = kn->name;
+		kn->flags &= ~KERNFS_STATIC_NAME;
+		kn->name = new_name;
+	}
+
+	spin_unlock_irq(&kernfs_rename_lock);
+
+	kn->hash = kernfs_name_hash(new_name, new_ns);
 	kernfs_link_sibling(kn);
 
+	kernfs_put(old_parent);
+	kfree(old_name);
+
 	error = 0;
  out:
 	mutex_unlock(&kernfs_mutex);
diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index ee0d761..0b45ff4 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -19,39 +19,18 @@
 
 DEFINE_SPINLOCK(sysfs_symlink_target_lock);
 
-/**
- *	sysfs_pathname - return full path to sysfs dirent
- *	@kn: kernfs_node whose path we want
- *	@path: caller allocated buffer of size PATH_MAX
- *
- *	Gives the name "/" to the sysfs_root entry; any path returned
- *	is relative to wherever sysfs is mounted.
- */
-static char *sysfs_pathname(struct kernfs_node *kn, char *path)
-{
-	if (kn->parent) {
-		sysfs_pathname(kn->parent, path);
-		strlcat(path, "/", PATH_MAX);
-	}
-	strlcat(path, kn->name, PATH_MAX);
-	return path;
-}
-
 void sysfs_warn_dup(struct kernfs_node *parent, const char *name)
 {
-	char *path;
+	char *buf, *path = NULL;
 
-	path = kzalloc(PATH_MAX, GFP_KERNEL);
-	if (path) {
-		sysfs_pathname(parent, path);
-		strlcat(path, "/", PATH_MAX);
-		strlcat(path, name, PATH_MAX);
-	}
+	buf = kzalloc(PATH_MAX, GFP_KERNEL);
+	if (buf)
+		path = kernfs_path(parent, buf, PATH_MAX);
 
-	WARN(1, KERN_WARNING "sysfs: cannot create duplicate filename '%s'\n",
-	     path ? path : name);
+	WARN(1, KERN_WARNING "sysfs: cannot create duplicate filename '%s/%s'\n",
+	     path, name);
 
-	kfree(path);
+	kfree(buf);
 }
 
 /**
@@ -122,9 +101,13 @@ void sysfs_remove_dir(struct kobject *kobj)
 int sysfs_rename_dir_ns(struct kobject *kobj, const char *new_name,
 			const void *new_ns)
 {
-	struct kernfs_node *parent = kobj->sd->parent;
+	struct kernfs_node *parent;
+	int ret;
 
-	return kernfs_rename_ns(kobj->sd, parent, new_name, new_ns);
+	parent = kernfs_get_parent(kobj->sd);
+	ret = kernfs_rename_ns(kobj->sd, parent, new_name, new_ns);
+	kernfs_put(parent);
+	return ret;
 }
 
 int sysfs_move_dir_ns(struct kobject *kobj, struct kobject *new_parent_kobj,
@@ -133,7 +116,6 @@ int sysfs_move_dir_ns(struct kobject *kobj, struct kobject *new_parent_kobj,
 	struct kernfs_node *kn = kobj->sd;
 	struct kernfs_node *new_parent;
 
-	BUG_ON(!kn->parent);
 	new_parent = new_parent_kobj && new_parent_kobj->sd ?
 		new_parent_kobj->sd : sysfs_root_kn;
 
diff --git a/include/linux/kernfs.h b/include/linux/kernfs.h
index 9c89904..8736ee8 100644
--- a/include/linux/kernfs.h
+++ b/include/linux/kernfs.h
@@ -91,7 +91,12 @@ struct kernfs_node {
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
 	struct lockdep_map	dep_map;
 #endif
-	/* the following two fields are published */
+	/*
+	 * Use kernfs_get_parent() and kernfs_name/path() instead of
+	 * accessing the following two fields directly.  If the node is
+	 * never moved to a different parent, it is safe to access the
+	 * parent directly.
+	 */
 	struct kernfs_node	*parent;
 	const char		*name;
 
@@ -229,6 +234,12 @@ static inline bool kernfs_ns_enabled(struct kernfs_node *kn)
 	return kn->flags & KERNFS_NS;
 }
 
+int kernfs_name(struct kernfs_node *kn, char *buf, size_t buflen);
+char * __must_check kernfs_path(struct kernfs_node *kn, char *buf,
+				size_t buflen);
+void pr_cont_kernfs_name(struct kernfs_node *kn);
+void pr_cont_kernfs_path(struct kernfs_node *kn);
+struct kernfs_node *kernfs_get_parent(struct kernfs_node *kn);
 struct kernfs_node *kernfs_find_and_get_ns(struct kernfs_node *parent,
 					   const char *name, const void *ns);
 void kernfs_get(struct kernfs_node *kn);
@@ -283,6 +294,19 @@ static inline void kernfs_enable_ns(struct kernfs_node *kn) { }
 static inline bool kernfs_ns_enabled(struct kernfs_node *kn)
 { return false; }
 
+static inline int kernfs_name(struct kernfs_node *kn, char *buf, size_t buflen)
+{ return -ENOSYS; }
+
+static inline char * __must_check kernfs_path(struct kernfs_node *kn, char *buf,
+					      size_t buflen)
+{ return NULL; }
+
+static inline void pr_cont_kernfs_name(struct kernfs_node *kn) { }
+static inline void pr_cont_kernfs_path(struct kernfs_node *kn) { }
+
+static inline struct kernfs_node *kernfs_get_parent(struct kernfs_node *kn)
+{ return NULL; }
+
 static inline struct kernfs_node *
 kernfs_find_and_get_ns(struct kernfs_node *parent, const char *name,
 		       const void *ns)
-- 
1.8.5.3


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 9.5/10] sysfs, kobject: add sysfs wrapper for kernfs_enable_ns()
  2014-02-03 19:09 [PATCHSET v2 driver-core-next] kernfs: prepare for cgroup's kernfs conversion Tejun Heo
                   ` (9 preceding siblings ...)
  2014-02-03 19:09 ` [PATCH 10/10] kernfs: add CONFIG_KERNFS Tejun Heo
@ 2014-02-07 18:41 ` Tejun Heo
  2014-02-07 18:43 ` [PATCHSET v2 driver-core-next] kernfs: prepare for cgroup's kernfs conversion Tejun Heo
  11 siblings, 0 replies; 19+ messages in thread
From: Tejun Heo @ 2014-02-07 18:41 UTC (permalink / raw)
  To: gregkh; +Cc: linux-kernel, Fengguang Wu

>From 5de4dc5a029516985b6ee9add23b4bacbbea490d Mon Sep 17 00:00:00 2001
From: Tejun Heo <tj@kernel.org>
Date: Fri, 7 Feb 2014 13:32:07 -0500

Currently, kobject is invoking kernfs_enable_ns() directly.  This is
fine now as sysfs and kernfs are enabled and disabled together.  If
sysfs is disabled, kernfs_enable_ns() is switched to dummy
implementation too and everything is fine; however, kernfs will soon
have its own config option CONFIG_KERNFS and !SYSFS && KERNFS will be
possible, which can make kobject call into non-dummy
kernfs_enable_ns() with NULL kernfs_node pointers leading to an oops.

Introduce sysfs_enable_ns() which is a wrapper around
kernfs_enable_ns() so that it can be made a noop depending only on
CONFIG_SYSFS regardless of the planned CONFIG_KERNFS.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
---
 include/linux/sysfs.h | 9 +++++++++
 lib/kobject.c         | 2 +-
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/include/linux/sysfs.h b/include/linux/sysfs.h
index 14df054..fdaa0c6 100644
--- a/include/linux/sysfs.h
+++ b/include/linux/sysfs.h
@@ -244,6 +244,11 @@ void sysfs_notify(struct kobject *kobj, const char *dir, const char *attr);
 
 int __must_check sysfs_init(void);
 
+static inline void sysfs_enable_ns(struct kernfs_node *kn)
+{
+	return kernfs_enable_ns(kn);
+}
+
 #else /* CONFIG_SYSFS */
 
 static inline int sysfs_create_dir_ns(struct kobject *kobj, const void *ns)
@@ -416,6 +421,10 @@ static inline int __must_check sysfs_init(void)
 	return 0;
 }
 
+static inline void sysfs_enable_ns(struct kernfs_node *kn)
+{
+}
+
 #endif /* CONFIG_SYSFS */
 
 static inline int __must_check sysfs_create_file(struct kobject *kobj,
diff --git a/lib/kobject.c b/lib/kobject.c
index cb14aea..58751bb 100644
--- a/lib/kobject.c
+++ b/lib/kobject.c
@@ -94,7 +94,7 @@ static int create_dir(struct kobject *kobj)
 		BUG_ON(ops->type >= KOBJ_NS_TYPES);
 		BUG_ON(!kobj_ns_type_registered(ops->type));
 
-		kernfs_enable_ns(kobj->sd);
+		sysfs_enable_ns(kobj->sd);
 	}
 
 	return 0;
-- 
1.8.5.3


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCHSET v2 driver-core-next] kernfs: prepare for cgroup's kernfs conversion
  2014-02-03 19:09 [PATCHSET v2 driver-core-next] kernfs: prepare for cgroup's kernfs conversion Tejun Heo
                   ` (10 preceding siblings ...)
  2014-02-07 18:41 ` [PATCH 9.5/10] sysfs, kobject: add sysfs wrapper for kernfs_enable_ns() Tejun Heo
@ 2014-02-07 18:43 ` Tejun Heo
  2014-02-08  0:12   ` Greg KH
  11 siblings, 1 reply; 19+ messages in thread
From: Tejun Heo @ 2014-02-07 18:43 UTC (permalink / raw)
  To: gregkh; +Cc: linux-kernel

Hello,

On Mon, Feb 03, 2014 at 02:09:07PM -0500, Tejun Heo wrote:
>  0001-kernfs-invoke-dir_ops-while-holding-active-ref-of-th.patch
>  0002-kernfs-rename-kernfs_dir_ops-to-kernfs_syscall_ops.patch
>  0003-kernfs-implement-kernfs_syscall_ops-remount_fs-and-s.patch
>  0004-kernfs-add-missing-kernfs_active-checks-in-directory.patch
>  0005-kernfs-allow-nodes-to-be-created-in-the-deactivated-.patch
>  0006-kernfs-implement-kernfs_ops-atomic_write_len.patch
>  0007-kernfs-add-kernfs_open_file-priv.patch
>  0008-kernfs-implement-kernfs_node_from_dentry-kernfs_root.patch
>  0009-kernfs-implement-kernfs_get_parent-kernfs_name-path-.patch
>  0010-kernfs-add-CONFIG_KERNFS.patch

0009 updated to v2 and a new patch added between 0009 and 0010.  Both
are for dummy definition related issues for !CONFIG_SYSFS.  The git
tree has been updated accordingly.

Thanks!

-- 
tejun

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCHSET v2 driver-core-next] kernfs: prepare for cgroup's kernfs conversion
  2014-02-07 18:43 ` [PATCHSET v2 driver-core-next] kernfs: prepare for cgroup's kernfs conversion Tejun Heo
@ 2014-02-08  0:12   ` Greg KH
  2014-02-08 15:07     ` Tejun Heo
  0 siblings, 1 reply; 19+ messages in thread
From: Greg KH @ 2014-02-08  0:12 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-kernel

On Fri, Feb 07, 2014 at 01:43:20PM -0500, Tejun Heo wrote:
> Hello,
> 
> On Mon, Feb 03, 2014 at 02:09:07PM -0500, Tejun Heo wrote:
> >  0001-kernfs-invoke-dir_ops-while-holding-active-ref-of-th.patch
> >  0002-kernfs-rename-kernfs_dir_ops-to-kernfs_syscall_ops.patch
> >  0003-kernfs-implement-kernfs_syscall_ops-remount_fs-and-s.patch
> >  0004-kernfs-add-missing-kernfs_active-checks-in-directory.patch
> >  0005-kernfs-allow-nodes-to-be-created-in-the-deactivated-.patch
> >  0006-kernfs-implement-kernfs_ops-atomic_write_len.patch
> >  0007-kernfs-add-kernfs_open_file-priv.patch
> >  0008-kernfs-implement-kernfs_node_from_dentry-kernfs_root.patch
> >  0009-kernfs-implement-kernfs_get_parent-kernfs_name-path-.patch
> >  0010-kernfs-add-CONFIG_KERNFS.patch
> 
> 0009 updated to v2 and a new patch added between 0009 and 0010.  Both
> are for dummy definition related issues for !CONFIG_SYSFS.  The git
> tree has been updated accordingly.

Ok, I should have now applied all of these, and hopefully in the correct
order.  If I got anything wrong, please let me know.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCHSET v2 driver-core-next] kernfs: prepare for cgroup's kernfs conversion
  2014-02-08  0:12   ` Greg KH
@ 2014-02-08 15:07     ` Tejun Heo
  2014-02-08 15:08       ` Tejun Heo
  0 siblings, 1 reply; 19+ messages in thread
From: Tejun Heo @ 2014-02-08 15:07 UTC (permalink / raw)
  To: Greg KH; +Cc: linux-kernel

On Fri, Feb 07, 2014 at 04:12:54PM -0800, Greg KH wrote:
> Ok, I should have now applied all of these, and hopefully in the correct
> order.  If I got anything wrong, please let me know.

Everything looks good to me.

Thanks!

-- 
tejun

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCHSET v2 driver-core-next] kernfs: prepare for cgroup's kernfs conversion
  2014-02-08 15:07     ` Tejun Heo
@ 2014-02-08 15:08       ` Tejun Heo
  2014-02-08 18:17         ` Greg KH
  0 siblings, 1 reply; 19+ messages in thread
From: Tejun Heo @ 2014-02-08 15:08 UTC (permalink / raw)
  To: Greg KH; +Cc: linux-kernel

On Sat, Feb 08, 2014 at 10:07:51AM -0500, Tejun Heo wrote:
> On Fri, Feb 07, 2014 at 04:12:54PM -0800, Greg KH wrote:
> > Ok, I should have now applied all of these, and hopefully in the correct
> > order.  If I got anything wrong, please let me know.
> 
> Everything looks good to me.

Ooh, JFYI, I'm pulling driver-core-next into cgroup for-3.15 so that
cgroup kernfs conversion can be put on top of it.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCHSET v2 driver-core-next] kernfs: prepare for cgroup's kernfs conversion
  2014-02-08 18:17         ` Greg KH
@ 2014-02-08 18:17           ` Tejun Heo
  0 siblings, 0 replies; 19+ messages in thread
From: Tejun Heo @ 2014-02-08 18:17 UTC (permalink / raw)
  To: Greg KH; +Cc: linux-kernel

Hello,

On Sat, Feb 08, 2014 at 10:17:46AM -0800, Greg KH wrote:
> On Sat, Feb 08, 2014 at 10:08:56AM -0500, Tejun Heo wrote:
> > On Sat, Feb 08, 2014 at 10:07:51AM -0500, Tejun Heo wrote:
> > > On Fri, Feb 07, 2014 at 04:12:54PM -0800, Greg KH wrote:
> > > > Ok, I should have now applied all of these, and hopefully in the correct
> > > > order.  If I got anything wrong, please let me know.
> > > 
> > > Everything looks good to me.
> > 
> > Ooh, JFYI, I'm pulling driver-core-next into cgroup for-3.15 so that
> > cgroup kernfs conversion can be put on top of it.
> 
> That's fine with me.  Do you feel kernfs is good enough now for other
> subsystems to use?  I want to port debugfs to it when ready.

API-wise, I think, or at least hope, so.  I've been running quite a
few tests with the converted cgroup and haven't found anything
critical yet, so the outlook isn't too bad.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCHSET v2 driver-core-next] kernfs: prepare for cgroup's kernfs conversion
  2014-02-08 15:08       ` Tejun Heo
@ 2014-02-08 18:17         ` Greg KH
  2014-02-08 18:17           ` Tejun Heo
  0 siblings, 1 reply; 19+ messages in thread
From: Greg KH @ 2014-02-08 18:17 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-kernel

On Sat, Feb 08, 2014 at 10:08:56AM -0500, Tejun Heo wrote:
> On Sat, Feb 08, 2014 at 10:07:51AM -0500, Tejun Heo wrote:
> > On Fri, Feb 07, 2014 at 04:12:54PM -0800, Greg KH wrote:
> > > Ok, I should have now applied all of these, and hopefully in the correct
> > > order.  If I got anything wrong, please let me know.
> > 
> > Everything looks good to me.
> 
> Ooh, JFYI, I'm pulling driver-core-next into cgroup for-3.15 so that
> cgroup kernfs conversion can be put on top of it.

That's fine with me.  Do you feel kernfs is good enough now for other
subsystems to use?  I want to port debugfs to it when ready.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2014-02-08 18:17 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-02-03 19:09 [PATCHSET v2 driver-core-next] kernfs: prepare for cgroup's kernfs conversion Tejun Heo
2014-02-03 19:09 ` [PATCH 01/10] kernfs: invoke dir_ops while holding active ref of the target node Tejun Heo
2014-02-03 19:09 ` [PATCH 02/10] kernfs: rename kernfs_dir_ops to kernfs_syscall_ops Tejun Heo
2014-02-03 19:09 ` [PATCH 03/10] kernfs: implement kernfs_syscall_ops->remount_fs() and ->show_options() Tejun Heo
2014-02-03 19:09 ` [PATCH 04/10] kernfs: add missing kernfs_active() checks in directory operations Tejun Heo
2014-02-03 19:09 ` [PATCH 05/10] kernfs: allow nodes to be created in the deactivated state Tejun Heo
2014-02-03 19:09 ` [PATCH 06/10] kernfs: implement kernfs_ops->atomic_write_len Tejun Heo
2014-02-03 19:09 ` [PATCH 07/10] kernfs: add kernfs_open_file->priv Tejun Heo
2014-02-03 19:09 ` [PATCH 08/10] kernfs: implement kernfs_node_from_dentry(), kernfs_root_from_sb() and kernfs_rename() Tejun Heo
2014-02-03 19:09 ` [PATCH 09/10] kernfs: implement kernfs_get_parent(), kernfs_name/path() and friends Tejun Heo
2014-02-07 18:40   ` [PATCH v2 " Tejun Heo
2014-02-03 19:09 ` [PATCH 10/10] kernfs: add CONFIG_KERNFS Tejun Heo
2014-02-07 18:41 ` [PATCH 9.5/10] sysfs, kobject: add sysfs wrapper for kernfs_enable_ns() Tejun Heo
2014-02-07 18:43 ` [PATCHSET v2 driver-core-next] kernfs: prepare for cgroup's kernfs conversion Tejun Heo
2014-02-08  0:12   ` Greg KH
2014-02-08 15:07     ` Tejun Heo
2014-02-08 15:08       ` Tejun Heo
2014-02-08 18:17         ` Greg KH
2014-02-08 18:17           ` Tejun Heo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).