linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Subject: [PATCHSET driver-core-next] sysfs: separate out kernfs, part #1
@ 2013-10-24 15:49 Tejun Heo
  2013-10-24 15:49 ` [PATCH 01/34] sysfs: merge sysfs_elem_bin_attr into sysfs_elem_attr Tejun Heo
                   ` (35 more replies)
  0 siblings, 36 replies; 46+ messages in thread
From: Tejun Heo @ 2013-10-24 15:49 UTC (permalink / raw)
  To: gregkh; +Cc: kay, linux-kernel, ebiederm, bhelgaas

Hello,

This patchset contains 34 patches to separate out core sysfs features
into kernfs.  kernfs exclusively deals with sysfs_dirent, which will
be later renamed to kernfs_node, and kernfs_ops.  sysfs becomes a
wrapping layer over sysfs which interfaces kobject and
[bin_]attribute.

The goal of these changes is to allow other users to make use of the
core features of sysfs instead of rolling their own pseudo filesystem
implementation which usually fails to deal with issues with file
shutdowns, locking separation from vfs layer and so on.  This patchset
refactors sysfs and separates out most core functionalities to kernfs;
however, the mount code hasn't been updated yet and it can't be used
by other users yet.  The patchset is pretty big already and the two
steps can be separated relatively well, so I think this is a good
split point.

This patchset shouldn't introduce any behavior differences and
contains the following 34 patches.

 0001-sysfs-merge-sysfs_elem_bin_attr-into-sysfs_elem_attr.patch
 0002-sysfs-honor-bin_attr.attr.ignore_lockdep.patch
 0003-sysfs-remove-unused-sysfs_get_dentry-prototype.patch
 0004-sysfs-move-sysfs_hash_and_remove-to-fs-sysfs-dir.c.patch
 0005-sysfs-separate-out-dup-filename-warning-into-a-separ.patch
 0006-sysfs-make-__sysfs_add_one-fail-if-the-parent-isn-t-.patch
 0007-sysfs-kernfs-add-skeletons-for-kernfs.patch
 0008-sysfs-kernfs-introduce-kernfs_remove-_by_name-_ns.patch
 0009-sysfs-kernfs-introduce-kernfs_create_link.patch
 0010-sysfs-kernfs-introduce-kernfs_rename-_ns.patch
 0011-sysfs-kernfs-introduce-kernfs_setattr.patch
 0012-sysfs-kernfs-replace-sysfs_dirent-s_dir.kobj-and-s_a.patch
 0013-sysfs-kernfs-introduce-kernfs_create_dir-_ns.patch
 0014-sysfs-kernfs-prepare-read-path-for-kernfs.patch
 0015-sysfs-kernfs-prepare-write-path-for-kernfs.patch
 0016-sysfs-kernfs-prepare-llseek-path-for-kernfs.patch
 0017-sysfs-kernfs-prepare-mmap-path-for-kernfs.patch
 0018-sysfs-kernfs-prepare-open-release-poll-paths-for-ker.patch
 0019-sysfs-kernfs-move-sysfs_open_file-to-include-linux-k.patch
 0020-sysfs-kernfs-introduce-kernfs_ops.patch
 0021-sysfs-kernfs-add-sysfs_dirent-s_attr.size.patch
 0022-sysfs-kernfs-remove-SYSFS_KOBJ_BIN_ATTR.patch
 0023-sysfs-kernfs-introduce-kernfs_create_file-_ns.patch
 0024-sysfs-kernfs-remove-sysfs_add_one.patch
 0025-sysfs-kernfs-add-kernfs_ops-seq_-start-next-stop.patch
 0026-sysfs-kernfs-introduce-kernfs_notify.patch
 0027-sysfs-kernfs-reorganize-SYSFS_-constants.patch
 0028-sysfs-kernfs-revamp-sysfs_dirent-active_ref-lockdep-.patch
 0029-sysfs-kernfs-introduce-kernfs-_find_and-_get-and-ke.patch
 0030-sysfs-kernfs-move-internal-decls-to-fs-kernfs-kernfs.patch
 0031-sysfs-kernfs-move-inode-code-to-fs-kernfs-inode.c.patch
 0032-sysfs-kernfs-move-dir-core-code-to-fs-kernfs-dir.c.patch
 0033-sysfs-kernfs-move-file-core-code-to-fs-kernfs-file.c.patch
 0034-sysfs-kernfs-move-symlink-core-code-to-fs-kernfs-sym.patch

0001-0006 are prep patches.

0007 preps fs/kernfs/ directory with skeleton files.

0008-0029 refactor various code paths so that all externally visible
interfaces are split to core kernfs interface which deals with
sysfs_dirent and kernfs_ops and sysfs wrapping it to provide the
existing interface.

0030-0034 move kernfs part of the implementation under fs/kernfs.

This patchset is on top of

  driver-core-next 0cba7de7f6cd ("input: gameport: convert bus code to use dev_groups")
+ [1] sysfs: fix sysfs_write_file for bin file

While this patchset doesn't depend on it, it probably would be a good
idea to apply the fix for devm.kmalloc bug before applying these as
that bug makes bisecting quite painful.

diffstat follows.  Thanks.

 fs/Makefile                 |   2 +-
 fs/kernfs/Makefile          |   5 +
 fs/kernfs/dir.c             | 978 ++++++++++++++++++++++++++++++++++++++++++++
 fs/kernfs/file.c            | 800 ++++++++++++++++++++++++++++++++++++
 fs/kernfs/inode.c           | 336 +++++++++++++++
 fs/kernfs/kernfs-internal.h | 157 +++++++
 fs/kernfs/symlink.c         | 147 +++++++
 fs/sysfs/Makefile           |   2 +-
 fs/sysfs/dir.c              | 965 +------------------------------------------
 fs/sysfs/file.c             | 914 ++++++++---------------------------------
 fs/sysfs/group.c            |  59 +--
 fs/sysfs/inode.c            | 357 ----------------
 fs/sysfs/mount.c            |  14 -
 fs/sysfs/symlink.c          | 156 +------
 fs/sysfs/sysfs.h            | 208 +---------
 include/linux/kernfs.h      | 185 +++++++++
 include/linux/sysfs.h       |  43 +-
 17 files changed, 2871 insertions(+), 2457 deletions(-)
 create mode 100644 fs/kernfs/Makefile
 create mode 100644 fs/kernfs/dir.c
 create mode 100644 fs/kernfs/file.c
 create mode 100644 fs/kernfs/inode.c
 create mode 100644 fs/kernfs/kernfs-internal.h
 create mode 100644 fs/kernfs/symlink.c
 delete mode 100644 fs/sysfs/inode.c
 create mode 100644 include/linux/kernfs.h

--
tejun

[1] http://thread.gmane.org/gmane.linux.kernel/1584113

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH 01/34] sysfs: merge sysfs_elem_bin_attr into sysfs_elem_attr
  2013-10-24 15:49 Subject: [PATCHSET driver-core-next] sysfs: separate out kernfs, part #1 Tejun Heo
@ 2013-10-24 15:49 ` Tejun Heo
  2013-10-24 15:49 ` [PATCH 02/34] sysfs: honor bin_attr.attr.ignore_lockdep Tejun Heo
                   ` (34 subsequent siblings)
  35 siblings, 0 replies; 46+ messages in thread
From: Tejun Heo @ 2013-10-24 15:49 UTC (permalink / raw)
  To: gregkh; +Cc: kay, linux-kernel, ebiederm, bhelgaas, Tejun Heo

3124eb1679 ("sysfs: merge regular and bin file handling") folded bin
file handling into regular file handling.  Among other things, bin
file now shares the same open path including sysfs_open_dirent
association using sysfs_dirent->s_attr.open.  This is buggy because
->s_bin_attr lives in the same union and doesn't have the field.  This
bug doesn't trigger because sysfs_elem_bin_attr doesn't have an active
field at the conflicting position.  It does have a field "buffers" but
it isn't used anymore.

This patch collapses sysfs_elem_bin_attr into sysfs_elem_attr so that
the bin_attr is accessed through ->s_attr.bin_attr which lives with
->s_attr.attr in an anonymous union.  The code paths already assume
bin_attr contains attr as the first element, so this doesn't add any
more assumptions while making it explicit that the two types are
handled together.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 fs/sysfs/file.c  |  8 ++++----
 fs/sysfs/inode.c |  2 +-
 fs/sysfs/sysfs.h | 11 ++++-------
 3 files changed, 9 insertions(+), 12 deletions(-)

diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c
index c379597..0d7368d4 100644
--- a/fs/sysfs/file.c
+++ b/fs/sysfs/file.c
@@ -154,7 +154,7 @@ static ssize_t sysfs_bin_read(struct file *file, char __user *userbuf,
 			      size_t bytes, loff_t *off)
 {
 	struct sysfs_open_file *of = sysfs_of(file);
-	struct bin_attribute *battr = of->sd->s_bin_attr.bin_attr;
+	struct bin_attribute *battr = of->sd->s_attr.bin_attr;
 	struct kobject *kobj = of->sd->s_parent->s_dir.kobj;
 	loff_t size = file_inode(file)->i_size;
 	int count = min_t(size_t, bytes, PAGE_SIZE);
@@ -236,7 +236,7 @@ static int flush_write_buffer(struct sysfs_open_file *of, char *buf, loff_t off,
 	}
 
 	if (sysfs_is_bin(of->sd)) {
-		struct bin_attribute *battr = of->sd->s_bin_attr.bin_attr;
+		struct bin_attribute *battr = of->sd->s_attr.bin_attr;
 
 		rc = -EIO;
 		if (battr->write)
@@ -466,7 +466,7 @@ static const struct vm_operations_struct sysfs_bin_vm_ops = {
 static int sysfs_bin_mmap(struct file *file, struct vm_area_struct *vma)
 {
 	struct sysfs_open_file *of = sysfs_of(file);
-	struct bin_attribute *battr = of->sd->s_bin_attr.bin_attr;
+	struct bin_attribute *battr = of->sd->s_attr.bin_attr;
 	struct kobject *kobj = of->sd->s_parent->s_dir.kobj;
 	int rc;
 
@@ -618,7 +618,7 @@ static int sysfs_open_file(struct inode *inode, struct file *file)
 		return -ENODEV;
 
 	if (sysfs_is_bin(attr_sd)) {
-		struct bin_attribute *battr = attr_sd->s_bin_attr.bin_attr;
+		struct bin_attribute *battr = attr_sd->s_attr.bin_attr;
 
 		has_read = battr->read || battr->mmap;
 		has_write = battr->write || battr->mmap;
diff --git a/fs/sysfs/inode.c b/fs/sysfs/inode.c
index 2cb1b6b..825c556 100644
--- a/fs/sysfs/inode.c
+++ b/fs/sysfs/inode.c
@@ -258,7 +258,7 @@ static void sysfs_init_inode(struct sysfs_dirent *sd, struct inode *inode)
 		inode->i_fop = &sysfs_file_operations;
 		break;
 	case SYSFS_KOBJ_BIN_ATTR:
-		bin_attr = sd->s_bin_attr.bin_attr;
+		bin_attr = sd->s_attr.bin_attr;
 		inode->i_size = bin_attr->size;
 		inode->i_fop = &sysfs_bin_operations;
 		break;
diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index 94ac8fa..c095d952 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -29,15 +29,13 @@ struct sysfs_elem_symlink {
 };
 
 struct sysfs_elem_attr {
-	struct attribute	*attr;
+	union {
+		struct attribute	*attr;
+		struct bin_attribute	*bin_attr;
+	};
 	struct sysfs_open_dirent *open;
 };
 
-struct sysfs_elem_bin_attr {
-	struct bin_attribute	*bin_attr;
-	struct hlist_head	buffers;
-};
-
 struct sysfs_inode_attrs {
 	struct iattr	ia_iattr;
 	void		*ia_secdata;
@@ -74,7 +72,6 @@ struct sysfs_dirent {
 		struct sysfs_elem_dir		s_dir;
 		struct sysfs_elem_symlink	s_symlink;
 		struct sysfs_elem_attr		s_attr;
-		struct sysfs_elem_bin_attr	s_bin_attr;
 	};
 
 	unsigned short		s_flags;
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 02/34] sysfs: honor bin_attr.attr.ignore_lockdep
  2013-10-24 15:49 Subject: [PATCHSET driver-core-next] sysfs: separate out kernfs, part #1 Tejun Heo
  2013-10-24 15:49 ` [PATCH 01/34] sysfs: merge sysfs_elem_bin_attr into sysfs_elem_attr Tejun Heo
@ 2013-10-24 15:49 ` Tejun Heo
  2013-10-24 15:49 ` [PATCH 03/34] sysfs: remove unused sysfs_get_dentry() prototype Tejun Heo
                   ` (33 subsequent siblings)
  35 siblings, 0 replies; 46+ messages in thread
From: Tejun Heo @ 2013-10-24 15:49 UTC (permalink / raw)
  To: gregkh; +Cc: kay, linux-kernel, ebiederm, bhelgaas, Tejun Heo

ignore_lockdep is currently honored only for regular files.  There's
no reason to ignore it for bin files.  Update sysfs_ignore_lockdep()
so that bin_attr.attr.ignore_lockdep works too.

While this doesn't have any in-kernel user, this unifies the behaviors
between regular and bin files and will help later changes.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 fs/sysfs/sysfs.h | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index c095d952..bdce845 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -114,7 +114,9 @@ do {								\
 /* Test for attributes that want to ignore lockdep for read-locking */
 static inline bool sysfs_ignore_lockdep(struct sysfs_dirent *sd)
 {
-	return sysfs_type(sd) == SYSFS_KOBJ_ATTR &&
+	int type = sysfs_type(sd);
+
+	return (type == SYSFS_KOBJ_ATTR || type == SYSFS_KOBJ_BIN_ATTR) &&
 		sd->s_attr.attr->ignore_lockdep;
 }
 
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 03/34] sysfs: remove unused sysfs_get_dentry() prototype
  2013-10-24 15:49 Subject: [PATCHSET driver-core-next] sysfs: separate out kernfs, part #1 Tejun Heo
  2013-10-24 15:49 ` [PATCH 01/34] sysfs: merge sysfs_elem_bin_attr into sysfs_elem_attr Tejun Heo
  2013-10-24 15:49 ` [PATCH 02/34] sysfs: honor bin_attr.attr.ignore_lockdep Tejun Heo
@ 2013-10-24 15:49 ` Tejun Heo
  2013-10-24 15:49 ` [PATCH 04/34] sysfs: move sysfs_hash_and_remove() to fs/sysfs/dir.c Tejun Heo
                   ` (32 subsequent siblings)
  35 siblings, 0 replies; 46+ messages in thread
From: Tejun Heo @ 2013-10-24 15:49 UTC (permalink / raw)
  To: gregkh; +Cc: kay, linux-kernel, ebiederm, bhelgaas, Tejun Heo

sysfs_get_dentry() has been gone for years now.  Remove the left-over
prototype.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 fs/sysfs/sysfs.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index bdce845..e0753e5 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -165,7 +165,6 @@ extern const struct dentry_operations sysfs_dentry_ops;
 extern const struct file_operations sysfs_dir_operations;
 extern const struct inode_operations sysfs_dir_inode_operations;
 
-struct dentry *sysfs_get_dentry(struct sysfs_dirent *sd);
 struct sysfs_dirent *sysfs_get_active(struct sysfs_dirent *sd);
 void sysfs_put_active(struct sysfs_dirent *sd);
 void sysfs_addrm_start(struct sysfs_addrm_cxt *acxt);
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 04/34] sysfs: move sysfs_hash_and_remove() to fs/sysfs/dir.c
  2013-10-24 15:49 Subject: [PATCHSET driver-core-next] sysfs: separate out kernfs, part #1 Tejun Heo
                   ` (2 preceding siblings ...)
  2013-10-24 15:49 ` [PATCH 03/34] sysfs: remove unused sysfs_get_dentry() prototype Tejun Heo
@ 2013-10-24 15:49 ` Tejun Heo
  2013-10-24 15:49 ` [PATCH 05/34] sysfs: separate out dup filename warning into a separate function Tejun Heo
                   ` (31 subsequent siblings)
  35 siblings, 0 replies; 46+ messages in thread
From: Tejun Heo @ 2013-10-24 15:49 UTC (permalink / raw)
  To: gregkh; +Cc: kay, linux-kernel, ebiederm, bhelgaas, Tejun Heo

Most removal related logic is implemented in fs/sysfs/dir.c.  Move
sysfs_hash_and_remove() to fs/sysfs/dir.c so that __sysfs_remove()
doesn't have to be public.

This is pure relocation.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 fs/sysfs/dir.c   | 38 +++++++++++++++++++++++++++++++++++++-
 fs/sysfs/inode.c | 26 --------------------------
 fs/sysfs/sysfs.h |  5 ++---
 3 files changed, 39 insertions(+), 30 deletions(-)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index eab59de..486238d 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -813,7 +813,8 @@ static struct sysfs_dirent *sysfs_next_descendant_post(struct sysfs_dirent *pos,
 	return pos->s_parent;
 }
 
-void __sysfs_remove(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd)
+static void __sysfs_remove(struct sysfs_addrm_cxt *acxt,
+			   struct sysfs_dirent *sd)
 {
 	struct sysfs_dirent *pos, *next;
 
@@ -847,6 +848,41 @@ void sysfs_remove(struct sysfs_dirent *sd)
 }
 
 /**
+ * sysfs_hash_and_remove - find a sysfs_dirent by name and remove it
+ * @dir_sd: parent of the target
+ * @name: name of the sysfs_dirent to remove
+ * @ns: namespace tag of the sysfs_dirent to remove
+ *
+ * Look for the sysfs_dirent with @name and @ns under @dir_sd and remove
+ * it.  Returns 0 on success, -ENOENT if such entry doesn't exist.
+ */
+int sysfs_hash_and_remove(struct sysfs_dirent *dir_sd, const char *name,
+			  const void *ns)
+{
+	struct sysfs_addrm_cxt acxt;
+	struct sysfs_dirent *sd;
+
+	if (!dir_sd) {
+		WARN(1, KERN_WARNING "sysfs: can not remove '%s', no directory\n",
+			name);
+		return -ENOENT;
+	}
+
+	sysfs_addrm_start(&acxt);
+
+	sd = sysfs_find_dirent(dir_sd, name, ns);
+	if (sd)
+		__sysfs_remove(&acxt, sd);
+
+	sysfs_addrm_finish(&acxt);
+
+	if (sd)
+		return 0;
+	else
+		return -ENOENT;
+}
+
+/**
  *	sysfs_remove_dir - remove an object's directory.
  *	@kobj:	object.
  *
diff --git a/fs/sysfs/inode.c b/fs/sysfs/inode.c
index 825c556..1750f79 100644
--- a/fs/sysfs/inode.c
+++ b/fs/sysfs/inode.c
@@ -314,32 +314,6 @@ void sysfs_evict_inode(struct inode *inode)
 	sysfs_put(sd);
 }
 
-int sysfs_hash_and_remove(struct sysfs_dirent *dir_sd, const char *name,
-			  const void *ns)
-{
-	struct sysfs_addrm_cxt acxt;
-	struct sysfs_dirent *sd;
-
-	if (!dir_sd) {
-		WARN(1, KERN_WARNING "sysfs: can not remove '%s', no directory\n",
-			name);
-		return -ENOENT;
-	}
-
-	sysfs_addrm_start(&acxt);
-
-	sd = sysfs_find_dirent(dir_sd, name, ns);
-	if (sd)
-		__sysfs_remove(&acxt, sd);
-
-	sysfs_addrm_finish(&acxt);
-
-	if (sd)
-		return 0;
-	else
-		return -ENOENT;
-}
-
 int sysfs_permission(struct inode *inode, int mask)
 {
 	struct sysfs_dirent *sd;
diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index e0753e5..8d3dc1d 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -172,8 +172,9 @@ int __sysfs_add_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd,
 		    struct sysfs_dirent *parent_sd);
 int sysfs_add_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd,
 		  struct sysfs_dirent *parent_sd);
-void __sysfs_remove(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd);
 void sysfs_remove(struct sysfs_dirent *sd);
+int sysfs_hash_and_remove(struct sysfs_dirent *dir_sd, const char *name,
+			  const void *ns);
 void sysfs_addrm_finish(struct sysfs_addrm_cxt *acxt);
 
 struct sysfs_dirent *sysfs_find_dirent(struct sysfs_dirent *parent_sd,
@@ -218,8 +219,6 @@ int sysfs_getattr(struct vfsmount *mnt, struct dentry *dentry,
 		  struct kstat *stat);
 int sysfs_setxattr(struct dentry *dentry, const char *name, const void *value,
 		   size_t size, int flags);
-int sysfs_hash_and_remove(struct sysfs_dirent *dir_sd, const char *name,
-			  const void *ns);
 int sysfs_inode_init(void);
 
 /*
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 05/34] sysfs: separate out dup filename warning into a separate function
  2013-10-24 15:49 Subject: [PATCHSET driver-core-next] sysfs: separate out kernfs, part #1 Tejun Heo
                   ` (3 preceding siblings ...)
  2013-10-24 15:49 ` [PATCH 04/34] sysfs: move sysfs_hash_and_remove() to fs/sysfs/dir.c Tejun Heo
@ 2013-10-24 15:49 ` Tejun Heo
  2013-10-24 15:49 ` [PATCH 06/34] sysfs: make __sysfs_add_one() fail if the parent isn't a directory Tejun Heo
                   ` (30 subsequent siblings)
  35 siblings, 0 replies; 46+ messages in thread
From: Tejun Heo @ 2013-10-24 15:49 UTC (permalink / raw)
  To: gregkh; +Cc: kay, linux-kernel, ebiederm, bhelgaas, Tejun Heo

Separate out sysfs_warn_dup() out of sysfs_add_one().  This will help
separating out the core sysfs functionalities into kernfs so that it
can be used by non-sysfs users too.

This doesn't make any functional changes.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 fs/sysfs/dir.c   | 30 +++++++++++++++++++-----------
 fs/sysfs/sysfs.h |  1 +
 2 files changed, 20 insertions(+), 11 deletions(-)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index 486238d..de47ed3 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -470,6 +470,23 @@ static char *sysfs_pathname(struct sysfs_dirent *sd, char *path)
 	return path;
 }
 
+void sysfs_warn_dup(struct sysfs_dirent *parent, const char *name)
+{
+	char *path;
+
+	path = kzalloc(PATH_MAX, GFP_KERNEL);
+	if (path) {
+		sysfs_pathname(parent, path);
+		strlcat(path, "/", PATH_MAX);
+		strlcat(path, name, PATH_MAX);
+	}
+
+	WARN(1, KERN_WARNING "sysfs: cannot create duplicate filename '%s'\n",
+	     path ? path : name);
+
+	kfree(path);
+}
+
 /**
  *	sysfs_add_one - add sysfs_dirent to parent
  *	@acxt: addrm context to use
@@ -497,18 +514,9 @@ int sysfs_add_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd,
 	int ret;
 
 	ret = __sysfs_add_one(acxt, sd, parent_sd);
-	if (ret == -EEXIST) {
-		char *path = kzalloc(PATH_MAX, GFP_KERNEL);
-		WARN(1, KERN_WARNING
-		     "sysfs: cannot create duplicate filename '%s'\n",
-		     (path == NULL) ? sd->s_name
-				    : (sysfs_pathname(parent_sd, path),
-				       strlcat(path, "/", PATH_MAX),
-				       strlcat(path, sd->s_name, PATH_MAX),
-				       path));
-		kfree(path);
-	}
 
+	if (ret == -EEXIST)
+		sysfs_warn_dup(parent_sd, sd->s_name);
 	return ret;
 }
 
diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index 8d3dc1d..05d063f 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -168,6 +168,7 @@ extern const struct inode_operations sysfs_dir_inode_operations;
 struct sysfs_dirent *sysfs_get_active(struct sysfs_dirent *sd);
 void sysfs_put_active(struct sysfs_dirent *sd);
 void sysfs_addrm_start(struct sysfs_addrm_cxt *acxt);
+void sysfs_warn_dup(struct sysfs_dirent *parent, const char *name);
 int __sysfs_add_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd,
 		    struct sysfs_dirent *parent_sd);
 int sysfs_add_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd,
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 06/34] sysfs: make __sysfs_add_one() fail if the parent isn't a directory
  2013-10-24 15:49 Subject: [PATCHSET driver-core-next] sysfs: separate out kernfs, part #1 Tejun Heo
                   ` (4 preceding siblings ...)
  2013-10-24 15:49 ` [PATCH 05/34] sysfs: separate out dup filename warning into a separate function Tejun Heo
@ 2013-10-24 15:49 ` Tejun Heo
  2013-10-24 15:49 ` [PATCH 07/34] sysfs, kernfs: add skeletons for kernfs Tejun Heo
                   ` (29 subsequent siblings)
  35 siblings, 0 replies; 46+ messages in thread
From: Tejun Heo @ 2013-10-24 15:49 UTC (permalink / raw)
  To: gregkh; +Cc: kay, linux-kernel, ebiederm, bhelgaas, Tejun Heo

Currently the kobject based interface guarantees that a parent
sysfs_dirent is always a directory; however, the planned kernfs
interface will be directly based on sysfs_dirents and the caller may
specify non-directory node as the parent.  Add an explicit check in
__sysfs_add_one() so that such attempts fail with -EINVAL.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 fs/sysfs/dir.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index de47ed3..bec4466 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -432,6 +432,9 @@ int __sysfs_add_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd,
 	struct sysfs_inode_attrs *ps_iattr;
 	int ret;
 
+	if (sysfs_type(parent_sd) != SYSFS_DIR)
+		return -EINVAL;
+
 	sd->s_hash = sysfs_name_hash(sd->s_name, sd->s_ns);
 	sd->s_parent = sysfs_get(parent_sd);
 
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 07/34] sysfs, kernfs: add skeletons for kernfs
  2013-10-24 15:49 Subject: [PATCHSET driver-core-next] sysfs: separate out kernfs, part #1 Tejun Heo
                   ` (5 preceding siblings ...)
  2013-10-24 15:49 ` [PATCH 06/34] sysfs: make __sysfs_add_one() fail if the parent isn't a directory Tejun Heo
@ 2013-10-24 15:49 ` Tejun Heo
  2013-10-24 15:49 ` [PATCH 08/34] sysfs, kernfs: introduce kernfs_remove[_by_name[_ns]]() Tejun Heo
                   ` (28 subsequent siblings)
  35 siblings, 0 replies; 46+ messages in thread
From: Tejun Heo @ 2013-10-24 15:49 UTC (permalink / raw)
  To: gregkh; +Cc: kay, linux-kernel, ebiederm, bhelgaas, Tejun Heo

Core sysfs implementation will be separated into kernfs so that it can
be used by other non-kobject users.

This patch creates fs/kernfs/ directory and makes boilerplate changes.
kernfs interface will be directly based on sysfs_dirent and its
forward declaration is moved to include/linux/kernfs.h which is
included from include/linux/sysfs.h.  sysfs core implementation will
be gradually separated out and moved to kernfs.

This patch doesn't introduce any functional changes.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 fs/Makefile            |  2 +-
 fs/kernfs/Makefile     |  5 +++++
 fs/kernfs/dir.c        |  9 +++++++++
 fs/kernfs/file.c       |  9 +++++++++
 fs/kernfs/inode.c      |  9 +++++++++
 fs/kernfs/symlink.c    |  9 +++++++++
 include/linux/kernfs.h | 12 ++++++++++++
 include/linux/sysfs.h  |  3 +--
 8 files changed, 55 insertions(+), 3 deletions(-)
 create mode 100644 fs/kernfs/Makefile
 create mode 100644 fs/kernfs/dir.c
 create mode 100644 fs/kernfs/file.c
 create mode 100644 fs/kernfs/inode.c
 create mode 100644 fs/kernfs/symlink.c
 create mode 100644 include/linux/kernfs.h

diff --git a/fs/Makefile b/fs/Makefile
index 4fe6df3..39a824f 100644
--- a/fs/Makefile
+++ b/fs/Makefile
@@ -53,7 +53,7 @@ obj-$(CONFIG_FHANDLE)		+= fhandle.o
 obj-y				+= quota/
 
 obj-$(CONFIG_PROC_FS)		+= proc/
-obj-$(CONFIG_SYSFS)		+= sysfs/
+obj-$(CONFIG_SYSFS)		+= sysfs/ kernfs/
 obj-$(CONFIG_CONFIGFS_FS)	+= configfs/
 obj-y				+= devpts/
 
diff --git a/fs/kernfs/Makefile b/fs/kernfs/Makefile
new file mode 100644
index 0000000..768c24c
--- /dev/null
+++ b/fs/kernfs/Makefile
@@ -0,0 +1,5 @@
+#
+# Makefile for the kernfs pseudo filesystem
+#
+
+obj-y		:= inode.o dir.o file.o symlink.o
diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c
new file mode 100644
index 0000000..1061602
--- /dev/null
+++ b/fs/kernfs/dir.c
@@ -0,0 +1,9 @@
+/*
+ * fs/kernfs/dir.c - kernfs directory implementation
+ *
+ * Copyright (c) 2001-3 Patrick Mochel
+ * Copyright (c) 2007 SUSE Linux Products GmbH
+ * Copyright (c) 2007, 2013 Tejun Heo <tj@kernel.org>
+ *
+ * This file is released under the GPLv2.
+ */
diff --git a/fs/kernfs/file.c b/fs/kernfs/file.c
new file mode 100644
index 0000000..90b1e88
--- /dev/null
+++ b/fs/kernfs/file.c
@@ -0,0 +1,9 @@
+/*
+ * fs/kernfs/file.c - kernfs file implementation
+ *
+ * Copyright (c) 2001-3 Patrick Mochel
+ * Copyright (c) 2007 SUSE Linux Products GmbH
+ * Copyright (c) 2007, 2013 Tejun Heo <tj@kernel.org>
+ *
+ * This file is released under the GPLv2.
+ */
diff --git a/fs/kernfs/inode.c b/fs/kernfs/inode.c
new file mode 100644
index 0000000..86bfeea
--- /dev/null
+++ b/fs/kernfs/inode.c
@@ -0,0 +1,9 @@
+/*
+ * fs/kernfs/inode.c - kernfs inode implementation
+ *
+ * Copyright (c) 2001-3 Patrick Mochel
+ * Copyright (c) 2007 SUSE Linux Products GmbH
+ * Copyright (c) 2007, 2013 Tejun Heo <tj@kernel.org>
+ *
+ * This file is released under the GPLv2.
+ */
diff --git a/fs/kernfs/symlink.c b/fs/kernfs/symlink.c
new file mode 100644
index 0000000..2578715
--- /dev/null
+++ b/fs/kernfs/symlink.c
@@ -0,0 +1,9 @@
+/*
+ * fs/kernfs/symlink.c - kernfs symlink implementation
+ *
+ * Copyright (c) 2001-3 Patrick Mochel
+ * Copyright (c) 2007 SUSE Linux Products GmbH
+ * Copyright (c) 2007, 2013 Tejun Heo <tj@kernel.org>
+ *
+ * This file is released under the GPLv2.
+ */
diff --git a/include/linux/kernfs.h b/include/linux/kernfs.h
new file mode 100644
index 0000000..254b9e8
--- /dev/null
+++ b/include/linux/kernfs.h
@@ -0,0 +1,12 @@
+/*
+ * kernfs.h - pseudo filesystem decoupled from vfs locking
+ *
+ * This file is released under the GPLv2.
+ */
+
+#ifndef __LINUX_KERNFS_H
+#define __LINUX_KERNFS_H
+
+struct sysfs_dirent;
+
+#endif	/* __LINUX_KERNFS_H */
diff --git a/include/linux/sysfs.h b/include/linux/sysfs.h
index 6695040..2bc735d 100644
--- a/include/linux/sysfs.h
+++ b/include/linux/sysfs.h
@@ -12,6 +12,7 @@
 #ifndef _SYSFS_H_
 #define _SYSFS_H_
 
+#include <linux/kernfs.h>
 #include <linux/compiler.h>
 #include <linux/errno.h>
 #include <linux/list.h>
@@ -175,8 +176,6 @@ struct sysfs_ops {
 	ssize_t	(*store)(struct kobject *, struct attribute *, const char *, size_t);
 };
 
-struct sysfs_dirent;
-
 #ifdef CONFIG_SYSFS
 
 int sysfs_schedule_callback(struct kobject *kobj, void (*func)(void *),
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 08/34] sysfs, kernfs: introduce kernfs_remove[_by_name[_ns]]()
  2013-10-24 15:49 Subject: [PATCHSET driver-core-next] sysfs: separate out kernfs, part #1 Tejun Heo
                   ` (6 preceding siblings ...)
  2013-10-24 15:49 ` [PATCH 07/34] sysfs, kernfs: add skeletons for kernfs Tejun Heo
@ 2013-10-24 15:49 ` Tejun Heo
  2013-10-24 15:49 ` [PATCH 09/34] sysfs, kernfs: introduce kernfs_create_link() Tejun Heo
                   ` (27 subsequent siblings)
  35 siblings, 0 replies; 46+ messages in thread
From: Tejun Heo @ 2013-10-24 15:49 UTC (permalink / raw)
  To: gregkh; +Cc: kay, linux-kernel, ebiederm, bhelgaas, Tejun Heo

Introduce kernfs removal interfaces - kernfs_remove() and
kernfs_remove_by_name[_ns]().

These are just renames of sysfs_remove() and sysfs_hash_and_remove().
No functional changes.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 fs/sysfs/dir.c         | 20 ++++++++++----------
 fs/sysfs/file.c        |  6 +++---
 fs/sysfs/group.c       | 15 +++++++--------
 fs/sysfs/symlink.c     |  4 ++--
 fs/sysfs/sysfs.h       |  3 ---
 include/linux/kernfs.h | 24 ++++++++++++++++++++++++
 6 files changed, 46 insertions(+), 26 deletions(-)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index bec4466..58a7779 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -824,8 +824,8 @@ static struct sysfs_dirent *sysfs_next_descendant_post(struct sysfs_dirent *pos,
 	return pos->s_parent;
 }
 
-static void __sysfs_remove(struct sysfs_addrm_cxt *acxt,
-			   struct sysfs_dirent *sd)
+static void __kernfs_remove(struct sysfs_addrm_cxt *acxt,
+			    struct sysfs_dirent *sd)
 {
 	struct sysfs_dirent *pos, *next;
 
@@ -844,22 +844,22 @@ static void __sysfs_remove(struct sysfs_addrm_cxt *acxt,
 }
 
 /**
- * sysfs_remove - remove a sysfs_dirent recursively
+ * kernfs_remove - remove a sysfs_dirent recursively
  * @sd: the sysfs_dirent to remove
  *
  * Remove @sd along with all its subdirectories and files.
  */
-void sysfs_remove(struct sysfs_dirent *sd)
+void kernfs_remove(struct sysfs_dirent *sd)
 {
 	struct sysfs_addrm_cxt acxt;
 
 	sysfs_addrm_start(&acxt);
-	__sysfs_remove(&acxt, sd);
+	__kernfs_remove(&acxt, sd);
 	sysfs_addrm_finish(&acxt);
 }
 
 /**
- * sysfs_hash_and_remove - find a sysfs_dirent by name and remove it
+ * kernfs_remove_by_name_ns - find a sysfs_dirent by name and remove it
  * @dir_sd: parent of the target
  * @name: name of the sysfs_dirent to remove
  * @ns: namespace tag of the sysfs_dirent to remove
@@ -867,8 +867,8 @@ void sysfs_remove(struct sysfs_dirent *sd)
  * Look for the sysfs_dirent with @name and @ns under @dir_sd and remove
  * it.  Returns 0 on success, -ENOENT if such entry doesn't exist.
  */
-int sysfs_hash_and_remove(struct sysfs_dirent *dir_sd, const char *name,
-			  const void *ns)
+int kernfs_remove_by_name_ns(struct sysfs_dirent *dir_sd, const char *name,
+			     const void *ns)
 {
 	struct sysfs_addrm_cxt acxt;
 	struct sysfs_dirent *sd;
@@ -883,7 +883,7 @@ int sysfs_hash_and_remove(struct sysfs_dirent *dir_sd, const char *name,
 
 	sd = sysfs_find_dirent(dir_sd, name, ns);
 	if (sd)
-		__sysfs_remove(&acxt, sd);
+		__kernfs_remove(&acxt, sd);
 
 	sysfs_addrm_finish(&acxt);
 
@@ -911,7 +911,7 @@ void sysfs_remove_dir(struct kobject *kobj)
 
 	if (sd) {
 		WARN_ON_ONCE(sysfs_type(sd) != SYSFS_DIR);
-		sysfs_remove(sd);
+		kernfs_remove(sd);
 	}
 }
 
diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c
index 0d7368d4..6e9c92e 100644
--- a/fs/sysfs/file.c
+++ b/fs/sysfs/file.c
@@ -953,7 +953,7 @@ void sysfs_remove_file_ns(struct kobject *kobj, const struct attribute *attr,
 {
 	struct sysfs_dirent *dir_sd = kobj->sd;
 
-	sysfs_hash_and_remove(dir_sd, attr->name, ns);
+	kernfs_remove_by_name_ns(dir_sd, attr->name, ns);
 }
 EXPORT_SYMBOL_GPL(sysfs_remove_file_ns);
 
@@ -981,7 +981,7 @@ void sysfs_remove_file_from_group(struct kobject *kobj,
 	else
 		dir_sd = sysfs_get(kobj->sd);
 	if (dir_sd) {
-		sysfs_hash_and_remove(dir_sd, attr->name, NULL);
+		kernfs_remove_by_name(dir_sd, attr->name);
 		sysfs_put(dir_sd);
 	}
 }
@@ -1009,7 +1009,7 @@ EXPORT_SYMBOL_GPL(sysfs_create_bin_file);
 void sysfs_remove_bin_file(struct kobject *kobj,
 			   const struct bin_attribute *attr)
 {
-	sysfs_hash_and_remove(kobj->sd, attr->attr.name, NULL);
+	kernfs_remove_by_name(kobj->sd, attr->attr.name);
 }
 EXPORT_SYMBOL_GPL(sysfs_remove_bin_file);
 
diff --git a/fs/sysfs/group.c b/fs/sysfs/group.c
index 1898a10..4bd9973 100644
--- a/fs/sysfs/group.c
+++ b/fs/sysfs/group.c
@@ -26,7 +26,7 @@ static void remove_files(struct sysfs_dirent *dir_sd, struct kobject *kobj,
 
 	if (grp->attrs)
 		for (attr = grp->attrs; *attr; attr++)
-			sysfs_hash_and_remove(dir_sd, (*attr)->name, NULL);
+			kernfs_remove_by_name(dir_sd, (*attr)->name);
 	if (grp->bin_attrs)
 		for (bin_attr = grp->bin_attrs; *bin_attr; bin_attr++)
 			sysfs_remove_bin_file(kobj, *bin_attr);
@@ -49,8 +49,7 @@ static int create_files(struct sysfs_dirent *dir_sd, struct kobject *kobj,
 			 * re-adding (if required) the file.
 			 */
 			if (update)
-				sysfs_hash_and_remove(dir_sd, (*attr)->name,
-						      NULL);
+				kernfs_remove_by_name(dir_sd, (*attr)->name);
 			if (grp->is_visible) {
 				mode = grp->is_visible(kobj, *attr, i);
 				if (!mode)
@@ -111,7 +110,7 @@ static int internal_create_group(struct kobject *kobj, int update,
 	error = create_files(sd, kobj, grp, update);
 	if (error) {
 		if (grp->name)
-			sysfs_remove(sd);
+			kernfs_remove(sd);
 	}
 	sysfs_put(sd);
 	return error;
@@ -219,7 +218,7 @@ void sysfs_remove_group(struct kobject *kobj,
 
 	remove_files(sd, kobj, grp);
 	if (grp->name)
-		sysfs_remove(sd);
+		kernfs_remove(sd);
 
 	sysfs_put(sd);
 }
@@ -270,7 +269,7 @@ int sysfs_merge_group(struct kobject *kobj,
 		error = sysfs_add_file(dir_sd, *attr, SYSFS_KOBJ_ATTR);
 	if (error) {
 		while (--i >= 0)
-			sysfs_hash_and_remove(dir_sd, (*--attr)->name, NULL);
+			kernfs_remove_by_name(dir_sd, (*--attr)->name);
 	}
 	sysfs_put(dir_sd);
 
@@ -292,7 +291,7 @@ void sysfs_unmerge_group(struct kobject *kobj,
 	dir_sd = sysfs_get_dirent(kobj->sd, grp->name);
 	if (dir_sd) {
 		for (attr = grp->attrs; *attr; ++attr)
-			sysfs_hash_and_remove(dir_sd, (*attr)->name, NULL);
+			kernfs_remove_by_name(dir_sd, (*attr)->name);
 		sysfs_put(dir_sd);
 	}
 }
@@ -335,7 +334,7 @@ void sysfs_remove_link_from_group(struct kobject *kobj, const char *group_name,
 
 	dir_sd = sysfs_get_dirent(kobj->sd, group_name);
 	if (dir_sd) {
-		sysfs_hash_and_remove(dir_sd, link_name, NULL);
+		kernfs_remove_by_name(dir_sd, link_name);
 		sysfs_put(dir_sd);
 	}
 }
diff --git a/fs/sysfs/symlink.c b/fs/sysfs/symlink.c
index 22ea2f5..996d861 100644
--- a/fs/sysfs/symlink.c
+++ b/fs/sysfs/symlink.c
@@ -144,7 +144,7 @@ void sysfs_delete_link(struct kobject *kobj, struct kobject *targ,
 	if (targ->sd)
 		ns = targ->sd->s_ns;
 	spin_unlock(&sysfs_assoc_lock);
-	sysfs_hash_and_remove(kobj->sd, name, ns);
+	kernfs_remove_by_name_ns(kobj->sd, name, ns);
 }
 
 /**
@@ -161,7 +161,7 @@ void sysfs_remove_link(struct kobject *kobj, const char *name)
 	else
 		parent_sd = kobj->sd;
 
-	sysfs_hash_and_remove(parent_sd, name, NULL);
+	kernfs_remove_by_name(parent_sd, name);
 }
 EXPORT_SYMBOL_GPL(sysfs_remove_link);
 
diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index 05d063f..506f5d0 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -173,9 +173,6 @@ int __sysfs_add_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd,
 		    struct sysfs_dirent *parent_sd);
 int sysfs_add_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd,
 		  struct sysfs_dirent *parent_sd);
-void sysfs_remove(struct sysfs_dirent *sd);
-int sysfs_hash_and_remove(struct sysfs_dirent *dir_sd, const char *name,
-			  const void *ns);
 void sysfs_addrm_finish(struct sysfs_addrm_cxt *acxt);
 
 struct sysfs_dirent *sysfs_find_dirent(struct sysfs_dirent *parent_sd,
diff --git a/include/linux/kernfs.h b/include/linux/kernfs.h
index 254b9e8..ad0483a 100644
--- a/include/linux/kernfs.h
+++ b/include/linux/kernfs.h
@@ -7,6 +7,30 @@
 #ifndef __LINUX_KERNFS_H
 #define __LINUX_KERNFS_H
 
+#include <linux/kernel.h>
+
 struct sysfs_dirent;
 
+#ifdef CONFIG_SYSFS
+
+void kernfs_remove(struct sysfs_dirent *sd);
+int kernfs_remove_by_name_ns(struct sysfs_dirent *parent, const char *name,
+			     const void *ns);
+
+#else	/* CONFIG_SYSFS */
+
+static inline void kernfs_remove(struct sysfs_dirent *sd) { }
+
+static inline int kernfs_remove_by_name_ns(struct sysfs_dirent *parent,
+					   const char *name, const void *ns)
+{ return 0; }
+
+#endif	/* CONFIG_SYSFS */
+
+static inline int kernfs_remove_by_name(struct sysfs_dirent *parent,
+					const char *name)
+{
+	return kernfs_remove_by_name_ns(parent, name, NULL);
+}
+
 #endif	/* __LINUX_KERNFS_H */
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 09/34] sysfs, kernfs: introduce kernfs_create_link()
  2013-10-24 15:49 Subject: [PATCHSET driver-core-next] sysfs: separate out kernfs, part #1 Tejun Heo
                   ` (7 preceding siblings ...)
  2013-10-24 15:49 ` [PATCH 08/34] sysfs, kernfs: introduce kernfs_remove[_by_name[_ns]]() Tejun Heo
@ 2013-10-24 15:49 ` Tejun Heo
  2013-10-24 15:49 ` [PATCH 10/34] sysfs, kernfs: introduce kernfs_rename[_ns]() Tejun Heo
                   ` (26 subsequent siblings)
  35 siblings, 0 replies; 46+ messages in thread
From: Tejun Heo @ 2013-10-24 15:49 UTC (permalink / raw)
  To: gregkh; +Cc: kay, linux-kernel, ebiederm, bhelgaas, Tejun Heo

Separate out kernfs symlink interface - kernfs_create_link() - which
takes and returns sysfs_dirents, from sysfs_do_create_link_sd().
sysfs_do_create_link_sd() now just determines the parent and target
sysfs_dirents and invokes the new interface and handles dup warning.

This patch doesn't introduce behavior changes.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 fs/sysfs/symlink.c     | 74 ++++++++++++++++++++++++++++++--------------------
 include/linux/kernfs.h |  8 ++++++
 2 files changed, 53 insertions(+), 29 deletions(-)

diff --git a/fs/sysfs/symlink.c b/fs/sysfs/symlink.c
index 996d861..8b8efbe 100644
--- a/fs/sysfs/symlink.c
+++ b/fs/sysfs/symlink.c
@@ -21,14 +21,47 @@
 
 #include "sysfs.h"
 
+/**
+ * kernfs_create_link - create a symlink
+ * @parent: directory to create the symlink in
+ * @name: name of the symlink
+ * @target: target node for the symlink to point to
+ *
+ * Returns the created node on success, ERR_PTR() value on error.
+ */
+struct sysfs_dirent *kernfs_create_link(struct sysfs_dirent *parent,
+					const char *name,
+					struct sysfs_dirent *target)
+{
+	struct sysfs_dirent *sd;
+	struct sysfs_addrm_cxt acxt;
+	int error;
+
+	sd = sysfs_new_dirent(name, S_IFLNK|S_IRWXUGO, SYSFS_KOBJ_LINK);
+	if (!sd)
+		return ERR_PTR(-ENOMEM);
+
+	sd->s_ns = target->s_ns;
+	sd->s_symlink.target_sd = target;
+	sysfs_get(target);	/* ref owned by symlink */
+
+	sysfs_addrm_start(&acxt);
+	error = __sysfs_add_one(&acxt, sd, parent);
+	sysfs_addrm_finish(&acxt);
+
+	if (!error)
+		return sd;
+
+	sysfs_put(sd);
+	return ERR_PTR(error);
+}
+
+
 static int sysfs_do_create_link_sd(struct sysfs_dirent *parent_sd,
 				   struct kobject *target,
 				   const char *name, int warn)
 {
-	struct sysfs_dirent *target_sd = NULL;
-	struct sysfs_dirent *sd = NULL;
-	struct sysfs_addrm_cxt acxt;
-	int error;
+	struct sysfs_dirent *sd, *target_sd = NULL;
 
 	BUG_ON(!name || !parent_sd);
 
@@ -40,35 +73,18 @@ static int sysfs_do_create_link_sd(struct sysfs_dirent *parent_sd,
 		target_sd = sysfs_get(target->sd);
 	spin_unlock(&sysfs_assoc_lock);
 
-	error = -ENOENT;
 	if (!target_sd)
-		goto out_put;
+		return -ENOENT;
 
-	error = -ENOMEM;
-	sd = sysfs_new_dirent(name, S_IFLNK|S_IRWXUGO, SYSFS_KOBJ_LINK);
-	if (!sd)
-		goto out_put;
-
-	sd->s_ns = target_sd->s_ns;
-	sd->s_symlink.target_sd = target_sd;
-	target_sd = NULL;	/* reference is now owned by the symlink */
-
-	sysfs_addrm_start(&acxt);
-	if (warn)
-		error = sysfs_add_one(&acxt, sd, parent_sd);
-	else
-		error = __sysfs_add_one(&acxt, sd, parent_sd);
-	sysfs_addrm_finish(&acxt);
-
-	if (error)
-		goto out_put;
+	sd = kernfs_create_link(parent_sd, name, target_sd);
+	sysfs_put(target_sd);
 
-	return 0;
+	if (!IS_ERR(sd))
+		return 0;
 
- out_put:
-	sysfs_put(target_sd);
-	sysfs_put(sd);
-	return error;
+	if (warn && PTR_ERR(sd) == -EEXIST)
+		sysfs_warn_dup(parent_sd, name);
+	return PTR_ERR(sd);
 }
 
 /**
diff --git a/include/linux/kernfs.h b/include/linux/kernfs.h
index ad0483a..03dcee3 100644
--- a/include/linux/kernfs.h
+++ b/include/linux/kernfs.h
@@ -13,12 +13,20 @@ struct sysfs_dirent;
 
 #ifdef CONFIG_SYSFS
 
+struct sysfs_dirent *kernfs_create_link(struct sysfs_dirent *parent,
+					const char *name,
+					struct sysfs_dirent *target);
 void kernfs_remove(struct sysfs_dirent *sd);
 int kernfs_remove_by_name_ns(struct sysfs_dirent *parent, const char *name,
 			     const void *ns);
 
 #else	/* CONFIG_SYSFS */
 
+static inline struct sysfs_dirent *
+kernfs_create_link(struct sysfs_dirent *parent, const char *name,
+		   struct sysfs_dirent *target)
+{ return NULL; }
+
 static inline void kernfs_remove(struct sysfs_dirent *sd) { }
 
 static inline int kernfs_remove_by_name_ns(struct sysfs_dirent *parent,
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 10/34] sysfs, kernfs: introduce kernfs_rename[_ns]()
  2013-10-24 15:49 Subject: [PATCHSET driver-core-next] sysfs: separate out kernfs, part #1 Tejun Heo
                   ` (8 preceding siblings ...)
  2013-10-24 15:49 ` [PATCH 09/34] sysfs, kernfs: introduce kernfs_create_link() Tejun Heo
@ 2013-10-24 15:49 ` Tejun Heo
  2013-10-24 15:49 ` [PATCH 11/34] sysfs, kernfs: introduce kernfs_setattr() Tejun Heo
                   ` (25 subsequent siblings)
  35 siblings, 0 replies; 46+ messages in thread
From: Tejun Heo @ 2013-10-24 15:49 UTC (permalink / raw)
  To: gregkh; +Cc: kay, linux-kernel, ebiederm, bhelgaas, Tejun Heo

Introduce kernfs rename interface, krenfs_rename[_ns]().

This is just rename of sysfs_rename().  No functional changes.
Function comment is added to kernfs_rename_ns() and @new_parent_sd is
renamed to @new_parent for consistency with other kernfs interfaces.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 fs/sysfs/dir.c         | 23 +++++++++++++++--------
 fs/sysfs/symlink.c     |  2 +-
 fs/sysfs/sysfs.h       |  3 ---
 include/linux/kernfs.h |  7 +++++++
 4 files changed, 23 insertions(+), 12 deletions(-)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index 58a7779..e2ce1e3 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -915,20 +915,27 @@ void sysfs_remove_dir(struct kobject *kobj)
 	}
 }
 
-int sysfs_rename(struct sysfs_dirent *sd, struct sysfs_dirent *new_parent_sd,
-		 const char *new_name, const void *new_ns)
+/**
+ * kernfs_rename_ns - move and rename a kernfs_node
+ * @sd: target node
+ * @new_parent: new parent to put @sd under
+ * @new_name: new name
+ * @new_ns: new namespace tag
+ */
+int kernfs_rename_ns(struct sysfs_dirent *sd, struct sysfs_dirent *new_parent,
+		     const char *new_name, const void *new_ns)
 {
 	int error;
 
 	mutex_lock(&sysfs_mutex);
 
 	error = 0;
-	if ((sd->s_parent == new_parent_sd) && (sd->s_ns == new_ns) &&
+	if ((sd->s_parent == new_parent) && (sd->s_ns == new_ns) &&
 	    (strcmp(sd->s_name, new_name) == 0))
 		goto out;	/* nothing to rename */
 
 	error = -EEXIST;
-	if (sysfs_find_dirent(new_parent_sd, new_name, new_ns))
+	if (sysfs_find_dirent(new_parent, new_name, new_ns))
 		goto out;
 
 	/* rename sysfs_dirent */
@@ -946,11 +953,11 @@ int sysfs_rename(struct sysfs_dirent *sd, struct sysfs_dirent *new_parent_sd,
 	 * Move to the appropriate place in the appropriate directories rbtree.
 	 */
 	sysfs_unlink_sibling(sd);
-	sysfs_get(new_parent_sd);
+	sysfs_get(new_parent);
 	sysfs_put(sd->s_parent);
 	sd->s_ns = new_ns;
 	sd->s_hash = sysfs_name_hash(sd->s_name, sd->s_ns);
-	sd->s_parent = new_parent_sd;
+	sd->s_parent = new_parent;
 	sysfs_link_sibling(sd);
 
 	error = 0;
@@ -964,7 +971,7 @@ int sysfs_rename_dir_ns(struct kobject *kobj, const char *new_name,
 {
 	struct sysfs_dirent *parent_sd = kobj->sd->s_parent;
 
-	return sysfs_rename(kobj->sd, parent_sd, new_name, new_ns);
+	return kernfs_rename_ns(kobj->sd, parent_sd, new_name, new_ns);
 }
 
 int sysfs_move_dir_ns(struct kobject *kobj, struct kobject *new_parent_kobj,
@@ -977,7 +984,7 @@ int sysfs_move_dir_ns(struct kobject *kobj, struct kobject *new_parent_kobj,
 	new_parent_sd = new_parent_kobj && new_parent_kobj->sd ?
 		new_parent_kobj->sd : &sysfs_root;
 
-	return sysfs_rename(sd, new_parent_sd, sd->s_name, new_ns);
+	return kernfs_rename_ns(sd, new_parent_sd, sd->s_name, new_ns);
 }
 
 /* Relationship between s_mode and the DT_xxx types */
diff --git a/fs/sysfs/symlink.c b/fs/sysfs/symlink.c
index 8b8efbe..2341350 100644
--- a/fs/sysfs/symlink.c
+++ b/fs/sysfs/symlink.c
@@ -217,7 +217,7 @@ int sysfs_rename_link_ns(struct kobject *kobj, struct kobject *targ,
 	if (sd->s_symlink.target_sd->s_dir.kobj != targ)
 		goto out;
 
-	result = sysfs_rename(sd, parent_sd, new, new_ns);
+	result = kernfs_rename_ns(sd, parent_sd, new, new_ns);
 
 out:
 	sysfs_put(sd);
diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index 506f5d0..3d48445 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -185,9 +185,6 @@ void release_sysfs_dirent(struct sysfs_dirent *sd);
 int sysfs_create_subdir(struct kobject *kobj, const char *name,
 			struct sysfs_dirent **p_sd);
 
-int sysfs_rename(struct sysfs_dirent *sd, struct sysfs_dirent *new_parent_sd,
-		 const char *new_name, const void *new_ns);
-
 static inline struct sysfs_dirent *__sysfs_get(struct sysfs_dirent *sd)
 {
 	if (sd) {
diff --git a/include/linux/kernfs.h b/include/linux/kernfs.h
index 03dcee3..9ad3fe5 100644
--- a/include/linux/kernfs.h
+++ b/include/linux/kernfs.h
@@ -19,6 +19,8 @@ struct sysfs_dirent *kernfs_create_link(struct sysfs_dirent *parent,
 void kernfs_remove(struct sysfs_dirent *sd);
 int kernfs_remove_by_name_ns(struct sysfs_dirent *parent, const char *name,
 			     const void *ns);
+int kernfs_rename_ns(struct sysfs_dirent *sd, struct sysfs_dirent *new_parent,
+		     const char *new_name, const void *new_ns);
 
 #else	/* CONFIG_SYSFS */
 
@@ -33,6 +35,11 @@ static inline int kernfs_remove_by_name_ns(struct sysfs_dirent *parent,
 					   const char *name, const void *ns)
 { return 0; }
 
+static inline int kernfs_rename_ns(struct sysfs_dirent *sd,
+				   struct sysfs_dirent *new_parent,
+				   const char *new_name, const void *new_ns)
+{ return 0; }
+
 #endif	/* CONFIG_SYSFS */
 
 static inline int kernfs_remove_by_name(struct sysfs_dirent *parent,
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 11/34] sysfs, kernfs: introduce kernfs_setattr()
  2013-10-24 15:49 Subject: [PATCHSET driver-core-next] sysfs: separate out kernfs, part #1 Tejun Heo
                   ` (9 preceding siblings ...)
  2013-10-24 15:49 ` [PATCH 10/34] sysfs, kernfs: introduce kernfs_rename[_ns]() Tejun Heo
@ 2013-10-24 15:49 ` Tejun Heo
  2013-10-24 15:49 ` [PATCH 12/34] sysfs, kernfs: replace sysfs_dirent->s_dir.kobj and ->s_attr.[bin_]attr with ->priv Tejun Heo
                   ` (24 subsequent siblings)
  35 siblings, 0 replies; 46+ messages in thread
From: Tejun Heo @ 2013-10-24 15:49 UTC (permalink / raw)
  To: gregkh; +Cc: kay, linux-kernel, ebiederm, bhelgaas, Tejun Heo

Introduce kernfs setattr interface - kernfs_setattr().

sysfs_sd_setattr() is renamed to __kernfs_setattr() and
kernfs_setattr() is a simple wrapper around it with sysfs_mutex
locking.  sysfs_chmod_file() is updated to get an explicit ref on
kobj->sd and then invoke kernfs_setattr() so that it doesn't have to
use internal interface.

This patch doesn't introduce any behavior differences.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 fs/sysfs/file.c        | 13 +++++--------
 fs/sysfs/inode.c       | 21 +++++++++++++++++++--
 fs/sysfs/sysfs.h       |  1 -
 include/linux/kernfs.h |  8 ++++++++
 4 files changed, 32 insertions(+), 11 deletions(-)

diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c
index 6e9c92e..d4eeb50 100644
--- a/fs/sysfs/file.c
+++ b/fs/sysfs/file.c
@@ -923,19 +923,16 @@ int sysfs_chmod_file(struct kobject *kobj, const struct attribute *attr,
 	struct iattr newattrs;
 	int rc;
 
-	mutex_lock(&sysfs_mutex);
-
-	rc = -ENOENT;
-	sd = sysfs_find_dirent(kobj->sd, attr->name, NULL);
+	sd = sysfs_get_dirent(kobj->sd, attr->name);
 	if (!sd)
-		goto out;
+		return -ENOENT;
 
 	newattrs.ia_mode = (mode & S_IALLUGO) | (sd->s_mode & ~S_IALLUGO);
 	newattrs.ia_valid = ATTR_MODE;
-	rc = sysfs_sd_setattr(sd, &newattrs);
 
- out:
-	mutex_unlock(&sysfs_mutex);
+	rc = kernfs_setattr(sd, &newattrs);
+
+	sysfs_put(sd);
 	return rc;
 }
 EXPORT_SYMBOL_GPL(sysfs_chmod_file);
diff --git a/fs/sysfs/inode.c b/fs/sysfs/inode.c
index 1750f79..5f7e2af 100644
--- a/fs/sysfs/inode.c
+++ b/fs/sysfs/inode.c
@@ -67,7 +67,7 @@ static struct sysfs_inode_attrs *sysfs_init_inode_attrs(struct sysfs_dirent *sd)
 	return attrs;
 }
 
-int sysfs_sd_setattr(struct sysfs_dirent *sd, struct iattr *iattr)
+static int __kernfs_setattr(struct sysfs_dirent *sd, const struct iattr *iattr)
 {
 	struct sysfs_inode_attrs *sd_attrs;
 	struct iattr *iattrs;
@@ -102,6 +102,23 @@ int sysfs_sd_setattr(struct sysfs_dirent *sd, struct iattr *iattr)
 	return 0;
 }
 
+/**
+ * kernfs_setattr - set iattr on a node
+ * @sd: target node
+ * @iattr: iattr to set
+ *
+ * Returns 0 on success, -errno on failure.
+ */
+int kernfs_setattr(struct sysfs_dirent *sd, const struct iattr *iattr)
+{
+	int ret;
+
+	mutex_lock(&sysfs_mutex);
+	ret = __kernfs_setattr(sd, iattr);
+	mutex_unlock(&sysfs_mutex);
+	return ret;
+}
+
 int sysfs_setattr(struct dentry *dentry, struct iattr *iattr)
 {
 	struct inode *inode = dentry->d_inode;
@@ -116,7 +133,7 @@ int sysfs_setattr(struct dentry *dentry, struct iattr *iattr)
 	if (error)
 		goto out;
 
-	error = sysfs_sd_setattr(sd, iattr);
+	error = __kernfs_setattr(sd, iattr);
 	if (error)
 		goto out;
 
diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index 3d48445..d49399d 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -207,7 +207,6 @@ static inline void __sysfs_put(struct sysfs_dirent *sd)
  */
 struct inode *sysfs_get_inode(struct super_block *sb, struct sysfs_dirent *sd);
 void sysfs_evict_inode(struct inode *inode);
-int sysfs_sd_setattr(struct sysfs_dirent *sd, struct iattr *iattr);
 int sysfs_permission(struct inode *inode, int mask);
 int sysfs_setattr(struct dentry *dentry, struct iattr *iattr);
 int sysfs_getattr(struct vfsmount *mnt, struct dentry *dentry,
diff --git a/include/linux/kernfs.h b/include/linux/kernfs.h
index 9ad3fe5..7585d9d 100644
--- a/include/linux/kernfs.h
+++ b/include/linux/kernfs.h
@@ -9,6 +9,9 @@
 
 #include <linux/kernel.h>
 
+struct file;
+struct iattr;
+
 struct sysfs_dirent;
 
 #ifdef CONFIG_SYSFS
@@ -21,6 +24,7 @@ int kernfs_remove_by_name_ns(struct sysfs_dirent *parent, const char *name,
 			     const void *ns);
 int kernfs_rename_ns(struct sysfs_dirent *sd, struct sysfs_dirent *new_parent,
 		     const char *new_name, const void *new_ns);
+int kernfs_setattr(struct sysfs_dirent *sd, const struct iattr *iattr);
 
 #else	/* CONFIG_SYSFS */
 
@@ -40,6 +44,10 @@ static inline int kernfs_rename_ns(struct sysfs_dirent *sd,
 				   const char *new_name, const void *new_ns)
 { return 0; }
 
+static inline int kernfs_setattr(struct sysfs_dirent *sd,
+				 const struct iattr *iattr)
+{ return 0; }
+
 #endif	/* CONFIG_SYSFS */
 
 static inline int kernfs_remove_by_name(struct sysfs_dirent *parent,
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 12/34] sysfs, kernfs: replace sysfs_dirent->s_dir.kobj and ->s_attr.[bin_]attr with ->priv
  2013-10-24 15:49 Subject: [PATCHSET driver-core-next] sysfs: separate out kernfs, part #1 Tejun Heo
                   ` (10 preceding siblings ...)
  2013-10-24 15:49 ` [PATCH 11/34] sysfs, kernfs: introduce kernfs_setattr() Tejun Heo
@ 2013-10-24 15:49 ` Tejun Heo
  2013-10-24 15:49 ` [PATCH 13/34] sysfs, kernfs: introduce kernfs_create_dir[_ns]() Tejun Heo
                   ` (23 subsequent siblings)
  35 siblings, 0 replies; 46+ messages in thread
From: Tejun Heo @ 2013-10-24 15:49 UTC (permalink / raw)
  To: gregkh; +Cc: kay, linux-kernel, ebiederm, bhelgaas, Tejun Heo

A directory sysfs_dirent points to the associated kobj.  A regular or
bin file points to the associated [bin_]attribute.  This patch
replaces sysfs_dirent->s_dir.kobj and ->s_attr.[bin_]attr with void *
->priv.

This is to prepare for kernfs interface so that sysfs can specify the
private data in the same way for directories and files.  This lower
debuggability but not by much - the whole thing was overlaid in a
union anyway.  If debuggability becomes an issue, we can later add
->priv accessors which explicitly check for the sysfs_dirent type and
performs casting.

This patch doesn't introduce any behavior difference.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 fs/sysfs/dir.c     |  2 +-
 fs/sysfs/file.c    | 26 +++++++++++++-------------
 fs/sysfs/inode.c   |  2 +-
 fs/sysfs/symlink.c |  2 +-
 fs/sysfs/sysfs.h   | 13 +++++--------
 5 files changed, 21 insertions(+), 24 deletions(-)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index e2ce1e3..aa8a21b 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -676,7 +676,7 @@ static int create_dir(struct kobject *kobj, struct sysfs_dirent *parent_sd,
 		return -ENOMEM;
 
 	sd->s_ns = ns;
-	sd->s_dir.kobj = kobj;
+	sd->priv = kobj;
 
 	/* link in */
 	sysfs_addrm_start(&acxt);
diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c
index d4eeb50..444615d 100644
--- a/fs/sysfs/file.c
+++ b/fs/sysfs/file.c
@@ -74,7 +74,7 @@ static struct sysfs_open_file *sysfs_of(struct file *file)
  */
 static const struct sysfs_ops *sysfs_file_ops(struct sysfs_dirent *sd)
 {
-	struct kobject *kobj = sd->s_parent->s_dir.kobj;
+	struct kobject *kobj = sd->s_parent->priv;
 
 	if (!sysfs_ignore_lockdep(sd))
 		lockdep_assert_held(sd);
@@ -89,7 +89,7 @@ static const struct sysfs_ops *sysfs_file_ops(struct sysfs_dirent *sd)
 static int sysfs_seq_show(struct seq_file *sf, void *v)
 {
 	struct sysfs_open_file *of = sf->private;
-	struct kobject *kobj = of->sd->s_parent->s_dir.kobj;
+	struct kobject *kobj = of->sd->s_parent->priv;
 	const struct sysfs_ops *ops;
 	char *buf;
 	ssize_t count;
@@ -120,7 +120,7 @@ static int sysfs_seq_show(struct seq_file *sf, void *v)
 	 */
 	ops = sysfs_file_ops(of->sd);
 	if (ops->show)
-		count = ops->show(kobj, of->sd->s_attr.attr, buf);
+		count = ops->show(kobj, of->sd->priv, buf);
 	else
 		count = 0;
 
@@ -154,8 +154,8 @@ static ssize_t sysfs_bin_read(struct file *file, char __user *userbuf,
 			      size_t bytes, loff_t *off)
 {
 	struct sysfs_open_file *of = sysfs_of(file);
-	struct bin_attribute *battr = of->sd->s_attr.bin_attr;
-	struct kobject *kobj = of->sd->s_parent->s_dir.kobj;
+	struct bin_attribute *battr = of->sd->priv;
+	struct kobject *kobj = of->sd->s_parent->priv;
 	loff_t size = file_inode(file)->i_size;
 	int count = min_t(size_t, bytes, PAGE_SIZE);
 	loff_t offs = *off;
@@ -221,7 +221,7 @@ static ssize_t sysfs_bin_read(struct file *file, char __user *userbuf,
 static int flush_write_buffer(struct sysfs_open_file *of, char *buf, loff_t off,
 			      size_t count)
 {
-	struct kobject *kobj = of->sd->s_parent->s_dir.kobj;
+	struct kobject *kobj = of->sd->s_parent->priv;
 	int rc = 0;
 
 	/*
@@ -236,7 +236,7 @@ static int flush_write_buffer(struct sysfs_open_file *of, char *buf, loff_t off,
 	}
 
 	if (sysfs_is_bin(of->sd)) {
-		struct bin_attribute *battr = of->sd->s_attr.bin_attr;
+		struct bin_attribute *battr = of->sd->priv;
 
 		rc = -EIO;
 		if (battr->write)
@@ -245,7 +245,7 @@ static int flush_write_buffer(struct sysfs_open_file *of, char *buf, loff_t off,
 	} else {
 		const struct sysfs_ops *ops = sysfs_file_ops(of->sd);
 
-		rc = ops->store(kobj, of->sd->s_attr.attr, buf, count);
+		rc = ops->store(kobj, of->sd->priv, buf, count);
 	}
 
 	sysfs_put_active(of->sd);
@@ -466,8 +466,8 @@ static const struct vm_operations_struct sysfs_bin_vm_ops = {
 static int sysfs_bin_mmap(struct file *file, struct vm_area_struct *vma)
 {
 	struct sysfs_open_file *of = sysfs_of(file);
-	struct bin_attribute *battr = of->sd->s_attr.bin_attr;
-	struct kobject *kobj = of->sd->s_parent->s_dir.kobj;
+	struct bin_attribute *battr = of->sd->priv;
+	struct kobject *kobj = of->sd->s_parent->priv;
 	int rc;
 
 	mutex_lock(&of->mutex);
@@ -608,7 +608,7 @@ static void sysfs_put_open_dirent(struct sysfs_dirent *sd,
 static int sysfs_open_file(struct inode *inode, struct file *file)
 {
 	struct sysfs_dirent *attr_sd = file->f_path.dentry->d_fsdata;
-	struct kobject *kobj = attr_sd->s_parent->s_dir.kobj;
+	struct kobject *kobj = attr_sd->s_parent->priv;
 	struct sysfs_open_file *of;
 	bool has_read, has_write;
 	int error = -EACCES;
@@ -618,7 +618,7 @@ static int sysfs_open_file(struct inode *inode, struct file *file)
 		return -ENODEV;
 
 	if (sysfs_is_bin(attr_sd)) {
-		struct bin_attribute *battr = attr_sd->s_attr.bin_attr;
+		struct bin_attribute *battr = attr_sd->priv;
 
 		has_read = battr->read || battr->mmap;
 		has_write = battr->write || battr->mmap;
@@ -831,7 +831,7 @@ int sysfs_add_file_mode_ns(struct sysfs_dirent *dir_sd,
 		return -ENOMEM;
 
 	sd->s_ns = ns;
-	sd->s_attr.attr = (void *)attr;
+	sd->priv = (void *)attr;
 	sysfs_dirent_init_lockdep(sd);
 
 	sysfs_addrm_start(&acxt);
diff --git a/fs/sysfs/inode.c b/fs/sysfs/inode.c
index 5f7e2af..81cc858 100644
--- a/fs/sysfs/inode.c
+++ b/fs/sysfs/inode.c
@@ -275,7 +275,7 @@ static void sysfs_init_inode(struct sysfs_dirent *sd, struct inode *inode)
 		inode->i_fop = &sysfs_file_operations;
 		break;
 	case SYSFS_KOBJ_BIN_ATTR:
-		bin_attr = sd->s_attr.bin_attr;
+		bin_attr = sd->priv;
 		inode->i_size = bin_attr->size;
 		inode->i_fop = &sysfs_bin_operations;
 		break;
diff --git a/fs/sysfs/symlink.c b/fs/sysfs/symlink.c
index 2341350..b44d2ce 100644
--- a/fs/sysfs/symlink.c
+++ b/fs/sysfs/symlink.c
@@ -214,7 +214,7 @@ int sysfs_rename_link_ns(struct kobject *kobj, struct kobject *targ,
 	result = -EINVAL;
 	if (sysfs_type(sd) != SYSFS_KOBJ_LINK)
 		goto out;
-	if (sd->s_symlink.target_sd->s_dir.kobj != targ)
+	if (sd->s_symlink.target_sd->priv != targ)
 		goto out;
 
 	result = kernfs_rename_ns(sd, parent_sd, new, new_ns);
diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index d49399d..9952e27 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -17,8 +17,6 @@ struct sysfs_open_dirent;
 
 /* type-specific structures for sysfs_dirent->s_* union members */
 struct sysfs_elem_dir {
-	struct kobject		*kobj;
-
 	unsigned long		subdirs;
 	/* children rbtree starts here and goes through sd->s_rb */
 	struct rb_root		children;
@@ -29,10 +27,6 @@ struct sysfs_elem_symlink {
 };
 
 struct sysfs_elem_attr {
-	union {
-		struct attribute	*attr;
-		struct bin_attribute	*bin_attr;
-	};
 	struct sysfs_open_dirent *open;
 };
 
@@ -74,6 +68,8 @@ struct sysfs_dirent {
 		struct sysfs_elem_attr		s_attr;
 	};
 
+	void			*priv;
+
 	unsigned short		s_flags;
 	umode_t			s_mode;
 	unsigned int		s_ino;
@@ -103,7 +99,7 @@ static inline unsigned int sysfs_type(struct sysfs_dirent *sd)
 
 #define sysfs_dirent_init_lockdep(sd)				\
 do {								\
-	struct attribute *attr = sd->s_attr.attr;		\
+	struct attribute *attr = sd->priv;			\
 	struct lock_class_key *key = attr->key;			\
 	if (!key)						\
 		key = &attr->skey;				\
@@ -114,10 +110,11 @@ do {								\
 /* Test for attributes that want to ignore lockdep for read-locking */
 static inline bool sysfs_ignore_lockdep(struct sysfs_dirent *sd)
 {
+	struct attribute *attr = sd->priv;
 	int type = sysfs_type(sd);
 
 	return (type == SYSFS_KOBJ_ATTR || type == SYSFS_KOBJ_BIN_ATTR) &&
-		sd->s_attr.attr->ignore_lockdep;
+		attr->ignore_lockdep;
 }
 
 #else
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 13/34] sysfs, kernfs: introduce kernfs_create_dir[_ns]()
  2013-10-24 15:49 Subject: [PATCHSET driver-core-next] sysfs: separate out kernfs, part #1 Tejun Heo
                   ` (11 preceding siblings ...)
  2013-10-24 15:49 ` [PATCH 12/34] sysfs, kernfs: replace sysfs_dirent->s_dir.kobj and ->s_attr.[bin_]attr with ->priv Tejun Heo
@ 2013-10-24 15:49 ` Tejun Heo
  2013-10-24 15:49 ` [PATCH 14/34] sysfs, kernfs: prepare read path for kernfs Tejun Heo
                   ` (22 subsequent siblings)
  35 siblings, 0 replies; 46+ messages in thread
From: Tejun Heo @ 2013-10-24 15:49 UTC (permalink / raw)
  To: gregkh; +Cc: kay, linux-kernel, ebiederm, bhelgaas, Tejun Heo

Introduce kernfs interface to create a directory which takes and
returns sysfs_dirents.

create_dir() is renamed to kernfs_create_dir_ns() and its argumantes
and return value are updated.  create_dir() usages are replaced with
kernfs_create_dir_ns() and sysfs_create_subdir() usages are replaced
with kernfs_create_dir().  Dup warnings are handled explicitly by
sysfs users of the kernfs interface.

This patch doesn't introduce any behavior changes.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 fs/sysfs/dir.c         | 50 ++++++++++++++++++++++++++++----------------------
 fs/sysfs/group.c       |  9 ++++++---
 fs/sysfs/sysfs.h       |  3 ---
 include/linux/kernfs.h | 14 ++++++++++++++
 4 files changed, 48 insertions(+), 28 deletions(-)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index aa8a21b..0a1e1da 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -661,9 +661,18 @@ struct sysfs_dirent *sysfs_get_dirent_ns(struct sysfs_dirent *parent_sd,
 }
 EXPORT_SYMBOL_GPL(sysfs_get_dirent_ns);
 
-static int create_dir(struct kobject *kobj, struct sysfs_dirent *parent_sd,
-		      const char *name, const void *ns,
-		      struct sysfs_dirent **p_sd)
+/**
+ * kernfs_create_dir_ns - create a directory
+ * @parent: parent in which to create a new directory
+ * @name: name of the new directory
+ * @priv: opaque data associated with the new directory
+ * @ns: optional namespace tag of the directory
+ *
+ * Returns the created node on success, ERR_PTR() value on failure.
+ */
+struct sysfs_dirent *kernfs_create_dir_ns(struct sysfs_dirent *parent,
+					  const char *name, void *priv,
+					  const void *ns)
 {
 	umode_t mode = S_IFDIR | S_IRWXU | S_IRUGO | S_IXUGO;
 	struct sysfs_addrm_cxt acxt;
@@ -673,28 +682,21 @@ static int create_dir(struct kobject *kobj, struct sysfs_dirent *parent_sd,
 	/* allocate */
 	sd = sysfs_new_dirent(name, mode, SYSFS_DIR);
 	if (!sd)
-		return -ENOMEM;
+		return ERR_PTR(-ENOMEM);
 
 	sd->s_ns = ns;
-	sd->priv = kobj;
+	sd->priv = priv;
 
 	/* link in */
 	sysfs_addrm_start(&acxt);
-	rc = sysfs_add_one(&acxt, sd, parent_sd);
+	rc = __sysfs_add_one(&acxt, sd, parent);
 	sysfs_addrm_finish(&acxt);
 
-	if (rc == 0)
-		*p_sd = sd;
-	else
-		sysfs_put(sd);
+	if (!rc)
+		return sd;
 
-	return rc;
-}
-
-int sysfs_create_subdir(struct kobject *kobj, const char *name,
-			struct sysfs_dirent **p_sd)
-{
-	return create_dir(kobj, kobj->sd, name, NULL, p_sd);
+	sysfs_put(sd);
+	return ERR_PTR(rc);
 }
 
 /**
@@ -705,7 +707,6 @@ int sysfs_create_subdir(struct kobject *kobj, const char *name,
 int sysfs_create_dir_ns(struct kobject *kobj, const void *ns)
 {
 	struct sysfs_dirent *parent_sd, *sd;
-	int error = 0;
 
 	BUG_ON(!kobj);
 
@@ -717,10 +718,15 @@ int sysfs_create_dir_ns(struct kobject *kobj, const void *ns)
 	if (!parent_sd)
 		return -ENOENT;
 
-	error = create_dir(kobj, parent_sd, kobject_name(kobj), ns, &sd);
-	if (!error)
-		kobj->sd = sd;
-	return error;
+	sd = kernfs_create_dir_ns(parent_sd, kobject_name(kobj), kobj, ns);
+	if (IS_ERR(sd)) {
+		if (PTR_ERR(sd) == -EEXIST)
+			sysfs_warn_dup(parent_sd, kobject_name(kobj));
+		return PTR_ERR(sd);
+	}
+
+	kobj->sd = sd;
+	return 0;
 }
 
 static struct dentry *sysfs_lookup(struct inode *dir, struct dentry *dentry,
diff --git a/fs/sysfs/group.c b/fs/sysfs/group.c
index 4bd9973..065689d 100644
--- a/fs/sysfs/group.c
+++ b/fs/sysfs/group.c
@@ -101,9 +101,12 @@ static int internal_create_group(struct kobject *kobj, int update,
 		return -EINVAL;
 	}
 	if (grp->name) {
-		error = sysfs_create_subdir(kobj, grp->name, &sd);
-		if (error)
-			return error;
+		sd = kernfs_create_dir(kobj->sd, grp->name, kobj);
+		if (IS_ERR(sd)) {
+			if (PTR_ERR(sd) == -EEXIST)
+				sysfs_warn_dup(kobj->sd, grp->name);
+			return PTR_ERR(sd);
+		}
 	} else
 		sd = kobj->sd;
 	sysfs_get(sd);
diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index 9952e27..69a1162 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -179,9 +179,6 @@ struct sysfs_dirent *sysfs_new_dirent(const char *name, umode_t mode, int type);
 
 void release_sysfs_dirent(struct sysfs_dirent *sd);
 
-int sysfs_create_subdir(struct kobject *kobj, const char *name,
-			struct sysfs_dirent **p_sd);
-
 static inline struct sysfs_dirent *__sysfs_get(struct sysfs_dirent *sd)
 {
 	if (sd) {
diff --git a/include/linux/kernfs.h b/include/linux/kernfs.h
index 7585d9d..fab2597 100644
--- a/include/linux/kernfs.h
+++ b/include/linux/kernfs.h
@@ -16,6 +16,9 @@ struct sysfs_dirent;
 
 #ifdef CONFIG_SYSFS
 
+struct sysfs_dirent *kernfs_create_dir_ns(struct sysfs_dirent *parent,
+					  const char *name, void *priv,
+					  const void *ns);
 struct sysfs_dirent *kernfs_create_link(struct sysfs_dirent *parent,
 					const char *name,
 					struct sysfs_dirent *target);
@@ -29,6 +32,11 @@ int kernfs_setattr(struct sysfs_dirent *sd, const struct iattr *iattr);
 #else	/* CONFIG_SYSFS */
 
 static inline struct sysfs_dirent *
+kernfs_create_dir_ns(struct sysfs_dirent *parent, const char *name, void *priv,
+		     const void *ns)
+{ return NULL; }
+
+static inline struct sysfs_dirent *
 kernfs_create_link(struct sysfs_dirent *parent, const char *name,
 		   struct sysfs_dirent *target)
 { return NULL; }
@@ -50,6 +58,12 @@ static inline int kernfs_setattr(struct sysfs_dirent *sd,
 
 #endif	/* CONFIG_SYSFS */
 
+static inline struct sysfs_dirent *
+kernfs_create_dir(struct sysfs_dirent *parent, const char *name, void *priv)
+{
+	return kernfs_create_dir_ns(parent, name, priv, NULL);
+}
+
 static inline int kernfs_remove_by_name(struct sysfs_dirent *parent,
 					const char *name)
 {
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 14/34] sysfs, kernfs: prepare read path for kernfs
  2013-10-24 15:49 Subject: [PATCHSET driver-core-next] sysfs: separate out kernfs, part #1 Tejun Heo
                   ` (12 preceding siblings ...)
  2013-10-24 15:49 ` [PATCH 13/34] sysfs, kernfs: introduce kernfs_create_dir[_ns]() Tejun Heo
@ 2013-10-24 15:49 ` Tejun Heo
  2013-10-26 11:32   ` Pavel Machek
                     ` (2 more replies)
  2013-10-24 15:49 ` [PATCH 15/34] sysfs, kernfs: prepare write " Tejun Heo
                   ` (21 subsequent siblings)
  35 siblings, 3 replies; 46+ messages in thread
From: Tejun Heo @ 2013-10-24 15:49 UTC (permalink / raw)
  To: gregkh; +Cc: kay, linux-kernel, ebiederm, bhelgaas, Tejun Heo

We're in the process of separating out core sysfs functionality into
kernfs which will deal with sysfs_dirents directly.  This patch
rearranges read path so that the kernfs and sysfs parts are separate.

* Regular file read path is refactored such that
  kernfs_seq_start/next/stop/show() handle all the boilerplate work
  including locking and updating event count for poll, while
  sysfs_kf_seq_show() deals with interaction with kobj show method.

* Bin file read path is refactored such that kernfs_file_direct_read()
  handles all the boilerplate work including buffer management and
  locking, while sysfs_kf_bin_read() deals with interaction with
  bin_attribute read method.

kernfs_file_read() is added.  It invokes either the seq_file or direct
read path depending on the file type.  This will eventually allow
using the same file_operations for both file types, which is necessary
to separate out kernfs.

While this patch changes the order of some operations, it shouldn't
change any visible behavior.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 fs/sysfs/file.c | 186 ++++++++++++++++++++++++++++++++++++--------------------
 1 file changed, 121 insertions(+), 65 deletions(-)

diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c
index 444615d..0f93330 100644
--- a/fs/sysfs/file.c
+++ b/fs/sysfs/file.c
@@ -86,13 +86,13 @@ static const struct sysfs_ops *sysfs_file_ops(struct sysfs_dirent *sd)
  * details like buffering and seeking.  The following function pipes
  * sysfs_ops->show() result through seq_file.
  */
-static int sysfs_seq_show(struct seq_file *sf, void *v)
+static int sysfs_kf_seq_show(struct seq_file *sf, void *v)
 {
 	struct sysfs_open_file *of = sf->private;
 	struct kobject *kobj = of->sd->s_parent->priv;
-	const struct sysfs_ops *ops;
+	const struct sysfs_ops *ops = sysfs_file_ops(of->sd);
+	ssize_t count = 0;
 	char *buf;
-	ssize_t count;
 
 	/* acquire buffer and ensure that it's >= PAGE_SIZE */
 	count = seq_get_buf(sf, &buf);
@@ -102,33 +102,14 @@ static int sysfs_seq_show(struct seq_file *sf, void *v)
 	}
 
 	/*
-	 * Need @of->sd for attr and ops, its parent for kobj.  @of->mutex
-	 * nests outside active ref and is just to ensure that the ops
-	 * aren't called concurrently for the same open file.
-	 */
-	mutex_lock(&of->mutex);
-	if (!sysfs_get_active(of->sd)) {
-		mutex_unlock(&of->mutex);
-		return -ENODEV;
-	}
-
-	of->event = atomic_read(&of->sd->s_attr.open->event);
-
-	/*
-	 * Lookup @ops and invoke show().  Control may reach here via seq
-	 * file lseek even if @ops->show() isn't implemented.
+	 * Invoke show().  Control may reach here via seq file lseek even
+	 * if @ops->show() isn't implemented.
 	 */
-	ops = sysfs_file_ops(of->sd);
-	if (ops->show)
+	if (ops->show) {
 		count = ops->show(kobj, of->sd->priv, buf);
-	else
-		count = 0;
-
-	sysfs_put_active(of->sd);
-	mutex_unlock(&of->mutex);
-
-	if (count < 0)
-		return count;
+		if (count < 0)
+			return count;
+	}
 
 	/*
 	 * The code works fine with PAGE_SIZE return but it's likely to
@@ -144,68 +125,141 @@ static int sysfs_seq_show(struct seq_file *sf, void *v)
 	return 0;
 }
 
-/*
- * Read method for bin files.  As reading a bin file can have side-effects,
- * the exact offset and bytes specified in read(2) call should be passed to
- * the read callback making it difficult to use seq_file.  Implement
- * simplistic custom buffering for bin files.
- */
-static ssize_t sysfs_bin_read(struct file *file, char __user *userbuf,
-			      size_t bytes, loff_t *off)
+static ssize_t sysfs_kf_bin_read(struct sysfs_open_file *of, char *buf,
+				 size_t count, loff_t pos)
 {
-	struct sysfs_open_file *of = sysfs_of(file);
 	struct bin_attribute *battr = of->sd->priv;
 	struct kobject *kobj = of->sd->s_parent->priv;
-	loff_t size = file_inode(file)->i_size;
-	int count = min_t(size_t, bytes, PAGE_SIZE);
-	loff_t offs = *off;
-	char *buf;
+	loff_t size = file_inode(of->file)->i_size;
 
-	if (!bytes)
+	if (!count)
 		return 0;
 
 	if (size) {
-		if (offs > size)
+		if (pos > size)
 			return 0;
-		if (offs + count > size)
-			count = size - offs;
+		if (pos + count > size)
+			count = size - pos;
 	}
 
-	buf = kmalloc(count, GFP_KERNEL);
+	if (!battr->read)
+		return -EIO;
+
+	return battr->read(of->file, kobj, battr, buf, pos, count);
+}
+
+static void *kernfs_seq_start(struct seq_file *sf, loff_t *ppos)
+{
+	struct sysfs_open_file *of = sf->private;
+
+	/*
+	 * @of->mutex nests outside active ref and is just to ensure that
+	 * the ops aren't called concurrently for the same open file.
+	 */
+	mutex_lock(&of->mutex);
+	if (!sysfs_get_active(of->sd)) {
+		mutex_unlock(&of->mutex);
+		return ERR_PTR(-ENODEV);
+	}
+
+	/* the same behavior as single_open() */
+	return NULL + !*ppos;
+}
+
+static void *kernfs_seq_next(struct seq_file *sf, void *v, loff_t *ppos)
+{
+	++*ppos;
+	return NULL;
+}
+
+static void kernfs_seq_stop(struct seq_file *sf, void *v)
+{
+	struct sysfs_open_file *of = sf->private;
+
+	sysfs_put_active(of->sd);
+	mutex_unlock(&of->mutex);
+}
+
+static int kernfs_seq_show(struct seq_file *sf, void *v)
+{
+	struct sysfs_open_file *of = sf->private;
+
+	of->event = atomic_read(&of->sd->s_attr.open->event);
+
+	return sysfs_kf_seq_show(sf, v);
+}
+
+static const struct seq_operations kernfs_seq_ops = {
+	.start = kernfs_seq_start,
+	.next = kernfs_seq_next,
+	.stop = kernfs_seq_stop,
+	.show = kernfs_seq_show,
+};
+
+/*
+ * As reading a bin file can have side-effects, the exact offset and bytes
+ * specified in read(2) call should be passed to the read callback making
+ * it difficult to use seq_file.  Implement simplistic custom buffering for
+ * bin files.
+ */
+static ssize_t kernfs_file_direct_read(struct sysfs_open_file *of,
+				       char __user *user_buf, size_t count,
+				       loff_t *ppos)
+{
+	ssize_t len = min_t(size_t, count, PAGE_SIZE);
+	char *buf;
+
+	buf = kmalloc(len, GFP_KERNEL);
 	if (!buf)
 		return -ENOMEM;
 
-	/* need of->sd for battr, its parent for kobj */
+	/*
+	 * @of->mutex nests outside active ref and is just to ensure that
+	 * the ops aren't called concurrently for the same open file.
+	 */
 	mutex_lock(&of->mutex);
 	if (!sysfs_get_active(of->sd)) {
-		count = -ENODEV;
+		len = -ENODEV;
 		mutex_unlock(&of->mutex);
 		goto out_free;
 	}
 
-	if (battr->read)
-		count = battr->read(file, kobj, battr, buf, offs, count);
-	else
-		count = -EIO;
+	len = sysfs_kf_bin_read(of, buf, len, *ppos);
 
 	sysfs_put_active(of->sd);
 	mutex_unlock(&of->mutex);
 
-	if (count < 0)
+	if (len < 0)
 		goto out_free;
 
-	if (copy_to_user(userbuf, buf, count)) {
-		count = -EFAULT;
+	if (copy_to_user(user_buf, buf, len)) {
+		len = -EFAULT;
 		goto out_free;
 	}
 
-	pr_debug("offs = %lld, *off = %lld, count = %d\n", offs, *off, count);
-
-	*off = offs + count;
+	*ppos += len;
 
  out_free:
 	kfree(buf);
-	return count;
+	return len;
+}
+
+/**
+ * kernfs_file_read - kernfs vfs read callback
+ * @file: file pointer
+ * @user_buf: data to write
+ * @count: number of bytes
+ * @ppos: starting offset
+ */
+static ssize_t kernfs_file_read(struct file *file, char __user *user_buf,
+				size_t count, loff_t *ppos)
+{
+	struct sysfs_open_file *of = sysfs_of(file);
+
+	if (sysfs_is_bin(of->sd))
+		return kernfs_file_direct_read(of, user_buf, count, ppos);
+	else
+		return seq_read(file, user_buf, count, ppos);
 }
 
 /**
@@ -660,12 +714,14 @@ static int sysfs_open_file(struct inode *inode, struct file *file)
 	 * and readable regular files are the vast majority anyway.
 	 */
 	if (sysfs_is_bin(attr_sd))
-		error = single_open(file, NULL, of);
+		error = seq_open(file, NULL);
 	else
-		error = single_open(file, sysfs_seq_show, of);
+		error = seq_open(file, &kernfs_seq_ops);
 	if (error)
 		goto err_free;
 
+	((struct seq_file *)file->private_data)->private = of;
+
 	/* seq_file clears PWRITE unconditionally, restore it if WRITE */
 	if (file->f_mode & FMODE_WRITE)
 		file->f_mode |= FMODE_PWRITE;
@@ -680,7 +736,7 @@ static int sysfs_open_file(struct inode *inode, struct file *file)
 	return 0;
 
 err_close:
-	single_release(inode, file);
+	seq_release(inode, file);
 err_free:
 	kfree(of);
 err_out:
@@ -694,7 +750,7 @@ static int sysfs_release(struct inode *inode, struct file *filp)
 	struct sysfs_open_file *of = sysfs_of(filp);
 
 	sysfs_put_open_dirent(sd, of);
-	single_release(inode, filp);
+	seq_release(inode, filp);
 	kfree(of);
 
 	return 0;
@@ -799,7 +855,7 @@ void sysfs_notify(struct kobject *k, const char *dir, const char *attr)
 EXPORT_SYMBOL_GPL(sysfs_notify);
 
 const struct file_operations sysfs_file_operations = {
-	.read		= seq_read,
+	.read		= kernfs_file_read,
 	.write		= sysfs_write_file,
 	.llseek		= seq_lseek,
 	.open		= sysfs_open_file,
@@ -808,7 +864,7 @@ const struct file_operations sysfs_file_operations = {
 };
 
 const struct file_operations sysfs_bin_operations = {
-	.read		= sysfs_bin_read,
+	.read		= kernfs_file_read,
 	.write		= sysfs_write_file,
 	.llseek		= generic_file_llseek,
 	.mmap		= sysfs_bin_mmap,
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 15/34] sysfs, kernfs: prepare write path for kernfs
  2013-10-24 15:49 Subject: [PATCHSET driver-core-next] sysfs: separate out kernfs, part #1 Tejun Heo
                   ` (13 preceding siblings ...)
  2013-10-24 15:49 ` [PATCH 14/34] sysfs, kernfs: prepare read path for kernfs Tejun Heo
@ 2013-10-24 15:49 ` Tejun Heo
  2013-10-24 15:49 ` [PATCH 16/34] sysfs, kernfs: prepare llseek " Tejun Heo
                   ` (20 subsequent siblings)
  35 siblings, 0 replies; 46+ messages in thread
From: Tejun Heo @ 2013-10-24 15:49 UTC (permalink / raw)
  To: gregkh; +Cc: kay, linux-kernel, ebiederm, bhelgaas, Tejun Heo

We're in the process of separating out core sysfs functionality into
kernfs which will deal with sysfs_dirents directly.  This patch
rearranges write path so that the kernfs and sysfs parts are separate.

kernfs_file_write() handles all boilerplate work including buffer
management and locking and invokes sysfs_kf_write() or
sysfs_kf_bin_write() depending on the file type which deals with the
interaction with kobj store or bin_attribute write method.

While this patch changes the order of some operations, it shouldn't
change any visible behavior.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 fs/sysfs/file.c | 103 +++++++++++++++++++++++++++-----------------------------
 1 file changed, 50 insertions(+), 53 deletions(-)

diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c
index 0f93330..2736ea9 100644
--- a/fs/sysfs/file.c
+++ b/fs/sysfs/file.c
@@ -262,61 +262,50 @@ static ssize_t kernfs_file_read(struct file *file, char __user *user_buf,
 		return seq_read(file, user_buf, count, ppos);
 }
 
-/**
- * flush_write_buffer - push buffer to kobject
- * @of: open file
- * @buf: data buffer for file
- * @off: file offset to write to
- * @count: number of bytes
- *
- * Get the correct pointers for the kobject and the attribute we're dealing
- * with, then call the store() method for it with @buf.
- */
-static int flush_write_buffer(struct sysfs_open_file *of, char *buf, loff_t off,
-			      size_t count)
+/* kernfs write callback for regular sysfs files */
+static ssize_t sysfs_kf_write(struct sysfs_open_file *of, char *buf,
+			      size_t count, loff_t pos)
 {
+	const struct sysfs_ops *ops = sysfs_file_ops(of->sd);
 	struct kobject *kobj = of->sd->s_parent->priv;
-	int rc = 0;
 
-	/*
-	 * Need @of->sd for attr and ops, its parent for kobj.  @of->mutex
-	 * nests outside active ref and is just to ensure that the ops
-	 * aren't called concurrently for the same open file.
-	 */
-	mutex_lock(&of->mutex);
-	if (!sysfs_get_active(of->sd)) {
-		mutex_unlock(&of->mutex);
-		return -ENODEV;
-	}
+	if (!count)
+		return 0;
 
-	if (sysfs_is_bin(of->sd)) {
-		struct bin_attribute *battr = of->sd->priv;
+	return ops->store(kobj, of->sd->priv, buf, count);
+}
 
-		rc = -EIO;
-		if (battr->write)
-			rc = battr->write(of->file, kobj, battr, buf, off,
-					  count);
-	} else {
-		const struct sysfs_ops *ops = sysfs_file_ops(of->sd);
+/* kernfs write callback for bin sysfs files */
+static ssize_t sysfs_kf_bin_write(struct sysfs_open_file *of, char *buf,
+				  size_t count, loff_t pos)
+{
+	struct bin_attribute *battr = of->sd->priv;
+	struct kobject *kobj = of->sd->s_parent->priv;
+	loff_t size = file_inode(of->file)->i_size;
 
-		rc = ops->store(kobj, of->sd->priv, buf, count);
+	if (size) {
+		if (size <= pos)
+			return 0;
+		count = min_t(ssize_t, count, size - pos);
 	}
+	if (!count)
+		return 0;
 
-	sysfs_put_active(of->sd);
-	mutex_unlock(&of->mutex);
+	if (!battr->write)
+		return -EIO;
 
-	return rc;
+	return battr->write(of->file, kobj, battr, buf, pos, count);
 }
 
 /**
- * sysfs_write_file - write an attribute
+ * kernfs_file_write - kernfs vfs write callback
  * @file: file pointer
  * @user_buf: data to write
  * @count: number of bytes
  * @ppos: starting offset
  *
- * Copy data in from userland and pass it to the matching
- * sysfs_ops->store() by invoking flush_write_buffer().
+ * Copy data in from userland and pass it to the matching kernfs write
+ * operation.
  *
  * There is no easy way for us to know if userspace is only doing a partial
  * write, so we don't support them. We expect the entire buffer to come on
@@ -324,23 +313,13 @@ static int flush_write_buffer(struct sysfs_open_file *of, char *buf, loff_t off,
  * modify only the the value you're changing, then write entire buffer
  * back.
  */
-static ssize_t sysfs_write_file(struct file *file, const char __user *user_buf,
-				size_t count, loff_t *ppos)
+static ssize_t kernfs_file_write(struct file *file, const char __user *user_buf,
+				 size_t count, loff_t *ppos)
 {
 	struct sysfs_open_file *of = sysfs_of(file);
 	ssize_t len = min_t(size_t, count, PAGE_SIZE);
-	loff_t size = file_inode(file)->i_size;
 	char *buf;
 
-	if (sysfs_is_bin(of->sd) && size) {
-		if (size <= *ppos)
-			return 0;
-		len = min_t(ssize_t, len, size - *ppos);
-	}
-
-	if (!len)
-		return 0;
-
 	buf = kmalloc(len + 1, GFP_KERNEL);
 	if (!buf)
 		return -ENOMEM;
@@ -351,7 +330,25 @@ static ssize_t sysfs_write_file(struct file *file, const char __user *user_buf,
 	}
 	buf[len] = '\0';	/* guarantee string termination */
 
-	len = flush_write_buffer(of, buf, *ppos, len);
+	/*
+	 * @of->mutex nests outside active ref and is just to ensure that
+	 * the ops aren't called concurrently for the same open file.
+	 */
+	mutex_lock(&of->mutex);
+	if (!sysfs_get_active(of->sd)) {
+		mutex_unlock(&of->mutex);
+		len = -ENODEV;
+		goto out_free;
+	}
+
+	if (sysfs_is_bin(of->sd))
+		len = sysfs_kf_bin_write(of, buf, len, *ppos);
+	else
+		len = sysfs_kf_write(of, buf, len, *ppos);
+
+	sysfs_put_active(of->sd);
+	mutex_unlock(&of->mutex);
+
 	if (len > 0)
 		*ppos += len;
 out_free:
@@ -856,7 +853,7 @@ EXPORT_SYMBOL_GPL(sysfs_notify);
 
 const struct file_operations sysfs_file_operations = {
 	.read		= kernfs_file_read,
-	.write		= sysfs_write_file,
+	.write		= kernfs_file_write,
 	.llseek		= seq_lseek,
 	.open		= sysfs_open_file,
 	.release	= sysfs_release,
@@ -865,7 +862,7 @@ const struct file_operations sysfs_file_operations = {
 
 const struct file_operations sysfs_bin_operations = {
 	.read		= kernfs_file_read,
-	.write		= sysfs_write_file,
+	.write		= kernfs_file_write,
 	.llseek		= generic_file_llseek,
 	.mmap		= sysfs_bin_mmap,
 	.open		= sysfs_open_file,
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 16/34] sysfs, kernfs: prepare llseek path for kernfs
  2013-10-24 15:49 Subject: [PATCHSET driver-core-next] sysfs: separate out kernfs, part #1 Tejun Heo
                   ` (14 preceding siblings ...)
  2013-10-24 15:49 ` [PATCH 15/34] sysfs, kernfs: prepare write " Tejun Heo
@ 2013-10-24 15:49 ` Tejun Heo
  2013-10-24 15:49 ` [PATCH 17/34] sysfs, kernfs: prepare mmap " Tejun Heo
                   ` (19 subsequent siblings)
  35 siblings, 0 replies; 46+ messages in thread
From: Tejun Heo @ 2013-10-24 15:49 UTC (permalink / raw)
  To: gregkh; +Cc: kay, linux-kernel, ebiederm, bhelgaas, Tejun Heo

We're in the process of separating out core sysfs functionality into
kernfs which will deal with sysfs_dirents directly.  This patch
introduces kernfs_file_llseek() which invokes either
generic_file_llseek() or seq_lseek() depending on the file type so
that both regular and bin files eventually can use the same
file_operations, which is necessary to separate out kernfs.

This patch doesn't introduce any behavior changes.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 fs/sysfs/file.c | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c
index 2736ea9..4f99da5 100644
--- a/fs/sysfs/file.c
+++ b/fs/sysfs/file.c
@@ -356,6 +356,16 @@ out_free:
 	return len;
 }
 
+static loff_t kernfs_file_llseek(struct file *file, loff_t off, int whence)
+{
+	struct sysfs_open_file *of = sysfs_of(file);
+
+	if (sysfs_is_bin(of->sd))
+		return generic_file_llseek(file, off, whence);
+	else
+		return seq_lseek(file, off, whence);
+}
+
 static void sysfs_bin_vma_open(struct vm_area_struct *vma)
 {
 	struct file *file = vma->vm_file;
@@ -854,7 +864,7 @@ EXPORT_SYMBOL_GPL(sysfs_notify);
 const struct file_operations sysfs_file_operations = {
 	.read		= kernfs_file_read,
 	.write		= kernfs_file_write,
-	.llseek		= seq_lseek,
+	.llseek		= kernfs_file_llseek,
 	.open		= sysfs_open_file,
 	.release	= sysfs_release,
 	.poll		= sysfs_poll,
@@ -863,7 +873,7 @@ const struct file_operations sysfs_file_operations = {
 const struct file_operations sysfs_bin_operations = {
 	.read		= kernfs_file_read,
 	.write		= kernfs_file_write,
-	.llseek		= generic_file_llseek,
+	.llseek		= kernfs_file_llseek,
 	.mmap		= sysfs_bin_mmap,
 	.open		= sysfs_open_file,
 	.release	= sysfs_release,
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 17/34] sysfs, kernfs: prepare mmap path for kernfs
  2013-10-24 15:49 Subject: [PATCHSET driver-core-next] sysfs: separate out kernfs, part #1 Tejun Heo
                   ` (15 preceding siblings ...)
  2013-10-24 15:49 ` [PATCH 16/34] sysfs, kernfs: prepare llseek " Tejun Heo
@ 2013-10-24 15:49 ` Tejun Heo
  2013-10-24 15:49 ` [PATCH 18/34] sysfs, kernfs: prepare open, release, poll paths " Tejun Heo
                   ` (18 subsequent siblings)
  35 siblings, 0 replies; 46+ messages in thread
From: Tejun Heo @ 2013-10-24 15:49 UTC (permalink / raw)
  To: gregkh; +Cc: kay, linux-kernel, ebiederm, bhelgaas, Tejun Heo

We're in the process of separating out core sysfs functionality into
kernfs which will deal with sysfs_dirents directly.  This patch
rearranges mmap path so that the kernfs and sysfs parts are separate.

sysfs_kf_bin_mmap() which handles the interaction with bin_attribute
mmap method is factored out of sysfs_bin_mmap(), which is renamed to
kernfs_file_mmap().  All vma ops are renamed accordingly.

sysfs_bin_mmap() is updated such that it can be used for both file
types.  This will eventually allow using the same file_operations for
both file types, which is necessary to separate out kernfs.

This patch doesn't introduce any behavior changes.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 fs/sysfs/file.c | 70 ++++++++++++++++++++++++++++++++-------------------------
 1 file changed, 39 insertions(+), 31 deletions(-)

diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c
index 4f99da5..0fdc543 100644
--- a/fs/sysfs/file.c
+++ b/fs/sysfs/file.c
@@ -366,7 +366,19 @@ static loff_t kernfs_file_llseek(struct file *file, loff_t off, int whence)
 		return seq_lseek(file, off, whence);
 }
 
-static void sysfs_bin_vma_open(struct vm_area_struct *vma)
+static int sysfs_kf_bin_mmap(struct sysfs_open_file *of,
+			     struct vm_area_struct *vma)
+{
+	struct bin_attribute *battr = of->sd->priv;
+	struct kobject *kobj = of->sd->s_parent->priv;
+
+	if (!battr->mmap)
+		return -EINVAL;		/* should be -ENODEV */
+
+	return battr->mmap(of->file, kobj, battr, vma);
+}
+
+static void kernfs_vma_open(struct vm_area_struct *vma)
 {
 	struct file *file = vma->vm_file;
 	struct sysfs_open_file *of = sysfs_of(file);
@@ -383,7 +395,7 @@ static void sysfs_bin_vma_open(struct vm_area_struct *vma)
 	sysfs_put_active(of->sd);
 }
 
-static int sysfs_bin_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
+static int kernfs_vma_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
 {
 	struct file *file = vma->vm_file;
 	struct sysfs_open_file *of = sysfs_of(file);
@@ -403,8 +415,8 @@ static int sysfs_bin_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
 	return ret;
 }
 
-static int sysfs_bin_page_mkwrite(struct vm_area_struct *vma,
-				  struct vm_fault *vmf)
+static int kernfs_vma_page_mkwrite(struct vm_area_struct *vma,
+				   struct vm_fault *vmf)
 {
 	struct file *file = vma->vm_file;
 	struct sysfs_open_file *of = sysfs_of(file);
@@ -426,8 +438,8 @@ static int sysfs_bin_page_mkwrite(struct vm_area_struct *vma,
 	return ret;
 }
 
-static int sysfs_bin_access(struct vm_area_struct *vma, unsigned long addr,
-			    void *buf, int len, int write)
+static int kernfs_vma_access(struct vm_area_struct *vma, unsigned long addr,
+			     void *buf, int len, int write)
 {
 	struct file *file = vma->vm_file;
 	struct sysfs_open_file *of = sysfs_of(file);
@@ -448,8 +460,8 @@ static int sysfs_bin_access(struct vm_area_struct *vma, unsigned long addr,
 }
 
 #ifdef CONFIG_NUMA
-static int sysfs_bin_set_policy(struct vm_area_struct *vma,
-				struct mempolicy *new)
+static int kernfs_vma_set_policy(struct vm_area_struct *vma,
+				 struct mempolicy *new)
 {
 	struct file *file = vma->vm_file;
 	struct sysfs_open_file *of = sysfs_of(file);
@@ -469,8 +481,8 @@ static int sysfs_bin_set_policy(struct vm_area_struct *vma,
 	return ret;
 }
 
-static struct mempolicy *sysfs_bin_get_policy(struct vm_area_struct *vma,
-					      unsigned long addr)
+static struct mempolicy *kernfs_vma_get_policy(struct vm_area_struct *vma,
+					       unsigned long addr)
 {
 	struct file *file = vma->vm_file;
 	struct sysfs_open_file *of = sysfs_of(file);
@@ -490,8 +502,9 @@ static struct mempolicy *sysfs_bin_get_policy(struct vm_area_struct *vma,
 	return pol;
 }
 
-static int sysfs_bin_migrate(struct vm_area_struct *vma, const nodemask_t *from,
-			     const nodemask_t *to, unsigned long flags)
+static int kernfs_vma_migrate(struct vm_area_struct *vma,
+			      const nodemask_t *from, const nodemask_t *to,
+			      unsigned long flags)
 {
 	struct file *file = vma->vm_file;
 	struct sysfs_open_file *of = sysfs_of(file);
@@ -512,37 +525,31 @@ static int sysfs_bin_migrate(struct vm_area_struct *vma, const nodemask_t *from,
 }
 #endif
 
-static const struct vm_operations_struct sysfs_bin_vm_ops = {
-	.open		= sysfs_bin_vma_open,
-	.fault		= sysfs_bin_fault,
-	.page_mkwrite	= sysfs_bin_page_mkwrite,
-	.access		= sysfs_bin_access,
+static const struct vm_operations_struct kernfs_vm_ops = {
+	.open		= kernfs_vma_open,
+	.fault		= kernfs_vma_fault,
+	.page_mkwrite	= kernfs_vma_page_mkwrite,
+	.access		= kernfs_vma_access,
 #ifdef CONFIG_NUMA
-	.set_policy	= sysfs_bin_set_policy,
-	.get_policy	= sysfs_bin_get_policy,
-	.migrate	= sysfs_bin_migrate,
+	.set_policy	= kernfs_vma_set_policy,
+	.get_policy	= kernfs_vma_get_policy,
+	.migrate	= kernfs_vma_migrate,
 #endif
 };
 
-static int sysfs_bin_mmap(struct file *file, struct vm_area_struct *vma)
+static int kernfs_file_mmap(struct file *file, struct vm_area_struct *vma)
 {
 	struct sysfs_open_file *of = sysfs_of(file);
-	struct bin_attribute *battr = of->sd->priv;
-	struct kobject *kobj = of->sd->s_parent->priv;
 	int rc;
 
 	mutex_lock(&of->mutex);
 
-	/* need of->sd for battr, its parent for kobj */
 	rc = -ENODEV;
 	if (!sysfs_get_active(of->sd))
 		goto out_unlock;
 
-	rc = -EINVAL;
-	if (!battr->mmap)
-		goto out_put;
-
-	rc = battr->mmap(file, kobj, battr, vma);
+	if (sysfs_is_bin(of->sd))
+		rc = sysfs_kf_bin_mmap(of, vma);
 	if (rc)
 		goto out_put;
 
@@ -569,7 +576,7 @@ static int sysfs_bin_mmap(struct file *file, struct vm_area_struct *vma)
 	rc = 0;
 	of->mmapped = 1;
 	of->vm_ops = vma->vm_ops;
-	vma->vm_ops = &sysfs_bin_vm_ops;
+	vma->vm_ops = &kernfs_vm_ops;
 out_put:
 	sysfs_put_active(of->sd);
 out_unlock:
@@ -865,6 +872,7 @@ const struct file_operations sysfs_file_operations = {
 	.read		= kernfs_file_read,
 	.write		= kernfs_file_write,
 	.llseek		= kernfs_file_llseek,
+	.mmap		= kernfs_file_mmap,
 	.open		= sysfs_open_file,
 	.release	= sysfs_release,
 	.poll		= sysfs_poll,
@@ -874,7 +882,7 @@ const struct file_operations sysfs_bin_operations = {
 	.read		= kernfs_file_read,
 	.write		= kernfs_file_write,
 	.llseek		= kernfs_file_llseek,
-	.mmap		= sysfs_bin_mmap,
+	.mmap		= kernfs_file_mmap,
 	.open		= sysfs_open_file,
 	.release	= sysfs_release,
 	.poll		= sysfs_poll,
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 18/34] sysfs, kernfs: prepare open, release, poll paths for kernfs
  2013-10-24 15:49 Subject: [PATCHSET driver-core-next] sysfs: separate out kernfs, part #1 Tejun Heo
                   ` (16 preceding siblings ...)
  2013-10-24 15:49 ` [PATCH 17/34] sysfs, kernfs: prepare mmap " Tejun Heo
@ 2013-10-24 15:49 ` Tejun Heo
  2013-10-24 15:49 ` [PATCH 19/34] sysfs, kernfs: move sysfs_open_file to include/linux/kernfs.h Tejun Heo
                   ` (17 subsequent siblings)
  35 siblings, 0 replies; 46+ messages in thread
From: Tejun Heo @ 2013-10-24 15:49 UTC (permalink / raw)
  To: gregkh; +Cc: kay, linux-kernel, ebiederm, bhelgaas, Tejun Heo

We're in the process of separating out core sysfs functionality into
kernfs which will deal with sysfs_dirents directly.  This patch
prepares the rest - open, release and poll.  There isn't much to do.
Just renaming is enough.  As sysfs_file_operations and
sysfs_bin_operations are identical now, use the same file_operations
for both - kernfs_file_operations.

This patch doesn't introduce any behavior changes.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 fs/sysfs/file.c  | 24 +++++++-----------------
 fs/sysfs/inode.c |  4 ++--
 fs/sysfs/sysfs.h |  3 +--
 3 files changed, 10 insertions(+), 21 deletions(-)

diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c
index 0fdc543..e9aea35 100644
--- a/fs/sysfs/file.c
+++ b/fs/sysfs/file.c
@@ -673,7 +673,7 @@ static void sysfs_put_open_dirent(struct sysfs_dirent *sd,
 	kfree(od);
 }
 
-static int sysfs_open_file(struct inode *inode, struct file *file)
+static int kernfs_file_open(struct inode *inode, struct file *file)
 {
 	struct sysfs_dirent *attr_sd = file->f_path.dentry->d_fsdata;
 	struct kobject *kobj = attr_sd->s_parent->priv;
@@ -758,7 +758,7 @@ err_out:
 	return error;
 }
 
-static int sysfs_release(struct inode *inode, struct file *filp)
+static int kernfs_file_release(struct inode *inode, struct file *filp)
 {
 	struct sysfs_dirent *sd = filp->f_path.dentry->d_fsdata;
 	struct sysfs_open_file *of = sysfs_of(filp);
@@ -809,7 +809,7 @@ void sysfs_unmap_bin_file(struct sysfs_dirent *sd)
  * to see if it supports poll (Neither 'poll' nor 'select' return
  * an appropriate error code).  When in doubt, set a suitable timeout value.
  */
-static unsigned int sysfs_poll(struct file *filp, poll_table *wait)
+static unsigned int kernfs_file_poll(struct file *filp, poll_table *wait)
 {
 	struct sysfs_open_file *of = sysfs_of(filp);
 	struct sysfs_dirent *attr_sd = filp->f_path.dentry->d_fsdata;
@@ -868,24 +868,14 @@ void sysfs_notify(struct kobject *k, const char *dir, const char *attr)
 }
 EXPORT_SYMBOL_GPL(sysfs_notify);
 
-const struct file_operations sysfs_file_operations = {
+const struct file_operations kernfs_file_operations = {
 	.read		= kernfs_file_read,
 	.write		= kernfs_file_write,
 	.llseek		= kernfs_file_llseek,
 	.mmap		= kernfs_file_mmap,
-	.open		= sysfs_open_file,
-	.release	= sysfs_release,
-	.poll		= sysfs_poll,
-};
-
-const struct file_operations sysfs_bin_operations = {
-	.read		= kernfs_file_read,
-	.write		= kernfs_file_write,
-	.llseek		= kernfs_file_llseek,
-	.mmap		= kernfs_file_mmap,
-	.open		= sysfs_open_file,
-	.release	= sysfs_release,
-	.poll		= sysfs_poll,
+	.open		= kernfs_file_open,
+	.release	= kernfs_file_release,
+	.poll		= kernfs_file_poll,
 };
 
 int sysfs_add_file_mode_ns(struct sysfs_dirent *dir_sd,
diff --git a/fs/sysfs/inode.c b/fs/sysfs/inode.c
index 81cc858..4c463da 100644
--- a/fs/sysfs/inode.c
+++ b/fs/sysfs/inode.c
@@ -272,12 +272,12 @@ static void sysfs_init_inode(struct sysfs_dirent *sd, struct inode *inode)
 		break;
 	case SYSFS_KOBJ_ATTR:
 		inode->i_size = PAGE_SIZE;
-		inode->i_fop = &sysfs_file_operations;
+		inode->i_fop = &kernfs_file_operations;
 		break;
 	case SYSFS_KOBJ_BIN_ATTR:
 		bin_attr = sd->priv;
 		inode->i_size = bin_attr->size;
-		inode->i_fop = &sysfs_bin_operations;
+		inode->i_fop = &kernfs_file_operations;
 		break;
 	case SYSFS_KOBJ_LINK:
 		inode->i_op = &sysfs_symlink_inode_operations;
diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index 69a1162..e69ebbb 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -212,8 +212,7 @@ int sysfs_inode_init(void);
 /*
  * file.c
  */
-extern const struct file_operations sysfs_file_operations;
-extern const struct file_operations sysfs_bin_operations;
+extern const struct file_operations kernfs_file_operations;
 
 int sysfs_add_file(struct sysfs_dirent *dir_sd,
 		   const struct attribute *attr, int type);
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 19/34] sysfs, kernfs: move sysfs_open_file to include/linux/kernfs.h
  2013-10-24 15:49 Subject: [PATCHSET driver-core-next] sysfs: separate out kernfs, part #1 Tejun Heo
                   ` (17 preceding siblings ...)
  2013-10-24 15:49 ` [PATCH 18/34] sysfs, kernfs: prepare open, release, poll paths " Tejun Heo
@ 2013-10-24 15:49 ` Tejun Heo
  2013-10-24 15:49 ` [PATCH 20/34] sysfs, kernfs: introduce kernfs_ops Tejun Heo
                   ` (16 subsequent siblings)
  35 siblings, 0 replies; 46+ messages in thread
From: Tejun Heo @ 2013-10-24 15:49 UTC (permalink / raw)
  To: gregkh; +Cc: kay, linux-kernel, ebiederm, bhelgaas, Tejun Heo

sysfs_open_file will be used as the primary handle for kernfs methods.
Move its definition from fs/sysfs/file.c to include/linux/kernfs.h and
mark the public and private fields.

This is pure relocation.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 fs/sysfs/file.c        | 11 -----------
 include/linux/kernfs.h | 18 ++++++++++++++++++
 2 files changed, 18 insertions(+), 11 deletions(-)

diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c
index e9aea35..7a33634 100644
--- a/fs/sysfs/file.c
+++ b/fs/sysfs/file.c
@@ -47,17 +47,6 @@ struct sysfs_open_dirent {
 	struct list_head	files; /* goes through sysfs_open_file.list */
 };
 
-struct sysfs_open_file {
-	struct sysfs_dirent	*sd;
-	struct file		*file;
-	struct mutex		mutex;
-	int			event;
-	struct list_head	list;
-
-	bool			mmapped;
-	const struct vm_operations_struct *vm_ops;
-};
-
 static bool sysfs_is_bin(struct sysfs_dirent *sd)
 {
 	return sysfs_type(sd) == SYSFS_KOBJ_BIN_ATTR;
diff --git a/include/linux/kernfs.h b/include/linux/kernfs.h
index fab2597..6fbd226 100644
--- a/include/linux/kernfs.h
+++ b/include/linux/kernfs.h
@@ -8,12 +8,30 @@
 #define __LINUX_KERNFS_H
 
 #include <linux/kernel.h>
+#include <linux/list.h>
+#include <linux/mutex.h>
 
 struct file;
 struct iattr;
+struct seq_file;
+struct vm_area_struct;
 
 struct sysfs_dirent;
 
+struct sysfs_open_file {
+	/* published fields */
+	struct sysfs_dirent	*sd;
+	struct file		*file;
+
+	/* private fields, do not use outside kernfs proper */
+	struct mutex		mutex;
+	int			event;
+	struct list_head	list;
+
+	bool			mmapped;
+	const struct vm_operations_struct *vm_ops;
+};
+
 #ifdef CONFIG_SYSFS
 
 struct sysfs_dirent *kernfs_create_dir_ns(struct sysfs_dirent *parent,
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 20/34] sysfs, kernfs: introduce kernfs_ops
  2013-10-24 15:49 Subject: [PATCHSET driver-core-next] sysfs: separate out kernfs, part #1 Tejun Heo
                   ` (18 preceding siblings ...)
  2013-10-24 15:49 ` [PATCH 19/34] sysfs, kernfs: move sysfs_open_file to include/linux/kernfs.h Tejun Heo
@ 2013-10-24 15:49 ` Tejun Heo
  2013-10-24 15:49 ` [PATCH 21/34] sysfs, kernfs: add sysfs_dirent->s_attr.size Tejun Heo
                   ` (15 subsequent siblings)
  35 siblings, 0 replies; 46+ messages in thread
From: Tejun Heo @ 2013-10-24 15:49 UTC (permalink / raw)
  To: gregkh; +Cc: kay, linux-kernel, ebiederm, bhelgaas, Tejun Heo

We're in the process of separating out core sysfs functionality into
kernfs which will deal with sysfs_dirents directly.  This patch
introduces kernfs_ops which hosts methods kernfs users implement and
updates fs/sysfs/file.c such that sysfs_kf_*() functions populate
kernfs_ops and kernfs_file_*() functions call the matching entries
from kernfs_ops.

kernfs_ops contains the following groups of methods.

* seq_show() - for kernfs files which use seq_file for reads.

* read() - for direct read implementations.  Used iff seq_show() is
  not implemented.

* write() - for writes.

* mmap() - for mmaps.

Notes:

* sysfs_elem_attr->ops is added so that kernfs_ops can be accessed
  from sysfs_dirent.  kernfs_ops() helper is added to verify locking
  and access the field.

* SYSFS_FLAG_HAS_(SEQ_SHOW|MMAP) added.  sd->s_attr->ops is accessible
  only while holding active_ref and there are cases where we want to
  take different actions depending on which ops are implemented.
  These two flags cache whether the two ops are implemented for those.

* kernfs_file_*() no longer test sysfs type but chooses different
  behaviors depending on which methods in kernfs_ops are implemented.
  The conversions are trivial except for the open path.  As
  kernfs_file_open() now decides whether to allow read/write accesses
  depending on the kernfs_ops implemented, the presence of methods in
  kobjs and attribute_bin should be propagated to kernfs_ops.
  sysfs_add_file_mode_ns() is updated so that it propagates presence /
  absence of the callbacks through _empty, _ro, _wo, _rw kernfs_ops.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 fs/sysfs/file.c        | 149 +++++++++++++++++++++++++++++++++++++------------
 fs/sysfs/sysfs.h       |   3 +
 include/linux/kernfs.h |  26 +++++++++
 3 files changed, 143 insertions(+), 35 deletions(-)

diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c
index 7a33634..0a533b4 100644
--- a/fs/sysfs/file.c
+++ b/fs/sysfs/file.c
@@ -58,6 +58,17 @@ static struct sysfs_open_file *sysfs_of(struct file *file)
 }
 
 /*
+ * Determine the kernfs_ops for the given sysfs_dirent.  This function must
+ * be called while holding an active reference.
+ */
+static const struct kernfs_ops *kernfs_ops(struct sysfs_dirent *sd)
+{
+	if (!sysfs_ignore_lockdep(sd))
+		lockdep_assert_held(sd);
+	return sd->s_attr.ops;
+}
+
+/*
  * Determine ktype->sysfs_ops for the given sysfs_dirent.  This function
  * must be called while holding an active reference.
  */
@@ -175,7 +186,7 @@ static int kernfs_seq_show(struct seq_file *sf, void *v)
 
 	of->event = atomic_read(&of->sd->s_attr.open->event);
 
-	return sysfs_kf_seq_show(sf, v);
+	return of->sd->s_attr.ops->seq_show(sf, v);
 }
 
 static const struct seq_operations kernfs_seq_ops = {
@@ -196,6 +207,7 @@ static ssize_t kernfs_file_direct_read(struct sysfs_open_file *of,
 				       loff_t *ppos)
 {
 	ssize_t len = min_t(size_t, count, PAGE_SIZE);
+	const struct kernfs_ops *ops;
 	char *buf;
 
 	buf = kmalloc(len, GFP_KERNEL);
@@ -213,7 +225,11 @@ static ssize_t kernfs_file_direct_read(struct sysfs_open_file *of,
 		goto out_free;
 	}
 
-	len = sysfs_kf_bin_read(of, buf, len, *ppos);
+	ops = kernfs_ops(of->sd);
+	if (ops->read)
+		len = ops->read(of, buf, len, *ppos);
+	else
+		len = -EINVAL;
 
 	sysfs_put_active(of->sd);
 	mutex_unlock(&of->mutex);
@@ -245,10 +261,10 @@ static ssize_t kernfs_file_read(struct file *file, char __user *user_buf,
 {
 	struct sysfs_open_file *of = sysfs_of(file);
 
-	if (sysfs_is_bin(of->sd))
-		return kernfs_file_direct_read(of, user_buf, count, ppos);
-	else
+	if (of->sd->s_flags & SYSFS_FLAG_HAS_SEQ_SHOW)
 		return seq_read(file, user_buf, count, ppos);
+	else
+		return kernfs_file_direct_read(of, user_buf, count, ppos);
 }
 
 /* kernfs write callback for regular sysfs files */
@@ -307,6 +323,7 @@ static ssize_t kernfs_file_write(struct file *file, const char __user *user_buf,
 {
 	struct sysfs_open_file *of = sysfs_of(file);
 	ssize_t len = min_t(size_t, count, PAGE_SIZE);
+	const struct kernfs_ops *ops;
 	char *buf;
 
 	buf = kmalloc(len + 1, GFP_KERNEL);
@@ -330,10 +347,11 @@ static ssize_t kernfs_file_write(struct file *file, const char __user *user_buf,
 		goto out_free;
 	}
 
-	if (sysfs_is_bin(of->sd))
-		len = sysfs_kf_bin_write(of, buf, len, *ppos);
+	ops = kernfs_ops(of->sd);
+	if (ops->write)
+		len = ops->write(of, buf, len, *ppos);
 	else
-		len = sysfs_kf_write(of, buf, len, *ppos);
+		len = -EINVAL;
 
 	sysfs_put_active(of->sd);
 	mutex_unlock(&of->mutex);
@@ -349,10 +367,10 @@ static loff_t kernfs_file_llseek(struct file *file, loff_t off, int whence)
 {
 	struct sysfs_open_file *of = sysfs_of(file);
 
-	if (sysfs_is_bin(of->sd))
-		return generic_file_llseek(file, off, whence);
-	else
+	if (of->sd->s_flags & SYSFS_FLAG_HAS_SEQ_SHOW)
 		return seq_lseek(file, off, whence);
+	else
+		return generic_file_llseek(file, off, whence);
 }
 
 static int sysfs_kf_bin_mmap(struct sysfs_open_file *of,
@@ -529,6 +547,7 @@ static const struct vm_operations_struct kernfs_vm_ops = {
 static int kernfs_file_mmap(struct file *file, struct vm_area_struct *vma)
 {
 	struct sysfs_open_file *of = sysfs_of(file);
+	const struct kernfs_ops *ops;
 	int rc;
 
 	mutex_lock(&of->mutex);
@@ -537,8 +556,9 @@ static int kernfs_file_mmap(struct file *file, struct vm_area_struct *vma)
 	if (!sysfs_get_active(of->sd))
 		goto out_unlock;
 
-	if (sysfs_is_bin(of->sd))
-		rc = sysfs_kf_bin_mmap(of, vma);
+	ops = kernfs_ops(of->sd);
+	if (ops->mmap)
+		rc = ops->mmap(of, vma);
 	if (rc)
 		goto out_put;
 
@@ -665,32 +685,18 @@ static void sysfs_put_open_dirent(struct sysfs_dirent *sd,
 static int kernfs_file_open(struct inode *inode, struct file *file)
 {
 	struct sysfs_dirent *attr_sd = file->f_path.dentry->d_fsdata;
-	struct kobject *kobj = attr_sd->s_parent->priv;
+	const struct kernfs_ops *ops;
 	struct sysfs_open_file *of;
 	bool has_read, has_write;
 	int error = -EACCES;
 
-	/* need attr_sd for attr and ops, its parent for kobj */
 	if (!sysfs_get_active(attr_sd))
 		return -ENODEV;
 
-	if (sysfs_is_bin(attr_sd)) {
-		struct bin_attribute *battr = attr_sd->priv;
+	ops = kernfs_ops(attr_sd);
 
-		has_read = battr->read || battr->mmap;
-		has_write = battr->write || battr->mmap;
-	} else {
-		const struct sysfs_ops *ops = sysfs_file_ops(attr_sd);
-
-		/* every kobject with an attribute needs a ktype assigned */
-		if (WARN(!ops, KERN_ERR
-			 "missing sysfs attribute operations for kobject: %s\n",
-			 kobject_name(kobj)))
-			goto err_out;
-
-		has_read = ops->show;
-		has_write = ops->store;
-	}
+	has_read = ops->seq_show || ops->read || ops->mmap;
+	has_write = ops->write || ops->mmap;
 
 	/* check perms and supported operations */
 	if ((file->f_mode & FMODE_WRITE) &&
@@ -716,10 +722,10 @@ static int kernfs_file_open(struct inode *inode, struct file *file)
 	 * seq_file or is not requested.  This unifies private data access
 	 * and readable regular files are the vast majority anyway.
 	 */
-	if (sysfs_is_bin(attr_sd))
-		error = seq_open(file, NULL);
-	else
+	if (ops->seq_show)
 		error = seq_open(file, &kernfs_seq_ops);
+	else
+		error = seq_open(file, NULL);
 	if (error)
 		goto err_free;
 
@@ -764,7 +770,7 @@ void sysfs_unmap_bin_file(struct sysfs_dirent *sd)
 	struct sysfs_open_dirent *od;
 	struct sysfs_open_file *of;
 
-	if (!sysfs_is_bin(sd))
+	if (!(sd->s_flags & SYSFS_FLAG_HAS_MMAP))
 		return;
 
 	spin_lock_irq(&sysfs_open_dirent_lock);
@@ -867,23 +873,96 @@ const struct file_operations kernfs_file_operations = {
 	.poll		= kernfs_file_poll,
 };
 
+static const struct kernfs_ops sysfs_file_kfops_empty = {
+};
+
+static const struct kernfs_ops sysfs_file_kfops_ro = {
+	.seq_show	= sysfs_kf_seq_show,
+};
+
+static const struct kernfs_ops sysfs_file_kfops_wo = {
+	.write		= sysfs_kf_write,
+};
+
+static const struct kernfs_ops sysfs_file_kfops_rw = {
+	.seq_show	= sysfs_kf_seq_show,
+	.write		= sysfs_kf_write,
+};
+
+static const struct kernfs_ops sysfs_bin_kfops_ro = {
+	.read		= sysfs_kf_bin_read,
+};
+
+static const struct kernfs_ops sysfs_bin_kfops_wo = {
+	.write		= sysfs_kf_bin_write,
+};
+
+static const struct kernfs_ops sysfs_bin_kfops_rw = {
+	.read		= sysfs_kf_bin_read,
+	.write		= sysfs_kf_bin_write,
+	.mmap		= sysfs_kf_bin_mmap,
+};
+
 int sysfs_add_file_mode_ns(struct sysfs_dirent *dir_sd,
 			   const struct attribute *attr, int type,
 			   umode_t amode, const void *ns)
 {
 	umode_t mode = (amode & S_IALLUGO) | S_IFREG;
+	const struct kernfs_ops *ops;
 	struct sysfs_addrm_cxt acxt;
 	struct sysfs_dirent *sd;
 	int rc;
 
+	if (type == SYSFS_KOBJ_ATTR) {
+		struct kobject *kobj = dir_sd->priv;
+		const struct sysfs_ops *sysfs_ops = kobj->ktype->sysfs_ops;
+
+		/* every kobject with an attribute needs a ktype assigned */
+		if (WARN(!sysfs_ops, KERN_ERR
+			 "missing sysfs attribute operations for kobject: %s\n",
+			 kobject_name(kobj)))
+			return -EINVAL;
+
+		if (sysfs_ops->show && sysfs_ops->store)
+			ops = &sysfs_file_kfops_rw;
+		else if (sysfs_ops->show)
+			ops = &sysfs_file_kfops_ro;
+		else if (sysfs_ops->store)
+			ops = &sysfs_file_kfops_wo;
+		else
+			ops = &sysfs_file_kfops_empty;
+	} else {
+		struct bin_attribute *battr = (void *)attr;
+
+		if ((battr->read && battr->write) || battr->mmap)
+			ops = &sysfs_bin_kfops_rw;
+		else if (battr->read)
+			ops = &sysfs_bin_kfops_ro;
+		else if (battr->write)
+			ops = &sysfs_bin_kfops_wo;
+		else
+			ops = &sysfs_file_kfops_empty;
+	}
+
 	sd = sysfs_new_dirent(attr->name, mode, type);
 	if (!sd)
 		return -ENOMEM;
 
+	sd->s_attr.ops = ops;
 	sd->s_ns = ns;
 	sd->priv = (void *)attr;
 	sysfs_dirent_init_lockdep(sd);
 
+	/*
+	 * sd->s_attr.ops is accesible only while holding active ref.  We
+	 * need to know whether some ops are implemented outside active
+	 * ref.  Cache their existence in flags.
+	 */
+	if (ops->seq_show)
+		sd->s_flags |= SYSFS_FLAG_HAS_SEQ_SHOW;
+	if (ops->mmap)
+		sd->s_flags |= SYSFS_FLAG_HAS_MMAP;
+
 	sysfs_addrm_start(&acxt);
 	rc = sysfs_add_one(&acxt, sd, dir_sd);
 	sysfs_addrm_finish(&acxt);
diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index e69ebbb..57e45ca 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -27,6 +27,7 @@ struct sysfs_elem_symlink {
 };
 
 struct sysfs_elem_attr {
+	const struct kernfs_ops	*ops;
 	struct sysfs_open_dirent *open;
 };
 
@@ -89,6 +90,8 @@ struct sysfs_dirent {
 #define SYSFS_FLAG_MASK			~SYSFS_TYPE_MASK
 #define SYSFS_FLAG_HAS_NS		0x01000
 #define SYSFS_FLAG_REMOVED		0x02000
+#define SYSFS_FLAG_HAS_SEQ_SHOW		0x04000
+#define SYSFS_FLAG_HAS_MMAP		0x08000
 
 static inline unsigned int sysfs_type(struct sysfs_dirent *sd)
 {
diff --git a/include/linux/kernfs.h b/include/linux/kernfs.h
index 6fbd226..ecf2d57 100644
--- a/include/linux/kernfs.h
+++ b/include/linux/kernfs.h
@@ -32,6 +32,32 @@ struct sysfs_open_file {
 	const struct vm_operations_struct *vm_ops;
 };
 
+struct kernfs_ops {
+	/*
+	 * Read is handled by either seq_file or raw_read().
+	 *
+	 * If seq_show() is present, seq_file path is active.  The behavior
+	 * is equivalent to single_open().  @sf->private points to the
+	 * associated sysfs_open_file.
+	 *
+	 * read() is bounced through kernel buffer and a read larger than
+	 * PAGE_SIZE results in partial operation of PAGE_SIZE.
+	 */
+	int (*seq_show)(struct seq_file *sf, void *v);
+
+	ssize_t (*read)(struct sysfs_open_file *of, char *buf, size_t bytes,
+			loff_t off);
+
+	/*
+	 * write() is bounced through kernel buffer and a write larger than
+	 * PAGE_SIZE results in partial operation of PAGE_SIZE.
+	 */
+	ssize_t (*write)(struct sysfs_open_file *of, char *buf, size_t bytes,
+			 loff_t off);
+
+	int (*mmap)(struct sysfs_open_file *of, struct vm_area_struct *vma);
+};
+
 #ifdef CONFIG_SYSFS
 
 struct sysfs_dirent *kernfs_create_dir_ns(struct sysfs_dirent *parent,
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 21/34] sysfs, kernfs: add sysfs_dirent->s_attr.size
  2013-10-24 15:49 Subject: [PATCHSET driver-core-next] sysfs: separate out kernfs, part #1 Tejun Heo
                   ` (19 preceding siblings ...)
  2013-10-24 15:49 ` [PATCH 20/34] sysfs, kernfs: introduce kernfs_ops Tejun Heo
@ 2013-10-24 15:49 ` Tejun Heo
  2013-10-24 15:49 ` [PATCH 22/34] sysfs, kernfs: remove SYSFS_KOBJ_BIN_ATTR Tejun Heo
                   ` (14 subsequent siblings)
  35 siblings, 0 replies; 46+ messages in thread
From: Tejun Heo @ 2013-10-24 15:49 UTC (permalink / raw)
  To: gregkh; +Cc: kay, linux-kernel, ebiederm, bhelgaas, Tejun Heo

sysfs sets the size of regular files unconditionally at PAGE_SIZE and
takes the size of bin files from bin_attribute.  The latter is a
pretty bad interface which forces bin_attribute users to create a
separate copy of bin_attribute for each instance of the file -
e.g. pci resource files.

Add sysfs_dirent->s_attr.size so that the size can be specified
separately.  This unifies inode init paths of ATTR and BIN_ATTR
identical and allows for generic size handling for kernfs.

Unfortunately, this grows the size of sysfs_dirent by sizeof(loff_t).

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 fs/sysfs/file.c  | 6 ++++++
 fs/sysfs/inode.c | 8 +-------
 fs/sysfs/sysfs.h | 1 +
 3 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c
index 0a533b4..8dd2d96 100644
--- a/fs/sysfs/file.c
+++ b/fs/sysfs/file.c
@@ -911,6 +911,7 @@ int sysfs_add_file_mode_ns(struct sysfs_dirent *dir_sd,
 	const struct kernfs_ops *ops;
 	struct sysfs_addrm_cxt acxt;
 	struct sysfs_dirent *sd;
+	loff_t size;
 	int rc;
 
 	if (type == SYSFS_KOBJ_ATTR) {
@@ -931,6 +932,8 @@ int sysfs_add_file_mode_ns(struct sysfs_dirent *dir_sd,
 			ops = &sysfs_file_kfops_wo;
 		else
 			ops = &sysfs_file_kfops_empty;
+
+		size = PAGE_SIZE;
 	} else {
 		struct bin_attribute *battr = (void *)attr;
 
@@ -942,6 +945,8 @@ int sysfs_add_file_mode_ns(struct sysfs_dirent *dir_sd,
 			ops = &sysfs_bin_kfops_wo;
 		else
 			ops = &sysfs_file_kfops_empty;
+
+		size = battr->size;
 	}
 
 	sd = sysfs_new_dirent(attr->name, mode, type);
@@ -949,6 +954,7 @@ int sysfs_add_file_mode_ns(struct sysfs_dirent *dir_sd,
 		return -ENOMEM;
 
 	sd->s_attr.ops = ops;
+	sd->s_attr.size = size;
 	sd->s_ns = ns;
 	sd->priv = (void *)attr;
 	sysfs_dirent_init_lockdep(sd);
diff --git a/fs/sysfs/inode.c b/fs/sysfs/inode.c
index 4c463da..037a892 100644
--- a/fs/sysfs/inode.c
+++ b/fs/sysfs/inode.c
@@ -254,8 +254,6 @@ int sysfs_getattr(struct vfsmount *mnt, struct dentry *dentry,
 
 static void sysfs_init_inode(struct sysfs_dirent *sd, struct inode *inode)
 {
-	struct bin_attribute *bin_attr;
-
 	inode->i_private = sysfs_get(sd);
 	inode->i_mapping->a_ops = &sysfs_aops;
 	inode->i_mapping->backing_dev_info = &sysfs_backing_dev_info;
@@ -271,12 +269,8 @@ static void sysfs_init_inode(struct sysfs_dirent *sd, struct inode *inode)
 		inode->i_fop = &sysfs_dir_operations;
 		break;
 	case SYSFS_KOBJ_ATTR:
-		inode->i_size = PAGE_SIZE;
-		inode->i_fop = &kernfs_file_operations;
-		break;
 	case SYSFS_KOBJ_BIN_ATTR:
-		bin_attr = sd->priv;
-		inode->i_size = bin_attr->size;
+		inode->i_size = sd->s_attr.size;
 		inode->i_fop = &kernfs_file_operations;
 		break;
 	case SYSFS_KOBJ_LINK:
diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index 57e45ca..bf5685e 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -29,6 +29,7 @@ struct sysfs_elem_symlink {
 struct sysfs_elem_attr {
 	const struct kernfs_ops	*ops;
 	struct sysfs_open_dirent *open;
+	loff_t			size;
 };
 
 struct sysfs_inode_attrs {
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 22/34] sysfs, kernfs: remove SYSFS_KOBJ_BIN_ATTR
  2013-10-24 15:49 Subject: [PATCHSET driver-core-next] sysfs: separate out kernfs, part #1 Tejun Heo
                   ` (20 preceding siblings ...)
  2013-10-24 15:49 ` [PATCH 21/34] sysfs, kernfs: add sysfs_dirent->s_attr.size Tejun Heo
@ 2013-10-24 15:49 ` Tejun Heo
  2013-10-24 15:49 ` [PATCH 23/34] sysfs, kernfs: introduce kernfs_create_file[_ns]() Tejun Heo
                   ` (13 subsequent siblings)
  35 siblings, 0 replies; 46+ messages in thread
From: Tejun Heo @ 2013-10-24 15:49 UTC (permalink / raw)
  To: gregkh; +Cc: kay, linux-kernel, ebiederm, bhelgaas, Tejun Heo

After kernfs_ops and sysfs_dirent->s_attr.size addition, the
distinction between SYSFS_KOBJ_BIN_ATTR and SYSFS_KOBJ_ATTR is only
necessary while creating files to decide which kernfs_ops to use.
Afterwards, they behave exactly the same.

This patch removes SYSFS_KOBJ_BIN_ATTR along with sysfs_is_bin().
sysfs_add_file[_mode_ns]() are updated to take bool @is_bin instead of
@type.

This patch doesn't introduce any behavior changes.  This completely
isolates the distinction between the two sysfs file types in the sysfs
layer proper.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 fs/sysfs/file.c  | 23 ++++++++---------------
 fs/sysfs/group.c |  5 ++---
 fs/sysfs/inode.c |  1 -
 fs/sysfs/sysfs.h | 11 ++++-------
 4 files changed, 14 insertions(+), 26 deletions(-)

diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c
index 8dd2d96..efa547d 100644
--- a/fs/sysfs/file.c
+++ b/fs/sysfs/file.c
@@ -47,11 +47,6 @@ struct sysfs_open_dirent {
 	struct list_head	files; /* goes through sysfs_open_file.list */
 };
 
-static bool sysfs_is_bin(struct sysfs_dirent *sd)
-{
-	return sysfs_type(sd) == SYSFS_KOBJ_BIN_ATTR;
-}
-
 static struct sysfs_open_file *sysfs_of(struct file *file)
 {
 	return ((struct seq_file *)file->private_data)->private;
@@ -904,7 +899,7 @@ static const struct kernfs_ops sysfs_bin_kfops_rw = {
 };
 
 int sysfs_add_file_mode_ns(struct sysfs_dirent *dir_sd,
-			   const struct attribute *attr, int type,
+			   const struct attribute *attr, bool is_bin,
 			   umode_t amode, const void *ns)
 {
 	umode_t mode = (amode & S_IALLUGO) | S_IFREG;
@@ -914,7 +909,7 @@ int sysfs_add_file_mode_ns(struct sysfs_dirent *dir_sd,
 	loff_t size;
 	int rc;
 
-	if (type == SYSFS_KOBJ_ATTR) {
+	if (!is_bin) {
 		struct kobject *kobj = dir_sd->priv;
 		const struct sysfs_ops *sysfs_ops = kobj->ktype->sysfs_ops;
 
@@ -949,7 +944,7 @@ int sysfs_add_file_mode_ns(struct sysfs_dirent *dir_sd,
 		size = battr->size;
 	}
 
-	sd = sysfs_new_dirent(attr->name, mode, type);
+	sd = sysfs_new_dirent(attr->name, mode, SYSFS_KOBJ_ATTR);
 	if (!sd)
 		return -ENOMEM;
 
@@ -979,11 +974,10 @@ int sysfs_add_file_mode_ns(struct sysfs_dirent *dir_sd,
 	return rc;
 }
 
-
 int sysfs_add_file(struct sysfs_dirent *dir_sd, const struct attribute *attr,
-		   int type)
+		   bool is_bin)
 {
-	return sysfs_add_file_mode_ns(dir_sd, attr, type, attr->mode, NULL);
+	return sysfs_add_file_mode_ns(dir_sd, attr, is_bin, attr->mode, NULL);
 }
 
 /**
@@ -997,8 +991,7 @@ int sysfs_create_file_ns(struct kobject *kobj, const struct attribute *attr,
 {
 	BUG_ON(!kobj || !kobj->sd || !attr);
 
-	return sysfs_add_file_mode_ns(kobj->sd, attr, SYSFS_KOBJ_ATTR,
-				      attr->mode, ns);
+	return sysfs_add_file_mode_ns(kobj->sd, attr, false, attr->mode, ns);
 
 }
 EXPORT_SYMBOL_GPL(sysfs_create_file_ns);
@@ -1037,7 +1030,7 @@ int sysfs_add_file_to_group(struct kobject *kobj,
 	if (!dir_sd)
 		return -ENOENT;
 
-	error = sysfs_add_file(dir_sd, attr, SYSFS_KOBJ_ATTR);
+	error = sysfs_add_file(dir_sd, attr, false);
 	sysfs_put(dir_sd);
 
 	return error;
@@ -1129,7 +1122,7 @@ int sysfs_create_bin_file(struct kobject *kobj,
 {
 	BUG_ON(!kobj || !kobj->sd || !attr);
 
-	return sysfs_add_file(kobj->sd, &attr->attr, SYSFS_KOBJ_BIN_ATTR);
+	return sysfs_add_file(kobj->sd, &attr->attr, true);
 }
 EXPORT_SYMBOL_GPL(sysfs_create_bin_file);
 
diff --git a/fs/sysfs/group.c b/fs/sysfs/group.c
index 065689d..9f65cd9 100644
--- a/fs/sysfs/group.c
+++ b/fs/sysfs/group.c
@@ -55,8 +55,7 @@ static int create_files(struct sysfs_dirent *dir_sd, struct kobject *kobj,
 				if (!mode)
 					continue;
 			}
-			error = sysfs_add_file_mode_ns(dir_sd, *attr,
-						       SYSFS_KOBJ_ATTR,
+			error = sysfs_add_file_mode_ns(dir_sd, *attr, false,
 						       (*attr)->mode | mode,
 						       NULL);
 			if (unlikely(error))
@@ -269,7 +268,7 @@ int sysfs_merge_group(struct kobject *kobj,
 		return -ENOENT;
 
 	for ((i = 0, attr = grp->attrs); *attr && !error; (++i, ++attr))
-		error = sysfs_add_file(dir_sd, *attr, SYSFS_KOBJ_ATTR);
+		error = sysfs_add_file(dir_sd, *attr, false);
 	if (error) {
 		while (--i >= 0)
 			kernfs_remove_by_name(dir_sd, (*--attr)->name);
diff --git a/fs/sysfs/inode.c b/fs/sysfs/inode.c
index 037a892..b3c717a 100644
--- a/fs/sysfs/inode.c
+++ b/fs/sysfs/inode.c
@@ -269,7 +269,6 @@ static void sysfs_init_inode(struct sysfs_dirent *sd, struct inode *inode)
 		inode->i_fop = &sysfs_dir_operations;
 		break;
 	case SYSFS_KOBJ_ATTR:
-	case SYSFS_KOBJ_BIN_ATTR:
 		inode->i_size = sd->s_attr.size;
 		inode->i_fop = &kernfs_file_operations;
 		break;
diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index bf5685e..d1102c0 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -83,10 +83,9 @@ struct sysfs_dirent {
 #define SYSFS_TYPE_MASK			0x00ff
 #define SYSFS_DIR			0x0001
 #define SYSFS_KOBJ_ATTR			0x0002
-#define SYSFS_KOBJ_BIN_ATTR		0x0004
 #define SYSFS_KOBJ_LINK			0x0008
 #define SYSFS_COPY_NAME			(SYSFS_DIR | SYSFS_KOBJ_LINK)
-#define SYSFS_ACTIVE_REF		(SYSFS_KOBJ_ATTR | SYSFS_KOBJ_BIN_ATTR)
+#define SYSFS_ACTIVE_REF		SYSFS_KOBJ_ATTR
 
 #define SYSFS_FLAG_MASK			~SYSFS_TYPE_MASK
 #define SYSFS_FLAG_HAS_NS		0x01000
@@ -115,10 +114,8 @@ do {								\
 static inline bool sysfs_ignore_lockdep(struct sysfs_dirent *sd)
 {
 	struct attribute *attr = sd->priv;
-	int type = sysfs_type(sd);
 
-	return (type == SYSFS_KOBJ_ATTR || type == SYSFS_KOBJ_BIN_ATTR) &&
-		attr->ignore_lockdep;
+	return sysfs_type(sd) == SYSFS_KOBJ_ATTR && attr->ignore_lockdep;
 }
 
 #else
@@ -219,10 +216,10 @@ int sysfs_inode_init(void);
 extern const struct file_operations kernfs_file_operations;
 
 int sysfs_add_file(struct sysfs_dirent *dir_sd,
-		   const struct attribute *attr, int type);
+		   const struct attribute *attr, bool is_bin);
 
 int sysfs_add_file_mode_ns(struct sysfs_dirent *dir_sd,
-			   const struct attribute *attr, int type,
+			   const struct attribute *attr, bool is_bin,
 			   umode_t amode, const void *ns);
 void sysfs_unmap_bin_file(struct sysfs_dirent *sd);
 
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 23/34] sysfs, kernfs: introduce kernfs_create_file[_ns]()
  2013-10-24 15:49 Subject: [PATCHSET driver-core-next] sysfs: separate out kernfs, part #1 Tejun Heo
                   ` (21 preceding siblings ...)
  2013-10-24 15:49 ` [PATCH 22/34] sysfs, kernfs: remove SYSFS_KOBJ_BIN_ATTR Tejun Heo
@ 2013-10-24 15:49 ` Tejun Heo
  2013-10-24 15:49 ` [PATCH 24/34] sysfs, kernfs: remove sysfs_add_one() Tejun Heo
                   ` (12 subsequent siblings)
  35 siblings, 0 replies; 46+ messages in thread
From: Tejun Heo @ 2013-10-24 15:49 UTC (permalink / raw)
  To: gregkh; +Cc: kay, linux-kernel, ebiederm, bhelgaas, Tejun Heo

Introduce kernfs interface to create a file which takes and returns
sysfs_dirents.

The actual file creation part is separated out from
sysfs_add_file_mode_ns() into kernfs_create_file_ns().  The former now
only decides the kernfs_ops to use and the file's size and invokes the
latter.

This patch doesn't introduce behavior changes.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 fs/sysfs/file.c        | 53 +++++++++++++++++++++++++++++++++++++++-----------
 include/linux/kernfs.h | 18 +++++++++++++++++
 2 files changed, 60 insertions(+), 11 deletions(-)

diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c
index efa547d..62d620a 100644
--- a/fs/sysfs/file.c
+++ b/fs/sysfs/file.c
@@ -900,14 +900,11 @@ static const struct kernfs_ops sysfs_bin_kfops_rw = {
 
 int sysfs_add_file_mode_ns(struct sysfs_dirent *dir_sd,
 			   const struct attribute *attr, bool is_bin,
-			   umode_t amode, const void *ns)
+			   umode_t mode, const void *ns)
 {
-	umode_t mode = (amode & S_IALLUGO) | S_IFREG;
 	const struct kernfs_ops *ops;
-	struct sysfs_addrm_cxt acxt;
 	struct sysfs_dirent *sd;
 	loff_t size;
-	int rc;
 
 	if (!is_bin) {
 		struct kobject *kobj = dir_sd->priv;
@@ -944,14 +941,47 @@ int sysfs_add_file_mode_ns(struct sysfs_dirent *dir_sd,
 		size = battr->size;
 	}
 
-	sd = sysfs_new_dirent(attr->name, mode, SYSFS_KOBJ_ATTR);
+	sd = kernfs_create_file_ns(dir_sd, attr->name, mode, size,
+				   ops, (void *)attr, ns);
+	if (IS_ERR(sd)) {
+		if (PTR_ERR(sd) == -EEXIST)
+			sysfs_warn_dup(dir_sd, attr->name);
+		return PTR_ERR(sd);
+	}
+	return 0;
+}
+
+/**
+ * kernfs_create_file_ns - create a file
+ * @parent: directory to create the file in
+ * @name: name of the file
+ * @mode: mode of the file
+ * @size: size of the file
+ * @ops: kernfs operations for the file
+ * @priv: private data for the file
+ * @ns: optional namespace tag of the file
+ *
+ * Returns the created node on success, ERR_PTR() value on error.
+ */
+struct sysfs_dirent *kernfs_create_file_ns(struct sysfs_dirent *parent,
+					   const char *name,
+					   umode_t mode, loff_t size,
+					   const struct kernfs_ops *ops,
+					   void *priv, const void *ns)
+{
+	struct sysfs_addrm_cxt acxt;
+	struct sysfs_dirent *sd;
+	int rc;
+
+	sd = sysfs_new_dirent(name, (mode & S_IALLUGO) | S_IFREG,
+			      SYSFS_KOBJ_ATTR);
 	if (!sd)
-		return -ENOMEM;
+		return ERR_PTR(-ENOMEM);
 
 	sd->s_attr.ops = ops;
 	sd->s_attr.size = size;
 	sd->s_ns = ns;
-	sd->priv = (void *)attr;
+	sd->priv = priv;
 	sysfs_dirent_init_lockdep(sd);
 
 	/*
@@ -965,13 +995,14 @@ int sysfs_add_file_mode_ns(struct sysfs_dirent *dir_sd,
 		sd->s_flags |= SYSFS_FLAG_HAS_MMAP;
 
 	sysfs_addrm_start(&acxt);
-	rc = sysfs_add_one(&acxt, sd, dir_sd);
+	rc = __sysfs_add_one(&acxt, sd, parent);
 	sysfs_addrm_finish(&acxt);
 
-	if (rc)
+	if (rc) {
 		sysfs_put(sd);
-
-	return rc;
+		return ERR_PTR(rc);
+	}
+	return sd;
 }
 
 int sysfs_add_file(struct sysfs_dirent *dir_sd, const struct attribute *attr,
diff --git a/include/linux/kernfs.h b/include/linux/kernfs.h
index ecf2d57..a1b94b7 100644
--- a/include/linux/kernfs.h
+++ b/include/linux/kernfs.h
@@ -63,6 +63,11 @@ struct kernfs_ops {
 struct sysfs_dirent *kernfs_create_dir_ns(struct sysfs_dirent *parent,
 					  const char *name, void *priv,
 					  const void *ns);
+struct sysfs_dirent *kernfs_create_file_ns(struct sysfs_dirent *parent,
+					   const char *name,
+					   umode_t mode, loff_t size,
+					   const struct kernfs_ops *ops,
+					   void *priv, const void *ns);
 struct sysfs_dirent *kernfs_create_link(struct sysfs_dirent *parent,
 					const char *name,
 					struct sysfs_dirent *target);
@@ -81,6 +86,12 @@ kernfs_create_dir_ns(struct sysfs_dirent *parent, const char *name, void *priv,
 { return NULL; }
 
 static inline struct sysfs_dirent *
+kernfs_create_file_ns(struct sysfs_dirent *parent, const char *name,
+		      umode_t mode, loff_t size, const struct kernfs_ops *ops,
+		      void *priv, const void *ns)
+{ return NULL; }
+
+static inline struct sysfs_dirent *
 kernfs_create_link(struct sysfs_dirent *parent, const char *name,
 		   struct sysfs_dirent *target)
 { return NULL; }
@@ -108,6 +119,13 @@ kernfs_create_dir(struct sysfs_dirent *parent, const char *name, void *priv)
 	return kernfs_create_dir_ns(parent, name, priv, NULL);
 }
 
+static inline struct sysfs_dirent *
+kernfs_create_file(struct sysfs_dirent *parent, const char *name, umode_t mode,
+		   loff_t size, const struct kernfs_ops *ops, void *priv)
+{
+	return kernfs_create_file_ns(parent, name, mode, size, ops, priv, NULL);
+}
+
 static inline int kernfs_remove_by_name(struct sysfs_dirent *parent,
 					const char *name)
 {
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 24/34] sysfs, kernfs: remove sysfs_add_one()
  2013-10-24 15:49 Subject: [PATCHSET driver-core-next] sysfs: separate out kernfs, part #1 Tejun Heo
                   ` (22 preceding siblings ...)
  2013-10-24 15:49 ` [PATCH 23/34] sysfs, kernfs: introduce kernfs_create_file[_ns]() Tejun Heo
@ 2013-10-24 15:49 ` Tejun Heo
  2013-10-24 15:49 ` [PATCH 25/34] sysfs, kernfs: add kernfs_ops->seq_{start|next|stop}() Tejun Heo
                   ` (11 subsequent siblings)
  35 siblings, 0 replies; 46+ messages in thread
From: Tejun Heo @ 2013-10-24 15:49 UTC (permalink / raw)
  To: gregkh; +Cc: kay, linux-kernel, ebiederm, bhelgaas, Tejun Heo

sysfs_add_one() is a wrapper around __sysfs_add_one() which prints out
duplicate name warning if __sysfs_add_one() fails with -EEXIST.  The
previous kernfs conversions moved all dup warnings to sysfs interface
functions and sysfs_add_one() doesn't have any user left.

Remove sysfs_add_one() and update __sysfs_add_one() to take its name.

This patch doesn't make any functional changes.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 fs/sysfs/dir.c     | 41 ++++-------------------------------------
 fs/sysfs/file.c    |  2 +-
 fs/sysfs/symlink.c |  2 +-
 fs/sysfs/sysfs.h   |  2 --
 4 files changed, 6 insertions(+), 41 deletions(-)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index 0a1e1da..312e0d6 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -406,7 +406,7 @@ void sysfs_addrm_start(struct sysfs_addrm_cxt *acxt)
 }
 
 /**
- *	__sysfs_add_one - add sysfs_dirent to parent without warning
+ *	sysfs_add_one - add sysfs_dirent to parent without warning
  *	@acxt: addrm context to use
  *	@sd: sysfs_dirent to be added
  *	@parent_sd: the parent sysfs_dirent to add @sd to
@@ -426,8 +426,8 @@ void sysfs_addrm_start(struct sysfs_addrm_cxt *acxt)
  *	0 on success, -EEXIST if entry with the given name already
  *	exists.
  */
-int __sysfs_add_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd,
-		    struct sysfs_dirent *parent_sd)
+int sysfs_add_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd,
+		  struct sysfs_dirent *parent_sd)
 {
 	struct sysfs_inode_attrs *ps_iattr;
 	int ret;
@@ -491,39 +491,6 @@ void sysfs_warn_dup(struct sysfs_dirent *parent, const char *name)
 }
 
 /**
- *	sysfs_add_one - add sysfs_dirent to parent
- *	@acxt: addrm context to use
- *	@sd: sysfs_dirent to be added
- *	@parent_sd: the parent sysfs_dirent to add @sd to
- *
- *	Get @parent_sd and set @sd->s_parent to it and increment nlink of
- *	the parent inode if @sd is a directory and link into the children
- *	list of the parent.
- *
- *	This function should be called between calls to
- *	sysfs_addrm_start() and sysfs_addrm_finish() and should be
- *	passed the same @acxt as passed to sysfs_addrm_start().
- *
- *	LOCKING:
- *	Determined by sysfs_addrm_start().
- *
- *	RETURNS:
- *	0 on success, -EEXIST if entry with the given name already
- *	exists.
- */
-int sysfs_add_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd,
-		  struct sysfs_dirent *parent_sd)
-{
-	int ret;
-
-	ret = __sysfs_add_one(acxt, sd, parent_sd);
-
-	if (ret == -EEXIST)
-		sysfs_warn_dup(parent_sd, sd->s_name);
-	return ret;
-}
-
-/**
  *	sysfs_remove_one - remove sysfs_dirent from parent
  *	@acxt: addrm context to use
  *	@sd: sysfs_dirent to be removed
@@ -689,7 +656,7 @@ struct sysfs_dirent *kernfs_create_dir_ns(struct sysfs_dirent *parent,
 
 	/* link in */
 	sysfs_addrm_start(&acxt);
-	rc = __sysfs_add_one(&acxt, sd, parent);
+	rc = sysfs_add_one(&acxt, sd, parent);
 	sysfs_addrm_finish(&acxt);
 
 	if (!rc)
diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c
index 62d620a..fa4d63d 100644
--- a/fs/sysfs/file.c
+++ b/fs/sysfs/file.c
@@ -995,7 +995,7 @@ struct sysfs_dirent *kernfs_create_file_ns(struct sysfs_dirent *parent,
 		sd->s_flags |= SYSFS_FLAG_HAS_MMAP;
 
 	sysfs_addrm_start(&acxt);
-	rc = __sysfs_add_one(&acxt, sd, parent);
+	rc = sysfs_add_one(&acxt, sd, parent);
 	sysfs_addrm_finish(&acxt);
 
 	if (rc) {
diff --git a/fs/sysfs/symlink.c b/fs/sysfs/symlink.c
index b44d2ce..5600560 100644
--- a/fs/sysfs/symlink.c
+++ b/fs/sysfs/symlink.c
@@ -46,7 +46,7 @@ struct sysfs_dirent *kernfs_create_link(struct sysfs_dirent *parent,
 	sysfs_get(target);	/* ref owned by symlink */
 
 	sysfs_addrm_start(&acxt);
-	error = __sysfs_add_one(&acxt, sd, parent);
+	error = sysfs_add_one(&acxt, sd, parent);
 	sysfs_addrm_finish(&acxt);
 
 	if (!error)
diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index d1102c0..dce319c 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -167,8 +167,6 @@ struct sysfs_dirent *sysfs_get_active(struct sysfs_dirent *sd);
 void sysfs_put_active(struct sysfs_dirent *sd);
 void sysfs_addrm_start(struct sysfs_addrm_cxt *acxt);
 void sysfs_warn_dup(struct sysfs_dirent *parent, const char *name);
-int __sysfs_add_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd,
-		    struct sysfs_dirent *parent_sd);
 int sysfs_add_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd,
 		  struct sysfs_dirent *parent_sd);
 void sysfs_addrm_finish(struct sysfs_addrm_cxt *acxt);
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 25/34] sysfs, kernfs: add kernfs_ops->seq_{start|next|stop}()
  2013-10-24 15:49 Subject: [PATCHSET driver-core-next] sysfs: separate out kernfs, part #1 Tejun Heo
                   ` (23 preceding siblings ...)
  2013-10-24 15:49 ` [PATCH 24/34] sysfs, kernfs: remove sysfs_add_one() Tejun Heo
@ 2013-10-24 15:49 ` Tejun Heo
  2013-10-28 13:04   ` [PATCH v2 " Tejun Heo
  2013-10-24 15:49 ` [PATCH 26/34] sysfs, kernfs: introduce kernfs_notify() Tejun Heo
                   ` (10 subsequent siblings)
  35 siblings, 1 reply; 46+ messages in thread
From: Tejun Heo @ 2013-10-24 15:49 UTC (permalink / raw)
  To: gregkh; +Cc: kay, linux-kernel, ebiederm, bhelgaas, Tejun Heo

kernfs_ops currently only supports single_open() behavior which is
pretty restrictive.  Add optional callbacks ->seq_{start|next|stop}()
which, when implemented, are invoked for seq_file traversal.  This
allows full seq_file functionality for kernfs users.  This currently
doesn't have any user and doesn't change any behavior.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 fs/sysfs/file.c        | 23 +++++++++++++++++++----
 include/linux/kernfs.h |  9 +++++++--
 2 files changed, 26 insertions(+), 6 deletions(-)

diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c
index fa4d63d..89134fc 100644
--- a/fs/sysfs/file.c
+++ b/fs/sysfs/file.c
@@ -146,6 +146,7 @@ static ssize_t sysfs_kf_bin_read(struct sysfs_open_file *of, char *buf,
 static void *kernfs_seq_start(struct seq_file *sf, loff_t *ppos)
 {
 	struct sysfs_open_file *of = sf->private;
+	const struct kernfs_ops *ops;
 
 	/*
 	 * @of->mutex nests outside active ref and is just to ensure that
@@ -157,19 +158,33 @@ static void *kernfs_seq_start(struct seq_file *sf, loff_t *ppos)
 		return ERR_PTR(-ENODEV);
 	}
 
-	/* the same behavior as single_open() */
-	return NULL + !*ppos;
+	ops = kernfs_ops(of->sd);
+	if (ops->seq_start)
+		return ops->seq_start(sf, ppos);
+	else
+		return NULL + !*ppos;	/* the same behavior as single_open() */
 }
 
 static void *kernfs_seq_next(struct seq_file *sf, void *v, loff_t *ppos)
 {
-	++*ppos;
-	return NULL;
+	struct sysfs_open_file *of = sf->private;
+	const struct kernfs_ops *ops = kernfs_ops(of->sd);
+
+	if (ops->seq_next) {
+		return ops->seq_next(sf, v, ppos);
+	} else {
+		++*ppos;
+		return NULL;
+	}
 }
 
 static void kernfs_seq_stop(struct seq_file *sf, void *v)
 {
 	struct sysfs_open_file *of = sf->private;
+	const struct kernfs_ops *ops = kernfs_ops(of->sd);
+
+	if (ops->seq_stop)
+		ops->seq_stop(sf, v);
 
 	sysfs_put_active(of->sd);
 	mutex_unlock(&of->mutex);
diff --git a/include/linux/kernfs.h b/include/linux/kernfs.h
index a1b94b7..222040c 100644
--- a/include/linux/kernfs.h
+++ b/include/linux/kernfs.h
@@ -36,8 +36,9 @@ struct kernfs_ops {
 	/*
 	 * Read is handled by either seq_file or raw_read().
 	 *
-	 * If seq_show() is present, seq_file path is active.  The behavior
-	 * is equivalent to single_open().  @sf->private points to the
+	 * If seq_show() is present, seq_file path is active.  Other seq
+	 * operations are optional and if not implemented, the behavior is
+	 * equivalent to single_open().  @sf->private points to the
 	 * associated sysfs_open_file.
 	 *
 	 * read() is bounced through kernel buffer and a read larger than
@@ -45,6 +46,10 @@ struct kernfs_ops {
 	 */
 	int (*seq_show)(struct seq_file *sf, void *v);
 
+	void *(*seq_start)(struct seq_file *sf, loff_t *ppos);
+	void *(*seq_next)(struct seq_file *sf, void *v, loff_t *ppos);
+	void (*seq_stop)(struct seq_file *sf, void *v);
+
 	ssize_t (*read)(struct sysfs_open_file *of, char *buf, size_t bytes,
 			loff_t off);
 
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 26/34] sysfs, kernfs: introduce kernfs_notify()
  2013-10-24 15:49 Subject: [PATCHSET driver-core-next] sysfs: separate out kernfs, part #1 Tejun Heo
                   ` (24 preceding siblings ...)
  2013-10-24 15:49 ` [PATCH 25/34] sysfs, kernfs: add kernfs_ops->seq_{start|next|stop}() Tejun Heo
@ 2013-10-24 15:49 ` Tejun Heo
  2013-10-24 15:49 ` [PATCH 27/34] sysfs, kernfs: reorganize SYSFS_* constants Tejun Heo
                   ` (9 subsequent siblings)
  35 siblings, 0 replies; 46+ messages in thread
From: Tejun Heo @ 2013-10-24 15:49 UTC (permalink / raw)
  To: gregkh; +Cc: kay, linux-kernel, ebiederm, bhelgaas, Tejun Heo

Introduce kernfs interface to wake up poll(2) which takes and returns
sysfs_dirents.

sysfs_notify_dirent() is renamed to kernfs_notify() and sysfs_notify()
is updated so that it doesn't directly grab sysfs_mutex but acquires
the target sysfs_dirents using sysfs_get_dirent().
sysfs_notify_dirent() is reimplemented as a dumb inline wrapper around
kernfs_notify().

This patch doesn't introduce any behavior changes.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 fs/sysfs/file.c        | 33 ++++++++++++++++++++++-----------
 include/linux/kernfs.h |  3 +++
 include/linux/sysfs.h  |  9 +++++----
 3 files changed, 30 insertions(+), 15 deletions(-)

diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c
index 89134fc..9271180 100644
--- a/fs/sysfs/file.c
+++ b/fs/sysfs/file.c
@@ -837,7 +837,13 @@ static unsigned int kernfs_file_poll(struct file *filp, poll_table *wait)
 	return DEFAULT_POLLMASK|POLLERR|POLLPRI;
 }
 
-void sysfs_notify_dirent(struct sysfs_dirent *sd)
+/**
+ * kernfs_notify - notify a kernfs file
+ * @sd: file to notify
+ *
+ * Notify @sd such that poll(2) on @sd wakes up.
+ */
+void kernfs_notify(struct sysfs_dirent *sd)
 {
 	struct sysfs_open_dirent *od;
 	unsigned long flags;
@@ -854,22 +860,27 @@ void sysfs_notify_dirent(struct sysfs_dirent *sd)
 
 	spin_unlock_irqrestore(&sysfs_open_dirent_lock, flags);
 }
-EXPORT_SYMBOL_GPL(sysfs_notify_dirent);
+EXPORT_SYMBOL_GPL(kernfs_notify);
 
 void sysfs_notify(struct kobject *k, const char *dir, const char *attr)
 {
-	struct sysfs_dirent *sd = k->sd;
-
-	mutex_lock(&sysfs_mutex);
+	struct sysfs_dirent *sd = k->sd, *tmp;
 
 	if (sd && dir)
-		sd = sysfs_find_dirent(sd, dir, NULL);
-	if (sd && attr)
-		sd = sysfs_find_dirent(sd, attr, NULL);
-	if (sd)
-		sysfs_notify_dirent(sd);
+		sd = sysfs_get_dirent(sd, dir);
+	else
+		sysfs_get(sd);
 
-	mutex_unlock(&sysfs_mutex);
+	if (sd && attr) {
+		tmp = sysfs_get_dirent(sd, attr);
+		sysfs_put(sd);
+		sd = tmp;
+	}
+
+	if (sd) {
+		kernfs_notify(sd);
+		sysfs_put(sd);
+	}
 }
 EXPORT_SYMBOL_GPL(sysfs_notify);
 
diff --git a/include/linux/kernfs.h b/include/linux/kernfs.h
index 222040c..89993ad 100644
--- a/include/linux/kernfs.h
+++ b/include/linux/kernfs.h
@@ -82,6 +82,7 @@ int kernfs_remove_by_name_ns(struct sysfs_dirent *parent, const char *name,
 int kernfs_rename_ns(struct sysfs_dirent *sd, struct sysfs_dirent *new_parent,
 		     const char *new_name, const void *new_ns);
 int kernfs_setattr(struct sysfs_dirent *sd, const struct iattr *iattr);
+void kernfs_notify(struct sysfs_dirent *sd);
 
 #else	/* CONFIG_SYSFS */
 
@@ -116,6 +117,8 @@ static inline int kernfs_setattr(struct sysfs_dirent *sd,
 				 const struct iattr *iattr)
 { return 0; }
 
+static inline void kernfs_notify(struct sysfs_dirent *sd) { }
+
 #endif	/* CONFIG_SYSFS */
 
 static inline struct sysfs_dirent *
diff --git a/include/linux/sysfs.h b/include/linux/sysfs.h
index 2bc735d..0ab2b02 100644
--- a/include/linux/sysfs.h
+++ b/include/linux/sysfs.h
@@ -243,7 +243,6 @@ void sysfs_remove_link_from_group(struct kobject *kobj, const char *group_name,
 				  const char *link_name);
 
 void sysfs_notify(struct kobject *kobj, const char *dir, const char *attr);
-void sysfs_notify_dirent(struct sysfs_dirent *sd);
 struct sysfs_dirent *sysfs_get_dirent_ns(struct sysfs_dirent *parent_sd,
 					 const unsigned char *name,
 					 const void *ns);
@@ -418,9 +417,6 @@ static inline void sysfs_notify(struct kobject *kobj, const char *dir,
 				const char *attr)
 {
 }
-static inline void sysfs_notify_dirent(struct sysfs_dirent *sd)
-{
-}
 static inline struct sysfs_dirent *
 sysfs_get_dirent_ns(struct sysfs_dirent *parent_sd, const unsigned char *name,
 		    const void *ns)
@@ -466,4 +462,9 @@ sysfs_get_dirent(struct sysfs_dirent *parent_sd, const unsigned char *name)
 	return sysfs_get_dirent_ns(parent_sd, name, NULL);
 }
 
+static inline void sysfs_notify_dirent(struct sysfs_dirent *sd)
+{
+	kernfs_notify(sd);
+}
+
 #endif /* _SYSFS_H_ */
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 27/34] sysfs, kernfs: reorganize SYSFS_* constants
  2013-10-24 15:49 Subject: [PATCHSET driver-core-next] sysfs: separate out kernfs, part #1 Tejun Heo
                   ` (25 preceding siblings ...)
  2013-10-24 15:49 ` [PATCH 26/34] sysfs, kernfs: introduce kernfs_notify() Tejun Heo
@ 2013-10-24 15:49 ` Tejun Heo
  2013-10-24 15:49 ` [PATCH 28/34] sysfs, kernfs: revamp sysfs_dirent active_ref lockdep annotation Tejun Heo
                   ` (8 subsequent siblings)
  35 siblings, 0 replies; 46+ messages in thread
From: Tejun Heo @ 2013-10-24 15:49 UTC (permalink / raw)
  To: gregkh; +Cc: kay, linux-kernel, ebiederm, bhelgaas, Tejun Heo

We want to add one more SYSFS_FLAG_* but we can't use the next higher
bit, 0x10000, as the flag field is 16bits wide.  The flags are
currently arranged weirdly - 8 bits are set aside for the type flags
when there are only three three used, the first flag starts at 0x1000
instead of 0x0100 and flag literals have 5 digits (20 bits) when only
4 digits can be used.

Rearrange them so that type bits are only the lowest four, flags start
at 0x0010 and similar flags are grouped.

This patch doesn't cause any behavior difference.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 fs/sysfs/sysfs.h | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index dce319c..3c7d0e6 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -80,18 +80,18 @@ struct sysfs_dirent {
 
 #define SD_DEACTIVATED_BIAS		INT_MIN
 
-#define SYSFS_TYPE_MASK			0x00ff
+#define SYSFS_TYPE_MASK			0x000f
 #define SYSFS_DIR			0x0001
 #define SYSFS_KOBJ_ATTR			0x0002
-#define SYSFS_KOBJ_LINK			0x0008
+#define SYSFS_KOBJ_LINK			0x0004
 #define SYSFS_COPY_NAME			(SYSFS_DIR | SYSFS_KOBJ_LINK)
 #define SYSFS_ACTIVE_REF		SYSFS_KOBJ_ATTR
 
 #define SYSFS_FLAG_MASK			~SYSFS_TYPE_MASK
-#define SYSFS_FLAG_HAS_NS		0x01000
-#define SYSFS_FLAG_REMOVED		0x02000
-#define SYSFS_FLAG_HAS_SEQ_SHOW		0x04000
-#define SYSFS_FLAG_HAS_MMAP		0x08000
+#define SYSFS_FLAG_REMOVED		0x0010
+#define SYSFS_FLAG_HAS_NS		0x0020
+#define SYSFS_FLAG_HAS_SEQ_SHOW		0x0040
+#define SYSFS_FLAG_HAS_MMAP		0x0080
 
 static inline unsigned int sysfs_type(struct sysfs_dirent *sd)
 {
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 28/34] sysfs, kernfs: revamp sysfs_dirent active_ref lockdep annotation
  2013-10-24 15:49 Subject: [PATCHSET driver-core-next] sysfs: separate out kernfs, part #1 Tejun Heo
                   ` (26 preceding siblings ...)
  2013-10-24 15:49 ` [PATCH 27/34] sysfs, kernfs: reorganize SYSFS_* constants Tejun Heo
@ 2013-10-24 15:49 ` Tejun Heo
  2013-10-24 15:49 ` [PATCH 29/34] sysfs, kernfs: introduce kernfs[_find_and]_get() and kernfs_put() Tejun Heo
                   ` (7 subsequent siblings)
  35 siblings, 0 replies; 46+ messages in thread
From: Tejun Heo @ 2013-10-24 15:49 UTC (permalink / raw)
  To: gregkh; +Cc: kay, linux-kernel, ebiederm, bhelgaas, Tejun Heo

Currently, sysfs_dirent active_ref lockdep annotation uses
attribute->[s]key as the lockdep key, which forces
kernfs_create_file_ns() to assume that sysfs_dirent->priv is pointing
to a struct attribute which may not be true for non-sysfs users.  This
patch restructures the lockdep annotation such that

* kernfs_ops contains lockdep_key which is used by default for files
  created kernfs_create_file_ns().

* kernfs_create_file_ns_key() is introduced which takes an extra @key
  argument.  The created file will use the specified key for
  active_ref lockdep annotation.  If NULL is specified, lockdep for
  the file is disabled.

* sysfs_add_file_mode_ns() is updated to use
  kernfs_create_file_ns_key() with the appropriate key from the
  attribute or NULL if ignore_lockdep is set.

This makes the lockdep annotation properly contained in kernfs while
allowing sysfs to cleanly keep its current behavior.  This patch
doesn't introduce any behavior differences.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 fs/sysfs/dir.c         |  4 ++--
 fs/sysfs/file.c        | 35 ++++++++++++++++++++++++-----------
 fs/sysfs/sysfs.h       | 32 +-------------------------------
 include/linux/kernfs.h | 37 +++++++++++++++++++++++++++++--------
 4 files changed, 56 insertions(+), 52 deletions(-)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index 312e0d6..f541efa 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -162,7 +162,7 @@ struct sysfs_dirent *sysfs_get_active(struct sysfs_dirent *sd)
 	if (!atomic_inc_unless_negative(&sd->s_active))
 		return NULL;
 
-	if (likely(!sysfs_ignore_lockdep(sd)))
+	if (sd->s_flags & SYSFS_FLAG_LOCKDEP)
 		rwsem_acquire_read(&sd->dep_map, 0, 1, _RET_IP_);
 	return sd;
 }
@@ -181,7 +181,7 @@ void sysfs_put_active(struct sysfs_dirent *sd)
 	if (unlikely(!sd))
 		return;
 
-	if (likely(!sysfs_ignore_lockdep(sd)))
+	if (sd->s_flags & SYSFS_FLAG_LOCKDEP)
 		rwsem_release(&sd->dep_map, 1, _RET_IP_);
 	v = atomic_dec_return(&sd->s_active);
 	if (likely(v != SD_DEACTIVATED_BIAS))
diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c
index 9271180..fc58883 100644
--- a/fs/sysfs/file.c
+++ b/fs/sysfs/file.c
@@ -58,7 +58,7 @@ static struct sysfs_open_file *sysfs_of(struct file *file)
  */
 static const struct kernfs_ops *kernfs_ops(struct sysfs_dirent *sd)
 {
-	if (!sysfs_ignore_lockdep(sd))
+	if (sd->s_flags & SYSFS_FLAG_LOCKDEP)
 		lockdep_assert_held(sd);
 	return sd->s_attr.ops;
 }
@@ -71,7 +71,7 @@ static const struct sysfs_ops *sysfs_file_ops(struct sysfs_dirent *sd)
 {
 	struct kobject *kobj = sd->s_parent->priv;
 
-	if (!sysfs_ignore_lockdep(sd))
+	if (sd->s_flags & SYSFS_FLAG_LOCKDEP)
 		lockdep_assert_held(sd);
 	return kobj->ktype ? kobj->ktype->sysfs_ops : NULL;
 }
@@ -928,6 +928,7 @@ int sysfs_add_file_mode_ns(struct sysfs_dirent *dir_sd,
 			   const struct attribute *attr, bool is_bin,
 			   umode_t mode, const void *ns)
 {
+	struct lock_class_key *key = NULL;
 	const struct kernfs_ops *ops;
 	struct sysfs_dirent *sd;
 	loff_t size;
@@ -967,8 +968,12 @@ int sysfs_add_file_mode_ns(struct sysfs_dirent *dir_sd,
 		size = battr->size;
 	}
 
-	sd = kernfs_create_file_ns(dir_sd, attr->name, mode, size,
-				   ops, (void *)attr, ns);
+#ifdef CONFIG_DEBUG_LOCK_ALLOC
+	if (!attr->ignore_lockdep)
+		key = attr->key ?: (struct lock_class_key *)&attr->skey;
+#endif
+	sd = kernfs_create_file_ns_key(dir_sd, attr->name, mode, size,
+				       ops, (void *)attr, ns, key);
 	if (IS_ERR(sd)) {
 		if (PTR_ERR(sd) == -EEXIST)
 			sysfs_warn_dup(dir_sd, attr->name);
@@ -978,7 +983,7 @@ int sysfs_add_file_mode_ns(struct sysfs_dirent *dir_sd,
 }
 
 /**
- * kernfs_create_file_ns - create a file
+ * kernfs_create_file_ns_key - create a file
  * @parent: directory to create the file in
  * @name: name of the file
  * @mode: mode of the file
@@ -986,14 +991,16 @@ int sysfs_add_file_mode_ns(struct sysfs_dirent *dir_sd,
  * @ops: kernfs operations for the file
  * @priv: private data for the file
  * @ns: optional namespace tag of the file
+ * @key: lockdep key for the file's active_ref, %NULL to disable lockdep
  *
  * Returns the created node on success, ERR_PTR() value on error.
  */
-struct sysfs_dirent *kernfs_create_file_ns(struct sysfs_dirent *parent,
-					   const char *name,
-					   umode_t mode, loff_t size,
-					   const struct kernfs_ops *ops,
-					   void *priv, const void *ns)
+struct sysfs_dirent *kernfs_create_file_ns_key(struct sysfs_dirent *parent,
+					       const char *name,
+					       umode_t mode, loff_t size,
+					       const struct kernfs_ops *ops,
+					       void *priv, const void *ns,
+					       struct lock_class_key *key)
 {
 	struct sysfs_addrm_cxt acxt;
 	struct sysfs_dirent *sd;
@@ -1008,7 +1015,13 @@ struct sysfs_dirent *kernfs_create_file_ns(struct sysfs_dirent *parent,
 	sd->s_attr.size = size;
 	sd->s_ns = ns;
 	sd->priv = priv;
-	sysfs_dirent_init_lockdep(sd);
+
+#ifdef CONFIG_DEBUG_LOCK_ALLOC
+	if (key) {
+		lockdep_init_map(&sd->dep_map, "s_active", key, 0);
+		sd->s_flags |= SYSFS_FLAG_LOCKDEP;
+	}
+#endif
 
 	/*
 	 * sd->s_attr.ops is accesible only while holding active ref.  We
diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index 3c7d0e6..fdfc603 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -92,43 +92,13 @@ struct sysfs_dirent {
 #define SYSFS_FLAG_HAS_NS		0x0020
 #define SYSFS_FLAG_HAS_SEQ_SHOW		0x0040
 #define SYSFS_FLAG_HAS_MMAP		0x0080
+#define SYSFS_FLAG_LOCKDEP		0x0100
 
 static inline unsigned int sysfs_type(struct sysfs_dirent *sd)
 {
 	return sd->s_flags & SYSFS_TYPE_MASK;
 }
 
-#ifdef CONFIG_DEBUG_LOCK_ALLOC
-
-#define sysfs_dirent_init_lockdep(sd)				\
-do {								\
-	struct attribute *attr = sd->priv;			\
-	struct lock_class_key *key = attr->key;			\
-	if (!key)						\
-		key = &attr->skey;				\
-								\
-	lockdep_init_map(&sd->dep_map, "s_active", key, 0);	\
-} while (0)
-
-/* Test for attributes that want to ignore lockdep for read-locking */
-static inline bool sysfs_ignore_lockdep(struct sysfs_dirent *sd)
-{
-	struct attribute *attr = sd->priv;
-
-	return sysfs_type(sd) == SYSFS_KOBJ_ATTR && attr->ignore_lockdep;
-}
-
-#else
-
-#define sysfs_dirent_init_lockdep(sd) do {} while (0)
-
-static inline bool sysfs_ignore_lockdep(struct sysfs_dirent *sd)
-{
-	return true;
-}
-
-#endif
-
 /*
  * Context structure to be used while adding/removing nodes.
  */
diff --git a/include/linux/kernfs.h b/include/linux/kernfs.h
index 89993ad..69f95b3 100644
--- a/include/linux/kernfs.h
+++ b/include/linux/kernfs.h
@@ -10,6 +10,7 @@
 #include <linux/kernel.h>
 #include <linux/list.h>
 #include <linux/mutex.h>
+#include <linux/lockdep.h>
 
 struct file;
 struct iattr;
@@ -61,6 +62,10 @@ struct kernfs_ops {
 			 loff_t off);
 
 	int (*mmap)(struct sysfs_open_file *of, struct vm_area_struct *vma);
+
+#ifdef CONFIG_DEBUG_LOCK_ALLOC
+	struct lock_class_key	lockdep_key;
+#endif
 };
 
 #ifdef CONFIG_SYSFS
@@ -68,11 +73,12 @@ struct kernfs_ops {
 struct sysfs_dirent *kernfs_create_dir_ns(struct sysfs_dirent *parent,
 					  const char *name, void *priv,
 					  const void *ns);
-struct sysfs_dirent *kernfs_create_file_ns(struct sysfs_dirent *parent,
-					   const char *name,
-					   umode_t mode, loff_t size,
-					   const struct kernfs_ops *ops,
-					   void *priv, const void *ns);
+struct sysfs_dirent *kernfs_create_file_ns_key(struct sysfs_dirent *parent,
+					       const char *name,
+					       umode_t mode, loff_t size,
+					       const struct kernfs_ops *ops,
+					       void *priv, const void *ns,
+					       struct lock_class_key *key);
 struct sysfs_dirent *kernfs_create_link(struct sysfs_dirent *parent,
 					const char *name,
 					struct sysfs_dirent *target);
@@ -92,9 +98,10 @@ kernfs_create_dir_ns(struct sysfs_dirent *parent, const char *name, void *priv,
 { return NULL; }
 
 static inline struct sysfs_dirent *
-kernfs_create_file_ns(struct sysfs_dirent *parent, const char *name,
-		      umode_t mode, loff_t size, const struct kernfs_ops *ops,
-		      void *priv, const void *ns)
+kernfs_create_file_ns_key(struct sysfs_dirent *parent, const char *name,
+			  umode_t mode, loff_t size,
+			  const struct kernfs_ops *ops, void *priv,
+			  const void *ns, struct lock_class_key *key)
 { return NULL; }
 
 static inline struct sysfs_dirent *
@@ -128,6 +135,20 @@ kernfs_create_dir(struct sysfs_dirent *parent, const char *name, void *priv)
 }
 
 static inline struct sysfs_dirent *
+kernfs_create_file_ns(struct sysfs_dirent *parent, const char *name,
+		      umode_t mode, loff_t size, const struct kernfs_ops *ops,
+		      void *priv, const void *ns)
+{
+	struct lock_class_key *key = NULL;
+
+#ifdef CONFIG_DEBUG_LOCK_ALLOC
+	key = (struct lock_class_key *)&ops->lockdep_key;
+#endif
+	return kernfs_create_file_ns_key(parent, name, mode, size, ops, priv,
+					 ns, key);
+}
+
+static inline struct sysfs_dirent *
 kernfs_create_file(struct sysfs_dirent *parent, const char *name, umode_t mode,
 		   loff_t size, const struct kernfs_ops *ops, void *priv)
 {
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 29/34] sysfs, kernfs: introduce kernfs[_find_and]_get() and kernfs_put()
  2013-10-24 15:49 Subject: [PATCHSET driver-core-next] sysfs: separate out kernfs, part #1 Tejun Heo
                   ` (27 preceding siblings ...)
  2013-10-24 15:49 ` [PATCH 28/34] sysfs, kernfs: revamp sysfs_dirent active_ref lockdep annotation Tejun Heo
@ 2013-10-24 15:49 ` Tejun Heo
  2013-10-24 15:49 ` [PATCH 30/34] sysfs, kernfs: move internal decls to fs/kernfs/kernfs-internal.h Tejun Heo
                   ` (6 subsequent siblings)
  35 siblings, 0 replies; 46+ messages in thread
From: Tejun Heo @ 2013-10-24 15:49 UTC (permalink / raw)
  To: gregkh; +Cc: kay, linux-kernel, ebiederm, bhelgaas, Tejun Heo

Introduce kernfs interface for finding, getting and putting
sysfs_dirents.

* sysfs_find_dirent() is renamed to kernfs_find_ns() and lockdep
  assertion for sysfs_mutex is added.

* sysfs_get_dirent_ns() is renamed to kernfs_find_and_get().

* Macro inline dancing around __sysfs_get/put() are removed and
  kernfs_get/put() are made proper functions implemented in
  fs/sysfs/dir.c.

While the conversions are mostly equivalent, there's one difference -
kernfs_get() doesn't return the input param as its return value.  This
change is intentional.  While passing through the input increases
writability in some areas, it is unnecessary and has been shown to
cause confusion regarding how the last ref is handled.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 fs/sysfs/dir.c         | 113 ++++++++++++++++++++++++++++---------------------
 fs/sysfs/file.c        |  41 ++++++++++--------
 fs/sysfs/group.c       |  30 +++++++------
 fs/sysfs/inode.c       |   5 ++-
 fs/sysfs/mount.c       |  14 ------
 fs/sysfs/symlink.c     |  16 ++++---
 fs/sysfs/sysfs.h       |  22 ----------
 include/linux/kernfs.h |  21 +++++++++
 include/linux/sysfs.h  |  35 ++++++---------
 9 files changed, 151 insertions(+), 146 deletions(-)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index f541efa..0e46ff8 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -252,10 +252,31 @@ static void sysfs_free_ino(unsigned int ino)
 	spin_unlock(&sysfs_ino_lock);
 }
 
-void release_sysfs_dirent(struct sysfs_dirent *sd)
+/**
+ * kernfs_get - get a reference count on a sysfs_dirent
+ * @sd: the target sysfs_dirent
+ */
+void kernfs_get(struct sysfs_dirent *sd)
+{
+	if (sd) {
+		WARN_ON(!atomic_read(&sd->s_count));
+		atomic_inc(&sd->s_count);
+	}
+}
+EXPORT_SYMBOL_GPL(kernfs_get);
+
+/**
+ * kernfs_put - put a reference count on a sysfs_dirent
+ * @sd: the target sysfs_dirent
+ *
+ * Put a reference count of @sd and destroy it if it reached zero.
+ */
+void kernfs_put(struct sysfs_dirent *sd)
 {
 	struct sysfs_dirent *parent_sd;
 
+	if (!sd || !atomic_dec_and_test(&sd->s_count))
+		return;
  repeat:
 	/* Moving/renaming is always done while holding reference.
 	 * sd->s_parent won't change beneath us.
@@ -267,7 +288,7 @@ void release_sysfs_dirent(struct sysfs_dirent *sd)
 		parent_sd ? parent_sd->s_name : "", sd->s_name);
 
 	if (sysfs_type(sd) == SYSFS_KOBJ_LINK)
-		sysfs_put(sd->s_symlink.target_sd);
+		kernfs_put(sd->s_symlink.target_sd);
 	if (sysfs_type(sd) & SYSFS_COPY_NAME)
 		kfree(sd->s_name);
 	if (sd->s_iattr && sd->s_iattr->ia_secdata)
@@ -281,6 +302,7 @@ void release_sysfs_dirent(struct sysfs_dirent *sd)
 	if (sd && atomic_dec_and_test(&sd->s_count))
 		goto repeat;
 }
+EXPORT_SYMBOL_GPL(kernfs_put);
 
 static int sysfs_dentry_delete(const struct dentry *dentry)
 {
@@ -342,7 +364,7 @@ out_bad:
 
 static void sysfs_dentry_release(struct dentry *dentry)
 {
-	sysfs_put(dentry->d_fsdata);
+	kernfs_put(dentry->d_fsdata);
 }
 
 const struct dentry_operations sysfs_dentry_ops = {
@@ -436,7 +458,8 @@ int sysfs_add_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd,
 		return -EINVAL;
 
 	sd->s_hash = sysfs_name_hash(sd->s_name, sd->s_ns);
-	sd->s_parent = sysfs_get(parent_sd);
+	sd->s_parent = parent_sd;
+	kernfs_get(parent_sd);
 
 	ret = sysfs_link_sibling(sd);
 	if (ret)
@@ -556,31 +579,28 @@ void sysfs_addrm_finish(struct sysfs_addrm_cxt *acxt)
 
 		sysfs_deactivate(sd);
 		sysfs_unmap_bin_file(sd);
-		sysfs_put(sd);
+		kernfs_put(sd);
 	}
 }
 
 /**
- *	sysfs_find_dirent - find sysfs_dirent with the given name
- *	@parent_sd: sysfs_dirent to search under
- *	@name: name to look for
- *	@ns: the namespace tag to use
- *
- *	Look for sysfs_dirent with name @name under @parent_sd.
- *
- *	LOCKING:
- *	mutex_lock(sysfs_mutex)
+ * kernfs_find_ns - find sysfs_dirent with the given name
+ * @parent: sysfs_dirent to search under
+ * @name: name to look for
+ * @ns: the namespace tag to use
  *
- *	RETURNS:
- *	Pointer to sysfs_dirent if found, NULL if not.
+ * Look for sysfs_dirent with name @name under @parent.  Returns pointer to
+ * the found sysfs_dirent on success, %NULL on failure.
  */
-struct sysfs_dirent *sysfs_find_dirent(struct sysfs_dirent *parent_sd,
-				       const unsigned char *name,
-				       const void *ns)
+static struct sysfs_dirent *kernfs_find_ns(struct sysfs_dirent *parent,
+					   const unsigned char *name,
+					   const void *ns)
 {
-	struct rb_node *node = parent_sd->s_dir.children.rb_node;
+	struct rb_node *node = parent->s_dir.children.rb_node;
 	unsigned int hash;
 
+	lockdep_assert_held(&sysfs_mutex);
+
 	hash = sysfs_name_hash(name, ns);
 	while (node) {
 		struct sysfs_dirent *sd;
@@ -599,34 +619,28 @@ struct sysfs_dirent *sysfs_find_dirent(struct sysfs_dirent *parent_sd,
 }
 
 /**
- *	sysfs_get_dirent_ns - find and get sysfs_dirent with the given name
- *	@parent_sd: sysfs_dirent to search under
- *	@name: name to look for
- *	@ns: the namespace tag to use
- *
- *	Look for sysfs_dirent with name @name under @parent_sd and get
- *	it if found.
- *
- *	LOCKING:
- *	Kernel thread context (may sleep).  Grabs sysfs_mutex.
+ * kernfs_find_and_get_ns - find and get sysfs_dirent with the given name
+ * @parent: sysfs_dirent to search under
+ * @name: name to look for
+ * @ns: the namespace tag to use
  *
- *	RETURNS:
- *	Pointer to sysfs_dirent if found, NULL if not.
+ * Look for sysfs_dirent with name @name under @parent and get a reference
+ * if found.  This function may sleep and returns pointer to the found
+ * sysfs_dirent on success, %NULL on failure.
  */
-struct sysfs_dirent *sysfs_get_dirent_ns(struct sysfs_dirent *parent_sd,
-					 const unsigned char *name,
-					 const void *ns)
+struct sysfs_dirent *kernfs_find_and_get_ns(struct sysfs_dirent *parent,
+					    const char *name, const void *ns)
 {
 	struct sysfs_dirent *sd;
 
 	mutex_lock(&sysfs_mutex);
-	sd = sysfs_find_dirent(parent_sd, name, ns);
-	sysfs_get(sd);
+	sd = kernfs_find_ns(parent, name, ns);
+	kernfs_get(sd);
 	mutex_unlock(&sysfs_mutex);
 
 	return sd;
 }
-EXPORT_SYMBOL_GPL(sysfs_get_dirent_ns);
+EXPORT_SYMBOL_GPL(kernfs_find_and_get_ns);
 
 /**
  * kernfs_create_dir_ns - create a directory
@@ -662,7 +676,7 @@ struct sysfs_dirent *kernfs_create_dir_ns(struct sysfs_dirent *parent,
 	if (!rc)
 		return sd;
 
-	sysfs_put(sd);
+	kernfs_put(sd);
 	return ERR_PTR(rc);
 }
 
@@ -711,14 +725,15 @@ static struct dentry *sysfs_lookup(struct inode *dir, struct dentry *dentry,
 	if (parent_sd->s_flags & SYSFS_FLAG_HAS_NS)
 		ns = sysfs_info(dir->i_sb)->ns;
 
-	sd = sysfs_find_dirent(parent_sd, dentry->d_name.name, ns);
+	sd = kernfs_find_ns(parent_sd, dentry->d_name.name, ns);
 
 	/* no such entry */
 	if (!sd) {
 		ret = ERR_PTR(-ENOENT);
 		goto out_unlock;
 	}
-	dentry->d_fsdata = sysfs_get(sd);
+	kernfs_get(sd);
+	dentry->d_fsdata = sd;
 
 	/* attach dentry and inode */
 	inode = sysfs_get_inode(dir->i_sb, sd);
@@ -854,7 +869,7 @@ int kernfs_remove_by_name_ns(struct sysfs_dirent *dir_sd, const char *name,
 
 	sysfs_addrm_start(&acxt);
 
-	sd = sysfs_find_dirent(dir_sd, name, ns);
+	sd = kernfs_find_ns(dir_sd, name, ns);
 	if (sd)
 		__kernfs_remove(&acxt, sd);
 
@@ -908,7 +923,7 @@ int kernfs_rename_ns(struct sysfs_dirent *sd, struct sysfs_dirent *new_parent,
 		goto out;	/* nothing to rename */
 
 	error = -EEXIST;
-	if (sysfs_find_dirent(new_parent, new_name, new_ns))
+	if (kernfs_find_ns(new_parent, new_name, new_ns))
 		goto out;
 
 	/* rename sysfs_dirent */
@@ -926,8 +941,8 @@ int kernfs_rename_ns(struct sysfs_dirent *sd, struct sysfs_dirent *new_parent,
 	 * Move to the appropriate place in the appropriate directories rbtree.
 	 */
 	sysfs_unlink_sibling(sd);
-	sysfs_get(new_parent);
-	sysfs_put(sd->s_parent);
+	kernfs_get(new_parent);
+	kernfs_put(sd->s_parent);
 	sd->s_ns = new_ns;
 	sd->s_hash = sysfs_name_hash(sd->s_name, sd->s_ns);
 	sd->s_parent = new_parent;
@@ -968,7 +983,7 @@ static inline unsigned char dt_type(struct sysfs_dirent *sd)
 
 static int sysfs_dir_release(struct inode *inode, struct file *filp)
 {
-	sysfs_put(filp->private_data);
+	kernfs_put(filp->private_data);
 	return 0;
 }
 
@@ -979,7 +994,7 @@ static struct sysfs_dirent *sysfs_dir_pos(const void *ns,
 		int valid = !(pos->s_flags & SYSFS_FLAG_REMOVED) &&
 			pos->s_parent == parent_sd &&
 			hash == pos->s_hash;
-		sysfs_put(pos);
+		kernfs_put(pos);
 		if (!valid)
 			pos = NULL;
 	}
@@ -1043,8 +1058,10 @@ static int sysfs_readdir(struct file *file, struct dir_context *ctx)
 		unsigned int type = dt_type(pos);
 		int len = strlen(name);
 		ino_t ino = pos->s_ino;
+
 		ctx->pos = pos->s_hash;
-		file->private_data = sysfs_get(pos);
+		file->private_data = pos;
+		kernfs_get(pos);
 
 		mutex_unlock(&sysfs_mutex);
 		if (!dir_emit(ctx, name, len, ino, type))
diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c
index fc58883..2b9cfb1 100644
--- a/fs/sysfs/file.c
+++ b/fs/sysfs/file.c
@@ -867,19 +867,19 @@ void sysfs_notify(struct kobject *k, const char *dir, const char *attr)
 	struct sysfs_dirent *sd = k->sd, *tmp;
 
 	if (sd && dir)
-		sd = sysfs_get_dirent(sd, dir);
+		sd = kernfs_find_and_get(sd, dir);
 	else
-		sysfs_get(sd);
+		kernfs_get(sd);
 
 	if (sd && attr) {
-		tmp = sysfs_get_dirent(sd, attr);
-		sysfs_put(sd);
+		tmp = kernfs_find_and_get(sd, attr);
+		kernfs_put(sd);
 		sd = tmp;
 	}
 
 	if (sd) {
 		kernfs_notify(sd);
-		sysfs_put(sd);
+		kernfs_put(sd);
 	}
 }
 EXPORT_SYMBOL_GPL(sysfs_notify);
@@ -1038,7 +1038,7 @@ struct sysfs_dirent *kernfs_create_file_ns_key(struct sysfs_dirent *parent,
 	sysfs_addrm_finish(&acxt);
 
 	if (rc) {
-		sysfs_put(sd);
+		kernfs_put(sd);
 		return ERR_PTR(rc);
 	}
 	return sd;
@@ -1092,16 +1092,18 @@ int sysfs_add_file_to_group(struct kobject *kobj,
 	struct sysfs_dirent *dir_sd;
 	int error;
 
-	if (group)
-		dir_sd = sysfs_get_dirent(kobj->sd, group);
-	else
-		dir_sd = sysfs_get(kobj->sd);
+	if (group) {
+		dir_sd = kernfs_find_and_get(kobj->sd, group);
+	} else {
+		dir_sd = kobj->sd;
+		kernfs_get(dir_sd);
+	}
 
 	if (!dir_sd)
 		return -ENOENT;
 
 	error = sysfs_add_file(dir_sd, attr, false);
-	sysfs_put(dir_sd);
+	kernfs_put(dir_sd);
 
 	return error;
 }
@@ -1121,7 +1123,7 @@ int sysfs_chmod_file(struct kobject *kobj, const struct attribute *attr,
 	struct iattr newattrs;
 	int rc;
 
-	sd = sysfs_get_dirent(kobj->sd, attr->name);
+	sd = kernfs_find_and_get(kobj->sd, attr->name);
 	if (!sd)
 		return -ENOENT;
 
@@ -1130,7 +1132,7 @@ int sysfs_chmod_file(struct kobject *kobj, const struct attribute *attr,
 
 	rc = kernfs_setattr(sd, &newattrs);
 
-	sysfs_put(sd);
+	kernfs_put(sd);
 	return rc;
 }
 EXPORT_SYMBOL_GPL(sysfs_chmod_file);
@@ -1171,13 +1173,16 @@ void sysfs_remove_file_from_group(struct kobject *kobj,
 {
 	struct sysfs_dirent *dir_sd;
 
-	if (group)
-		dir_sd = sysfs_get_dirent(kobj->sd, group);
-	else
-		dir_sd = sysfs_get(kobj->sd);
+	if (group) {
+		dir_sd = kernfs_find_and_get(kobj->sd, group);
+	} else {
+		dir_sd = kobj->sd;
+		kernfs_get(dir_sd);
+	}
+
 	if (dir_sd) {
 		kernfs_remove_by_name(dir_sd, attr->name);
-		sysfs_put(dir_sd);
+		kernfs_put(dir_sd);
 	}
 }
 EXPORT_SYMBOL_GPL(sysfs_remove_file_from_group);
diff --git a/fs/sysfs/group.c b/fs/sysfs/group.c
index 9f65cd9..7177532 100644
--- a/fs/sysfs/group.c
+++ b/fs/sysfs/group.c
@@ -108,13 +108,13 @@ static int internal_create_group(struct kobject *kobj, int update,
 		}
 	} else
 		sd = kobj->sd;
-	sysfs_get(sd);
+	kernfs_get(sd);
 	error = create_files(sd, kobj, grp, update);
 	if (error) {
 		if (grp->name)
 			kernfs_remove(sd);
 	}
-	sysfs_put(sd);
+	kernfs_put(sd);
 	return error;
 }
 
@@ -208,21 +208,23 @@ void sysfs_remove_group(struct kobject *kobj,
 	struct sysfs_dirent *sd;
 
 	if (grp->name) {
-		sd = sysfs_get_dirent(dir_sd, grp->name);
+		sd = kernfs_find_and_get(dir_sd, grp->name);
 		if (!sd) {
 			WARN(!sd, KERN_WARNING
 			     "sysfs group %p not found for kobject '%s'\n",
 			     grp, kobject_name(kobj));
 			return;
 		}
-	} else
-		sd = sysfs_get(dir_sd);
+	} else {
+		sd = dir_sd;
+		kernfs_get(sd);
+	}
 
 	remove_files(sd, kobj, grp);
 	if (grp->name)
 		kernfs_remove(sd);
 
-	sysfs_put(sd);
+	kernfs_put(sd);
 }
 EXPORT_SYMBOL_GPL(sysfs_remove_group);
 
@@ -263,7 +265,7 @@ int sysfs_merge_group(struct kobject *kobj,
 	struct attribute *const *attr;
 	int i;
 
-	dir_sd = sysfs_get_dirent(kobj->sd, grp->name);
+	dir_sd = kernfs_find_and_get(kobj->sd, grp->name);
 	if (!dir_sd)
 		return -ENOENT;
 
@@ -273,7 +275,7 @@ int sysfs_merge_group(struct kobject *kobj,
 		while (--i >= 0)
 			kernfs_remove_by_name(dir_sd, (*--attr)->name);
 	}
-	sysfs_put(dir_sd);
+	kernfs_put(dir_sd);
 
 	return error;
 }
@@ -290,11 +292,11 @@ void sysfs_unmerge_group(struct kobject *kobj,
 	struct sysfs_dirent *dir_sd;
 	struct attribute *const *attr;
 
-	dir_sd = sysfs_get_dirent(kobj->sd, grp->name);
+	dir_sd = kernfs_find_and_get(kobj->sd, grp->name);
 	if (dir_sd) {
 		for (attr = grp->attrs; *attr; ++attr)
 			kernfs_remove_by_name(dir_sd, (*attr)->name);
-		sysfs_put(dir_sd);
+		kernfs_put(dir_sd);
 	}
 }
 EXPORT_SYMBOL_GPL(sysfs_unmerge_group);
@@ -312,12 +314,12 @@ int sysfs_add_link_to_group(struct kobject *kobj, const char *group_name,
 	struct sysfs_dirent *dir_sd;
 	int error = 0;
 
-	dir_sd = sysfs_get_dirent(kobj->sd, group_name);
+	dir_sd = kernfs_find_and_get(kobj->sd, group_name);
 	if (!dir_sd)
 		return -ENOENT;
 
 	error = sysfs_create_link_sd(dir_sd, target, link_name);
-	sysfs_put(dir_sd);
+	kernfs_put(dir_sd);
 
 	return error;
 }
@@ -334,10 +336,10 @@ void sysfs_remove_link_from_group(struct kobject *kobj, const char *group_name,
 {
 	struct sysfs_dirent *dir_sd;
 
-	dir_sd = sysfs_get_dirent(kobj->sd, group_name);
+	dir_sd = kernfs_find_and_get(kobj->sd, group_name);
 	if (dir_sd) {
 		kernfs_remove_by_name(dir_sd, link_name);
-		sysfs_put(dir_sd);
+		kernfs_put(dir_sd);
 	}
 }
 EXPORT_SYMBOL_GPL(sysfs_remove_link_from_group);
diff --git a/fs/sysfs/inode.c b/fs/sysfs/inode.c
index b3c717a..bfe4478 100644
--- a/fs/sysfs/inode.c
+++ b/fs/sysfs/inode.c
@@ -254,7 +254,8 @@ int sysfs_getattr(struct vfsmount *mnt, struct dentry *dentry,
 
 static void sysfs_init_inode(struct sysfs_dirent *sd, struct inode *inode)
 {
-	inode->i_private = sysfs_get(sd);
+	kernfs_get(sd);
+	inode->i_private = sd;
 	inode->i_mapping->a_ops = &sysfs_aops;
 	inode->i_mapping->backing_dev_info = &sysfs_backing_dev_info;
 	inode->i_op = &sysfs_inode_operations;
@@ -321,7 +322,7 @@ void sysfs_evict_inode(struct inode *inode)
 
 	truncate_inode_pages(&inode->i_data, 0);
 	clear_inode(inode);
-	sysfs_put(sd);
+	kernfs_put(sd);
 }
 
 int sysfs_permission(struct inode *inode, int mask)
diff --git a/fs/sysfs/mount.c b/fs/sysfs/mount.c
index 8c24bce..852d115 100644
--- a/fs/sysfs/mount.c
+++ b/fs/sysfs/mount.c
@@ -184,17 +184,3 @@ out_err:
 	sysfs_dir_cachep = NULL;
 	goto out;
 }
-
-#undef sysfs_get
-struct sysfs_dirent *sysfs_get(struct sysfs_dirent *sd)
-{
-	return __sysfs_get(sd);
-}
-EXPORT_SYMBOL_GPL(sysfs_get);
-
-#undef sysfs_put
-void sysfs_put(struct sysfs_dirent *sd)
-{
-	__sysfs_put(sd);
-}
-EXPORT_SYMBOL_GPL(sysfs_put);
diff --git a/fs/sysfs/symlink.c b/fs/sysfs/symlink.c
index 5600560..1aae6ff 100644
--- a/fs/sysfs/symlink.c
+++ b/fs/sysfs/symlink.c
@@ -43,7 +43,7 @@ struct sysfs_dirent *kernfs_create_link(struct sysfs_dirent *parent,
 
 	sd->s_ns = target->s_ns;
 	sd->s_symlink.target_sd = target;
-	sysfs_get(target);	/* ref owned by symlink */
+	kernfs_get(target);	/* ref owned by symlink */
 
 	sysfs_addrm_start(&acxt);
 	error = sysfs_add_one(&acxt, sd, parent);
@@ -52,7 +52,7 @@ struct sysfs_dirent *kernfs_create_link(struct sysfs_dirent *parent,
 	if (!error)
 		return sd;
 
-	sysfs_put(sd);
+	kernfs_put(sd);
 	return ERR_PTR(error);
 }
 
@@ -69,15 +69,17 @@ static int sysfs_do_create_link_sd(struct sysfs_dirent *parent_sd,
 	 * sysfs_assoc_lock.  Fetch target_sd from it.
 	 */
 	spin_lock(&sysfs_assoc_lock);
-	if (target->sd)
-		target_sd = sysfs_get(target->sd);
+	if (target->sd) {
+		target_sd = target->sd;
+		kernfs_get(target_sd);
+	}
 	spin_unlock(&sysfs_assoc_lock);
 
 	if (!target_sd)
 		return -ENOENT;
 
 	sd = kernfs_create_link(parent_sd, name, target_sd);
-	sysfs_put(target_sd);
+	kernfs_put(target_sd);
 
 	if (!IS_ERR(sd))
 		return 0;
@@ -207,7 +209,7 @@ int sysfs_rename_link_ns(struct kobject *kobj, struct kobject *targ,
 		old_ns = targ->sd->s_ns;
 
 	result = -ENOENT;
-	sd = sysfs_get_dirent_ns(parent_sd, old, old_ns);
+	sd = kernfs_find_and_get_ns(parent_sd, old, old_ns);
 	if (!sd)
 		goto out;
 
@@ -220,7 +222,7 @@ int sysfs_rename_link_ns(struct kobject *kobj, struct kobject *targ,
 	result = kernfs_rename_ns(sd, parent_sd, new, new_ns);
 
 out:
-	sysfs_put(sd);
+	kernfs_put(sd);
 	return result;
 }
 EXPORT_SYMBOL_GPL(sysfs_rename_link_ns);
diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index fdfc603..0b1a781 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -141,30 +141,8 @@ int sysfs_add_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd,
 		  struct sysfs_dirent *parent_sd);
 void sysfs_addrm_finish(struct sysfs_addrm_cxt *acxt);
 
-struct sysfs_dirent *sysfs_find_dirent(struct sysfs_dirent *parent_sd,
-				       const unsigned char *name,
-				       const void *ns);
 struct sysfs_dirent *sysfs_new_dirent(const char *name, umode_t mode, int type);
 
-void release_sysfs_dirent(struct sysfs_dirent *sd);
-
-static inline struct sysfs_dirent *__sysfs_get(struct sysfs_dirent *sd)
-{
-	if (sd) {
-		WARN_ON(!atomic_read(&sd->s_count));
-		atomic_inc(&sd->s_count);
-	}
-	return sd;
-}
-#define sysfs_get(sd) __sysfs_get(sd)
-
-static inline void __sysfs_put(struct sysfs_dirent *sd)
-{
-	if (sd && atomic_dec_and_test(&sd->s_count))
-		release_sysfs_dirent(sd);
-}
-#define sysfs_put(sd) __sysfs_put(sd)
-
 /*
  * inode.c
  */
diff --git a/include/linux/kernfs.h b/include/linux/kernfs.h
index 69f95b3..3d1de86 100644
--- a/include/linux/kernfs.h
+++ b/include/linux/kernfs.h
@@ -70,6 +70,11 @@ struct kernfs_ops {
 
 #ifdef CONFIG_SYSFS
 
+struct sysfs_dirent *kernfs_find_and_get_ns(struct sysfs_dirent *parent,
+					    const char *name, const void *ns);
+void kernfs_get(struct sysfs_dirent *sd);
+void kernfs_put(struct sysfs_dirent *sd);
+
 struct sysfs_dirent *kernfs_create_dir_ns(struct sysfs_dirent *parent,
 					  const char *name, void *priv,
 					  const void *ns);
@@ -93,6 +98,16 @@ void kernfs_notify(struct sysfs_dirent *sd);
 #else	/* CONFIG_SYSFS */
 
 static inline struct sysfs_dirent *
+kernfs_find_and_get_ns(struct sysfs_dirent *parent, const char *name,
+		       const void *ns)
+{
+	return NULL;
+}
+
+static inline void kernfs_get(struct sysfs_dirent *sd) { }
+static inline void kernfs_put(struct sysfs_dirent *sd) { }
+
+static inline struct sysfs_dirent *
 kernfs_create_dir_ns(struct sysfs_dirent *parent, const char *name, void *priv,
 		     const void *ns)
 { return NULL; }
@@ -129,6 +144,12 @@ static inline void kernfs_notify(struct sysfs_dirent *sd) { }
 #endif	/* CONFIG_SYSFS */
 
 static inline struct sysfs_dirent *
+kernfs_find_and_get(struct sysfs_dirent *sd, const char *name)
+{
+	return kernfs_find_and_get_ns(sd, name, NULL);
+}
+
+static inline struct sysfs_dirent *
 kernfs_create_dir(struct sysfs_dirent *parent, const char *name, void *priv)
 {
 	return kernfs_create_dir_ns(parent, name, priv, NULL);
diff --git a/include/linux/sysfs.h b/include/linux/sysfs.h
index 0ab2b02..cd8f90b 100644
--- a/include/linux/sysfs.h
+++ b/include/linux/sysfs.h
@@ -243,11 +243,6 @@ void sysfs_remove_link_from_group(struct kobject *kobj, const char *group_name,
 				  const char *link_name);
 
 void sysfs_notify(struct kobject *kobj, const char *dir, const char *attr);
-struct sysfs_dirent *sysfs_get_dirent_ns(struct sysfs_dirent *parent_sd,
-					 const unsigned char *name,
-					 const void *ns);
-struct sysfs_dirent *sysfs_get(struct sysfs_dirent *sd);
-void sysfs_put(struct sysfs_dirent *sd);
 
 int __must_check sysfs_init(void);
 
@@ -417,19 +412,6 @@ static inline void sysfs_notify(struct kobject *kobj, const char *dir,
 				const char *attr)
 {
 }
-static inline struct sysfs_dirent *
-sysfs_get_dirent_ns(struct sysfs_dirent *parent_sd, const unsigned char *name,
-		    const void *ns)
-{
-	return NULL;
-}
-static inline struct sysfs_dirent *sysfs_get(struct sysfs_dirent *sd)
-{
-	return NULL;
-}
-static inline void sysfs_put(struct sysfs_dirent *sd)
-{
-}
 
 static inline int __must_check sysfs_init(void)
 {
@@ -456,15 +438,26 @@ static inline int sysfs_rename_link(struct kobject *kobj, struct kobject *target
 	return sysfs_rename_link_ns(kobj, target, old_name, new_name, NULL);
 }
 
+static inline void sysfs_notify_dirent(struct sysfs_dirent *sd)
+{
+	kernfs_notify(sd);
+}
+
 static inline struct sysfs_dirent *
 sysfs_get_dirent(struct sysfs_dirent *parent_sd, const unsigned char *name)
 {
-	return sysfs_get_dirent_ns(parent_sd, name, NULL);
+	return kernfs_find_and_get(parent_sd, name);
 }
 
-static inline void sysfs_notify_dirent(struct sysfs_dirent *sd)
+static inline struct sysfs_dirent *sysfs_get(struct sysfs_dirent *sd)
 {
-	kernfs_notify(sd);
+	kernfs_get(sd);
+	return sd;
+}
+
+static inline void sysfs_put(struct sysfs_dirent *sd)
+{
+	kernfs_put(sd);
 }
 
 #endif /* _SYSFS_H_ */
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 30/34] sysfs, kernfs: move internal decls to fs/kernfs/kernfs-internal.h
  2013-10-24 15:49 Subject: [PATCHSET driver-core-next] sysfs: separate out kernfs, part #1 Tejun Heo
                   ` (28 preceding siblings ...)
  2013-10-24 15:49 ` [PATCH 29/34] sysfs, kernfs: introduce kernfs[_find_and]_get() and kernfs_put() Tejun Heo
@ 2013-10-24 15:49 ` Tejun Heo
  2013-10-24 15:49 ` [PATCH 31/34] sysfs, kernfs: move inode code to fs/kernfs/inode.c Tejun Heo
                   ` (5 subsequent siblings)
  35 siblings, 0 replies; 46+ messages in thread
From: Tejun Heo @ 2013-10-24 15:49 UTC (permalink / raw)
  To: gregkh; +Cc: kay, linux-kernel, ebiederm, bhelgaas, Tejun Heo

Move data structure, constant and basic accessor declarations from
fs/sysfs/sysfs.h to fs/kernfs/kernfs-internal.h.  The two files
currently include each other.  Once kernfs / sysfs separation is
complete, the cross inclusions will be removed.  Inclusion protectors
are added to fs/sysfs/sysfs.h to allow cross-inclusion.

This patch doesn't introduce any functional changes.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 fs/kernfs/kernfs-internal.h | 115 ++++++++++++++++++++++++++++++++++++++++++++
 fs/sysfs/sysfs.h            | 102 +++------------------------------------
 2 files changed, 121 insertions(+), 96 deletions(-)
 create mode 100644 fs/kernfs/kernfs-internal.h

diff --git a/fs/kernfs/kernfs-internal.h b/fs/kernfs/kernfs-internal.h
new file mode 100644
index 0000000..6bb8c12
--- /dev/null
+++ b/fs/kernfs/kernfs-internal.h
@@ -0,0 +1,115 @@
+/*
+ * fs/kernfs/kernfs-internal.h - kernfs internal header file
+ *
+ * Copyright (c) 2001-3 Patrick Mochel
+ * Copyright (c) 2007 SUSE Linux Products GmbH
+ * Copyright (c) 2007, 2013 Tejun Heo <teheo@suse.de>
+ *
+ * This file is released under the GPLv2.
+ */
+
+#ifndef __KERNFS_INTERNAL_H
+#define __KERNFS_INTERNAL_H
+
+#include <linux/lockdep.h>
+#include <linux/fs.h>
+#include <linux/rbtree.h>
+
+#include <linux/kernfs.h>
+
+struct sysfs_open_dirent;
+
+/* type-specific structures for sysfs_dirent->s_* union members */
+struct sysfs_elem_dir {
+	unsigned long		subdirs;
+	/* children rbtree starts here and goes through sd->s_rb */
+	struct rb_root		children;
+};
+
+struct sysfs_elem_symlink {
+	struct sysfs_dirent	*target_sd;
+};
+
+struct sysfs_elem_attr {
+	const struct kernfs_ops	*ops;
+	struct sysfs_open_dirent *open;
+	loff_t			size;
+};
+
+struct sysfs_inode_attrs {
+	struct iattr	ia_iattr;
+	void		*ia_secdata;
+	u32		ia_secdata_len;
+};
+
+/*
+ * sysfs_dirent - the building block of sysfs hierarchy.  Each and
+ * every sysfs node is represented by single sysfs_dirent.
+ *
+ * As long as s_count reference is held, the sysfs_dirent itself is
+ * accessible.  Dereferencing s_elem or any other outer entity
+ * requires s_active reference.
+ */
+struct sysfs_dirent {
+	atomic_t		s_count;
+	atomic_t		s_active;
+#ifdef CONFIG_DEBUG_LOCK_ALLOC
+	struct lockdep_map	dep_map;
+#endif
+	struct sysfs_dirent	*s_parent;
+	const char		*s_name;
+
+	struct rb_node		s_rb;
+
+	union {
+		struct completion	*completion;
+		struct sysfs_dirent	*removed_list;
+	} u;
+
+	const void		*s_ns; /* namespace tag */
+	unsigned int		s_hash; /* ns + name hash */
+	union {
+		struct sysfs_elem_dir		s_dir;
+		struct sysfs_elem_symlink	s_symlink;
+		struct sysfs_elem_attr		s_attr;
+	};
+
+	void			*priv;
+
+	unsigned short		s_flags;
+	umode_t			s_mode;
+	unsigned int		s_ino;
+	struct sysfs_inode_attrs *s_iattr;
+};
+
+#define SD_DEACTIVATED_BIAS		INT_MIN
+
+#define SYSFS_TYPE_MASK			0x000f
+#define SYSFS_DIR			0x0001
+#define SYSFS_KOBJ_ATTR			0x0002
+#define SYSFS_KOBJ_LINK			0x0004
+#define SYSFS_COPY_NAME			(SYSFS_DIR | SYSFS_KOBJ_LINK)
+#define SYSFS_ACTIVE_REF		SYSFS_KOBJ_ATTR
+
+#define SYSFS_FLAG_MASK			~SYSFS_TYPE_MASK
+#define SYSFS_FLAG_REMOVED		0x0010
+#define SYSFS_FLAG_HAS_NS		0x0020
+#define SYSFS_FLAG_HAS_SEQ_SHOW		0x0040
+#define SYSFS_FLAG_HAS_MMAP		0x0080
+#define SYSFS_FLAG_LOCKDEP		0x0100
+
+static inline unsigned int sysfs_type(struct sysfs_dirent *sd)
+{
+	return sd->s_flags & SYSFS_TYPE_MASK;
+}
+
+/*
+ * Context structure to be used while adding/removing nodes.
+ */
+struct sysfs_addrm_cxt {
+	struct sysfs_dirent	*removed;
+};
+
+#include "../sysfs/sysfs.h"
+
+#endif	/* __KERNFS_INTERNAL_H */
diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index 0b1a781..00538ac 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -8,103 +8,11 @@
  * This file is released under the GPLv2.
  */
 
-#include <linux/lockdep.h>
-#include <linux/kobject_ns.h>
-#include <linux/fs.h>
-#include <linux/rbtree.h>
+#ifndef __SYSFS_INTERNAL_H
+#define __SYSFS_INTERNAL_H
 
-struct sysfs_open_dirent;
-
-/* type-specific structures for sysfs_dirent->s_* union members */
-struct sysfs_elem_dir {
-	unsigned long		subdirs;
-	/* children rbtree starts here and goes through sd->s_rb */
-	struct rb_root		children;
-};
-
-struct sysfs_elem_symlink {
-	struct sysfs_dirent	*target_sd;
-};
-
-struct sysfs_elem_attr {
-	const struct kernfs_ops	*ops;
-	struct sysfs_open_dirent *open;
-	loff_t			size;
-};
-
-struct sysfs_inode_attrs {
-	struct iattr	ia_iattr;
-	void		*ia_secdata;
-	u32		ia_secdata_len;
-};
-
-/*
- * sysfs_dirent - the building block of sysfs hierarchy.  Each and
- * every sysfs node is represented by single sysfs_dirent.
- *
- * As long as s_count reference is held, the sysfs_dirent itself is
- * accessible.  Dereferencing s_elem or any other outer entity
- * requires s_active reference.
- */
-struct sysfs_dirent {
-	atomic_t		s_count;
-	atomic_t		s_active;
-#ifdef CONFIG_DEBUG_LOCK_ALLOC
-	struct lockdep_map	dep_map;
-#endif
-	struct sysfs_dirent	*s_parent;
-	const char		*s_name;
-
-	struct rb_node		s_rb;
-
-	union {
-		struct completion	*completion;
-		struct sysfs_dirent	*removed_list;
-	} u;
-
-	const void		*s_ns; /* namespace tag */
-	unsigned int		s_hash; /* ns + name hash */
-	union {
-		struct sysfs_elem_dir		s_dir;
-		struct sysfs_elem_symlink	s_symlink;
-		struct sysfs_elem_attr		s_attr;
-	};
-
-	void			*priv;
-
-	unsigned short		s_flags;
-	umode_t			s_mode;
-	unsigned int		s_ino;
-	struct sysfs_inode_attrs *s_iattr;
-};
-
-#define SD_DEACTIVATED_BIAS		INT_MIN
-
-#define SYSFS_TYPE_MASK			0x000f
-#define SYSFS_DIR			0x0001
-#define SYSFS_KOBJ_ATTR			0x0002
-#define SYSFS_KOBJ_LINK			0x0004
-#define SYSFS_COPY_NAME			(SYSFS_DIR | SYSFS_KOBJ_LINK)
-#define SYSFS_ACTIVE_REF		SYSFS_KOBJ_ATTR
-
-#define SYSFS_FLAG_MASK			~SYSFS_TYPE_MASK
-#define SYSFS_FLAG_REMOVED		0x0010
-#define SYSFS_FLAG_HAS_NS		0x0020
-#define SYSFS_FLAG_HAS_SEQ_SHOW		0x0040
-#define SYSFS_FLAG_HAS_MMAP		0x0080
-#define SYSFS_FLAG_LOCKDEP		0x0100
-
-static inline unsigned int sysfs_type(struct sysfs_dirent *sd)
-{
-	return sd->s_flags & SYSFS_TYPE_MASK;
-}
-
-/*
- * Context structure to be used while adding/removing nodes.
- */
-struct sysfs_addrm_cxt {
-	struct sysfs_dirent	*removed;
-};
+#include "../kernfs/kernfs-internal.h"
+#include <linux/sysfs.h>
 
 /*
  * mount.c
@@ -175,3 +83,5 @@ void sysfs_unmap_bin_file(struct sysfs_dirent *sd);
 extern const struct inode_operations sysfs_symlink_inode_operations;
 int sysfs_create_link_sd(struct sysfs_dirent *sd, struct kobject *target,
 			 const char *name);
+
+#endif	/* __SYSFS_INTERNAL_H */
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 31/34] sysfs, kernfs: move inode code to fs/kernfs/inode.c
  2013-10-24 15:49 Subject: [PATCHSET driver-core-next] sysfs: separate out kernfs, part #1 Tejun Heo
                   ` (29 preceding siblings ...)
  2013-10-24 15:49 ` [PATCH 30/34] sysfs, kernfs: move internal decls to fs/kernfs/kernfs-internal.h Tejun Heo
@ 2013-10-24 15:49 ` Tejun Heo
  2013-10-24 15:49 ` [PATCH 32/34] sysfs, kernfs: move dir core code to fs/kernfs/dir.c Tejun Heo
                   ` (4 subsequent siblings)
  35 siblings, 0 replies; 46+ messages in thread
From: Tejun Heo @ 2013-10-24 15:49 UTC (permalink / raw)
  To: gregkh; +Cc: kay, linux-kernel, ebiederm, bhelgaas, Tejun Heo

There's nothing sysfs-specific in fs/sysfs/inode.c.  Move everything
in it to fs/kernfs/inode.c.  The respective declarations in
fs/sysfs/sysfs.h are moved to fs/kernfs/kernfs-internal.h.

This is pure relocation.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 fs/kernfs/inode.c           | 327 ++++++++++++++++++++++++++++++++++++++++++
 fs/kernfs/kernfs-internal.h |  13 ++
 fs/sysfs/Makefile           |   2 +-
 fs/sysfs/inode.c            | 342 --------------------------------------------
 fs/sysfs/sysfs.h            |  13 --
 5 files changed, 341 insertions(+), 356 deletions(-)
 delete mode 100644 fs/sysfs/inode.c

diff --git a/fs/kernfs/inode.c b/fs/kernfs/inode.c
index 86bfeea..9d4fab4 100644
--- a/fs/kernfs/inode.c
+++ b/fs/kernfs/inode.c
@@ -7,3 +7,330 @@
  *
  * This file is released under the GPLv2.
  */
+
+#include <linux/pagemap.h>
+#include <linux/backing-dev.h>
+#include <linux/capability.h>
+#include <linux/errno.h>
+#include <linux/slab.h>
+#include <linux/xattr.h>
+#include <linux/security.h>
+
+#include "kernfs-internal.h"
+
+static const struct address_space_operations sysfs_aops = {
+	.readpage	= simple_readpage,
+	.write_begin	= simple_write_begin,
+	.write_end	= simple_write_end,
+};
+
+static struct backing_dev_info sysfs_backing_dev_info = {
+	.name		= "sysfs",
+	.ra_pages	= 0,	/* No readahead */
+	.capabilities	= BDI_CAP_NO_ACCT_AND_WRITEBACK,
+};
+
+static const struct inode_operations sysfs_inode_operations = {
+	.permission	= sysfs_permission,
+	.setattr	= sysfs_setattr,
+	.getattr	= sysfs_getattr,
+	.setxattr	= sysfs_setxattr,
+};
+
+int __init sysfs_inode_init(void)
+{
+	return bdi_init(&sysfs_backing_dev_info);
+}
+
+static struct sysfs_inode_attrs *sysfs_init_inode_attrs(struct sysfs_dirent *sd)
+{
+	struct sysfs_inode_attrs *attrs;
+	struct iattr *iattrs;
+
+	attrs = kzalloc(sizeof(struct sysfs_inode_attrs), GFP_KERNEL);
+	if (!attrs)
+		return NULL;
+	iattrs = &attrs->ia_iattr;
+
+	/* assign default attributes */
+	iattrs->ia_mode = sd->s_mode;
+	iattrs->ia_uid = GLOBAL_ROOT_UID;
+	iattrs->ia_gid = GLOBAL_ROOT_GID;
+	iattrs->ia_atime = iattrs->ia_mtime = iattrs->ia_ctime = CURRENT_TIME;
+
+	return attrs;
+}
+
+static int __kernfs_setattr(struct sysfs_dirent *sd, const struct iattr *iattr)
+{
+	struct sysfs_inode_attrs *sd_attrs;
+	struct iattr *iattrs;
+	unsigned int ia_valid = iattr->ia_valid;
+
+	sd_attrs = sd->s_iattr;
+
+	if (!sd_attrs) {
+		/* setting attributes for the first time, allocate now */
+		sd_attrs = sysfs_init_inode_attrs(sd);
+		if (!sd_attrs)
+			return -ENOMEM;
+		sd->s_iattr = sd_attrs;
+	}
+	/* attributes were changed at least once in past */
+	iattrs = &sd_attrs->ia_iattr;
+
+	if (ia_valid & ATTR_UID)
+		iattrs->ia_uid = iattr->ia_uid;
+	if (ia_valid & ATTR_GID)
+		iattrs->ia_gid = iattr->ia_gid;
+	if (ia_valid & ATTR_ATIME)
+		iattrs->ia_atime = iattr->ia_atime;
+	if (ia_valid & ATTR_MTIME)
+		iattrs->ia_mtime = iattr->ia_mtime;
+	if (ia_valid & ATTR_CTIME)
+		iattrs->ia_ctime = iattr->ia_ctime;
+	if (ia_valid & ATTR_MODE) {
+		umode_t mode = iattr->ia_mode;
+		iattrs->ia_mode = sd->s_mode = mode;
+	}
+	return 0;
+}
+
+/**
+ * kernfs_setattr - set iattr on a node
+ * @sd: target node
+ * @iattr: iattr to set
+ *
+ * Returns 0 on success, -errno on failure.
+ */
+int kernfs_setattr(struct sysfs_dirent *sd, const struct iattr *iattr)
+{
+	int ret;
+
+	mutex_lock(&sysfs_mutex);
+	ret = __kernfs_setattr(sd, iattr);
+	mutex_unlock(&sysfs_mutex);
+	return ret;
+}
+
+int sysfs_setattr(struct dentry *dentry, struct iattr *iattr)
+{
+	struct inode *inode = dentry->d_inode;
+	struct sysfs_dirent *sd = dentry->d_fsdata;
+	int error;
+
+	if (!sd)
+		return -EINVAL;
+
+	mutex_lock(&sysfs_mutex);
+	error = inode_change_ok(inode, iattr);
+	if (error)
+		goto out;
+
+	error = __kernfs_setattr(sd, iattr);
+	if (error)
+		goto out;
+
+	/* this ignores size changes */
+	setattr_copy(inode, iattr);
+
+out:
+	mutex_unlock(&sysfs_mutex);
+	return error;
+}
+
+static int sysfs_sd_setsecdata(struct sysfs_dirent *sd, void **secdata,
+			       u32 *secdata_len)
+{
+	struct sysfs_inode_attrs *iattrs;
+	void *old_secdata;
+	size_t old_secdata_len;
+
+	if (!sd->s_iattr) {
+		sd->s_iattr = sysfs_init_inode_attrs(sd);
+		if (!sd->s_iattr)
+			return -ENOMEM;
+	}
+
+	iattrs = sd->s_iattr;
+	old_secdata = iattrs->ia_secdata;
+	old_secdata_len = iattrs->ia_secdata_len;
+
+	iattrs->ia_secdata = *secdata;
+	iattrs->ia_secdata_len = *secdata_len;
+
+	*secdata = old_secdata;
+	*secdata_len = old_secdata_len;
+	return 0;
+}
+
+int sysfs_setxattr(struct dentry *dentry, const char *name, const void *value,
+		size_t size, int flags)
+{
+	struct sysfs_dirent *sd = dentry->d_fsdata;
+	void *secdata;
+	int error;
+	u32 secdata_len = 0;
+
+	if (!sd)
+		return -EINVAL;
+
+	if (!strncmp(name, XATTR_SECURITY_PREFIX, XATTR_SECURITY_PREFIX_LEN)) {
+		const char *suffix = name + XATTR_SECURITY_PREFIX_LEN;
+		error = security_inode_setsecurity(dentry->d_inode, suffix,
+						value, size, flags);
+		if (error)
+			goto out;
+		error = security_inode_getsecctx(dentry->d_inode,
+						&secdata, &secdata_len);
+		if (error)
+			goto out;
+
+		mutex_lock(&sysfs_mutex);
+		error = sysfs_sd_setsecdata(sd, &secdata, &secdata_len);
+		mutex_unlock(&sysfs_mutex);
+
+		if (secdata)
+			security_release_secctx(secdata, secdata_len);
+	} else
+		return -EINVAL;
+out:
+	return error;
+}
+
+static inline void set_default_inode_attr(struct inode *inode, umode_t mode)
+{
+	inode->i_mode = mode;
+	inode->i_atime = inode->i_mtime = inode->i_ctime = CURRENT_TIME;
+}
+
+static inline void set_inode_attr(struct inode *inode, struct iattr *iattr)
+{
+	inode->i_uid = iattr->ia_uid;
+	inode->i_gid = iattr->ia_gid;
+	inode->i_atime = iattr->ia_atime;
+	inode->i_mtime = iattr->ia_mtime;
+	inode->i_ctime = iattr->ia_ctime;
+}
+
+static void sysfs_refresh_inode(struct sysfs_dirent *sd, struct inode *inode)
+{
+	struct sysfs_inode_attrs *iattrs = sd->s_iattr;
+
+	inode->i_mode = sd->s_mode;
+	if (iattrs) {
+		/* sysfs_dirent has non-default attributes
+		 * get them from persistent copy in sysfs_dirent
+		 */
+		set_inode_attr(inode, &iattrs->ia_iattr);
+		security_inode_notifysecctx(inode,
+					    iattrs->ia_secdata,
+					    iattrs->ia_secdata_len);
+	}
+
+	if (sysfs_type(sd) == SYSFS_DIR)
+		set_nlink(inode, sd->s_dir.subdirs + 2);
+}
+
+int sysfs_getattr(struct vfsmount *mnt, struct dentry *dentry,
+		  struct kstat *stat)
+{
+	struct sysfs_dirent *sd = dentry->d_fsdata;
+	struct inode *inode = dentry->d_inode;
+
+	mutex_lock(&sysfs_mutex);
+	sysfs_refresh_inode(sd, inode);
+	mutex_unlock(&sysfs_mutex);
+
+	generic_fillattr(inode, stat);
+	return 0;
+}
+
+static void sysfs_init_inode(struct sysfs_dirent *sd, struct inode *inode)
+{
+	kernfs_get(sd);
+	inode->i_private = sd;
+	inode->i_mapping->a_ops = &sysfs_aops;
+	inode->i_mapping->backing_dev_info = &sysfs_backing_dev_info;
+	inode->i_op = &sysfs_inode_operations;
+
+	set_default_inode_attr(inode, sd->s_mode);
+	sysfs_refresh_inode(sd, inode);
+
+	/* initialize inode according to type */
+	switch (sysfs_type(sd)) {
+	case SYSFS_DIR:
+		inode->i_op = &sysfs_dir_inode_operations;
+		inode->i_fop = &sysfs_dir_operations;
+		break;
+	case SYSFS_KOBJ_ATTR:
+		inode->i_size = sd->s_attr.size;
+		inode->i_fop = &kernfs_file_operations;
+		break;
+	case SYSFS_KOBJ_LINK:
+		inode->i_op = &sysfs_symlink_inode_operations;
+		break;
+	default:
+		BUG();
+	}
+
+	unlock_new_inode(inode);
+}
+
+/**
+ *	sysfs_get_inode - get inode for sysfs_dirent
+ *	@sb: super block
+ *	@sd: sysfs_dirent to allocate inode for
+ *
+ *	Get inode for @sd.  If such inode doesn't exist, a new inode
+ *	is allocated and basics are initialized.  New inode is
+ *	returned locked.
+ *
+ *	LOCKING:
+ *	Kernel thread context (may sleep).
+ *
+ *	RETURNS:
+ *	Pointer to allocated inode on success, NULL on failure.
+ */
+struct inode *sysfs_get_inode(struct super_block *sb, struct sysfs_dirent *sd)
+{
+	struct inode *inode;
+
+	inode = iget_locked(sb, sd->s_ino);
+	if (inode && (inode->i_state & I_NEW))
+		sysfs_init_inode(sd, inode);
+
+	return inode;
+}
+
+/*
+ * The sysfs_dirent serves as both an inode and a directory entry for sysfs.
+ * To prevent the sysfs inode numbers from being freed prematurely we take a
+ * reference to sysfs_dirent from the sysfs inode.  A
+ * super_operations.evict_inode() implementation is needed to drop that
+ * reference upon inode destruction.
+ */
+void sysfs_evict_inode(struct inode *inode)
+{
+	struct sysfs_dirent *sd  = inode->i_private;
+
+	truncate_inode_pages(&inode->i_data, 0);
+	clear_inode(inode);
+	kernfs_put(sd);
+}
+
+int sysfs_permission(struct inode *inode, int mask)
+{
+	struct sysfs_dirent *sd;
+
+	if (mask & MAY_NOT_BLOCK)
+		return -ECHILD;
+
+	sd = inode->i_private;
+
+	mutex_lock(&sysfs_mutex);
+	sysfs_refresh_inode(sd, inode);
+	mutex_unlock(&sysfs_mutex);
+
+	return generic_permission(inode, mask);
+}
diff --git a/fs/kernfs/kernfs-internal.h b/fs/kernfs/kernfs-internal.h
index 6bb8c12..5ea2564 100644
--- a/fs/kernfs/kernfs-internal.h
+++ b/fs/kernfs/kernfs-internal.h
@@ -112,4 +112,17 @@ struct sysfs_addrm_cxt {
 
 #include "../sysfs/sysfs.h"
 
+/*
+ * inode.c
+ */
+struct inode *sysfs_get_inode(struct super_block *sb, struct sysfs_dirent *sd);
+void sysfs_evict_inode(struct inode *inode);
+int sysfs_permission(struct inode *inode, int mask);
+int sysfs_setattr(struct dentry *dentry, struct iattr *iattr);
+int sysfs_getattr(struct vfsmount *mnt, struct dentry *dentry,
+		  struct kstat *stat);
+int sysfs_setxattr(struct dentry *dentry, const char *name, const void *value,
+		   size_t size, int flags);
+int sysfs_inode_init(void);
+
 #endif	/* __KERNFS_INTERNAL_H */
diff --git a/fs/sysfs/Makefile b/fs/sysfs/Makefile
index 8876ac1..6eff6e1 100644
--- a/fs/sysfs/Makefile
+++ b/fs/sysfs/Makefile
@@ -2,4 +2,4 @@
 # Makefile for the sysfs virtual filesystem
 #
 
-obj-y		:= inode.o file.o dir.o symlink.o mount.o group.o
+obj-y		:= file.o dir.o symlink.o mount.o group.o
diff --git a/fs/sysfs/inode.c b/fs/sysfs/inode.c
deleted file mode 100644
index bfe4478..0000000
--- a/fs/sysfs/inode.c
+++ /dev/null
@@ -1,342 +0,0 @@
-/*
- * fs/sysfs/inode.c - basic sysfs inode and dentry operations
- *
- * Copyright (c) 2001-3 Patrick Mochel
- * Copyright (c) 2007 SUSE Linux Products GmbH
- * Copyright (c) 2007 Tejun Heo <teheo@suse.de>
- *
- * This file is released under the GPLv2.
- *
- * Please see Documentation/filesystems/sysfs.txt for more information.
- */
-
-#undef DEBUG
-
-#include <linux/pagemap.h>
-#include <linux/namei.h>
-#include <linux/backing-dev.h>
-#include <linux/capability.h>
-#include <linux/errno.h>
-#include <linux/sched.h>
-#include <linux/slab.h>
-#include <linux/sysfs.h>
-#include <linux/xattr.h>
-#include <linux/security.h>
-#include "sysfs.h"
-
-static const struct address_space_operations sysfs_aops = {
-	.readpage	= simple_readpage,
-	.write_begin	= simple_write_begin,
-	.write_end	= simple_write_end,
-};
-
-static struct backing_dev_info sysfs_backing_dev_info = {
-	.name		= "sysfs",
-	.ra_pages	= 0,	/* No readahead */
-	.capabilities	= BDI_CAP_NO_ACCT_AND_WRITEBACK,
-};
-
-static const struct inode_operations sysfs_inode_operations = {
-	.permission	= sysfs_permission,
-	.setattr	= sysfs_setattr,
-	.getattr	= sysfs_getattr,
-	.setxattr	= sysfs_setxattr,
-};
-
-int __init sysfs_inode_init(void)
-{
-	return bdi_init(&sysfs_backing_dev_info);
-}
-
-static struct sysfs_inode_attrs *sysfs_init_inode_attrs(struct sysfs_dirent *sd)
-{
-	struct sysfs_inode_attrs *attrs;
-	struct iattr *iattrs;
-
-	attrs = kzalloc(sizeof(struct sysfs_inode_attrs), GFP_KERNEL);
-	if (!attrs)
-		return NULL;
-	iattrs = &attrs->ia_iattr;
-
-	/* assign default attributes */
-	iattrs->ia_mode = sd->s_mode;
-	iattrs->ia_uid = GLOBAL_ROOT_UID;
-	iattrs->ia_gid = GLOBAL_ROOT_GID;
-	iattrs->ia_atime = iattrs->ia_mtime = iattrs->ia_ctime = CURRENT_TIME;
-
-	return attrs;
-}
-
-static int __kernfs_setattr(struct sysfs_dirent *sd, const struct iattr *iattr)
-{
-	struct sysfs_inode_attrs *sd_attrs;
-	struct iattr *iattrs;
-	unsigned int ia_valid = iattr->ia_valid;
-
-	sd_attrs = sd->s_iattr;
-
-	if (!sd_attrs) {
-		/* setting attributes for the first time, allocate now */
-		sd_attrs = sysfs_init_inode_attrs(sd);
-		if (!sd_attrs)
-			return -ENOMEM;
-		sd->s_iattr = sd_attrs;
-	}
-	/* attributes were changed at least once in past */
-	iattrs = &sd_attrs->ia_iattr;
-
-	if (ia_valid & ATTR_UID)
-		iattrs->ia_uid = iattr->ia_uid;
-	if (ia_valid & ATTR_GID)
-		iattrs->ia_gid = iattr->ia_gid;
-	if (ia_valid & ATTR_ATIME)
-		iattrs->ia_atime = iattr->ia_atime;
-	if (ia_valid & ATTR_MTIME)
-		iattrs->ia_mtime = iattr->ia_mtime;
-	if (ia_valid & ATTR_CTIME)
-		iattrs->ia_ctime = iattr->ia_ctime;
-	if (ia_valid & ATTR_MODE) {
-		umode_t mode = iattr->ia_mode;
-		iattrs->ia_mode = sd->s_mode = mode;
-	}
-	return 0;
-}
-
-/**
- * kernfs_setattr - set iattr on a node
- * @sd: target node
- * @iattr: iattr to set
- *
- * Returns 0 on success, -errno on failure.
- */
-int kernfs_setattr(struct sysfs_dirent *sd, const struct iattr *iattr)
-{
-	int ret;
-
-	mutex_lock(&sysfs_mutex);
-	ret = __kernfs_setattr(sd, iattr);
-	mutex_unlock(&sysfs_mutex);
-	return ret;
-}
-
-int sysfs_setattr(struct dentry *dentry, struct iattr *iattr)
-{
-	struct inode *inode = dentry->d_inode;
-	struct sysfs_dirent *sd = dentry->d_fsdata;
-	int error;
-
-	if (!sd)
-		return -EINVAL;
-
-	mutex_lock(&sysfs_mutex);
-	error = inode_change_ok(inode, iattr);
-	if (error)
-		goto out;
-
-	error = __kernfs_setattr(sd, iattr);
-	if (error)
-		goto out;
-
-	/* this ignores size changes */
-	setattr_copy(inode, iattr);
-
-out:
-	mutex_unlock(&sysfs_mutex);
-	return error;
-}
-
-static int sysfs_sd_setsecdata(struct sysfs_dirent *sd, void **secdata,
-			       u32 *secdata_len)
-{
-	struct sysfs_inode_attrs *iattrs;
-	void *old_secdata;
-	size_t old_secdata_len;
-
-	if (!sd->s_iattr) {
-		sd->s_iattr = sysfs_init_inode_attrs(sd);
-		if (!sd->s_iattr)
-			return -ENOMEM;
-	}
-
-	iattrs = sd->s_iattr;
-	old_secdata = iattrs->ia_secdata;
-	old_secdata_len = iattrs->ia_secdata_len;
-
-	iattrs->ia_secdata = *secdata;
-	iattrs->ia_secdata_len = *secdata_len;
-
-	*secdata = old_secdata;
-	*secdata_len = old_secdata_len;
-	return 0;
-}
-
-int sysfs_setxattr(struct dentry *dentry, const char *name, const void *value,
-		size_t size, int flags)
-{
-	struct sysfs_dirent *sd = dentry->d_fsdata;
-	void *secdata;
-	int error;
-	u32 secdata_len = 0;
-
-	if (!sd)
-		return -EINVAL;
-
-	if (!strncmp(name, XATTR_SECURITY_PREFIX, XATTR_SECURITY_PREFIX_LEN)) {
-		const char *suffix = name + XATTR_SECURITY_PREFIX_LEN;
-		error = security_inode_setsecurity(dentry->d_inode, suffix,
-						value, size, flags);
-		if (error)
-			goto out;
-		error = security_inode_getsecctx(dentry->d_inode,
-						&secdata, &secdata_len);
-		if (error)
-			goto out;
-
-		mutex_lock(&sysfs_mutex);
-		error = sysfs_sd_setsecdata(sd, &secdata, &secdata_len);
-		mutex_unlock(&sysfs_mutex);
-
-		if (secdata)
-			security_release_secctx(secdata, secdata_len);
-	} else
-		return -EINVAL;
-out:
-	return error;
-}
-
-static inline void set_default_inode_attr(struct inode *inode, umode_t mode)
-{
-	inode->i_mode = mode;
-	inode->i_atime = inode->i_mtime = inode->i_ctime = CURRENT_TIME;
-}
-
-static inline void set_inode_attr(struct inode *inode, struct iattr *iattr)
-{
-	inode->i_uid = iattr->ia_uid;
-	inode->i_gid = iattr->ia_gid;
-	inode->i_atime = iattr->ia_atime;
-	inode->i_mtime = iattr->ia_mtime;
-	inode->i_ctime = iattr->ia_ctime;
-}
-
-static void sysfs_refresh_inode(struct sysfs_dirent *sd, struct inode *inode)
-{
-	struct sysfs_inode_attrs *iattrs = sd->s_iattr;
-
-	inode->i_mode = sd->s_mode;
-	if (iattrs) {
-		/* sysfs_dirent has non-default attributes
-		 * get them from persistent copy in sysfs_dirent
-		 */
-		set_inode_attr(inode, &iattrs->ia_iattr);
-		security_inode_notifysecctx(inode,
-					    iattrs->ia_secdata,
-					    iattrs->ia_secdata_len);
-	}
-
-	if (sysfs_type(sd) == SYSFS_DIR)
-		set_nlink(inode, sd->s_dir.subdirs + 2);
-}
-
-int sysfs_getattr(struct vfsmount *mnt, struct dentry *dentry,
-		  struct kstat *stat)
-{
-	struct sysfs_dirent *sd = dentry->d_fsdata;
-	struct inode *inode = dentry->d_inode;
-
-	mutex_lock(&sysfs_mutex);
-	sysfs_refresh_inode(sd, inode);
-	mutex_unlock(&sysfs_mutex);
-
-	generic_fillattr(inode, stat);
-	return 0;
-}
-
-static void sysfs_init_inode(struct sysfs_dirent *sd, struct inode *inode)
-{
-	kernfs_get(sd);
-	inode->i_private = sd;
-	inode->i_mapping->a_ops = &sysfs_aops;
-	inode->i_mapping->backing_dev_info = &sysfs_backing_dev_info;
-	inode->i_op = &sysfs_inode_operations;
-
-	set_default_inode_attr(inode, sd->s_mode);
-	sysfs_refresh_inode(sd, inode);
-
-	/* initialize inode according to type */
-	switch (sysfs_type(sd)) {
-	case SYSFS_DIR:
-		inode->i_op = &sysfs_dir_inode_operations;
-		inode->i_fop = &sysfs_dir_operations;
-		break;
-	case SYSFS_KOBJ_ATTR:
-		inode->i_size = sd->s_attr.size;
-		inode->i_fop = &kernfs_file_operations;
-		break;
-	case SYSFS_KOBJ_LINK:
-		inode->i_op = &sysfs_symlink_inode_operations;
-		break;
-	default:
-		BUG();
-	}
-
-	unlock_new_inode(inode);
-}
-
-/**
- *	sysfs_get_inode - get inode for sysfs_dirent
- *	@sb: super block
- *	@sd: sysfs_dirent to allocate inode for
- *
- *	Get inode for @sd.  If such inode doesn't exist, a new inode
- *	is allocated and basics are initialized.  New inode is
- *	returned locked.
- *
- *	LOCKING:
- *	Kernel thread context (may sleep).
- *
- *	RETURNS:
- *	Pointer to allocated inode on success, NULL on failure.
- */
-struct inode *sysfs_get_inode(struct super_block *sb, struct sysfs_dirent *sd)
-{
-	struct inode *inode;
-
-	inode = iget_locked(sb, sd->s_ino);
-	if (inode && (inode->i_state & I_NEW))
-		sysfs_init_inode(sd, inode);
-
-	return inode;
-}
-
-/*
- * The sysfs_dirent serves as both an inode and a directory entry for sysfs.
- * To prevent the sysfs inode numbers from being freed prematurely we take a
- * reference to sysfs_dirent from the sysfs inode.  A
- * super_operations.evict_inode() implementation is needed to drop that
- * reference upon inode destruction.
- */
-void sysfs_evict_inode(struct inode *inode)
-{
-	struct sysfs_dirent *sd  = inode->i_private;
-
-	truncate_inode_pages(&inode->i_data, 0);
-	clear_inode(inode);
-	kernfs_put(sd);
-}
-
-int sysfs_permission(struct inode *inode, int mask)
-{
-	struct sysfs_dirent *sd;
-
-	if (mask & MAY_NOT_BLOCK)
-		return -ECHILD;
-
-	sd = inode->i_private;
-
-	mutex_lock(&sysfs_mutex);
-	sysfs_refresh_inode(sd, inode);
-	mutex_unlock(&sysfs_mutex);
-
-	return generic_permission(inode, mask);
-}
diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index 00538ac..4c79173 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -52,19 +52,6 @@ void sysfs_addrm_finish(struct sysfs_addrm_cxt *acxt);
 struct sysfs_dirent *sysfs_new_dirent(const char *name, umode_t mode, int type);
 
 /*
- * inode.c
- */
-struct inode *sysfs_get_inode(struct super_block *sb, struct sysfs_dirent *sd);
-void sysfs_evict_inode(struct inode *inode);
-int sysfs_permission(struct inode *inode, int mask);
-int sysfs_setattr(struct dentry *dentry, struct iattr *iattr);
-int sysfs_getattr(struct vfsmount *mnt, struct dentry *dentry,
-		  struct kstat *stat);
-int sysfs_setxattr(struct dentry *dentry, const char *name, const void *value,
-		   size_t size, int flags);
-int sysfs_inode_init(void);
-
-/*
  * file.c
  */
 extern const struct file_operations kernfs_file_operations;
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 32/34] sysfs, kernfs: move dir core code to fs/kernfs/dir.c
  2013-10-24 15:49 Subject: [PATCHSET driver-core-next] sysfs: separate out kernfs, part #1 Tejun Heo
                   ` (30 preceding siblings ...)
  2013-10-24 15:49 ` [PATCH 31/34] sysfs, kernfs: move inode code to fs/kernfs/inode.c Tejun Heo
@ 2013-10-24 15:49 ` Tejun Heo
  2013-10-24 15:49 ` [PATCH 33/34] sysfs, kernfs: move file core code to fs/kernfs/file.c Tejun Heo
                   ` (3 subsequent siblings)
  35 siblings, 0 replies; 46+ messages in thread
From: Tejun Heo @ 2013-10-24 15:49 UTC (permalink / raw)
  To: gregkh; +Cc: kay, linux-kernel, ebiederm, bhelgaas, Tejun Heo

Move core dir code to fs/kernfs/dir.c.  fs/sysfs/dir.c now only
contains sysfs_warn_dup() and sysfs wrappers around kernfs interfaces.
The respective declarations in fs/sysfs/sysfs.h are moved to
fs/kernfs/kernfs-internal.h.

This is pure relocation.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 fs/kernfs/dir.c             | 969 ++++++++++++++++++++++++++++++++++++++++++++
 fs/kernfs/kernfs-internal.h |  17 +
 fs/sysfs/dir.c              | 969 --------------------------------------------
 fs/sysfs/sysfs.h            |  13 -
 4 files changed, 986 insertions(+), 982 deletions(-)

diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c
index 1061602..2ab6ace 100644
--- a/fs/kernfs/dir.c
+++ b/fs/kernfs/dir.c
@@ -7,3 +7,972 @@
  *
  * This file is released under the GPLv2.
  */
+
+#include <linux/fs.h>
+#include <linux/namei.h>
+#include <linux/idr.h>
+#include <linux/slab.h>
+#include <linux/security.h>
+#include <linux/hash.h>
+
+#include "kernfs-internal.h"
+
+DEFINE_MUTEX(sysfs_mutex);
+DEFINE_SPINLOCK(sysfs_assoc_lock);
+
+#define to_sysfs_dirent(X) rb_entry((X), struct sysfs_dirent, s_rb)
+
+static DEFINE_SPINLOCK(sysfs_ino_lock);
+static DEFINE_IDA(sysfs_ino_ida);
+
+/**
+ *	sysfs_name_hash
+ *	@name: Null terminated string to hash
+ *	@ns:   Namespace tag to hash
+ *
+ *	Returns 31 bit hash of ns + name (so it fits in an off_t )
+ */
+static unsigned int sysfs_name_hash(const char *name, const void *ns)
+{
+	unsigned long hash = init_name_hash();
+	unsigned int len = strlen(name);
+	while (len--)
+		hash = partial_name_hash(*name++, hash);
+	hash = (end_name_hash(hash) ^ hash_ptr((void *)ns, 31));
+	hash &= 0x7fffffffU;
+	/* Reserve hash numbers 0, 1 and INT_MAX for magic directory entries */
+	if (hash < 1)
+		hash += 2;
+	if (hash >= INT_MAX)
+		hash = INT_MAX - 1;
+	return hash;
+}
+
+static int sysfs_name_compare(unsigned int hash, const char *name,
+			      const void *ns, const struct sysfs_dirent *sd)
+{
+	if (hash != sd->s_hash)
+		return hash - sd->s_hash;
+	if (ns != sd->s_ns)
+		return ns - sd->s_ns;
+	return strcmp(name, sd->s_name);
+}
+
+static int sysfs_sd_compare(const struct sysfs_dirent *left,
+			    const struct sysfs_dirent *right)
+{
+	return sysfs_name_compare(left->s_hash, left->s_name, left->s_ns,
+				  right);
+}
+
+/**
+ *	sysfs_link_sibling - link sysfs_dirent into sibling rbtree
+ *	@sd: sysfs_dirent of interest
+ *
+ *	Link @sd into its sibling rbtree which starts from
+ *	sd->s_parent->s_dir.children.
+ *
+ *	Locking:
+ *	mutex_lock(sysfs_mutex)
+ *
+ *	RETURNS:
+ *	0 on susccess -EEXIST on failure.
+ */
+static int sysfs_link_sibling(struct sysfs_dirent *sd)
+{
+	struct rb_node **node = &sd->s_parent->s_dir.children.rb_node;
+	struct rb_node *parent = NULL;
+
+	if (sysfs_type(sd) == SYSFS_DIR)
+		sd->s_parent->s_dir.subdirs++;
+
+	while (*node) {
+		struct sysfs_dirent *pos;
+		int result;
+
+		pos = to_sysfs_dirent(*node);
+		parent = *node;
+		result = sysfs_sd_compare(sd, pos);
+		if (result < 0)
+			node = &pos->s_rb.rb_left;
+		else if (result > 0)
+			node = &pos->s_rb.rb_right;
+		else
+			return -EEXIST;
+	}
+	/* add new node and rebalance the tree */
+	rb_link_node(&sd->s_rb, parent, node);
+	rb_insert_color(&sd->s_rb, &sd->s_parent->s_dir.children);
+
+	/* if @sd has ns tag, mark the parent to enable ns filtering */
+	if (sd->s_ns)
+		sd->s_parent->s_flags |= SYSFS_FLAG_HAS_NS;
+
+	return 0;
+}
+
+/**
+ *	sysfs_unlink_sibling - unlink sysfs_dirent from sibling rbtree
+ *	@sd: sysfs_dirent of interest
+ *
+ *	Unlink @sd from its sibling rbtree which starts from
+ *	sd->s_parent->s_dir.children.
+ *
+ *	Locking:
+ *	mutex_lock(sysfs_mutex)
+ */
+static void sysfs_unlink_sibling(struct sysfs_dirent *sd)
+{
+	if (sysfs_type(sd) == SYSFS_DIR)
+		sd->s_parent->s_dir.subdirs--;
+
+	rb_erase(&sd->s_rb, &sd->s_parent->s_dir.children);
+
+	/*
+	 * Either all or none of the children have tags.  Clearing HAS_NS
+	 * when there's no child left is enough to keep the flag synced.
+	 */
+	if (RB_EMPTY_ROOT(&sd->s_parent->s_dir.children))
+		sd->s_parent->s_flags &= ~SYSFS_FLAG_HAS_NS;
+}
+
+/**
+ *	sysfs_get_active - get an active reference to sysfs_dirent
+ *	@sd: sysfs_dirent to get an active reference to
+ *
+ *	Get an active reference of @sd.  This function is noop if @sd
+ *	is NULL.
+ *
+ *	RETURNS:
+ *	Pointer to @sd on success, NULL on failure.
+ */
+struct sysfs_dirent *sysfs_get_active(struct sysfs_dirent *sd)
+{
+	if (unlikely(!sd))
+		return NULL;
+
+	if (!atomic_inc_unless_negative(&sd->s_active))
+		return NULL;
+
+	if (sd->s_flags & SYSFS_FLAG_LOCKDEP)
+		rwsem_acquire_read(&sd->dep_map, 0, 1, _RET_IP_);
+	return sd;
+}
+
+/**
+ *	sysfs_put_active - put an active reference to sysfs_dirent
+ *	@sd: sysfs_dirent to put an active reference to
+ *
+ *	Put an active reference to @sd.  This function is noop if @sd
+ *	is NULL.
+ */
+void sysfs_put_active(struct sysfs_dirent *sd)
+{
+	int v;
+
+	if (unlikely(!sd))
+		return;
+
+	if (sd->s_flags & SYSFS_FLAG_LOCKDEP)
+		rwsem_release(&sd->dep_map, 1, _RET_IP_);
+	v = atomic_dec_return(&sd->s_active);
+	if (likely(v != SD_DEACTIVATED_BIAS))
+		return;
+
+	/* atomic_dec_return() is a mb(), we'll always see the updated
+	 * sd->u.completion.
+	 */
+	complete(sd->u.completion);
+}
+
+/**
+ *	sysfs_deactivate - deactivate sysfs_dirent
+ *	@sd: sysfs_dirent to deactivate
+ *
+ *	Deny new active references and drain existing ones.
+ */
+static void sysfs_deactivate(struct sysfs_dirent *sd)
+{
+	DECLARE_COMPLETION_ONSTACK(wait);
+	int v;
+
+	BUG_ON(!(sd->s_flags & SYSFS_FLAG_REMOVED));
+
+	if (!(sysfs_type(sd) & SYSFS_ACTIVE_REF))
+		return;
+
+	sd->u.completion = (void *)&wait;
+
+	rwsem_acquire(&sd->dep_map, 0, 0, _RET_IP_);
+	/* atomic_add_return() is a mb(), put_active() will always see
+	 * the updated sd->u.completion.
+	 */
+	v = atomic_add_return(SD_DEACTIVATED_BIAS, &sd->s_active);
+
+	if (v != SD_DEACTIVATED_BIAS) {
+		lock_contended(&sd->dep_map, _RET_IP_);
+		wait_for_completion(&wait);
+	}
+
+	lock_acquired(&sd->dep_map, _RET_IP_);
+	rwsem_release(&sd->dep_map, 1, _RET_IP_);
+}
+
+static int sysfs_alloc_ino(unsigned int *pino)
+{
+	int ino, rc;
+
+ retry:
+	spin_lock(&sysfs_ino_lock);
+	rc = ida_get_new_above(&sysfs_ino_ida, 2, &ino);
+	spin_unlock(&sysfs_ino_lock);
+
+	if (rc == -EAGAIN) {
+		if (ida_pre_get(&sysfs_ino_ida, GFP_KERNEL))
+			goto retry;
+		rc = -ENOMEM;
+	}
+
+	*pino = ino;
+	return rc;
+}
+
+static void sysfs_free_ino(unsigned int ino)
+{
+	spin_lock(&sysfs_ino_lock);
+	ida_remove(&sysfs_ino_ida, ino);
+	spin_unlock(&sysfs_ino_lock);
+}
+
+/**
+ * kernfs_get - get a reference count on a sysfs_dirent
+ * @sd: the target sysfs_dirent
+ */
+void kernfs_get(struct sysfs_dirent *sd)
+{
+	if (sd) {
+		WARN_ON(!atomic_read(&sd->s_count));
+		atomic_inc(&sd->s_count);
+	}
+}
+EXPORT_SYMBOL_GPL(kernfs_get);
+
+/**
+ * kernfs_put - put a reference count on a sysfs_dirent
+ * @sd: the target sysfs_dirent
+ *
+ * Put a reference count of @sd and destroy it if it reached zero.
+ */
+void kernfs_put(struct sysfs_dirent *sd)
+{
+	struct sysfs_dirent *parent_sd;
+
+	if (!sd || !atomic_dec_and_test(&sd->s_count))
+		return;
+ repeat:
+	/* Moving/renaming is always done while holding reference.
+	 * sd->s_parent won't change beneath us.
+	 */
+	parent_sd = sd->s_parent;
+
+	WARN(!(sd->s_flags & SYSFS_FLAG_REMOVED),
+		"sysfs: free using entry: %s/%s\n",
+		parent_sd ? parent_sd->s_name : "", sd->s_name);
+
+	if (sysfs_type(sd) == SYSFS_KOBJ_LINK)
+		kernfs_put(sd->s_symlink.target_sd);
+	if (sysfs_type(sd) & SYSFS_COPY_NAME)
+		kfree(sd->s_name);
+	if (sd->s_iattr && sd->s_iattr->ia_secdata)
+		security_release_secctx(sd->s_iattr->ia_secdata,
+					sd->s_iattr->ia_secdata_len);
+	kfree(sd->s_iattr);
+	sysfs_free_ino(sd->s_ino);
+	kmem_cache_free(sysfs_dir_cachep, sd);
+
+	sd = parent_sd;
+	if (sd && atomic_dec_and_test(&sd->s_count))
+		goto repeat;
+}
+EXPORT_SYMBOL_GPL(kernfs_put);
+
+static int sysfs_dentry_delete(const struct dentry *dentry)
+{
+	struct sysfs_dirent *sd = dentry->d_fsdata;
+	return !(sd && !(sd->s_flags & SYSFS_FLAG_REMOVED));
+}
+
+static int sysfs_dentry_revalidate(struct dentry *dentry, unsigned int flags)
+{
+	struct sysfs_dirent *sd;
+
+	if (flags & LOOKUP_RCU)
+		return -ECHILD;
+
+	sd = dentry->d_fsdata;
+	mutex_lock(&sysfs_mutex);
+
+	/* The sysfs dirent has been deleted */
+	if (sd->s_flags & SYSFS_FLAG_REMOVED)
+		goto out_bad;
+
+	/* The sysfs dirent has been moved? */
+	if (dentry->d_parent->d_fsdata != sd->s_parent)
+		goto out_bad;
+
+	/* The sysfs dirent has been renamed */
+	if (strcmp(dentry->d_name.name, sd->s_name) != 0)
+		goto out_bad;
+
+	/* The sysfs dirent has been moved to a different namespace */
+	if (sd->s_ns && sd->s_ns != sysfs_info(dentry->d_sb)->ns)
+		goto out_bad;
+
+	mutex_unlock(&sysfs_mutex);
+out_valid:
+	return 1;
+out_bad:
+	/* Remove the dentry from the dcache hashes.
+	 * If this is a deleted dentry we use d_drop instead of d_delete
+	 * so sysfs doesn't need to cope with negative dentries.
+	 *
+	 * If this is a dentry that has simply been renamed we
+	 * use d_drop to remove it from the dcache lookup on its
+	 * old parent.  If this dentry persists later when a lookup
+	 * is performed at its new name the dentry will be readded
+	 * to the dcache hashes.
+	 */
+	mutex_unlock(&sysfs_mutex);
+
+	/* If we have submounts we must allow the vfs caches
+	 * to lie about the state of the filesystem to prevent
+	 * leaks and other nasty things.
+	 */
+	if (check_submounts_and_drop(dentry) != 0)
+		goto out_valid;
+
+	return 0;
+}
+
+static void sysfs_dentry_release(struct dentry *dentry)
+{
+	kernfs_put(dentry->d_fsdata);
+}
+
+const struct dentry_operations sysfs_dentry_ops = {
+	.d_revalidate	= sysfs_dentry_revalidate,
+	.d_delete	= sysfs_dentry_delete,
+	.d_release	= sysfs_dentry_release,
+};
+
+struct sysfs_dirent *sysfs_new_dirent(const char *name, umode_t mode, int type)
+{
+	char *dup_name = NULL;
+	struct sysfs_dirent *sd;
+
+	if (type & SYSFS_COPY_NAME) {
+		name = dup_name = kstrdup(name, GFP_KERNEL);
+		if (!name)
+			return NULL;
+	}
+
+	sd = kmem_cache_zalloc(sysfs_dir_cachep, GFP_KERNEL);
+	if (!sd)
+		goto err_out1;
+
+	if (sysfs_alloc_ino(&sd->s_ino))
+		goto err_out2;
+
+	atomic_set(&sd->s_count, 1);
+	atomic_set(&sd->s_active, 0);
+
+	sd->s_name = name;
+	sd->s_mode = mode;
+	sd->s_flags = type | SYSFS_FLAG_REMOVED;
+
+	return sd;
+
+ err_out2:
+	kmem_cache_free(sysfs_dir_cachep, sd);
+ err_out1:
+	kfree(dup_name);
+	return NULL;
+}
+
+/**
+ *	sysfs_addrm_start - prepare for sysfs_dirent add/remove
+ *	@acxt: pointer to sysfs_addrm_cxt to be used
+ *
+ *	This function is called when the caller is about to add or remove
+ *	sysfs_dirent.  This function acquires sysfs_mutex.  @acxt is used
+ *	to keep and pass context to other addrm functions.
+ *
+ *	LOCKING:
+ *	Kernel thread context (may sleep).  sysfs_mutex is locked on
+ *	return.
+ */
+void sysfs_addrm_start(struct sysfs_addrm_cxt *acxt)
+	__acquires(sysfs_mutex)
+{
+	memset(acxt, 0, sizeof(*acxt));
+
+	mutex_lock(&sysfs_mutex);
+}
+
+/**
+ *	sysfs_add_one - add sysfs_dirent to parent without warning
+ *	@acxt: addrm context to use
+ *	@sd: sysfs_dirent to be added
+ *	@parent_sd: the parent sysfs_dirent to add @sd to
+ *
+ *	Get @parent_sd and set @sd->s_parent to it and increment nlink of
+ *	the parent inode if @sd is a directory and link into the children
+ *	list of the parent.
+ *
+ *	This function should be called between calls to
+ *	sysfs_addrm_start() and sysfs_addrm_finish() and should be
+ *	passed the same @acxt as passed to sysfs_addrm_start().
+ *
+ *	LOCKING:
+ *	Determined by sysfs_addrm_start().
+ *
+ *	RETURNS:
+ *	0 on success, -EEXIST if entry with the given name already
+ *	exists.
+ */
+int sysfs_add_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd,
+		  struct sysfs_dirent *parent_sd)
+{
+	struct sysfs_inode_attrs *ps_iattr;
+	int ret;
+
+	if (sysfs_type(parent_sd) != SYSFS_DIR)
+		return -EINVAL;
+
+	sd->s_hash = sysfs_name_hash(sd->s_name, sd->s_ns);
+	sd->s_parent = parent_sd;
+	kernfs_get(parent_sd);
+
+	ret = sysfs_link_sibling(sd);
+	if (ret)
+		return ret;
+
+	/* Update timestamps on the parent */
+	ps_iattr = parent_sd->s_iattr;
+	if (ps_iattr) {
+		struct iattr *ps_iattrs = &ps_iattr->ia_iattr;
+		ps_iattrs->ia_ctime = ps_iattrs->ia_mtime = CURRENT_TIME;
+	}
+
+	/* Mark the entry added into directory tree */
+	sd->s_flags &= ~SYSFS_FLAG_REMOVED;
+
+	return 0;
+}
+
+/**
+ *	sysfs_remove_one - remove sysfs_dirent from parent
+ *	@acxt: addrm context to use
+ *	@sd: sysfs_dirent to be removed
+ *
+ *	Mark @sd removed and drop nlink of parent inode if @sd is a
+ *	directory.  @sd is unlinked from the children list.
+ *
+ *	This function should be called between calls to
+ *	sysfs_addrm_start() and sysfs_addrm_finish() and should be
+ *	passed the same @acxt as passed to sysfs_addrm_start().
+ *
+ *	LOCKING:
+ *	Determined by sysfs_addrm_start().
+ */
+static void sysfs_remove_one(struct sysfs_addrm_cxt *acxt,
+			     struct sysfs_dirent *sd)
+{
+	struct sysfs_inode_attrs *ps_iattr;
+
+	/*
+	 * Removal can be called multiple times on the same node.  Only the
+	 * first invocation is effective and puts the base ref.
+	 */
+	if (sd->s_flags & SYSFS_FLAG_REMOVED)
+		return;
+
+	sysfs_unlink_sibling(sd);
+
+	/* Update timestamps on the parent */
+	ps_iattr = sd->s_parent->s_iattr;
+	if (ps_iattr) {
+		struct iattr *ps_iattrs = &ps_iattr->ia_iattr;
+		ps_iattrs->ia_ctime = ps_iattrs->ia_mtime = CURRENT_TIME;
+	}
+
+	sd->s_flags |= SYSFS_FLAG_REMOVED;
+	sd->u.removed_list = acxt->removed;
+	acxt->removed = sd;
+}
+
+/**
+ *	sysfs_addrm_finish - finish up sysfs_dirent add/remove
+ *	@acxt: addrm context to finish up
+ *
+ *	Finish up sysfs_dirent add/remove.  Resources acquired by
+ *	sysfs_addrm_start() are released and removed sysfs_dirents are
+ *	cleaned up.
+ *
+ *	LOCKING:
+ *	sysfs_mutex is released.
+ */
+void sysfs_addrm_finish(struct sysfs_addrm_cxt *acxt)
+	__releases(sysfs_mutex)
+{
+	/* release resources acquired by sysfs_addrm_start() */
+	mutex_unlock(&sysfs_mutex);
+
+	/* kill removed sysfs_dirents */
+	while (acxt->removed) {
+		struct sysfs_dirent *sd = acxt->removed;
+
+		acxt->removed = sd->u.removed_list;
+
+		sysfs_deactivate(sd);
+		sysfs_unmap_bin_file(sd);
+		kernfs_put(sd);
+	}
+}
+
+/**
+ * kernfs_find_ns - find sysfs_dirent with the given name
+ * @parent: sysfs_dirent to search under
+ * @name: name to look for
+ * @ns: the namespace tag to use
+ *
+ * Look for sysfs_dirent with name @name under @parent.  Returns pointer to
+ * the found sysfs_dirent on success, %NULL on failure.
+ */
+static struct sysfs_dirent *kernfs_find_ns(struct sysfs_dirent *parent,
+					   const unsigned char *name,
+					   const void *ns)
+{
+	struct rb_node *node = parent->s_dir.children.rb_node;
+	unsigned int hash;
+
+	lockdep_assert_held(&sysfs_mutex);
+
+	hash = sysfs_name_hash(name, ns);
+	while (node) {
+		struct sysfs_dirent *sd;
+		int result;
+
+		sd = to_sysfs_dirent(node);
+		result = sysfs_name_compare(hash, name, ns, sd);
+		if (result < 0)
+			node = node->rb_left;
+		else if (result > 0)
+			node = node->rb_right;
+		else
+			return sd;
+	}
+	return NULL;
+}
+
+/**
+ * kernfs_find_and_get_ns - find and get sysfs_dirent with the given name
+ * @parent: sysfs_dirent to search under
+ * @name: name to look for
+ * @ns: the namespace tag to use
+ *
+ * Look for sysfs_dirent with name @name under @parent and get a reference
+ * if found.  This function may sleep and returns pointer to the found
+ * sysfs_dirent on success, %NULL on failure.
+ */
+struct sysfs_dirent *kernfs_find_and_get_ns(struct sysfs_dirent *parent,
+					    const char *name, const void *ns)
+{
+	struct sysfs_dirent *sd;
+
+	mutex_lock(&sysfs_mutex);
+	sd = kernfs_find_ns(parent, name, ns);
+	kernfs_get(sd);
+	mutex_unlock(&sysfs_mutex);
+
+	return sd;
+}
+EXPORT_SYMBOL_GPL(kernfs_find_and_get_ns);
+
+/**
+ * kernfs_create_dir_ns - create a directory
+ * @parent: parent in which to create a new directory
+ * @name: name of the new directory
+ * @priv: opaque data associated with the new directory
+ * @ns: optional namespace tag of the directory
+ *
+ * Returns the created node on success, ERR_PTR() value on failure.
+ */
+struct sysfs_dirent *kernfs_create_dir_ns(struct sysfs_dirent *parent,
+					  const char *name, void *priv,
+					  const void *ns)
+{
+	umode_t mode = S_IFDIR | S_IRWXU | S_IRUGO | S_IXUGO;
+	struct sysfs_addrm_cxt acxt;
+	struct sysfs_dirent *sd;
+	int rc;
+
+	/* allocate */
+	sd = sysfs_new_dirent(name, mode, SYSFS_DIR);
+	if (!sd)
+		return ERR_PTR(-ENOMEM);
+
+	sd->s_ns = ns;
+	sd->priv = priv;
+
+	/* link in */
+	sysfs_addrm_start(&acxt);
+	rc = sysfs_add_one(&acxt, sd, parent);
+	sysfs_addrm_finish(&acxt);
+
+	if (!rc)
+		return sd;
+
+	kernfs_put(sd);
+	return ERR_PTR(rc);
+}
+
+static struct dentry *sysfs_lookup(struct inode *dir, struct dentry *dentry,
+				   unsigned int flags)
+{
+	struct dentry *ret = NULL;
+	struct dentry *parent = dentry->d_parent;
+	struct sysfs_dirent *parent_sd = parent->d_fsdata;
+	struct sysfs_dirent *sd;
+	struct inode *inode;
+	const void *ns = NULL;
+
+	mutex_lock(&sysfs_mutex);
+
+	if (parent_sd->s_flags & SYSFS_FLAG_HAS_NS)
+		ns = sysfs_info(dir->i_sb)->ns;
+
+	sd = kernfs_find_ns(parent_sd, dentry->d_name.name, ns);
+
+	/* no such entry */
+	if (!sd) {
+		ret = ERR_PTR(-ENOENT);
+		goto out_unlock;
+	}
+	kernfs_get(sd);
+	dentry->d_fsdata = sd;
+
+	/* attach dentry and inode */
+	inode = sysfs_get_inode(dir->i_sb, sd);
+	if (!inode) {
+		ret = ERR_PTR(-ENOMEM);
+		goto out_unlock;
+	}
+
+	/* instantiate and hash dentry */
+	ret = d_materialise_unique(dentry, inode);
+ out_unlock:
+	mutex_unlock(&sysfs_mutex);
+	return ret;
+}
+
+const struct inode_operations sysfs_dir_inode_operations = {
+	.lookup		= sysfs_lookup,
+	.permission	= sysfs_permission,
+	.setattr	= sysfs_setattr,
+	.getattr	= sysfs_getattr,
+	.setxattr	= sysfs_setxattr,
+};
+
+static struct sysfs_dirent *sysfs_leftmost_descendant(struct sysfs_dirent *pos)
+{
+	struct sysfs_dirent *last;
+
+	while (true) {
+		struct rb_node *rbn;
+
+		last = pos;
+
+		if (sysfs_type(pos) != SYSFS_DIR)
+			break;
+
+		rbn = rb_first(&pos->s_dir.children);
+		if (!rbn)
+			break;
+
+		pos = to_sysfs_dirent(rbn);
+	}
+
+	return last;
+}
+
+/**
+ * sysfs_next_descendant_post - find the next descendant for post-order walk
+ * @pos: the current position (%NULL to initiate traversal)
+ * @root: sysfs_dirent whose descendants to walk
+ *
+ * Find the next descendant to visit for post-order traversal of @root's
+ * descendants.  @root is included in the iteration and the last node to be
+ * visited.
+ */
+static struct sysfs_dirent *sysfs_next_descendant_post(struct sysfs_dirent *pos,
+						       struct sysfs_dirent *root)
+{
+	struct rb_node *rbn;
+
+	lockdep_assert_held(&sysfs_mutex);
+
+	/* if first iteration, visit leftmost descendant which may be root */
+	if (!pos)
+		return sysfs_leftmost_descendant(root);
+
+	/* if we visited @root, we're done */
+	if (pos == root)
+		return NULL;
+
+	/* if there's an unvisited sibling, visit its leftmost descendant */
+	rbn = rb_next(&pos->s_rb);
+	if (rbn)
+		return sysfs_leftmost_descendant(to_sysfs_dirent(rbn));
+
+	/* no sibling left, visit parent */
+	return pos->s_parent;
+}
+
+static void __kernfs_remove(struct sysfs_addrm_cxt *acxt,
+			    struct sysfs_dirent *sd)
+{
+	struct sysfs_dirent *pos, *next;
+
+	if (!sd)
+		return;
+
+	pr_debug("sysfs %s: removing\n", sd->s_name);
+
+	next = NULL;
+	do {
+		pos = next;
+		next = sysfs_next_descendant_post(pos, sd);
+		if (pos)
+			sysfs_remove_one(acxt, pos);
+	} while (next);
+}
+
+/**
+ * kernfs_remove - remove a sysfs_dirent recursively
+ * @sd: the sysfs_dirent to remove
+ *
+ * Remove @sd along with all its subdirectories and files.
+ */
+void kernfs_remove(struct sysfs_dirent *sd)
+{
+	struct sysfs_addrm_cxt acxt;
+
+	sysfs_addrm_start(&acxt);
+	__kernfs_remove(&acxt, sd);
+	sysfs_addrm_finish(&acxt);
+}
+
+/**
+ * kernfs_remove_by_name_ns - find a sysfs_dirent by name and remove it
+ * @dir_sd: parent of the target
+ * @name: name of the sysfs_dirent to remove
+ * @ns: namespace tag of the sysfs_dirent to remove
+ *
+ * Look for the sysfs_dirent with @name and @ns under @dir_sd and remove
+ * it.  Returns 0 on success, -ENOENT if such entry doesn't exist.
+ */
+int kernfs_remove_by_name_ns(struct sysfs_dirent *dir_sd, const char *name,
+			     const void *ns)
+{
+	struct sysfs_addrm_cxt acxt;
+	struct sysfs_dirent *sd;
+
+	if (!dir_sd) {
+		WARN(1, KERN_WARNING "sysfs: can not remove '%s', no directory\n",
+			name);
+		return -ENOENT;
+	}
+
+	sysfs_addrm_start(&acxt);
+
+	sd = kernfs_find_ns(dir_sd, name, ns);
+	if (sd)
+		__kernfs_remove(&acxt, sd);
+
+	sysfs_addrm_finish(&acxt);
+
+	if (sd)
+		return 0;
+	else
+		return -ENOENT;
+}
+
+/**
+ * kernfs_rename_ns - move and rename a kernfs_node
+ * @sd: target node
+ * @new_parent: new parent to put @sd under
+ * @new_name: new name
+ * @new_ns: new namespace tag
+ */
+int kernfs_rename_ns(struct sysfs_dirent *sd, struct sysfs_dirent *new_parent,
+		     const char *new_name, const void *new_ns)
+{
+	int error;
+
+	mutex_lock(&sysfs_mutex);
+
+	error = 0;
+	if ((sd->s_parent == new_parent) && (sd->s_ns == new_ns) &&
+	    (strcmp(sd->s_name, new_name) == 0))
+		goto out;	/* nothing to rename */
+
+	error = -EEXIST;
+	if (kernfs_find_ns(new_parent, new_name, new_ns))
+		goto out;
+
+	/* rename sysfs_dirent */
+	if (strcmp(sd->s_name, new_name) != 0) {
+		error = -ENOMEM;
+		new_name = kstrdup(new_name, GFP_KERNEL);
+		if (!new_name)
+			goto out;
+
+		kfree(sd->s_name);
+		sd->s_name = new_name;
+	}
+
+	/*
+	 * Move to the appropriate place in the appropriate directories rbtree.
+	 */
+	sysfs_unlink_sibling(sd);
+	kernfs_get(new_parent);
+	kernfs_put(sd->s_parent);
+	sd->s_ns = new_ns;
+	sd->s_hash = sysfs_name_hash(sd->s_name, sd->s_ns);
+	sd->s_parent = new_parent;
+	sysfs_link_sibling(sd);
+
+	error = 0;
+ out:
+	mutex_unlock(&sysfs_mutex);
+	return error;
+}
+
+/* Relationship between s_mode and the DT_xxx types */
+static inline unsigned char dt_type(struct sysfs_dirent *sd)
+{
+	return (sd->s_mode >> 12) & 15;
+}
+
+static int sysfs_dir_release(struct inode *inode, struct file *filp)
+{
+	kernfs_put(filp->private_data);
+	return 0;
+}
+
+static struct sysfs_dirent *sysfs_dir_pos(const void *ns,
+	struct sysfs_dirent *parent_sd,	loff_t hash, struct sysfs_dirent *pos)
+{
+	if (pos) {
+		int valid = !(pos->s_flags & SYSFS_FLAG_REMOVED) &&
+			pos->s_parent == parent_sd &&
+			hash == pos->s_hash;
+		kernfs_put(pos);
+		if (!valid)
+			pos = NULL;
+	}
+	if (!pos && (hash > 1) && (hash < INT_MAX)) {
+		struct rb_node *node = parent_sd->s_dir.children.rb_node;
+		while (node) {
+			pos = to_sysfs_dirent(node);
+
+			if (hash < pos->s_hash)
+				node = node->rb_left;
+			else if (hash > pos->s_hash)
+				node = node->rb_right;
+			else
+				break;
+		}
+	}
+	/* Skip over entries in the wrong namespace */
+	while (pos && pos->s_ns != ns) {
+		struct rb_node *node = rb_next(&pos->s_rb);
+		if (!node)
+			pos = NULL;
+		else
+			pos = to_sysfs_dirent(node);
+	}
+	return pos;
+}
+
+static struct sysfs_dirent *sysfs_dir_next_pos(const void *ns,
+	struct sysfs_dirent *parent_sd,	ino_t ino, struct sysfs_dirent *pos)
+{
+	pos = sysfs_dir_pos(ns, parent_sd, ino, pos);
+	if (pos)
+		do {
+			struct rb_node *node = rb_next(&pos->s_rb);
+			if (!node)
+				pos = NULL;
+			else
+				pos = to_sysfs_dirent(node);
+		} while (pos && pos->s_ns != ns);
+	return pos;
+}
+
+static int sysfs_readdir(struct file *file, struct dir_context *ctx)
+{
+	struct dentry *dentry = file->f_path.dentry;
+	struct sysfs_dirent *parent_sd = dentry->d_fsdata;
+	struct sysfs_dirent *pos = file->private_data;
+	const void *ns = NULL;
+
+	if (!dir_emit_dots(file, ctx))
+		return 0;
+	mutex_lock(&sysfs_mutex);
+
+	if (parent_sd->s_flags & SYSFS_FLAG_HAS_NS)
+		ns = sysfs_info(dentry->d_sb)->ns;
+
+	for (pos = sysfs_dir_pos(ns, parent_sd, ctx->pos, pos);
+	     pos;
+	     pos = sysfs_dir_next_pos(ns, parent_sd, ctx->pos, pos)) {
+		const char *name = pos->s_name;
+		unsigned int type = dt_type(pos);
+		int len = strlen(name);
+		ino_t ino = pos->s_ino;
+
+		ctx->pos = pos->s_hash;
+		file->private_data = pos;
+		kernfs_get(pos);
+
+		mutex_unlock(&sysfs_mutex);
+		if (!dir_emit(ctx, name, len, ino, type))
+			return 0;
+		mutex_lock(&sysfs_mutex);
+	}
+	mutex_unlock(&sysfs_mutex);
+	file->private_data = NULL;
+	ctx->pos = INT_MAX;
+	return 0;
+}
+
+static loff_t sysfs_dir_llseek(struct file *file, loff_t offset, int whence)
+{
+	struct inode *inode = file_inode(file);
+	loff_t ret;
+
+	mutex_lock(&inode->i_mutex);
+	ret = generic_file_llseek(file, offset, whence);
+	mutex_unlock(&inode->i_mutex);
+
+	return ret;
+}
+
+const struct file_operations sysfs_dir_operations = {
+	.read		= generic_read_dir,
+	.iterate	= sysfs_readdir,
+	.release	= sysfs_dir_release,
+	.llseek		= sysfs_dir_llseek,
+};
diff --git a/fs/kernfs/kernfs-internal.h b/fs/kernfs/kernfs-internal.h
index 5ea2564..67b111b 100644
--- a/fs/kernfs/kernfs-internal.h
+++ b/fs/kernfs/kernfs-internal.h
@@ -14,6 +14,7 @@
 #include <linux/lockdep.h>
 #include <linux/fs.h>
 #include <linux/rbtree.h>
+#include <linux/mutex.h>
 
 #include <linux/kernfs.h>
 
@@ -125,4 +126,20 @@ int sysfs_setxattr(struct dentry *dentry, const char *name, const void *value,
 		   size_t size, int flags);
 int sysfs_inode_init(void);
 
+/*
+ * dir.c
+ */
+extern struct mutex sysfs_mutex;
+extern const struct dentry_operations sysfs_dentry_ops;
+extern const struct file_operations sysfs_dir_operations;
+extern const struct inode_operations sysfs_dir_inode_operations;
+
+struct sysfs_dirent *sysfs_get_active(struct sysfs_dirent *sd);
+void sysfs_put_active(struct sysfs_dirent *sd);
+void sysfs_addrm_start(struct sysfs_addrm_cxt *acxt);
+int sysfs_add_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd,
+		  struct sysfs_dirent *parent_sd);
+void sysfs_addrm_finish(struct sysfs_addrm_cxt *acxt);
+struct sysfs_dirent *sysfs_new_dirent(const char *name, umode_t mode, int type);
+
 #endif	/* __KERNFS_INTERNAL_H */
diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index 0e46ff8..6de6c0c 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -12,472 +12,10 @@
 
 #undef DEBUG
 
-#include <linux/fs.h>
-#include <linux/mount.h>
-#include <linux/module.h>
 #include <linux/kobject.h>
-#include <linux/namei.h>
-#include <linux/idr.h>
-#include <linux/completion.h>
-#include <linux/mutex.h>
 #include <linux/slab.h>
-#include <linux/security.h>
-#include <linux/hash.h>
 #include "sysfs.h"
 
-DEFINE_MUTEX(sysfs_mutex);
-DEFINE_SPINLOCK(sysfs_assoc_lock);
-
-#define to_sysfs_dirent(X) rb_entry((X), struct sysfs_dirent, s_rb)
-
-static DEFINE_SPINLOCK(sysfs_ino_lock);
-static DEFINE_IDA(sysfs_ino_ida);
-
-/**
- *	sysfs_name_hash
- *	@name: Null terminated string to hash
- *	@ns:   Namespace tag to hash
- *
- *	Returns 31 bit hash of ns + name (so it fits in an off_t )
- */
-static unsigned int sysfs_name_hash(const char *name, const void *ns)
-{
-	unsigned long hash = init_name_hash();
-	unsigned int len = strlen(name);
-	while (len--)
-		hash = partial_name_hash(*name++, hash);
-	hash = (end_name_hash(hash) ^ hash_ptr((void *)ns, 31));
-	hash &= 0x7fffffffU;
-	/* Reserve hash numbers 0, 1 and INT_MAX for magic directory entries */
-	if (hash < 1)
-		hash += 2;
-	if (hash >= INT_MAX)
-		hash = INT_MAX - 1;
-	return hash;
-}
-
-static int sysfs_name_compare(unsigned int hash, const char *name,
-			      const void *ns, const struct sysfs_dirent *sd)
-{
-	if (hash != sd->s_hash)
-		return hash - sd->s_hash;
-	if (ns != sd->s_ns)
-		return ns - sd->s_ns;
-	return strcmp(name, sd->s_name);
-}
-
-static int sysfs_sd_compare(const struct sysfs_dirent *left,
-			    const struct sysfs_dirent *right)
-{
-	return sysfs_name_compare(left->s_hash, left->s_name, left->s_ns,
-				  right);
-}
-
-/**
- *	sysfs_link_sibling - link sysfs_dirent into sibling rbtree
- *	@sd: sysfs_dirent of interest
- *
- *	Link @sd into its sibling rbtree which starts from
- *	sd->s_parent->s_dir.children.
- *
- *	Locking:
- *	mutex_lock(sysfs_mutex)
- *
- *	RETURNS:
- *	0 on susccess -EEXIST on failure.
- */
-static int sysfs_link_sibling(struct sysfs_dirent *sd)
-{
-	struct rb_node **node = &sd->s_parent->s_dir.children.rb_node;
-	struct rb_node *parent = NULL;
-
-	if (sysfs_type(sd) == SYSFS_DIR)
-		sd->s_parent->s_dir.subdirs++;
-
-	while (*node) {
-		struct sysfs_dirent *pos;
-		int result;
-
-		pos = to_sysfs_dirent(*node);
-		parent = *node;
-		result = sysfs_sd_compare(sd, pos);
-		if (result < 0)
-			node = &pos->s_rb.rb_left;
-		else if (result > 0)
-			node = &pos->s_rb.rb_right;
-		else
-			return -EEXIST;
-	}
-	/* add new node and rebalance the tree */
-	rb_link_node(&sd->s_rb, parent, node);
-	rb_insert_color(&sd->s_rb, &sd->s_parent->s_dir.children);
-
-	/* if @sd has ns tag, mark the parent to enable ns filtering */
-	if (sd->s_ns)
-		sd->s_parent->s_flags |= SYSFS_FLAG_HAS_NS;
-
-	return 0;
-}
-
-/**
- *	sysfs_unlink_sibling - unlink sysfs_dirent from sibling rbtree
- *	@sd: sysfs_dirent of interest
- *
- *	Unlink @sd from its sibling rbtree which starts from
- *	sd->s_parent->s_dir.children.
- *
- *	Locking:
- *	mutex_lock(sysfs_mutex)
- */
-static void sysfs_unlink_sibling(struct sysfs_dirent *sd)
-{
-	if (sysfs_type(sd) == SYSFS_DIR)
-		sd->s_parent->s_dir.subdirs--;
-
-	rb_erase(&sd->s_rb, &sd->s_parent->s_dir.children);
-
-	/*
-	 * Either all or none of the children have tags.  Clearing HAS_NS
-	 * when there's no child left is enough to keep the flag synced.
-	 */
-	if (RB_EMPTY_ROOT(&sd->s_parent->s_dir.children))
-		sd->s_parent->s_flags &= ~SYSFS_FLAG_HAS_NS;
-}
-
-/**
- *	sysfs_get_active - get an active reference to sysfs_dirent
- *	@sd: sysfs_dirent to get an active reference to
- *
- *	Get an active reference of @sd.  This function is noop if @sd
- *	is NULL.
- *
- *	RETURNS:
- *	Pointer to @sd on success, NULL on failure.
- */
-struct sysfs_dirent *sysfs_get_active(struct sysfs_dirent *sd)
-{
-	if (unlikely(!sd))
-		return NULL;
-
-	if (!atomic_inc_unless_negative(&sd->s_active))
-		return NULL;
-
-	if (sd->s_flags & SYSFS_FLAG_LOCKDEP)
-		rwsem_acquire_read(&sd->dep_map, 0, 1, _RET_IP_);
-	return sd;
-}
-
-/**
- *	sysfs_put_active - put an active reference to sysfs_dirent
- *	@sd: sysfs_dirent to put an active reference to
- *
- *	Put an active reference to @sd.  This function is noop if @sd
- *	is NULL.
- */
-void sysfs_put_active(struct sysfs_dirent *sd)
-{
-	int v;
-
-	if (unlikely(!sd))
-		return;
-
-	if (sd->s_flags & SYSFS_FLAG_LOCKDEP)
-		rwsem_release(&sd->dep_map, 1, _RET_IP_);
-	v = atomic_dec_return(&sd->s_active);
-	if (likely(v != SD_DEACTIVATED_BIAS))
-		return;
-
-	/* atomic_dec_return() is a mb(), we'll always see the updated
-	 * sd->u.completion.
-	 */
-	complete(sd->u.completion);
-}
-
-/**
- *	sysfs_deactivate - deactivate sysfs_dirent
- *	@sd: sysfs_dirent to deactivate
- *
- *	Deny new active references and drain existing ones.
- */
-static void sysfs_deactivate(struct sysfs_dirent *sd)
-{
-	DECLARE_COMPLETION_ONSTACK(wait);
-	int v;
-
-	BUG_ON(!(sd->s_flags & SYSFS_FLAG_REMOVED));
-
-	if (!(sysfs_type(sd) & SYSFS_ACTIVE_REF))
-		return;
-
-	sd->u.completion = (void *)&wait;
-
-	rwsem_acquire(&sd->dep_map, 0, 0, _RET_IP_);
-	/* atomic_add_return() is a mb(), put_active() will always see
-	 * the updated sd->u.completion.
-	 */
-	v = atomic_add_return(SD_DEACTIVATED_BIAS, &sd->s_active);
-
-	if (v != SD_DEACTIVATED_BIAS) {
-		lock_contended(&sd->dep_map, _RET_IP_);
-		wait_for_completion(&wait);
-	}
-
-	lock_acquired(&sd->dep_map, _RET_IP_);
-	rwsem_release(&sd->dep_map, 1, _RET_IP_);
-}
-
-static int sysfs_alloc_ino(unsigned int *pino)
-{
-	int ino, rc;
-
- retry:
-	spin_lock(&sysfs_ino_lock);
-	rc = ida_get_new_above(&sysfs_ino_ida, 2, &ino);
-	spin_unlock(&sysfs_ino_lock);
-
-	if (rc == -EAGAIN) {
-		if (ida_pre_get(&sysfs_ino_ida, GFP_KERNEL))
-			goto retry;
-		rc = -ENOMEM;
-	}
-
-	*pino = ino;
-	return rc;
-}
-
-static void sysfs_free_ino(unsigned int ino)
-{
-	spin_lock(&sysfs_ino_lock);
-	ida_remove(&sysfs_ino_ida, ino);
-	spin_unlock(&sysfs_ino_lock);
-}
-
-/**
- * kernfs_get - get a reference count on a sysfs_dirent
- * @sd: the target sysfs_dirent
- */
-void kernfs_get(struct sysfs_dirent *sd)
-{
-	if (sd) {
-		WARN_ON(!atomic_read(&sd->s_count));
-		atomic_inc(&sd->s_count);
-	}
-}
-EXPORT_SYMBOL_GPL(kernfs_get);
-
-/**
- * kernfs_put - put a reference count on a sysfs_dirent
- * @sd: the target sysfs_dirent
- *
- * Put a reference count of @sd and destroy it if it reached zero.
- */
-void kernfs_put(struct sysfs_dirent *sd)
-{
-	struct sysfs_dirent *parent_sd;
-
-	if (!sd || !atomic_dec_and_test(&sd->s_count))
-		return;
- repeat:
-	/* Moving/renaming is always done while holding reference.
-	 * sd->s_parent won't change beneath us.
-	 */
-	parent_sd = sd->s_parent;
-
-	WARN(!(sd->s_flags & SYSFS_FLAG_REMOVED),
-		"sysfs: free using entry: %s/%s\n",
-		parent_sd ? parent_sd->s_name : "", sd->s_name);
-
-	if (sysfs_type(sd) == SYSFS_KOBJ_LINK)
-		kernfs_put(sd->s_symlink.target_sd);
-	if (sysfs_type(sd) & SYSFS_COPY_NAME)
-		kfree(sd->s_name);
-	if (sd->s_iattr && sd->s_iattr->ia_secdata)
-		security_release_secctx(sd->s_iattr->ia_secdata,
-					sd->s_iattr->ia_secdata_len);
-	kfree(sd->s_iattr);
-	sysfs_free_ino(sd->s_ino);
-	kmem_cache_free(sysfs_dir_cachep, sd);
-
-	sd = parent_sd;
-	if (sd && atomic_dec_and_test(&sd->s_count))
-		goto repeat;
-}
-EXPORT_SYMBOL_GPL(kernfs_put);
-
-static int sysfs_dentry_delete(const struct dentry *dentry)
-{
-	struct sysfs_dirent *sd = dentry->d_fsdata;
-	return !(sd && !(sd->s_flags & SYSFS_FLAG_REMOVED));
-}
-
-static int sysfs_dentry_revalidate(struct dentry *dentry, unsigned int flags)
-{
-	struct sysfs_dirent *sd;
-
-	if (flags & LOOKUP_RCU)
-		return -ECHILD;
-
-	sd = dentry->d_fsdata;
-	mutex_lock(&sysfs_mutex);
-
-	/* The sysfs dirent has been deleted */
-	if (sd->s_flags & SYSFS_FLAG_REMOVED)
-		goto out_bad;
-
-	/* The sysfs dirent has been moved? */
-	if (dentry->d_parent->d_fsdata != sd->s_parent)
-		goto out_bad;
-
-	/* The sysfs dirent has been renamed */
-	if (strcmp(dentry->d_name.name, sd->s_name) != 0)
-		goto out_bad;
-
-	/* The sysfs dirent has been moved to a different namespace */
-	if (sd->s_ns && sd->s_ns != sysfs_info(dentry->d_sb)->ns)
-		goto out_bad;
-
-	mutex_unlock(&sysfs_mutex);
-out_valid:
-	return 1;
-out_bad:
-	/* Remove the dentry from the dcache hashes.
-	 * If this is a deleted dentry we use d_drop instead of d_delete
-	 * so sysfs doesn't need to cope with negative dentries.
-	 *
-	 * If this is a dentry that has simply been renamed we
-	 * use d_drop to remove it from the dcache lookup on its
-	 * old parent.  If this dentry persists later when a lookup
-	 * is performed at its new name the dentry will be readded
-	 * to the dcache hashes.
-	 */
-	mutex_unlock(&sysfs_mutex);
-
-	/* If we have submounts we must allow the vfs caches
-	 * to lie about the state of the filesystem to prevent
-	 * leaks and other nasty things.
-	 */
-	if (check_submounts_and_drop(dentry) != 0)
-		goto out_valid;
-
-	return 0;
-}
-
-static void sysfs_dentry_release(struct dentry *dentry)
-{
-	kernfs_put(dentry->d_fsdata);
-}
-
-const struct dentry_operations sysfs_dentry_ops = {
-	.d_revalidate	= sysfs_dentry_revalidate,
-	.d_delete	= sysfs_dentry_delete,
-	.d_release	= sysfs_dentry_release,
-};
-
-struct sysfs_dirent *sysfs_new_dirent(const char *name, umode_t mode, int type)
-{
-	char *dup_name = NULL;
-	struct sysfs_dirent *sd;
-
-	if (type & SYSFS_COPY_NAME) {
-		name = dup_name = kstrdup(name, GFP_KERNEL);
-		if (!name)
-			return NULL;
-	}
-
-	sd = kmem_cache_zalloc(sysfs_dir_cachep, GFP_KERNEL);
-	if (!sd)
-		goto err_out1;
-
-	if (sysfs_alloc_ino(&sd->s_ino))
-		goto err_out2;
-
-	atomic_set(&sd->s_count, 1);
-	atomic_set(&sd->s_active, 0);
-
-	sd->s_name = name;
-	sd->s_mode = mode;
-	sd->s_flags = type | SYSFS_FLAG_REMOVED;
-
-	return sd;
-
- err_out2:
-	kmem_cache_free(sysfs_dir_cachep, sd);
- err_out1:
-	kfree(dup_name);
-	return NULL;
-}
-
-/**
- *	sysfs_addrm_start - prepare for sysfs_dirent add/remove
- *	@acxt: pointer to sysfs_addrm_cxt to be used
- *
- *	This function is called when the caller is about to add or remove
- *	sysfs_dirent.  This function acquires sysfs_mutex.  @acxt is used
- *	to keep and pass context to other addrm functions.
- *
- *	LOCKING:
- *	Kernel thread context (may sleep).  sysfs_mutex is locked on
- *	return.
- */
-void sysfs_addrm_start(struct sysfs_addrm_cxt *acxt)
-	__acquires(sysfs_mutex)
-{
-	memset(acxt, 0, sizeof(*acxt));
-
-	mutex_lock(&sysfs_mutex);
-}
-
-/**
- *	sysfs_add_one - add sysfs_dirent to parent without warning
- *	@acxt: addrm context to use
- *	@sd: sysfs_dirent to be added
- *	@parent_sd: the parent sysfs_dirent to add @sd to
- *
- *	Get @parent_sd and set @sd->s_parent to it and increment nlink of
- *	the parent inode if @sd is a directory and link into the children
- *	list of the parent.
- *
- *	This function should be called between calls to
- *	sysfs_addrm_start() and sysfs_addrm_finish() and should be
- *	passed the same @acxt as passed to sysfs_addrm_start().
- *
- *	LOCKING:
- *	Determined by sysfs_addrm_start().
- *
- *	RETURNS:
- *	0 on success, -EEXIST if entry with the given name already
- *	exists.
- */
-int sysfs_add_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd,
-		  struct sysfs_dirent *parent_sd)
-{
-	struct sysfs_inode_attrs *ps_iattr;
-	int ret;
-
-	if (sysfs_type(parent_sd) != SYSFS_DIR)
-		return -EINVAL;
-
-	sd->s_hash = sysfs_name_hash(sd->s_name, sd->s_ns);
-	sd->s_parent = parent_sd;
-	kernfs_get(parent_sd);
-
-	ret = sysfs_link_sibling(sd);
-	if (ret)
-		return ret;
-
-	/* Update timestamps on the parent */
-	ps_iattr = parent_sd->s_iattr;
-	if (ps_iattr) {
-		struct iattr *ps_iattrs = &ps_iattr->ia_iattr;
-		ps_iattrs->ia_ctime = ps_iattrs->ia_mtime = CURRENT_TIME;
-	}
-
-	/* Mark the entry added into directory tree */
-	sd->s_flags &= ~SYSFS_FLAG_REMOVED;
-
-	return 0;
-}
-
 /**
  *	sysfs_pathname - return full path to sysfs dirent
  *	@sd: sysfs_dirent whose path we want
@@ -514,173 +52,6 @@ void sysfs_warn_dup(struct sysfs_dirent *parent, const char *name)
 }
 
 /**
- *	sysfs_remove_one - remove sysfs_dirent from parent
- *	@acxt: addrm context to use
- *	@sd: sysfs_dirent to be removed
- *
- *	Mark @sd removed and drop nlink of parent inode if @sd is a
- *	directory.  @sd is unlinked from the children list.
- *
- *	This function should be called between calls to
- *	sysfs_addrm_start() and sysfs_addrm_finish() and should be
- *	passed the same @acxt as passed to sysfs_addrm_start().
- *
- *	LOCKING:
- *	Determined by sysfs_addrm_start().
- */
-static void sysfs_remove_one(struct sysfs_addrm_cxt *acxt,
-			     struct sysfs_dirent *sd)
-{
-	struct sysfs_inode_attrs *ps_iattr;
-
-	/*
-	 * Removal can be called multiple times on the same node.  Only the
-	 * first invocation is effective and puts the base ref.
-	 */
-	if (sd->s_flags & SYSFS_FLAG_REMOVED)
-		return;
-
-	sysfs_unlink_sibling(sd);
-
-	/* Update timestamps on the parent */
-	ps_iattr = sd->s_parent->s_iattr;
-	if (ps_iattr) {
-		struct iattr *ps_iattrs = &ps_iattr->ia_iattr;
-		ps_iattrs->ia_ctime = ps_iattrs->ia_mtime = CURRENT_TIME;
-	}
-
-	sd->s_flags |= SYSFS_FLAG_REMOVED;
-	sd->u.removed_list = acxt->removed;
-	acxt->removed = sd;
-}
-
-/**
- *	sysfs_addrm_finish - finish up sysfs_dirent add/remove
- *	@acxt: addrm context to finish up
- *
- *	Finish up sysfs_dirent add/remove.  Resources acquired by
- *	sysfs_addrm_start() are released and removed sysfs_dirents are
- *	cleaned up.
- *
- *	LOCKING:
- *	sysfs_mutex is released.
- */
-void sysfs_addrm_finish(struct sysfs_addrm_cxt *acxt)
-	__releases(sysfs_mutex)
-{
-	/* release resources acquired by sysfs_addrm_start() */
-	mutex_unlock(&sysfs_mutex);
-
-	/* kill removed sysfs_dirents */
-	while (acxt->removed) {
-		struct sysfs_dirent *sd = acxt->removed;
-
-		acxt->removed = sd->u.removed_list;
-
-		sysfs_deactivate(sd);
-		sysfs_unmap_bin_file(sd);
-		kernfs_put(sd);
-	}
-}
-
-/**
- * kernfs_find_ns - find sysfs_dirent with the given name
- * @parent: sysfs_dirent to search under
- * @name: name to look for
- * @ns: the namespace tag to use
- *
- * Look for sysfs_dirent with name @name under @parent.  Returns pointer to
- * the found sysfs_dirent on success, %NULL on failure.
- */
-static struct sysfs_dirent *kernfs_find_ns(struct sysfs_dirent *parent,
-					   const unsigned char *name,
-					   const void *ns)
-{
-	struct rb_node *node = parent->s_dir.children.rb_node;
-	unsigned int hash;
-
-	lockdep_assert_held(&sysfs_mutex);
-
-	hash = sysfs_name_hash(name, ns);
-	while (node) {
-		struct sysfs_dirent *sd;
-		int result;
-
-		sd = to_sysfs_dirent(node);
-		result = sysfs_name_compare(hash, name, ns, sd);
-		if (result < 0)
-			node = node->rb_left;
-		else if (result > 0)
-			node = node->rb_right;
-		else
-			return sd;
-	}
-	return NULL;
-}
-
-/**
- * kernfs_find_and_get_ns - find and get sysfs_dirent with the given name
- * @parent: sysfs_dirent to search under
- * @name: name to look for
- * @ns: the namespace tag to use
- *
- * Look for sysfs_dirent with name @name under @parent and get a reference
- * if found.  This function may sleep and returns pointer to the found
- * sysfs_dirent on success, %NULL on failure.
- */
-struct sysfs_dirent *kernfs_find_and_get_ns(struct sysfs_dirent *parent,
-					    const char *name, const void *ns)
-{
-	struct sysfs_dirent *sd;
-
-	mutex_lock(&sysfs_mutex);
-	sd = kernfs_find_ns(parent, name, ns);
-	kernfs_get(sd);
-	mutex_unlock(&sysfs_mutex);
-
-	return sd;
-}
-EXPORT_SYMBOL_GPL(kernfs_find_and_get_ns);
-
-/**
- * kernfs_create_dir_ns - create a directory
- * @parent: parent in which to create a new directory
- * @name: name of the new directory
- * @priv: opaque data associated with the new directory
- * @ns: optional namespace tag of the directory
- *
- * Returns the created node on success, ERR_PTR() value on failure.
- */
-struct sysfs_dirent *kernfs_create_dir_ns(struct sysfs_dirent *parent,
-					  const char *name, void *priv,
-					  const void *ns)
-{
-	umode_t mode = S_IFDIR | S_IRWXU | S_IRUGO | S_IXUGO;
-	struct sysfs_addrm_cxt acxt;
-	struct sysfs_dirent *sd;
-	int rc;
-
-	/* allocate */
-	sd = sysfs_new_dirent(name, mode, SYSFS_DIR);
-	if (!sd)
-		return ERR_PTR(-ENOMEM);
-
-	sd->s_ns = ns;
-	sd->priv = priv;
-
-	/* link in */
-	sysfs_addrm_start(&acxt);
-	rc = sysfs_add_one(&acxt, sd, parent);
-	sysfs_addrm_finish(&acxt);
-
-	if (!rc)
-		return sd;
-
-	kernfs_put(sd);
-	return ERR_PTR(rc);
-}
-
-/**
  * sysfs_create_dir_ns - create a directory for an object with a namespace tag
  * @kobj: object we're creating directory for
  * @ns: the namespace tag to use
@@ -710,177 +81,6 @@ int sysfs_create_dir_ns(struct kobject *kobj, const void *ns)
 	return 0;
 }
 
-static struct dentry *sysfs_lookup(struct inode *dir, struct dentry *dentry,
-				   unsigned int flags)
-{
-	struct dentry *ret = NULL;
-	struct dentry *parent = dentry->d_parent;
-	struct sysfs_dirent *parent_sd = parent->d_fsdata;
-	struct sysfs_dirent *sd;
-	struct inode *inode;
-	const void *ns = NULL;
-
-	mutex_lock(&sysfs_mutex);
-
-	if (parent_sd->s_flags & SYSFS_FLAG_HAS_NS)
-		ns = sysfs_info(dir->i_sb)->ns;
-
-	sd = kernfs_find_ns(parent_sd, dentry->d_name.name, ns);
-
-	/* no such entry */
-	if (!sd) {
-		ret = ERR_PTR(-ENOENT);
-		goto out_unlock;
-	}
-	kernfs_get(sd);
-	dentry->d_fsdata = sd;
-
-	/* attach dentry and inode */
-	inode = sysfs_get_inode(dir->i_sb, sd);
-	if (!inode) {
-		ret = ERR_PTR(-ENOMEM);
-		goto out_unlock;
-	}
-
-	/* instantiate and hash dentry */
-	ret = d_materialise_unique(dentry, inode);
- out_unlock:
-	mutex_unlock(&sysfs_mutex);
-	return ret;
-}
-
-const struct inode_operations sysfs_dir_inode_operations = {
-	.lookup		= sysfs_lookup,
-	.permission	= sysfs_permission,
-	.setattr	= sysfs_setattr,
-	.getattr	= sysfs_getattr,
-	.setxattr	= sysfs_setxattr,
-};
-
-static struct sysfs_dirent *sysfs_leftmost_descendant(struct sysfs_dirent *pos)
-{
-	struct sysfs_dirent *last;
-
-	while (true) {
-		struct rb_node *rbn;
-
-		last = pos;
-
-		if (sysfs_type(pos) != SYSFS_DIR)
-			break;
-
-		rbn = rb_first(&pos->s_dir.children);
-		if (!rbn)
-			break;
-
-		pos = to_sysfs_dirent(rbn);
-	}
-
-	return last;
-}
-
-/**
- * sysfs_next_descendant_post - find the next descendant for post-order walk
- * @pos: the current position (%NULL to initiate traversal)
- * @root: sysfs_dirent whose descendants to walk
- *
- * Find the next descendant to visit for post-order traversal of @root's
- * descendants.  @root is included in the iteration and the last node to be
- * visited.
- */
-static struct sysfs_dirent *sysfs_next_descendant_post(struct sysfs_dirent *pos,
-						       struct sysfs_dirent *root)
-{
-	struct rb_node *rbn;
-
-	lockdep_assert_held(&sysfs_mutex);
-
-	/* if first iteration, visit leftmost descendant which may be root */
-	if (!pos)
-		return sysfs_leftmost_descendant(root);
-
-	/* if we visited @root, we're done */
-	if (pos == root)
-		return NULL;
-
-	/* if there's an unvisited sibling, visit its leftmost descendant */
-	rbn = rb_next(&pos->s_rb);
-	if (rbn)
-		return sysfs_leftmost_descendant(to_sysfs_dirent(rbn));
-
-	/* no sibling left, visit parent */
-	return pos->s_parent;
-}
-
-static void __kernfs_remove(struct sysfs_addrm_cxt *acxt,
-			    struct sysfs_dirent *sd)
-{
-	struct sysfs_dirent *pos, *next;
-
-	if (!sd)
-		return;
-
-	pr_debug("sysfs %s: removing\n", sd->s_name);
-
-	next = NULL;
-	do {
-		pos = next;
-		next = sysfs_next_descendant_post(pos, sd);
-		if (pos)
-			sysfs_remove_one(acxt, pos);
-	} while (next);
-}
-
-/**
- * kernfs_remove - remove a sysfs_dirent recursively
- * @sd: the sysfs_dirent to remove
- *
- * Remove @sd along with all its subdirectories and files.
- */
-void kernfs_remove(struct sysfs_dirent *sd)
-{
-	struct sysfs_addrm_cxt acxt;
-
-	sysfs_addrm_start(&acxt);
-	__kernfs_remove(&acxt, sd);
-	sysfs_addrm_finish(&acxt);
-}
-
-/**
- * kernfs_remove_by_name_ns - find a sysfs_dirent by name and remove it
- * @dir_sd: parent of the target
- * @name: name of the sysfs_dirent to remove
- * @ns: namespace tag of the sysfs_dirent to remove
- *
- * Look for the sysfs_dirent with @name and @ns under @dir_sd and remove
- * it.  Returns 0 on success, -ENOENT if such entry doesn't exist.
- */
-int kernfs_remove_by_name_ns(struct sysfs_dirent *dir_sd, const char *name,
-			     const void *ns)
-{
-	struct sysfs_addrm_cxt acxt;
-	struct sysfs_dirent *sd;
-
-	if (!dir_sd) {
-		WARN(1, KERN_WARNING "sysfs: can not remove '%s', no directory\n",
-			name);
-		return -ENOENT;
-	}
-
-	sysfs_addrm_start(&acxt);
-
-	sd = kernfs_find_ns(dir_sd, name, ns);
-	if (sd)
-		__kernfs_remove(&acxt, sd);
-
-	sysfs_addrm_finish(&acxt);
-
-	if (sd)
-		return 0;
-	else
-		return -ENOENT;
-}
-
 /**
  *	sysfs_remove_dir - remove an object's directory.
  *	@kobj:	object.
@@ -903,57 +103,6 @@ void sysfs_remove_dir(struct kobject *kobj)
 	}
 }
 
-/**
- * kernfs_rename_ns - move and rename a kernfs_node
- * @sd: target node
- * @new_parent: new parent to put @sd under
- * @new_name: new name
- * @new_ns: new namespace tag
- */
-int kernfs_rename_ns(struct sysfs_dirent *sd, struct sysfs_dirent *new_parent,
-		     const char *new_name, const void *new_ns)
-{
-	int error;
-
-	mutex_lock(&sysfs_mutex);
-
-	error = 0;
-	if ((sd->s_parent == new_parent) && (sd->s_ns == new_ns) &&
-	    (strcmp(sd->s_name, new_name) == 0))
-		goto out;	/* nothing to rename */
-
-	error = -EEXIST;
-	if (kernfs_find_ns(new_parent, new_name, new_ns))
-		goto out;
-
-	/* rename sysfs_dirent */
-	if (strcmp(sd->s_name, new_name) != 0) {
-		error = -ENOMEM;
-		new_name = kstrdup(new_name, GFP_KERNEL);
-		if (!new_name)
-			goto out;
-
-		kfree(sd->s_name);
-		sd->s_name = new_name;
-	}
-
-	/*
-	 * Move to the appropriate place in the appropriate directories rbtree.
-	 */
-	sysfs_unlink_sibling(sd);
-	kernfs_get(new_parent);
-	kernfs_put(sd->s_parent);
-	sd->s_ns = new_ns;
-	sd->s_hash = sysfs_name_hash(sd->s_name, sd->s_ns);
-	sd->s_parent = new_parent;
-	sysfs_link_sibling(sd);
-
-	error = 0;
- out:
-	mutex_unlock(&sysfs_mutex);
-	return error;
-}
-
 int sysfs_rename_dir_ns(struct kobject *kobj, const char *new_name,
 			const void *new_ns)
 {
@@ -974,121 +123,3 @@ int sysfs_move_dir_ns(struct kobject *kobj, struct kobject *new_parent_kobj,
 
 	return kernfs_rename_ns(sd, new_parent_sd, sd->s_name, new_ns);
 }
-
-/* Relationship between s_mode and the DT_xxx types */
-static inline unsigned char dt_type(struct sysfs_dirent *sd)
-{
-	return (sd->s_mode >> 12) & 15;
-}
-
-static int sysfs_dir_release(struct inode *inode, struct file *filp)
-{
-	kernfs_put(filp->private_data);
-	return 0;
-}
-
-static struct sysfs_dirent *sysfs_dir_pos(const void *ns,
-	struct sysfs_dirent *parent_sd,	loff_t hash, struct sysfs_dirent *pos)
-{
-	if (pos) {
-		int valid = !(pos->s_flags & SYSFS_FLAG_REMOVED) &&
-			pos->s_parent == parent_sd &&
-			hash == pos->s_hash;
-		kernfs_put(pos);
-		if (!valid)
-			pos = NULL;
-	}
-	if (!pos && (hash > 1) && (hash < INT_MAX)) {
-		struct rb_node *node = parent_sd->s_dir.children.rb_node;
-		while (node) {
-			pos = to_sysfs_dirent(node);
-
-			if (hash < pos->s_hash)
-				node = node->rb_left;
-			else if (hash > pos->s_hash)
-				node = node->rb_right;
-			else
-				break;
-		}
-	}
-	/* Skip over entries in the wrong namespace */
-	while (pos && pos->s_ns != ns) {
-		struct rb_node *node = rb_next(&pos->s_rb);
-		if (!node)
-			pos = NULL;
-		else
-			pos = to_sysfs_dirent(node);
-	}
-	return pos;
-}
-
-static struct sysfs_dirent *sysfs_dir_next_pos(const void *ns,
-	struct sysfs_dirent *parent_sd,	ino_t ino, struct sysfs_dirent *pos)
-{
-	pos = sysfs_dir_pos(ns, parent_sd, ino, pos);
-	if (pos)
-		do {
-			struct rb_node *node = rb_next(&pos->s_rb);
-			if (!node)
-				pos = NULL;
-			else
-				pos = to_sysfs_dirent(node);
-		} while (pos && pos->s_ns != ns);
-	return pos;
-}
-
-static int sysfs_readdir(struct file *file, struct dir_context *ctx)
-{
-	struct dentry *dentry = file->f_path.dentry;
-	struct sysfs_dirent *parent_sd = dentry->d_fsdata;
-	struct sysfs_dirent *pos = file->private_data;
-	const void *ns = NULL;
-
-	if (!dir_emit_dots(file, ctx))
-		return 0;
-	mutex_lock(&sysfs_mutex);
-
-	if (parent_sd->s_flags & SYSFS_FLAG_HAS_NS)
-		ns = sysfs_info(dentry->d_sb)->ns;
-
-	for (pos = sysfs_dir_pos(ns, parent_sd, ctx->pos, pos);
-	     pos;
-	     pos = sysfs_dir_next_pos(ns, parent_sd, ctx->pos, pos)) {
-		const char *name = pos->s_name;
-		unsigned int type = dt_type(pos);
-		int len = strlen(name);
-		ino_t ino = pos->s_ino;
-
-		ctx->pos = pos->s_hash;
-		file->private_data = pos;
-		kernfs_get(pos);
-
-		mutex_unlock(&sysfs_mutex);
-		if (!dir_emit(ctx, name, len, ino, type))
-			return 0;
-		mutex_lock(&sysfs_mutex);
-	}
-	mutex_unlock(&sysfs_mutex);
-	file->private_data = NULL;
-	ctx->pos = INT_MAX;
-	return 0;
-}
-
-static loff_t sysfs_dir_llseek(struct file *file, loff_t offset, int whence)
-{
-	struct inode *inode = file_inode(file);
-	loff_t ret;
-
-	mutex_lock(&inode->i_mutex);
-	ret = generic_file_llseek(file, offset, whence);
-	mutex_unlock(&inode->i_mutex);
-
-	return ret;
-}
-
-const struct file_operations sysfs_dir_operations = {
-	.read		= generic_read_dir,
-	.iterate	= sysfs_readdir,
-	.release	= sysfs_dir_release,
-	.llseek		= sysfs_dir_llseek,
-};
diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index 4c79173..aef8180 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -34,22 +34,9 @@ extern struct kmem_cache *sysfs_dir_cachep;
 /*
  * dir.c
  */
-extern struct mutex sysfs_mutex;
 extern spinlock_t sysfs_assoc_lock;
-extern const struct dentry_operations sysfs_dentry_ops;
 
-extern const struct file_operations sysfs_dir_operations;
-extern const struct inode_operations sysfs_dir_inode_operations;
-
-struct sysfs_dirent *sysfs_get_active(struct sysfs_dirent *sd);
-void sysfs_put_active(struct sysfs_dirent *sd);
-void sysfs_addrm_start(struct sysfs_addrm_cxt *acxt);
 void sysfs_warn_dup(struct sysfs_dirent *parent, const char *name);
-int sysfs_add_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd,
-		  struct sysfs_dirent *parent_sd);
-void sysfs_addrm_finish(struct sysfs_addrm_cxt *acxt);
-
-struct sysfs_dirent *sysfs_new_dirent(const char *name, umode_t mode, int type);
 
 /*
  * file.c
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 33/34] sysfs, kernfs: move file core code to fs/kernfs/file.c
  2013-10-24 15:49 Subject: [PATCHSET driver-core-next] sysfs: separate out kernfs, part #1 Tejun Heo
                   ` (31 preceding siblings ...)
  2013-10-24 15:49 ` [PATCH 32/34] sysfs, kernfs: move dir core code to fs/kernfs/dir.c Tejun Heo
@ 2013-10-24 15:49 ` Tejun Heo
  2013-10-28 13:04   ` [PATCH v2 " Tejun Heo
  2013-10-28 16:34   ` [PATCH v3 " Tejun Heo
  2013-10-24 15:49 ` [PATCH 34/34] sysfs, kernfs: move symlink core code to fs/kernfs/symlink.c Tejun Heo
                   ` (2 subsequent siblings)
  35 siblings, 2 replies; 46+ messages in thread
From: Tejun Heo @ 2013-10-24 15:49 UTC (permalink / raw)
  To: gregkh; +Cc: kay, linux-kernel, ebiederm, bhelgaas, Tejun Heo

Move core file code to fs/kernfs/file.c.  fs/sysfs/file.c now contains
sysfs kernfs_ops callbacks, sysfs wrappers around kernfs interfaces,
and sysfs_schedule_callback().  The respective declarations in
fs/sysfs/sysfs.h are moved to fs/kernfs/kernfs-internal.h.

This is pure relocation.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 fs/kernfs/file.c            | 791 ++++++++++++++++++++++++++++++++++++++++++++
 fs/kernfs/kernfs-internal.h |   7 +
 fs/sysfs/file.c             | 788 +------------------------------------------
 fs/sysfs/sysfs.h            |   4 -
 4 files changed, 799 insertions(+), 791 deletions(-)

diff --git a/fs/kernfs/file.c b/fs/kernfs/file.c
index 90b1e88..e7bd5c1 100644
--- a/fs/kernfs/file.c
+++ b/fs/kernfs/file.c
@@ -7,3 +7,794 @@
  *
  * This file is released under the GPLv2.
  */
+
+#include <linux/fs.h>
+#include <linux/seq_file.h>
+#include <linux/slab.h>
+#include <linux/poll.h>
+#include <linux/pagemap.h>
+#include <linux/poll.h>
+#include <linux/sched.h>
+
+#include "kernfs-internal.h"
+
+/*
+ * There's one sysfs_open_file for each open file and one sysfs_open_dirent
+ * for each sysfs_dirent with one or more open files.
+ *
+ * sysfs_dirent->s_attr.open points to sysfs_open_dirent.  s_attr.open is
+ * protected by sysfs_open_dirent_lock.
+ *
+ * filp->private_data points to seq_file whose ->private points to
+ * sysfs_open_file.  sysfs_open_files are chained at
+ * sysfs_open_dirent->files, which is protected by sysfs_open_file_mutex.
+ */
+static DEFINE_SPINLOCK(sysfs_open_dirent_lock);
+static DEFINE_MUTEX(sysfs_open_file_mutex);
+
+struct sysfs_open_dirent {
+	atomic_t		refcnt;
+	atomic_t		event;
+	wait_queue_head_t	poll;
+	struct list_head	files; /* goes through sysfs_open_file.list */
+};
+
+static struct sysfs_open_file *sysfs_of(struct file *file)
+{
+	return ((struct seq_file *)file->private_data)->private;
+}
+
+/*
+ * Determine the kernfs_ops for the given sysfs_dirent.  This function must
+ * be called while holding an active reference.
+ */
+static const struct kernfs_ops *kernfs_ops(struct sysfs_dirent *sd)
+{
+	if (sd->s_flags & SYSFS_FLAG_LOCKDEP)
+		lockdep_assert_held(sd);
+	return sd->s_attr.ops;
+}
+
+static void *kernfs_seq_start(struct seq_file *sf, loff_t *ppos)
+{
+	struct sysfs_open_file *of = sf->private;
+	const struct kernfs_ops *ops;
+
+	/*
+	 * @of->mutex nests outside active ref and is just to ensure that
+	 * the ops aren't called concurrently for the same open file.
+	 */
+	mutex_lock(&of->mutex);
+	if (!sysfs_get_active(of->sd)) {
+		mutex_unlock(&of->mutex);
+		return ERR_PTR(-ENODEV);
+	}
+
+	ops = kernfs_ops(of->sd);
+	if (ops->seq_start)
+		return ops->seq_start(sf, ppos);
+	else
+		return NULL + !*ppos;	/* the same behavior as single_open() */
+}
+
+static void *kernfs_seq_next(struct seq_file *sf, void *v, loff_t *ppos)
+{
+	struct sysfs_open_file *of = sf->private;
+	const struct kernfs_ops *ops = kernfs_ops(of->sd);
+
+	if (ops->seq_next) {
+		return ops->seq_next(sf, v, ppos);
+	} else {
+		++*ppos;
+		return NULL;
+	}
+}
+
+static void kernfs_seq_stop(struct seq_file *sf, void *v)
+{
+	struct sysfs_open_file *of = sf->private;
+	const struct kernfs_ops *ops = kernfs_ops(of->sd);
+
+	if (ops->seq_stop)
+		ops->seq_stop(sf, v);
+
+	sysfs_put_active(of->sd);
+	mutex_unlock(&of->mutex);
+}
+
+static int kernfs_seq_show(struct seq_file *sf, void *v)
+{
+	struct sysfs_open_file *of = sf->private;
+
+	of->event = atomic_read(&of->sd->s_attr.open->event);
+
+	return of->sd->s_attr.ops->seq_show(sf, v);
+}
+
+static const struct seq_operations kernfs_seq_ops = {
+	.start = kernfs_seq_start,
+	.next = kernfs_seq_next,
+	.stop = kernfs_seq_stop,
+	.show = kernfs_seq_show,
+};
+
+/*
+ * As reading a bin file can have side-effects, the exact offset and bytes
+ * specified in read(2) call should be passed to the read callback making
+ * it difficult to use seq_file.  Implement simplistic custom buffering for
+ * bin files.
+ */
+static ssize_t kernfs_file_direct_read(struct sysfs_open_file *of,
+				       char __user *user_buf, size_t count,
+				       loff_t *ppos)
+{
+	ssize_t len = min_t(size_t, count, PAGE_SIZE);
+	const struct kernfs_ops *ops;
+	char *buf;
+
+	buf = kmalloc(len, GFP_KERNEL);
+	if (!buf)
+		return -ENOMEM;
+
+	/*
+	 * @of->mutex nests outside active ref and is just to ensure that
+	 * the ops aren't called concurrently for the same open file.
+	 */
+	mutex_lock(&of->mutex);
+	if (!sysfs_get_active(of->sd)) {
+		len = -ENODEV;
+		mutex_unlock(&of->mutex);
+		goto out_free;
+	}
+
+	ops = kernfs_ops(of->sd);
+	if (ops->read)
+		len = ops->read(of, buf, len, *ppos);
+	else
+		len = -EINVAL;
+
+	sysfs_put_active(of->sd);
+	mutex_unlock(&of->mutex);
+
+	if (len < 0)
+		goto out_free;
+
+	if (copy_to_user(user_buf, buf, len)) {
+		len = -EFAULT;
+		goto out_free;
+	}
+
+	*ppos += len;
+
+ out_free:
+	kfree(buf);
+	return len;
+}
+
+/**
+ * kernfs_file_read - kernfs vfs read callback
+ * @file: file pointer
+ * @user_buf: data to write
+ * @count: number of bytes
+ * @ppos: starting offset
+ */
+static ssize_t kernfs_file_read(struct file *file, char __user *user_buf,
+				size_t count, loff_t *ppos)
+{
+	struct sysfs_open_file *of = sysfs_of(file);
+
+	if (of->sd->s_flags & SYSFS_FLAG_HAS_SEQ_SHOW)
+		return seq_read(file, user_buf, count, ppos);
+	else
+		return kernfs_file_direct_read(of, user_buf, count, ppos);
+}
+
+/**
+ * kernfs_file_write - kernfs vfs write callback
+ * @file: file pointer
+ * @user_buf: data to write
+ * @count: number of bytes
+ * @ppos: starting offset
+ *
+ * Copy data in from userland and pass it to the matching kernfs write
+ * operation.
+ *
+ * There is no easy way for us to know if userspace is only doing a partial
+ * write, so we don't support them. We expect the entire buffer to come on
+ * the first write.  Hint: if you're writing a value, first read the file,
+ * modify only the the value you're changing, then write entire buffer
+ * back.
+ */
+static ssize_t kernfs_file_write(struct file *file, const char __user *user_buf,
+				 size_t count, loff_t *ppos)
+{
+	struct sysfs_open_file *of = sysfs_of(file);
+	ssize_t len = min_t(size_t, count, PAGE_SIZE);
+	const struct kernfs_ops *ops;
+	char *buf;
+
+	buf = kmalloc(len + 1, GFP_KERNEL);
+	if (!buf)
+		return -ENOMEM;
+
+	if (copy_from_user(buf, user_buf, len)) {
+		len = -EFAULT;
+		goto out_free;
+	}
+	buf[len] = '\0';	/* guarantee string termination */
+
+	/*
+	 * @of->mutex nests outside active ref and is just to ensure that
+	 * the ops aren't called concurrently for the same open file.
+	 */
+	mutex_lock(&of->mutex);
+	if (!sysfs_get_active(of->sd)) {
+		mutex_unlock(&of->mutex);
+		len = -ENODEV;
+		goto out_free;
+	}
+
+	ops = kernfs_ops(of->sd);
+	if (ops->write)
+		len = ops->write(of, buf, len, *ppos);
+	else
+		len = -EINVAL;
+
+	sysfs_put_active(of->sd);
+	mutex_unlock(&of->mutex);
+
+	if (len > 0)
+		*ppos += len;
+out_free:
+	kfree(buf);
+	return len;
+}
+
+static loff_t kernfs_file_llseek(struct file *file, loff_t off, int whence)
+{
+	struct sysfs_open_file *of = sysfs_of(file);
+
+	if (of->sd->s_flags & SYSFS_FLAG_HAS_SEQ_SHOW)
+		return seq_lseek(file, off, whence);
+	else
+		return generic_file_llseek(file, off, whence);
+}
+
+static void kernfs_vma_open(struct vm_area_struct *vma)
+{
+	struct file *file = vma->vm_file;
+	struct sysfs_open_file *of = sysfs_of(file);
+
+	if (!of->vm_ops)
+		return;
+
+	if (!sysfs_get_active(of->sd))
+		return;
+
+	if (of->vm_ops->open)
+		of->vm_ops->open(vma);
+
+	sysfs_put_active(of->sd);
+}
+
+static int kernfs_vma_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
+{
+	struct file *file = vma->vm_file;
+	struct sysfs_open_file *of = sysfs_of(file);
+	int ret;
+
+	if (!of->vm_ops)
+		return VM_FAULT_SIGBUS;
+
+	if (!sysfs_get_active(of->sd))
+		return VM_FAULT_SIGBUS;
+
+	ret = VM_FAULT_SIGBUS;
+	if (of->vm_ops->fault)
+		ret = of->vm_ops->fault(vma, vmf);
+
+	sysfs_put_active(of->sd);
+	return ret;
+}
+
+static int kernfs_vma_page_mkwrite(struct vm_area_struct *vma,
+				   struct vm_fault *vmf)
+{
+	struct file *file = vma->vm_file;
+	struct sysfs_open_file *of = sysfs_of(file);
+	int ret;
+
+	if (!of->vm_ops)
+		return VM_FAULT_SIGBUS;
+
+	if (!sysfs_get_active(of->sd))
+		return VM_FAULT_SIGBUS;
+
+	ret = 0;
+	if (of->vm_ops->page_mkwrite)
+		ret = of->vm_ops->page_mkwrite(vma, vmf);
+	else
+		file_update_time(file);
+
+	sysfs_put_active(of->sd);
+	return ret;
+}
+
+static int kernfs_vma_access(struct vm_area_struct *vma, unsigned long addr,
+			     void *buf, int len, int write)
+{
+	struct file *file = vma->vm_file;
+	struct sysfs_open_file *of = sysfs_of(file);
+	int ret;
+
+	if (!of->vm_ops)
+		return -EINVAL;
+
+	if (!sysfs_get_active(of->sd))
+		return -EINVAL;
+
+	ret = -EINVAL;
+	if (of->vm_ops->access)
+		ret = of->vm_ops->access(vma, addr, buf, len, write);
+
+	sysfs_put_active(of->sd);
+	return ret;
+}
+
+#ifdef CONFIG_NUMA
+static int kernfs_vma_set_policy(struct vm_area_struct *vma,
+				 struct mempolicy *new)
+{
+	struct file *file = vma->vm_file;
+	struct sysfs_open_file *of = sysfs_of(file);
+	int ret;
+
+	if (!of->vm_ops)
+		return 0;
+
+	if (!sysfs_get_active(of->sd))
+		return -EINVAL;
+
+	ret = 0;
+	if (of->vm_ops->set_policy)
+		ret = of->vm_ops->set_policy(vma, new);
+
+	sysfs_put_active(of->sd);
+	return ret;
+}
+
+static struct mempolicy *kernfs_vma_get_policy(struct vm_area_struct *vma,
+					       unsigned long addr)
+{
+	struct file *file = vma->vm_file;
+	struct sysfs_open_file *of = sysfs_of(file);
+	struct mempolicy *pol;
+
+	if (!of->vm_ops)
+		return vma->vm_policy;
+
+	if (!sysfs_get_active(of->sd))
+		return vma->vm_policy;
+
+	pol = vma->vm_policy;
+	if (of->vm_ops->get_policy)
+		pol = of->vm_ops->get_policy(vma, addr);
+
+	sysfs_put_active(of->sd);
+	return pol;
+}
+
+static int kernfs_vma_migrate(struct vm_area_struct *vma,
+			      const nodemask_t *from, const nodemask_t *to,
+			      unsigned long flags)
+{
+	struct file *file = vma->vm_file;
+	struct sysfs_open_file *of = sysfs_of(file);
+	int ret;
+
+	if (!of->vm_ops)
+		return 0;
+
+	if (!sysfs_get_active(of->sd))
+		return 0;
+
+	ret = 0;
+	if (of->vm_ops->migrate)
+		ret = of->vm_ops->migrate(vma, from, to, flags);
+
+	sysfs_put_active(of->sd);
+	return ret;
+}
+#endif
+
+static const struct vm_operations_struct kernfs_vm_ops = {
+	.open		= kernfs_vma_open,
+	.fault		= kernfs_vma_fault,
+	.page_mkwrite	= kernfs_vma_page_mkwrite,
+	.access		= kernfs_vma_access,
+#ifdef CONFIG_NUMA
+	.set_policy	= kernfs_vma_set_policy,
+	.get_policy	= kernfs_vma_get_policy,
+	.migrate	= kernfs_vma_migrate,
+#endif
+};
+
+static int kernfs_file_mmap(struct file *file, struct vm_area_struct *vma)
+{
+	struct sysfs_open_file *of = sysfs_of(file);
+	const struct kernfs_ops *ops;
+	int rc;
+
+	mutex_lock(&of->mutex);
+
+	rc = -ENODEV;
+	if (!sysfs_get_active(of->sd))
+		goto out_unlock;
+
+	ops = kernfs_ops(of->sd);
+	if (ops->mmap)
+		rc = ops->mmap(of, vma);
+	if (rc)
+		goto out_put;
+
+	/*
+	 * PowerPC's pci_mmap of legacy_mem uses shmem_zero_setup()
+	 * to satisfy versions of X which crash if the mmap fails: that
+	 * substitutes a new vm_file, and we don't then want bin_vm_ops.
+	 */
+	if (vma->vm_file != file)
+		goto out_put;
+
+	rc = -EINVAL;
+	if (of->mmapped && of->vm_ops != vma->vm_ops)
+		goto out_put;
+
+	/*
+	 * It is not possible to successfully wrap close.
+	 * So error if someone is trying to use close.
+	 */
+	rc = -EINVAL;
+	if (vma->vm_ops && vma->vm_ops->close)
+		goto out_put;
+
+	rc = 0;
+	of->mmapped = 1;
+	of->vm_ops = vma->vm_ops;
+	vma->vm_ops = &kernfs_vm_ops;
+out_put:
+	sysfs_put_active(of->sd);
+out_unlock:
+	mutex_unlock(&of->mutex);
+
+	return rc;
+}
+
+/**
+ *	sysfs_get_open_dirent - get or create sysfs_open_dirent
+ *	@sd: target sysfs_dirent
+ *	@of: sysfs_open_file for this instance of open
+ *
+ *	If @sd->s_attr.open exists, increment its reference count;
+ *	otherwise, create one.  @of is chained to the files list.
+ *
+ *	LOCKING:
+ *	Kernel thread context (may sleep).
+ *
+ *	RETURNS:
+ *	0 on success, -errno on failure.
+ */
+static int sysfs_get_open_dirent(struct sysfs_dirent *sd,
+				 struct sysfs_open_file *of)
+{
+	struct sysfs_open_dirent *od, *new_od = NULL;
+
+ retry:
+	mutex_lock(&sysfs_open_file_mutex);
+	spin_lock_irq(&sysfs_open_dirent_lock);
+
+	if (!sd->s_attr.open && new_od) {
+		sd->s_attr.open = new_od;
+		new_od = NULL;
+	}
+
+	od = sd->s_attr.open;
+	if (od) {
+		atomic_inc(&od->refcnt);
+		list_add_tail(&of->list, &od->files);
+	}
+
+	spin_unlock_irq(&sysfs_open_dirent_lock);
+	mutex_unlock(&sysfs_open_file_mutex);
+
+	if (od) {
+		kfree(new_od);
+		return 0;
+	}
+
+	/* not there, initialize a new one and retry */
+	new_od = kmalloc(sizeof(*new_od), GFP_KERNEL);
+	if (!new_od)
+		return -ENOMEM;
+
+	atomic_set(&new_od->refcnt, 0);
+	atomic_set(&new_od->event, 1);
+	init_waitqueue_head(&new_od->poll);
+	INIT_LIST_HEAD(&new_od->files);
+	goto retry;
+}
+
+/**
+ *	sysfs_put_open_dirent - put sysfs_open_dirent
+ *	@sd: target sysfs_dirent
+ *	@of: associated sysfs_open_file
+ *
+ *	Put @sd->s_attr.open and unlink @of from the files list.  If
+ *	reference count reaches zero, disassociate and free it.
+ *
+ *	LOCKING:
+ *	None.
+ */
+static void sysfs_put_open_dirent(struct sysfs_dirent *sd,
+				  struct sysfs_open_file *of)
+{
+	struct sysfs_open_dirent *od = sd->s_attr.open;
+	unsigned long flags;
+
+	mutex_lock(&sysfs_open_file_mutex);
+	spin_lock_irqsave(&sysfs_open_dirent_lock, flags);
+
+	if (of)
+		list_del(&of->list);
+
+	if (atomic_dec_and_test(&od->refcnt))
+		sd->s_attr.open = NULL;
+	else
+		od = NULL;
+
+	spin_unlock_irqrestore(&sysfs_open_dirent_lock, flags);
+	mutex_unlock(&sysfs_open_file_mutex);
+
+	kfree(od);
+}
+
+static int kernfs_file_open(struct inode *inode, struct file *file)
+{
+	struct sysfs_dirent *attr_sd = file->f_path.dentry->d_fsdata;
+	const struct kernfs_ops *ops;
+	struct sysfs_open_file *of;
+	bool has_read, has_write;
+	int error = -EACCES;
+
+	if (!sysfs_get_active(attr_sd))
+		return -ENODEV;
+
+	ops = kernfs_ops(attr_sd);
+
+	has_read = ops->seq_show || ops->read || ops->mmap;
+	has_write = ops->write || ops->mmap;
+
+	/* check perms and supported operations */
+	if ((file->f_mode & FMODE_WRITE) &&
+	    (!(inode->i_mode & S_IWUGO) || !has_write))
+		goto err_out;
+
+	if ((file->f_mode & FMODE_READ) &&
+	    (!(inode->i_mode & S_IRUGO) || !has_read))
+		goto err_out;
+
+	/* allocate a sysfs_open_file for the file */
+	error = -ENOMEM;
+	of = kzalloc(sizeof(struct sysfs_open_file), GFP_KERNEL);
+	if (!of)
+		goto err_out;
+
+	mutex_init(&of->mutex);
+	of->sd = attr_sd;
+	of->file = file;
+
+	/*
+	 * Always instantiate seq_file even if read access doesn't use
+	 * seq_file or is not requested.  This unifies private data access
+	 * and readable regular files are the vast majority anyway.
+	 */
+	if (ops->seq_show)
+		error = seq_open(file, &kernfs_seq_ops);
+	else
+		error = seq_open(file, NULL);
+	if (error)
+		goto err_free;
+
+	((struct seq_file *)file->private_data)->private = of;
+
+	/* seq_file clears PWRITE unconditionally, restore it if WRITE */
+	if (file->f_mode & FMODE_WRITE)
+		file->f_mode |= FMODE_PWRITE;
+
+	/* make sure we have open dirent struct */
+	error = sysfs_get_open_dirent(attr_sd, of);
+	if (error)
+		goto err_close;
+
+	/* open succeeded, put active references */
+	sysfs_put_active(attr_sd);
+	return 0;
+
+err_close:
+	seq_release(inode, file);
+err_free:
+	kfree(of);
+err_out:
+	sysfs_put_active(attr_sd);
+	return error;
+}
+
+static int kernfs_file_release(struct inode *inode, struct file *filp)
+{
+	struct sysfs_dirent *sd = filp->f_path.dentry->d_fsdata;
+	struct sysfs_open_file *of = sysfs_of(filp);
+
+	sysfs_put_open_dirent(sd, of);
+	seq_release(inode, filp);
+	kfree(of);
+
+	return 0;
+}
+
+void sysfs_unmap_bin_file(struct sysfs_dirent *sd)
+{
+	struct sysfs_open_dirent *od;
+	struct sysfs_open_file *of;
+
+	if (!(sd->s_flags & SYSFS_FLAG_HAS_MMAP))
+		return;
+
+	spin_lock_irq(&sysfs_open_dirent_lock);
+	od = sd->s_attr.open;
+	if (od)
+		atomic_inc(&od->refcnt);
+	spin_unlock_irq(&sysfs_open_dirent_lock);
+	if (!od)
+		return;
+
+	mutex_lock(&sysfs_open_file_mutex);
+	list_for_each_entry(of, &od->files, list) {
+		struct inode *inode = file_inode(of->file);
+		unmap_mapping_range(inode->i_mapping, 0, 0, 1);
+	}
+	mutex_unlock(&sysfs_open_file_mutex);
+
+	sysfs_put_open_dirent(sd, NULL);
+}
+
+/* Sysfs attribute files are pollable.  The idea is that you read
+ * the content and then you use 'poll' or 'select' to wait for
+ * the content to change.  When the content changes (assuming the
+ * manager for the kobject supports notification), poll will
+ * return POLLERR|POLLPRI, and select will return the fd whether
+ * it is waiting for read, write, or exceptions.
+ * Once poll/select indicates that the value has changed, you
+ * need to close and re-open the file, or seek to 0 and read again.
+ * Reminder: this only works for attributes which actively support
+ * it, and it is not possible to test an attribute from userspace
+ * to see if it supports poll (Neither 'poll' nor 'select' return
+ * an appropriate error code).  When in doubt, set a suitable timeout value.
+ */
+static unsigned int kernfs_file_poll(struct file *filp, poll_table *wait)
+{
+	struct sysfs_open_file *of = sysfs_of(filp);
+	struct sysfs_dirent *attr_sd = filp->f_path.dentry->d_fsdata;
+	struct sysfs_open_dirent *od = attr_sd->s_attr.open;
+
+	/* need parent for the kobj, grab both */
+	if (!sysfs_get_active(attr_sd))
+		goto trigger;
+
+	poll_wait(filp, &od->poll, wait);
+
+	sysfs_put_active(attr_sd);
+
+	if (of->event != atomic_read(&od->event))
+		goto trigger;
+
+	return DEFAULT_POLLMASK;
+
+ trigger:
+	return DEFAULT_POLLMASK|POLLERR|POLLPRI;
+}
+
+/**
+ * kernfs_notify - notify a kernfs file
+ * @sd: file to notify
+ *
+ * Notify @sd such that poll(2) on @sd wakes up.
+ */
+void kernfs_notify(struct sysfs_dirent *sd)
+{
+	struct sysfs_open_dirent *od;
+	unsigned long flags;
+
+	spin_lock_irqsave(&sysfs_open_dirent_lock, flags);
+
+	if (!WARN_ON(sysfs_type(sd) != SYSFS_KOBJ_ATTR)) {
+		od = sd->s_attr.open;
+		if (od) {
+			atomic_inc(&od->event);
+			wake_up_interruptible(&od->poll);
+		}
+	}
+
+	spin_unlock_irqrestore(&sysfs_open_dirent_lock, flags);
+}
+EXPORT_SYMBOL_GPL(kernfs_notify);
+
+const struct file_operations kernfs_file_operations = {
+	.read		= kernfs_file_read,
+	.write		= kernfs_file_write,
+	.llseek		= kernfs_file_llseek,
+	.mmap		= kernfs_file_mmap,
+	.open		= kernfs_file_open,
+	.release	= kernfs_file_release,
+	.poll		= kernfs_file_poll,
+};
+
+/**
+ * kernfs_create_file_ns_key - create a file
+ * @parent: directory to create the file in
+ * @name: name of the file
+ * @mode: mode of the file
+ * @size: size of the file
+ * @ops: kernfs operations for the file
+ * @priv: private data for the file
+ * @ns: optional namespace tag of the file
+ * @key: lockdep key for the file's active_ref, %NULL to disable lockdep
+ *
+ * Returns the created node on success, ERR_PTR() value on error.
+ */
+struct sysfs_dirent *kernfs_create_file_ns_key(struct sysfs_dirent *parent,
+					       const char *name,
+					       umode_t mode, loff_t size,
+					       const struct kernfs_ops *ops,
+					       void *priv, const void *ns,
+					       struct lock_class_key *key)
+{
+	struct sysfs_addrm_cxt acxt;
+	struct sysfs_dirent *sd;
+	int rc;
+
+	sd = sysfs_new_dirent(name, (mode & S_IALLUGO) | S_IFREG,
+			      SYSFS_KOBJ_ATTR);
+	if (!sd)
+		return ERR_PTR(-ENOMEM);
+
+	sd->s_attr.ops = ops;
+	sd->s_attr.size = size;
+	sd->s_ns = ns;
+	sd->priv = priv;
+
+#ifdef CONFIG_DEBUG_LOCK_ALLOC
+	if (key) {
+		lockdep_init_map(&sd->dep_map, "s_active", key, 0);
+		sd->s_flags |= SYSFS_FLAG_LOCKDEP;
+	}
+#endif
+
+	/*
+	 * sd->s_attr.ops is accesible only while holding active ref.  We
+	 * need to know whether some ops are implemented outside active
+	 * ref.  Cache their existence in flags.
+	 */
+	if (ops->seq_show)
+		sd->s_flags |= SYSFS_FLAG_HAS_SEQ_SHOW;
+	if (ops->mmap)
+		sd->s_flags |= SYSFS_FLAG_HAS_MMAP;
+
+	sysfs_addrm_start(&acxt);
+	rc = sysfs_add_one(&acxt, sd, parent);
+	sysfs_addrm_finish(&acxt);
+
+	if (rc) {
+		kernfs_put(sd);
+		return ERR_PTR(rc);
+	}
+	return sd;
+}
diff --git a/fs/kernfs/kernfs-internal.h b/fs/kernfs/kernfs-internal.h
index 67b111b..1046624 100644
--- a/fs/kernfs/kernfs-internal.h
+++ b/fs/kernfs/kernfs-internal.h
@@ -142,4 +142,11 @@ int sysfs_add_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd,
 void sysfs_addrm_finish(struct sysfs_addrm_cxt *acxt);
 struct sysfs_dirent *sysfs_new_dirent(const char *name, umode_t mode, int type);
 
+/*
+ * file.c
+ */
+extern const struct file_operations kernfs_file_operations;
+
+void sysfs_unmap_bin_file(struct sysfs_dirent *sd);
+
 #endif	/* __KERNFS_INTERNAL_H */
diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c
index 2b9cfb1..9056aef 100644
--- a/fs/sysfs/file.c
+++ b/fs/sysfs/file.c
@@ -14,54 +14,12 @@
 #include <linux/kobject.h>
 #include <linux/kallsyms.h>
 #include <linux/slab.h>
-#include <linux/fsnotify.h>
-#include <linux/namei.h>
-#include <linux/poll.h>
 #include <linux/list.h>
 #include <linux/mutex.h>
-#include <linux/limits.h>
-#include <linux/uaccess.h>
 #include <linux/seq_file.h>
-#include <linux/mm.h>
 
 #include "sysfs.h"
-
-/*
- * There's one sysfs_open_file for each open file and one sysfs_open_dirent
- * for each sysfs_dirent with one or more open files.
- *
- * sysfs_dirent->s_attr.open points to sysfs_open_dirent.  s_attr.open is
- * protected by sysfs_open_dirent_lock.
- *
- * filp->private_data points to seq_file whose ->private points to
- * sysfs_open_file.  sysfs_open_files are chained at
- * sysfs_open_dirent->files, which is protected by sysfs_open_file_mutex.
- */
-static DEFINE_SPINLOCK(sysfs_open_dirent_lock);
-static DEFINE_MUTEX(sysfs_open_file_mutex);
-
-struct sysfs_open_dirent {
-	atomic_t		refcnt;
-	atomic_t		event;
-	wait_queue_head_t	poll;
-	struct list_head	files; /* goes through sysfs_open_file.list */
-};
-
-static struct sysfs_open_file *sysfs_of(struct file *file)
-{
-	return ((struct seq_file *)file->private_data)->private;
-}
-
-/*
- * Determine the kernfs_ops for the given sysfs_dirent.  This function must
- * be called while holding an active reference.
- */
-static const struct kernfs_ops *kernfs_ops(struct sysfs_dirent *sd)
-{
-	if (sd->s_flags & SYSFS_FLAG_LOCKDEP)
-		lockdep_assert_held(sd);
-	return sd->s_attr.ops;
-}
+#include "../kernfs/kernfs-internal.h"
 
 /*
  * Determine ktype->sysfs_ops for the given sysfs_dirent.  This function
@@ -143,140 +101,6 @@ static ssize_t sysfs_kf_bin_read(struct sysfs_open_file *of, char *buf,
 	return battr->read(of->file, kobj, battr, buf, pos, count);
 }
 
-static void *kernfs_seq_start(struct seq_file *sf, loff_t *ppos)
-{
-	struct sysfs_open_file *of = sf->private;
-	const struct kernfs_ops *ops;
-
-	/*
-	 * @of->mutex nests outside active ref and is just to ensure that
-	 * the ops aren't called concurrently for the same open file.
-	 */
-	mutex_lock(&of->mutex);
-	if (!sysfs_get_active(of->sd)) {
-		mutex_unlock(&of->mutex);
-		return ERR_PTR(-ENODEV);
-	}
-
-	ops = kernfs_ops(of->sd);
-	if (ops->seq_start)
-		return ops->seq_start(sf, ppos);
-	else
-		return NULL + !*ppos;	/* the same behavior as single_open() */
-}
-
-static void *kernfs_seq_next(struct seq_file *sf, void *v, loff_t *ppos)
-{
-	struct sysfs_open_file *of = sf->private;
-	const struct kernfs_ops *ops = kernfs_ops(of->sd);
-
-	if (ops->seq_next) {
-		return ops->seq_next(sf, v, ppos);
-	} else {
-		++*ppos;
-		return NULL;
-	}
-}
-
-static void kernfs_seq_stop(struct seq_file *sf, void *v)
-{
-	struct sysfs_open_file *of = sf->private;
-	const struct kernfs_ops *ops = kernfs_ops(of->sd);
-
-	if (ops->seq_stop)
-		ops->seq_stop(sf, v);
-
-	sysfs_put_active(of->sd);
-	mutex_unlock(&of->mutex);
-}
-
-static int kernfs_seq_show(struct seq_file *sf, void *v)
-{
-	struct sysfs_open_file *of = sf->private;
-
-	of->event = atomic_read(&of->sd->s_attr.open->event);
-
-	return of->sd->s_attr.ops->seq_show(sf, v);
-}
-
-static const struct seq_operations kernfs_seq_ops = {
-	.start = kernfs_seq_start,
-	.next = kernfs_seq_next,
-	.stop = kernfs_seq_stop,
-	.show = kernfs_seq_show,
-};
-
-/*
- * As reading a bin file can have side-effects, the exact offset and bytes
- * specified in read(2) call should be passed to the read callback making
- * it difficult to use seq_file.  Implement simplistic custom buffering for
- * bin files.
- */
-static ssize_t kernfs_file_direct_read(struct sysfs_open_file *of,
-				       char __user *user_buf, size_t count,
-				       loff_t *ppos)
-{
-	ssize_t len = min_t(size_t, count, PAGE_SIZE);
-	const struct kernfs_ops *ops;
-	char *buf;
-
-	buf = kmalloc(len, GFP_KERNEL);
-	if (!buf)
-		return -ENOMEM;
-
-	/*
-	 * @of->mutex nests outside active ref and is just to ensure that
-	 * the ops aren't called concurrently for the same open file.
-	 */
-	mutex_lock(&of->mutex);
-	if (!sysfs_get_active(of->sd)) {
-		len = -ENODEV;
-		mutex_unlock(&of->mutex);
-		goto out_free;
-	}
-
-	ops = kernfs_ops(of->sd);
-	if (ops->read)
-		len = ops->read(of, buf, len, *ppos);
-	else
-		len = -EINVAL;
-
-	sysfs_put_active(of->sd);
-	mutex_unlock(&of->mutex);
-
-	if (len < 0)
-		goto out_free;
-
-	if (copy_to_user(user_buf, buf, len)) {
-		len = -EFAULT;
-		goto out_free;
-	}
-
-	*ppos += len;
-
- out_free:
-	kfree(buf);
-	return len;
-}
-
-/**
- * kernfs_file_read - kernfs vfs read callback
- * @file: file pointer
- * @user_buf: data to write
- * @count: number of bytes
- * @ppos: starting offset
- */
-static ssize_t kernfs_file_read(struct file *file, char __user *user_buf,
-				size_t count, loff_t *ppos)
-{
-	struct sysfs_open_file *of = sysfs_of(file);
-
-	if (of->sd->s_flags & SYSFS_FLAG_HAS_SEQ_SHOW)
-		return seq_read(file, user_buf, count, ppos);
-	else
-		return kernfs_file_direct_read(of, user_buf, count, ppos);
-}
-
 /* kernfs write callback for regular sysfs files */
 static ssize_t sysfs_kf_write(struct sysfs_open_file *of, char *buf,
 			      size_t count, loff_t pos)
@@ -312,77 +136,6 @@ static ssize_t sysfs_kf_bin_write(struct sysfs_open_file *of, char *buf,
 	return battr->write(of->file, kobj, battr, buf, pos, count);
 }
 
-/**
- * kernfs_file_write - kernfs vfs write callback
- * @file: file pointer
- * @user_buf: data to write
- * @count: number of bytes
- * @ppos: starting offset
- *
- * Copy data in from userland and pass it to the matching kernfs write
- * operation.
- *
- * There is no easy way for us to know if userspace is only doing a partial
- * write, so we don't support them. We expect the entire buffer to come on
- * the first write.  Hint: if you're writing a value, first read the file,
- * modify only the the value you're changing, then write entire buffer
- * back.
- */
-static ssize_t kernfs_file_write(struct file *file, const char __user *user_buf,
-				 size_t count, loff_t *ppos)
-{
-	struct sysfs_open_file *of = sysfs_of(file);
-	ssize_t len = min_t(size_t, count, PAGE_SIZE);
-	const struct kernfs_ops *ops;
-	char *buf;
-
-	buf = kmalloc(len + 1, GFP_KERNEL);
-	if (!buf)
-		return -ENOMEM;
-
-	if (copy_from_user(buf, user_buf, len)) {
-		len = -EFAULT;
-		goto out_free;
-	}
-	buf[len] = '\0';	/* guarantee string termination */
-
-	/*
-	 * @of->mutex nests outside active ref and is just to ensure that
-	 * the ops aren't called concurrently for the same open file.
-	 */
-	mutex_lock(&of->mutex);
-	if (!sysfs_get_active(of->sd)) {
-		mutex_unlock(&of->mutex);
-		len = -ENODEV;
-		goto out_free;
-	}
-
-	ops = kernfs_ops(of->sd);
-	if (ops->write)
-		len = ops->write(of, buf, len, *ppos);
-	else
-		len = -EINVAL;
-
-	sysfs_put_active(of->sd);
-	mutex_unlock(&of->mutex);
-
-	if (len > 0)
-		*ppos += len;
-out_free:
-	kfree(buf);
-	return len;
-}
-
-static loff_t kernfs_file_llseek(struct file *file, loff_t off, int whence)
-{
-	struct sysfs_open_file *of = sysfs_of(file);
-
-	if (of->sd->s_flags & SYSFS_FLAG_HAS_SEQ_SHOW)
-		return seq_lseek(file, off, whence);
-	else
-		return generic_file_llseek(file, off, whence);
-}
-
 static int sysfs_kf_bin_mmap(struct sysfs_open_file *of,
 			     struct vm_area_struct *vma)
 {
@@ -395,473 +148,6 @@ static int sysfs_kf_bin_mmap(struct sysfs_open_file *of,
 	return battr->mmap(of->file, kobj, battr, vma);
 }
 
-static void kernfs_vma_open(struct vm_area_struct *vma)
-{
-	struct file *file = vma->vm_file;
-	struct sysfs_open_file *of = sysfs_of(file);
-
-	if (!of->vm_ops)
-		return;
-
-	if (!sysfs_get_active(of->sd))
-		return;
-
-	if (of->vm_ops->open)
-		of->vm_ops->open(vma);
-
-	sysfs_put_active(of->sd);
-}
-
-static int kernfs_vma_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
-{
-	struct file *file = vma->vm_file;
-	struct sysfs_open_file *of = sysfs_of(file);
-	int ret;
-
-	if (!of->vm_ops)
-		return VM_FAULT_SIGBUS;
-
-	if (!sysfs_get_active(of->sd))
-		return VM_FAULT_SIGBUS;
-
-	ret = VM_FAULT_SIGBUS;
-	if (of->vm_ops->fault)
-		ret = of->vm_ops->fault(vma, vmf);
-
-	sysfs_put_active(of->sd);
-	return ret;
-}
-
-static int kernfs_vma_page_mkwrite(struct vm_area_struct *vma,
-				   struct vm_fault *vmf)
-{
-	struct file *file = vma->vm_file;
-	struct sysfs_open_file *of = sysfs_of(file);
-	int ret;
-
-	if (!of->vm_ops)
-		return VM_FAULT_SIGBUS;
-
-	if (!sysfs_get_active(of->sd))
-		return VM_FAULT_SIGBUS;
-
-	ret = 0;
-	if (of->vm_ops->page_mkwrite)
-		ret = of->vm_ops->page_mkwrite(vma, vmf);
-	else
-		file_update_time(file);
-
-	sysfs_put_active(of->sd);
-	return ret;
-}
-
-static int kernfs_vma_access(struct vm_area_struct *vma, unsigned long addr,
-			     void *buf, int len, int write)
-{
-	struct file *file = vma->vm_file;
-	struct sysfs_open_file *of = sysfs_of(file);
-	int ret;
-
-	if (!of->vm_ops)
-		return -EINVAL;
-
-	if (!sysfs_get_active(of->sd))
-		return -EINVAL;
-
-	ret = -EINVAL;
-	if (of->vm_ops->access)
-		ret = of->vm_ops->access(vma, addr, buf, len, write);
-
-	sysfs_put_active(of->sd);
-	return ret;
-}
-
-#ifdef CONFIG_NUMA
-static int kernfs_vma_set_policy(struct vm_area_struct *vma,
-				 struct mempolicy *new)
-{
-	struct file *file = vma->vm_file;
-	struct sysfs_open_file *of = sysfs_of(file);
-	int ret;
-
-	if (!of->vm_ops)
-		return 0;
-
-	if (!sysfs_get_active(of->sd))
-		return -EINVAL;
-
-	ret = 0;
-	if (of->vm_ops->set_policy)
-		ret = of->vm_ops->set_policy(vma, new);
-
-	sysfs_put_active(of->sd);
-	return ret;
-}
-
-static struct mempolicy *kernfs_vma_get_policy(struct vm_area_struct *vma,
-					       unsigned long addr)
-{
-	struct file *file = vma->vm_file;
-	struct sysfs_open_file *of = sysfs_of(file);
-	struct mempolicy *pol;
-
-	if (!of->vm_ops)
-		return vma->vm_policy;
-
-	if (!sysfs_get_active(of->sd))
-		return vma->vm_policy;
-
-	pol = vma->vm_policy;
-	if (of->vm_ops->get_policy)
-		pol = of->vm_ops->get_policy(vma, addr);
-
-	sysfs_put_active(of->sd);
-	return pol;
-}
-
-static int kernfs_vma_migrate(struct vm_area_struct *vma,
-			      const nodemask_t *from, const nodemask_t *to,
-			      unsigned long flags)
-{
-	struct file *file = vma->vm_file;
-	struct sysfs_open_file *of = sysfs_of(file);
-	int ret;
-
-	if (!of->vm_ops)
-		return 0;
-
-	if (!sysfs_get_active(of->sd))
-		return 0;
-
-	ret = 0;
-	if (of->vm_ops->migrate)
-		ret = of->vm_ops->migrate(vma, from, to, flags);
-
-	sysfs_put_active(of->sd);
-	return ret;
-}
-#endif
-
-static const struct vm_operations_struct kernfs_vm_ops = {
-	.open		= kernfs_vma_open,
-	.fault		= kernfs_vma_fault,
-	.page_mkwrite	= kernfs_vma_page_mkwrite,
-	.access		= kernfs_vma_access,
-#ifdef CONFIG_NUMA
-	.set_policy	= kernfs_vma_set_policy,
-	.get_policy	= kernfs_vma_get_policy,
-	.migrate	= kernfs_vma_migrate,
-#endif
-};
-
-static int kernfs_file_mmap(struct file *file, struct vm_area_struct *vma)
-{
-	struct sysfs_open_file *of = sysfs_of(file);
-	const struct kernfs_ops *ops;
-	int rc;
-
-	mutex_lock(&of->mutex);
-
-	rc = -ENODEV;
-	if (!sysfs_get_active(of->sd))
-		goto out_unlock;
-
-	ops = kernfs_ops(of->sd);
-	if (ops->mmap)
-		rc = ops->mmap(of, vma);
-	if (rc)
-		goto out_put;
-
-	/*
-	 * PowerPC's pci_mmap of legacy_mem uses shmem_zero_setup()
-	 * to satisfy versions of X which crash if the mmap fails: that
-	 * substitutes a new vm_file, and we don't then want bin_vm_ops.
-	 */
-	if (vma->vm_file != file)
-		goto out_put;
-
-	rc = -EINVAL;
-	if (of->mmapped && of->vm_ops != vma->vm_ops)
-		goto out_put;
-
-	/*
-	 * It is not possible to successfully wrap close.
-	 * So error if someone is trying to use close.
-	 */
-	rc = -EINVAL;
-	if (vma->vm_ops && vma->vm_ops->close)
-		goto out_put;
-
-	rc = 0;
-	of->mmapped = 1;
-	of->vm_ops = vma->vm_ops;
-	vma->vm_ops = &kernfs_vm_ops;
-out_put:
-	sysfs_put_active(of->sd);
-out_unlock:
-	mutex_unlock(&of->mutex);
-
-	return rc;
-}
-
-/**
- *	sysfs_get_open_dirent - get or create sysfs_open_dirent
- *	@sd: target sysfs_dirent
- *	@of: sysfs_open_file for this instance of open
- *
- *	If @sd->s_attr.open exists, increment its reference count;
- *	otherwise, create one.  @of is chained to the files list.
- *
- *	LOCKING:
- *	Kernel thread context (may sleep).
- *
- *	RETURNS:
- *	0 on success, -errno on failure.
- */
-static int sysfs_get_open_dirent(struct sysfs_dirent *sd,
-				 struct sysfs_open_file *of)
-{
-	struct sysfs_open_dirent *od, *new_od = NULL;
-
- retry:
-	mutex_lock(&sysfs_open_file_mutex);
-	spin_lock_irq(&sysfs_open_dirent_lock);
-
-	if (!sd->s_attr.open && new_od) {
-		sd->s_attr.open = new_od;
-		new_od = NULL;
-	}
-
-	od = sd->s_attr.open;
-	if (od) {
-		atomic_inc(&od->refcnt);
-		list_add_tail(&of->list, &od->files);
-	}
-
-	spin_unlock_irq(&sysfs_open_dirent_lock);
-	mutex_unlock(&sysfs_open_file_mutex);
-
-	if (od) {
-		kfree(new_od);
-		return 0;
-	}
-
-	/* not there, initialize a new one and retry */
-	new_od = kmalloc(sizeof(*new_od), GFP_KERNEL);
-	if (!new_od)
-		return -ENOMEM;
-
-	atomic_set(&new_od->refcnt, 0);
-	atomic_set(&new_od->event, 1);
-	init_waitqueue_head(&new_od->poll);
-	INIT_LIST_HEAD(&new_od->files);
-	goto retry;
-}
-
-/**
- *	sysfs_put_open_dirent - put sysfs_open_dirent
- *	@sd: target sysfs_dirent
- *	@of: associated sysfs_open_file
- *
- *	Put @sd->s_attr.open and unlink @of from the files list.  If
- *	reference count reaches zero, disassociate and free it.
- *
- *	LOCKING:
- *	None.
- */
-static void sysfs_put_open_dirent(struct sysfs_dirent *sd,
-				  struct sysfs_open_file *of)
-{
-	struct sysfs_open_dirent *od = sd->s_attr.open;
-	unsigned long flags;
-
-	mutex_lock(&sysfs_open_file_mutex);
-	spin_lock_irqsave(&sysfs_open_dirent_lock, flags);
-
-	if (of)
-		list_del(&of->list);
-
-	if (atomic_dec_and_test(&od->refcnt))
-		sd->s_attr.open = NULL;
-	else
-		od = NULL;
-
-	spin_unlock_irqrestore(&sysfs_open_dirent_lock, flags);
-	mutex_unlock(&sysfs_open_file_mutex);
-
-	kfree(od);
-}
-
-static int kernfs_file_open(struct inode *inode, struct file *file)
-{
-	struct sysfs_dirent *attr_sd = file->f_path.dentry->d_fsdata;
-	const struct kernfs_ops *ops;
-	struct sysfs_open_file *of;
-	bool has_read, has_write;
-	int error = -EACCES;
-
-	if (!sysfs_get_active(attr_sd))
-		return -ENODEV;
-
-	ops = kernfs_ops(attr_sd);
-
-	has_read = ops->seq_show || ops->read || ops->mmap;
-	has_write = ops->write || ops->mmap;
-
-	/* check perms and supported operations */
-	if ((file->f_mode & FMODE_WRITE) &&
-	    (!(inode->i_mode & S_IWUGO) || !has_write))
-		goto err_out;
-
-	if ((file->f_mode & FMODE_READ) &&
-	    (!(inode->i_mode & S_IRUGO) || !has_read))
-		goto err_out;
-
-	/* allocate a sysfs_open_file for the file */
-	error = -ENOMEM;
-	of = kzalloc(sizeof(struct sysfs_open_file), GFP_KERNEL);
-	if (!of)
-		goto err_out;
-
-	mutex_init(&of->mutex);
-	of->sd = attr_sd;
-	of->file = file;
-
-	/*
-	 * Always instantiate seq_file even if read access doesn't use
-	 * seq_file or is not requested.  This unifies private data access
-	 * and readable regular files are the vast majority anyway.
-	 */
-	if (ops->seq_show)
-		error = seq_open(file, &kernfs_seq_ops);
-	else
-		error = seq_open(file, NULL);
-	if (error)
-		goto err_free;
-
-	((struct seq_file *)file->private_data)->private = of;
-
-	/* seq_file clears PWRITE unconditionally, restore it if WRITE */
-	if (file->f_mode & FMODE_WRITE)
-		file->f_mode |= FMODE_PWRITE;
-
-	/* make sure we have open dirent struct */
-	error = sysfs_get_open_dirent(attr_sd, of);
-	if (error)
-		goto err_close;
-
-	/* open succeeded, put active references */
-	sysfs_put_active(attr_sd);
-	return 0;
-
-err_close:
-	seq_release(inode, file);
-err_free:
-	kfree(of);
-err_out:
-	sysfs_put_active(attr_sd);
-	return error;
-}
-
-static int kernfs_file_release(struct inode *inode, struct file *filp)
-{
-	struct sysfs_dirent *sd = filp->f_path.dentry->d_fsdata;
-	struct sysfs_open_file *of = sysfs_of(filp);
-
-	sysfs_put_open_dirent(sd, of);
-	seq_release(inode, filp);
-	kfree(of);
-
-	return 0;
-}
-
-void sysfs_unmap_bin_file(struct sysfs_dirent *sd)
-{
-	struct sysfs_open_dirent *od;
-	struct sysfs_open_file *of;
-
-	if (!(sd->s_flags & SYSFS_FLAG_HAS_MMAP))
-		return;
-
-	spin_lock_irq(&sysfs_open_dirent_lock);
-	od = sd->s_attr.open;
-	if (od)
-		atomic_inc(&od->refcnt);
-	spin_unlock_irq(&sysfs_open_dirent_lock);
-	if (!od)
-		return;
-
-	mutex_lock(&sysfs_open_file_mutex);
-	list_for_each_entry(of, &od->files, list) {
-		struct inode *inode = file_inode(of->file);
-		unmap_mapping_range(inode->i_mapping, 0, 0, 1);
-	}
-	mutex_unlock(&sysfs_open_file_mutex);
-
-	sysfs_put_open_dirent(sd, NULL);
-}
-
-/* Sysfs attribute files are pollable.  The idea is that you read
- * the content and then you use 'poll' or 'select' to wait for
- * the content to change.  When the content changes (assuming the
- * manager for the kobject supports notification), poll will
- * return POLLERR|POLLPRI, and select will return the fd whether
- * it is waiting for read, write, or exceptions.
- * Once poll/select indicates that the value has changed, you
- * need to close and re-open the file, or seek to 0 and read again.
- * Reminder: this only works for attributes which actively support
- * it, and it is not possible to test an attribute from userspace
- * to see if it supports poll (Neither 'poll' nor 'select' return
- * an appropriate error code).  When in doubt, set a suitable timeout value.
- */
-static unsigned int kernfs_file_poll(struct file *filp, poll_table *wait)
-{
-	struct sysfs_open_file *of = sysfs_of(filp);
-	struct sysfs_dirent *attr_sd = filp->f_path.dentry->d_fsdata;
-	struct sysfs_open_dirent *od = attr_sd->s_attr.open;
-
-	/* need parent for the kobj, grab both */
-	if (!sysfs_get_active(attr_sd))
-		goto trigger;
-
-	poll_wait(filp, &od->poll, wait);
-
-	sysfs_put_active(attr_sd);
-
-	if (of->event != atomic_read(&od->event))
-		goto trigger;
-
-	return DEFAULT_POLLMASK;
-
- trigger:
-	return DEFAULT_POLLMASK|POLLERR|POLLPRI;
-}
-
-/**
- * kernfs_notify - notify a kernfs file
- * @sd: file to notify
- *
- * Notify @sd such that poll(2) on @sd wakes up.
- */
-void kernfs_notify(struct sysfs_dirent *sd)
-{
-	struct sysfs_open_dirent *od;
-	unsigned long flags;
-
-	spin_lock_irqsave(&sysfs_open_dirent_lock, flags);
-
-	if (!WARN_ON(sysfs_type(sd) != SYSFS_KOBJ_ATTR)) {
-		od = sd->s_attr.open;
-		if (od) {
-			atomic_inc(&od->event);
-			wake_up_interruptible(&od->poll);
-		}
-	}
-
-	spin_unlock_irqrestore(&sysfs_open_dirent_lock, flags);
-}
-EXPORT_SYMBOL_GPL(kernfs_notify);
-
 void sysfs_notify(struct kobject *k, const char *dir, const char *attr)
 {
 	struct sysfs_dirent *sd = k->sd, *tmp;
@@ -884,16 +170,6 @@ void sysfs_notify(struct kobject *k, const char *dir, const char *attr)
 }
 EXPORT_SYMBOL_GPL(sysfs_notify);
 
-const struct file_operations kernfs_file_operations = {
-	.read		= kernfs_file_read,
-	.write		= kernfs_file_write,
-	.llseek		= kernfs_file_llseek,
-	.mmap		= kernfs_file_mmap,
-	.open		= kernfs_file_open,
-	.release	= kernfs_file_release,
-	.poll		= kernfs_file_poll,
-};
-
 static const struct kernfs_ops sysfs_file_kfops_empty = {
 };
 
@@ -982,68 +258,6 @@ int sysfs_add_file_mode_ns(struct sysfs_dirent *dir_sd,
 	return 0;
 }
 
-/**
- * kernfs_create_file_ns_key - create a file
- * @parent: directory to create the file in
- * @name: name of the file
- * @mode: mode of the file
- * @size: size of the file
- * @ops: kernfs operations for the file
- * @priv: private data for the file
- * @ns: optional namespace tag of the file
- * @key: lockdep key for the file's active_ref, %NULL to disable lockdep
- *
- * Returns the created node on success, ERR_PTR() value on error.
- */
-struct sysfs_dirent *kernfs_create_file_ns_key(struct sysfs_dirent *parent,
-					       const char *name,
-					       umode_t mode, loff_t size,
-					       const struct kernfs_ops *ops,
-					       void *priv, const void *ns,
-					       struct lock_class_key *key)
-{
-	struct sysfs_addrm_cxt acxt;
-	struct sysfs_dirent *sd;
-	int rc;
-
-	sd = sysfs_new_dirent(name, (mode & S_IALLUGO) | S_IFREG,
-			      SYSFS_KOBJ_ATTR);
-	if (!sd)
-		return ERR_PTR(-ENOMEM);
-
-	sd->s_attr.ops = ops;
-	sd->s_attr.size = size;
-	sd->s_ns = ns;
-	sd->priv = priv;
-
-#ifdef CONFIG_DEBUG_LOCK_ALLOC
-	if (key) {
-		lockdep_init_map(&sd->dep_map, "s_active", key, 0);
-		sd->s_flags |= SYSFS_FLAG_LOCKDEP;
-	}
-#endif
-
-	/*
-	 * sd->s_attr.ops is accesible only while holding active ref.  We
-	 * need to know whether some ops are implemented outside active
-	 * ref.  Cache their existence in flags.
-	 */
-	if (ops->seq_show)
-		sd->s_flags |= SYSFS_FLAG_HAS_SEQ_SHOW;
-	if (ops->mmap)
-		sd->s_flags |= SYSFS_FLAG_HAS_MMAP;
-
-	sysfs_addrm_start(&acxt);
-	rc = sysfs_add_one(&acxt, sd, parent);
-	sysfs_addrm_finish(&acxt);
-
-	if (rc) {
-		kernfs_put(sd);
-		return ERR_PTR(rc);
-	}
-	return sd;
-}
-
 int sysfs_add_file(struct sysfs_dirent *dir_sd, const struct attribute *attr,
 		   bool is_bin)
 {
diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index aef8180..95cc989 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -41,15 +41,11 @@ void sysfs_warn_dup(struct sysfs_dirent *parent, const char *name);
 /*
  * file.c
  */
-extern const struct file_operations kernfs_file_operations;
-
 int sysfs_add_file(struct sysfs_dirent *dir_sd,
 		   const struct attribute *attr, bool is_bin);
-
 int sysfs_add_file_mode_ns(struct sysfs_dirent *dir_sd,
 			   const struct attribute *attr, bool is_bin,
 			   umode_t amode, const void *ns);
-void sysfs_unmap_bin_file(struct sysfs_dirent *sd);
 
 /*
  * symlink.c
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 34/34] sysfs, kernfs: move symlink core code to fs/kernfs/symlink.c
  2013-10-24 15:49 Subject: [PATCHSET driver-core-next] sysfs: separate out kernfs, part #1 Tejun Heo
                   ` (32 preceding siblings ...)
  2013-10-24 15:49 ` [PATCH 33/34] sysfs, kernfs: move file core code to fs/kernfs/file.c Tejun Heo
@ 2013-10-24 15:49 ` Tejun Heo
  2013-10-28 13:06 ` Subject: [PATCHSET driver-core-next] sysfs: separate out kernfs, part #1 Tejun Heo
  2013-10-29 22:29 ` Greg KH
  35 siblings, 0 replies; 46+ messages in thread
From: Tejun Heo @ 2013-10-24 15:49 UTC (permalink / raw)
  To: gregkh; +Cc: kay, linux-kernel, ebiederm, bhelgaas, Tejun Heo

Move core symlink code to fs/kernfs/symlink.c.  fs/sysfs/symlink.c now
only contains sysfs wrappers around kernfs interfaces.  The respective
declarations in fs/sysfs/sysfs.h are moved to
fs/kernfs/kernfs-internal.h.

This is pure relocation.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 fs/kernfs/kernfs-internal.h |   5 ++
 fs/kernfs/symlink.c         | 138 ++++++++++++++++++++++++++++++++++++++++++++
 fs/sysfs/symlink.c          | 136 -------------------------------------------
 fs/sysfs/sysfs.h            |   1 -
 4 files changed, 143 insertions(+), 137 deletions(-)

diff --git a/fs/kernfs/kernfs-internal.h b/fs/kernfs/kernfs-internal.h
index 1046624..c711072 100644
--- a/fs/kernfs/kernfs-internal.h
+++ b/fs/kernfs/kernfs-internal.h
@@ -149,4 +149,9 @@ extern const struct file_operations kernfs_file_operations;
 
 void sysfs_unmap_bin_file(struct sysfs_dirent *sd);
 
+/*
+ * symlink.c
+ */
+extern const struct inode_operations sysfs_symlink_inode_operations;
+
 #endif	/* __KERNFS_INTERNAL_H */
diff --git a/fs/kernfs/symlink.c b/fs/kernfs/symlink.c
index 2578715..00e4fdb 100644
--- a/fs/kernfs/symlink.c
+++ b/fs/kernfs/symlink.c
@@ -7,3 +7,141 @@
  *
  * This file is released under the GPLv2.
  */
+
+#include <linux/fs.h>
+#include <linux/gfp.h>
+#include <linux/namei.h>
+
+#include "kernfs-internal.h"
+
+/**
+ * kernfs_create_link - create a symlink
+ * @parent: directory to create the symlink in
+ * @name: name of the symlink
+ * @target: target node for the symlink to point to
+ *
+ * Returns the created node on success, ERR_PTR() value on error.
+ */
+struct sysfs_dirent *kernfs_create_link(struct sysfs_dirent *parent,
+					const char *name,
+					struct sysfs_dirent *target)
+{
+	struct sysfs_dirent *sd;
+	struct sysfs_addrm_cxt acxt;
+	int error;
+
+	sd = sysfs_new_dirent(name, S_IFLNK|S_IRWXUGO, SYSFS_KOBJ_LINK);
+	if (!sd)
+		return ERR_PTR(-ENOMEM);
+
+	sd->s_ns = target->s_ns;
+	sd->s_symlink.target_sd = target;
+	kernfs_get(target);	/* ref owned by symlink */
+
+	sysfs_addrm_start(&acxt);
+	error = sysfs_add_one(&acxt, sd, parent);
+	sysfs_addrm_finish(&acxt);
+
+	if (!error)
+		return sd;
+
+	kernfs_put(sd);
+	return ERR_PTR(error);
+}
+
+static int sysfs_get_target_path(struct sysfs_dirent *parent_sd,
+				 struct sysfs_dirent *target_sd, char *path)
+{
+	struct sysfs_dirent *base, *sd;
+	char *s = path;
+	int len = 0;
+
+	/* go up to the root, stop at the base */
+	base = parent_sd;
+	while (base->s_parent) {
+		sd = target_sd->s_parent;
+		while (sd->s_parent && base != sd)
+			sd = sd->s_parent;
+
+		if (base == sd)
+			break;
+
+		strcpy(s, "../");
+		s += 3;
+		base = base->s_parent;
+	}
+
+	/* determine end of target string for reverse fillup */
+	sd = target_sd;
+	while (sd->s_parent && sd != base) {
+		len += strlen(sd->s_name) + 1;
+		sd = sd->s_parent;
+	}
+
+	/* check limits */
+	if (len < 2)
+		return -EINVAL;
+	len--;
+	if ((s - path) + len > PATH_MAX)
+		return -ENAMETOOLONG;
+
+	/* reverse fillup of target string from target to base */
+	sd = target_sd;
+	while (sd->s_parent && sd != base) {
+		int slen = strlen(sd->s_name);
+
+		len -= slen;
+		strncpy(s + len, sd->s_name, slen);
+		if (len)
+			s[--len] = '/';
+
+		sd = sd->s_parent;
+	}
+
+	return 0;
+}
+
+static int sysfs_getlink(struct dentry *dentry, char *path)
+{
+	struct sysfs_dirent *sd = dentry->d_fsdata;
+	struct sysfs_dirent *parent_sd = sd->s_parent;
+	struct sysfs_dirent *target_sd = sd->s_symlink.target_sd;
+	int error;
+
+	mutex_lock(&sysfs_mutex);
+	error = sysfs_get_target_path(parent_sd, target_sd, path);
+	mutex_unlock(&sysfs_mutex);
+
+	return error;
+}
+
+static void *sysfs_follow_link(struct dentry *dentry, struct nameidata *nd)
+{
+	int error = -ENOMEM;
+	unsigned long page = get_zeroed_page(GFP_KERNEL);
+	if (page) {
+		error = sysfs_getlink(dentry, (char *) page);
+		if (error < 0)
+			free_page((unsigned long)page);
+	}
+	nd_set_link(nd, error ? ERR_PTR(error) : (char *)page);
+	return NULL;
+}
+
+static void sysfs_put_link(struct dentry *dentry, struct nameidata *nd,
+			   void *cookie)
+{
+	char *page = nd_get_link(nd);
+	if (!IS_ERR(page))
+		free_page((unsigned long)page);
+}
+
+const struct inode_operations sysfs_symlink_inode_operations = {
+	.setxattr	= sysfs_setxattr,
+	.readlink	= generic_readlink,
+	.follow_link	= sysfs_follow_link,
+	.put_link	= sysfs_put_link,
+	.setattr	= sysfs_setattr,
+	.getattr	= sysfs_getattr,
+	.permission	= sysfs_permission,
+};
diff --git a/fs/sysfs/symlink.c b/fs/sysfs/symlink.c
index 1aae6ff..e70202d 100644
--- a/fs/sysfs/symlink.c
+++ b/fs/sysfs/symlink.c
@@ -11,52 +11,13 @@
  */
 
 #include <linux/fs.h>
-#include <linux/gfp.h>
-#include <linux/mount.h>
 #include <linux/module.h>
 #include <linux/kobject.h>
-#include <linux/namei.h>
 #include <linux/mutex.h>
 #include <linux/security.h>
 
 #include "sysfs.h"
 
-/**
- * kernfs_create_link - create a symlink
- * @parent: directory to create the symlink in
- * @name: name of the symlink
- * @target: target node for the symlink to point to
- *
- * Returns the created node on success, ERR_PTR() value on error.
- */
-struct sysfs_dirent *kernfs_create_link(struct sysfs_dirent *parent,
-					const char *name,
-					struct sysfs_dirent *target)
-{
-	struct sysfs_dirent *sd;
-	struct sysfs_addrm_cxt acxt;
-	int error;
-
-	sd = sysfs_new_dirent(name, S_IFLNK|S_IRWXUGO, SYSFS_KOBJ_LINK);
-	if (!sd)
-		return ERR_PTR(-ENOMEM);
-
-	sd->s_ns = target->s_ns;
-	sd->s_symlink.target_sd = target;
-	kernfs_get(target);	/* ref owned by symlink */
-
-	sysfs_addrm_start(&acxt);
-	error = sysfs_add_one(&acxt, sd, parent);
-	sysfs_addrm_finish(&acxt);
-
-	if (!error)
-		return sd;
-
-	kernfs_put(sd);
-	return ERR_PTR(error);
-}
-
-
 static int sysfs_do_create_link_sd(struct sysfs_dirent *parent_sd,
 				   struct kobject *target,
 				   const char *name, int warn)
@@ -226,100 +187,3 @@ out:
 	return result;
 }
 EXPORT_SYMBOL_GPL(sysfs_rename_link_ns);
-
-static int sysfs_get_target_path(struct sysfs_dirent *parent_sd,
-				 struct sysfs_dirent *target_sd, char *path)
-{
-	struct sysfs_dirent *base, *sd;
-	char *s = path;
-	int len = 0;
-
-	/* go up to the root, stop at the base */
-	base = parent_sd;
-	while (base->s_parent) {
-		sd = target_sd->s_parent;
-		while (sd->s_parent && base != sd)
-			sd = sd->s_parent;
-
-		if (base == sd)
-			break;
-
-		strcpy(s, "../");
-		s += 3;
-		base = base->s_parent;
-	}
-
-	/* determine end of target string for reverse fillup */
-	sd = target_sd;
-	while (sd->s_parent && sd != base) {
-		len += strlen(sd->s_name) + 1;
-		sd = sd->s_parent;
-	}
-
-	/* check limits */
-	if (len < 2)
-		return -EINVAL;
-	len--;
-	if ((s - path) + len > PATH_MAX)
-		return -ENAMETOOLONG;
-
-	/* reverse fillup of target string from target to base */
-	sd = target_sd;
-	while (sd->s_parent && sd != base) {
-		int slen = strlen(sd->s_name);
-
-		len -= slen;
-		strncpy(s + len, sd->s_name, slen);
-		if (len)
-			s[--len] = '/';
-
-		sd = sd->s_parent;
-	}
-
-	return 0;
-}
-
-static int sysfs_getlink(struct dentry *dentry, char *path)
-{
-	struct sysfs_dirent *sd = dentry->d_fsdata;
-	struct sysfs_dirent *parent_sd = sd->s_parent;
-	struct sysfs_dirent *target_sd = sd->s_symlink.target_sd;
-	int error;
-
-	mutex_lock(&sysfs_mutex);
-	error = sysfs_get_target_path(parent_sd, target_sd, path);
-	mutex_unlock(&sysfs_mutex);
-
-	return error;
-}
-
-static void *sysfs_follow_link(struct dentry *dentry, struct nameidata *nd)
-{
-	int error = -ENOMEM;
-	unsigned long page = get_zeroed_page(GFP_KERNEL);
-	if (page) {
-		error = sysfs_getlink(dentry, (char *) page);
-		if (error < 0)
-			free_page((unsigned long)page);
-	}
-	nd_set_link(nd, error ? ERR_PTR(error) : (char *)page);
-	return NULL;
-}
-
-static void sysfs_put_link(struct dentry *dentry, struct nameidata *nd,
-			   void *cookie)
-{
-	char *page = nd_get_link(nd);
-	if (!IS_ERR(page))
-		free_page((unsigned long)page);
-}
-
-const struct inode_operations sysfs_symlink_inode_operations = {
-	.setxattr	= sysfs_setxattr,
-	.readlink	= generic_readlink,
-	.follow_link	= sysfs_follow_link,
-	.put_link	= sysfs_put_link,
-	.setattr	= sysfs_setattr,
-	.getattr	= sysfs_getattr,
-	.permission	= sysfs_permission,
-};
diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index 95cc989..3feb575 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -50,7 +50,6 @@ int sysfs_add_file_mode_ns(struct sysfs_dirent *dir_sd,
 /*
  * symlink.c
  */
-extern const struct inode_operations sysfs_symlink_inode_operations;
 int sysfs_create_link_sd(struct sysfs_dirent *sd, struct kobject *target,
 			 const char *name);
 
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* Re: [PATCH 14/34] sysfs, kernfs: prepare read path for kernfs
  2013-10-24 15:49 ` [PATCH 14/34] sysfs, kernfs: prepare read path for kernfs Tejun Heo
@ 2013-10-26 11:32   ` Pavel Machek
  2013-10-27 11:30     ` Tejun Heo
  2013-10-28 13:03   ` [PATCH v2 " Tejun Heo
  2013-10-28 16:33   ` [PATCH v3 " Tejun Heo
  2 siblings, 1 reply; 46+ messages in thread
From: Pavel Machek @ 2013-10-26 11:32 UTC (permalink / raw)
  To: Tejun Heo; +Cc: gregkh, kay, linux-kernel, ebiederm, bhelgaas

Hi!

> We're in the process of separating out core sysfs functionality into
> kernfs which will deal with sysfs_dirents directly.  This patch
> rearranges read path so that the kernfs and sysfs parts are separate.

> +static int sysfs_kf_seq_show(struct seq_file *sf, void *v)
>  {
>  	struct sysfs_open_file *of = sf->private;
>  	struct kobject *kobj = of->sd->s_parent->priv;
> -	const struct sysfs_ops *ops;
> +	const struct sysfs_ops *ops = sysfs_file_ops(of->sd);
> +	ssize_t count = 0;
>  	char *buf;
> -	ssize_t count;
>  
>  	/* acquire buffer and ensure that it's >= PAGE_SIZE */
>  	count = seq_get_buf(sf, &buf);

Initing count to zero is superfluous here.

> +	/* the same behavior as single_open() */
> +	return NULL + !*ppos;
> +}

That's certainly interesting code.

						Pavel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 14/34] sysfs, kernfs: prepare read path for kernfs
  2013-10-26 11:32   ` Pavel Machek
@ 2013-10-27 11:30     ` Tejun Heo
  0 siblings, 0 replies; 46+ messages in thread
From: Tejun Heo @ 2013-10-27 11:30 UTC (permalink / raw)
  To: Pavel Machek; +Cc: gregkh, kay, linux-kernel, ebiederm, bhelgaas

On Sat, Oct 26, 2013 at 01:32:21PM +0200, Pavel Machek wrote:
> Hi!
> 
> > We're in the process of separating out core sysfs functionality into
> > kernfs which will deal with sysfs_dirents directly.  This patch
> > rearranges read path so that the kernfs and sysfs parts are separate.
> 
> > +static int sysfs_kf_seq_show(struct seq_file *sf, void *v)
> >  {
> >  	struct sysfs_open_file *of = sf->private;
> >  	struct kobject *kobj = of->sd->s_parent->priv;
> > -	const struct sysfs_ops *ops;
> > +	const struct sysfs_ops *ops = sysfs_file_ops(of->sd);
> > +	ssize_t count = 0;
> >  	char *buf;
> > -	ssize_t count;
> >  
> >  	/* acquire buffer and ensure that it's >= PAGE_SIZE */
> >  	count = seq_get_buf(sf, &buf);
> 
> Initing count to zero is superfluous here.

Ooh, right.

> > +	/* the same behavior as single_open() */
> > +	return NULL + !*ppos;
> > +}
> 
> That's certainly interesting code.

Directly copied from single_*() functions.  Will add some comments but
what it's doing is inherently slightly tricky.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH v2 14/34] sysfs, kernfs: prepare read path for kernfs
  2013-10-24 15:49 ` [PATCH 14/34] sysfs, kernfs: prepare read path for kernfs Tejun Heo
  2013-10-26 11:32   ` Pavel Machek
@ 2013-10-28 13:03   ` Tejun Heo
  2013-10-28 16:33   ` [PATCH v3 " Tejun Heo
  2 siblings, 0 replies; 46+ messages in thread
From: Tejun Heo @ 2013-10-28 13:03 UTC (permalink / raw)
  To: gregkh; +Cc: kay, linux-kernel, ebiederm, bhelgaas, Pavel Machek

We're in the process of separating out core sysfs functionality into
kernfs which will deal with sysfs_dirents directly.  This patch
rearranges read path so that the kernfs and sysfs parts are separate.

* Regular file read path is refactored such that
  kernfs_seq_start/next/stop/show() handle all the boilerplate work
  including locking and updating event count for poll, while
  sysfs_kf_seq_show() deals with interaction with kobj show method.

* Bin file read path is refactored such that kernfs_file_direct_read()
  handles all the boilerplate work including buffer management and
  locking, while sysfs_kf_bin_read() deals with interaction with
  bin_attribute read method.

kernfs_file_read() is added.  It invokes either the seq_file or direct
read path depending on the file type.  This will eventually allow
using the same file_operations for both file types, which is necessary
to separate out kernfs.

While this patch changes the order of some operations, it shouldn't
change any visible behavior.

v2: Dropped unnecessary zeroing of @count from sysfs_kf_seq_show().
    Add comments explaining single_open() behavior.  Both suggested by
    Pavel.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Pavel Machek <pavel@ucw.cz>
---
 fs/sysfs/file.c | 193 +++++++++++++++++++++++++++++++++++++-------------------
 1 file changed, 128 insertions(+), 65 deletions(-)

diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c
index 444615d..6759d57 100644
--- a/fs/sysfs/file.c
+++ b/fs/sysfs/file.c
@@ -86,13 +86,13 @@ static const struct sysfs_ops *sysfs_file_ops(struct sysfs_dirent *sd)
  * details like buffering and seeking.  The following function pipes
  * sysfs_ops->show() result through seq_file.
  */
-static int sysfs_seq_show(struct seq_file *sf, void *v)
+static int sysfs_kf_seq_show(struct seq_file *sf, void *v)
 {
 	struct sysfs_open_file *of = sf->private;
 	struct kobject *kobj = of->sd->s_parent->priv;
-	const struct sysfs_ops *ops;
-	char *buf;
+	const struct sysfs_ops *ops = sysfs_file_ops(of->sd);
 	ssize_t count;
+	char *buf;
 
 	/* acquire buffer and ensure that it's >= PAGE_SIZE */
 	count = seq_get_buf(sf, &buf);
@@ -102,33 +102,14 @@ static int sysfs_seq_show(struct seq_file *sf, void *v)
 	}
 
 	/*
-	 * Need @of->sd for attr and ops, its parent for kobj.  @of->mutex
-	 * nests outside active ref and is just to ensure that the ops
-	 * aren't called concurrently for the same open file.
+	 * Invoke show().  Control may reach here via seq file lseek even
+	 * if @ops->show() isn't implemented.
 	 */
-	mutex_lock(&of->mutex);
-	if (!sysfs_get_active(of->sd)) {
-		mutex_unlock(&of->mutex);
-		return -ENODEV;
-	}
-
-	of->event = atomic_read(&of->sd->s_attr.open->event);
-
-	/*
-	 * Lookup @ops and invoke show().  Control may reach here via seq
-	 * file lseek even if @ops->show() isn't implemented.
-	 */
-	ops = sysfs_file_ops(of->sd);
-	if (ops->show)
+	if (ops->show) {
 		count = ops->show(kobj, of->sd->priv, buf);
-	else
-		count = 0;
-
-	sysfs_put_active(of->sd);
-	mutex_unlock(&of->mutex);
-
-	if (count < 0)
-		return count;
+		if (count < 0)
+			return count;
+	}
 
 	/*
 	 * The code works fine with PAGE_SIZE return but it's likely to
@@ -144,68 +125,148 @@ static int sysfs_seq_show(struct seq_file *sf, void *v)
 	return 0;
 }
 
-/*
- * Read method for bin files.  As reading a bin file can have side-effects,
- * the exact offset and bytes specified in read(2) call should be passed to
- * the read callback making it difficult to use seq_file.  Implement
- * simplistic custom buffering for bin files.
- */
-static ssize_t sysfs_bin_read(struct file *file, char __user *userbuf,
-			      size_t bytes, loff_t *off)
+static ssize_t sysfs_kf_bin_read(struct sysfs_open_file *of, char *buf,
+				 size_t count, loff_t pos)
 {
-	struct sysfs_open_file *of = sysfs_of(file);
 	struct bin_attribute *battr = of->sd->priv;
 	struct kobject *kobj = of->sd->s_parent->priv;
-	loff_t size = file_inode(file)->i_size;
-	int count = min_t(size_t, bytes, PAGE_SIZE);
-	loff_t offs = *off;
-	char *buf;
+	loff_t size = file_inode(of->file)->i_size;
 
-	if (!bytes)
+	if (!count)
 		return 0;
 
 	if (size) {
-		if (offs > size)
+		if (pos > size)
 			return 0;
-		if (offs + count > size)
-			count = size - offs;
+		if (pos + count > size)
+			count = size - pos;
 	}
 
-	buf = kmalloc(count, GFP_KERNEL);
+	if (!battr->read)
+		return -EIO;
+
+	return battr->read(of->file, kobj, battr, buf, pos, count);
+}
+
+static void *kernfs_seq_start(struct seq_file *sf, loff_t *ppos)
+{
+	struct sysfs_open_file *of = sf->private;
+
+	/*
+	 * @of->mutex nests outside active ref and is just to ensure that
+	 * the ops aren't called concurrently for the same open file.
+	 */
+	mutex_lock(&of->mutex);
+	if (!sysfs_get_active(of->sd)) {
+		mutex_unlock(&of->mutex);
+		return ERR_PTR(-ENODEV);
+	}
+
+	/*
+	 * The same behavior and code as single_open().  Returns !NULL if
+	 * pos is at the beginning; otherwise, NULL.
+	 */
+	return NULL + !*ppos;
+}
+
+static void *kernfs_seq_next(struct seq_file *sf, void *v, loff_t *ppos)
+{
+	/*
+	 * The same behavior and code as single_open(), always terminate
+	 * after the initial read.
+	 */
+	++*ppos;
+	return NULL;
+}
+
+static void kernfs_seq_stop(struct seq_file *sf, void *v)
+{
+	struct sysfs_open_file *of = sf->private;
+
+	sysfs_put_active(of->sd);
+	mutex_unlock(&of->mutex);
+}
+
+static int kernfs_seq_show(struct seq_file *sf, void *v)
+{
+	struct sysfs_open_file *of = sf->private;
+
+	of->event = atomic_read(&of->sd->s_attr.open->event);
+
+	return sysfs_kf_seq_show(sf, v);
+}
+
+static const struct seq_operations kernfs_seq_ops = {
+	.start = kernfs_seq_start,
+	.next = kernfs_seq_next,
+	.stop = kernfs_seq_stop,
+	.show = kernfs_seq_show,
+};
+
+/*
+ * As reading a bin file can have side-effects, the exact offset and bytes
+ * specified in read(2) call should be passed to the read callback making
+ * it difficult to use seq_file.  Implement simplistic custom buffering for
+ * bin files.
+ */
+static ssize_t kernfs_file_direct_read(struct sysfs_open_file *of,
+				       char __user *user_buf, size_t count,
+				       loff_t *ppos)
+{
+	ssize_t len = min_t(size_t, count, PAGE_SIZE);
+	char *buf;
+
+	buf = kmalloc(len, GFP_KERNEL);
 	if (!buf)
 		return -ENOMEM;
 
-	/* need of->sd for battr, its parent for kobj */
+	/*
+	 * @of->mutex nests outside active ref and is just to ensure that
+	 * the ops aren't called concurrently for the same open file.
+	 */
 	mutex_lock(&of->mutex);
 	if (!sysfs_get_active(of->sd)) {
-		count = -ENODEV;
+		len = -ENODEV;
 		mutex_unlock(&of->mutex);
 		goto out_free;
 	}
 
-	if (battr->read)
-		count = battr->read(file, kobj, battr, buf, offs, count);
-	else
-		count = -EIO;
+	len = sysfs_kf_bin_read(of, buf, len, *ppos);
 
 	sysfs_put_active(of->sd);
 	mutex_unlock(&of->mutex);
 
-	if (count < 0)
+	if (len < 0)
 		goto out_free;
 
-	if (copy_to_user(userbuf, buf, count)) {
-		count = -EFAULT;
+	if (copy_to_user(user_buf, buf, len)) {
+		len = -EFAULT;
 		goto out_free;
 	}
 
-	pr_debug("offs = %lld, *off = %lld, count = %d\n", offs, *off, count);
-
-	*off = offs + count;
+	*ppos += len;
 
  out_free:
 	kfree(buf);
-	return count;
+	return len;
+}
+
+/**
+ * kernfs_file_read - kernfs vfs read callback
+ * @file: file pointer
+ * @user_buf: data to write
+ * @count: number of bytes
+ * @ppos: starting offset
+ */
+static ssize_t kernfs_file_read(struct file *file, char __user *user_buf,
+				size_t count, loff_t *ppos)
+{
+	struct sysfs_open_file *of = sysfs_of(file);
+
+	if (sysfs_is_bin(of->sd))
+		return kernfs_file_direct_read(of, user_buf, count, ppos);
+	else
+		return seq_read(file, user_buf, count, ppos);
 }
 
 /**
@@ -660,12 +721,14 @@ static int sysfs_open_file(struct inode *inode, struct file *file)
 	 * and readable regular files are the vast majority anyway.
 	 */
 	if (sysfs_is_bin(attr_sd))
-		error = single_open(file, NULL, of);
+		error = seq_open(file, NULL);
 	else
-		error = single_open(file, sysfs_seq_show, of);
+		error = seq_open(file, &kernfs_seq_ops);
 	if (error)
 		goto err_free;
 
+	((struct seq_file *)file->private_data)->private = of;
+
 	/* seq_file clears PWRITE unconditionally, restore it if WRITE */
 	if (file->f_mode & FMODE_WRITE)
 		file->f_mode |= FMODE_PWRITE;
@@ -680,7 +743,7 @@ static int sysfs_open_file(struct inode *inode, struct file *file)
 	return 0;
 
 err_close:
-	single_release(inode, file);
+	seq_release(inode, file);
 err_free:
 	kfree(of);
 err_out:
@@ -694,7 +757,7 @@ static int sysfs_release(struct inode *inode, struct file *filp)
 	struct sysfs_open_file *of = sysfs_of(filp);
 
 	sysfs_put_open_dirent(sd, of);
-	single_release(inode, filp);
+	seq_release(inode, filp);
 	kfree(of);
 
 	return 0;
@@ -799,7 +862,7 @@ void sysfs_notify(struct kobject *k, const char *dir, const char *attr)
 EXPORT_SYMBOL_GPL(sysfs_notify);
 
 const struct file_operations sysfs_file_operations = {
-	.read		= seq_read,
+	.read		= kernfs_file_read,
 	.write		= sysfs_write_file,
 	.llseek		= seq_lseek,
 	.open		= sysfs_open_file,
@@ -808,7 +871,7 @@ const struct file_operations sysfs_file_operations = {
 };
 
 const struct file_operations sysfs_bin_operations = {
-	.read		= sysfs_bin_read,
+	.read		= kernfs_file_read,
 	.write		= sysfs_write_file,
 	.llseek		= generic_file_llseek,
 	.mmap		= sysfs_bin_mmap,
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v2 25/34] sysfs, kernfs: add kernfs_ops->seq_{start|next|stop}()
  2013-10-24 15:49 ` [PATCH 25/34] sysfs, kernfs: add kernfs_ops->seq_{start|next|stop}() Tejun Heo
@ 2013-10-28 13:04   ` Tejun Heo
  0 siblings, 0 replies; 46+ messages in thread
From: Tejun Heo @ 2013-10-28 13:04 UTC (permalink / raw)
  To: gregkh; +Cc: kay, linux-kernel, ebiederm, bhelgaas

kernfs_ops currently only supports single_open() behavior which is
pretty restrictive.  Add optional callbacks ->seq_{start|next|stop}()
which, when implemented, are invoked for seq_file traversal.  This
allows full seq_file functionality for kernfs users.  This currently
doesn't have any user and doesn't change any behavior.

v2: Refreshed on top of the updated "sysfs, kernfs: prepare read path
    for kernfs".

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 fs/sysfs/file.c        | 39 ++++++++++++++++++++++++++++-----------
 include/linux/kernfs.h |  9 +++++++--
 2 files changed, 35 insertions(+), 13 deletions(-)

diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c
index 971f1d8..9b79160 100644
--- a/fs/sysfs/file.c
+++ b/fs/sysfs/file.c
@@ -146,6 +146,7 @@ static ssize_t sysfs_kf_bin_read(struct sysfs_open_file *of, char *buf,
 static void *kernfs_seq_start(struct seq_file *sf, loff_t *ppos)
 {
 	struct sysfs_open_file *of = sf->private;
+	const struct kernfs_ops *ops;
 
 	/*
 	 * @of->mutex nests outside active ref and is just to ensure that
@@ -157,26 +158,42 @@ static void *kernfs_seq_start(struct seq_file *sf, loff_t *ppos)
 		return ERR_PTR(-ENODEV);
 	}
 
-	/*
-	 * The same behavior and code as single_open().  Returns !NULL if
-	 * pos is at the beginning; otherwise, NULL.
-	 */
-	return NULL + !*ppos;
+	ops = kernfs_ops(of->sd);
+	if (ops->seq_start) {
+		return ops->seq_start(sf, ppos);
+	} else {
+		/*
+		 * The same behavior and code as single_open().  Returns
+		 * !NULL if pos is at the beginning; otherwise, NULL.
+		 */
+		return NULL + !*ppos;
+	}
 }
 
 static void *kernfs_seq_next(struct seq_file *sf, void *v, loff_t *ppos)
 {
-	/*
-	 * The same behavior and code as single_open(), always terminate
-	 * after the initial read.
-	 */
-	++*ppos;
-	return NULL;
+	struct sysfs_open_file *of = sf->private;
+	const struct kernfs_ops *ops = kernfs_ops(of->sd);
+
+	if (ops->seq_next) {
+		return ops->seq_next(sf, v, ppos);
+	} else {
+		/*
+		 * The same behavior and code as single_open(), always
+		 * terminate after the initial read.
+		 */
+		++*ppos;
+		return NULL;
+	}
 }
 
 static void kernfs_seq_stop(struct seq_file *sf, void *v)
 {
 	struct sysfs_open_file *of = sf->private;
+	const struct kernfs_ops *ops = kernfs_ops(of->sd);
+
+	if (ops->seq_stop)
+		ops->seq_stop(sf, v);
 
 	sysfs_put_active(of->sd);
 	mutex_unlock(&of->mutex);
diff --git a/include/linux/kernfs.h b/include/linux/kernfs.h
index a1b94b7..222040c 100644
--- a/include/linux/kernfs.h
+++ b/include/linux/kernfs.h
@@ -36,8 +36,9 @@ struct kernfs_ops {
 	/*
 	 * Read is handled by either seq_file or raw_read().
 	 *
-	 * If seq_show() is present, seq_file path is active.  The behavior
-	 * is equivalent to single_open().  @sf->private points to the
+	 * If seq_show() is present, seq_file path is active.  Other seq
+	 * operations are optional and if not implemented, the behavior is
+	 * equivalent to single_open().  @sf->private points to the
 	 * associated sysfs_open_file.
 	 *
 	 * read() is bounced through kernel buffer and a read larger than
@@ -45,6 +46,10 @@ struct kernfs_ops {
 	 */
 	int (*seq_show)(struct seq_file *sf, void *v);
 
+	void *(*seq_start)(struct seq_file *sf, loff_t *ppos);
+	void *(*seq_next)(struct seq_file *sf, void *v, loff_t *ppos);
+	void (*seq_stop)(struct seq_file *sf, void *v);
+
 	ssize_t (*read)(struct sysfs_open_file *of, char *buf, size_t bytes,
 			loff_t off);
 
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v2 33/34] sysfs, kernfs: move file core code to fs/kernfs/file.c
  2013-10-24 15:49 ` [PATCH 33/34] sysfs, kernfs: move file core code to fs/kernfs/file.c Tejun Heo
@ 2013-10-28 13:04   ` Tejun Heo
  2013-10-28 16:34   ` [PATCH v3 " Tejun Heo
  1 sibling, 0 replies; 46+ messages in thread
From: Tejun Heo @ 2013-10-28 13:04 UTC (permalink / raw)
  To: gregkh; +Cc: kay, linux-kernel, ebiederm, bhelgaas

Move core file code to fs/kernfs/file.c.  fs/sysfs/file.c now contains
sysfs kernfs_ops callbacks, sysfs wrappers around kernfs interfaces,
and sysfs_schedule_callback().  The respective declarations in
fs/sysfs/sysfs.h are moved to fs/kernfs/kernfs-internal.h.

This is pure relocation.

v2: Refreshed on top of the updated "sysfs, kernfs: prepare read path
    for kernfs".

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 fs/kernfs/file.c            | 800 ++++++++++++++++++++++++++++++++++++++++++++
 fs/kernfs/kernfs-internal.h |   7 +
 fs/sysfs/file.c             | 797 +------------------------------------------
 fs/sysfs/sysfs.h            |   4 -
 4 files changed, 808 insertions(+), 800 deletions(-)

diff --git a/fs/kernfs/file.c b/fs/kernfs/file.c
index 90b1e88..6011222 100644
--- a/fs/kernfs/file.c
+++ b/fs/kernfs/file.c
@@ -7,3 +7,803 @@
  *
  * This file is released under the GPLv2.
  */
+
+#include <linux/fs.h>
+#include <linux/seq_file.h>
+#include <linux/slab.h>
+#include <linux/poll.h>
+#include <linux/pagemap.h>
+#include <linux/poll.h>
+#include <linux/sched.h>
+
+#include "kernfs-internal.h"
+
+/*
+ * There's one sysfs_open_file for each open file and one sysfs_open_dirent
+ * for each sysfs_dirent with one or more open files.
+ *
+ * sysfs_dirent->s_attr.open points to sysfs_open_dirent.  s_attr.open is
+ * protected by sysfs_open_dirent_lock.
+ *
+ * filp->private_data points to seq_file whose ->private points to
+ * sysfs_open_file.  sysfs_open_files are chained at
+ * sysfs_open_dirent->files, which is protected by sysfs_open_file_mutex.
+ */
+static DEFINE_SPINLOCK(sysfs_open_dirent_lock);
+static DEFINE_MUTEX(sysfs_open_file_mutex);
+
+struct sysfs_open_dirent {
+	atomic_t		refcnt;
+	atomic_t		event;
+	wait_queue_head_t	poll;
+	struct list_head	files; /* goes through sysfs_open_file.list */
+};
+
+static struct sysfs_open_file *sysfs_of(struct file *file)
+{
+	return ((struct seq_file *)file->private_data)->private;
+}
+
+/*
+ * Determine the kernfs_ops for the given sysfs_dirent.  This function must
+ * be called while holding an active reference.
+ */
+static const struct kernfs_ops *kernfs_ops(struct sysfs_dirent *sd)
+{
+	if (sd->s_flags & SYSFS_FLAG_LOCKDEP)
+		lockdep_assert_held(sd);
+	return sd->s_attr.ops;
+}
+
+static void *kernfs_seq_start(struct seq_file *sf, loff_t *ppos)
+{
+	struct sysfs_open_file *of = sf->private;
+	const struct kernfs_ops *ops;
+
+	/*
+	 * @of->mutex nests outside active ref and is just to ensure that
+	 * the ops aren't called concurrently for the same open file.
+	 */
+	mutex_lock(&of->mutex);
+	if (!sysfs_get_active(of->sd)) {
+		mutex_unlock(&of->mutex);
+		return ERR_PTR(-ENODEV);
+	}
+
+	ops = kernfs_ops(of->sd);
+	if (ops->seq_start) {
+		return ops->seq_start(sf, ppos);
+	} else {
+		/*
+		 * The same behavior and code as single_open().  Returns
+		 * !NULL if pos is at the beginning; otherwise, NULL.
+		 */
+		return NULL + !*ppos;
+	}
+}
+
+static void *kernfs_seq_next(struct seq_file *sf, void *v, loff_t *ppos)
+{
+	struct sysfs_open_file *of = sf->private;
+	const struct kernfs_ops *ops = kernfs_ops(of->sd);
+
+	if (ops->seq_next) {
+		return ops->seq_next(sf, v, ppos);
+	} else {
+		/*
+		 * The same behavior and code as single_open(), always
+		 * terminate after the initial read.
+		 */
+		++*ppos;
+		return NULL;
+	}
+}
+
+static void kernfs_seq_stop(struct seq_file *sf, void *v)
+{
+	struct sysfs_open_file *of = sf->private;
+	const struct kernfs_ops *ops = kernfs_ops(of->sd);
+
+	if (ops->seq_stop)
+		ops->seq_stop(sf, v);
+
+	sysfs_put_active(of->sd);
+	mutex_unlock(&of->mutex);
+}
+
+static int kernfs_seq_show(struct seq_file *sf, void *v)
+{
+	struct sysfs_open_file *of = sf->private;
+
+	of->event = atomic_read(&of->sd->s_attr.open->event);
+
+	return of->sd->s_attr.ops->seq_show(sf, v);
+}
+
+static const struct seq_operations kernfs_seq_ops = {
+	.start = kernfs_seq_start,
+	.next = kernfs_seq_next,
+	.stop = kernfs_seq_stop,
+	.show = kernfs_seq_show,
+};
+
+/*
+ * As reading a bin file can have side-effects, the exact offset and bytes
+ * specified in read(2) call should be passed to the read callback making
+ * it difficult to use seq_file.  Implement simplistic custom buffering for
+ * bin files.
+ */
+static ssize_t kernfs_file_direct_read(struct sysfs_open_file *of,
+				       char __user *user_buf, size_t count,
+				       loff_t *ppos)
+{
+	ssize_t len = min_t(size_t, count, PAGE_SIZE);
+	const struct kernfs_ops *ops;
+	char *buf;
+
+	buf = kmalloc(len, GFP_KERNEL);
+	if (!buf)
+		return -ENOMEM;
+
+	/*
+	 * @of->mutex nests outside active ref and is just to ensure that
+	 * the ops aren't called concurrently for the same open file.
+	 */
+	mutex_lock(&of->mutex);
+	if (!sysfs_get_active(of->sd)) {
+		len = -ENODEV;
+		mutex_unlock(&of->mutex);
+		goto out_free;
+	}
+
+	ops = kernfs_ops(of->sd);
+	if (ops->read)
+		len = ops->read(of, buf, len, *ppos);
+	else
+		len = -EINVAL;
+
+	sysfs_put_active(of->sd);
+	mutex_unlock(&of->mutex);
+
+	if (len < 0)
+		goto out_free;
+
+	if (copy_to_user(user_buf, buf, len)) {
+		len = -EFAULT;
+		goto out_free;
+	}
+
+	*ppos += len;
+
+ out_free:
+	kfree(buf);
+	return len;
+}
+
+/**
+ * kernfs_file_read - kernfs vfs read callback
+ * @file: file pointer
+ * @user_buf: data to write
+ * @count: number of bytes
+ * @ppos: starting offset
+ */
+static ssize_t kernfs_file_read(struct file *file, char __user *user_buf,
+				size_t count, loff_t *ppos)
+{
+	struct sysfs_open_file *of = sysfs_of(file);
+
+	if (of->sd->s_flags & SYSFS_FLAG_HAS_SEQ_SHOW)
+		return seq_read(file, user_buf, count, ppos);
+	else
+		return kernfs_file_direct_read(of, user_buf, count, ppos);
+}
+
+/**
+ * kernfs_file_write - kernfs vfs write callback
+ * @file: file pointer
+ * @user_buf: data to write
+ * @count: number of bytes
+ * @ppos: starting offset
+ *
+ * Copy data in from userland and pass it to the matching kernfs write
+ * operation.
+ *
+ * There is no easy way for us to know if userspace is only doing a partial
+ * write, so we don't support them. We expect the entire buffer to come on
+ * the first write.  Hint: if you're writing a value, first read the file,
+ * modify only the the value you're changing, then write entire buffer
+ * back.
+ */
+static ssize_t kernfs_file_write(struct file *file, const char __user *user_buf,
+				 size_t count, loff_t *ppos)
+{
+	struct sysfs_open_file *of = sysfs_of(file);
+	ssize_t len = min_t(size_t, count, PAGE_SIZE);
+	const struct kernfs_ops *ops;
+	char *buf;
+
+	buf = kmalloc(len + 1, GFP_KERNEL);
+	if (!buf)
+		return -ENOMEM;
+
+	if (copy_from_user(buf, user_buf, len)) {
+		len = -EFAULT;
+		goto out_free;
+	}
+	buf[len] = '\0';	/* guarantee string termination */
+
+	/*
+	 * @of->mutex nests outside active ref and is just to ensure that
+	 * the ops aren't called concurrently for the same open file.
+	 */
+	mutex_lock(&of->mutex);
+	if (!sysfs_get_active(of->sd)) {
+		mutex_unlock(&of->mutex);
+		len = -ENODEV;
+		goto out_free;
+	}
+
+	ops = kernfs_ops(of->sd);
+	if (ops->write)
+		len = ops->write(of, buf, len, *ppos);
+	else
+		len = -EINVAL;
+
+	sysfs_put_active(of->sd);
+	mutex_unlock(&of->mutex);
+
+	if (len > 0)
+		*ppos += len;
+out_free:
+	kfree(buf);
+	return len;
+}
+
+static loff_t kernfs_file_llseek(struct file *file, loff_t off, int whence)
+{
+	struct sysfs_open_file *of = sysfs_of(file);
+
+	if (of->sd->s_flags & SYSFS_FLAG_HAS_SEQ_SHOW)
+		return seq_lseek(file, off, whence);
+	else
+		return generic_file_llseek(file, off, whence);
+}
+
+static void kernfs_vma_open(struct vm_area_struct *vma)
+{
+	struct file *file = vma->vm_file;
+	struct sysfs_open_file *of = sysfs_of(file);
+
+	if (!of->vm_ops)
+		return;
+
+	if (!sysfs_get_active(of->sd))
+		return;
+
+	if (of->vm_ops->open)
+		of->vm_ops->open(vma);
+
+	sysfs_put_active(of->sd);
+}
+
+static int kernfs_vma_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
+{
+	struct file *file = vma->vm_file;
+	struct sysfs_open_file *of = sysfs_of(file);
+	int ret;
+
+	if (!of->vm_ops)
+		return VM_FAULT_SIGBUS;
+
+	if (!sysfs_get_active(of->sd))
+		return VM_FAULT_SIGBUS;
+
+	ret = VM_FAULT_SIGBUS;
+	if (of->vm_ops->fault)
+		ret = of->vm_ops->fault(vma, vmf);
+
+	sysfs_put_active(of->sd);
+	return ret;
+}
+
+static int kernfs_vma_page_mkwrite(struct vm_area_struct *vma,
+				   struct vm_fault *vmf)
+{
+	struct file *file = vma->vm_file;
+	struct sysfs_open_file *of = sysfs_of(file);
+	int ret;
+
+	if (!of->vm_ops)
+		return VM_FAULT_SIGBUS;
+
+	if (!sysfs_get_active(of->sd))
+		return VM_FAULT_SIGBUS;
+
+	ret = 0;
+	if (of->vm_ops->page_mkwrite)
+		ret = of->vm_ops->page_mkwrite(vma, vmf);
+	else
+		file_update_time(file);
+
+	sysfs_put_active(of->sd);
+	return ret;
+}
+
+static int kernfs_vma_access(struct vm_area_struct *vma, unsigned long addr,
+			     void *buf, int len, int write)
+{
+	struct file *file = vma->vm_file;
+	struct sysfs_open_file *of = sysfs_of(file);
+	int ret;
+
+	if (!of->vm_ops)
+		return -EINVAL;
+
+	if (!sysfs_get_active(of->sd))
+		return -EINVAL;
+
+	ret = -EINVAL;
+	if (of->vm_ops->access)
+		ret = of->vm_ops->access(vma, addr, buf, len, write);
+
+	sysfs_put_active(of->sd);
+	return ret;
+}
+
+#ifdef CONFIG_NUMA
+static int kernfs_vma_set_policy(struct vm_area_struct *vma,
+				 struct mempolicy *new)
+{
+	struct file *file = vma->vm_file;
+	struct sysfs_open_file *of = sysfs_of(file);
+	int ret;
+
+	if (!of->vm_ops)
+		return 0;
+
+	if (!sysfs_get_active(of->sd))
+		return -EINVAL;
+
+	ret = 0;
+	if (of->vm_ops->set_policy)
+		ret = of->vm_ops->set_policy(vma, new);
+
+	sysfs_put_active(of->sd);
+	return ret;
+}
+
+static struct mempolicy *kernfs_vma_get_policy(struct vm_area_struct *vma,
+					       unsigned long addr)
+{
+	struct file *file = vma->vm_file;
+	struct sysfs_open_file *of = sysfs_of(file);
+	struct mempolicy *pol;
+
+	if (!of->vm_ops)
+		return vma->vm_policy;
+
+	if (!sysfs_get_active(of->sd))
+		return vma->vm_policy;
+
+	pol = vma->vm_policy;
+	if (of->vm_ops->get_policy)
+		pol = of->vm_ops->get_policy(vma, addr);
+
+	sysfs_put_active(of->sd);
+	return pol;
+}
+
+static int kernfs_vma_migrate(struct vm_area_struct *vma,
+			      const nodemask_t *from, const nodemask_t *to,
+			      unsigned long flags)
+{
+	struct file *file = vma->vm_file;
+	struct sysfs_open_file *of = sysfs_of(file);
+	int ret;
+
+	if (!of->vm_ops)
+		return 0;
+
+	if (!sysfs_get_active(of->sd))
+		return 0;
+
+	ret = 0;
+	if (of->vm_ops->migrate)
+		ret = of->vm_ops->migrate(vma, from, to, flags);
+
+	sysfs_put_active(of->sd);
+	return ret;
+}
+#endif
+
+static const struct vm_operations_struct kernfs_vm_ops = {
+	.open		= kernfs_vma_open,
+	.fault		= kernfs_vma_fault,
+	.page_mkwrite	= kernfs_vma_page_mkwrite,
+	.access		= kernfs_vma_access,
+#ifdef CONFIG_NUMA
+	.set_policy	= kernfs_vma_set_policy,
+	.get_policy	= kernfs_vma_get_policy,
+	.migrate	= kernfs_vma_migrate,
+#endif
+};
+
+static int kernfs_file_mmap(struct file *file, struct vm_area_struct *vma)
+{
+	struct sysfs_open_file *of = sysfs_of(file);
+	const struct kernfs_ops *ops;
+	int rc;
+
+	mutex_lock(&of->mutex);
+
+	rc = -ENODEV;
+	if (!sysfs_get_active(of->sd))
+		goto out_unlock;
+
+	ops = kernfs_ops(of->sd);
+	if (ops->mmap)
+		rc = ops->mmap(of, vma);
+	if (rc)
+		goto out_put;
+
+	/*
+	 * PowerPC's pci_mmap of legacy_mem uses shmem_zero_setup()
+	 * to satisfy versions of X which crash if the mmap fails: that
+	 * substitutes a new vm_file, and we don't then want bin_vm_ops.
+	 */
+	if (vma->vm_file != file)
+		goto out_put;
+
+	rc = -EINVAL;
+	if (of->mmapped && of->vm_ops != vma->vm_ops)
+		goto out_put;
+
+	/*
+	 * It is not possible to successfully wrap close.
+	 * So error if someone is trying to use close.
+	 */
+	rc = -EINVAL;
+	if (vma->vm_ops && vma->vm_ops->close)
+		goto out_put;
+
+	rc = 0;
+	of->mmapped = 1;
+	of->vm_ops = vma->vm_ops;
+	vma->vm_ops = &kernfs_vm_ops;
+out_put:
+	sysfs_put_active(of->sd);
+out_unlock:
+	mutex_unlock(&of->mutex);
+
+	return rc;
+}
+
+/**
+ *	sysfs_get_open_dirent - get or create sysfs_open_dirent
+ *	@sd: target sysfs_dirent
+ *	@of: sysfs_open_file for this instance of open
+ *
+ *	If @sd->s_attr.open exists, increment its reference count;
+ *	otherwise, create one.  @of is chained to the files list.
+ *
+ *	LOCKING:
+ *	Kernel thread context (may sleep).
+ *
+ *	RETURNS:
+ *	0 on success, -errno on failure.
+ */
+static int sysfs_get_open_dirent(struct sysfs_dirent *sd,
+				 struct sysfs_open_file *of)
+{
+	struct sysfs_open_dirent *od, *new_od = NULL;
+
+ retry:
+	mutex_lock(&sysfs_open_file_mutex);
+	spin_lock_irq(&sysfs_open_dirent_lock);
+
+	if (!sd->s_attr.open && new_od) {
+		sd->s_attr.open = new_od;
+		new_od = NULL;
+	}
+
+	od = sd->s_attr.open;
+	if (od) {
+		atomic_inc(&od->refcnt);
+		list_add_tail(&of->list, &od->files);
+	}
+
+	spin_unlock_irq(&sysfs_open_dirent_lock);
+	mutex_unlock(&sysfs_open_file_mutex);
+
+	if (od) {
+		kfree(new_od);
+		return 0;
+	}
+
+	/* not there, initialize a new one and retry */
+	new_od = kmalloc(sizeof(*new_od), GFP_KERNEL);
+	if (!new_od)
+		return -ENOMEM;
+
+	atomic_set(&new_od->refcnt, 0);
+	atomic_set(&new_od->event, 1);
+	init_waitqueue_head(&new_od->poll);
+	INIT_LIST_HEAD(&new_od->files);
+	goto retry;
+}
+
+/**
+ *	sysfs_put_open_dirent - put sysfs_open_dirent
+ *	@sd: target sysfs_dirent
+ *	@of: associated sysfs_open_file
+ *
+ *	Put @sd->s_attr.open and unlink @of from the files list.  If
+ *	reference count reaches zero, disassociate and free it.
+ *
+ *	LOCKING:
+ *	None.
+ */
+static void sysfs_put_open_dirent(struct sysfs_dirent *sd,
+				  struct sysfs_open_file *of)
+{
+	struct sysfs_open_dirent *od = sd->s_attr.open;
+	unsigned long flags;
+
+	mutex_lock(&sysfs_open_file_mutex);
+	spin_lock_irqsave(&sysfs_open_dirent_lock, flags);
+
+	if (of)
+		list_del(&of->list);
+
+	if (atomic_dec_and_test(&od->refcnt))
+		sd->s_attr.open = NULL;
+	else
+		od = NULL;
+
+	spin_unlock_irqrestore(&sysfs_open_dirent_lock, flags);
+	mutex_unlock(&sysfs_open_file_mutex);
+
+	kfree(od);
+}
+
+static int kernfs_file_open(struct inode *inode, struct file *file)
+{
+	struct sysfs_dirent *attr_sd = file->f_path.dentry->d_fsdata;
+	const struct kernfs_ops *ops;
+	struct sysfs_open_file *of;
+	bool has_read, has_write;
+	int error = -EACCES;
+
+	if (!sysfs_get_active(attr_sd))
+		return -ENODEV;
+
+	ops = kernfs_ops(attr_sd);
+
+	has_read = ops->seq_show || ops->read || ops->mmap;
+	has_write = ops->write || ops->mmap;
+
+	/* check perms and supported operations */
+	if ((file->f_mode & FMODE_WRITE) &&
+	    (!(inode->i_mode & S_IWUGO) || !has_write))
+		goto err_out;
+
+	if ((file->f_mode & FMODE_READ) &&
+	    (!(inode->i_mode & S_IRUGO) || !has_read))
+		goto err_out;
+
+	/* allocate a sysfs_open_file for the file */
+	error = -ENOMEM;
+	of = kzalloc(sizeof(struct sysfs_open_file), GFP_KERNEL);
+	if (!of)
+		goto err_out;
+
+	mutex_init(&of->mutex);
+	of->sd = attr_sd;
+	of->file = file;
+
+	/*
+	 * Always instantiate seq_file even if read access doesn't use
+	 * seq_file or is not requested.  This unifies private data access
+	 * and readable regular files are the vast majority anyway.
+	 */
+	if (ops->seq_show)
+		error = seq_open(file, &kernfs_seq_ops);
+	else
+		error = seq_open(file, NULL);
+	if (error)
+		goto err_free;
+
+	((struct seq_file *)file->private_data)->private = of;
+
+	/* seq_file clears PWRITE unconditionally, restore it if WRITE */
+	if (file->f_mode & FMODE_WRITE)
+		file->f_mode |= FMODE_PWRITE;
+
+	/* make sure we have open dirent struct */
+	error = sysfs_get_open_dirent(attr_sd, of);
+	if (error)
+		goto err_close;
+
+	/* open succeeded, put active references */
+	sysfs_put_active(attr_sd);
+	return 0;
+
+err_close:
+	seq_release(inode, file);
+err_free:
+	kfree(of);
+err_out:
+	sysfs_put_active(attr_sd);
+	return error;
+}
+
+static int kernfs_file_release(struct inode *inode, struct file *filp)
+{
+	struct sysfs_dirent *sd = filp->f_path.dentry->d_fsdata;
+	struct sysfs_open_file *of = sysfs_of(filp);
+
+	sysfs_put_open_dirent(sd, of);
+	seq_release(inode, filp);
+	kfree(of);
+
+	return 0;
+}
+
+void sysfs_unmap_bin_file(struct sysfs_dirent *sd)
+{
+	struct sysfs_open_dirent *od;
+	struct sysfs_open_file *of;
+
+	if (!(sd->s_flags & SYSFS_FLAG_HAS_MMAP))
+		return;
+
+	spin_lock_irq(&sysfs_open_dirent_lock);
+	od = sd->s_attr.open;
+	if (od)
+		atomic_inc(&od->refcnt);
+	spin_unlock_irq(&sysfs_open_dirent_lock);
+	if (!od)
+		return;
+
+	mutex_lock(&sysfs_open_file_mutex);
+	list_for_each_entry(of, &od->files, list) {
+		struct inode *inode = file_inode(of->file);
+		unmap_mapping_range(inode->i_mapping, 0, 0, 1);
+	}
+	mutex_unlock(&sysfs_open_file_mutex);
+
+	sysfs_put_open_dirent(sd, NULL);
+}
+
+/* Sysfs attribute files are pollable.  The idea is that you read
+ * the content and then you use 'poll' or 'select' to wait for
+ * the content to change.  When the content changes (assuming the
+ * manager for the kobject supports notification), poll will
+ * return POLLERR|POLLPRI, and select will return the fd whether
+ * it is waiting for read, write, or exceptions.
+ * Once poll/select indicates that the value has changed, you
+ * need to close and re-open the file, or seek to 0 and read again.
+ * Reminder: this only works for attributes which actively support
+ * it, and it is not possible to test an attribute from userspace
+ * to see if it supports poll (Neither 'poll' nor 'select' return
+ * an appropriate error code).  When in doubt, set a suitable timeout value.
+ */
+static unsigned int kernfs_file_poll(struct file *filp, poll_table *wait)
+{
+	struct sysfs_open_file *of = sysfs_of(filp);
+	struct sysfs_dirent *attr_sd = filp->f_path.dentry->d_fsdata;
+	struct sysfs_open_dirent *od = attr_sd->s_attr.open;
+
+	/* need parent for the kobj, grab both */
+	if (!sysfs_get_active(attr_sd))
+		goto trigger;
+
+	poll_wait(filp, &od->poll, wait);
+
+	sysfs_put_active(attr_sd);
+
+	if (of->event != atomic_read(&od->event))
+		goto trigger;
+
+	return DEFAULT_POLLMASK;
+
+ trigger:
+	return DEFAULT_POLLMASK|POLLERR|POLLPRI;
+}
+
+/**
+ * kernfs_notify - notify a kernfs file
+ * @sd: file to notify
+ *
+ * Notify @sd such that poll(2) on @sd wakes up.
+ */
+void kernfs_notify(struct sysfs_dirent *sd)
+{
+	struct sysfs_open_dirent *od;
+	unsigned long flags;
+
+	spin_lock_irqsave(&sysfs_open_dirent_lock, flags);
+
+	if (!WARN_ON(sysfs_type(sd) != SYSFS_KOBJ_ATTR)) {
+		od = sd->s_attr.open;
+		if (od) {
+			atomic_inc(&od->event);
+			wake_up_interruptible(&od->poll);
+		}
+	}
+
+	spin_unlock_irqrestore(&sysfs_open_dirent_lock, flags);
+}
+EXPORT_SYMBOL_GPL(kernfs_notify);
+
+const struct file_operations kernfs_file_operations = {
+	.read		= kernfs_file_read,
+	.write		= kernfs_file_write,
+	.llseek		= kernfs_file_llseek,
+	.mmap		= kernfs_file_mmap,
+	.open		= kernfs_file_open,
+	.release	= kernfs_file_release,
+	.poll		= kernfs_file_poll,
+};
+
+/**
+ * kernfs_create_file_ns_key - create a file
+ * @parent: directory to create the file in
+ * @name: name of the file
+ * @mode: mode of the file
+ * @size: size of the file
+ * @ops: kernfs operations for the file
+ * @priv: private data for the file
+ * @ns: optional namespace tag of the file
+ * @key: lockdep key for the file's active_ref, %NULL to disable lockdep
+ *
+ * Returns the created node on success, ERR_PTR() value on error.
+ */
+struct sysfs_dirent *kernfs_create_file_ns_key(struct sysfs_dirent *parent,
+					       const char *name,
+					       umode_t mode, loff_t size,
+					       const struct kernfs_ops *ops,
+					       void *priv, const void *ns,
+					       struct lock_class_key *key)
+{
+	struct sysfs_addrm_cxt acxt;
+	struct sysfs_dirent *sd;
+	int rc;
+
+	sd = sysfs_new_dirent(name, (mode & S_IALLUGO) | S_IFREG,
+			      SYSFS_KOBJ_ATTR);
+	if (!sd)
+		return ERR_PTR(-ENOMEM);
+
+	sd->s_attr.ops = ops;
+	sd->s_attr.size = size;
+	sd->s_ns = ns;
+	sd->priv = priv;
+
+#ifdef CONFIG_DEBUG_LOCK_ALLOC
+	if (key) {
+		lockdep_init_map(&sd->dep_map, "s_active", key, 0);
+		sd->s_flags |= SYSFS_FLAG_LOCKDEP;
+	}
+#endif
+
+	/*
+	 * sd->s_attr.ops is accesible only while holding active ref.  We
+	 * need to know whether some ops are implemented outside active
+	 * ref.  Cache their existence in flags.
+	 */
+	if (ops->seq_show)
+		sd->s_flags |= SYSFS_FLAG_HAS_SEQ_SHOW;
+	if (ops->mmap)
+		sd->s_flags |= SYSFS_FLAG_HAS_MMAP;
+
+	sysfs_addrm_start(&acxt);
+	rc = sysfs_add_one(&acxt, sd, parent);
+	sysfs_addrm_finish(&acxt);
+
+	if (rc) {
+		kernfs_put(sd);
+		return ERR_PTR(rc);
+	}
+	return sd;
+}
diff --git a/fs/kernfs/kernfs-internal.h b/fs/kernfs/kernfs-internal.h
index 67b111b..1046624 100644
--- a/fs/kernfs/kernfs-internal.h
+++ b/fs/kernfs/kernfs-internal.h
@@ -142,4 +142,11 @@ int sysfs_add_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd,
 void sysfs_addrm_finish(struct sysfs_addrm_cxt *acxt);
 struct sysfs_dirent *sysfs_new_dirent(const char *name, umode_t mode, int type);
 
+/*
+ * file.c
+ */
+extern const struct file_operations kernfs_file_operations;
+
+void sysfs_unmap_bin_file(struct sysfs_dirent *sd);
+
 #endif	/* __KERNFS_INTERNAL_H */
diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c
index 8ee2e27..6634a9f 100644
--- a/fs/sysfs/file.c
+++ b/fs/sysfs/file.c
@@ -14,54 +14,12 @@
 #include <linux/kobject.h>
 #include <linux/kallsyms.h>
 #include <linux/slab.h>
-#include <linux/fsnotify.h>
-#include <linux/namei.h>
-#include <linux/poll.h>
 #include <linux/list.h>
 #include <linux/mutex.h>
-#include <linux/limits.h>
-#include <linux/uaccess.h>
 #include <linux/seq_file.h>
-#include <linux/mm.h>
 
 #include "sysfs.h"
-
-/*
- * There's one sysfs_open_file for each open file and one sysfs_open_dirent
- * for each sysfs_dirent with one or more open files.
- *
- * sysfs_dirent->s_attr.open points to sysfs_open_dirent.  s_attr.open is
- * protected by sysfs_open_dirent_lock.
- *
- * filp->private_data points to seq_file whose ->private points to
- * sysfs_open_file.  sysfs_open_files are chained at
- * sysfs_open_dirent->files, which is protected by sysfs_open_file_mutex.
- */
-static DEFINE_SPINLOCK(sysfs_open_dirent_lock);
-static DEFINE_MUTEX(sysfs_open_file_mutex);
-
-struct sysfs_open_dirent {
-	atomic_t		refcnt;
-	atomic_t		event;
-	wait_queue_head_t	poll;
-	struct list_head	files; /* goes through sysfs_open_file.list */
-};
-
-static struct sysfs_open_file *sysfs_of(struct file *file)
-{
-	return ((struct seq_file *)file->private_data)->private;
-}
-
-/*
- * Determine the kernfs_ops for the given sysfs_dirent.  This function must
- * be called while holding an active reference.
- */
-static const struct kernfs_ops *kernfs_ops(struct sysfs_dirent *sd)
-{
-	if (sd->s_flags & SYSFS_FLAG_LOCKDEP)
-		lockdep_assert_held(sd);
-	return sd->s_attr.ops;
-}
+#include "../kernfs/kernfs-internal.h"
 
 /*
  * Determine ktype->sysfs_ops for the given sysfs_dirent.  This function
@@ -143,149 +101,6 @@ static ssize_t sysfs_kf_bin_read(struct sysfs_open_file *of, char *buf,
 	return battr->read(of->file, kobj, battr, buf, pos, count);
 }
 
-static void *kernfs_seq_start(struct seq_file *sf, loff_t *ppos)
-{
-	struct sysfs_open_file *of = sf->private;
-	const struct kernfs_ops *ops;
-
-	/*
-	 * @of->mutex nests outside active ref and is just to ensure that
-	 * the ops aren't called concurrently for the same open file.
-	 */
-	mutex_lock(&of->mutex);
-	if (!sysfs_get_active(of->sd)) {
-		mutex_unlock(&of->mutex);
-		return ERR_PTR(-ENODEV);
-	}
-
-	ops = kernfs_ops(of->sd);
-	if (ops->seq_start) {
-		return ops->seq_start(sf, ppos);
-	} else {
-		/*
-		 * The same behavior and code as single_open().  Returns
-		 * !NULL if pos is at the beginning; otherwise, NULL.
-		 */
-		return NULL + !*ppos;
-	}
-}
-
-static void *kernfs_seq_next(struct seq_file *sf, void *v, loff_t *ppos)
-{
-	struct sysfs_open_file *of = sf->private;
-	const struct kernfs_ops *ops = kernfs_ops(of->sd);
-
-	if (ops->seq_next) {
-		return ops->seq_next(sf, v, ppos);
-	} else {
-		/*
-		 * The same behavior and code as single_open(), always
-		 * terminate after the initial read.
-		 */
-		++*ppos;
-		return NULL;
-	}
-}
-
-static void kernfs_seq_stop(struct seq_file *sf, void *v)
-{
-	struct sysfs_open_file *of = sf->private;
-	const struct kernfs_ops *ops = kernfs_ops(of->sd);
-
-	if (ops->seq_stop)
-		ops->seq_stop(sf, v);
-
-	sysfs_put_active(of->sd);
-	mutex_unlock(&of->mutex);
-}
-
-static int kernfs_seq_show(struct seq_file *sf, void *v)
-{
-	struct sysfs_open_file *of = sf->private;
-
-	of->event = atomic_read(&of->sd->s_attr.open->event);
-
-	return of->sd->s_attr.ops->seq_show(sf, v);
-}
-
-static const struct seq_operations kernfs_seq_ops = {
-	.start = kernfs_seq_start,
-	.next = kernfs_seq_next,
-	.stop = kernfs_seq_stop,
-	.show = kernfs_seq_show,
-};
-
-/*
- * As reading a bin file can have side-effects, the exact offset and bytes
- * specified in read(2) call should be passed to the read callback making
- * it difficult to use seq_file.  Implement simplistic custom buffering for
- * bin files.
- */
-static ssize_t kernfs_file_direct_read(struct sysfs_open_file *of,
-				       char __user *user_buf, size_t count,
-				       loff_t *ppos)
-{
-	ssize_t len = min_t(size_t, count, PAGE_SIZE);
-	const struct kernfs_ops *ops;
-	char *buf;
-
-	buf = kmalloc(len, GFP_KERNEL);
-	if (!buf)
-		return -ENOMEM;
-
-	/*
-	 * @of->mutex nests outside active ref and is just to ensure that
-	 * the ops aren't called concurrently for the same open file.
-	 */
-	mutex_lock(&of->mutex);
-	if (!sysfs_get_active(of->sd)) {
-		len = -ENODEV;
-		mutex_unlock(&of->mutex);
-		goto out_free;
-	}
-
-	ops = kernfs_ops(of->sd);
-	if (ops->read)
-		len = ops->read(of, buf, len, *ppos);
-	else
-		len = -EINVAL;
-
-	sysfs_put_active(of->sd);
-	mutex_unlock(&of->mutex);
-
-	if (len < 0)
-		goto out_free;
-
-	if (copy_to_user(user_buf, buf, len)) {
-		len = -EFAULT;
-		goto out_free;
-	}
-
-	*ppos += len;
-
- out_free:
-	kfree(buf);
-	return len;
-}
-
-/**
- * kernfs_file_read - kernfs vfs read callback
- * @file: file pointer
- * @user_buf: data to write
- * @count: number of bytes
- * @ppos: starting offset
- */
-static ssize_t kernfs_file_read(struct file *file, char __user *user_buf,
-				size_t count, loff_t *ppos)
-{
-	struct sysfs_open_file *of = sysfs_of(file);
-
-	if (of->sd->s_flags & SYSFS_FLAG_HAS_SEQ_SHOW)
-		return seq_read(file, user_buf, count, ppos);
-	else
-		return kernfs_file_direct_read(of, user_buf, count, ppos);
-}
-
 /* kernfs write callback for regular sysfs files */
 static ssize_t sysfs_kf_write(struct sysfs_open_file *of, char *buf,
 			      size_t count, loff_t pos)
@@ -321,77 +136,6 @@ static ssize_t sysfs_kf_bin_write(struct sysfs_open_file *of, char *buf,
 	return battr->write(of->file, kobj, battr, buf, pos, count);
 }
 
-/**
- * kernfs_file_write - kernfs vfs write callback
- * @file: file pointer
- * @user_buf: data to write
- * @count: number of bytes
- * @ppos: starting offset
- *
- * Copy data in from userland and pass it to the matching kernfs write
- * operation.
- *
- * There is no easy way for us to know if userspace is only doing a partial
- * write, so we don't support them. We expect the entire buffer to come on
- * the first write.  Hint: if you're writing a value, first read the file,
- * modify only the the value you're changing, then write entire buffer
- * back.
- */
-static ssize_t kernfs_file_write(struct file *file, const char __user *user_buf,
-				 size_t count, loff_t *ppos)
-{
-	struct sysfs_open_file *of = sysfs_of(file);
-	ssize_t len = min_t(size_t, count, PAGE_SIZE);
-	const struct kernfs_ops *ops;
-	char *buf;
-
-	buf = kmalloc(len + 1, GFP_KERNEL);
-	if (!buf)
-		return -ENOMEM;
-
-	if (copy_from_user(buf, user_buf, len)) {
-		len = -EFAULT;
-		goto out_free;
-	}
-	buf[len] = '\0';	/* guarantee string termination */
-
-	/*
-	 * @of->mutex nests outside active ref and is just to ensure that
-	 * the ops aren't called concurrently for the same open file.
-	 */
-	mutex_lock(&of->mutex);
-	if (!sysfs_get_active(of->sd)) {
-		mutex_unlock(&of->mutex);
-		len = -ENODEV;
-		goto out_free;
-	}
-
-	ops = kernfs_ops(of->sd);
-	if (ops->write)
-		len = ops->write(of, buf, len, *ppos);
-	else
-		len = -EINVAL;
-
-	sysfs_put_active(of->sd);
-	mutex_unlock(&of->mutex);
-
-	if (len > 0)
-		*ppos += len;
-out_free:
-	kfree(buf);
-	return len;
-}
-
-static loff_t kernfs_file_llseek(struct file *file, loff_t off, int whence)
-{
-	struct sysfs_open_file *of = sysfs_of(file);
-
-	if (of->sd->s_flags & SYSFS_FLAG_HAS_SEQ_SHOW)
-		return seq_lseek(file, off, whence);
-	else
-		return generic_file_llseek(file, off, whence);
-}
-
 static int sysfs_kf_bin_mmap(struct sysfs_open_file *of,
 			     struct vm_area_struct *vma)
 {
@@ -404,473 +148,6 @@ static int sysfs_kf_bin_mmap(struct sysfs_open_file *of,
 	return battr->mmap(of->file, kobj, battr, vma);
 }
 
-static void kernfs_vma_open(struct vm_area_struct *vma)
-{
-	struct file *file = vma->vm_file;
-	struct sysfs_open_file *of = sysfs_of(file);
-
-	if (!of->vm_ops)
-		return;
-
-	if (!sysfs_get_active(of->sd))
-		return;
-
-	if (of->vm_ops->open)
-		of->vm_ops->open(vma);
-
-	sysfs_put_active(of->sd);
-}
-
-static int kernfs_vma_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
-{
-	struct file *file = vma->vm_file;
-	struct sysfs_open_file *of = sysfs_of(file);
-	int ret;
-
-	if (!of->vm_ops)
-		return VM_FAULT_SIGBUS;
-
-	if (!sysfs_get_active(of->sd))
-		return VM_FAULT_SIGBUS;
-
-	ret = VM_FAULT_SIGBUS;
-	if (of->vm_ops->fault)
-		ret = of->vm_ops->fault(vma, vmf);
-
-	sysfs_put_active(of->sd);
-	return ret;
-}
-
-static int kernfs_vma_page_mkwrite(struct vm_area_struct *vma,
-				   struct vm_fault *vmf)
-{
-	struct file *file = vma->vm_file;
-	struct sysfs_open_file *of = sysfs_of(file);
-	int ret;
-
-	if (!of->vm_ops)
-		return VM_FAULT_SIGBUS;
-
-	if (!sysfs_get_active(of->sd))
-		return VM_FAULT_SIGBUS;
-
-	ret = 0;
-	if (of->vm_ops->page_mkwrite)
-		ret = of->vm_ops->page_mkwrite(vma, vmf);
-	else
-		file_update_time(file);
-
-	sysfs_put_active(of->sd);
-	return ret;
-}
-
-static int kernfs_vma_access(struct vm_area_struct *vma, unsigned long addr,
-			     void *buf, int len, int write)
-{
-	struct file *file = vma->vm_file;
-	struct sysfs_open_file *of = sysfs_of(file);
-	int ret;
-
-	if (!of->vm_ops)
-		return -EINVAL;
-
-	if (!sysfs_get_active(of->sd))
-		return -EINVAL;
-
-	ret = -EINVAL;
-	if (of->vm_ops->access)
-		ret = of->vm_ops->access(vma, addr, buf, len, write);
-
-	sysfs_put_active(of->sd);
-	return ret;
-}
-
-#ifdef CONFIG_NUMA
-static int kernfs_vma_set_policy(struct vm_area_struct *vma,
-				 struct mempolicy *new)
-{
-	struct file *file = vma->vm_file;
-	struct sysfs_open_file *of = sysfs_of(file);
-	int ret;
-
-	if (!of->vm_ops)
-		return 0;
-
-	if (!sysfs_get_active(of->sd))
-		return -EINVAL;
-
-	ret = 0;
-	if (of->vm_ops->set_policy)
-		ret = of->vm_ops->set_policy(vma, new);
-
-	sysfs_put_active(of->sd);
-	return ret;
-}
-
-static struct mempolicy *kernfs_vma_get_policy(struct vm_area_struct *vma,
-					       unsigned long addr)
-{
-	struct file *file = vma->vm_file;
-	struct sysfs_open_file *of = sysfs_of(file);
-	struct mempolicy *pol;
-
-	if (!of->vm_ops)
-		return vma->vm_policy;
-
-	if (!sysfs_get_active(of->sd))
-		return vma->vm_policy;
-
-	pol = vma->vm_policy;
-	if (of->vm_ops->get_policy)
-		pol = of->vm_ops->get_policy(vma, addr);
-
-	sysfs_put_active(of->sd);
-	return pol;
-}
-
-static int kernfs_vma_migrate(struct vm_area_struct *vma,
-			      const nodemask_t *from, const nodemask_t *to,
-			      unsigned long flags)
-{
-	struct file *file = vma->vm_file;
-	struct sysfs_open_file *of = sysfs_of(file);
-	int ret;
-
-	if (!of->vm_ops)
-		return 0;
-
-	if (!sysfs_get_active(of->sd))
-		return 0;
-
-	ret = 0;
-	if (of->vm_ops->migrate)
-		ret = of->vm_ops->migrate(vma, from, to, flags);
-
-	sysfs_put_active(of->sd);
-	return ret;
-}
-#endif
-
-static const struct vm_operations_struct kernfs_vm_ops = {
-	.open		= kernfs_vma_open,
-	.fault		= kernfs_vma_fault,
-	.page_mkwrite	= kernfs_vma_page_mkwrite,
-	.access		= kernfs_vma_access,
-#ifdef CONFIG_NUMA
-	.set_policy	= kernfs_vma_set_policy,
-	.get_policy	= kernfs_vma_get_policy,
-	.migrate	= kernfs_vma_migrate,
-#endif
-};
-
-static int kernfs_file_mmap(struct file *file, struct vm_area_struct *vma)
-{
-	struct sysfs_open_file *of = sysfs_of(file);
-	const struct kernfs_ops *ops;
-	int rc;
-
-	mutex_lock(&of->mutex);
-
-	rc = -ENODEV;
-	if (!sysfs_get_active(of->sd))
-		goto out_unlock;
-
-	ops = kernfs_ops(of->sd);
-	if (ops->mmap)
-		rc = ops->mmap(of, vma);
-	if (rc)
-		goto out_put;
-
-	/*
-	 * PowerPC's pci_mmap of legacy_mem uses shmem_zero_setup()
-	 * to satisfy versions of X which crash if the mmap fails: that
-	 * substitutes a new vm_file, and we don't then want bin_vm_ops.
-	 */
-	if (vma->vm_file != file)
-		goto out_put;
-
-	rc = -EINVAL;
-	if (of->mmapped && of->vm_ops != vma->vm_ops)
-		goto out_put;
-
-	/*
-	 * It is not possible to successfully wrap close.
-	 * So error if someone is trying to use close.
-	 */
-	rc = -EINVAL;
-	if (vma->vm_ops && vma->vm_ops->close)
-		goto out_put;
-
-	rc = 0;
-	of->mmapped = 1;
-	of->vm_ops = vma->vm_ops;
-	vma->vm_ops = &kernfs_vm_ops;
-out_put:
-	sysfs_put_active(of->sd);
-out_unlock:
-	mutex_unlock(&of->mutex);
-
-	return rc;
-}
-
-/**
- *	sysfs_get_open_dirent - get or create sysfs_open_dirent
- *	@sd: target sysfs_dirent
- *	@of: sysfs_open_file for this instance of open
- *
- *	If @sd->s_attr.open exists, increment its reference count;
- *	otherwise, create one.  @of is chained to the files list.
- *
- *	LOCKING:
- *	Kernel thread context (may sleep).
- *
- *	RETURNS:
- *	0 on success, -errno on failure.
- */
-static int sysfs_get_open_dirent(struct sysfs_dirent *sd,
-				 struct sysfs_open_file *of)
-{
-	struct sysfs_open_dirent *od, *new_od = NULL;
-
- retry:
-	mutex_lock(&sysfs_open_file_mutex);
-	spin_lock_irq(&sysfs_open_dirent_lock);
-
-	if (!sd->s_attr.open && new_od) {
-		sd->s_attr.open = new_od;
-		new_od = NULL;
-	}
-
-	od = sd->s_attr.open;
-	if (od) {
-		atomic_inc(&od->refcnt);
-		list_add_tail(&of->list, &od->files);
-	}
-
-	spin_unlock_irq(&sysfs_open_dirent_lock);
-	mutex_unlock(&sysfs_open_file_mutex);
-
-	if (od) {
-		kfree(new_od);
-		return 0;
-	}
-
-	/* not there, initialize a new one and retry */
-	new_od = kmalloc(sizeof(*new_od), GFP_KERNEL);
-	if (!new_od)
-		return -ENOMEM;
-
-	atomic_set(&new_od->refcnt, 0);
-	atomic_set(&new_od->event, 1);
-	init_waitqueue_head(&new_od->poll);
-	INIT_LIST_HEAD(&new_od->files);
-	goto retry;
-}
-
-/**
- *	sysfs_put_open_dirent - put sysfs_open_dirent
- *	@sd: target sysfs_dirent
- *	@of: associated sysfs_open_file
- *
- *	Put @sd->s_attr.open and unlink @of from the files list.  If
- *	reference count reaches zero, disassociate and free it.
- *
- *	LOCKING:
- *	None.
- */
-static void sysfs_put_open_dirent(struct sysfs_dirent *sd,
-				  struct sysfs_open_file *of)
-{
-	struct sysfs_open_dirent *od = sd->s_attr.open;
-	unsigned long flags;
-
-	mutex_lock(&sysfs_open_file_mutex);
-	spin_lock_irqsave(&sysfs_open_dirent_lock, flags);
-
-	if (of)
-		list_del(&of->list);
-
-	if (atomic_dec_and_test(&od->refcnt))
-		sd->s_attr.open = NULL;
-	else
-		od = NULL;
-
-	spin_unlock_irqrestore(&sysfs_open_dirent_lock, flags);
-	mutex_unlock(&sysfs_open_file_mutex);
-
-	kfree(od);
-}
-
-static int kernfs_file_open(struct inode *inode, struct file *file)
-{
-	struct sysfs_dirent *attr_sd = file->f_path.dentry->d_fsdata;
-	const struct kernfs_ops *ops;
-	struct sysfs_open_file *of;
-	bool has_read, has_write;
-	int error = -EACCES;
-
-	if (!sysfs_get_active(attr_sd))
-		return -ENODEV;
-
-	ops = kernfs_ops(attr_sd);
-
-	has_read = ops->seq_show || ops->read || ops->mmap;
-	has_write = ops->write || ops->mmap;
-
-	/* check perms and supported operations */
-	if ((file->f_mode & FMODE_WRITE) &&
-	    (!(inode->i_mode & S_IWUGO) || !has_write))
-		goto err_out;
-
-	if ((file->f_mode & FMODE_READ) &&
-	    (!(inode->i_mode & S_IRUGO) || !has_read))
-		goto err_out;
-
-	/* allocate a sysfs_open_file for the file */
-	error = -ENOMEM;
-	of = kzalloc(sizeof(struct sysfs_open_file), GFP_KERNEL);
-	if (!of)
-		goto err_out;
-
-	mutex_init(&of->mutex);
-	of->sd = attr_sd;
-	of->file = file;
-
-	/*
-	 * Always instantiate seq_file even if read access doesn't use
-	 * seq_file or is not requested.  This unifies private data access
-	 * and readable regular files are the vast majority anyway.
-	 */
-	if (ops->seq_show)
-		error = seq_open(file, &kernfs_seq_ops);
-	else
-		error = seq_open(file, NULL);
-	if (error)
-		goto err_free;
-
-	((struct seq_file *)file->private_data)->private = of;
-
-	/* seq_file clears PWRITE unconditionally, restore it if WRITE */
-	if (file->f_mode & FMODE_WRITE)
-		file->f_mode |= FMODE_PWRITE;
-
-	/* make sure we have open dirent struct */
-	error = sysfs_get_open_dirent(attr_sd, of);
-	if (error)
-		goto err_close;
-
-	/* open succeeded, put active references */
-	sysfs_put_active(attr_sd);
-	return 0;
-
-err_close:
-	seq_release(inode, file);
-err_free:
-	kfree(of);
-err_out:
-	sysfs_put_active(attr_sd);
-	return error;
-}
-
-static int kernfs_file_release(struct inode *inode, struct file *filp)
-{
-	struct sysfs_dirent *sd = filp->f_path.dentry->d_fsdata;
-	struct sysfs_open_file *of = sysfs_of(filp);
-
-	sysfs_put_open_dirent(sd, of);
-	seq_release(inode, filp);
-	kfree(of);
-
-	return 0;
-}
-
-void sysfs_unmap_bin_file(struct sysfs_dirent *sd)
-{
-	struct sysfs_open_dirent *od;
-	struct sysfs_open_file *of;
-
-	if (!(sd->s_flags & SYSFS_FLAG_HAS_MMAP))
-		return;
-
-	spin_lock_irq(&sysfs_open_dirent_lock);
-	od = sd->s_attr.open;
-	if (od)
-		atomic_inc(&od->refcnt);
-	spin_unlock_irq(&sysfs_open_dirent_lock);
-	if (!od)
-		return;
-
-	mutex_lock(&sysfs_open_file_mutex);
-	list_for_each_entry(of, &od->files, list) {
-		struct inode *inode = file_inode(of->file);
-		unmap_mapping_range(inode->i_mapping, 0, 0, 1);
-	}
-	mutex_unlock(&sysfs_open_file_mutex);
-
-	sysfs_put_open_dirent(sd, NULL);
-}
-
-/* Sysfs attribute files are pollable.  The idea is that you read
- * the content and then you use 'poll' or 'select' to wait for
- * the content to change.  When the content changes (assuming the
- * manager for the kobject supports notification), poll will
- * return POLLERR|POLLPRI, and select will return the fd whether
- * it is waiting for read, write, or exceptions.
- * Once poll/select indicates that the value has changed, you
- * need to close and re-open the file, or seek to 0 and read again.
- * Reminder: this only works for attributes which actively support
- * it, and it is not possible to test an attribute from userspace
- * to see if it supports poll (Neither 'poll' nor 'select' return
- * an appropriate error code).  When in doubt, set a suitable timeout value.
- */
-static unsigned int kernfs_file_poll(struct file *filp, poll_table *wait)
-{
-	struct sysfs_open_file *of = sysfs_of(filp);
-	struct sysfs_dirent *attr_sd = filp->f_path.dentry->d_fsdata;
-	struct sysfs_open_dirent *od = attr_sd->s_attr.open;
-
-	/* need parent for the kobj, grab both */
-	if (!sysfs_get_active(attr_sd))
-		goto trigger;
-
-	poll_wait(filp, &od->poll, wait);
-
-	sysfs_put_active(attr_sd);
-
-	if (of->event != atomic_read(&od->event))
-		goto trigger;
-
-	return DEFAULT_POLLMASK;
-
- trigger:
-	return DEFAULT_POLLMASK|POLLERR|POLLPRI;
-}
-
-/**
- * kernfs_notify - notify a kernfs file
- * @sd: file to notify
- *
- * Notify @sd such that poll(2) on @sd wakes up.
- */
-void kernfs_notify(struct sysfs_dirent *sd)
-{
-	struct sysfs_open_dirent *od;
-	unsigned long flags;
-
-	spin_lock_irqsave(&sysfs_open_dirent_lock, flags);
-
-	if (!WARN_ON(sysfs_type(sd) != SYSFS_KOBJ_ATTR)) {
-		od = sd->s_attr.open;
-		if (od) {
-			atomic_inc(&od->event);
-			wake_up_interruptible(&od->poll);
-		}
-	}
-
-	spin_unlock_irqrestore(&sysfs_open_dirent_lock, flags);
-}
-EXPORT_SYMBOL_GPL(kernfs_notify);
-
 void sysfs_notify(struct kobject *k, const char *dir, const char *attr)
 {
 	struct sysfs_dirent *sd = k->sd, *tmp;
@@ -893,16 +170,6 @@ void sysfs_notify(struct kobject *k, const char *dir, const char *attr)
 }
 EXPORT_SYMBOL_GPL(sysfs_notify);
 
-const struct file_operations kernfs_file_operations = {
-	.read		= kernfs_file_read,
-	.write		= kernfs_file_write,
-	.llseek		= kernfs_file_llseek,
-	.mmap		= kernfs_file_mmap,
-	.open		= kernfs_file_open,
-	.release	= kernfs_file_release,
-	.poll		= kernfs_file_poll,
-};
-
 static const struct kernfs_ops sysfs_file_kfops_empty = {
 };
 
@@ -991,68 +258,6 @@ int sysfs_add_file_mode_ns(struct sysfs_dirent *dir_sd,
 	return 0;
 }
 
-/**
- * kernfs_create_file_ns_key - create a file
- * @parent: directory to create the file in
- * @name: name of the file
- * @mode: mode of the file
- * @size: size of the file
- * @ops: kernfs operations for the file
- * @priv: private data for the file
- * @ns: optional namespace tag of the file
- * @key: lockdep key for the file's active_ref, %NULL to disable lockdep
- *
- * Returns the created node on success, ERR_PTR() value on error.
- */
-struct sysfs_dirent *kernfs_create_file_ns_key(struct sysfs_dirent *parent,
-					       const char *name,
-					       umode_t mode, loff_t size,
-					       const struct kernfs_ops *ops,
-					       void *priv, const void *ns,
-					       struct lock_class_key *key)
-{
-	struct sysfs_addrm_cxt acxt;
-	struct sysfs_dirent *sd;
-	int rc;
-
-	sd = sysfs_new_dirent(name, (mode & S_IALLUGO) | S_IFREG,
-			      SYSFS_KOBJ_ATTR);
-	if (!sd)
-		return ERR_PTR(-ENOMEM);
-
-	sd->s_attr.ops = ops;
-	sd->s_attr.size = size;
-	sd->s_ns = ns;
-	sd->priv = priv;
-
-#ifdef CONFIG_DEBUG_LOCK_ALLOC
-	if (key) {
-		lockdep_init_map(&sd->dep_map, "s_active", key, 0);
-		sd->s_flags |= SYSFS_FLAG_LOCKDEP;
-	}
-#endif
-
-	/*
-	 * sd->s_attr.ops is accesible only while holding active ref.  We
-	 * need to know whether some ops are implemented outside active
-	 * ref.  Cache their existence in flags.
-	 */
-	if (ops->seq_show)
-		sd->s_flags |= SYSFS_FLAG_HAS_SEQ_SHOW;
-	if (ops->mmap)
-		sd->s_flags |= SYSFS_FLAG_HAS_MMAP;
-
-	sysfs_addrm_start(&acxt);
-	rc = sysfs_add_one(&acxt, sd, parent);
-	sysfs_addrm_finish(&acxt);
-
-	if (rc) {
-		kernfs_put(sd);
-		return ERR_PTR(rc);
-	}
-	return sd;
-}
-
 int sysfs_add_file(struct sysfs_dirent *dir_sd, const struct attribute *attr,
 		   bool is_bin)
 {
diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index aef8180..95cc989 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -41,15 +41,11 @@ void sysfs_warn_dup(struct sysfs_dirent *parent, const char *name);
 /*
  * file.c
  */
-extern const struct file_operations kernfs_file_operations;
-
 int sysfs_add_file(struct sysfs_dirent *dir_sd,
 		   const struct attribute *attr, bool is_bin);
-
 int sysfs_add_file_mode_ns(struct sysfs_dirent *dir_sd,
 			   const struct attribute *attr, bool is_bin,
 			   umode_t amode, const void *ns);
-void sysfs_unmap_bin_file(struct sysfs_dirent *sd);
 
 /*
  * symlink.c
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* Re: Subject: [PATCHSET driver-core-next] sysfs: separate out kernfs, part #1
  2013-10-24 15:49 Subject: [PATCHSET driver-core-next] sysfs: separate out kernfs, part #1 Tejun Heo
                   ` (33 preceding siblings ...)
  2013-10-24 15:49 ` [PATCH 34/34] sysfs, kernfs: move symlink core code to fs/kernfs/symlink.c Tejun Heo
@ 2013-10-28 13:06 ` Tejun Heo
  2013-10-28 16:35   ` Tejun Heo
  2013-10-29 22:29 ` Greg KH
  35 siblings, 1 reply; 46+ messages in thread
From: Tejun Heo @ 2013-10-28 13:06 UTC (permalink / raw)
  To: gregkh; +Cc: kay, linux-kernel, ebiederm, bhelgaas

Hello, guys.

Just posted updated patches incorporating Pavel's suggestions.  Only
the patches which need to be updated are reposted.  The whole series
can be found in the following branch.  It's based on top of today's
driver-core-next 6fffcfa7c0fc ("devres: restore zeroing behavior of
devres_alloc()").

  git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc.git review-separate-out-kernfs

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH v3 14/34] sysfs, kernfs: prepare read path for kernfs
  2013-10-24 15:49 ` [PATCH 14/34] sysfs, kernfs: prepare read path for kernfs Tejun Heo
  2013-10-26 11:32   ` Pavel Machek
  2013-10-28 13:03   ` [PATCH v2 " Tejun Heo
@ 2013-10-28 16:33   ` Tejun Heo
  2 siblings, 0 replies; 46+ messages in thread
From: Tejun Heo @ 2013-10-28 16:33 UTC (permalink / raw)
  To: gregkh; +Cc: kay, linux-kernel, ebiederm, bhelgaas, Pavel Machek, Fengguang Wu

We're in the process of separating out core sysfs functionality into
kernfs which will deal with sysfs_dirents directly.  This patch
rearranges read path so that the kernfs and sysfs parts are separate.

* Regular file read path is refactored such that
  kernfs_seq_start/next/stop/show() handle all the boilerplate work
  including locking and updating event count for poll, while
  sysfs_kf_seq_show() deals with interaction with kobj show method.

* Bin file read path is refactored such that kernfs_file_direct_read()
  handles all the boilerplate work including buffer management and
  locking, while sysfs_kf_bin_read() deals with interaction with
  bin_attribute read method.

kernfs_file_read() is added.  It invokes either the seq_file or direct
read path depending on the file type.  This will eventually allow
using the same file_operations for both file types, which is necessary
to separate out kernfs.

While this patch changes the order of some operations, it shouldn't
change any visible behavior.

v2: Dropped unnecessary zeroing of @count from sysfs_kf_seq_show().
    Add comments explaining single_open() behavior.  Both suggested by
    Pavel.

v3: seq_stop() is called even after seq_start() failed.
    kernfs_seq_start() updated so that it doesn't unlock
    sysfs_open_file->mutex on failure so that kernfs_seq_stop()
    doesn't try to unlock an already unlocked mutex.  Reported by
    Fengguang.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Fengguang Wu <fengguang.wu@intel.com>
---
 fs/sysfs/file.c | 191 +++++++++++++++++++++++++++++++++++++-------------------
 1 file changed, 126 insertions(+), 65 deletions(-)

diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c
index 444615d..dcf9155 100644
--- a/fs/sysfs/file.c
+++ b/fs/sysfs/file.c
@@ -86,13 +86,13 @@ static const struct sysfs_ops *sysfs_file_ops(struct sysfs_dirent *sd)
  * details like buffering and seeking.  The following function pipes
  * sysfs_ops->show() result through seq_file.
  */
-static int sysfs_seq_show(struct seq_file *sf, void *v)
+static int sysfs_kf_seq_show(struct seq_file *sf, void *v)
 {
 	struct sysfs_open_file *of = sf->private;
 	struct kobject *kobj = of->sd->s_parent->priv;
-	const struct sysfs_ops *ops;
-	char *buf;
+	const struct sysfs_ops *ops = sysfs_file_ops(of->sd);
 	ssize_t count;
+	char *buf;
 
 	/* acquire buffer and ensure that it's >= PAGE_SIZE */
 	count = seq_get_buf(sf, &buf);
@@ -102,33 +102,14 @@ static int sysfs_seq_show(struct seq_file *sf, void *v)
 	}
 
 	/*
-	 * Need @of->sd for attr and ops, its parent for kobj.  @of->mutex
-	 * nests outside active ref and is just to ensure that the ops
-	 * aren't called concurrently for the same open file.
+	 * Invoke show().  Control may reach here via seq file lseek even
+	 * if @ops->show() isn't implemented.
 	 */
-	mutex_lock(&of->mutex);
-	if (!sysfs_get_active(of->sd)) {
-		mutex_unlock(&of->mutex);
-		return -ENODEV;
-	}
-
-	of->event = atomic_read(&of->sd->s_attr.open->event);
-
-	/*
-	 * Lookup @ops and invoke show().  Control may reach here via seq
-	 * file lseek even if @ops->show() isn't implemented.
-	 */
-	ops = sysfs_file_ops(of->sd);
-	if (ops->show)
+	if (ops->show) {
 		count = ops->show(kobj, of->sd->priv, buf);
-	else
-		count = 0;
-
-	sysfs_put_active(of->sd);
-	mutex_unlock(&of->mutex);
-
-	if (count < 0)
-		return count;
+		if (count < 0)
+			return count;
+	}
 
 	/*
 	 * The code works fine with PAGE_SIZE return but it's likely to
@@ -144,68 +125,146 @@ static int sysfs_seq_show(struct seq_file *sf, void *v)
 	return 0;
 }
 
-/*
- * Read method for bin files.  As reading a bin file can have side-effects,
- * the exact offset and bytes specified in read(2) call should be passed to
- * the read callback making it difficult to use seq_file.  Implement
- * simplistic custom buffering for bin files.
- */
-static ssize_t sysfs_bin_read(struct file *file, char __user *userbuf,
-			      size_t bytes, loff_t *off)
+static ssize_t sysfs_kf_bin_read(struct sysfs_open_file *of, char *buf,
+				 size_t count, loff_t pos)
 {
-	struct sysfs_open_file *of = sysfs_of(file);
 	struct bin_attribute *battr = of->sd->priv;
 	struct kobject *kobj = of->sd->s_parent->priv;
-	loff_t size = file_inode(file)->i_size;
-	int count = min_t(size_t, bytes, PAGE_SIZE);
-	loff_t offs = *off;
-	char *buf;
+	loff_t size = file_inode(of->file)->i_size;
 
-	if (!bytes)
+	if (!count)
 		return 0;
 
 	if (size) {
-		if (offs > size)
+		if (pos > size)
 			return 0;
-		if (offs + count > size)
-			count = size - offs;
+		if (pos + count > size)
+			count = size - pos;
 	}
 
-	buf = kmalloc(count, GFP_KERNEL);
+	if (!battr->read)
+		return -EIO;
+
+	return battr->read(of->file, kobj, battr, buf, pos, count);
+}
+
+static void *kernfs_seq_start(struct seq_file *sf, loff_t *ppos)
+{
+	struct sysfs_open_file *of = sf->private;
+
+	/*
+	 * @of->mutex nests outside active ref and is just to ensure that
+	 * the ops aren't called concurrently for the same open file.
+	 */
+	mutex_lock(&of->mutex);
+	if (!sysfs_get_active(of->sd))
+		return ERR_PTR(-ENODEV);
+
+	/*
+	 * The same behavior and code as single_open().  Returns !NULL if
+	 * pos is at the beginning; otherwise, NULL.
+	 */
+	return NULL + !*ppos;
+}
+
+static void *kernfs_seq_next(struct seq_file *sf, void *v, loff_t *ppos)
+{
+	/*
+	 * The same behavior and code as single_open(), always terminate
+	 * after the initial read.
+	 */
+	++*ppos;
+	return NULL;
+}
+
+static void kernfs_seq_stop(struct seq_file *sf, void *v)
+{
+	struct sysfs_open_file *of = sf->private;
+
+	sysfs_put_active(of->sd);
+	mutex_unlock(&of->mutex);
+}
+
+static int kernfs_seq_show(struct seq_file *sf, void *v)
+{
+	struct sysfs_open_file *of = sf->private;
+
+	of->event = atomic_read(&of->sd->s_attr.open->event);
+
+	return sysfs_kf_seq_show(sf, v);
+}
+
+static const struct seq_operations kernfs_seq_ops = {
+	.start = kernfs_seq_start,
+	.next = kernfs_seq_next,
+	.stop = kernfs_seq_stop,
+	.show = kernfs_seq_show,
+};
+
+/*
+ * As reading a bin file can have side-effects, the exact offset and bytes
+ * specified in read(2) call should be passed to the read callback making
+ * it difficult to use seq_file.  Implement simplistic custom buffering for
+ * bin files.
+ */
+static ssize_t kernfs_file_direct_read(struct sysfs_open_file *of,
+				       char __user *user_buf, size_t count,
+				       loff_t *ppos)
+{
+	ssize_t len = min_t(size_t, count, PAGE_SIZE);
+	char *buf;
+
+	buf = kmalloc(len, GFP_KERNEL);
 	if (!buf)
 		return -ENOMEM;
 
-	/* need of->sd for battr, its parent for kobj */
+	/*
+	 * @of->mutex nests outside active ref and is just to ensure that
+	 * the ops aren't called concurrently for the same open file.
+	 */
 	mutex_lock(&of->mutex);
 	if (!sysfs_get_active(of->sd)) {
-		count = -ENODEV;
+		len = -ENODEV;
 		mutex_unlock(&of->mutex);
 		goto out_free;
 	}
 
-	if (battr->read)
-		count = battr->read(file, kobj, battr, buf, offs, count);
-	else
-		count = -EIO;
+	len = sysfs_kf_bin_read(of, buf, len, *ppos);
 
 	sysfs_put_active(of->sd);
 	mutex_unlock(&of->mutex);
 
-	if (count < 0)
+	if (len < 0)
 		goto out_free;
 
-	if (copy_to_user(userbuf, buf, count)) {
-		count = -EFAULT;
+	if (copy_to_user(user_buf, buf, len)) {
+		len = -EFAULT;
 		goto out_free;
 	}
 
-	pr_debug("offs = %lld, *off = %lld, count = %d\n", offs, *off, count);
-
-	*off = offs + count;
+	*ppos += len;
 
  out_free:
 	kfree(buf);
-	return count;
+	return len;
+}
+
+/**
+ * kernfs_file_read - kernfs vfs read callback
+ * @file: file pointer
+ * @user_buf: data to write
+ * @count: number of bytes
+ * @ppos: starting offset
+ */
+static ssize_t kernfs_file_read(struct file *file, char __user *user_buf,
+				size_t count, loff_t *ppos)
+{
+	struct sysfs_open_file *of = sysfs_of(file);
+
+	if (sysfs_is_bin(of->sd))
+		return kernfs_file_direct_read(of, user_buf, count, ppos);
+	else
+		return seq_read(file, user_buf, count, ppos);
 }
 
 /**
@@ -660,12 +719,14 @@ static int sysfs_open_file(struct inode *inode, struct file *file)
 	 * and readable regular files are the vast majority anyway.
 	 */
 	if (sysfs_is_bin(attr_sd))
-		error = single_open(file, NULL, of);
+		error = seq_open(file, NULL);
 	else
-		error = single_open(file, sysfs_seq_show, of);
+		error = seq_open(file, &kernfs_seq_ops);
 	if (error)
 		goto err_free;
 
+	((struct seq_file *)file->private_data)->private = of;
+
 	/* seq_file clears PWRITE unconditionally, restore it if WRITE */
 	if (file->f_mode & FMODE_WRITE)
 		file->f_mode |= FMODE_PWRITE;
@@ -680,7 +741,7 @@ static int sysfs_open_file(struct inode *inode, struct file *file)
 	return 0;
 
 err_close:
-	single_release(inode, file);
+	seq_release(inode, file);
 err_free:
 	kfree(of);
 err_out:
@@ -694,7 +755,7 @@ static int sysfs_release(struct inode *inode, struct file *filp)
 	struct sysfs_open_file *of = sysfs_of(filp);
 
 	sysfs_put_open_dirent(sd, of);
-	single_release(inode, filp);
+	seq_release(inode, filp);
 	kfree(of);
 
 	return 0;
@@ -799,7 +860,7 @@ void sysfs_notify(struct kobject *k, const char *dir, const char *attr)
 EXPORT_SYMBOL_GPL(sysfs_notify);
 
 const struct file_operations sysfs_file_operations = {
-	.read		= seq_read,
+	.read		= kernfs_file_read,
 	.write		= sysfs_write_file,
 	.llseek		= seq_lseek,
 	.open		= sysfs_open_file,
@@ -808,7 +869,7 @@ const struct file_operations sysfs_file_operations = {
 };
 
 const struct file_operations sysfs_bin_operations = {
-	.read		= sysfs_bin_read,
+	.read		= kernfs_file_read,
 	.write		= sysfs_write_file,
 	.llseek		= generic_file_llseek,
 	.mmap		= sysfs_bin_mmap,
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v3 33/34] sysfs, kernfs: move file core code to fs/kernfs/file.c
  2013-10-24 15:49 ` [PATCH 33/34] sysfs, kernfs: move file core code to fs/kernfs/file.c Tejun Heo
  2013-10-28 13:04   ` [PATCH v2 " Tejun Heo
@ 2013-10-28 16:34   ` Tejun Heo
  1 sibling, 0 replies; 46+ messages in thread
From: Tejun Heo @ 2013-10-28 16:34 UTC (permalink / raw)
  To: gregkh; +Cc: kay, linux-kernel, ebiederm, bhelgaas

Move core file code to fs/kernfs/file.c.  fs/sysfs/file.c now contains
sysfs kernfs_ops callbacks, sysfs wrappers around kernfs interfaces,
and sysfs_schedule_callback().  The respective declarations in
fs/sysfs/sysfs.h are moved to fs/kernfs/kernfs-internal.h.

This is pure relocation.

v2: Refreshed on top of the v2 of "sysfs, kernfs: prepare read path
    for kernfs".

v3: Refreshed on top of the v3 of "sysfs, kernfs: prepare read path
    for kernfs".

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 fs/kernfs/file.c            | 798 ++++++++++++++++++++++++++++++++++++++++++++
 fs/kernfs/kernfs-internal.h |   7 +
 fs/sysfs/file.c             | 795 +------------------------------------------
 fs/sysfs/sysfs.h            |   4 -
 4 files changed, 806 insertions(+), 798 deletions(-)

diff --git a/fs/kernfs/file.c b/fs/kernfs/file.c
index 90b1e88..bfa1486b2 100644
--- a/fs/kernfs/file.c
+++ b/fs/kernfs/file.c
@@ -7,3 +7,801 @@
  *
  * This file is released under the GPLv2.
  */
+
+#include <linux/fs.h>
+#include <linux/seq_file.h>
+#include <linux/slab.h>
+#include <linux/poll.h>
+#include <linux/pagemap.h>
+#include <linux/poll.h>
+#include <linux/sched.h>
+
+#include "kernfs-internal.h"
+
+/*
+ * There's one sysfs_open_file for each open file and one sysfs_open_dirent
+ * for each sysfs_dirent with one or more open files.
+ *
+ * sysfs_dirent->s_attr.open points to sysfs_open_dirent.  s_attr.open is
+ * protected by sysfs_open_dirent_lock.
+ *
+ * filp->private_data points to seq_file whose ->private points to
+ * sysfs_open_file.  sysfs_open_files are chained at
+ * sysfs_open_dirent->files, which is protected by sysfs_open_file_mutex.
+ */
+static DEFINE_SPINLOCK(sysfs_open_dirent_lock);
+static DEFINE_MUTEX(sysfs_open_file_mutex);
+
+struct sysfs_open_dirent {
+	atomic_t		refcnt;
+	atomic_t		event;
+	wait_queue_head_t	poll;
+	struct list_head	files; /* goes through sysfs_open_file.list */
+};
+
+static struct sysfs_open_file *sysfs_of(struct file *file)
+{
+	return ((struct seq_file *)file->private_data)->private;
+}
+
+/*
+ * Determine the kernfs_ops for the given sysfs_dirent.  This function must
+ * be called while holding an active reference.
+ */
+static const struct kernfs_ops *kernfs_ops(struct sysfs_dirent *sd)
+{
+	if (sd->s_flags & SYSFS_FLAG_LOCKDEP)
+		lockdep_assert_held(sd);
+	return sd->s_attr.ops;
+}
+
+static void *kernfs_seq_start(struct seq_file *sf, loff_t *ppos)
+{
+	struct sysfs_open_file *of = sf->private;
+	const struct kernfs_ops *ops;
+
+	/*
+	 * @of->mutex nests outside active ref and is just to ensure that
+	 * the ops aren't called concurrently for the same open file.
+	 */
+	mutex_lock(&of->mutex);
+	if (!sysfs_get_active(of->sd))
+		return ERR_PTR(-ENODEV);
+
+	ops = kernfs_ops(of->sd);
+	if (ops->seq_start) {
+		return ops->seq_start(sf, ppos);
+	} else {
+		/*
+		 * The same behavior and code as single_open().  Returns
+		 * !NULL if pos is at the beginning; otherwise, NULL.
+		 */
+		return NULL + !*ppos;
+	}
+}
+
+static void *kernfs_seq_next(struct seq_file *sf, void *v, loff_t *ppos)
+{
+	struct sysfs_open_file *of = sf->private;
+	const struct kernfs_ops *ops = kernfs_ops(of->sd);
+
+	if (ops->seq_next) {
+		return ops->seq_next(sf, v, ppos);
+	} else {
+		/*
+		 * The same behavior and code as single_open(), always
+		 * terminate after the initial read.
+		 */
+		++*ppos;
+		return NULL;
+	}
+}
+
+static void kernfs_seq_stop(struct seq_file *sf, void *v)
+{
+	struct sysfs_open_file *of = sf->private;
+	const struct kernfs_ops *ops = kernfs_ops(of->sd);
+
+	if (ops->seq_stop)
+		ops->seq_stop(sf, v);
+
+	sysfs_put_active(of->sd);
+	mutex_unlock(&of->mutex);
+}
+
+static int kernfs_seq_show(struct seq_file *sf, void *v)
+{
+	struct sysfs_open_file *of = sf->private;
+
+	of->event = atomic_read(&of->sd->s_attr.open->event);
+
+	return of->sd->s_attr.ops->seq_show(sf, v);
+}
+
+static const struct seq_operations kernfs_seq_ops = {
+	.start = kernfs_seq_start,
+	.next = kernfs_seq_next,
+	.stop = kernfs_seq_stop,
+	.show = kernfs_seq_show,
+};
+
+/*
+ * As reading a bin file can have side-effects, the exact offset and bytes
+ * specified in read(2) call should be passed to the read callback making
+ * it difficult to use seq_file.  Implement simplistic custom buffering for
+ * bin files.
+ */
+static ssize_t kernfs_file_direct_read(struct sysfs_open_file *of,
+				       char __user *user_buf, size_t count,
+				       loff_t *ppos)
+{
+	ssize_t len = min_t(size_t, count, PAGE_SIZE);
+	const struct kernfs_ops *ops;
+	char *buf;
+
+	buf = kmalloc(len, GFP_KERNEL);
+	if (!buf)
+		return -ENOMEM;
+
+	/*
+	 * @of->mutex nests outside active ref and is just to ensure that
+	 * the ops aren't called concurrently for the same open file.
+	 */
+	mutex_lock(&of->mutex);
+	if (!sysfs_get_active(of->sd)) {
+		len = -ENODEV;
+		mutex_unlock(&of->mutex);
+		goto out_free;
+	}
+
+	ops = kernfs_ops(of->sd);
+	if (ops->read)
+		len = ops->read(of, buf, len, *ppos);
+	else
+		len = -EINVAL;
+
+	sysfs_put_active(of->sd);
+	mutex_unlock(&of->mutex);
+
+	if (len < 0)
+		goto out_free;
+
+	if (copy_to_user(user_buf, buf, len)) {
+		len = -EFAULT;
+		goto out_free;
+	}
+
+	*ppos += len;
+
+ out_free:
+	kfree(buf);
+	return len;
+}
+
+/**
+ * kernfs_file_read - kernfs vfs read callback
+ * @file: file pointer
+ * @user_buf: data to write
+ * @count: number of bytes
+ * @ppos: starting offset
+ */
+static ssize_t kernfs_file_read(struct file *file, char __user *user_buf,
+				size_t count, loff_t *ppos)
+{
+	struct sysfs_open_file *of = sysfs_of(file);
+
+	if (of->sd->s_flags & SYSFS_FLAG_HAS_SEQ_SHOW)
+		return seq_read(file, user_buf, count, ppos);
+	else
+		return kernfs_file_direct_read(of, user_buf, count, ppos);
+}
+
+/**
+ * kernfs_file_write - kernfs vfs write callback
+ * @file: file pointer
+ * @user_buf: data to write
+ * @count: number of bytes
+ * @ppos: starting offset
+ *
+ * Copy data in from userland and pass it to the matching kernfs write
+ * operation.
+ *
+ * There is no easy way for us to know if userspace is only doing a partial
+ * write, so we don't support them. We expect the entire buffer to come on
+ * the first write.  Hint: if you're writing a value, first read the file,
+ * modify only the the value you're changing, then write entire buffer
+ * back.
+ */
+static ssize_t kernfs_file_write(struct file *file, const char __user *user_buf,
+				 size_t count, loff_t *ppos)
+{
+	struct sysfs_open_file *of = sysfs_of(file);
+	ssize_t len = min_t(size_t, count, PAGE_SIZE);
+	const struct kernfs_ops *ops;
+	char *buf;
+
+	buf = kmalloc(len + 1, GFP_KERNEL);
+	if (!buf)
+		return -ENOMEM;
+
+	if (copy_from_user(buf, user_buf, len)) {
+		len = -EFAULT;
+		goto out_free;
+	}
+	buf[len] = '\0';	/* guarantee string termination */
+
+	/*
+	 * @of->mutex nests outside active ref and is just to ensure that
+	 * the ops aren't called concurrently for the same open file.
+	 */
+	mutex_lock(&of->mutex);
+	if (!sysfs_get_active(of->sd)) {
+		mutex_unlock(&of->mutex);
+		len = -ENODEV;
+		goto out_free;
+	}
+
+	ops = kernfs_ops(of->sd);
+	if (ops->write)
+		len = ops->write(of, buf, len, *ppos);
+	else
+		len = -EINVAL;
+
+	sysfs_put_active(of->sd);
+	mutex_unlock(&of->mutex);
+
+	if (len > 0)
+		*ppos += len;
+out_free:
+	kfree(buf);
+	return len;
+}
+
+static loff_t kernfs_file_llseek(struct file *file, loff_t off, int whence)
+{
+	struct sysfs_open_file *of = sysfs_of(file);
+
+	if (of->sd->s_flags & SYSFS_FLAG_HAS_SEQ_SHOW)
+		return seq_lseek(file, off, whence);
+	else
+		return generic_file_llseek(file, off, whence);
+}
+
+static void kernfs_vma_open(struct vm_area_struct *vma)
+{
+	struct file *file = vma->vm_file;
+	struct sysfs_open_file *of = sysfs_of(file);
+
+	if (!of->vm_ops)
+		return;
+
+	if (!sysfs_get_active(of->sd))
+		return;
+
+	if (of->vm_ops->open)
+		of->vm_ops->open(vma);
+
+	sysfs_put_active(of->sd);
+}
+
+static int kernfs_vma_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
+{
+	struct file *file = vma->vm_file;
+	struct sysfs_open_file *of = sysfs_of(file);
+	int ret;
+
+	if (!of->vm_ops)
+		return VM_FAULT_SIGBUS;
+
+	if (!sysfs_get_active(of->sd))
+		return VM_FAULT_SIGBUS;
+
+	ret = VM_FAULT_SIGBUS;
+	if (of->vm_ops->fault)
+		ret = of->vm_ops->fault(vma, vmf);
+
+	sysfs_put_active(of->sd);
+	return ret;
+}
+
+static int kernfs_vma_page_mkwrite(struct vm_area_struct *vma,
+				   struct vm_fault *vmf)
+{
+	struct file *file = vma->vm_file;
+	struct sysfs_open_file *of = sysfs_of(file);
+	int ret;
+
+	if (!of->vm_ops)
+		return VM_FAULT_SIGBUS;
+
+	if (!sysfs_get_active(of->sd))
+		return VM_FAULT_SIGBUS;
+
+	ret = 0;
+	if (of->vm_ops->page_mkwrite)
+		ret = of->vm_ops->page_mkwrite(vma, vmf);
+	else
+		file_update_time(file);
+
+	sysfs_put_active(of->sd);
+	return ret;
+}
+
+static int kernfs_vma_access(struct vm_area_struct *vma, unsigned long addr,
+			     void *buf, int len, int write)
+{
+	struct file *file = vma->vm_file;
+	struct sysfs_open_file *of = sysfs_of(file);
+	int ret;
+
+	if (!of->vm_ops)
+		return -EINVAL;
+
+	if (!sysfs_get_active(of->sd))
+		return -EINVAL;
+
+	ret = -EINVAL;
+	if (of->vm_ops->access)
+		ret = of->vm_ops->access(vma, addr, buf, len, write);
+
+	sysfs_put_active(of->sd);
+	return ret;
+}
+
+#ifdef CONFIG_NUMA
+static int kernfs_vma_set_policy(struct vm_area_struct *vma,
+				 struct mempolicy *new)
+{
+	struct file *file = vma->vm_file;
+	struct sysfs_open_file *of = sysfs_of(file);
+	int ret;
+
+	if (!of->vm_ops)
+		return 0;
+
+	if (!sysfs_get_active(of->sd))
+		return -EINVAL;
+
+	ret = 0;
+	if (of->vm_ops->set_policy)
+		ret = of->vm_ops->set_policy(vma, new);
+
+	sysfs_put_active(of->sd);
+	return ret;
+}
+
+static struct mempolicy *kernfs_vma_get_policy(struct vm_area_struct *vma,
+					       unsigned long addr)
+{
+	struct file *file = vma->vm_file;
+	struct sysfs_open_file *of = sysfs_of(file);
+	struct mempolicy *pol;
+
+	if (!of->vm_ops)
+		return vma->vm_policy;
+
+	if (!sysfs_get_active(of->sd))
+		return vma->vm_policy;
+
+	pol = vma->vm_policy;
+	if (of->vm_ops->get_policy)
+		pol = of->vm_ops->get_policy(vma, addr);
+
+	sysfs_put_active(of->sd);
+	return pol;
+}
+
+static int kernfs_vma_migrate(struct vm_area_struct *vma,
+			      const nodemask_t *from, const nodemask_t *to,
+			      unsigned long flags)
+{
+	struct file *file = vma->vm_file;
+	struct sysfs_open_file *of = sysfs_of(file);
+	int ret;
+
+	if (!of->vm_ops)
+		return 0;
+
+	if (!sysfs_get_active(of->sd))
+		return 0;
+
+	ret = 0;
+	if (of->vm_ops->migrate)
+		ret = of->vm_ops->migrate(vma, from, to, flags);
+
+	sysfs_put_active(of->sd);
+	return ret;
+}
+#endif
+
+static const struct vm_operations_struct kernfs_vm_ops = {
+	.open		= kernfs_vma_open,
+	.fault		= kernfs_vma_fault,
+	.page_mkwrite	= kernfs_vma_page_mkwrite,
+	.access		= kernfs_vma_access,
+#ifdef CONFIG_NUMA
+	.set_policy	= kernfs_vma_set_policy,
+	.get_policy	= kernfs_vma_get_policy,
+	.migrate	= kernfs_vma_migrate,
+#endif
+};
+
+static int kernfs_file_mmap(struct file *file, struct vm_area_struct *vma)
+{
+	struct sysfs_open_file *of = sysfs_of(file);
+	const struct kernfs_ops *ops;
+	int rc;
+
+	mutex_lock(&of->mutex);
+
+	rc = -ENODEV;
+	if (!sysfs_get_active(of->sd))
+		goto out_unlock;
+
+	ops = kernfs_ops(of->sd);
+	if (ops->mmap)
+		rc = ops->mmap(of, vma);
+	if (rc)
+		goto out_put;
+
+	/*
+	 * PowerPC's pci_mmap of legacy_mem uses shmem_zero_setup()
+	 * to satisfy versions of X which crash if the mmap fails: that
+	 * substitutes a new vm_file, and we don't then want bin_vm_ops.
+	 */
+	if (vma->vm_file != file)
+		goto out_put;
+
+	rc = -EINVAL;
+	if (of->mmapped && of->vm_ops != vma->vm_ops)
+		goto out_put;
+
+	/*
+	 * It is not possible to successfully wrap close.
+	 * So error if someone is trying to use close.
+	 */
+	rc = -EINVAL;
+	if (vma->vm_ops && vma->vm_ops->close)
+		goto out_put;
+
+	rc = 0;
+	of->mmapped = 1;
+	of->vm_ops = vma->vm_ops;
+	vma->vm_ops = &kernfs_vm_ops;
+out_put:
+	sysfs_put_active(of->sd);
+out_unlock:
+	mutex_unlock(&of->mutex);
+
+	return rc;
+}
+
+/**
+ *	sysfs_get_open_dirent - get or create sysfs_open_dirent
+ *	@sd: target sysfs_dirent
+ *	@of: sysfs_open_file for this instance of open
+ *
+ *	If @sd->s_attr.open exists, increment its reference count;
+ *	otherwise, create one.  @of is chained to the files list.
+ *
+ *	LOCKING:
+ *	Kernel thread context (may sleep).
+ *
+ *	RETURNS:
+ *	0 on success, -errno on failure.
+ */
+static int sysfs_get_open_dirent(struct sysfs_dirent *sd,
+				 struct sysfs_open_file *of)
+{
+	struct sysfs_open_dirent *od, *new_od = NULL;
+
+ retry:
+	mutex_lock(&sysfs_open_file_mutex);
+	spin_lock_irq(&sysfs_open_dirent_lock);
+
+	if (!sd->s_attr.open && new_od) {
+		sd->s_attr.open = new_od;
+		new_od = NULL;
+	}
+
+	od = sd->s_attr.open;
+	if (od) {
+		atomic_inc(&od->refcnt);
+		list_add_tail(&of->list, &od->files);
+	}
+
+	spin_unlock_irq(&sysfs_open_dirent_lock);
+	mutex_unlock(&sysfs_open_file_mutex);
+
+	if (od) {
+		kfree(new_od);
+		return 0;
+	}
+
+	/* not there, initialize a new one and retry */
+	new_od = kmalloc(sizeof(*new_od), GFP_KERNEL);
+	if (!new_od)
+		return -ENOMEM;
+
+	atomic_set(&new_od->refcnt, 0);
+	atomic_set(&new_od->event, 1);
+	init_waitqueue_head(&new_od->poll);
+	INIT_LIST_HEAD(&new_od->files);
+	goto retry;
+}
+
+/**
+ *	sysfs_put_open_dirent - put sysfs_open_dirent
+ *	@sd: target sysfs_dirent
+ *	@of: associated sysfs_open_file
+ *
+ *	Put @sd->s_attr.open and unlink @of from the files list.  If
+ *	reference count reaches zero, disassociate and free it.
+ *
+ *	LOCKING:
+ *	None.
+ */
+static void sysfs_put_open_dirent(struct sysfs_dirent *sd,
+				  struct sysfs_open_file *of)
+{
+	struct sysfs_open_dirent *od = sd->s_attr.open;
+	unsigned long flags;
+
+	mutex_lock(&sysfs_open_file_mutex);
+	spin_lock_irqsave(&sysfs_open_dirent_lock, flags);
+
+	if (of)
+		list_del(&of->list);
+
+	if (atomic_dec_and_test(&od->refcnt))
+		sd->s_attr.open = NULL;
+	else
+		od = NULL;
+
+	spin_unlock_irqrestore(&sysfs_open_dirent_lock, flags);
+	mutex_unlock(&sysfs_open_file_mutex);
+
+	kfree(od);
+}
+
+static int kernfs_file_open(struct inode *inode, struct file *file)
+{
+	struct sysfs_dirent *attr_sd = file->f_path.dentry->d_fsdata;
+	const struct kernfs_ops *ops;
+	struct sysfs_open_file *of;
+	bool has_read, has_write;
+	int error = -EACCES;
+
+	if (!sysfs_get_active(attr_sd))
+		return -ENODEV;
+
+	ops = kernfs_ops(attr_sd);
+
+	has_read = ops->seq_show || ops->read || ops->mmap;
+	has_write = ops->write || ops->mmap;
+
+	/* check perms and supported operations */
+	if ((file->f_mode & FMODE_WRITE) &&
+	    (!(inode->i_mode & S_IWUGO) || !has_write))
+		goto err_out;
+
+	if ((file->f_mode & FMODE_READ) &&
+	    (!(inode->i_mode & S_IRUGO) || !has_read))
+		goto err_out;
+
+	/* allocate a sysfs_open_file for the file */
+	error = -ENOMEM;
+	of = kzalloc(sizeof(struct sysfs_open_file), GFP_KERNEL);
+	if (!of)
+		goto err_out;
+
+	mutex_init(&of->mutex);
+	of->sd = attr_sd;
+	of->file = file;
+
+	/*
+	 * Always instantiate seq_file even if read access doesn't use
+	 * seq_file or is not requested.  This unifies private data access
+	 * and readable regular files are the vast majority anyway.
+	 */
+	if (ops->seq_show)
+		error = seq_open(file, &kernfs_seq_ops);
+	else
+		error = seq_open(file, NULL);
+	if (error)
+		goto err_free;
+
+	((struct seq_file *)file->private_data)->private = of;
+
+	/* seq_file clears PWRITE unconditionally, restore it if WRITE */
+	if (file->f_mode & FMODE_WRITE)
+		file->f_mode |= FMODE_PWRITE;
+
+	/* make sure we have open dirent struct */
+	error = sysfs_get_open_dirent(attr_sd, of);
+	if (error)
+		goto err_close;
+
+	/* open succeeded, put active references */
+	sysfs_put_active(attr_sd);
+	return 0;
+
+err_close:
+	seq_release(inode, file);
+err_free:
+	kfree(of);
+err_out:
+	sysfs_put_active(attr_sd);
+	return error;
+}
+
+static int kernfs_file_release(struct inode *inode, struct file *filp)
+{
+	struct sysfs_dirent *sd = filp->f_path.dentry->d_fsdata;
+	struct sysfs_open_file *of = sysfs_of(filp);
+
+	sysfs_put_open_dirent(sd, of);
+	seq_release(inode, filp);
+	kfree(of);
+
+	return 0;
+}
+
+void sysfs_unmap_bin_file(struct sysfs_dirent *sd)
+{
+	struct sysfs_open_dirent *od;
+	struct sysfs_open_file *of;
+
+	if (!(sd->s_flags & SYSFS_FLAG_HAS_MMAP))
+		return;
+
+	spin_lock_irq(&sysfs_open_dirent_lock);
+	od = sd->s_attr.open;
+	if (od)
+		atomic_inc(&od->refcnt);
+	spin_unlock_irq(&sysfs_open_dirent_lock);
+	if (!od)
+		return;
+
+	mutex_lock(&sysfs_open_file_mutex);
+	list_for_each_entry(of, &od->files, list) {
+		struct inode *inode = file_inode(of->file);
+		unmap_mapping_range(inode->i_mapping, 0, 0, 1);
+	}
+	mutex_unlock(&sysfs_open_file_mutex);
+
+	sysfs_put_open_dirent(sd, NULL);
+}
+
+/* Sysfs attribute files are pollable.  The idea is that you read
+ * the content and then you use 'poll' or 'select' to wait for
+ * the content to change.  When the content changes (assuming the
+ * manager for the kobject supports notification), poll will
+ * return POLLERR|POLLPRI, and select will return the fd whether
+ * it is waiting for read, write, or exceptions.
+ * Once poll/select indicates that the value has changed, you
+ * need to close and re-open the file, or seek to 0 and read again.
+ * Reminder: this only works for attributes which actively support
+ * it, and it is not possible to test an attribute from userspace
+ * to see if it supports poll (Neither 'poll' nor 'select' return
+ * an appropriate error code).  When in doubt, set a suitable timeout value.
+ */
+static unsigned int kernfs_file_poll(struct file *filp, poll_table *wait)
+{
+	struct sysfs_open_file *of = sysfs_of(filp);
+	struct sysfs_dirent *attr_sd = filp->f_path.dentry->d_fsdata;
+	struct sysfs_open_dirent *od = attr_sd->s_attr.open;
+
+	/* need parent for the kobj, grab both */
+	if (!sysfs_get_active(attr_sd))
+		goto trigger;
+
+	poll_wait(filp, &od->poll, wait);
+
+	sysfs_put_active(attr_sd);
+
+	if (of->event != atomic_read(&od->event))
+		goto trigger;
+
+	return DEFAULT_POLLMASK;
+
+ trigger:
+	return DEFAULT_POLLMASK|POLLERR|POLLPRI;
+}
+
+/**
+ * kernfs_notify - notify a kernfs file
+ * @sd: file to notify
+ *
+ * Notify @sd such that poll(2) on @sd wakes up.
+ */
+void kernfs_notify(struct sysfs_dirent *sd)
+{
+	struct sysfs_open_dirent *od;
+	unsigned long flags;
+
+	spin_lock_irqsave(&sysfs_open_dirent_lock, flags);
+
+	if (!WARN_ON(sysfs_type(sd) != SYSFS_KOBJ_ATTR)) {
+		od = sd->s_attr.open;
+		if (od) {
+			atomic_inc(&od->event);
+			wake_up_interruptible(&od->poll);
+		}
+	}
+
+	spin_unlock_irqrestore(&sysfs_open_dirent_lock, flags);
+}
+EXPORT_SYMBOL_GPL(kernfs_notify);
+
+const struct file_operations kernfs_file_operations = {
+	.read		= kernfs_file_read,
+	.write		= kernfs_file_write,
+	.llseek		= kernfs_file_llseek,
+	.mmap		= kernfs_file_mmap,
+	.open		= kernfs_file_open,
+	.release	= kernfs_file_release,
+	.poll		= kernfs_file_poll,
+};
+
+/**
+ * kernfs_create_file_ns_key - create a file
+ * @parent: directory to create the file in
+ * @name: name of the file
+ * @mode: mode of the file
+ * @size: size of the file
+ * @ops: kernfs operations for the file
+ * @priv: private data for the file
+ * @ns: optional namespace tag of the file
+ * @key: lockdep key for the file's active_ref, %NULL to disable lockdep
+ *
+ * Returns the created node on success, ERR_PTR() value on error.
+ */
+struct sysfs_dirent *kernfs_create_file_ns_key(struct sysfs_dirent *parent,
+					       const char *name,
+					       umode_t mode, loff_t size,
+					       const struct kernfs_ops *ops,
+					       void *priv, const void *ns,
+					       struct lock_class_key *key)
+{
+	struct sysfs_addrm_cxt acxt;
+	struct sysfs_dirent *sd;
+	int rc;
+
+	sd = sysfs_new_dirent(name, (mode & S_IALLUGO) | S_IFREG,
+			      SYSFS_KOBJ_ATTR);
+	if (!sd)
+		return ERR_PTR(-ENOMEM);
+
+	sd->s_attr.ops = ops;
+	sd->s_attr.size = size;
+	sd->s_ns = ns;
+	sd->priv = priv;
+
+#ifdef CONFIG_DEBUG_LOCK_ALLOC
+	if (key) {
+		lockdep_init_map(&sd->dep_map, "s_active", key, 0);
+		sd->s_flags |= SYSFS_FLAG_LOCKDEP;
+	}
+#endif
+
+	/*
+	 * sd->s_attr.ops is accesible only while holding active ref.  We
+	 * need to know whether some ops are implemented outside active
+	 * ref.  Cache their existence in flags.
+	 */
+	if (ops->seq_show)
+		sd->s_flags |= SYSFS_FLAG_HAS_SEQ_SHOW;
+	if (ops->mmap)
+		sd->s_flags |= SYSFS_FLAG_HAS_MMAP;
+
+	sysfs_addrm_start(&acxt);
+	rc = sysfs_add_one(&acxt, sd, parent);
+	sysfs_addrm_finish(&acxt);
+
+	if (rc) {
+		kernfs_put(sd);
+		return ERR_PTR(rc);
+	}
+	return sd;
+}
diff --git a/fs/kernfs/kernfs-internal.h b/fs/kernfs/kernfs-internal.h
index 67b111b..1046624 100644
--- a/fs/kernfs/kernfs-internal.h
+++ b/fs/kernfs/kernfs-internal.h
@@ -142,4 +142,11 @@ int sysfs_add_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd,
 void sysfs_addrm_finish(struct sysfs_addrm_cxt *acxt);
 struct sysfs_dirent *sysfs_new_dirent(const char *name, umode_t mode, int type);
 
+/*
+ * file.c
+ */
+extern const struct file_operations kernfs_file_operations;
+
+void sysfs_unmap_bin_file(struct sysfs_dirent *sd);
+
 #endif	/* __KERNFS_INTERNAL_H */
diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c
index 94452de..6634a9f 100644
--- a/fs/sysfs/file.c
+++ b/fs/sysfs/file.c
@@ -14,54 +14,12 @@
 #include <linux/kobject.h>
 #include <linux/kallsyms.h>
 #include <linux/slab.h>
-#include <linux/fsnotify.h>
-#include <linux/namei.h>
-#include <linux/poll.h>
 #include <linux/list.h>
 #include <linux/mutex.h>
-#include <linux/limits.h>
-#include <linux/uaccess.h>
 #include <linux/seq_file.h>
-#include <linux/mm.h>
 
 #include "sysfs.h"
-
-/*
- * There's one sysfs_open_file for each open file and one sysfs_open_dirent
- * for each sysfs_dirent with one or more open files.
- *
- * sysfs_dirent->s_attr.open points to sysfs_open_dirent.  s_attr.open is
- * protected by sysfs_open_dirent_lock.
- *
- * filp->private_data points to seq_file whose ->private points to
- * sysfs_open_file.  sysfs_open_files are chained at
- * sysfs_open_dirent->files, which is protected by sysfs_open_file_mutex.
- */
-static DEFINE_SPINLOCK(sysfs_open_dirent_lock);
-static DEFINE_MUTEX(sysfs_open_file_mutex);
-
-struct sysfs_open_dirent {
-	atomic_t		refcnt;
-	atomic_t		event;
-	wait_queue_head_t	poll;
-	struct list_head	files; /* goes through sysfs_open_file.list */
-};
-
-static struct sysfs_open_file *sysfs_of(struct file *file)
-{
-	return ((struct seq_file *)file->private_data)->private;
-}
-
-/*
- * Determine the kernfs_ops for the given sysfs_dirent.  This function must
- * be called while holding an active reference.
- */
-static const struct kernfs_ops *kernfs_ops(struct sysfs_dirent *sd)
-{
-	if (sd->s_flags & SYSFS_FLAG_LOCKDEP)
-		lockdep_assert_held(sd);
-	return sd->s_attr.ops;
-}
+#include "../kernfs/kernfs-internal.h"
 
 /*
  * Determine ktype->sysfs_ops for the given sysfs_dirent.  This function
@@ -143,147 +101,6 @@ static ssize_t sysfs_kf_bin_read(struct sysfs_open_file *of, char *buf,
 	return battr->read(of->file, kobj, battr, buf, pos, count);
 }
 
-static void *kernfs_seq_start(struct seq_file *sf, loff_t *ppos)
-{
-	struct sysfs_open_file *of = sf->private;
-	const struct kernfs_ops *ops;
-
-	/*
-	 * @of->mutex nests outside active ref and is just to ensure that
-	 * the ops aren't called concurrently for the same open file.
-	 */
-	mutex_lock(&of->mutex);
-	if (!sysfs_get_active(of->sd))
-		return ERR_PTR(-ENODEV);
-
-	ops = kernfs_ops(of->sd);
-	if (ops->seq_start) {
-		return ops->seq_start(sf, ppos);
-	} else {
-		/*
-		 * The same behavior and code as single_open().  Returns
-		 * !NULL if pos is at the beginning; otherwise, NULL.
-		 */
-		return NULL + !*ppos;
-	}
-}
-
-static void *kernfs_seq_next(struct seq_file *sf, void *v, loff_t *ppos)
-{
-	struct sysfs_open_file *of = sf->private;
-	const struct kernfs_ops *ops = kernfs_ops(of->sd);
-
-	if (ops->seq_next) {
-		return ops->seq_next(sf, v, ppos);
-	} else {
-		/*
-		 * The same behavior and code as single_open(), always
-		 * terminate after the initial read.
-		 */
-		++*ppos;
-		return NULL;
-	}
-}
-
-static void kernfs_seq_stop(struct seq_file *sf, void *v)
-{
-	struct sysfs_open_file *of = sf->private;
-	const struct kernfs_ops *ops = kernfs_ops(of->sd);
-
-	if (ops->seq_stop)
-		ops->seq_stop(sf, v);
-
-	sysfs_put_active(of->sd);
-	mutex_unlock(&of->mutex);
-}
-
-static int kernfs_seq_show(struct seq_file *sf, void *v)
-{
-	struct sysfs_open_file *of = sf->private;
-
-	of->event = atomic_read(&of->sd->s_attr.open->event);
-
-	return of->sd->s_attr.ops->seq_show(sf, v);
-}
-
-static const struct seq_operations kernfs_seq_ops = {
-	.start = kernfs_seq_start,
-	.next = kernfs_seq_next,
-	.stop = kernfs_seq_stop,
-	.show = kernfs_seq_show,
-};
-
-/*
- * As reading a bin file can have side-effects, the exact offset and bytes
- * specified in read(2) call should be passed to the read callback making
- * it difficult to use seq_file.  Implement simplistic custom buffering for
- * bin files.
- */
-static ssize_t kernfs_file_direct_read(struct sysfs_open_file *of,
-				       char __user *user_buf, size_t count,
-				       loff_t *ppos)
-{
-	ssize_t len = min_t(size_t, count, PAGE_SIZE);
-	const struct kernfs_ops *ops;
-	char *buf;
-
-	buf = kmalloc(len, GFP_KERNEL);
-	if (!buf)
-		return -ENOMEM;
-
-	/*
-	 * @of->mutex nests outside active ref and is just to ensure that
-	 * the ops aren't called concurrently for the same open file.
-	 */
-	mutex_lock(&of->mutex);
-	if (!sysfs_get_active(of->sd)) {
-		len = -ENODEV;
-		mutex_unlock(&of->mutex);
-		goto out_free;
-	}
-
-	ops = kernfs_ops(of->sd);
-	if (ops->read)
-		len = ops->read(of, buf, len, *ppos);
-	else
-		len = -EINVAL;
-
-	sysfs_put_active(of->sd);
-	mutex_unlock(&of->mutex);
-
-	if (len < 0)
-		goto out_free;
-
-	if (copy_to_user(user_buf, buf, len)) {
-		len = -EFAULT;
-		goto out_free;
-	}
-
-	*ppos += len;
-
- out_free:
-	kfree(buf);
-	return len;
-}
-
-/**
- * kernfs_file_read - kernfs vfs read callback
- * @file: file pointer
- * @user_buf: data to write
- * @count: number of bytes
- * @ppos: starting offset
- */
-static ssize_t kernfs_file_read(struct file *file, char __user *user_buf,
-				size_t count, loff_t *ppos)
-{
-	struct sysfs_open_file *of = sysfs_of(file);
-
-	if (of->sd->s_flags & SYSFS_FLAG_HAS_SEQ_SHOW)
-		return seq_read(file, user_buf, count, ppos);
-	else
-		return kernfs_file_direct_read(of, user_buf, count, ppos);
-}
-
 /* kernfs write callback for regular sysfs files */
 static ssize_t sysfs_kf_write(struct sysfs_open_file *of, char *buf,
 			      size_t count, loff_t pos)
@@ -319,77 +136,6 @@ static ssize_t sysfs_kf_bin_write(struct sysfs_open_file *of, char *buf,
 	return battr->write(of->file, kobj, battr, buf, pos, count);
 }
 
-/**
- * kernfs_file_write - kernfs vfs write callback
- * @file: file pointer
- * @user_buf: data to write
- * @count: number of bytes
- * @ppos: starting offset
- *
- * Copy data in from userland and pass it to the matching kernfs write
- * operation.
- *
- * There is no easy way for us to know if userspace is only doing a partial
- * write, so we don't support them. We expect the entire buffer to come on
- * the first write.  Hint: if you're writing a value, first read the file,
- * modify only the the value you're changing, then write entire buffer
- * back.
- */
-static ssize_t kernfs_file_write(struct file *file, const char __user *user_buf,
-				 size_t count, loff_t *ppos)
-{
-	struct sysfs_open_file *of = sysfs_of(file);
-	ssize_t len = min_t(size_t, count, PAGE_SIZE);
-	const struct kernfs_ops *ops;
-	char *buf;
-
-	buf = kmalloc(len + 1, GFP_KERNEL);
-	if (!buf)
-		return -ENOMEM;
-
-	if (copy_from_user(buf, user_buf, len)) {
-		len = -EFAULT;
-		goto out_free;
-	}
-	buf[len] = '\0';	/* guarantee string termination */
-
-	/*
-	 * @of->mutex nests outside active ref and is just to ensure that
-	 * the ops aren't called concurrently for the same open file.
-	 */
-	mutex_lock(&of->mutex);
-	if (!sysfs_get_active(of->sd)) {
-		mutex_unlock(&of->mutex);
-		len = -ENODEV;
-		goto out_free;
-	}
-
-	ops = kernfs_ops(of->sd);
-	if (ops->write)
-		len = ops->write(of, buf, len, *ppos);
-	else
-		len = -EINVAL;
-
-	sysfs_put_active(of->sd);
-	mutex_unlock(&of->mutex);
-
-	if (len > 0)
-		*ppos += len;
-out_free:
-	kfree(buf);
-	return len;
-}
-
-static loff_t kernfs_file_llseek(struct file *file, loff_t off, int whence)
-{
-	struct sysfs_open_file *of = sysfs_of(file);
-
-	if (of->sd->s_flags & SYSFS_FLAG_HAS_SEQ_SHOW)
-		return seq_lseek(file, off, whence);
-	else
-		return generic_file_llseek(file, off, whence);
-}
-
 static int sysfs_kf_bin_mmap(struct sysfs_open_file *of,
 			     struct vm_area_struct *vma)
 {
@@ -402,473 +148,6 @@ static int sysfs_kf_bin_mmap(struct sysfs_open_file *of,
 	return battr->mmap(of->file, kobj, battr, vma);
 }
 
-static void kernfs_vma_open(struct vm_area_struct *vma)
-{
-	struct file *file = vma->vm_file;
-	struct sysfs_open_file *of = sysfs_of(file);
-
-	if (!of->vm_ops)
-		return;
-
-	if (!sysfs_get_active(of->sd))
-		return;
-
-	if (of->vm_ops->open)
-		of->vm_ops->open(vma);
-
-	sysfs_put_active(of->sd);
-}
-
-static int kernfs_vma_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
-{
-	struct file *file = vma->vm_file;
-	struct sysfs_open_file *of = sysfs_of(file);
-	int ret;
-
-	if (!of->vm_ops)
-		return VM_FAULT_SIGBUS;
-
-	if (!sysfs_get_active(of->sd))
-		return VM_FAULT_SIGBUS;
-
-	ret = VM_FAULT_SIGBUS;
-	if (of->vm_ops->fault)
-		ret = of->vm_ops->fault(vma, vmf);
-
-	sysfs_put_active(of->sd);
-	return ret;
-}
-
-static int kernfs_vma_page_mkwrite(struct vm_area_struct *vma,
-				   struct vm_fault *vmf)
-{
-	struct file *file = vma->vm_file;
-	struct sysfs_open_file *of = sysfs_of(file);
-	int ret;
-
-	if (!of->vm_ops)
-		return VM_FAULT_SIGBUS;
-
-	if (!sysfs_get_active(of->sd))
-		return VM_FAULT_SIGBUS;
-
-	ret = 0;
-	if (of->vm_ops->page_mkwrite)
-		ret = of->vm_ops->page_mkwrite(vma, vmf);
-	else
-		file_update_time(file);
-
-	sysfs_put_active(of->sd);
-	return ret;
-}
-
-static int kernfs_vma_access(struct vm_area_struct *vma, unsigned long addr,
-			     void *buf, int len, int write)
-{
-	struct file *file = vma->vm_file;
-	struct sysfs_open_file *of = sysfs_of(file);
-	int ret;
-
-	if (!of->vm_ops)
-		return -EINVAL;
-
-	if (!sysfs_get_active(of->sd))
-		return -EINVAL;
-
-	ret = -EINVAL;
-	if (of->vm_ops->access)
-		ret = of->vm_ops->access(vma, addr, buf, len, write);
-
-	sysfs_put_active(of->sd);
-	return ret;
-}
-
-#ifdef CONFIG_NUMA
-static int kernfs_vma_set_policy(struct vm_area_struct *vma,
-				 struct mempolicy *new)
-{
-	struct file *file = vma->vm_file;
-	struct sysfs_open_file *of = sysfs_of(file);
-	int ret;
-
-	if (!of->vm_ops)
-		return 0;
-
-	if (!sysfs_get_active(of->sd))
-		return -EINVAL;
-
-	ret = 0;
-	if (of->vm_ops->set_policy)
-		ret = of->vm_ops->set_policy(vma, new);
-
-	sysfs_put_active(of->sd);
-	return ret;
-}
-
-static struct mempolicy *kernfs_vma_get_policy(struct vm_area_struct *vma,
-					       unsigned long addr)
-{
-	struct file *file = vma->vm_file;
-	struct sysfs_open_file *of = sysfs_of(file);
-	struct mempolicy *pol;
-
-	if (!of->vm_ops)
-		return vma->vm_policy;
-
-	if (!sysfs_get_active(of->sd))
-		return vma->vm_policy;
-
-	pol = vma->vm_policy;
-	if (of->vm_ops->get_policy)
-		pol = of->vm_ops->get_policy(vma, addr);
-
-	sysfs_put_active(of->sd);
-	return pol;
-}
-
-static int kernfs_vma_migrate(struct vm_area_struct *vma,
-			      const nodemask_t *from, const nodemask_t *to,
-			      unsigned long flags)
-{
-	struct file *file = vma->vm_file;
-	struct sysfs_open_file *of = sysfs_of(file);
-	int ret;
-
-	if (!of->vm_ops)
-		return 0;
-
-	if (!sysfs_get_active(of->sd))
-		return 0;
-
-	ret = 0;
-	if (of->vm_ops->migrate)
-		ret = of->vm_ops->migrate(vma, from, to, flags);
-
-	sysfs_put_active(of->sd);
-	return ret;
-}
-#endif
-
-static const struct vm_operations_struct kernfs_vm_ops = {
-	.open		= kernfs_vma_open,
-	.fault		= kernfs_vma_fault,
-	.page_mkwrite	= kernfs_vma_page_mkwrite,
-	.access		= kernfs_vma_access,
-#ifdef CONFIG_NUMA
-	.set_policy	= kernfs_vma_set_policy,
-	.get_policy	= kernfs_vma_get_policy,
-	.migrate	= kernfs_vma_migrate,
-#endif
-};
-
-static int kernfs_file_mmap(struct file *file, struct vm_area_struct *vma)
-{
-	struct sysfs_open_file *of = sysfs_of(file);
-	const struct kernfs_ops *ops;
-	int rc;
-
-	mutex_lock(&of->mutex);
-
-	rc = -ENODEV;
-	if (!sysfs_get_active(of->sd))
-		goto out_unlock;
-
-	ops = kernfs_ops(of->sd);
-	if (ops->mmap)
-		rc = ops->mmap(of, vma);
-	if (rc)
-		goto out_put;
-
-	/*
-	 * PowerPC's pci_mmap of legacy_mem uses shmem_zero_setup()
-	 * to satisfy versions of X which crash if the mmap fails: that
-	 * substitutes a new vm_file, and we don't then want bin_vm_ops.
-	 */
-	if (vma->vm_file != file)
-		goto out_put;
-
-	rc = -EINVAL;
-	if (of->mmapped && of->vm_ops != vma->vm_ops)
-		goto out_put;
-
-	/*
-	 * It is not possible to successfully wrap close.
-	 * So error if someone is trying to use close.
-	 */
-	rc = -EINVAL;
-	if (vma->vm_ops && vma->vm_ops->close)
-		goto out_put;
-
-	rc = 0;
-	of->mmapped = 1;
-	of->vm_ops = vma->vm_ops;
-	vma->vm_ops = &kernfs_vm_ops;
-out_put:
-	sysfs_put_active(of->sd);
-out_unlock:
-	mutex_unlock(&of->mutex);
-
-	return rc;
-}
-
-/**
- *	sysfs_get_open_dirent - get or create sysfs_open_dirent
- *	@sd: target sysfs_dirent
- *	@of: sysfs_open_file for this instance of open
- *
- *	If @sd->s_attr.open exists, increment its reference count;
- *	otherwise, create one.  @of is chained to the files list.
- *
- *	LOCKING:
- *	Kernel thread context (may sleep).
- *
- *	RETURNS:
- *	0 on success, -errno on failure.
- */
-static int sysfs_get_open_dirent(struct sysfs_dirent *sd,
-				 struct sysfs_open_file *of)
-{
-	struct sysfs_open_dirent *od, *new_od = NULL;
-
- retry:
-	mutex_lock(&sysfs_open_file_mutex);
-	spin_lock_irq(&sysfs_open_dirent_lock);
-
-	if (!sd->s_attr.open && new_od) {
-		sd->s_attr.open = new_od;
-		new_od = NULL;
-	}
-
-	od = sd->s_attr.open;
-	if (od) {
-		atomic_inc(&od->refcnt);
-		list_add_tail(&of->list, &od->files);
-	}
-
-	spin_unlock_irq(&sysfs_open_dirent_lock);
-	mutex_unlock(&sysfs_open_file_mutex);
-
-	if (od) {
-		kfree(new_od);
-		return 0;
-	}
-
-	/* not there, initialize a new one and retry */
-	new_od = kmalloc(sizeof(*new_od), GFP_KERNEL);
-	if (!new_od)
-		return -ENOMEM;
-
-	atomic_set(&new_od->refcnt, 0);
-	atomic_set(&new_od->event, 1);
-	init_waitqueue_head(&new_od->poll);
-	INIT_LIST_HEAD(&new_od->files);
-	goto retry;
-}
-
-/**
- *	sysfs_put_open_dirent - put sysfs_open_dirent
- *	@sd: target sysfs_dirent
- *	@of: associated sysfs_open_file
- *
- *	Put @sd->s_attr.open and unlink @of from the files list.  If
- *	reference count reaches zero, disassociate and free it.
- *
- *	LOCKING:
- *	None.
- */
-static void sysfs_put_open_dirent(struct sysfs_dirent *sd,
-				  struct sysfs_open_file *of)
-{
-	struct sysfs_open_dirent *od = sd->s_attr.open;
-	unsigned long flags;
-
-	mutex_lock(&sysfs_open_file_mutex);
-	spin_lock_irqsave(&sysfs_open_dirent_lock, flags);
-
-	if (of)
-		list_del(&of->list);
-
-	if (atomic_dec_and_test(&od->refcnt))
-		sd->s_attr.open = NULL;
-	else
-		od = NULL;
-
-	spin_unlock_irqrestore(&sysfs_open_dirent_lock, flags);
-	mutex_unlock(&sysfs_open_file_mutex);
-
-	kfree(od);
-}
-
-static int kernfs_file_open(struct inode *inode, struct file *file)
-{
-	struct sysfs_dirent *attr_sd = file->f_path.dentry->d_fsdata;
-	const struct kernfs_ops *ops;
-	struct sysfs_open_file *of;
-	bool has_read, has_write;
-	int error = -EACCES;
-
-	if (!sysfs_get_active(attr_sd))
-		return -ENODEV;
-
-	ops = kernfs_ops(attr_sd);
-
-	has_read = ops->seq_show || ops->read || ops->mmap;
-	has_write = ops->write || ops->mmap;
-
-	/* check perms and supported operations */
-	if ((file->f_mode & FMODE_WRITE) &&
-	    (!(inode->i_mode & S_IWUGO) || !has_write))
-		goto err_out;
-
-	if ((file->f_mode & FMODE_READ) &&
-	    (!(inode->i_mode & S_IRUGO) || !has_read))
-		goto err_out;
-
-	/* allocate a sysfs_open_file for the file */
-	error = -ENOMEM;
-	of = kzalloc(sizeof(struct sysfs_open_file), GFP_KERNEL);
-	if (!of)
-		goto err_out;
-
-	mutex_init(&of->mutex);
-	of->sd = attr_sd;
-	of->file = file;
-
-	/*
-	 * Always instantiate seq_file even if read access doesn't use
-	 * seq_file or is not requested.  This unifies private data access
-	 * and readable regular files are the vast majority anyway.
-	 */
-	if (ops->seq_show)
-		error = seq_open(file, &kernfs_seq_ops);
-	else
-		error = seq_open(file, NULL);
-	if (error)
-		goto err_free;
-
-	((struct seq_file *)file->private_data)->private = of;
-
-	/* seq_file clears PWRITE unconditionally, restore it if WRITE */
-	if (file->f_mode & FMODE_WRITE)
-		file->f_mode |= FMODE_PWRITE;
-
-	/* make sure we have open dirent struct */
-	error = sysfs_get_open_dirent(attr_sd, of);
-	if (error)
-		goto err_close;
-
-	/* open succeeded, put active references */
-	sysfs_put_active(attr_sd);
-	return 0;
-
-err_close:
-	seq_release(inode, file);
-err_free:
-	kfree(of);
-err_out:
-	sysfs_put_active(attr_sd);
-	return error;
-}
-
-static int kernfs_file_release(struct inode *inode, struct file *filp)
-{
-	struct sysfs_dirent *sd = filp->f_path.dentry->d_fsdata;
-	struct sysfs_open_file *of = sysfs_of(filp);
-
-	sysfs_put_open_dirent(sd, of);
-	seq_release(inode, filp);
-	kfree(of);
-
-	return 0;
-}
-
-void sysfs_unmap_bin_file(struct sysfs_dirent *sd)
-{
-	struct sysfs_open_dirent *od;
-	struct sysfs_open_file *of;
-
-	if (!(sd->s_flags & SYSFS_FLAG_HAS_MMAP))
-		return;
-
-	spin_lock_irq(&sysfs_open_dirent_lock);
-	od = sd->s_attr.open;
-	if (od)
-		atomic_inc(&od->refcnt);
-	spin_unlock_irq(&sysfs_open_dirent_lock);
-	if (!od)
-		return;
-
-	mutex_lock(&sysfs_open_file_mutex);
-	list_for_each_entry(of, &od->files, list) {
-		struct inode *inode = file_inode(of->file);
-		unmap_mapping_range(inode->i_mapping, 0, 0, 1);
-	}
-	mutex_unlock(&sysfs_open_file_mutex);
-
-	sysfs_put_open_dirent(sd, NULL);
-}
-
-/* Sysfs attribute files are pollable.  The idea is that you read
- * the content and then you use 'poll' or 'select' to wait for
- * the content to change.  When the content changes (assuming the
- * manager for the kobject supports notification), poll will
- * return POLLERR|POLLPRI, and select will return the fd whether
- * it is waiting for read, write, or exceptions.
- * Once poll/select indicates that the value has changed, you
- * need to close and re-open the file, or seek to 0 and read again.
- * Reminder: this only works for attributes which actively support
- * it, and it is not possible to test an attribute from userspace
- * to see if it supports poll (Neither 'poll' nor 'select' return
- * an appropriate error code).  When in doubt, set a suitable timeout value.
- */
-static unsigned int kernfs_file_poll(struct file *filp, poll_table *wait)
-{
-	struct sysfs_open_file *of = sysfs_of(filp);
-	struct sysfs_dirent *attr_sd = filp->f_path.dentry->d_fsdata;
-	struct sysfs_open_dirent *od = attr_sd->s_attr.open;
-
-	/* need parent for the kobj, grab both */
-	if (!sysfs_get_active(attr_sd))
-		goto trigger;
-
-	poll_wait(filp, &od->poll, wait);
-
-	sysfs_put_active(attr_sd);
-
-	if (of->event != atomic_read(&od->event))
-		goto trigger;
-
-	return DEFAULT_POLLMASK;
-
- trigger:
-	return DEFAULT_POLLMASK|POLLERR|POLLPRI;
-}
-
-/**
- * kernfs_notify - notify a kernfs file
- * @sd: file to notify
- *
- * Notify @sd such that poll(2) on @sd wakes up.
- */
-void kernfs_notify(struct sysfs_dirent *sd)
-{
-	struct sysfs_open_dirent *od;
-	unsigned long flags;
-
-	spin_lock_irqsave(&sysfs_open_dirent_lock, flags);
-
-	if (!WARN_ON(sysfs_type(sd) != SYSFS_KOBJ_ATTR)) {
-		od = sd->s_attr.open;
-		if (od) {
-			atomic_inc(&od->event);
-			wake_up_interruptible(&od->poll);
-		}
-	}
-
-	spin_unlock_irqrestore(&sysfs_open_dirent_lock, flags);
-}
-EXPORT_SYMBOL_GPL(kernfs_notify);
-
 void sysfs_notify(struct kobject *k, const char *dir, const char *attr)
 {
 	struct sysfs_dirent *sd = k->sd, *tmp;
@@ -891,16 +170,6 @@ void sysfs_notify(struct kobject *k, const char *dir, const char *attr)
 }
 EXPORT_SYMBOL_GPL(sysfs_notify);
 
-const struct file_operations kernfs_file_operations = {
-	.read		= kernfs_file_read,
-	.write		= kernfs_file_write,
-	.llseek		= kernfs_file_llseek,
-	.mmap		= kernfs_file_mmap,
-	.open		= kernfs_file_open,
-	.release	= kernfs_file_release,
-	.poll		= kernfs_file_poll,
-};
-
 static const struct kernfs_ops sysfs_file_kfops_empty = {
 };
 
@@ -989,68 +258,6 @@ int sysfs_add_file_mode_ns(struct sysfs_dirent *dir_sd,
 	return 0;
 }
 
-/**
- * kernfs_create_file_ns_key - create a file
- * @parent: directory to create the file in
- * @name: name of the file
- * @mode: mode of the file
- * @size: size of the file
- * @ops: kernfs operations for the file
- * @priv: private data for the file
- * @ns: optional namespace tag of the file
- * @key: lockdep key for the file's active_ref, %NULL to disable lockdep
- *
- * Returns the created node on success, ERR_PTR() value on error.
- */
-struct sysfs_dirent *kernfs_create_file_ns_key(struct sysfs_dirent *parent,
-					       const char *name,
-					       umode_t mode, loff_t size,
-					       const struct kernfs_ops *ops,
-					       void *priv, const void *ns,
-					       struct lock_class_key *key)
-{
-	struct sysfs_addrm_cxt acxt;
-	struct sysfs_dirent *sd;
-	int rc;
-
-	sd = sysfs_new_dirent(name, (mode & S_IALLUGO) | S_IFREG,
-			      SYSFS_KOBJ_ATTR);
-	if (!sd)
-		return ERR_PTR(-ENOMEM);
-
-	sd->s_attr.ops = ops;
-	sd->s_attr.size = size;
-	sd->s_ns = ns;
-	sd->priv = priv;
-
-#ifdef CONFIG_DEBUG_LOCK_ALLOC
-	if (key) {
-		lockdep_init_map(&sd->dep_map, "s_active", key, 0);
-		sd->s_flags |= SYSFS_FLAG_LOCKDEP;
-	}
-#endif
-
-	/*
-	 * sd->s_attr.ops is accesible only while holding active ref.  We
-	 * need to know whether some ops are implemented outside active
-	 * ref.  Cache their existence in flags.
-	 */
-	if (ops->seq_show)
-		sd->s_flags |= SYSFS_FLAG_HAS_SEQ_SHOW;
-	if (ops->mmap)
-		sd->s_flags |= SYSFS_FLAG_HAS_MMAP;
-
-	sysfs_addrm_start(&acxt);
-	rc = sysfs_add_one(&acxt, sd, parent);
-	sysfs_addrm_finish(&acxt);
-
-	if (rc) {
-		kernfs_put(sd);
-		return ERR_PTR(rc);
-	}
-	return sd;
-}
-
 int sysfs_add_file(struct sysfs_dirent *dir_sd, const struct attribute *attr,
 		   bool is_bin)
 {
diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index aef8180..95cc989 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -41,15 +41,11 @@ void sysfs_warn_dup(struct sysfs_dirent *parent, const char *name);
 /*
  * file.c
  */
-extern const struct file_operations kernfs_file_operations;
-
 int sysfs_add_file(struct sysfs_dirent *dir_sd,
 		   const struct attribute *attr, bool is_bin);
-
 int sysfs_add_file_mode_ns(struct sysfs_dirent *dir_sd,
 			   const struct attribute *attr, bool is_bin,
 			   umode_t amode, const void *ns);
-void sysfs_unmap_bin_file(struct sysfs_dirent *sd);
 
 /*
  * symlink.c
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* Re: Subject: [PATCHSET driver-core-next] sysfs: separate out kernfs, part #1
  2013-10-28 13:06 ` Subject: [PATCHSET driver-core-next] sysfs: separate out kernfs, part #1 Tejun Heo
@ 2013-10-28 16:35   ` Tejun Heo
  0 siblings, 0 replies; 46+ messages in thread
From: Tejun Heo @ 2013-10-28 16:35 UTC (permalink / raw)
  To: gregkh; +Cc: kay, linux-kernel, ebiederm, bhelgaas

On Mon, Oct 28, 2013 at 09:06:54AM -0400, Tejun Heo wrote:
> Hello, guys.
> 
> Just posted updated patches incorporating Pavel's suggestions.  Only
> the patches which need to be updated are reposted.  The whole series
> can be found in the following branch.  It's based on top of today's
> driver-core-next 6fffcfa7c0fc ("devres: restore zeroing behavior of
> devres_alloc()").
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc.git review-separate-out-kernfs

Updated again to reflect bug report from Fengguang.  The git tree has
been updated accordingly.  Please give me a holler if someone wants
the whole series to be reposted.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Subject: [PATCHSET driver-core-next] sysfs: separate out kernfs, part #1
  2013-10-24 15:49 Subject: [PATCHSET driver-core-next] sysfs: separate out kernfs, part #1 Tejun Heo
                   ` (34 preceding siblings ...)
  2013-10-28 13:06 ` Subject: [PATCHSET driver-core-next] sysfs: separate out kernfs, part #1 Tejun Heo
@ 2013-10-29 22:29 ` Greg KH
  2013-10-30 14:41   ` Tejun Heo
  35 siblings, 1 reply; 46+ messages in thread
From: Greg KH @ 2013-10-29 22:29 UTC (permalink / raw)
  To: Tejun Heo; +Cc: kay, linux-kernel, ebiederm, bhelgaas

On Thu, Oct 24, 2013 at 11:49:06AM -0400, Tejun Heo wrote:
> Hello,
> 
> This patchset contains 34 patches to separate out core sysfs features
> into kernfs.  kernfs exclusively deals with sysfs_dirent, which will
> be later renamed to kernfs_node, and kernfs_ops.  sysfs becomes a
> wrapping layer over sysfs which interfaces kobject and
> [bin_]attribute.
> 
> The goal of these changes is to allow other users to make use of the
> core features of sysfs instead of rolling their own pseudo filesystem
> implementation which usually fails to deal with issues with file
> shutdowns, locking separation from vfs layer and so on.  This patchset
> refactors sysfs and separates out most core functionalities to kernfs;
> however, the mount code hasn't been updated yet and it can't be used
> by other users yet.  The patchset is pretty big already and the two
> steps can be separated relatively well, so I think this is a good
> split point.

I've applied the first 6 patches here, as they only did cleanup stuff.
As it's late in the development cycle (i.e. 3.12 is going to be out in
days), I'd prefer to wait on the rest until 3.13-rc1 is out so that it
gets lots of testing in linux-next.

So, I'll hold off on the rest of these until then, ok?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Subject: [PATCHSET driver-core-next] sysfs: separate out kernfs, part #1
  2013-10-29 22:29 ` Greg KH
@ 2013-10-30 14:41   ` Tejun Heo
  0 siblings, 0 replies; 46+ messages in thread
From: Tejun Heo @ 2013-10-30 14:41 UTC (permalink / raw)
  To: Greg KH; +Cc: kay, linux-kernel, ebiederm, bhelgaas

Hello, Greg.

On Tue, Oct 29, 2013 at 03:29:50PM -0700, Greg KH wrote:
> I've applied the first 6 patches here, as they only did cleanup stuff.
> As it's late in the development cycle (i.e. 3.12 is going to be out in
> days), I'd prefer to wait on the rest until 3.13-rc1 is out so that it
> gets lots of testing in linux-next.
> 
> So, I'll hold off on the rest of these until then, ok?

Yeah, sounds good to me.  kernfs isn't functional yet anyway and there
will be more sweeping changes so it probably is the least painful if
we do all the big changes in one cycle anyway.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 46+ messages in thread

end of thread, other threads:[~2013-10-30 14:47 UTC | newest]

Thread overview: 46+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-10-24 15:49 Subject: [PATCHSET driver-core-next] sysfs: separate out kernfs, part #1 Tejun Heo
2013-10-24 15:49 ` [PATCH 01/34] sysfs: merge sysfs_elem_bin_attr into sysfs_elem_attr Tejun Heo
2013-10-24 15:49 ` [PATCH 02/34] sysfs: honor bin_attr.attr.ignore_lockdep Tejun Heo
2013-10-24 15:49 ` [PATCH 03/34] sysfs: remove unused sysfs_get_dentry() prototype Tejun Heo
2013-10-24 15:49 ` [PATCH 04/34] sysfs: move sysfs_hash_and_remove() to fs/sysfs/dir.c Tejun Heo
2013-10-24 15:49 ` [PATCH 05/34] sysfs: separate out dup filename warning into a separate function Tejun Heo
2013-10-24 15:49 ` [PATCH 06/34] sysfs: make __sysfs_add_one() fail if the parent isn't a directory Tejun Heo
2013-10-24 15:49 ` [PATCH 07/34] sysfs, kernfs: add skeletons for kernfs Tejun Heo
2013-10-24 15:49 ` [PATCH 08/34] sysfs, kernfs: introduce kernfs_remove[_by_name[_ns]]() Tejun Heo
2013-10-24 15:49 ` [PATCH 09/34] sysfs, kernfs: introduce kernfs_create_link() Tejun Heo
2013-10-24 15:49 ` [PATCH 10/34] sysfs, kernfs: introduce kernfs_rename[_ns]() Tejun Heo
2013-10-24 15:49 ` [PATCH 11/34] sysfs, kernfs: introduce kernfs_setattr() Tejun Heo
2013-10-24 15:49 ` [PATCH 12/34] sysfs, kernfs: replace sysfs_dirent->s_dir.kobj and ->s_attr.[bin_]attr with ->priv Tejun Heo
2013-10-24 15:49 ` [PATCH 13/34] sysfs, kernfs: introduce kernfs_create_dir[_ns]() Tejun Heo
2013-10-24 15:49 ` [PATCH 14/34] sysfs, kernfs: prepare read path for kernfs Tejun Heo
2013-10-26 11:32   ` Pavel Machek
2013-10-27 11:30     ` Tejun Heo
2013-10-28 13:03   ` [PATCH v2 " Tejun Heo
2013-10-28 16:33   ` [PATCH v3 " Tejun Heo
2013-10-24 15:49 ` [PATCH 15/34] sysfs, kernfs: prepare write " Tejun Heo
2013-10-24 15:49 ` [PATCH 16/34] sysfs, kernfs: prepare llseek " Tejun Heo
2013-10-24 15:49 ` [PATCH 17/34] sysfs, kernfs: prepare mmap " Tejun Heo
2013-10-24 15:49 ` [PATCH 18/34] sysfs, kernfs: prepare open, release, poll paths " Tejun Heo
2013-10-24 15:49 ` [PATCH 19/34] sysfs, kernfs: move sysfs_open_file to include/linux/kernfs.h Tejun Heo
2013-10-24 15:49 ` [PATCH 20/34] sysfs, kernfs: introduce kernfs_ops Tejun Heo
2013-10-24 15:49 ` [PATCH 21/34] sysfs, kernfs: add sysfs_dirent->s_attr.size Tejun Heo
2013-10-24 15:49 ` [PATCH 22/34] sysfs, kernfs: remove SYSFS_KOBJ_BIN_ATTR Tejun Heo
2013-10-24 15:49 ` [PATCH 23/34] sysfs, kernfs: introduce kernfs_create_file[_ns]() Tejun Heo
2013-10-24 15:49 ` [PATCH 24/34] sysfs, kernfs: remove sysfs_add_one() Tejun Heo
2013-10-24 15:49 ` [PATCH 25/34] sysfs, kernfs: add kernfs_ops->seq_{start|next|stop}() Tejun Heo
2013-10-28 13:04   ` [PATCH v2 " Tejun Heo
2013-10-24 15:49 ` [PATCH 26/34] sysfs, kernfs: introduce kernfs_notify() Tejun Heo
2013-10-24 15:49 ` [PATCH 27/34] sysfs, kernfs: reorganize SYSFS_* constants Tejun Heo
2013-10-24 15:49 ` [PATCH 28/34] sysfs, kernfs: revamp sysfs_dirent active_ref lockdep annotation Tejun Heo
2013-10-24 15:49 ` [PATCH 29/34] sysfs, kernfs: introduce kernfs[_find_and]_get() and kernfs_put() Tejun Heo
2013-10-24 15:49 ` [PATCH 30/34] sysfs, kernfs: move internal decls to fs/kernfs/kernfs-internal.h Tejun Heo
2013-10-24 15:49 ` [PATCH 31/34] sysfs, kernfs: move inode code to fs/kernfs/inode.c Tejun Heo
2013-10-24 15:49 ` [PATCH 32/34] sysfs, kernfs: move dir core code to fs/kernfs/dir.c Tejun Heo
2013-10-24 15:49 ` [PATCH 33/34] sysfs, kernfs: move file core code to fs/kernfs/file.c Tejun Heo
2013-10-28 13:04   ` [PATCH v2 " Tejun Heo
2013-10-28 16:34   ` [PATCH v3 " Tejun Heo
2013-10-24 15:49 ` [PATCH 34/34] sysfs, kernfs: move symlink core code to fs/kernfs/symlink.c Tejun Heo
2013-10-28 13:06 ` Subject: [PATCHSET driver-core-next] sysfs: separate out kernfs, part #1 Tejun Heo
2013-10-28 16:35   ` Tejun Heo
2013-10-29 22:29 ` Greg KH
2013-10-30 14:41   ` Tejun Heo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).