linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/20] Sysfs cleanups
@ 2009-05-20  4:09 Eric W. Biederman
  2009-05-20 15:37 ` Greg KH
                   ` (6 more replies)
  0 siblings, 7 replies; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-20  4:09 UTC (permalink / raw)
  To: Andrew Morton, Greg Kroah-Hartman
  Cc: linux-kernel, Tejun Heo, Cornelia Huck, linux-fsdevel


The following patch series cleans up sysfs to the point where itq is
generally a good citizen of the vfs layer.  The big theme is lazy
synchronization from the sysfs data structures to the vfs data
structures using the same techniques as most other distributed
filesystems.

This allows the complete removal of i_mutex from the sysfs code,
the death of lookup_one_noperm, and probably a few other weird
cases that slip my tongue.

Included in this is the latest version of my work that merges
sysfs_move_dir and sysfs_rename_dir to simplify maitenance of.

I have been running these patches for several months so there should
be no really nasty surprises in here.

 drivers/base/core.c   |   18 +-
 fs/namei.c            |   22 --
 fs/sysfs/dir.c        |  565 ++++++++++++++-----------------------------------
 fs/sysfs/file.c       |   47 +----
 fs/sysfs/inode.c      |  154 ++++++++------
 fs/sysfs/mount.c      |   20 +-
 fs/sysfs/symlink.c    |   71 +++----
 fs/sysfs/sysfs.h      |   25 +--
 include/linux/namei.h |    1 -
 include/linux/sysfs.h |    9 +
 10 files changed, 325 insertions(+), 607 deletions(-)

Eric


^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 0/20] Sysfs cleanups
  2009-05-20  4:09 [PATCH 0/20] Sysfs cleanups Eric W. Biederman
@ 2009-05-20 15:37 ` Greg KH
  2009-05-20 23:04   ` Eric W. Biederman
  2009-05-21  0:27 ` [PATCH 01/20] sysfs: Implement sysfs_rename_link Eric W. Biederman
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 200+ messages in thread
From: Greg KH @ 2009-05-20 15:37 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Andrew Morton, linux-kernel, Tejun Heo, Cornelia Huck, linux-fsdevel

On Tue, May 19, 2009 at 09:09:43PM -0700, Eric W. Biederman wrote:
> 
> The following patch series cleans up sysfs to the point where itq is
> generally a good citizen of the vfs layer.  The big theme is lazy
> synchronization from the sysfs data structures to the vfs data
> structures using the same techniques as most other distributed
> filesystems.

Hm, I only seem to have gotten patch 0/20, none of the actual patches.
Did I somehow miss them?

confused,

greg k-h

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 0/20] Sysfs cleanups
  2009-05-20 15:37 ` Greg KH
@ 2009-05-20 23:04   ` Eric W. Biederman
  0 siblings, 0 replies; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-20 23:04 UTC (permalink / raw)
  To: Greg KH
  Cc: Andrew Morton, linux-kernel, Tejun Heo, Cornelia Huck, linux-fsdevel

Greg KH <gregkh@suse.de> writes:

> On Tue, May 19, 2009 at 09:09:43PM -0700, Eric W. Biederman wrote:
>> 
>> The following patch series cleans up sysfs to the point where itq is
>> generally a good citizen of the vfs layer.  The big theme is lazy
>> synchronization from the sysfs data structures to the vfs data
>> structures using the same techniques as most other distributed
>> filesystems.
>
> Hm, I only seem to have gotten patch 0/20, none of the actual patches.
> Did I somehow miss them?
>
> confused,

Me too.

git-send-email ran.  I got copies of the mail sent.  But fsdevel
and linux-kernel appears not to have received them.  Hopefully I can figure
out what went wrong shortly.  My apologies if anyone receives several copies.

Eric

^ permalink raw reply	[flat|nested] 200+ messages in thread

* [PATCH 01/20] sysfs: Implement sysfs_rename_link
  2009-05-20  4:09 [PATCH 0/20] Sysfs cleanups Eric W. Biederman
  2009-05-20 15:37 ` Greg KH
@ 2009-05-21  0:27 ` Eric W. Biederman
  2009-05-21  0:27   ` [PATCH 02/20] driver core: Use sysfs_rename_link in device_rename Eric W. Biederman
                     ` (3 more replies)
  2009-05-23 20:13 ` [PATCH 21/20] sysfs: Rename sysfs_mv_dir sysfs_rename Eric W. Biederman
                   ` (4 subsequent siblings)
  6 siblings, 4 replies; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-21  0:27 UTC (permalink / raw)
  To: Andrew Morton, Greg Kroah-Hartman
  Cc: linux-kernel, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Eric W. Biederman, Benjamin Thery, Daniel Lezcano,
	Eric W. Biederman

From: Eric W. Biederman <ebiederm@xmission.com>

Because of rename ordering problems we occassionally give false
warnings about invalid sysfs operations, so implement a helper
function for this common sysfs idiom.

This is a stripped down version of an earlier patch that
also added sysfs_delete_link.

Cc: Benjamin Thery <benjamin.thery@bull.net>
Cc: Daniel Lezcano <dlezcano@fr.ibm.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
---
 fs/sysfs/symlink.c    |   16 ++++++++++++++++
 include/linux/sysfs.h |    9 +++++++++
 2 files changed, 25 insertions(+), 0 deletions(-)

diff --git a/fs/sysfs/symlink.c b/fs/sysfs/symlink.c
index a3ba217..11c4da5 100644
--- a/fs/sysfs/symlink.c
+++ b/fs/sysfs/symlink.c
@@ -122,6 +122,22 @@ void sysfs_remove_link(struct kobject * kobj, const char * name)
 	sysfs_hash_and_remove(parent_sd, name);
 }
 
+/**
+ *	sysfs_rename_link - rename symlink in object's directory.
+ *	@kobj:	object we're acting for.
+ *	@targ:	object we're pointing to.
+ *	@old:	previous name of the symlink.
+ *	@new:	new name of the symlink.
+ *
+ *	A helper function for the common rename symlink idiom.
+ */
+int sysfs_rename_link(struct kobject *kobj, struct kobject *targ,
+			const char *old, const char *new)
+{
+	sysfs_remove_link(kobj, old);
+	return sysfs_create_link(kobj, targ, new);
+}
+
 static int sysfs_get_target_path(struct sysfs_dirent *parent_sd,
 				 struct sysfs_dirent *target_sd, char *path)
 {
diff --git a/include/linux/sysfs.h b/include/linux/sysfs.h
index 9d68fed..18c8e70 100644
--- a/include/linux/sysfs.h
+++ b/include/linux/sysfs.h
@@ -109,6 +109,9 @@ int __must_check sysfs_create_link_nowarn(struct kobject *kobj,
 					  const char *name);
 void sysfs_remove_link(struct kobject *kobj, const char *name);
 
+int sysfs_rename_link(struct kobject *kobj, struct kobject *target,
+			const char *old_name, const char *new_name);
+
 int __must_check sysfs_create_group(struct kobject *kobj,
 				    const struct attribute_group *grp);
 int sysfs_update_group(struct kobject *kobj,
@@ -202,6 +205,12 @@ static inline void sysfs_remove_link(struct kobject *kobj, const char *name)
 {
 }
 
+static inline int sysfs_rename_link(struct kobject *k, struct kobject *t,
+				    const char *old_name, const char *new_name)
+{
+	return 0;
+}
+
 static inline int sysfs_create_group(struct kobject *kobj,
 				     const struct attribute_group *grp)
 {
-- 
1.6.1.2.350.g88cc


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH 02/20] driver core: Use sysfs_rename_link in device_rename
  2009-05-21  0:27 ` [PATCH 01/20] sysfs: Implement sysfs_rename_link Eric W. Biederman
@ 2009-05-21  0:27   ` Eric W. Biederman
  2009-05-21  0:27     ` [PATCH 03/20] sysfs: Remove now unnecessary error reporting suppression Eric W. Biederman
  2009-05-21  1:49   ` [PATCH 01/20] sysfs: Implement sysfs_rename_link Tejun Heo
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-21  0:27 UTC (permalink / raw)
  To: Andrew Morton, Greg Kroah-Hartman
  Cc: linux-kernel, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Eric W. Biederman, Eric W. Biederman

From: Eric W. Biederman <ebiederm@xmission.com>

Don't open code the renaming of symlinks in sysfs
instead use the new helper function sysfs_rename_link

Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
---
 drivers/base/core.c |   18 ++++++------------
 1 files changed, 6 insertions(+), 12 deletions(-)

diff --git a/drivers/base/core.c b/drivers/base/core.c
index 4aa527b..8a1569c 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -1490,22 +1490,16 @@ int device_rename(struct device *dev, char *new_name)
 	if (old_class_name) {
 		new_class_name = make_class_name(dev->class->name, &dev->kobj);
 		if (new_class_name) {
-			error = sysfs_create_link_nowarn(&dev->parent->kobj,
-							 &dev->kobj,
-							 new_class_name);
-			if (error)
-				goto out;
-			sysfs_remove_link(&dev->parent->kobj, old_class_name);
+			error = sysfs_rename_link(&dev->parent->kobj,
+						  &dev->kobj,
+						  old_class_name,
+						  new_class_name);
 		}
 	}
 #else
 	if (dev->class) {
-		error = sysfs_create_link_nowarn(&dev->class->p->class_subsys.kobj,
-						 &dev->kobj, dev_name(dev));
-		if (error)
-			goto out;
-		sysfs_remove_link(&dev->class->p->class_subsys.kobj,
-				  old_device_name);
+		error = sysfs_rename_link(&dev->class->p->class_subsys.kobj,
+					  &dev->kobj, old_device_name, new_name);
 	}
 #endif
 
-- 
1.6.1.2.350.g88cc


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH 03/20] sysfs: Remove now unnecessary error reporting suppression.
  2009-05-21  0:27   ` [PATCH 02/20] driver core: Use sysfs_rename_link in device_rename Eric W. Biederman
@ 2009-05-21  0:27     ` Eric W. Biederman
  2009-05-21  0:27       ` [PATCH 04/20] sysfs: Handle the general case of removing of directories with subdirectories Eric W. Biederman
  2009-05-21  5:37       ` [PATCH 03/20] sysfs: Remove now unnecessary error reporting suppression Tejun Heo
  0 siblings, 2 replies; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-21  0:27 UTC (permalink / raw)
  To: Andrew Morton, Greg Kroah-Hartman
  Cc: linux-kernel, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Eric W. Biederman, Eric W. Biederman

From: Eric W. Biederman <ebiederm@xmission.com>

Now that we use sysfs_rename_link in the places we previously
used sysfs_create_link_nowarn we can remove sysfs_create_link_nowarn
and all it's supporting infrastructure.

Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
---
 fs/sysfs/dir.c     |   54 +++++++++++----------------------------------------
 fs/sysfs/symlink.c |   42 ++++++++-------------------------------
 fs/sysfs/sysfs.h   |    1 -
 3 files changed, 21 insertions(+), 76 deletions(-)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index d88d0fa..b95cc07 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -397,43 +397,6 @@ void sysfs_addrm_start(struct sysfs_addrm_cxt *acxt,
 }
 
 /**
- *	__sysfs_add_one - add sysfs_dirent to parent without warning
- *	@acxt: addrm context to use
- *	@sd: sysfs_dirent to be added
- *
- *	Get @acxt->parent_sd and set sd->s_parent to it and increment
- *	nlink of parent inode if @sd is a directory and link into the
- *	children list of the parent.
- *
- *	This function should be called between calls to
- *	sysfs_addrm_start() and sysfs_addrm_finish() and should be
- *	passed the same @acxt as passed to sysfs_addrm_start().
- *
- *	LOCKING:
- *	Determined by sysfs_addrm_start().
- *
- *	RETURNS:
- *	0 on success, -EEXIST if entry with the given name already
- *	exists.
- */
-int __sysfs_add_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd)
-{
-	if (sysfs_find_dirent(acxt->parent_sd, sd->s_name))
-		return -EEXIST;
-
-	sd->s_parent = sysfs_get(acxt->parent_sd);
-
-	if (sysfs_type(sd) == SYSFS_DIR && acxt->parent_inode)
-		inc_nlink(acxt->parent_inode);
-
-	acxt->cnt++;
-
-	sysfs_link_sibling(sd);
-
-	return 0;
-}
-
-/**
  *	sysfs_pathname - return full path to sysfs dirent
  *	@sd: sysfs_dirent whose path we want
  *	@path: caller allocated buffer
@@ -475,10 +438,7 @@ static char *sysfs_pathname(struct sysfs_dirent *sd, char *path)
  */
 int sysfs_add_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd)
 {
-	int ret;
-
-	ret = __sysfs_add_one(acxt, sd);
-	if (ret == -EEXIST) {
+	if (sysfs_find_dirent(acxt->parent_sd, sd->s_name)) {
 		char *path = kzalloc(PATH_MAX, GFP_KERNEL);
 		WARN(1, KERN_WARNING
 		     "sysfs: cannot create duplicate filename '%s'\n",
@@ -486,9 +446,19 @@ int sysfs_add_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd)
 		     strcat(strcat(sysfs_pathname(acxt->parent_sd, path), "/"),
 		            sd->s_name));
 		kfree(path);
+		return -EEXIST;
 	}
 
-	return ret;
+	sd->s_parent = sysfs_get(acxt->parent_sd);
+
+	if (sysfs_type(sd) == SYSFS_DIR && acxt->parent_inode)
+		inc_nlink(acxt->parent_inode);
+
+	acxt->cnt++;
+
+	sysfs_link_sibling(sd);
+
+	return 0;
 }
 
 /**
diff --git a/fs/sysfs/symlink.c b/fs/sysfs/symlink.c
index 11c4da5..ac13e61 100644
--- a/fs/sysfs/symlink.c
+++ b/fs/sysfs/symlink.c
@@ -19,8 +19,14 @@
 
 #include "sysfs.h"
 
-static int sysfs_do_create_link(struct kobject *kobj, struct kobject *target,
-				const char *name, int warn)
+/**
+ *	sysfs_create_link - create symlink between two objects.
+ *	@kobj:	object whose directory we're creating the link in.
+ *	@target:	object we're pointing to.
+ *	@name:		name of the symlink.
+ */
+int sysfs_create_link(struct kobject *kobj, struct kobject *target,
+			const char *name)
 {
 	struct sysfs_dirent *parent_sd = NULL;
 	struct sysfs_dirent *target_sd = NULL;
@@ -60,10 +66,7 @@ static int sysfs_do_create_link(struct kobject *kobj, struct kobject *target,
 	target_sd = NULL;	/* reference is now owned by the symlink */
 
 	sysfs_addrm_start(&acxt, parent_sd);
-	if (warn)
-		error = sysfs_add_one(&acxt, sd);
-	else
-		error = __sysfs_add_one(&acxt, sd);
+	error = sysfs_add_one(&acxt, sd);
 	sysfs_addrm_finish(&acxt);
 
 	if (error)
@@ -78,33 +81,6 @@ static int sysfs_do_create_link(struct kobject *kobj, struct kobject *target,
 }
 
 /**
- *	sysfs_create_link - create symlink between two objects.
- *	@kobj:	object whose directory we're creating the link in.
- *	@target:	object we're pointing to.
- *	@name:		name of the symlink.
- */
-int sysfs_create_link(struct kobject *kobj, struct kobject *target,
-		      const char *name)
-{
-	return sysfs_do_create_link(kobj, target, name, 1);
-}
-
-/**
- *	sysfs_create_link_nowarn - create symlink between two objects.
- *	@kobj:	object whose directory we're creating the link in.
- *	@target:	object we're pointing to.
- *	@name:		name of the symlink.
- *
- *	This function does the same as sysf_create_link(), but it
- *	doesn't warn if the link already exists.
- */
-int sysfs_create_link_nowarn(struct kobject *kobj, struct kobject *target,
-			     const char *name)
-{
-	return sysfs_do_create_link(kobj, target, name, 0);
-}
-
-/**
  *	sysfs_remove_link - remove symlink in object's directory.
  *	@kobj:	object we're acting for.
  *	@name:	name of the symlink to remove.
diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index 3fa0d98..abf05f4 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -108,7 +108,6 @@ struct sysfs_dirent *sysfs_get_active_two(struct sysfs_dirent *sd);
 void sysfs_put_active_two(struct sysfs_dirent *sd);
 void sysfs_addrm_start(struct sysfs_addrm_cxt *acxt,
 		       struct sysfs_dirent *parent_sd);
-int __sysfs_add_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd);
 int sysfs_add_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd);
 void sysfs_remove_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd);
 void sysfs_addrm_finish(struct sysfs_addrm_cxt *acxt);
-- 
1.6.1.2.350.g88cc


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH 04/20] sysfs: Handle the general case of removing of directories with subdirectories
  2009-05-21  0:27     ` [PATCH 03/20] sysfs: Remove now unnecessary error reporting suppression Eric W. Biederman
@ 2009-05-21  0:27       ` Eric W. Biederman
  2009-05-21  0:27         ` [PATCH 05/20] sysfs: Rename sysfs_d_iput to sysfs_dentry_iput Eric W. Biederman
  2009-05-21  6:23         ` [PATCH 04/20] sysfs: Handle the general case of removing of directories with subdirectories Tejun Heo
  2009-05-21  5:37       ` [PATCH 03/20] sysfs: Remove now unnecessary error reporting suppression Tejun Heo
  1 sibling, 2 replies; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-21  0:27 UTC (permalink / raw)
  To: Andrew Morton, Greg Kroah-Hartman
  Cc: linux-kernel, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Eric W. Biederman, Eric W. Biederman

From: Eric W. Biederman <ebiederm@xmission.com>

Modify sysfs to properly remove directories containing attributes and
subdirectories.  The code is relatively simple and means we don't have
to worry about what might use this logic.

In a quick survey I have only found /sys/dev/char and /sys/dev/block that are
removing non-enmpty directories today (and they are exclusively filled with symlinks).
So only removing empty directories does not appear to be an option.

I don't hold sysfs_mutex across the entire operation as that is unneeded
for coherence at the sysfs level and some level of coordination is expected
at the upper layers.

Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
---
 fs/sysfs/dir.c |   54 ++++++++++++++++++++++++++----------------------------
 1 files changed, 26 insertions(+), 28 deletions(-)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index b95cc07..50702b3 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -732,43 +732,41 @@ const struct inode_operations sysfs_dir_inode_operations = {
 	.setattr	= sysfs_setattr,
 };
 
-static void remove_dir(struct sysfs_dirent *sd)
+static struct sysfs_dirent *sysfs_get_one(struct sysfs_dirent *dir_sd)
 {
-	struct sysfs_addrm_cxt acxt;
-
-	sysfs_addrm_start(&acxt, sd->s_parent);
-	sysfs_remove_one(&acxt, sd);
-	sysfs_addrm_finish(&acxt);
-}
-
-void sysfs_remove_subdir(struct sysfs_dirent *sd)
-{
-	remove_dir(sd);
+	struct sysfs_dirent *sd = dir_sd;
+	mutex_lock(&sysfs_mutex);
+	while ((sysfs_type(sd) == SYSFS_DIR) && sd->s_dir.children)
+		sd = sd->s_dir.children;
+	if (sd != dir_sd)
+		sysfs_get(sd);
+	else
+		sd = NULL;
+	mutex_unlock(&sysfs_mutex);
+	return sd;
 }
 
-
-static void __sysfs_remove_dir(struct sysfs_dirent *dir_sd)
+static void remove_dir(struct sysfs_dirent *dir_sd)
 {
 	struct sysfs_addrm_cxt acxt;
-	struct sysfs_dirent **pos;
-
-	if (!dir_sd)
-		return;
+	struct sysfs_dirent *sd;
 
 	pr_debug("sysfs %s: removing dir\n", dir_sd->s_name);
-	sysfs_addrm_start(&acxt, dir_sd);
-	pos = &dir_sd->s_dir.children;
-	while (*pos) {
-		struct sysfs_dirent *sd = *pos;
 
-		if (sysfs_type(sd) != SYSFS_DIR)
-			sysfs_remove_one(&acxt, sd);
-		else
-			pos = &(*pos)->s_sibling;
+	while ((sd = sysfs_get_one(dir_sd))) {
+		sysfs_addrm_start(&acxt, sd->s_parent);
+		sysfs_remove_one(&acxt, sd);
+		sysfs_addrm_finish(&acxt);
+		sysfs_put(sd);
 	}
+	sysfs_addrm_start(&acxt, dir_sd->s_parent);
+	sysfs_remove_one(&acxt, dir_sd);
 	sysfs_addrm_finish(&acxt);
+}
 
-	remove_dir(dir_sd);
+void sysfs_remove_subdir(struct sysfs_dirent *sd)
+{
+	remove_dir(sd);
 }
 
 /**
@@ -788,7 +786,7 @@ void sysfs_remove_dir(struct kobject * kobj)
 	kobj->sd = NULL;
 	spin_unlock(&sysfs_assoc_lock);
 
-	__sysfs_remove_dir(sd);
+	remove_dir(sd);
 }
 
 int sysfs_rename_dir(struct kobject * kobj, const char *new_name)
-- 
1.6.1.2.350.g88cc


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH 05/20] sysfs: Rename sysfs_d_iput to sysfs_dentry_iput
  2009-05-21  0:27       ` [PATCH 04/20] sysfs: Handle the general case of removing of directories with subdirectories Eric W. Biederman
@ 2009-05-21  0:27         ` Eric W. Biederman
  2009-05-21  0:28           ` [PATCH 06/20] sysfs: Use dentry_ops instead of directly playing with the dcache Eric W. Biederman
  2009-05-21  6:24           ` [PATCH 05/20] sysfs: Rename sysfs_d_iput to sysfs_dentry_iput Tejun Heo
  2009-05-21  6:23         ` [PATCH 04/20] sysfs: Handle the general case of removing of directories with subdirectories Tejun Heo
  1 sibling, 2 replies; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-21  0:27 UTC (permalink / raw)
  To: Andrew Morton, Greg Kroah-Hartman
  Cc: linux-kernel, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Eric W. Biederman, Eric W. Biederman

From: Eric W. Biederman <ebiederm@xmission.com>

Using dentry instead of d in the function name is what
several other filesystems are doing and it seems to be
a more readable convention.

Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
---
 fs/sysfs/dir.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index 50702b3..01b1e40 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -294,7 +294,7 @@ void release_sysfs_dirent(struct sysfs_dirent * sd)
 		goto repeat;
 }
 
-static void sysfs_d_iput(struct dentry * dentry, struct inode * inode)
+static void sysfs_dentry_iput(struct dentry * dentry, struct inode * inode)
 {
 	struct sysfs_dirent * sd = dentry->d_fsdata;
 
@@ -303,7 +303,7 @@ static void sysfs_d_iput(struct dentry * dentry, struct inode * inode)
 }
 
 static const struct dentry_operations sysfs_dentry_ops = {
-	.d_iput		= sysfs_d_iput,
+	.d_iput		= sysfs_dentry_iput,
 };
 
 struct sysfs_dirent *sysfs_new_dirent(const char *name, umode_t mode, int type)
-- 
1.6.1.2.350.g88cc


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH 06/20] sysfs: Use dentry_ops instead of directly playing with the dcache
  2009-05-21  0:27         ` [PATCH 05/20] sysfs: Rename sysfs_d_iput to sysfs_dentry_iput Eric W. Biederman
@ 2009-05-21  0:28           ` Eric W. Biederman
  2009-05-21  0:28             ` [PATCH 07/20] sysfs: Simplify sysfs_chmod_file semantics Eric W. Biederman
  2009-05-21  6:41             ` [PATCH 06/20] sysfs: Use dentry_ops instead of directly playing with the dcache Tejun Heo
  2009-05-21  6:24           ` [PATCH 05/20] sysfs: Rename sysfs_d_iput to sysfs_dentry_iput Tejun Heo
  1 sibling, 2 replies; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-21  0:28 UTC (permalink / raw)
  To: Andrew Morton, Greg Kroah-Hartman
  Cc: linux-kernel, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Eric W. Biederman, Eric W. Biederman

From: Eric W. Biederman <ebiederm@xmission.com>

Calling d_drop unconditionally when a sysfs_dirent is deleted has
the potential to leak mounts, so instead implement dentry delete
and revalidate operations that cause sysfs dentries to be removed
at the appropriate time.

Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
---
 fs/sysfs/dir.c |   73 +++++++++++++++++++++++++++++++++++--------------------
 1 files changed, 46 insertions(+), 27 deletions(-)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index 01b1e40..8dd2abf 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -294,6 +294,46 @@ void release_sysfs_dirent(struct sysfs_dirent * sd)
 		goto repeat;
 }
 
+static int sysfs_dentry_delete(struct dentry *dentry)
+{
+	struct sysfs_dirent *sd = dentry->d_fsdata;
+	return !!(sd->s_flags & SYSFS_FLAG_REMOVED);
+}
+
+static int sysfs_dentry_revalidate(struct dentry *dentry, struct nameidata *nd)
+{
+	struct sysfs_dirent *sd = dentry->d_fsdata;
+	int is_dir;
+
+	mutex_lock(&sysfs_mutex);
+
+	/* The sysfs dirent has been deleted */
+	if (sd->s_flags & SYSFS_FLAG_REMOVED)
+		goto out_bad;
+
+	mutex_unlock(&sysfs_mutex);
+out_valid:
+	return 1;
+out_bad:
+	/* Remove the dentry from the dcache hashes.
+	 * If this is a deleted dentry we use d_drop instead of d_delete
+	 * so sysfs doesn't need to cope with negative dentries.
+	 */
+	is_dir = (sysfs_type(sd) == SYSFS_DIR);
+	mutex_unlock(&sysfs_mutex);
+	if (is_dir) {
+		/* If we have submounts we must allow the vfs caches
+		 * to lie about the state of the filesystem to prevent
+		 * leaks and other nasty things.
+		 */
+		if (have_submounts(dentry))
+			goto out_valid;
+		shrink_dcache_parent(dentry);
+	}
+	d_drop(dentry);
+	return 0;
+}
+
 static void sysfs_dentry_iput(struct dentry * dentry, struct inode * inode)
 {
 	struct sysfs_dirent * sd = dentry->d_fsdata;
@@ -303,6 +343,8 @@ static void sysfs_dentry_iput(struct dentry * dentry, struct inode * inode)
 }
 
 static const struct dentry_operations sysfs_dentry_ops = {
+	.d_revalidate	= sysfs_dentry_revalidate,
+	.d_delete	= sysfs_dentry_delete,
 	.d_iput		= sysfs_dentry_iput,
 };
 
@@ -493,44 +535,21 @@ void sysfs_remove_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd)
 }
 
 /**
- *	sysfs_drop_dentry - drop dentry for the specified sysfs_dirent
+ *	sysfs_dec_nlink - Decrement link count for the specified sysfs_dirent
  *	@sd: target sysfs_dirent
  *
- *	Drop dentry for @sd.  @sd must have been unlinked from its
+ *	Decrement nlink for @sd.  @sd must have been unlinked from its
  *	parent on entry to this function such that it can't be looked
  *	up anymore.
  */
-static void sysfs_drop_dentry(struct sysfs_dirent *sd)
+static void sysfs_dec_nlink(struct sysfs_dirent *sd)
 {
 	struct inode *inode;
-	struct dentry *dentry;
 
 	inode = ilookup(sysfs_sb, sd->s_ino);
 	if (!inode)
 		return;
 
-	/* Drop any existing dentries associated with sd.
-	 *
-	 * For the dentry to be properly freed we need to grab a
-	 * reference to the dentry under the dcache lock,  unhash it,
-	 * and then put it.  The playing with the dentry count allows
-	 * dput to immediately free the dentry  if it is not in use.
-	 */
-repeat:
-	spin_lock(&dcache_lock);
-	list_for_each_entry(dentry, &inode->i_dentry, d_alias) {
-		if (d_unhashed(dentry))
-			continue;
-		dget_locked(dentry);
-		spin_lock(&dentry->d_lock);
-		__d_drop(dentry);
-		spin_unlock(&dentry->d_lock);
-		spin_unlock(&dcache_lock);
-		dput(dentry);
-		goto repeat;
-	}
-	spin_unlock(&dcache_lock);
-
 	/* adjust nlink and update timestamp */
 	mutex_lock(&inode->i_mutex);
 
@@ -577,7 +596,7 @@ void sysfs_addrm_finish(struct sysfs_addrm_cxt *acxt)
 		acxt->removed = sd->s_sibling;
 		sd->s_sibling = NULL;
 
-		sysfs_drop_dentry(sd);
+		sysfs_dec_nlink(sd);
 		sysfs_deactivate(sd);
 		unmap_bin_file(sd);
 		sysfs_put(sd);
-- 
1.6.1.2.350.g88cc


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH 07/20] sysfs: Simplify sysfs_chmod_file semantics
  2009-05-21  0:28           ` [PATCH 06/20] sysfs: Use dentry_ops instead of directly playing with the dcache Eric W. Biederman
@ 2009-05-21  0:28             ` Eric W. Biederman
  2009-05-21  0:28               ` [PATCH 08/20] sysfs: Optimize just changing the sysfs file mode Eric W. Biederman
  2009-05-21  6:42               ` [PATCH 07/20] sysfs: Simplify sysfs_chmod_file semantics Tejun Heo
  2009-05-21  6:41             ` [PATCH 06/20] sysfs: Use dentry_ops instead of directly playing with the dcache Tejun Heo
  1 sibling, 2 replies; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-21  0:28 UTC (permalink / raw)
  To: Andrew Morton, Greg Kroah-Hartman
  Cc: linux-kernel, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Eric W. Biederman, Eric W. Biederman

From: Eric W. Biederman <ebiederm@xmission.com>

Currently every caller of sysfs_chmod_file happens at either
file creation time to set a non-default mode or in response
to a specific user requested space change in policy.  Making
timestamps of when the chmod happens and notification of
a file changing mode uninteresting.

Remove the unnecessary time stamp and filesystem change
notification, and removes the last of the explicit inotify
and donitfy support from sysfs.

Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
---
 fs/sysfs/file.c |   10 +---------
 1 files changed, 1 insertions(+), 9 deletions(-)

diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c
index b1606e0..0786b41 100644
--- a/fs/sysfs/file.c
+++ b/fs/sysfs/file.c
@@ -602,17 +602,9 @@ int sysfs_chmod_file(struct kobject *kobj, struct attribute *attr, mode_t mode)
 	mutex_lock(&inode->i_mutex);
 
 	newattrs.ia_mode = (mode & S_IALLUGO) | (inode->i_mode & ~S_IALLUGO);
-	newattrs.ia_valid = ATTR_MODE | ATTR_CTIME;
-	newattrs.ia_ctime = current_fs_time(inode->i_sb);
+	newattrs.ia_valid = ATTR_MODE;
 	rc = sysfs_setattr(victim, &newattrs);
 
-	if (rc == 0) {
-		fsnotify_change(victim, newattrs.ia_valid);
-		mutex_lock(&sysfs_mutex);
-		victim_sd->s_mode = newattrs.ia_mode;
-		mutex_unlock(&sysfs_mutex);
-	}
-
 	mutex_unlock(&inode->i_mutex);
  out:
 	dput(victim);
-- 
1.6.1.2.350.g88cc


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH 08/20] sysfs: Optimize just changing the sysfs file mode.
  2009-05-21  0:28             ` [PATCH 07/20] sysfs: Simplify sysfs_chmod_file semantics Eric W. Biederman
@ 2009-05-21  0:28               ` Eric W. Biederman
  2009-05-21  0:28                 ` [PATCH 09/20] sysfs: Simplify iattr assignments Eric W. Biederman
  2009-05-21  7:29                 ` [PATCH 08/20] sysfs: Optimize just changing the sysfs file mode Tejun Heo
  2009-05-21  6:42               ` [PATCH 07/20] sysfs: Simplify sysfs_chmod_file semantics Tejun Heo
  1 sibling, 2 replies; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-21  0:28 UTC (permalink / raw)
  To: Andrew Morton, Greg Kroah-Hartman
  Cc: linux-kernel, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Eric W. Biederman, Eric W. Biederman

From: Eric W. Biederman <ebiederm@xmission.com>

Don't allocate a struct iattr for the sysfs dentry if just
the mode changes because we have a field for that on the
sysfs_dirent, and we can trigger that case with sysfs_chmod_file.

Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
---
 fs/sysfs/inode.c |   22 ++++++++++++++--------
 1 files changed, 14 insertions(+), 8 deletions(-)

diff --git a/fs/sysfs/inode.c b/fs/sysfs/inode.c
index 555f0ff..70ff2a2 100644
--- a/fs/sysfs/inode.c
+++ b/fs/sysfs/inode.c
@@ -60,12 +60,16 @@ int sysfs_setattr(struct dentry * dentry, struct iattr * iattr)
 		return error;
 
 	iattr->ia_valid &= ~ATTR_SIZE; /* ignore size changes */
+	if (iattr->ia_valid & ATTR_MODE) {
+		if (!in_group_p(inode->i_gid) && !capable(CAP_FSETID))
+			iattr->ia_mode &= ~S_ISGID;
+	}
 
 	error = inode_setattr(inode, iattr);
 	if (error)
 		return error;
 
-	if (!sd_iattr) {
+	if (!sd_iattr && (ia_valid & ~ATTR_MODE)) {
 		/* setting attributes for the first time, allocate now */
 		sd_iattr = kzalloc(sizeof(struct iattr), GFP_KERNEL);
 		if (!sd_iattr)
@@ -78,6 +82,13 @@ int sysfs_setattr(struct dentry * dentry, struct iattr * iattr)
 		sd->s_iattr = sd_iattr;
 	}
 
+	if (ia_valid & ATTR_MODE)
+		sd->s_mode = iattr->ia_mode;
+
+	/* If we don't need the extra attributes leave */
+	if (!sd_iattr)
+		return 0;
+
 	/* attributes were changed atleast once in past */
 
 	if (ia_valid & ATTR_UID)
@@ -93,13 +104,8 @@ int sysfs_setattr(struct dentry * dentry, struct iattr * iattr)
 	if (ia_valid & ATTR_CTIME)
 		sd_iattr->ia_ctime = timespec_trunc(iattr->ia_ctime,
 						inode->i_sb->s_time_gran);
-	if (ia_valid & ATTR_MODE) {
-		umode_t mode = iattr->ia_mode;
-
-		if (!in_group_p(inode->i_gid) && !capable(CAP_FSETID))
-			mode &= ~S_ISGID;
-		sd_iattr->ia_mode = sd->s_mode = mode;
-	}
+	if (ia_valid & ATTR_MODE)
+		sd_iattr->ia_mode = iattr->ia_mode;
 
 	return error;
 }
-- 
1.6.1.2.350.g88cc


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH 09/20] sysfs: Simplify iattr assignments
  2009-05-21  0:28               ` [PATCH 08/20] sysfs: Optimize just changing the sysfs file mode Eric W. Biederman
@ 2009-05-21  0:28                 ` Eric W. Biederman
  2009-05-21  0:28                   ` [PATCH 10/20] sysfs: Fix locking and factor out sysfs_sd_setattr Eric W. Biederman
  2009-05-21  7:31                   ` [PATCH 09/20] sysfs: Simplify iattr assignments Tejun Heo
  2009-05-21  7:29                 ` [PATCH 08/20] sysfs: Optimize just changing the sysfs file mode Tejun Heo
  1 sibling, 2 replies; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-21  0:28 UTC (permalink / raw)
  To: Andrew Morton, Greg Kroah-Hartman
  Cc: linux-kernel, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Eric W. Biederman, Eric W. Biederman

From: Eric W. Biederman <ebiederm@xmission.com>

The granularity of sysfs time when we keep it is 1 ns.  Which
when passed to timestamp_trunc results in a nop.  So remove
the unnecessary function call making sysfs_setattr slightly
easier to read.

Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
---
 fs/sysfs/inode.c |    9 +++------
 1 files changed, 3 insertions(+), 6 deletions(-)

diff --git a/fs/sysfs/inode.c b/fs/sysfs/inode.c
index 70ff2a2..5020a1d 100644
--- a/fs/sysfs/inode.c
+++ b/fs/sysfs/inode.c
@@ -96,14 +96,11 @@ int sysfs_setattr(struct dentry * dentry, struct iattr * iattr)
 	if (ia_valid & ATTR_GID)
 		sd_iattr->ia_gid = iattr->ia_gid;
 	if (ia_valid & ATTR_ATIME)
-		sd_iattr->ia_atime = timespec_trunc(iattr->ia_atime,
-						inode->i_sb->s_time_gran);
+		sd_iattr->ia_atime = iattr->ia_atime;
 	if (ia_valid & ATTR_MTIME)
-		sd_iattr->ia_mtime = timespec_trunc(iattr->ia_mtime,
-						inode->i_sb->s_time_gran);
+		sd_iattr->ia_mtime = iattr->ia_mtime;
 	if (ia_valid & ATTR_CTIME)
-		sd_iattr->ia_ctime = timespec_trunc(iattr->ia_ctime,
-						inode->i_sb->s_time_gran);
+		sd_iattr->ia_ctime = iattr->ia_ctime;
 	if (ia_valid & ATTR_MODE)
 		sd_iattr->ia_mode = iattr->ia_mode;
 
-- 
1.6.1.2.350.g88cc


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH 10/20] sysfs: Fix locking and factor out sysfs_sd_setattr
  2009-05-21  0:28                 ` [PATCH 09/20] sysfs: Simplify iattr assignments Eric W. Biederman
@ 2009-05-21  0:28                   ` Eric W. Biederman
  2009-05-21  0:28                     ` [PATCH 11/20] sysfs: Update s_iattr on link and unlink Eric W. Biederman
  2009-05-21  7:42                     ` [PATCH 10/20] sysfs: Fix locking and factor out sysfs_sd_setattr Tejun Heo
  2009-05-21  7:31                   ` [PATCH 09/20] sysfs: Simplify iattr assignments Tejun Heo
  1 sibling, 2 replies; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-21  0:28 UTC (permalink / raw)
  To: Andrew Morton, Greg Kroah-Hartman
  Cc: linux-kernel, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Eric W. Biederman, Eric W. Biederman

From: Eric W. Biederman <ebiederm@xmission.com>

Cleanly separate the work that is specific to setting the
attributes of a sysfs_dirent from what is needed to update
the attributes of a vfs inode.

Additionally grab the sysfs_mutex to keep any nasties from
surprising us when updating the sysfs_dirent.

Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
---
 fs/sysfs/inode.c |   52 ++++++++++++++++++++++++++++++----------------------
 fs/sysfs/sysfs.h |    1 +
 2 files changed, 31 insertions(+), 22 deletions(-)

diff --git a/fs/sysfs/inode.c b/fs/sysfs/inode.c
index 5020a1d..dd154cb 100644
--- a/fs/sysfs/inode.c
+++ b/fs/sysfs/inode.c
@@ -42,33 +42,12 @@ int __init sysfs_inode_init(void)
 	return bdi_init(&sysfs_backing_dev_info);
 }
 
-int sysfs_setattr(struct dentry * dentry, struct iattr * iattr)
+int sysfs_sd_setattr(struct sysfs_dirent *sd, struct iattr * iattr)
 {
-	struct inode * inode = dentry->d_inode;
-	struct sysfs_dirent * sd = dentry->d_fsdata;
 	struct iattr * sd_iattr;
 	unsigned int ia_valid = iattr->ia_valid;
-	int error;
-
-	if (!sd)
-		return -EINVAL;
 
 	sd_iattr = sd->s_iattr;
-
-	error = inode_change_ok(inode, iattr);
-	if (error)
-		return error;
-
-	iattr->ia_valid &= ~ATTR_SIZE; /* ignore size changes */
-	if (iattr->ia_valid & ATTR_MODE) {
-		if (!in_group_p(inode->i_gid) && !capable(CAP_FSETID))
-			iattr->ia_mode &= ~S_ISGID;
-	}
-
-	error = inode_setattr(inode, iattr);
-	if (error)
-		return error;
-
 	if (!sd_iattr && (ia_valid & ~ATTR_MODE)) {
 		/* setting attributes for the first time, allocate now */
 		sd_iattr = kzalloc(sizeof(struct iattr), GFP_KERNEL);
@@ -103,6 +82,35 @@ int sysfs_setattr(struct dentry * dentry, struct iattr * iattr)
 		sd_iattr->ia_ctime = iattr->ia_ctime;
 	if (ia_valid & ATTR_MODE)
 		sd_iattr->ia_mode = iattr->ia_mode;
+	return 0;
+}
+
+int sysfs_setattr(struct dentry * dentry, struct iattr * iattr)
+{
+	struct inode * inode = dentry->d_inode;
+	struct sysfs_dirent * sd = dentry->d_fsdata;
+	int error;
+
+	if (!sd)
+		return -EINVAL;
+
+	error = inode_change_ok(inode, iattr);
+	if (error)
+		return error;
+
+	iattr->ia_valid &= ~ATTR_SIZE; /* ignore size changes */
+	if (iattr->ia_valid & ATTR_MODE) {
+		if (!in_group_p(inode->i_gid) && !capable(CAP_FSETID))
+			iattr->ia_mode &= ~S_ISGID;
+	}
+
+	error = inode_setattr(inode, iattr);
+	if (error)
+		return error;
+
+	mutex_lock(&sysfs_mutex);
+	error = sysfs_sd_setattr(sd, iattr);
+	mutex_unlock(&sysfs_mutex);
 
 	return error;
 }
diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index abf05f4..043bb13 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -146,6 +146,7 @@ static inline void __sysfs_put(struct sysfs_dirent *sd)
  */
 struct inode *sysfs_get_inode(struct sysfs_dirent *sd);
 void sysfs_delete_inode(struct inode *inode);
+int sysfs_sd_setattr(struct sysfs_dirent *sd, struct iattr *iattr);
 int sysfs_setattr(struct dentry *dentry, struct iattr *iattr);
 int sysfs_hash_and_remove(struct sysfs_dirent *dir_sd, const char *name);
 int sysfs_inode_init(void);
-- 
1.6.1.2.350.g88cc


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH 11/20] sysfs: Update s_iattr on link and unlink.
  2009-05-21  0:28                   ` [PATCH 10/20] sysfs: Fix locking and factor out sysfs_sd_setattr Eric W. Biederman
@ 2009-05-21  0:28                     ` Eric W. Biederman
  2009-05-21  0:28                       ` [PATCH 12/20] sysfs: Nicely indent sysfs_symlink_inode_operations Eric W. Biederman
  2009-05-21  8:42                       ` [PATCH 11/20] sysfs: Update s_iattr on link and unlink Tejun Heo
  2009-05-21  7:42                     ` [PATCH 10/20] sysfs: Fix locking and factor out sysfs_sd_setattr Tejun Heo
  1 sibling, 2 replies; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-21  0:28 UTC (permalink / raw)
  To: Andrew Morton, Greg Kroah-Hartman
  Cc: linux-kernel, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Eric W. Biederman, Eric W. Biederman

From: Eric W. Biederman <ebiederm@xmission.com>

Currently sysfs updates the timestamps on the vfs directory
inode when we create or remove a directory entry but doesn't
update the cached copy on the sysfs_dirent, fix that oversight.

Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
---
 fs/sysfs/dir.c |   14 ++++++++++++++
 1 files changed, 14 insertions(+), 0 deletions(-)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index 8dd2abf..94b926f 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -480,6 +480,8 @@ static char *sysfs_pathname(struct sysfs_dirent *sd, char *path)
  */
 int sysfs_add_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd)
 {
+	struct iattr *ps_iattr;
+
 	if (sysfs_find_dirent(acxt->parent_sd, sd->s_name)) {
 		char *path = kzalloc(PATH_MAX, GFP_KERNEL);
 		WARN(1, KERN_WARNING
@@ -500,6 +502,11 @@ int sysfs_add_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd)
 
 	sysfs_link_sibling(sd);
 
+	/* Update timestamps on the parent */
+	ps_iattr = acxt->parent_sd->s_iattr;
+	if (ps_iattr)
+		ps_iattr->ia_ctime = ps_iattr->ia_mtime = CURRENT_TIME;
+
 	return 0;
 }
 
@@ -520,10 +527,17 @@ int sysfs_add_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd)
  */
 void sysfs_remove_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd)
 {
+	struct iattr *ps_iattr;
+
 	BUG_ON(sd->s_flags & SYSFS_FLAG_REMOVED);
 
 	sysfs_unlink_sibling(sd);
 
+	/* Update timestamps on the parent */
+	ps_iattr = acxt->parent_sd->s_iattr;
+	if (ps_iattr)
+		ps_iattr->ia_ctime = ps_iattr->ia_mtime = CURRENT_TIME;
+
 	sd->s_flags |= SYSFS_FLAG_REMOVED;
 	sd->s_sibling = acxt->removed;
 	acxt->removed = sd;
-- 
1.6.1.2.350.g88cc


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH 12/20] sysfs: Nicely indent sysfs_symlink_inode_operations
  2009-05-21  0:28                     ` [PATCH 11/20] sysfs: Update s_iattr on link and unlink Eric W. Biederman
@ 2009-05-21  0:28                       ` Eric W. Biederman
  2009-05-21  0:28                         ` [PATCH 13/20] sysfs: Implement sysfs_getattr & sysfs_permission Eric W. Biederman
  2009-05-21  7:42                         ` [PATCH 12/20] sysfs: Nicely indent sysfs_symlink_inode_operations Tejun Heo
  2009-05-21  8:42                       ` [PATCH 11/20] sysfs: Update s_iattr on link and unlink Tejun Heo
  1 sibling, 2 replies; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-21  0:28 UTC (permalink / raw)
  To: Andrew Morton, Greg Kroah-Hartman
  Cc: linux-kernel, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Eric W. Biederman, Eric W. Biederman

From: Eric W. Biederman <ebiederm@xmission.com>

Lining up the functions in sysfs_symlink_inode_operations
follows the pattern in the rest of sysfs and makes things
slightly more readable.

Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
---
 fs/sysfs/symlink.c |    6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/fs/sysfs/symlink.c b/fs/sysfs/symlink.c
index ac13e61..0367ed1 100644
--- a/fs/sysfs/symlink.c
+++ b/fs/sysfs/symlink.c
@@ -198,9 +198,9 @@ static void sysfs_put_link(struct dentry *dentry, struct nameidata *nd, void *co
 }
 
 const struct inode_operations sysfs_symlink_inode_operations = {
-	.readlink = generic_readlink,
-	.follow_link = sysfs_follow_link,
-	.put_link = sysfs_put_link,
+	.readlink	= generic_readlink,
+	.follow_link	= sysfs_follow_link,
+	.put_link	= sysfs_put_link,
 };
 
 
-- 
1.6.1.2.350.g88cc


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH 13/20] sysfs: Implement sysfs_getattr & sysfs_permission
  2009-05-21  0:28                       ` [PATCH 12/20] sysfs: Nicely indent sysfs_symlink_inode_operations Eric W. Biederman
@ 2009-05-21  0:28                         ` Eric W. Biederman
  2009-05-21  0:28                           ` [PATCH 14/20] sysfs: In sysfs_chmod_file lazily propagate the mode change Eric W. Biederman
  2009-05-21  9:14                           ` [PATCH 13/20] sysfs: Implement sysfs_getattr & sysfs_permission Tejun Heo
  2009-05-21  7:42                         ` [PATCH 12/20] sysfs: Nicely indent sysfs_symlink_inode_operations Tejun Heo
  1 sibling, 2 replies; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-21  0:28 UTC (permalink / raw)
  To: Andrew Morton, Greg Kroah-Hartman
  Cc: linux-kernel, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Eric W. Biederman, Eric W. Biederman

From: Eric W. Biederman <ebiederm@xmission.com>

With the implementation of sysfs_getattr and sysfs_permission
sysfs becomes able to lazily propogate inode attribute changes
from the sysfs_dirents to the vfs inodes.   This paves the way
for deleting significant chunks of now unnecessary code.

Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
---
 fs/sysfs/dir.c     |    2 +
 fs/sysfs/inode.c   |   54 ++++++++++++++++++++++++++++++++++++++++-----------
 fs/sysfs/symlink.c |    3 ++
 fs/sysfs/sysfs.h   |    2 +
 4 files changed, 49 insertions(+), 12 deletions(-)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index 94b926f..d1b52b2 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -763,6 +763,8 @@ static struct dentry * sysfs_lookup(struct inode *dir, struct dentry *dentry,
 const struct inode_operations sysfs_dir_inode_operations = {
 	.lookup		= sysfs_lookup,
 	.setattr	= sysfs_setattr,
+	.getattr	= sysfs_getattr,
+	.permission	= sysfs_permission,
 };
 
 static struct sysfs_dirent *sysfs_get_one(struct sysfs_dirent *dir_sd)
diff --git a/fs/sysfs/inode.c b/fs/sysfs/inode.c
index dd154cb..1b7ed3c 100644
--- a/fs/sysfs/inode.c
+++ b/fs/sysfs/inode.c
@@ -35,6 +35,8 @@ static struct backing_dev_info sysfs_backing_dev_info = {
 
 static const struct inode_operations sysfs_inode_operations ={
 	.setattr	= sysfs_setattr,
+	.getattr	= sysfs_getattr,
+	.permission	= sysfs_permission,
 };
 
 int __init sysfs_inode_init(void)
@@ -123,7 +125,6 @@ static inline void set_default_inode_attr(struct inode * inode, mode_t mode)
 
 static inline void set_inode_attr(struct inode * inode, struct iattr * iattr)
 {
-	inode->i_mode = iattr->ia_mode;
 	inode->i_uid = iattr->ia_uid;
 	inode->i_gid = iattr->ia_gid;
 	inode->i_atime = iattr->ia_atime;
@@ -154,6 +155,33 @@ static int sysfs_count_nlink(struct sysfs_dirent *sd)
 	return nr + 2;
 }
 
+static void sysfs_refresh_inode(struct sysfs_dirent *sd, struct inode *inode)
+{
+	inode->i_mode = sd->s_mode;
+	if (sd->s_iattr) {
+		/* sysfs_dirent has non-default attributes
+		 * get them from persistent copy in sysfs_dirent
+		 */
+		set_inode_attr(inode, sd->s_iattr);
+	}
+
+	if (sysfs_type(sd) == SYSFS_DIR)
+		inode->i_nlink = sysfs_count_nlink(sd);
+}
+
+int sysfs_getattr(struct vfsmount *mnt, struct dentry *dentry, struct kstat *stat)
+{
+	struct sysfs_dirent *sd = dentry->d_fsdata;
+	struct inode *inode = dentry->d_inode;
+
+	mutex_lock(&sysfs_mutex);
+	sysfs_refresh_inode(sd, inode);
+	mutex_unlock(&sysfs_mutex);
+
+	generic_fillattr(inode, stat);
+	return 0;
+}
+
 static void sysfs_init_inode(struct sysfs_dirent *sd, struct inode *inode)
 {
 	struct bin_attribute *bin_attr;
@@ -162,25 +190,16 @@ static void sysfs_init_inode(struct sysfs_dirent *sd, struct inode *inode)
 	inode->i_mapping->a_ops = &sysfs_aops;
 	inode->i_mapping->backing_dev_info = &sysfs_backing_dev_info;
 	inode->i_op = &sysfs_inode_operations;
-	inode->i_ino = sd->s_ino;
 	lockdep_set_class(&inode->i_mutex, &sysfs_inode_imutex_key);
 
-	if (sd->s_iattr) {
-		/* sysfs_dirent has non-default attributes
-		 * get them for the new inode from persistent copy
-		 * in sysfs_dirent
-		 */
-		set_inode_attr(inode, sd->s_iattr);
-	} else
-		set_default_inode_attr(inode, sd->s_mode);
-
+	set_default_inode_attr(inode, sd->s_mode);
+	sysfs_refresh_inode(sd, inode);
 
 	/* initialize inode according to type */
 	switch (sysfs_type(sd)) {
 	case SYSFS_DIR:
 		inode->i_op = &sysfs_dir_inode_operations;
 		inode->i_fop = &sysfs_dir_operations;
-		inode->i_nlink = sysfs_count_nlink(sd);
 		break;
 	case SYSFS_KOBJ_ATTR:
 		inode->i_size = PAGE_SIZE;
@@ -263,3 +282,14 @@ int sysfs_hash_and_remove(struct sysfs_dirent *dir_sd, const char *name)
 	else
 		return -ENOENT;
 }
+
+int sysfs_permission(struct inode *inode, int mask)
+{
+	struct sysfs_dirent *sd = inode->i_private;
+
+	mutex_lock(&sysfs_mutex);
+	sysfs_refresh_inode(sd, inode);
+	mutex_unlock(&sysfs_mutex);
+
+	return generic_permission(inode, mask, NULL);
+}
diff --git a/fs/sysfs/symlink.c b/fs/sysfs/symlink.c
index 0367ed1..05e4984 100644
--- a/fs/sysfs/symlink.c
+++ b/fs/sysfs/symlink.c
@@ -201,6 +201,9 @@ const struct inode_operations sysfs_symlink_inode_operations = {
 	.readlink	= generic_readlink,
 	.follow_link	= sysfs_follow_link,
 	.put_link	= sysfs_put_link,
+	.setattr	= sysfs_setattr,
+	.getattr	= sysfs_getattr,
+	.permission	= sysfs_permission,
 };
 
 
diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index 043bb13..f5b53cf 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -148,6 +148,8 @@ struct inode *sysfs_get_inode(struct sysfs_dirent *sd);
 void sysfs_delete_inode(struct inode *inode);
 int sysfs_sd_setattr(struct sysfs_dirent *sd, struct iattr *iattr);
 int sysfs_setattr(struct dentry *dentry, struct iattr *iattr);
+int sysfs_getattr(struct vfsmount *mnt, struct dentry *dentry, struct kstat *stat);
+int sysfs_permission(struct inode *inode, int mask);
 int sysfs_hash_and_remove(struct sysfs_dirent *dir_sd, const char *name);
 int sysfs_inode_init(void);
 
-- 
1.6.1.2.350.g88cc


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH 14/20] sysfs: In sysfs_chmod_file lazily propagate the mode change.
  2009-05-21  0:28                         ` [PATCH 13/20] sysfs: Implement sysfs_getattr & sysfs_permission Eric W. Biederman
@ 2009-05-21  0:28                           ` Eric W. Biederman
  2009-05-21  0:28                             ` [PATCH 15/20] sysfs: Kill sysfs_addrm_start and sysfs_addrm_finish Eric W. Biederman
  2009-05-21  9:16                             ` [PATCH 14/20] sysfs: In sysfs_chmod_file lazily propagate the mode change Tejun Heo
  2009-05-21  9:14                           ` [PATCH 13/20] sysfs: Implement sysfs_getattr & sysfs_permission Tejun Heo
  1 sibling, 2 replies; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-21  0:28 UTC (permalink / raw)
  To: Andrew Morton, Greg Kroah-Hartman
  Cc: linux-kernel, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Eric W. Biederman, Eric W. Biederman

From: Eric W. Biederman <ebiederm@xmission.com>

Now that sysfs_getattr and sysfs_permission refresh the vfs
inode there is no need to immediatly push the mode change
into the vfs cache.  Reducing the amount of work needed and
simplifying the locking.

Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
---
 fs/sysfs/file.c |   31 ++++++++-----------------------
 1 files changed, 8 insertions(+), 23 deletions(-)

diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c
index 0786b41..31cfe1d 100644
--- a/fs/sysfs/file.c
+++ b/fs/sysfs/file.c
@@ -577,38 +577,23 @@ EXPORT_SYMBOL_GPL(sysfs_add_file_to_group);
  */
 int sysfs_chmod_file(struct kobject *kobj, struct attribute *attr, mode_t mode)
 {
-	struct sysfs_dirent *victim_sd = NULL;
-	struct dentry *victim = NULL;
-	struct inode * inode;
+	struct sysfs_dirent *sd;
 	struct iattr newattrs;
 	int rc;
 
-	rc = -ENOENT;
-	victim_sd = sysfs_get_dirent(kobj->sd, attr->name);
-	if (!victim_sd)
-		goto out;
+	mutex_lock(&sysfs_mutex);
 
-	mutex_lock(&sysfs_rename_mutex);
-	victim = sysfs_get_dentry(victim_sd);
-	mutex_unlock(&sysfs_rename_mutex);
-	if (IS_ERR(victim)) {
-		rc = PTR_ERR(victim);
-		victim = NULL;
+	rc = -ENOENT;
+	sd = sysfs_find_dirent(kobj->sd, attr->name);
+	if (!sd)
 		goto out;
-	}
-
-	inode = victim->d_inode;
 
-	mutex_lock(&inode->i_mutex);
-
-	newattrs.ia_mode = (mode & S_IALLUGO) | (inode->i_mode & ~S_IALLUGO);
+	newattrs.ia_mode = (mode & S_IALLUGO) | (sd->s_mode & ~S_IALLUGO);
 	newattrs.ia_valid = ATTR_MODE;
-	rc = sysfs_setattr(victim, &newattrs);
+	rc = sysfs_sd_setattr(sd, &newattrs);
 
-	mutex_unlock(&inode->i_mutex);
  out:
-	dput(victim);
-	sysfs_put(victim_sd);
+	mutex_unlock(&sysfs_mutex);
 	return rc;
 }
 EXPORT_SYMBOL_GPL(sysfs_chmod_file);
-- 
1.6.1.2.350.g88cc


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH 15/20] sysfs: Kill sysfs_addrm_start and sysfs_addrm_finish
  2009-05-21  0:28                           ` [PATCH 14/20] sysfs: In sysfs_chmod_file lazily propagate the mode change Eric W. Biederman
@ 2009-05-21  0:28                             ` Eric W. Biederman
  2009-05-21  0:28                               ` [PATCH 16/20] sysfs: Propagate renames to the vfs on demand Eric W. Biederman
  2009-05-21  9:31                               ` [PATCH 15/20] sysfs: Kill sysfs_addrm_start and sysfs_addrm_finish Tejun Heo
  2009-05-21  9:16                             ` [PATCH 14/20] sysfs: In sysfs_chmod_file lazily propagate the mode change Tejun Heo
  1 sibling, 2 replies; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-21  0:28 UTC (permalink / raw)
  To: Andrew Morton, Greg Kroah-Hartman
  Cc: linux-kernel, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Eric W. Biederman, Eric W. Biederman

From: Eric W. Biederman <ebiederm@xmission.com>

With lazy inode updates and dentry operations bringing everything
into sync on demand there is no longer any need to immediately
update the vfs or grab i_mutex to protect those updates as we
make changes to sysfs.

So stop updating the vfs inodes and move what remains of
sysfs_addrm_start and sysfs_addrm_finsih (just barely more than taking
the sysfs_mutex) into sysfs_add_one and sysfs_remove_one.

Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
---
 fs/sysfs/dir.c     |  192 +++++++---------------------------------------------
 fs/sysfs/file.c    |    6 +--
 fs/sysfs/inode.c   |   16 ++---
 fs/sysfs/symlink.c |    6 +--
 fs/sysfs/sysfs.h   |   17 +----
 5 files changed, 34 insertions(+), 203 deletions(-)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index d1b52b2..e4973c3 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -382,62 +382,6 @@ struct sysfs_dirent *sysfs_new_dirent(const char *name, umode_t mode, int type)
 	return NULL;
 }
 
-static int sysfs_ilookup_test(struct inode *inode, void *arg)
-{
-	struct sysfs_dirent *sd = arg;
-	return inode->i_ino == sd->s_ino;
-}
-
-/**
- *	sysfs_addrm_start - prepare for sysfs_dirent add/remove
- *	@acxt: pointer to sysfs_addrm_cxt to be used
- *	@parent_sd: parent sysfs_dirent
- *
- *	This function is called when the caller is about to add or
- *	remove sysfs_dirent under @parent_sd.  This function acquires
- *	sysfs_mutex, grabs inode for @parent_sd if available and lock
- *	i_mutex of it.  @acxt is used to keep and pass context to
- *	other addrm functions.
- *
- *	LOCKING:
- *	Kernel thread context (may sleep).  sysfs_mutex is locked on
- *	return.  i_mutex of parent inode is locked on return if
- *	available.
- */
-void sysfs_addrm_start(struct sysfs_addrm_cxt *acxt,
-		       struct sysfs_dirent *parent_sd)
-{
-	struct inode *inode;
-
-	memset(acxt, 0, sizeof(*acxt));
-	acxt->parent_sd = parent_sd;
-
-	/* Lookup parent inode.  inode initialization is protected by
-	 * sysfs_mutex, so inode existence can be determined by
-	 * looking up inode while holding sysfs_mutex.
-	 */
-	mutex_lock(&sysfs_mutex);
-
-	inode = ilookup5(sysfs_sb, parent_sd->s_ino, sysfs_ilookup_test,
-			 parent_sd);
-	if (inode) {
-		WARN_ON(inode->i_state & I_NEW);
-
-		/* parent inode available */
-		acxt->parent_inode = inode;
-
-		/* sysfs_mutex is below i_mutex in lock hierarchy.
-		 * First, trylock i_mutex.  If fails, unlock
-		 * sysfs_mutex and lock them in order.
-		 */
-		if (!mutex_trylock(&inode->i_mutex)) {
-			mutex_unlock(&sysfs_mutex);
-			mutex_lock(&inode->i_mutex);
-			mutex_lock(&sysfs_mutex);
-		}
-	}
-}
-
 /**
  *	sysfs_pathname - return full path to sysfs dirent
  *	@sd: sysfs_dirent whose path we want
@@ -460,161 +404,83 @@ static char *sysfs_pathname(struct sysfs_dirent *sd, char *path)
 
 /**
  *	sysfs_add_one - add sysfs_dirent to parent
- *	@acxt: addrm context to use
+ *	@parent_sd: directory to add @sd into
  *	@sd: sysfs_dirent to be added
  *
- *	Get @acxt->parent_sd and set sd->s_parent to it and increment
+ *	Get @parent_sd and set sd->s_parent to it and increment
  *	nlink of parent inode if @sd is a directory and link into the
  *	children list of the parent.
  *
- *	This function should be called between calls to
- *	sysfs_addrm_start() and sysfs_addrm_finish() and should be
- *	passed the same @acxt as passed to sysfs_addrm_start().
- *
  *	LOCKING:
- *	Determined by sysfs_addrm_start().
+ *	Kernel thread context (may sleep).  Grabs sysfs_mutex.
  *
  *	RETURNS:
  *	0 on success, -EEXIST if entry with the given name already
  *	exists.
  */
-int sysfs_add_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd)
+int sysfs_add_one(struct sysfs_dirent *parent_sd, struct sysfs_dirent *sd)
 {
 	struct iattr *ps_iattr;
 
-	if (sysfs_find_dirent(acxt->parent_sd, sd->s_name)) {
-		char *path = kzalloc(PATH_MAX, GFP_KERNEL);
+	mutex_lock(&sysfs_mutex);
+	if (sysfs_find_dirent(parent_sd, sd->s_name)) {
+		char *path;
+		mutex_unlock(&sysfs_mutex);
+
+		path = kzalloc(PATH_MAX, GFP_KERNEL);
 		WARN(1, KERN_WARNING
 		     "sysfs: cannot create duplicate filename '%s'\n",
 		     (path == NULL) ? sd->s_name :
-		     strcat(strcat(sysfs_pathname(acxt->parent_sd, path), "/"),
+		     strcat(strcat(sysfs_pathname(parent_sd, path), "/"),
 		            sd->s_name));
 		kfree(path);
 		return -EEXIST;
 	}
 
-	sd->s_parent = sysfs_get(acxt->parent_sd);
-
-	if (sysfs_type(sd) == SYSFS_DIR && acxt->parent_inode)
-		inc_nlink(acxt->parent_inode);
-
-	acxt->cnt++;
-
+	sd->s_parent = sysfs_get(parent_sd);
 	sysfs_link_sibling(sd);
 
 	/* Update timestamps on the parent */
-	ps_iattr = acxt->parent_sd->s_iattr;
+	ps_iattr = parent_sd->s_iattr;
 	if (ps_iattr)
 		ps_iattr->ia_ctime = ps_iattr->ia_mtime = CURRENT_TIME;
 
+	mutex_unlock(&sysfs_mutex);
 	return 0;
 }
 
 /**
  *	sysfs_remove_one - remove sysfs_dirent from parent
- *	@acxt: addrm context to use
  *	@sd: sysfs_dirent to be removed
  *
  *	Mark @sd removed and drop nlink of parent inode if @sd is a
  *	directory.  @sd is unlinked from the children list.
  *
- *	This function should be called between calls to
- *	sysfs_addrm_start() and sysfs_addrm_finish() and should be
- *	passed the same @acxt as passed to sysfs_addrm_start().
- *
  *	LOCKING:
- *	Determined by sysfs_addrm_start().
+ *	Kernel thread context (may sleep).  Grabs sysfs_mutex.
  */
-void sysfs_remove_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd)
+void sysfs_remove_one(struct sysfs_dirent *sd)
 {
 	struct iattr *ps_iattr;
 
 	BUG_ON(sd->s_flags & SYSFS_FLAG_REMOVED);
 
+	mutex_lock(&sysfs_mutex);
+
 	sysfs_unlink_sibling(sd);
 
 	/* Update timestamps on the parent */
-	ps_iattr = acxt->parent_sd->s_iattr;
+	ps_iattr = sd->s_parent->s_iattr;
 	if (ps_iattr)
 		ps_iattr->ia_ctime = ps_iattr->ia_mtime = CURRENT_TIME;
 
 	sd->s_flags |= SYSFS_FLAG_REMOVED;
-	sd->s_sibling = acxt->removed;
-	acxt->removed = sd;
-
-	if (sysfs_type(sd) == SYSFS_DIR && acxt->parent_inode)
-		drop_nlink(acxt->parent_inode);
-
-	acxt->cnt++;
-}
 
-/**
- *	sysfs_dec_nlink - Decrement link count for the specified sysfs_dirent
- *	@sd: target sysfs_dirent
- *
- *	Decrement nlink for @sd.  @sd must have been unlinked from its
- *	parent on entry to this function such that it can't be looked
- *	up anymore.
- */
-static void sysfs_dec_nlink(struct sysfs_dirent *sd)
-{
-	struct inode *inode;
-
-	inode = ilookup(sysfs_sb, sd->s_ino);
-	if (!inode)
-		return;
-
-	/* adjust nlink and update timestamp */
-	mutex_lock(&inode->i_mutex);
-
-	inode->i_ctime = CURRENT_TIME;
-	drop_nlink(inode);
-	if (sysfs_type(sd) == SYSFS_DIR)
-		drop_nlink(inode);
-
-	mutex_unlock(&inode->i_mutex);
-
-	iput(inode);
-}
-
-/**
- *	sysfs_addrm_finish - finish up sysfs_dirent add/remove
- *	@acxt: addrm context to finish up
- *
- *	Finish up sysfs_dirent add/remove.  Resources acquired by
- *	sysfs_addrm_start() are released and removed sysfs_dirents are
- *	cleaned up.  Timestamps on the parent inode are updated.
- *
- *	LOCKING:
- *	All mutexes acquired by sysfs_addrm_start() are released.
- */
-void sysfs_addrm_finish(struct sysfs_addrm_cxt *acxt)
-{
-	/* release resources acquired by sysfs_addrm_start() */
 	mutex_unlock(&sysfs_mutex);
-	if (acxt->parent_inode) {
-		struct inode *inode = acxt->parent_inode;
 
-		/* if added/removed, update timestamps on the parent */
-		if (acxt->cnt)
-			inode->i_ctime = inode->i_mtime = CURRENT_TIME;
-
-		mutex_unlock(&inode->i_mutex);
-		iput(inode);
-	}
-
-	/* kill removed sysfs_dirents */
-	while (acxt->removed) {
-		struct sysfs_dirent *sd = acxt->removed;
-
-		acxt->removed = sd->s_sibling;
-		sd->s_sibling = NULL;
-
-		sysfs_dec_nlink(sd);
-		sysfs_deactivate(sd);
-		unmap_bin_file(sd);
-		sysfs_put(sd);
-	}
+	sysfs_deactivate(sd);
+	unmap_bin_file(sd);
+	sysfs_put(sd);
 }
 
 /**
@@ -673,7 +539,6 @@ static int create_dir(struct kobject *kobj, struct sysfs_dirent *parent_sd,
 		      const char *name, struct sysfs_dirent **p_sd)
 {
 	umode_t mode = S_IFDIR| S_IRWXU | S_IRUGO | S_IXUGO;
-	struct sysfs_addrm_cxt acxt;
 	struct sysfs_dirent *sd;
 	int rc;
 
@@ -684,10 +549,8 @@ static int create_dir(struct kobject *kobj, struct sysfs_dirent *parent_sd,
 	sd->s_dir.kobj = kobj;
 
 	/* link in */
-	sysfs_addrm_start(&acxt, parent_sd);
-	rc = sysfs_add_one(&acxt, sd);
-	sysfs_addrm_finish(&acxt);
 
+	rc = sysfs_add_one(parent_sd, sd);
 	if (rc == 0)
 		*p_sd = sd;
 	else
@@ -783,20 +646,15 @@ static struct sysfs_dirent *sysfs_get_one(struct sysfs_dirent *dir_sd)
 
 static void remove_dir(struct sysfs_dirent *dir_sd)
 {
-	struct sysfs_addrm_cxt acxt;
 	struct sysfs_dirent *sd;
 
 	pr_debug("sysfs %s: removing dir\n", dir_sd->s_name);
 
 	while ((sd = sysfs_get_one(dir_sd))) {
-		sysfs_addrm_start(&acxt, sd->s_parent);
-		sysfs_remove_one(&acxt, sd);
-		sysfs_addrm_finish(&acxt);
+		sysfs_remove_one(sd);
 		sysfs_put(sd);
 	}
-	sysfs_addrm_start(&acxt, dir_sd->s_parent);
-	sysfs_remove_one(&acxt, dir_sd);
-	sysfs_addrm_finish(&acxt);
+	sysfs_remove_one(dir_sd);
 }
 
 void sysfs_remove_subdir(struct sysfs_dirent *sd)
diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c
index 31cfe1d..b512ce6 100644
--- a/fs/sysfs/file.c
+++ b/fs/sysfs/file.c
@@ -499,7 +499,6 @@ int sysfs_add_file_mode(struct sysfs_dirent *dir_sd,
 			const struct attribute *attr, int type, mode_t amode)
 {
 	umode_t mode = (amode & S_IALLUGO) | S_IFREG;
-	struct sysfs_addrm_cxt acxt;
 	struct sysfs_dirent *sd;
 	int rc;
 
@@ -508,10 +507,7 @@ int sysfs_add_file_mode(struct sysfs_dirent *dir_sd,
 		return -ENOMEM;
 	sd->s_attr.attr = (void *)attr;
 
-	sysfs_addrm_start(&acxt, dir_sd);
-	rc = sysfs_add_one(&acxt, sd);
-	sysfs_addrm_finish(&acxt);
-
+	rc = sysfs_add_one(dir_sd, sd);
 	if (rc)
 		sysfs_put(sd);
 
diff --git a/fs/sysfs/inode.c b/fs/sysfs/inode.c
index 1b7ed3c..ad9a30d 100644
--- a/fs/sysfs/inode.c
+++ b/fs/sysfs/inode.c
@@ -263,23 +263,17 @@ void sysfs_delete_inode(struct inode *inode)
 
 int sysfs_hash_and_remove(struct sysfs_dirent *dir_sd, const char *name)
 {
-	struct sysfs_addrm_cxt acxt;
 	struct sysfs_dirent *sd;
 
 	if (!dir_sd)
 		return -ENOENT;
 
-	sysfs_addrm_start(&acxt, dir_sd);
-
-	sd = sysfs_find_dirent(dir_sd, name);
-	if (sd)
-		sysfs_remove_one(&acxt, sd);
-
-	sysfs_addrm_finish(&acxt);
-
-	if (sd)
+	sd = sysfs_get_dirent(dir_sd, name);
+	if (sd) {
+		sysfs_remove_one(sd);
+		sysfs_put(sd);
 		return 0;
-	else
+	} else
 		return -ENOENT;
 }
 
diff --git a/fs/sysfs/symlink.c b/fs/sysfs/symlink.c
index 05e4984..fc5fc86 100644
--- a/fs/sysfs/symlink.c
+++ b/fs/sysfs/symlink.c
@@ -31,7 +31,6 @@ int sysfs_create_link(struct kobject *kobj, struct kobject *target,
 	struct sysfs_dirent *parent_sd = NULL;
 	struct sysfs_dirent *target_sd = NULL;
 	struct sysfs_dirent *sd = NULL;
-	struct sysfs_addrm_cxt acxt;
 	int error;
 
 	BUG_ON(!name);
@@ -65,10 +64,7 @@ int sysfs_create_link(struct kobject *kobj, struct kobject *target,
 	sd->s_symlink.target_sd = target_sd;
 	target_sd = NULL;	/* reference is now owned by the symlink */
 
-	sysfs_addrm_start(&acxt, parent_sd);
-	error = sysfs_add_one(&acxt, sd);
-	sysfs_addrm_finish(&acxt);
-
+	error = sysfs_add_one(parent_sd, sd);
 	if (error)
 		goto out_put;
 
diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index f5b53cf..f17ebb8 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -77,16 +77,6 @@ static inline unsigned int sysfs_type(struct sysfs_dirent *sd)
 }
 
 /*
- * Context structure to be used while adding/removing nodes.
- */
-struct sysfs_addrm_cxt {
-	struct sysfs_dirent	*parent_sd;
-	struct inode		*parent_inode;
-	struct sysfs_dirent	*removed;
-	int			cnt;
-};
-
-/*
  * mount.c
  */
 extern struct sysfs_dirent sysfs_root;
@@ -106,11 +96,8 @@ extern const struct inode_operations sysfs_dir_inode_operations;
 struct dentry *sysfs_get_dentry(struct sysfs_dirent *sd);
 struct sysfs_dirent *sysfs_get_active_two(struct sysfs_dirent *sd);
 void sysfs_put_active_two(struct sysfs_dirent *sd);
-void sysfs_addrm_start(struct sysfs_addrm_cxt *acxt,
-		       struct sysfs_dirent *parent_sd);
-int sysfs_add_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd);
-void sysfs_remove_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd);
-void sysfs_addrm_finish(struct sysfs_addrm_cxt *acxt);
+int sysfs_add_one(struct sysfs_dirent *parent_sd, struct sysfs_dirent *sd);
+void sysfs_remove_one(struct sysfs_dirent *sd);
 
 struct sysfs_dirent *sysfs_find_dirent(struct sysfs_dirent *parent_sd,
 				       const unsigned char *name);
-- 
1.6.1.2.350.g88cc


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH 16/20] sysfs: Propagate renames to the vfs on demand
  2009-05-21  0:28                             ` [PATCH 15/20] sysfs: Kill sysfs_addrm_start and sysfs_addrm_finish Eric W. Biederman
@ 2009-05-21  0:28                               ` Eric W. Biederman
  2009-05-21  0:28                                 ` [PATCH 17/20] sysfs: Merge sysfs_rename_dir and sysfs_move_dir Eric W. Biederman
  2009-05-21  9:41                                 ` [PATCH 16/20] sysfs: Propagate renames to the vfs on demand Tejun Heo
  2009-05-21  9:31                               ` [PATCH 15/20] sysfs: Kill sysfs_addrm_start and sysfs_addrm_finish Tejun Heo
  1 sibling, 2 replies; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-21  0:28 UTC (permalink / raw)
  To: Andrew Morton, Greg Kroah-Hartman
  Cc: linux-kernel, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Eric W. Biederman, Eric W. Biederman

From: Eric W. Biederman <ebiederm@xmission.com>

By teaching sysfs_revalidate to hide a dentry for
a sysfs_dirent if the sysfs_dirent has been renamed,
and by teaching sysfs_lookup to return the original
dentry if the sysfs dirent has been renamed.  I can
show the results of renames correctly without having to
update the dcache during the directory rename.

This massively simplifies the rename logic allowing a lot
of weird sysfs special cases to be removed along with
a lot of now unnecesary helper code.

Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
---
 fs/namei.c            |   22 -------
 fs/sysfs/dir.c        |  156 ++++++++++---------------------------------------
 fs/sysfs/inode.c      |   12 ----
 fs/sysfs/sysfs.h      |    1 -
 include/linux/namei.h |    1 -
 5 files changed, 32 insertions(+), 160 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index 78f253c..69f559a 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -1260,28 +1260,6 @@ struct dentry *lookup_one_len(const char *name, struct dentry *base, int len)
 	return __lookup_hash(&this, base, NULL);
 }
 
-/**
- * lookup_one_noperm - bad hack for sysfs
- * @name:	pathname component to lookup
- * @base:	base directory to lookup from
- *
- * This is a variant of lookup_one_len that doesn't perform any permission
- * checks.   It's a horrible hack to work around the braindead sysfs
- * architecture and should not be used anywhere else.
- *
- * DON'T USE THIS FUNCTION EVER, thanks.
- */
-struct dentry *lookup_one_noperm(const char *name, struct dentry *base)
-{
-	int err;
-	struct qstr this;
-
-	err = __lookup_one_len(name, &this, base, strlen(name));
-	if (err)
-		return ERR_PTR(err);
-	return __lookup_hash(&this, base, NULL);
-}
-
 int user_path_at(int dfd, const char __user *name, unsigned flags,
 		 struct path *path)
 {
diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index e4973c3..3289fd8 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -24,7 +24,6 @@
 #include "sysfs.h"
 
 DEFINE_MUTEX(sysfs_mutex);
-DEFINE_MUTEX(sysfs_rename_mutex);
 DEFINE_SPINLOCK(sysfs_assoc_lock);
 
 static DEFINE_SPINLOCK(sysfs_ino_lock);
@@ -84,46 +83,6 @@ static void sysfs_unlink_sibling(struct sysfs_dirent *sd)
 }
 
 /**
- *	sysfs_get_dentry - get dentry for the given sysfs_dirent
- *	@sd: sysfs_dirent of interest
- *
- *	Get dentry for @sd.  Dentry is looked up if currently not
- *	present.  This function descends from the root looking up
- *	dentry for each step.
- *
- *	LOCKING:
- *	mutex_lock(sysfs_rename_mutex)
- *
- *	RETURNS:
- *	Pointer to found dentry on success, ERR_PTR() value on error.
- */
-struct dentry *sysfs_get_dentry(struct sysfs_dirent *sd)
-{
-	struct dentry *dentry = dget(sysfs_sb->s_root);
-
-	while (dentry->d_fsdata != sd) {
-		struct sysfs_dirent *cur;
-		struct dentry *parent;
-
-		/* find the first ancestor which hasn't been looked up */
-		cur = sd;
-		while (cur->s_parent != dentry->d_fsdata)
-			cur = cur->s_parent;
-
-		/* look it up */
-		parent = dentry;
-		mutex_lock(&parent->d_inode->i_mutex);
-		dentry = lookup_one_noperm(cur->s_name, parent);
-		mutex_unlock(&parent->d_inode->i_mutex);
-		dput(parent);
-
-		if (IS_ERR(dentry))
-			break;
-	}
-	return dentry;
-}
-
-/**
  *	sysfs_get_active - get an active reference to sysfs_dirent
  *	@sd: sysfs_dirent to get an active reference to
  *
@@ -311,6 +270,14 @@ static int sysfs_dentry_revalidate(struct dentry *dentry, struct nameidata *nd)
 	if (sd->s_flags & SYSFS_FLAG_REMOVED)
 		goto out_bad;
 
+	/* The sysfs dirent has been moved? */
+	if (dentry->d_parent->d_fsdata != sd->s_parent)
+		goto out_bad;
+
+	/* The sysfs dirent has been renamed */
+	if (strcmp(dentry->d_name.name, sd->s_name) != 0)
+		goto out_bad;
+
 	mutex_unlock(&sysfs_mutex);
 out_valid:
 	return 1;
@@ -318,6 +285,12 @@ out_bad:
 	/* Remove the dentry from the dcache hashes.
 	 * If this is a deleted dentry we use d_drop instead of d_delete
 	 * so sysfs doesn't need to cope with negative dentries.
+	 *
+	 * If this is a dentry that has simply been renamed we
+	 * use d_drop to remove it from the dcache lookup on it's
+	 * old parent.  If this dentry persists later when a lookup
+	 * is performed at it's new name the dentry will be readded
+	 * to the dcache hashes.
 	 */
 	is_dir = (sysfs_type(sd) == SYSFS_DIR);
 	mutex_unlock(&sysfs_mutex);
@@ -613,10 +586,15 @@ static struct dentry * sysfs_lookup(struct inode *dir, struct dentry *dentry,
 	}
 
 	/* instantiate and hash dentry */
-	dentry->d_op = &sysfs_dentry_ops;
-	dentry->d_fsdata = sysfs_get(sd);
-	d_instantiate(dentry, inode);
-	d_rehash(dentry);
+	ret = d_find_alias(inode);
+	if (!ret) {
+		dentry->d_op = &sysfs_dentry_ops;
+		dentry->d_fsdata = sysfs_get(sd);
+		d_add(dentry, inode);
+	} else {
+		d_move(ret, dentry);
+		iput(inode);
+	}
 
  out_unlock:
 	mutex_unlock(&sysfs_mutex);
@@ -685,62 +663,32 @@ void sysfs_remove_dir(struct kobject * kobj)
 int sysfs_rename_dir(struct kobject * kobj, const char *new_name)
 {
 	struct sysfs_dirent *sd = kobj->sd;
-	struct dentry *parent = NULL;
-	struct dentry *old_dentry = NULL, *new_dentry = NULL;
 	const char *dup_name = NULL;
 	int error;
 
-	mutex_lock(&sysfs_rename_mutex);
+	mutex_lock(&sysfs_mutex);
 
 	error = 0;
 	if (strcmp(sd->s_name, new_name) == 0)
 		goto out;	/* nothing to rename */
 
-	/* get the original dentry */
-	old_dentry = sysfs_get_dentry(sd);
-	if (IS_ERR(old_dentry)) {
-		error = PTR_ERR(old_dentry);
-		old_dentry = NULL;
-		goto out;
-	}
-
-	parent = old_dentry->d_parent;
-
-	/* lock parent and get dentry for new name */
-	mutex_lock(&parent->d_inode->i_mutex);
-	mutex_lock(&sysfs_mutex);
-
 	error = -EEXIST;
 	if (sysfs_find_dirent(sd->s_parent, new_name))
-		goto out_unlock;
-
-	error = -ENOMEM;
-	new_dentry = d_alloc_name(parent, new_name);
-	if (!new_dentry)
-		goto out_unlock;
+		goto out;
 
 	/* rename sysfs_dirent */
 	error = -ENOMEM;
 	new_name = dup_name = kstrdup(new_name, GFP_KERNEL);
 	if (!new_name)
-		goto out_unlock;
+		goto out;
 
 	dup_name = sd->s_name;
 	sd->s_name = new_name;
 
-	/* rename */
-	d_add(new_dentry, NULL);
-	d_move(old_dentry, new_dentry);
-
 	error = 0;
- out_unlock:
+ out:
 	mutex_unlock(&sysfs_mutex);
-	mutex_unlock(&parent->d_inode->i_mutex);
 	kfree(dup_name);
-	dput(old_dentry);
-	dput(new_dentry);
- out:
-	mutex_unlock(&sysfs_rename_mutex);
 	return error;
 }
 
@@ -748,54 +696,20 @@ int sysfs_move_dir(struct kobject *kobj, struct kobject *new_parent_kobj)
 {
 	struct sysfs_dirent *sd = kobj->sd;
 	struct sysfs_dirent *new_parent_sd;
-	struct dentry *old_parent, *new_parent = NULL;
-	struct dentry *old_dentry = NULL, *new_dentry = NULL;
 	int error;
 
-	mutex_lock(&sysfs_rename_mutex);
 	BUG_ON(!sd->s_parent);
+
+	mutex_lock(&sysfs_mutex);
 	new_parent_sd = new_parent_kobj->sd ? new_parent_kobj->sd : &sysfs_root;
 
 	error = 0;
 	if (sd->s_parent == new_parent_sd)
 		goto out;	/* nothing to move */
 
-	/* get dentries */
-	old_dentry = sysfs_get_dentry(sd);
-	if (IS_ERR(old_dentry)) {
-		error = PTR_ERR(old_dentry);
-		old_dentry = NULL;
-		goto out;
-	}
-	old_parent = old_dentry->d_parent;
-
-	new_parent = sysfs_get_dentry(new_parent_sd);
-	if (IS_ERR(new_parent)) {
-		error = PTR_ERR(new_parent);
-		new_parent = NULL;
-		goto out;
-	}
-
-again:
-	mutex_lock(&old_parent->d_inode->i_mutex);
-	if (!mutex_trylock(&new_parent->d_inode->i_mutex)) {
-		mutex_unlock(&old_parent->d_inode->i_mutex);
-		goto again;
-	}
-	mutex_lock(&sysfs_mutex);
-
 	error = -EEXIST;
 	if (sysfs_find_dirent(new_parent_sd, sd->s_name))
-		goto out_unlock;
-
-	error = -ENOMEM;
-	new_dentry = d_alloc_name(new_parent, sd->s_name);
-	if (!new_dentry)
-		goto out_unlock;
-
-	error = 0;
-	d_add(new_dentry, NULL);
-	d_move(old_dentry, new_dentry);
+		goto out;
 
 	/* Remove from old parent's list and insert into new parent's list. */
 	sysfs_unlink_sibling(sd);
@@ -804,15 +718,9 @@ again:
 	sd->s_parent = new_parent_sd;
 	sysfs_link_sibling(sd);
 
- out_unlock:
+	error = 0;
+out:
 	mutex_unlock(&sysfs_mutex);
-	mutex_unlock(&new_parent->d_inode->i_mutex);
-	mutex_unlock(&old_parent->d_inode->i_mutex);
- out:
-	dput(new_parent);
-	dput(old_dentry);
-	dput(new_dentry);
-	mutex_unlock(&sysfs_rename_mutex);
 	return error;
 }
 
diff --git a/fs/sysfs/inode.c b/fs/sysfs/inode.c
index ad9a30d..a1917b5 100644
--- a/fs/sysfs/inode.c
+++ b/fs/sysfs/inode.c
@@ -132,17 +132,6 @@ static inline void set_inode_attr(struct inode * inode, struct iattr * iattr)
 	inode->i_ctime = iattr->ia_ctime;
 }
 
-
-/*
- * sysfs has a different i_mutex lock order behavior for i_mutex than other
- * filesystems; sysfs i_mutex is called in many places with subsystem locks
- * held. At the same time, many of the VFS locking rules do not apply to
- * sysfs at all (cross directory rename for example). To untangle this mess
- * (which gives false positives in lockdep), we're giving sysfs inodes their
- * own class for i_mutex.
- */
-static struct lock_class_key sysfs_inode_imutex_key;
-
 static int sysfs_count_nlink(struct sysfs_dirent *sd)
 {
 	struct sysfs_dirent *child;
@@ -190,7 +179,6 @@ static void sysfs_init_inode(struct sysfs_dirent *sd, struct inode *inode)
 	inode->i_mapping->a_ops = &sysfs_aops;
 	inode->i_mapping->backing_dev_info = &sysfs_backing_dev_info;
 	inode->i_op = &sysfs_inode_operations;
-	lockdep_set_class(&inode->i_mutex, &sysfs_inode_imutex_key);
 
 	set_default_inode_attr(inode, sd->s_mode);
 	sysfs_refresh_inode(sd, inode);
diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index f17ebb8..2db952c 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -87,7 +87,6 @@ extern struct kmem_cache *sysfs_dir_cachep;
  * dir.c
  */
 extern struct mutex sysfs_mutex;
-extern struct mutex sysfs_rename_mutex;
 extern spinlock_t sysfs_assoc_lock;
 
 extern const struct file_operations sysfs_dir_operations;
diff --git a/include/linux/namei.h b/include/linux/namei.h
index fc2e035..758ecfb 100644
--- a/include/linux/namei.h
+++ b/include/linux/namei.h
@@ -76,7 +76,6 @@ extern struct file *nameidata_to_filp(struct nameidata *nd, int flags);
 extern void release_open_intent(struct nameidata *);
 
 extern struct dentry *lookup_one_len(const char *, struct dentry *, int);
-extern struct dentry *lookup_one_noperm(const char *, struct dentry *);
 
 extern int follow_down(struct vfsmount **, struct dentry **);
 extern int follow_up(struct vfsmount **, struct dentry **);
-- 
1.6.1.2.350.g88cc


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH 17/20] sysfs: Merge sysfs_rename_dir and sysfs_move_dir
  2009-05-21  0:28                               ` [PATCH 16/20] sysfs: Propagate renames to the vfs on demand Eric W. Biederman
@ 2009-05-21  0:28                                 ` Eric W. Biederman
  2009-05-21  0:28                                   ` [PATCH 18/20] sysfs: Pass super_block to sysfs_get_inode Eric W. Biederman
  2009-05-21  9:42                                   ` [PATCH 17/20] sysfs: Merge sysfs_rename_dir and sysfs_move_dir Tejun Heo
  2009-05-21  9:41                                 ` [PATCH 16/20] sysfs: Propagate renames to the vfs on demand Tejun Heo
  1 sibling, 2 replies; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-21  0:28 UTC (permalink / raw)
  To: Andrew Morton, Greg Kroah-Hartman
  Cc: linux-kernel, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Eric W. Biederman, Eric W. Biederman

From: Eric W. Biederman <ebiederm@xmission.com>

These two functions do 90% of the same work and it doesn't significantly
obfuscate the function to allow both the parent dir and the name to change
at the same time.  So merge them together to simplify maintenance, and
increase testing.

Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
---
 fs/sysfs/dir.c |   66 +++++++++++++++++++++++--------------------------------
 1 files changed, 28 insertions(+), 38 deletions(-)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index 3289fd8..3ed4489 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -660,30 +660,42 @@ void sysfs_remove_dir(struct kobject * kobj)
 	remove_dir(sd);
 }
 
-int sysfs_rename_dir(struct kobject * kobj, const char *new_name)
+static int sysfs_mv_dir(struct sysfs_dirent *sd,
+	struct sysfs_dirent *new_parent_sd, const char *new_name)
 {
-	struct sysfs_dirent *sd = kobj->sd;
 	const char *dup_name = NULL;
 	int error;
 
 	mutex_lock(&sysfs_mutex);
 
 	error = 0;
-	if (strcmp(sd->s_name, new_name) == 0)
+	if ((sd->s_parent == new_parent_sd) &&
+	    (strcmp(sd->s_name, new_name) == 0))
 		goto out;	/* nothing to rename */
 
 	error = -EEXIST;
-	if (sysfs_find_dirent(sd->s_parent, new_name))
+	if (sysfs_find_dirent(new_parent_sd, new_name))
 		goto out;
 
 	/* rename sysfs_dirent */
-	error = -ENOMEM;
-	new_name = dup_name = kstrdup(new_name, GFP_KERNEL);
-	if (!new_name)
-		goto out;
+	if (strcmp(sd->s_name, new_name) != 0) {
+		error = -ENOMEM;
+		new_name = dup_name = kstrdup(new_name, GFP_KERNEL);
+		if (!new_name)
+			goto out;
+
+		dup_name = sd->s_name;
+		sd->s_name = new_name;
+	}
 
-	dup_name = sd->s_name;
-	sd->s_name = new_name;
+	/* Remove from old parent's list and insert into new parent's list. */
+	if (sd->s_parent != new_parent_sd) {
+		sysfs_unlink_sibling(sd);
+		sysfs_get(new_parent_sd);
+		sysfs_put(sd->s_parent);
+		sd->s_parent = new_parent_sd;
+		sysfs_link_sibling(sd);
+	}
 
 	error = 0;
  out:
@@ -692,36 +704,14 @@ int sysfs_rename_dir(struct kobject * kobj, const char *new_name)
 	return error;
 }
 
-int sysfs_move_dir(struct kobject *kobj, struct kobject *new_parent_kobj)
+int sysfs_rename_dir(struct kobject * kobj, const char *new_name)
 {
-	struct sysfs_dirent *sd = kobj->sd;
-	struct sysfs_dirent *new_parent_sd;
-	int error;
-
-	BUG_ON(!sd->s_parent);
-
-	mutex_lock(&sysfs_mutex);
-	new_parent_sd = new_parent_kobj->sd ? new_parent_kobj->sd : &sysfs_root;
-
-	error = 0;
-	if (sd->s_parent == new_parent_sd)
-		goto out;	/* nothing to move */
-
-	error = -EEXIST;
-	if (sysfs_find_dirent(new_parent_sd, sd->s_name))
-		goto out;
-
-	/* Remove from old parent's list and insert into new parent's list. */
-	sysfs_unlink_sibling(sd);
-	sysfs_get(new_parent_sd);
-	sysfs_put(sd->s_parent);
-	sd->s_parent = new_parent_sd;
-	sysfs_link_sibling(sd);
+	return sysfs_mv_dir(kobj->sd, kobj->sd->s_parent, new_name);
+}
 
-	error = 0;
-out:
-	mutex_unlock(&sysfs_mutex);
-	return error;
+int sysfs_move_dir(struct kobject *kobj, struct kobject *new_parent_kobj)
+{
+	return sysfs_mv_dir(kobj->sd, new_parent_kobj->sd, kobj->sd->s_name);
 }
 
 /* Relationship between s_mode and the DT_xxx types */
-- 
1.6.1.2.350.g88cc


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH 18/20] sysfs: Pass super_block to sysfs_get_inode
  2009-05-21  0:28                                 ` [PATCH 17/20] sysfs: Merge sysfs_rename_dir and sysfs_move_dir Eric W. Biederman
@ 2009-05-21  0:28                                   ` Eric W. Biederman
  2009-05-21  0:28                                     ` [PATCH 19/20] sysfs: Kill unused sysfs_sb variable Eric W. Biederman
  2009-05-21  9:42                                   ` [PATCH 17/20] sysfs: Merge sysfs_rename_dir and sysfs_move_dir Tejun Heo
  1 sibling, 1 reply; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-21  0:28 UTC (permalink / raw)
  To: Andrew Morton, Greg Kroah-Hartman
  Cc: linux-kernel, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Eric W. Biederman, Eric W. Biederman

From: Eric W. Biederman <ebiederm@xmission.com>

Currently sysfs_get_inode magically returns an inode on
sysfs_sb.  Make the super_block parameter explicit and
the code becomes clearer.

Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
---
 fs/sysfs/dir.c   |    2 +-
 fs/sysfs/inode.c |    5 +++--
 fs/sysfs/mount.c |    2 +-
 fs/sysfs/sysfs.h |    2 +-
 4 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index 3ed4489..7aa8890 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -579,7 +579,7 @@ static struct dentry * sysfs_lookup(struct inode *dir, struct dentry *dentry,
 	}
 
 	/* attach dentry and inode */
-	inode = sysfs_get_inode(sd);
+	inode = sysfs_get_inode(dir->i_sb, sd);
 	if (!inode) {
 		ret = ERR_PTR(-ENOMEM);
 		goto out_unlock;
diff --git a/fs/sysfs/inode.c b/fs/sysfs/inode.c
index a1917b5..c725aeb 100644
--- a/fs/sysfs/inode.c
+++ b/fs/sysfs/inode.c
@@ -210,6 +210,7 @@ static void sysfs_init_inode(struct sysfs_dirent *sd, struct inode *inode)
 
 /**
  *	sysfs_get_inode - get inode for sysfs_dirent
+ *	@sb: super block
  *	@sd: sysfs_dirent to allocate inode for
  *
  *	Get inode for @sd.  If such inode doesn't exist, a new inode
@@ -222,11 +223,11 @@ static void sysfs_init_inode(struct sysfs_dirent *sd, struct inode *inode)
  *	RETURNS:
  *	Pointer to allocated inode on success, NULL on failure.
  */
-struct inode * sysfs_get_inode(struct sysfs_dirent *sd)
+struct inode * sysfs_get_inode(struct super_block *sb, struct sysfs_dirent *sd)
 {
 	struct inode *inode;
 
-	inode = iget_locked(sysfs_sb, sd->s_ino);
+	inode = iget_locked(sb, sd->s_ino);
 	if (inode && (inode->i_state & I_NEW))
 		sysfs_init_inode(sd, inode);
 
diff --git a/fs/sysfs/mount.c b/fs/sysfs/mount.c
index 4974995..89db07e 100644
--- a/fs/sysfs/mount.c
+++ b/fs/sysfs/mount.c
@@ -54,7 +54,7 @@ static int sysfs_fill_super(struct super_block *sb, void *data, int silent)
 
 	/* get root inode, initialize and unlock it */
 	mutex_lock(&sysfs_mutex);
-	inode = sysfs_get_inode(&sysfs_root);
+	inode = sysfs_get_inode(sb, &sysfs_root);
 	mutex_unlock(&sysfs_mutex);
 	if (!inode) {
 		pr_debug("sysfs: could not get root inode\n");
diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index 2db952c..cf21b06 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -130,7 +130,7 @@ static inline void __sysfs_put(struct sysfs_dirent *sd)
 /*
  * inode.c
  */
-struct inode *sysfs_get_inode(struct sysfs_dirent *sd);
+struct inode *sysfs_get_inode(struct super_block *sb, struct sysfs_dirent *sd);
 void sysfs_delete_inode(struct inode *inode);
 int sysfs_sd_setattr(struct sysfs_dirent *sd, struct iattr *iattr);
 int sysfs_setattr(struct dentry *dentry, struct iattr *iattr);
-- 
1.6.1.2.350.g88cc


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH 19/20] sysfs: Kill unused sysfs_sb variable.
  2009-05-21  0:28                                   ` [PATCH 18/20] sysfs: Pass super_block to sysfs_get_inode Eric W. Biederman
@ 2009-05-21  0:28                                     ` Eric W. Biederman
  2009-05-21  0:28                                       ` [PATCH 20/20] sysfs: Normalize error handling in sysfs_fill_inode Eric W. Biederman
  0 siblings, 1 reply; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-21  0:28 UTC (permalink / raw)
  To: Andrew Morton, Greg Kroah-Hartman
  Cc: linux-kernel, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Eric W. Biederman, Eric W. Biederman

From: Eric W. Biederman <ebiederm@xmission.com>

Now that there are no more users we can remove
the sysfs_sb variable.

Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
---
 fs/sysfs/mount.c |    2 --
 fs/sysfs/sysfs.h |    1 -
 2 files changed, 0 insertions(+), 3 deletions(-)

diff --git a/fs/sysfs/mount.c b/fs/sysfs/mount.c
index 89db07e..0cb1088 100644
--- a/fs/sysfs/mount.c
+++ b/fs/sysfs/mount.c
@@ -23,7 +23,6 @@
 
 
 static struct vfsmount *sysfs_mount;
-struct super_block * sysfs_sb = NULL;
 struct kmem_cache *sysfs_dir_cachep;
 
 static const struct super_operations sysfs_ops = {
@@ -50,7 +49,6 @@ static int sysfs_fill_super(struct super_block *sb, void *data, int silent)
 	sb->s_magic = SYSFS_MAGIC;
 	sb->s_op = &sysfs_ops;
 	sb->s_time_gran = 1;
-	sysfs_sb = sb;
 
 	/* get root inode, initialize and unlock it */
 	mutex_lock(&sysfs_mutex);
diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index cf21b06..5dd8168 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -80,7 +80,6 @@ static inline unsigned int sysfs_type(struct sysfs_dirent *sd)
  * mount.c
  */
 extern struct sysfs_dirent sysfs_root;
-extern struct super_block *sysfs_sb;
 extern struct kmem_cache *sysfs_dir_cachep;
 
 /*
-- 
1.6.1.2.350.g88cc


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH 20/20] sysfs: Normalize error handling in sysfs_fill_inode
  2009-05-21  0:28                                     ` [PATCH 19/20] sysfs: Kill unused sysfs_sb variable Eric W. Biederman
@ 2009-05-21  0:28                                       ` Eric W. Biederman
  2009-05-21  9:43                                         ` Tejun Heo
  0 siblings, 1 reply; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-21  0:28 UTC (permalink / raw)
  To: Andrew Morton, Greg Kroah-Hartman
  Cc: linux-kernel, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Eric W. Biederman, Eric W. Biederman

From: Eric W. Biederman <ebiederm@xmission.com>

Use a single error exit path instead of doing whatever
is the required cleanup at each point we find the error.
Ultimately this should make the code more maintainable.

Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
---
 fs/sysfs/mount.c |   16 +++++++++++-----
 1 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/fs/sysfs/mount.c b/fs/sysfs/mount.c
index 0cb1088..1dd023a 100644
--- a/fs/sysfs/mount.c
+++ b/fs/sysfs/mount.c
@@ -41,8 +41,9 @@ struct sysfs_dirent sysfs_root = {
 
 static int sysfs_fill_super(struct super_block *sb, void *data, int silent)
 {
-	struct inode *inode;
-	struct dentry *root;
+	struct inode *inode = NULL;
+	struct dentry *root = NULL;
+	int error;
 
 	sb->s_blocksize = PAGE_CACHE_SIZE;
 	sb->s_blocksize_bits = PAGE_CACHE_SHIFT;
@@ -51,24 +52,29 @@ static int sysfs_fill_super(struct super_block *sb, void *data, int silent)
 	sb->s_time_gran = 1;
 
 	/* get root inode, initialize and unlock it */
+	error = -ENOMEM;
 	mutex_lock(&sysfs_mutex);
 	inode = sysfs_get_inode(sb, &sysfs_root);
 	mutex_unlock(&sysfs_mutex);
 	if (!inode) {
 		pr_debug("sysfs: could not get root inode\n");
-		return -ENOMEM;
+		goto err_out;
 	}
 
 	/* instantiate and link root dentry */
+	error = -ENOMEM;
 	root = d_alloc_root(inode);
 	if (!root) {
 		pr_debug("%s: could not get root dentry!\n",__func__);
-		iput(inode);
-		return -ENOMEM;
+		goto err_out;
 	}
 	root->d_fsdata = &sysfs_root;
 	sb->s_root = root;
 	return 0;
+err_out:
+	dput(root);
+	iput(inode);
+	return error;
 }
 
 static int sysfs_get_sb(struct file_system_type *fs_type,
-- 
1.6.1.2.350.g88cc


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* Re: [PATCH 01/20] sysfs: Implement sysfs_rename_link
  2009-05-21  0:27 ` [PATCH 01/20] sysfs: Implement sysfs_rename_link Eric W. Biederman
  2009-05-21  0:27   ` [PATCH 02/20] driver core: Use sysfs_rename_link in device_rename Eric W. Biederman
@ 2009-05-21  1:49   ` Tejun Heo
  2009-05-21  5:35   ` Tejun Heo
  2009-05-28  0:14   ` Greg KH
  3 siblings, 0 replies; 200+ messages in thread
From: Tejun Heo @ 2009-05-21  1:49 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Andrew Morton, Greg Kroah-Hartman, linux-kernel, Cornelia Huck,
	linux-fsdevel, Benjamin Thery, Daniel Lezcano, Eric W. Biederman

Eric W. Biederman wrote:
> From: Eric W. Biederman <ebiederm@xmission.com>
> 
> Because of rename ordering problems we occassionally give false
> warnings about invalid sysfs operations, so implement a helper
> function for this common sysfs idiom.
> 
> This is a stripped down version of an earlier patch that
> also added sysfs_delete_link.
> 
> Cc: Benjamin Thery <benjamin.thery@bull.net>
> Cc: Daniel Lezcano <dlezcano@fr.ibm.com>
> Cc: Tejun Heo <tj@kernel.org>
> Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>

I think this has been brought up before but can you please use
--no-chain-reply-to?  It's painful to follow the deep nesting.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 01/20] sysfs: Implement sysfs_rename_link
  2009-05-21  0:27 ` [PATCH 01/20] sysfs: Implement sysfs_rename_link Eric W. Biederman
  2009-05-21  0:27   ` [PATCH 02/20] driver core: Use sysfs_rename_link in device_rename Eric W. Biederman
  2009-05-21  1:49   ` [PATCH 01/20] sysfs: Implement sysfs_rename_link Tejun Heo
@ 2009-05-21  5:35   ` Tejun Heo
  2009-05-21 10:06     ` Kay Sievers
  2009-05-28  0:14   ` Greg KH
  3 siblings, 1 reply; 200+ messages in thread
From: Tejun Heo @ 2009-05-21  5:35 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Andrew Morton, Greg Kroah-Hartman, linux-kernel, Cornelia Huck,
	linux-fsdevel, Benjamin Thery, Daniel Lezcano, Eric W. Biederman

Hello,

Eric W. Biederman wrote:
> +int sysfs_rename_link(struct kobject *kobj, struct kobject *targ,
> +			const char *old, const char *new)
> +{
> +	sysfs_remove_link(kobj, old);
> +	return sysfs_create_link(kobj, targ, new);
> +}
> +

Removal and creation are done in the reverse order compared to the one
used in device rename.  The important difference is that previously
failed operation was noop whereas it now would remove the current
link.  I think the old order is correct.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 03/20] sysfs: Remove now unnecessary error reporting suppression.
  2009-05-21  0:27     ` [PATCH 03/20] sysfs: Remove now unnecessary error reporting suppression Eric W. Biederman
  2009-05-21  0:27       ` [PATCH 04/20] sysfs: Handle the general case of removing of directories with subdirectories Eric W. Biederman
@ 2009-05-21  5:37       ` Tejun Heo
  2009-05-21  6:12         ` Eric W. Biederman
  1 sibling, 1 reply; 200+ messages in thread
From: Tejun Heo @ 2009-05-21  5:37 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Andrew Morton, Greg Kroah-Hartman, linux-kernel, Cornelia Huck,
	linux-fsdevel, Eric W. Biederman

Eric W. Biederman wrote:
> From: Eric W. Biederman <ebiederm@xmission.com>
> 
> Now that we use sysfs_rename_link in the places we previously
> used sysfs_create_link_nowarn we can remove sysfs_create_link_nowarn
> and all it's supporting infrastructure.

I'm not entirely sure why implementing a rename helper means that we
don't need nowarn version anymore.  Nothing really changed or is it
that the nowarn version wasn't too necessary anyway?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 03/20] sysfs: Remove now unnecessary error reporting suppression.
  2009-05-21  5:37       ` [PATCH 03/20] sysfs: Remove now unnecessary error reporting suppression Tejun Heo
@ 2009-05-21  6:12         ` Eric W. Biederman
  2009-05-21  6:20           ` Tejun Heo
  0 siblings, 1 reply; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-21  6:12 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Andrew Morton, Greg Kroah-Hartman, linux-kernel, Cornelia Huck,
	linux-fsdevel, Eric W. Biederman

Tejun Heo <tj@kernel.org> writes:

> Eric W. Biederman wrote:
>> From: Eric W. Biederman <ebiederm@xmission.com>
>> 
>> Now that we use sysfs_rename_link in the places we previously
>> used sysfs_create_link_nowarn we can remove sysfs_create_link_nowarn
>> and all it's supporting infrastructure.
>
> I'm not entirely sure why implementing a rename helper means that we
> don't need nowarn version anymore.  Nothing really changed or is it
> that the nowarn version wasn't too necessary anyway?

nowarn was used exclusively in the hand coded version of rename.  By
switching the order I was able perform the operations such that even
if the operation is ultimately a noop and are attempt to recreate the
same link we won't have problems.

The two callers of device_rename are required (and do) perform locking
to ensure the rename operation is safe.  So the exact implementation
in the sysfs does not matter.  Although making it atomic would be
ideal.

The nowarn helpers existed because the order was backwards in
device rename and when a noop rename happened sysfs would mistakenly
think there was a problem and complain.  I think the upper
layers suppress that case now for a while at least it lead to
a lot of spurious warnings.

Eric

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 03/20] sysfs: Remove now unnecessary error reporting suppression.
  2009-05-21  6:12         ` Eric W. Biederman
@ 2009-05-21  6:20           ` Tejun Heo
  0 siblings, 0 replies; 200+ messages in thread
From: Tejun Heo @ 2009-05-21  6:20 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Andrew Morton, Greg Kroah-Hartman, linux-kernel, Cornelia Huck,
	linux-fsdevel, Eric W. Biederman

Eric W. Biederman wrote:
> Tejun Heo <tj@kernel.org> writes:
> 
>> Eric W. Biederman wrote:
>>> From: Eric W. Biederman <ebiederm@xmission.com>
>>>
>>> Now that we use sysfs_rename_link in the places we previously
>>> used sysfs_create_link_nowarn we can remove sysfs_create_link_nowarn
>>> and all it's supporting infrastructure.
>> I'm not entirely sure why implementing a rename helper means that we
>> don't need nowarn version anymore.  Nothing really changed or is it
>> that the nowarn version wasn't too necessary anyway?
> 
> nowarn was used exclusively in the hand coded version of rename.  By
> switching the order I was able perform the operations such that even
> if the operation is ultimately a noop and are attempt to recreate the
> same link we won't have problems.
> 
> The two callers of device_rename are required (and do) perform locking
> to ensure the rename operation is safe.  So the exact implementation
> in the sysfs does not matter.  Although making it atomic would be
> ideal.
> 
> The nowarn helpers existed because the order was backwards in
> device rename and when a noop rename happened sysfs would mistakenly
> think there was a problem and complain.  I think the upper
> layers suppress that case now for a while at least it lead to
> a lot of spurious warnings.

But, still, removing the original link on failure doesn't sound too
enticing.  Wouldn't it be better to detect the noop special case and
do nothing instead of swapping the order?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 04/20] sysfs: Handle the general case of removing of directories with subdirectories
  2009-05-21  0:27       ` [PATCH 04/20] sysfs: Handle the general case of removing of directories with subdirectories Eric W. Biederman
  2009-05-21  0:27         ` [PATCH 05/20] sysfs: Rename sysfs_d_iput to sysfs_dentry_iput Eric W. Biederman
@ 2009-05-21  6:23         ` Tejun Heo
  2009-05-21  7:29           ` Eric W. Biederman
  1 sibling, 1 reply; 200+ messages in thread
From: Tejun Heo @ 2009-05-21  6:23 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Andrew Morton, Greg Kroah-Hartman, linux-kernel, Cornelia Huck,
	linux-fsdevel, Eric W. Biederman

Eric W. Biederman wrote:
> From: Eric W. Biederman <ebiederm@xmission.com>
> 
> Modify sysfs to properly remove directories containing attributes and
> subdirectories.  The code is relatively simple and means we don't have
> to worry about what might use this logic.
> 
> In a quick survey I have only found /sys/dev/char and /sys/dev/block that are
> removing non-enmpty directories today (and they are exclusively filled with symlinks).
> So only removing empty directories does not appear to be an option.
> 
> I don't hold sysfs_mutex across the entire operation as that is unneeded
> for coherence at the sysfs level and some level of coordination is expected
> at the upper layers.
> 
> Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
...
> -void sysfs_remove_subdir(struct sysfs_dirent *sd)
> -{
> -	remove_dir(sd);
> +	struct sysfs_dirent *sd = dir_sd;
> +	mutex_lock(&sysfs_mutex);
> +	while ((sysfs_type(sd) == SYSFS_DIR) && sd->s_dir.children)
> +		sd = sd->s_dir.children;
> +	if (sd != dir_sd)
> +		sysfs_get(sd);
> +	else
> +		sd = NULL;
> +	mutex_unlock(&sysfs_mutex);
> +	return sd;
>  }

Some blank lines wouldn't hurt, especially after local variable
declaration.

> -static void __sysfs_remove_dir(struct sysfs_dirent *dir_sd)
> +static void remove_dir(struct sysfs_dirent *dir_sd)
>  {
>  	struct sysfs_addrm_cxt acxt;
> -	struct sysfs_dirent **pos;
> -
> -	if (!dir_sd)
> -		return;
> +	struct sysfs_dirent *sd;
>  
>  	pr_debug("sysfs %s: removing dir\n", dir_sd->s_name);
> -	sysfs_addrm_start(&acxt, dir_sd);
> -	pos = &dir_sd->s_dir.children;
> -	while (*pos) {
> -		struct sysfs_dirent *sd = *pos;
>  
> -		if (sysfs_type(sd) != SYSFS_DIR)
> -			sysfs_remove_one(&acxt, sd);
> -		else
> -			pos = &(*pos)->s_sibling;
> +	while ((sd = sysfs_get_one(dir_sd))) {
> +		sysfs_addrm_start(&acxt, sd->s_parent);
> +		sysfs_remove_one(&acxt, sd);
> +		sysfs_addrm_finish(&acxt);
> +		sysfs_put(sd);
>  	}
> +	sysfs_addrm_start(&acxt, dir_sd->s_parent);
> +	sysfs_remove_one(&acxt, dir_sd);
>  	sysfs_addrm_finish(&acxt);
> +}

I agree we should be heading this way but what happens to attributes
or directories living below the subdirectories?  If it's gonna handle
recursive case, I think it better do it properly.  I had patches of
similar effect.

 http://article.gmane.org/gmane.linux.kernel/582151
 http://article.gmane.org/gmane.linux.kernel/582155

The patchset didn't really go anywhere but the recursive atomic
removal should be usable.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 05/20] sysfs: Rename sysfs_d_iput to sysfs_dentry_iput
  2009-05-21  0:27         ` [PATCH 05/20] sysfs: Rename sysfs_d_iput to sysfs_dentry_iput Eric W. Biederman
  2009-05-21  0:28           ` [PATCH 06/20] sysfs: Use dentry_ops instead of directly playing with the dcache Eric W. Biederman
@ 2009-05-21  6:24           ` Tejun Heo
  1 sibling, 0 replies; 200+ messages in thread
From: Tejun Heo @ 2009-05-21  6:24 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Andrew Morton, Greg Kroah-Hartman, linux-kernel, Cornelia Huck,
	linux-fsdevel, Eric W. Biederman

Eric W. Biederman wrote:
> From: Eric W. Biederman <ebiederm@xmission.com>
> 
> Using dentry instead of d in the function name is what
> several other filesystems are doing and it seems to be
> a more readable convention.
> 
> Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>

Acked-by: Tejun Heo <tj@kernel.org>

-- 
tejun

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 06/20] sysfs: Use dentry_ops instead of directly playing with the dcache
  2009-05-21  0:28           ` [PATCH 06/20] sysfs: Use dentry_ops instead of directly playing with the dcache Eric W. Biederman
  2009-05-21  0:28             ` [PATCH 07/20] sysfs: Simplify sysfs_chmod_file semantics Eric W. Biederman
@ 2009-05-21  6:41             ` Tejun Heo
  2009-05-21  7:37               ` Eric W. Biederman
  1 sibling, 1 reply; 200+ messages in thread
From: Tejun Heo @ 2009-05-21  6:41 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Andrew Morton, Greg Kroah-Hartman, linux-kernel, Cornelia Huck,
	linux-fsdevel, Eric W. Biederman

Hello,

Eric W. Biederman wrote:
> Calling d_drop unconditionally when a sysfs_dirent is deleted has
> the potential to leak mounts, so instead implement dentry delete
> and revalidate operations that cause sysfs dentries to be removed
> at the appropriate time.
> 
> Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>

Great, thanks for doing this.  It's much better than the fragile vfs
tinkering sysfs has been doing.

> +static int sysfs_dentry_revalidate(struct dentry *dentry, struct nameidata *nd)
> +{
> +	struct sysfs_dirent *sd = dentry->d_fsdata;
> +	int is_dir;
> +
> +	mutex_lock(&sysfs_mutex);
> +
> +	/* The sysfs dirent has been deleted */
> +	if (sd->s_flags & SYSFS_FLAG_REMOVED)
> +		goto out_bad;
> +
> +	mutex_unlock(&sysfs_mutex);
> +out_valid:
> +	return 1;
> +out_bad:
> +	/* Remove the dentry from the dcache hashes.
> +	 * If this is a deleted dentry we use d_drop instead of d_delete
> +	 * so sysfs doesn't need to cope with negative dentries.
> +	 */
> +	is_dir = (sysfs_type(sd) == SYSFS_DIR);
> +	mutex_unlock(&sysfs_mutex);
> +	if (is_dir) {
> +		/* If we have submounts we must allow the vfs caches
> +		 * to lie about the state of the filesystem to prevent
> +		 * leaks and other nasty things.
> +		 */
> +		if (have_submounts(dentry))
> +			goto out_valid;
> +		shrink_dcache_parent(dentry);
> +	}
> +	d_drop(dentry);
> +	return 0;
> +}

Ummm... what happens if sysfs recreates those identical nodes again
while the old dentries are lingering?  The dead ones will linger till
the submounts are gone and then look ups after that will show the new
ones, right?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 07/20] sysfs: Simplify sysfs_chmod_file semantics
  2009-05-21  0:28             ` [PATCH 07/20] sysfs: Simplify sysfs_chmod_file semantics Eric W. Biederman
  2009-05-21  0:28               ` [PATCH 08/20] sysfs: Optimize just changing the sysfs file mode Eric W. Biederman
@ 2009-05-21  6:42               ` Tejun Heo
  1 sibling, 0 replies; 200+ messages in thread
From: Tejun Heo @ 2009-05-21  6:42 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Andrew Morton, Greg Kroah-Hartman, linux-kernel, Cornelia Huck,
	linux-fsdevel, Eric W. Biederman

Eric W. Biederman wrote:
> From: Eric W. Biederman <ebiederm@xmission.com>
> 
> Currently every caller of sysfs_chmod_file happens at either
> file creation time to set a non-default mode or in response
> to a specific user requested space change in policy.  Making
> timestamps of when the chmod happens and notification of
> a file changing mode uninteresting.
> 
> Remove the unnecessary time stamp and filesystem change
> notification, and removes the last of the explicit inotify
> and donitfy support from sysfs.
> 
> Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>

If Greg is okay with this,

Acked-by: Tejun Heo <tj@kernel.org>

-- 
tejun

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 08/20] sysfs: Optimize just changing the sysfs file mode.
  2009-05-21  0:28               ` [PATCH 08/20] sysfs: Optimize just changing the sysfs file mode Eric W. Biederman
  2009-05-21  0:28                 ` [PATCH 09/20] sysfs: Simplify iattr assignments Eric W. Biederman
@ 2009-05-21  7:29                 ` Tejun Heo
  2009-05-21  7:54                   ` Eric W. Biederman
  1 sibling, 1 reply; 200+ messages in thread
From: Tejun Heo @ 2009-05-21  7:29 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Andrew Morton, Greg Kroah-Hartman, linux-kernel, Cornelia Huck,
	linux-fsdevel, Eric W. Biederman

Eric W. Biederman wrote:
> From: Eric W. Biederman <ebiederm@xmission.com>
> 
> Don't allocate a struct iattr for the sysfs dentry if just
> the mode changes because we have a field for that on the
> sysfs_dirent, and we can trigger that case with sysfs_chmod_file.
> 
> Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
> ---
>  fs/sysfs/inode.c |   22 ++++++++++++++--------
>  1 files changed, 14 insertions(+), 8 deletions(-)
> 
> diff --git a/fs/sysfs/inode.c b/fs/sysfs/inode.c
> index 555f0ff..70ff2a2 100644
> --- a/fs/sysfs/inode.c
> +++ b/fs/sysfs/inode.c
> @@ -60,12 +60,16 @@ int sysfs_setattr(struct dentry * dentry, struct iattr * iattr)
>  		return error;
>  
>  	iattr->ia_valid &= ~ATTR_SIZE; /* ignore size changes */
> +	if (iattr->ia_valid & ATTR_MODE) {
> +		if (!in_group_p(inode->i_gid) && !capable(CAP_FSETID))
> +			iattr->ia_mode &= ~S_ISGID;
> +	}
>  
>  	error = inode_setattr(inode, iattr);
>  	if (error)
>  		return error;
>  
> -	if (!sd_iattr) {
> +	if (!sd_iattr && (ia_valid & ~ATTR_MODE)) {
>  		/* setting attributes for the first time, allocate now */
>  		sd_iattr = kzalloc(sizeof(struct iattr), GFP_KERNEL);
>  		if (!sd_iattr)
> @@ -78,6 +82,13 @@ int sysfs_setattr(struct dentry * dentry, struct iattr * iattr)
>  		sd->s_iattr = sd_iattr;
>  	}
>  
> +	if (ia_valid & ATTR_MODE)
> +		sd->s_mode = iattr->ia_mode;
> +
> +	/* If we don't need the extra attributes leave */
> +	if (!sd_iattr)
> +		return 0;

One visible difference is lack of timestamp update.  Is there any use
case where sysfs file mode changing needs to be fast?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 04/20] sysfs: Handle the general case of removing of directories with subdirectories
  2009-05-21  6:23         ` [PATCH 04/20] sysfs: Handle the general case of removing of directories with subdirectories Tejun Heo
@ 2009-05-21  7:29           ` Eric W. Biederman
  2009-05-21  7:36             ` Tejun Heo
  0 siblings, 1 reply; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-21  7:29 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Andrew Morton, Greg Kroah-Hartman, linux-kernel, Cornelia Huck,
	linux-fsdevel, Eric W. Biederman

Tejun Heo <tj@kernel.org> writes:

> Eric W. Biederman wrote:
>> From: Eric W. Biederman <ebiederm@xmission.com>
>> 
>> Modify sysfs to properly remove directories containing attributes and
>> subdirectories.  The code is relatively simple and means we don't have
>> to worry about what might use this logic.
>> 
>> In a quick survey I have only found /sys/dev/char and /sys/dev/block that are
>> removing non-enmpty directories today (and they are exclusively filled with symlinks).
>> So only removing empty directories does not appear to be an option.
>> 
>> I don't hold sysfs_mutex across the entire operation as that is unneeded
>> for coherence at the sysfs level and some level of coordination is expected
>> at the upper layers.
>> 
>> Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
> ...
>> -void sysfs_remove_subdir(struct sysfs_dirent *sd)
>> -{
>> -	remove_dir(sd);
>> +	struct sysfs_dirent *sd = dir_sd;
>> +	mutex_lock(&sysfs_mutex);
>> +	while ((sysfs_type(sd) == SYSFS_DIR) && sd->s_dir.children)
>> +		sd = sd->s_dir.children;
>> +	if (sd != dir_sd)
>> +		sysfs_get(sd);
>> +	else
>> +		sd = NULL;
>> +	mutex_unlock(&sysfs_mutex);
>> +	return sd;
>>  }
>
> Some blank lines wouldn't hurt, especially after local variable
> declaration.
>
>> -static void __sysfs_remove_dir(struct sysfs_dirent *dir_sd)
>> +static void remove_dir(struct sysfs_dirent *dir_sd)
>>  {
>>  	struct sysfs_addrm_cxt acxt;
>> -	struct sysfs_dirent **pos;
>> -
>> -	if (!dir_sd)
>> -		return;
>> +	struct sysfs_dirent *sd;
>>  
>>  	pr_debug("sysfs %s: removing dir\n", dir_sd->s_name);
>> -	sysfs_addrm_start(&acxt, dir_sd);
>> -	pos = &dir_sd->s_dir.children;
>> -	while (*pos) {
>> -		struct sysfs_dirent *sd = *pos;
>>  
>> -		if (sysfs_type(sd) != SYSFS_DIR)
>> -			sysfs_remove_one(&acxt, sd);
>> -		else
>> -			pos = &(*pos)->s_sibling;
>> +	while ((sd = sysfs_get_one(dir_sd))) {
>> +		sysfs_addrm_start(&acxt, sd->s_parent);
>> +		sysfs_remove_one(&acxt, sd);
>> +		sysfs_addrm_finish(&acxt);
>> +		sysfs_put(sd);
>>  	}
>> +	sysfs_addrm_start(&acxt, dir_sd->s_parent);
>> +	sysfs_remove_one(&acxt, dir_sd);
>>  	sysfs_addrm_finish(&acxt);
>> +}
>
> I agree we should be heading this way but what happens to attributes
> or directories living below the subdirectories?  If it's gonna handle
> recursive case, I think it better do it properly.  I had patches of
> similar effect.

I do handle it properly.  sysfs_get_one finds the deepest child of the
first directory entry.  Then I remove it.  And I repeat until done.

The locking is correct, something that is much more difficult to
tell with your version.

By grabbing and dropping the sysfs_mutex things are simpler, and they
get even simpler in future patches.



Now looking at that code in detail there is a question of what happens if
we add a directory entry while we are recursively deleting a directory.
Neither your patch, my patch, nor the existing code handle that case
(assuming the sysfs_dirent) was looked up before it is removed from it's
parent directory.  I expect another patch is called for to plug that
theoretical gap.  

I expect the way to close that hole is to have an extra flag that says
we are removing a directory entry and refuse to add if that flag is
set.

I would prefer to only remove empty directories.  But when I
instrumented things up I found cases where that does indeed happen.

Eric

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 09/20] sysfs: Simplify iattr assignments
  2009-05-21  0:28                 ` [PATCH 09/20] sysfs: Simplify iattr assignments Eric W. Biederman
  2009-05-21  0:28                   ` [PATCH 10/20] sysfs: Fix locking and factor out sysfs_sd_setattr Eric W. Biederman
@ 2009-05-21  7:31                   ` Tejun Heo
  1 sibling, 0 replies; 200+ messages in thread
From: Tejun Heo @ 2009-05-21  7:31 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Andrew Morton, Greg Kroah-Hartman, linux-kernel, Cornelia Huck,
	linux-fsdevel, Eric W. Biederman

Eric W. Biederman wrote:
> From: Eric W. Biederman <ebiederm@xmission.com>
> 
> The granularity of sysfs time when we keep it is 1 ns.  Which
> when passed to timestamp_trunc results in a nop.  So remove
> the unnecessary function call making sysfs_setattr slightly
> easier to read.
> 
> Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>

Acked-by: Tejun Heo <tj@kernel.org>

-- 
tejun

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 04/20] sysfs: Handle the general case of removing of directories with subdirectories
  2009-05-21  7:29           ` Eric W. Biederman
@ 2009-05-21  7:36             ` Tejun Heo
  2009-05-21  8:04               ` Eric W. Biederman
  0 siblings, 1 reply; 200+ messages in thread
From: Tejun Heo @ 2009-05-21  7:36 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Andrew Morton, Greg Kroah-Hartman, linux-kernel, Cornelia Huck,
	linux-fsdevel, Eric W. Biederman

Eric W. Biederman wrote:
>> I agree we should be heading this way but what happens to attributes
>> or directories living below the subdirectories?  If it's gonna handle
>> recursive case, I think it better do it properly.  I had patches of
>> similar effect.
> 
> I do handle it properly.  sysfs_get_one finds the deepest child of the
> first directory entry.  Then I remove it.  And I repeat until done.
> 
> The locking is correct, something that is much more difficult to
> tell with your version.

Why? :-)

> By grabbing and dropping the sysfs_mutex things are simpler, and they
> get even simpler in future patches.
> 
> Now looking at that code in detail there is a question of what happens if
> we add a directory entry while we are recursively deleting a directory.
> Neither your patch, my patch, nor the existing code handle that case
> (assuming the sysfs_dirent) was looked up before it is removed from it's
> parent directory.  I expect another patch is called for to plug that
> theoretical gap.  
> 
> I expect the way to close that hole is to have an extra flag that says
> we are removing a directory entry and refuse to add if that flag is
> set.
> 
> I would prefer to only remove empty directories.  But when I
> instrumented things up I found cases where that does indeed happen.

IIRC, my version did the whole thing while holding sysfs_mutex, so
it's safe against such races.  I can't really see why ops like this
can't be atomic in sysfs.  I don't really care how things are done but
please make it atomic.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 06/20] sysfs: Use dentry_ops instead of directly playing with the dcache
  2009-05-21  6:41             ` [PATCH 06/20] sysfs: Use dentry_ops instead of directly playing with the dcache Tejun Heo
@ 2009-05-21  7:37               ` Eric W. Biederman
  2009-05-21  7:40                 ` Tejun Heo
  0 siblings, 1 reply; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-21  7:37 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Andrew Morton, Greg Kroah-Hartman, linux-kernel, Cornelia Huck,
	linux-fsdevel, Eric W. Biederman

Tejun Heo <tj@kernel.org> writes:

>> +static int sysfs_dentry_revalidate(struct dentry *dentry, struct nameidata *nd)
>> +{
>> +	struct sysfs_dirent *sd = dentry->d_fsdata;
>> +	int is_dir;
>> +
>> +	mutex_lock(&sysfs_mutex);
>> +
>> +	/* The sysfs dirent has been deleted */
>> +	if (sd->s_flags & SYSFS_FLAG_REMOVED)
>> +		goto out_bad;
>> +
>> +	mutex_unlock(&sysfs_mutex);
>> +out_valid:
>> +	return 1;
>> +out_bad:
>> +	/* Remove the dentry from the dcache hashes.
>> +	 * If this is a deleted dentry we use d_drop instead of d_delete
>> +	 * so sysfs doesn't need to cope with negative dentries.
>> +	 */
>> +	is_dir = (sysfs_type(sd) == SYSFS_DIR);
>> +	mutex_unlock(&sysfs_mutex);
>> +	if (is_dir) {
>> +		/* If we have submounts we must allow the vfs caches
>> +		 * to lie about the state of the filesystem to prevent
>> +		 * leaks and other nasty things.
>> +		 */
>> +		if (have_submounts(dentry))
>> +			goto out_valid;
>> +		shrink_dcache_parent(dentry);
>> +	}
>> +	d_drop(dentry);
>> +	return 0;
>> +}
>
> Ummm... what happens if sysfs recreates those identical nodes again
> while the old dentries are lingering?  The dead ones will linger till
> the submounts are gone and then look ups after that will show the new
> ones, right?

Yep.  On the vfs level.  The sysfs dirent tree will reflect what is
going on with the hardware.

This is a vfs misfeature, that I hope someday we will get fixed.
But for now it is better not to leak mount points.  Especially
since no one actually mounts things on sysfs.

Eric

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 06/20] sysfs: Use dentry_ops instead of directly playing with the dcache
  2009-05-21  7:37               ` Eric W. Biederman
@ 2009-05-21  7:40                 ` Tejun Heo
  0 siblings, 0 replies; 200+ messages in thread
From: Tejun Heo @ 2009-05-21  7:40 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Andrew Morton, Greg Kroah-Hartman, linux-kernel, Cornelia Huck,
	linux-fsdevel, Eric W. Biederman

Eric W. Biederman wrote:
>> Ummm... what happens if sysfs recreates those identical nodes again
>> while the old dentries are lingering?  The dead ones will linger till
>> the submounts are gone and then look ups after that will show the new
>> ones, right?
> 
> Yep.  On the vfs level.  The sysfs dirent tree will reflect what is
> going on with the hardware.
> 
> This is a vfs misfeature, that I hope someday we will get fixed.
> But for now it is better not to leak mount points.  Especially
> since no one actually mounts things on sysfs.

fuse and debugfs do.  :-P

Acked-by: Tejun Heo <tj@kernel.org>

-- 
tejun

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 10/20] sysfs: Fix locking and factor out sysfs_sd_setattr
  2009-05-21  0:28                   ` [PATCH 10/20] sysfs: Fix locking and factor out sysfs_sd_setattr Eric W. Biederman
  2009-05-21  0:28                     ` [PATCH 11/20] sysfs: Update s_iattr on link and unlink Eric W. Biederman
@ 2009-05-21  7:42                     ` Tejun Heo
  1 sibling, 0 replies; 200+ messages in thread
From: Tejun Heo @ 2009-05-21  7:42 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Andrew Morton, Greg Kroah-Hartman, linux-kernel, Cornelia Huck,
	linux-fsdevel, Eric W. Biederman

Eric W. Biederman wrote:
> From: Eric W. Biederman <ebiederm@xmission.com>
> 
> Cleanly separate the work that is specific to setting the
> attributes of a sysfs_dirent from what is needed to update
> the attributes of a vfs inode.
> 
> Additionally grab the sysfs_mutex to keep any nasties from
> surprising us when updating the sysfs_dirent.
> 
> Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>

Acked-by: Tejun Heo <tj@kernel.org>

-- 
tejun

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 12/20] sysfs: Nicely indent sysfs_symlink_inode_operations
  2009-05-21  0:28                       ` [PATCH 12/20] sysfs: Nicely indent sysfs_symlink_inode_operations Eric W. Biederman
  2009-05-21  0:28                         ` [PATCH 13/20] sysfs: Implement sysfs_getattr & sysfs_permission Eric W. Biederman
@ 2009-05-21  7:42                         ` Tejun Heo
  1 sibling, 0 replies; 200+ messages in thread
From: Tejun Heo @ 2009-05-21  7:42 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Andrew Morton, Greg Kroah-Hartman, linux-kernel, Cornelia Huck,
	linux-fsdevel, Eric W. Biederman

Eric W. Biederman wrote:
> From: Eric W. Biederman <ebiederm@xmission.com>
> 
> Lining up the functions in sysfs_symlink_inode_operations
> follows the pattern in the rest of sysfs and makes things
> slightly more readable.
> 
> Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>

Acked-by: Tejun Heo <tj@kernel.org>

-- 
tejun

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 08/20] sysfs: Optimize just changing the sysfs file mode.
  2009-05-21  7:29                 ` [PATCH 08/20] sysfs: Optimize just changing the sysfs file mode Tejun Heo
@ 2009-05-21  7:54                   ` Eric W. Biederman
  2009-05-21  8:41                     ` Tejun Heo
  0 siblings, 1 reply; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-21  7:54 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Andrew Morton, Greg Kroah-Hartman, linux-kernel, Cornelia Huck,
	linux-fsdevel, Eric W. Biederman

Tejun Heo <tj@kernel.org> writes:

> Eric W. Biederman wrote:
>> From: Eric W. Biederman <ebiederm@xmission.com>
>> 
>> Don't allocate a struct iattr for the sysfs dentry if just
>> the mode changes because we have a field for that on the
>> sysfs_dirent, and we can trigger that case with sysfs_chmod_file.
>> 
>> Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
>> ---
>>  fs/sysfs/inode.c |   22 ++++++++++++++--------
>>  1 files changed, 14 insertions(+), 8 deletions(-)
>> 
>> diff --git a/fs/sysfs/inode.c b/fs/sysfs/inode.c
>> index 555f0ff..70ff2a2 100644
>> --- a/fs/sysfs/inode.c
>> +++ b/fs/sysfs/inode.c
>> @@ -60,12 +60,16 @@ int sysfs_setattr(struct dentry * dentry, struct iattr * iattr)
>>  		return error;
>>  
>>  	iattr->ia_valid &= ~ATTR_SIZE; /* ignore size changes */
>> +	if (iattr->ia_valid & ATTR_MODE) {
>> +		if (!in_group_p(inode->i_gid) && !capable(CAP_FSETID))
>> +			iattr->ia_mode &= ~S_ISGID;
>> +	}
>>  
>>  	error = inode_setattr(inode, iattr);
>>  	if (error)
>>  		return error;
>>  
>> -	if (!sd_iattr) {
>> +	if (!sd_iattr && (ia_valid & ~ATTR_MODE)) {
>>  		/* setting attributes for the first time, allocate now */
>>  		sd_iattr = kzalloc(sizeof(struct iattr), GFP_KERNEL);
>>  		if (!sd_iattr)
>> @@ -78,6 +82,13 @@ int sysfs_setattr(struct dentry * dentry, struct iattr * iattr)
>>  		sd->s_iattr = sd_iattr;
>>  	}
>>  
>> +	if (ia_valid & ATTR_MODE)
>> +		sd->s_mode = iattr->ia_mode;
>> +
>> +	/* If we don't need the extra attributes leave */
>> +	if (!sd_iattr)
>> +		return 0;
>
> One visible difference is lack of timestamp update.  Is there any use
> case where sysfs file mode changing needs to be fast?\

Not really.  If the time changes we set something besides ATTR_MODE
like ATTR_MTIME or ATTR_CTIME.  If we come in through any of the
user space entry points ATTR_CTIME appears to be set so this optimization
will not trigger.

I think there are cases where we only opportunistically track time
changes, when the structure is allocated that this changes but it
is a very small percentage of the time.

The practical effect of my changes should be that we only track timestamps
when user space actually performs an explicit change to the file.

If someone was depending on some weird indirect side effect like that
on one of the 5-6 files that calls sysfs_chmod let's make it explicit.

For me this isn't about making this go faster.  This is about keeping
the sysfs data structures small when we can.

It doesn't really complicate the code and we wind up doing the obvious thing.

Eric

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 04/20] sysfs: Handle the general case of removing of directories with subdirectories
  2009-05-21  7:36             ` Tejun Heo
@ 2009-05-21  8:04               ` Eric W. Biederman
  2009-05-21  8:37                 ` Tejun Heo
  0 siblings, 1 reply; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-21  8:04 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Andrew Morton, Greg Kroah-Hartman, linux-kernel, Cornelia Huck,
	linux-fsdevel, Eric W. Biederman

Tejun Heo <tj@kernel.org> writes:

> Eric W. Biederman wrote:
>>> I agree we should be heading this way but what happens to attributes
>>> or directories living below the subdirectories?  If it's gonna handle
>>> recursive case, I think it better do it properly.  I had patches of
>>> similar effect.
>> 
>> I do handle it properly.  sysfs_get_one finds the deepest child of the
>> first directory entry.  Then I remove it.  And I repeat until done.
>> 
>> The locking is correct, something that is much more difficult to
>> tell with your version.
>
> Why? :-)

Because mine is all in a single place and there is no optimization
to get locks I don't need.

Unless I have misread your patch you are failing to get the
i_mutex for child directories, if it possible to get it.

Something that it is trivial to see that I always do correctly.
Simply because the distance between the lock and where I depend on
it is so small.

>> By grabbing and dropping the sysfs_mutex things are simpler, and they
>> get even simpler in future patches.
>> 
>> Now looking at that code in detail there is a question of what happens if
>> we add a directory entry while we are recursively deleting a directory.
>> Neither your patch, my patch, nor the existing code handle that case
>> (assuming the sysfs_dirent) was looked up before it is removed from it's
>> parent directory.  I expect another patch is called for to plug that
>> theoretical gap.  
>> 
>> I expect the way to close that hole is to have an extra flag that says
>> we are removing a directory entry and refuse to add if that flag is
>> set.
>> 
>> I would prefer to only remove empty directories.  But when I
>> instrumented things up I found cases where that does indeed happen.
>
> IIRC, my version did the whole thing while holding sysfs_mutex, so
> it's safe against such races.  I can't really see why ops like this
> can't be atomic in sysfs.  I don't really care how things are done but
> please make it atomic.

Nope.  Holding the sysfs_mutex does not make you safe from such races.
It actually makes you more prone to someone adding a directory entry to
a deleted directory and not having it deleted.  I have a chance of
deleting the added directory entry.

The problem is that sysfs_add_one takes to sysfs_dirents.  The look up
of the directory is done before we take the sysfs_mutex.  So the
sysfs_dirent could be grabbed at any time.

Eric

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 04/20] sysfs: Handle the general case of removing of directories with subdirectories
  2009-05-21  8:04               ` Eric W. Biederman
@ 2009-05-21  8:37                 ` Tejun Heo
  2009-05-21  9:18                   ` Eric W. Biederman
  0 siblings, 1 reply; 200+ messages in thread
From: Tejun Heo @ 2009-05-21  8:37 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Andrew Morton, Greg Kroah-Hartman, linux-kernel, Cornelia Huck,
	linux-fsdevel, Eric W. Biederman

Hi,

Eric W. Biederman wrote:
>>> The locking is correct, something that is much more difficult to
>>> tell with your version.
>> Why? :-)
> 
> Because mine is all in a single place and there is no optimization
> to get locks I don't need.
> 
> Unless I have misread your patch you are failing to get the
> i_mutex for child directories, if it possible to get it.
> 
> Something that it is trivial to see that I always do correctly.
> Simply because the distance between the lock and where I depend on
> it is so small.

If this patch series works out, we don't need to grab i_mutexes while
manipulating sd's, right?

>>> I would prefer to only remove empty directories.  But when I
>>> instrumented things up I found cases where that does indeed happen.
>> IIRC, my version did the whole thing while holding sysfs_mutex, so
>> it's safe against such races.  I can't really see why ops like this
>> can't be atomic in sysfs.  I don't really care how things are done but
>> please make it atomic.
> 
> Nope.  Holding the sysfs_mutex does not make you safe from such races.
> It actually makes you more prone to someone adding a directory entry to
> a deleted directory and not having it deleted.  I have a chance of
> deleting the added directory entry.
> 
> The problem is that sysfs_add_one takes to sysfs_dirents.  The look up
> of the directory is done before we take the sysfs_mutex.  So the
> sysfs_dirent could be grabbed at any time.

Well, it can be trivially fixed by checking the removed flag.  The
add/rm thing is designed to help additions and removals of multiple
nodes at one go and I'd really like to see it working that way.  Any
chance you can change code toward that direction?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 08/20] sysfs: Optimize just changing the sysfs file mode.
  2009-05-21  7:54                   ` Eric W. Biederman
@ 2009-05-21  8:41                     ` Tejun Heo
  0 siblings, 0 replies; 200+ messages in thread
From: Tejun Heo @ 2009-05-21  8:41 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Andrew Morton, Greg Kroah-Hartman, linux-kernel, Cornelia Huck,
	linux-fsdevel, Eric W. Biederman

Hello,

Eric W. Biederman wrote:
>> One visible difference is lack of timestamp update.  Is there any use
>> case where sysfs file mode changing needs to be fast?\
> 
> Not really.  If the time changes we set something besides ATTR_MODE
> like ATTR_MTIME or ATTR_CTIME.  If we come in through any of the
> user space entry points ATTR_CTIME appears to be set so this optimization
> will not trigger.
> 
> I think there are cases where we only opportunistically track time
> changes, when the structure is allocated that this changes but it
> is a very small percentage of the time.
> 
> The practical effect of my changes should be that we only track timestamps
> when user space actually performs an explicit change to the file.
> 
> If someone was depending on some weird indirect side effect like that
> on one of the 5-6 files that calls sysfs_chmod let's make it explicit.
> 
> For me this isn't about making this go faster.  This is about keeping
> the sysfs data structures small when we can.
> 
> It doesn't really complicate the code and we wind up doing the obvious thing.

Well, it doesn't add a lot of complexity but also seems pointless when
there basically is no use case which would benefit from this change.
I suppose it's upto the maintainer.  Greg?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 11/20] sysfs: Update s_iattr on link and unlink.
  2009-05-21  0:28                     ` [PATCH 11/20] sysfs: Update s_iattr on link and unlink Eric W. Biederman
  2009-05-21  0:28                       ` [PATCH 12/20] sysfs: Nicely indent sysfs_symlink_inode_operations Eric W. Biederman
@ 2009-05-21  8:42                       ` Tejun Heo
  1 sibling, 0 replies; 200+ messages in thread
From: Tejun Heo @ 2009-05-21  8:42 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Andrew Morton, Greg Kroah-Hartman, linux-kernel, Cornelia Huck,
	linux-fsdevel, Eric W. Biederman

Eric W. Biederman wrote:
> From: Eric W. Biederman <ebiederm@xmission.com>
> 
> Currently sysfs updates the timestamps on the vfs directory
> inode when we create or remove a directory entry but doesn't
> update the cached copy on the sysfs_dirent, fix that oversight.
> 
> Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>

Acked-by: Tejun Heo <tj@kernel.org>

-- 
tejun

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 13/20] sysfs: Implement sysfs_getattr & sysfs_permission
  2009-05-21  0:28                         ` [PATCH 13/20] sysfs: Implement sysfs_getattr & sysfs_permission Eric W. Biederman
  2009-05-21  0:28                           ` [PATCH 14/20] sysfs: In sysfs_chmod_file lazily propagate the mode change Eric W. Biederman
@ 2009-05-21  9:14                           ` Tejun Heo
  1 sibling, 0 replies; 200+ messages in thread
From: Tejun Heo @ 2009-05-21  9:14 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Andrew Morton, Greg Kroah-Hartman, linux-kernel, Cornelia Huck,
	linux-fsdevel, Eric W. Biederman

Eric W. Biederman wrote:
> From: Eric W. Biederman <ebiederm@xmission.com>
> 
> With the implementation of sysfs_getattr and sysfs_permission
> sysfs becomes able to lazily propogate inode attribute changes
> from the sysfs_dirents to the vfs inodes.   This paves the way
> for deleting significant chunks of now unnecessary code.
> 
> Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>

Acked-by: Tejun Heo <tj@kernel.org>

-- 
tejun

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 14/20] sysfs: In sysfs_chmod_file lazily propagate the mode change.
  2009-05-21  0:28                           ` [PATCH 14/20] sysfs: In sysfs_chmod_file lazily propagate the mode change Eric W. Biederman
  2009-05-21  0:28                             ` [PATCH 15/20] sysfs: Kill sysfs_addrm_start and sysfs_addrm_finish Eric W. Biederman
@ 2009-05-21  9:16                             ` Tejun Heo
  1 sibling, 0 replies; 200+ messages in thread
From: Tejun Heo @ 2009-05-21  9:16 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Andrew Morton, Greg Kroah-Hartman, linux-kernel, Cornelia Huck,
	linux-fsdevel, Eric W. Biederman

Eric W. Biederman wrote:
> From: Eric W. Biederman <ebiederm@xmission.com>
> 
> Now that sysfs_getattr and sysfs_permission refresh the vfs
> inode there is no need to immediatly push the mode change
> into the vfs cache.  Reducing the amount of work needed and
> simplifying the locking.
> 
> Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>

Cool.

Acked-by: Tejun Heo <tj@kernel.org>

-- 
tejun

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 04/20] sysfs: Handle the general case of removing of directories with subdirectories
  2009-05-21  8:37                 ` Tejun Heo
@ 2009-05-21  9:18                   ` Eric W. Biederman
  2009-05-21  9:28                     ` Tejun Heo
  0 siblings, 1 reply; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-21  9:18 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Andrew Morton, Greg Kroah-Hartman, linux-kernel, Cornelia Huck,
	linux-fsdevel, Eric W. Biederman

Tejun Heo <tj@kernel.org> writes:

> Well, it can be trivially fixed by checking the removed flag.  The
> add/rm thing is designed to help additions and removals of multiple
> nodes at one go and I'd really like to see it working that way.  Any
> chance you can change code toward that direction?

Yes.  We definitely need to check the removed flag in sysfs_add_one.
Regardless of anything else.

I need to sleep on this but I am inclined to get rid of the rest of
the complications simply by failing the removal of non-empty
directories.  Going through the upper layers and making them properly
responsible for their actions.

I am afraid friendlier in this circumstance might equate to easier
to misuse and let code bugs pile up.

Eric

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 04/20] sysfs: Handle the general case of removing of directories with subdirectories
  2009-05-21  9:18                   ` Eric W. Biederman
@ 2009-05-21  9:28                     ` Tejun Heo
  2009-05-23  6:33                       ` Eric W. Biederman
  0 siblings, 1 reply; 200+ messages in thread
From: Tejun Heo @ 2009-05-21  9:28 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Andrew Morton, Greg Kroah-Hartman, linux-kernel, Cornelia Huck,
	linux-fsdevel, Eric W. Biederman

Hello, Eric.

Eric W. Biederman wrote:
> Tejun Heo <tj@kernel.org> writes:
> 
>> Well, it can be trivially fixed by checking the removed flag.  The
>> add/rm thing is designed to help additions and removals of multiple
>> nodes at one go and I'd really like to see it working that way.  Any
>> chance you can change code toward that direction?
> 
> Yes.  We definitely need to check the removed flag in sysfs_add_one.
> Regardless of anything else.
> 
> I need to sleep on this but I am inclined to get rid of the rest of
> the complications simply by failing the removal of non-empty
> directories.  Going through the upper layers and making them properly
> responsible for their actions.
> 
> I am afraid friendlier in this circumstance might equate to easier
> to misuse and let code bugs pile up.

I'm going through the latter part of the patchset and the code around
this area gets much simpler there.  Would it be possible to make it
atomic after the simplification?  Requiring recursive deletion from
all the callers is silly and error prone.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 15/20] sysfs: Kill sysfs_addrm_start and sysfs_addrm_finish
  2009-05-21  0:28                             ` [PATCH 15/20] sysfs: Kill sysfs_addrm_start and sysfs_addrm_finish Eric W. Biederman
  2009-05-21  0:28                               ` [PATCH 16/20] sysfs: Propagate renames to the vfs on demand Eric W. Biederman
@ 2009-05-21  9:31                               ` Tejun Heo
  1 sibling, 0 replies; 200+ messages in thread
From: Tejun Heo @ 2009-05-21  9:31 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Andrew Morton, Greg Kroah-Hartman, linux-kernel, Cornelia Huck,
	linux-fsdevel, Eric W. Biederman

Eric W. Biederman wrote:
> From: Eric W. Biederman <ebiederm@xmission.com>
> 
> With lazy inode updates and dentry operations bringing everything
> into sync on demand there is no longer any need to immediately
> update the vfs or grab i_mutex to protect those updates as we
> make changes to sysfs.
> 
> So stop updating the vfs inodes and move what remains of
> sysfs_addrm_start and sysfs_addrm_finsih (just barely more than taking
> the sysfs_mutex) into sysfs_add_one and sysfs_remove_one.
> 
> Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>

This looks so better than the original code.  :-)

Acked-by: Tejun Heo <tj@kernel.org>

-- 
tejun

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 16/20] sysfs: Propagate renames to the vfs on demand
  2009-05-21  0:28                               ` [PATCH 16/20] sysfs: Propagate renames to the vfs on demand Eric W. Biederman
  2009-05-21  0:28                                 ` [PATCH 17/20] sysfs: Merge sysfs_rename_dir and sysfs_move_dir Eric W. Biederman
@ 2009-05-21  9:41                                 ` Tejun Heo
  1 sibling, 0 replies; 200+ messages in thread
From: Tejun Heo @ 2009-05-21  9:41 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Andrew Morton, Greg Kroah-Hartman, linux-kernel, Cornelia Huck,
	linux-fsdevel, Eric W. Biederman

Eric W. Biederman wrote:
> From: Eric W. Biederman <ebiederm@xmission.com>
> 
> By teaching sysfs_revalidate to hide a dentry for
> a sysfs_dirent if the sysfs_dirent has been renamed,
> and by teaching sysfs_lookup to return the original
> dentry if the sysfs dirent has been renamed.  I can
> show the results of renames correctly without having to
> update the dcache during the directory rename.
> 
> This massively simplifies the rename logic allowing a lot
> of weird sysfs special cases to be removed along with
> a lot of now unnecesary helper code.
> 
> Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>

It's probably a good idea to cc Al Viro?

> @@ -311,6 +270,14 @@ static int sysfs_dentry_revalidate(struct dentry *dentry, struct nameidata *nd)
>  	if (sd->s_flags & SYSFS_FLAG_REMOVED)
>  		goto out_bad;
>  
> +	/* The sysfs dirent has been moved? */
> +	if (dentry->d_parent->d_fsdata != sd->s_parent)
> +		goto out_bad;
> +
> +	/* The sysfs dirent has been renamed */
> +	if (strcmp(dentry->d_name.name, sd->s_name) != 0)
> +		goto out_bad;
> +
>  	mutex_unlock(&sysfs_mutex);
>  out_valid:
>  	return 1;
> @@ -318,6 +285,12 @@ out_bad:
>  	/* Remove the dentry from the dcache hashes.
>  	 * If this is a deleted dentry we use d_drop instead of d_delete
>  	 * so sysfs doesn't need to cope with negative dentries.
> +	 *
> +	 * If this is a dentry that has simply been renamed we
> +	 * use d_drop to remove it from the dcache lookup on it's
							     ^^^^
							     its

> +	 * old parent.  If this dentry persists later when a lookup
> +	 * is performed at it's new name the dentry will be readded
                           ^^^^
			   its

Other than the above comment changes,

Acked-by: Tejun Heo <tj@kernel.org>

-- 
tejun

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 17/20] sysfs: Merge sysfs_rename_dir and sysfs_move_dir
  2009-05-21  0:28                                 ` [PATCH 17/20] sysfs: Merge sysfs_rename_dir and sysfs_move_dir Eric W. Biederman
  2009-05-21  0:28                                   ` [PATCH 18/20] sysfs: Pass super_block to sysfs_get_inode Eric W. Biederman
@ 2009-05-21  9:42                                   ` Tejun Heo
  1 sibling, 0 replies; 200+ messages in thread
From: Tejun Heo @ 2009-05-21  9:42 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Andrew Morton, Greg Kroah-Hartman, linux-kernel, Cornelia Huck,
	linux-fsdevel, Eric W. Biederman

Eric W. Biederman wrote:
> From: Eric W. Biederman <ebiederm@xmission.com>
> 
> These two functions do 90% of the same work and it doesn't significantly
> obfuscate the function to allow both the parent dir and the name to change
> at the same time.  So merge them together to simplify maintenance, and
> increase testing.
> 
> Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>

Acked-by: Tejun Heo <tj@kernel.org>

-- 
tejun

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 20/20] sysfs: Normalize error handling in sysfs_fill_inode
  2009-05-21  0:28                                       ` [PATCH 20/20] sysfs: Normalize error handling in sysfs_fill_inode Eric W. Biederman
@ 2009-05-21  9:43                                         ` Tejun Heo
  2009-05-21 10:29                                           ` Eric W. Biederman
  0 siblings, 1 reply; 200+ messages in thread
From: Tejun Heo @ 2009-05-21  9:43 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Andrew Morton, Greg Kroah-Hartman, linux-kernel, Cornelia Huck,
	linux-fsdevel, Eric W. Biederman

Eric W. Biederman wrote:
> From: Eric W. Biederman <ebiederm@xmission.com>
> 
> Use a single error exit path instead of doing whatever
> is the required cleanup at each point we find the error.
> Ultimately this should make the code more maintainable.
> 
> Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>

For 0018-0020,

Acked-by: Tejun Heo <tj@kernel.org>

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 01/20] sysfs: Implement sysfs_rename_link
  2009-05-21  5:35   ` Tejun Heo
@ 2009-05-21 10:06     ` Kay Sievers
  2009-05-21 10:29       ` Eric W. Biederman
  0 siblings, 1 reply; 200+ messages in thread
From: Kay Sievers @ 2009-05-21 10:06 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Eric W. Biederman, Andrew Morton, Greg Kroah-Hartman,
	linux-kernel, Cornelia Huck, linux-fsdevel, Benjamin Thery,
	Daniel Lezcano, Eric W. Biederman

On Thu, May 21, 2009 at 07:35, Tejun Heo <tj@kernel.org> wrote:
> Eric W. Biederman wrote:
>> +int sysfs_rename_link(struct kobject *kobj, struct kobject *targ,
>> +                     const char *old, const char *new)
>> +{
>> +     sysfs_remove_link(kobj, old);
>> +     return sysfs_create_link(kobj, targ, new);
>> +}
>> +
>
> Removal and creation are done in the reverse order compared to the one
> used in device rename.  The important difference is that previously
> failed operation was noop whereas it now would remove the current
> link.  I think the old order is correct.

The target string is composed on-demand, and it always points to the
same kobject and *targ is not needed, right?

Can't we just change the name of the link, instead of removing and
re-creating the entire thing, and all these issues go away?

Thanks,
Kay
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 01/20] sysfs: Implement sysfs_rename_link
  2009-05-21 10:06     ` Kay Sievers
@ 2009-05-21 10:29       ` Eric W. Biederman
  2009-05-21 11:40         ` Kay Sievers
  0 siblings, 1 reply; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-21 10:29 UTC (permalink / raw)
  To: Kay Sievers
  Cc: Tejun Heo, Andrew Morton, Greg Kroah-Hartman, linux-kernel,
	Cornelia Huck, linux-fsdevel, Benjamin Thery, Daniel Lezcano,
	Eric W. Biederman

Kay Sievers <kay.sievers@vrfy.org> writes:

> On Thu, May 21, 2009 at 07:35, Tejun Heo <tj@kernel.org> wrote:
>> Eric W. Biederman wrote:
>>> +int sysfs_rename_link(struct kobject *kobj, struct kobject *targ,
>>> +                     const char *old, const char *new)
>>> +{
>>> +     sysfs_remove_link(kobj, old);
>>> +     return sysfs_create_link(kobj, targ, new);
>>> +}
>>> +
>>
>> Removal and creation are done in the reverse order compared to the one
>> used in device rename.  The important difference is that previously
>> failed operation was noop whereas it now would remove the current
>> link.  I think the old order is correct.
>
> The target string is composed on-demand, and it always points to the
> same kobject and *targ is not needed, right?
>
> Can't we just change the name of the link, instead of removing and
> re-creating the entire thing, and all these issues go away?

Good point.  It looks like I can generalize sysfs_mv_dir into simply
being sysfs_rename.  All of the existing logic looks like it can handle
that.  I will look at doing an incremental patch for that.

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 20/20] sysfs: Normalize error handling in sysfs_fill_inode
  2009-05-21  9:43                                         ` Tejun Heo
@ 2009-05-21 10:29                                           ` Eric W. Biederman
  0 siblings, 0 replies; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-21 10:29 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Andrew Morton, Greg Kroah-Hartman, linux-kernel, Cornelia Huck,
	linux-fsdevel, Eric W. Biederman

Tejun Heo <tj@kernel.org> writes:

> Eric W. Biederman wrote:
>> From: Eric W. Biederman <ebiederm@xmission.com>
>> 
>> Use a single error exit path instead of doing whatever
>> is the required cleanup at each point we find the error.
>> Ultimately this should make the code more maintainable.
>> 
>> Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
>
> For 0018-0020,
>
> Acked-by: Tejun Heo <tj@kernel.org>

Thanks for taking the time to review these patches.

Eric

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 01/20] sysfs: Implement sysfs_rename_link
  2009-05-21 10:29       ` Eric W. Biederman
@ 2009-05-21 11:40         ` Kay Sievers
  0 siblings, 0 replies; 200+ messages in thread
From: Kay Sievers @ 2009-05-21 11:40 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Tejun Heo, Andrew Morton, Greg Kroah-Hartman, linux-kernel,
	Cornelia Huck, linux-fsdevel, Benjamin Thery, Daniel Lezcano,
	Eric W. Biederman

On Thu, May 21, 2009 at 12:29, Eric W. Biederman <ebiederm@xmission.com> wrote:
> Kay Sievers <kay.sievers@vrfy.org> writes:

>> The target string is composed on-demand, and it always points to the
>> same kobject and *targ is not needed, right?
>>
>> Can't we just change the name of the link, instead of removing and
>> re-creating the entire thing, and all these issues go away?
>
> Good point.  It looks like I can generalize sysfs_mv_dir into simply
> being sysfs_rename.  All of the existing logic looks like it can handle
> that.  I will look at doing an incremental patch for that.

Nice, would be great, if that can be made working. It would be much
better than the current logic, and there would be no ordering problem,
like Tejun mentioned, to solve anymore.

As we don't need to change the target for link, maybe:
  sysfs_rename_file(kobj_parent, old_name, new_name);
that can act on symlinks and regular files would be useful?

Thanks,
Kay
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 04/20] sysfs: Handle the general case of removing of directories with subdirectories
  2009-05-21  9:28                     ` Tejun Heo
@ 2009-05-23  6:33                       ` Eric W. Biederman
  2009-05-23 11:35                         ` Kay Sievers
  0 siblings, 1 reply; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-23  6:33 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Andrew Morton, Greg Kroah-Hartman, linux-kernel, Cornelia Huck,
	linux-fsdevel, Eric W. Biederman

Tejun Heo <tj@kernel.org> writes:

> Hello, Eric.
>
> Eric W. Biederman wrote:
>> Tejun Heo <tj@kernel.org> writes:
>> 
>>> Well, it can be trivially fixed by checking the removed flag.  The
>>> add/rm thing is designed to help additions and removals of multiple
>>> nodes at one go and I'd really like to see it working that way.  Any
>>> chance you can change code toward that direction?
>> 
>> Yes.  We definitely need to check the removed flag in sysfs_add_one.
>> Regardless of anything else.
>> 
>> I need to sleep on this but I am inclined to get rid of the rest of
>> the complications simply by failing the removal of non-empty
>> directories.  Going through the upper layers and making them properly
>> responsible for their actions.
>> 
>> I am afraid friendlier in this circumstance might equate to easier
>> to misuse and let code bugs pile up.
>
> I'm going through the latter part of the patchset and the code around
> this area gets much simpler there.  Would it be possible to make it
> atomic after the simplification?  Requiring recursive deletion from
> all the callers is silly and error prone.

I have slept and looked at this in some detail.

There may be some virtue in better support from sysfs for deleting
objects.  At this point my observation is that support comes from the
kobject and device layers.  Where you can define all of the attributes
of a device up front.

My goal is to make the current sysfs as simple and as correct as I can
before changes are made to either it's interface or otherwise make it
better suited to work.

The case I have been worried about is someone removing a subsystem
before unregistering it's devices, or otherwise removing a real parent
before removing it's children.

After a little more investigation that is exactly what is happening
today with /sys/dev/char and /sys/dev/block.

Those ordering issues we must handle because quite frequently there
are real hardware complications and that is exactly the case that the
kernel device tree was built to address.  So I expect any amount of
getting it wrong in sysfs is actually getting it wrong in the device
tree.

There is another problem with relying on recursive delete.  When we
come to delete one of our objects that someone else recursively
deleted we will hit the BUG_ON in sysfs_remove_one, and then attempt
to run operations that we have already run.  Not maintaining that the
prerequisite things exist for the lifetime of something in sysfs
sounds really icky.

My plan going forward is to fix the ordering problems with deleting
/sys/dev/char and /sys/dev/block.  Add a WARN_ON if we delete a
non-empty directory.  Ensure we don't add something to an already
deleted directory.

Eric

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 04/20] sysfs: Handle the general case of removing of directories with subdirectories
  2009-05-23  6:33                       ` Eric W. Biederman
@ 2009-05-23 11:35                         ` Kay Sievers
  2009-05-23 20:09                           ` Eric W. Biederman
  0 siblings, 1 reply; 200+ messages in thread
From: Kay Sievers @ 2009-05-23 11:35 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Tejun Heo, Andrew Morton, Greg Kroah-Hartman, linux-kernel,
	Cornelia Huck, linux-fsdevel, Eric W. Biederman

On Sat, May 23, 2009 at 08:33, Eric W. Biederman <ebiederm@xmission.com> wrote:

> My plan going forward is to fix the ordering problems with deleting
> /sys/dev/char and /sys/dev/block.  Add a WARN_ON if we delete a
> non-empty directory.  Ensure we don't add something to an already
> deleted directory.

What's the problem in /sys/dev/? There are just a bunch of symlinks,
one for every device with a dev_t, and all in flat directories, and no
directory to remove.

Thanks,
Kay
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 04/20] sysfs: Handle the general case of removing of  directories with subdirectories
  2009-05-23 11:35                         ` Kay Sievers
@ 2009-05-23 20:09                           ` Eric W. Biederman
  2009-05-23 20:46                             ` Kay Sievers
  0 siblings, 1 reply; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-23 20:09 UTC (permalink / raw)
  To: Kay Sievers
  Cc: Tejun Heo, Andrew Morton, Greg Kroah-Hartman, linux-kernel,
	Cornelia Huck, linux-fsdevel, Eric W. Biederman

Kay Sievers <kay.sievers@vrfy.org> writes:

> On Sat, May 23, 2009 at 08:33, Eric W. Biederman <ebiederm@xmission.com> wrote:
>
>> My plan going forward is to fix the ordering problems with deleting
>> /sys/dev/char and /sys/dev/block.  Add a WARN_ON if we delete a
>> non-empty directory.  Ensure we don't add something to an already
>> deleted directory.
>
> What's the problem in /sys/dev/? There are just a bunch of symlinks,
> one for every device with a dev_t, and all in flat directories, and no
> directory to remove.

device_shutdown called during reboot removes /sys/dev/block and /sys/dev/char.
The current sysfs_remove_dir (because it empties directories)
removes all of those symlinks.

The problem is that it is the device objects for each individual
device that owns those symlinks, and normally removes those symlinks.

Which means that in theory we could have double deletion going on.

In practice today it doesn't matter because this is at reboot.

And as far as that goes it is wrong to remove anything from sysfs during
device_shutdown so the fix is just to not call kobject_put there.

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 200+ messages in thread

* [PATCH 21/20] sysfs: Rename sysfs_mv_dir sysfs_rename
  2009-05-20  4:09 [PATCH 0/20] Sysfs cleanups Eric W. Biederman
  2009-05-20 15:37 ` Greg KH
  2009-05-21  0:27 ` [PATCH 01/20] sysfs: Implement sysfs_rename_link Eric W. Biederman
@ 2009-05-23 20:13 ` Eric W. Biederman
  2009-05-23 20:13 ` [PATCH 22/20] sysfs: Make sysfs_rename_link atomic Eric W. Biederman
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-23 20:13 UTC (permalink / raw)
  To: Andrew Morton, Greg Kroah-Hartman
  Cc: linux-kernel, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Eric W. Biederman, Eric W. Biederman

From: Eric W. Biederman <ebiederm@xmission.com>

It turns out that sysfs_mv_dir actually makes no assumptions that what
is being renamed is a directory.   So rename sysfs_mv_dir to sysfs_rename to
reflect the functions general utility.  Later we will use it rename symlinks
in sysfs.

Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
---
 fs/sysfs/dir.c   |    6 +++---
 fs/sysfs/sysfs.h |    3 +++
 2 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index 7aa8890..4da20a3 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -660,7 +660,7 @@ void sysfs_remove_dir(struct kobject * kobj)
 	remove_dir(sd);
 }
 
-static int sysfs_mv_dir(struct sysfs_dirent *sd,
+int sysfs_rename(struct sysfs_dirent *sd,
 	struct sysfs_dirent *new_parent_sd, const char *new_name)
 {
 	const char *dup_name = NULL;
@@ -706,12 +706,12 @@ static int sysfs_mv_dir(struct sysfs_dirent *sd,
 
 int sysfs_rename_dir(struct kobject * kobj, const char *new_name)
 {
-	return sysfs_mv_dir(kobj->sd, kobj->sd->s_parent, new_name);
+	return sysfs_rename(kobj->sd, kobj->sd->s_parent, new_name);
 }
 
 int sysfs_move_dir(struct kobject *kobj, struct kobject *new_parent_kobj)
 {
-	return sysfs_mv_dir(kobj->sd, new_parent_kobj->sd, kobj->sd->s_name);
+	return sysfs_rename(kobj->sd, new_parent_kobj->sd, kobj->sd->s_name);
 }
 
 /* Relationship between s_mode and the DT_xxx types */
diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index 5dd8168..be1d932 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -109,6 +109,9 @@ int sysfs_create_subdir(struct kobject *kobj, const char *name,
 			struct sysfs_dirent **p_sd);
 void sysfs_remove_subdir(struct sysfs_dirent *sd);
 
+int sysfs_rename(struct sysfs_dirent *sd,
+	struct sysfs_dirent *new_parent_sd, const char *new_name);
+
 static inline struct sysfs_dirent *__sysfs_get(struct sysfs_dirent *sd)
 {
 	if (sd) {
-- 
1.6.3.1.54.g99dd.dirty


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH 22/20] sysfs: Make sysfs_rename_link atomic
  2009-05-20  4:09 [PATCH 0/20] Sysfs cleanups Eric W. Biederman
                   ` (2 preceding siblings ...)
  2009-05-23 20:13 ` [PATCH 21/20] sysfs: Rename sysfs_mv_dir sysfs_rename Eric W. Biederman
@ 2009-05-23 20:13 ` Eric W. Biederman
  2009-05-23 21:32   ` Kay Sievers
  2009-05-23 20:13 ` [PATCH 23/20] driver core: Don't remove kobjects in device_shutdown Eric W. Biederman
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-23 20:13 UTC (permalink / raw)
  To: Andrew Morton, Greg Kroah-Hartman
  Cc: linux-kernel, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Eric W. Biederman, Eric W. Biederman

From: Eric W. Biederman <ebiederm@xmission.com>

Use the existing sysfs_rename to make sysfs_rename_link an atomic
operation that does less work.  While I am at add additional sanity
checking to ensure it is a symlink I am renaming.

Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
---
 fs/sysfs/symlink.c |   26 ++++++++++++++++++++++++--
 1 files changed, 24 insertions(+), 2 deletions(-)

diff --git a/fs/sysfs/symlink.c b/fs/sysfs/symlink.c
index fc5fc86..39d050b 100644
--- a/fs/sysfs/symlink.c
+++ b/fs/sysfs/symlink.c
@@ -106,8 +106,30 @@ void sysfs_remove_link(struct kobject * kobj, const char * name)
 int sysfs_rename_link(struct kobject *kobj, struct kobject *targ,
 			const char *old, const char *new)
 {
-	sysfs_remove_link(kobj, old);
-	return sysfs_create_link(kobj, targ, new);
+	struct sysfs_dirent *parent_sd, *sd = NULL;
+	int result;
+
+	if (!kobj)
+		parent_sd = &sysfs_root;
+	else
+		parent_sd = kobj->sd;
+
+	result = -ENOENT;
+	sd = sysfs_get_dirent(parent_sd, old);
+	if (!sd)
+		goto out;
+
+	result = -EINVAL;
+	if (sysfs_type(sd) != SYSFS_KOBJ_LINK)
+		goto out;
+	if (sd->s_symlink.target_sd->s_dir.kobj != targ)
+		goto out;
+
+	result = sysfs_rename(sd, parent_sd, new);
+
+out:
+	sysfs_put(sd);
+	return result;
 }
 
 static int sysfs_get_target_path(struct sysfs_dirent *parent_sd,
-- 
1.6.3.1.54.g99dd.dirty


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH 23/20] driver core: Don't remove kobjects in device_shutdown.
  2009-05-20  4:09 [PATCH 0/20] Sysfs cleanups Eric W. Biederman
                   ` (3 preceding siblings ...)
  2009-05-23 20:13 ` [PATCH 22/20] sysfs: Make sysfs_rename_link atomic Eric W. Biederman
@ 2009-05-23 20:13 ` Eric W. Biederman
  2009-05-23 22:15   ` Kay Sievers
  2009-05-23 20:13 ` [PATCH 24/20] sysfs: In sysfs_add_one fail if the targe directory has been removed Eric W. Biederman
  2009-05-23 20:13 ` [PATCH 25/20] sysfs: Only support removing emtpy sysfs directories Eric W. Biederman
  6 siblings, 1 reply; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-23 20:13 UTC (permalink / raw)
  To: Andrew Morton, Greg Kroah-Hartman
  Cc: linux-kernel, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Eric W. Biederman, Eric W. Biederman

From: Eric W. Biederman <ebiederm@xmission.com>

device_shutdown is defined to just shutdown the hardware and to not
clean up any kernel data structures.  Therefore don't put the kobjects
for /sys/dev and /sys/dev/block and /sys/dev/char.

This ensures we don't remove /sys/dev/block and /sys/dev/char while
we still have symlinks from there to the actual devices.

Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
---
 drivers/base/core.c |    3 ---
 1 files changed, 0 insertions(+), 3 deletions(-)

diff --git a/drivers/base/core.c b/drivers/base/core.c
index 8a1569c..49d3142 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -1653,7 +1653,4 @@ void device_shutdown(void)
 			dev->driver->shutdown(dev);
 		}
 	}
-	kobject_put(sysfs_dev_char_kobj);
-	kobject_put(sysfs_dev_block_kobj);
-	kobject_put(dev_kobj);
 }
-- 
1.6.3.1.54.g99dd.dirty


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH 24/20] sysfs: In sysfs_add_one fail if the targe directory has been removed.
  2009-05-20  4:09 [PATCH 0/20] Sysfs cleanups Eric W. Biederman
                   ` (4 preceding siblings ...)
  2009-05-23 20:13 ` [PATCH 23/20] driver core: Don't remove kobjects in device_shutdown Eric W. Biederman
@ 2009-05-23 20:13 ` Eric W. Biederman
  2009-05-23 21:29   ` Kay Sievers
  2009-05-23 20:13 ` [PATCH 25/20] sysfs: Only support removing emtpy sysfs directories Eric W. Biederman
  6 siblings, 1 reply; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-23 20:13 UTC (permalink / raw)
  To: Andrew Morton, Greg Kroah-Hartman
  Cc: linux-kernel, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Eric W. Biederman, Eric W. Biederman

From: Eric W. Biederman <ebiederm@xmission.com>

If a bug in the upper layers results in someone attempting to add
to a sysfs directory that has already been removed, warn about it
and fail.

I don't believe this has ever happened, and it certainly never should
happen, but be strict to avoid errors creeping in.

Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
---
 fs/sysfs/dir.c |   37 +++++++++++++++++++++++--------------
 1 files changed, 23 insertions(+), 14 deletions(-)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index 4da20a3..b3058f5 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -394,21 +394,17 @@ static char *sysfs_pathname(struct sysfs_dirent *sd, char *path)
 int sysfs_add_one(struct sysfs_dirent *parent_sd, struct sysfs_dirent *sd)
 {
 	struct iattr *ps_iattr;
+	char *path;
+	int result;
 
 	mutex_lock(&sysfs_mutex);
-	if (sysfs_find_dirent(parent_sd, sd->s_name)) {
-		char *path;
-		mutex_unlock(&sysfs_mutex);
 
-		path = kzalloc(PATH_MAX, GFP_KERNEL);
-		WARN(1, KERN_WARNING
-		     "sysfs: cannot create duplicate filename '%s'\n",
-		     (path == NULL) ? sd->s_name :
-		     strcat(strcat(sysfs_pathname(parent_sd, path), "/"),
-		            sd->s_name));
-		kfree(path);
-		return -EEXIST;
-	}
+	result = -ENOENT;
+	if (parent_sd->s_flags & SYSFS_FLAG_REMOVED)
+		goto out_err;
+
+	if (sysfs_find_dirent(parent_sd, sd->s_name))
+		goto out_err;
 
 	sd->s_parent = sysfs_get(parent_sd);
 	sysfs_link_sibling(sd);
@@ -417,9 +413,22 @@ int sysfs_add_one(struct sysfs_dirent *parent_sd, struct sysfs_dirent *sd)
 	ps_iattr = parent_sd->s_iattr;
 	if (ps_iattr)
 		ps_iattr->ia_ctime = ps_iattr->ia_mtime = CURRENT_TIME;
-
 	mutex_unlock(&sysfs_mutex);
 	return 0;
+
+out_err:
+	mutex_unlock(&sysfs_mutex);
+
+	path = kzalloc(PATH_MAX, GFP_KERNEL);
+	WARN(1, KERN_WARNING "sysfs: cannot create '%s' %s\n",
+		(path == NULL) ? sd->s_name :
+		strcat(strcat(sysfs_pathname(parent_sd, path), "/"),
+		       sd->s_name),
+		(result == -EEXIST ? "duplicate filename" : "no such directory")
+		);
+	kfree(path);
+
+	return result;
 }
 
 /**
-- 
1.6.3.1.54.g99dd.dirty


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH 25/20] sysfs:  Only support removing emtpy sysfs directories.
  2009-05-20  4:09 [PATCH 0/20] Sysfs cleanups Eric W. Biederman
                   ` (5 preceding siblings ...)
  2009-05-23 20:13 ` [PATCH 24/20] sysfs: In sysfs_add_one fail if the targe directory has been removed Eric W. Biederman
@ 2009-05-23 20:13 ` Eric W. Biederman
  2009-05-23 21:27   ` Kay Sievers
  2009-05-24  3:24   ` Tejun Heo
  6 siblings, 2 replies; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-23 20:13 UTC (permalink / raw)
  To: Andrew Morton, Greg Kroah-Hartman
  Cc: linux-kernel, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Eric W. Biederman, Eric W. Biederman

From: Eric W. Biederman <ebiederm@xmission.com>

I have looked and I have not found a single legitimate case today where
we remove sysfs directories with anything in them.  The only case I have
found to date was a bug.  It was a problem of ownership.  The files in
the directory where not owned by the directory itself.   Leaving open
the potential for double deletion of the directory contents.

Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
---
 fs/sysfs/dir.c |   24 ++++--------------------
 1 files changed, 4 insertions(+), 20 deletions(-)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index b3058f5..6b3e038 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -617,30 +617,14 @@ const struct inode_operations sysfs_dir_inode_operations = {
 	.permission	= sysfs_permission,
 };
 
-static struct sysfs_dirent *sysfs_get_one(struct sysfs_dirent *dir_sd)
-{
-	struct sysfs_dirent *sd = dir_sd;
-	mutex_lock(&sysfs_mutex);
-	while ((sysfs_type(sd) == SYSFS_DIR) && sd->s_dir.children)
-		sd = sd->s_dir.children;
-	if (sd != dir_sd)
-		sysfs_get(sd);
-	else
-		sd = NULL;
-	mutex_unlock(&sysfs_mutex);
-	return sd;
-}
-
 static void remove_dir(struct sysfs_dirent *dir_sd)
 {
-	struct sysfs_dirent *sd;
-
 	pr_debug("sysfs %s: removing dir\n", dir_sd->s_name);
 
-	while ((sd = sysfs_get_one(dir_sd))) {
-		sysfs_remove_one(sd);
-		sysfs_put(sd);
-	}
+	WARN(dir_sd->s_dir.children,
+		KERN_WARNING "sysfs: removing non-empty dir: %s\n",
+		dir_sd->s_name);
+
 	sysfs_remove_one(dir_sd);
 }
 
-- 
1.6.3.1.54.g99dd.dirty


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* Re: [PATCH 04/20] sysfs: Handle the general case of removing of directories with subdirectories
  2009-05-23 20:09                           ` Eric W. Biederman
@ 2009-05-23 20:46                             ` Kay Sievers
  0 siblings, 0 replies; 200+ messages in thread
From: Kay Sievers @ 2009-05-23 20:46 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Tejun Heo, Andrew Morton, Greg Kroah-Hartman, linux-kernel,
	Cornelia Huck, linux-fsdevel, Eric W. Biederman

On Sat, May 23, 2009 at 22:09, Eric W. Biederman <ebiederm@xmission.com> wrote:
> Kay Sievers <kay.sievers@vrfy.org> writes:
>> What's the problem in /sys/dev/? There are just a bunch of symlinks,
>> one for every device with a dev_t, and all in flat directories, and no
>> directory to remove.
>
> device_shutdown called during reboot removes /sys/dev/block and /sys/dev/char.
> The current sysfs_remove_dir (because it empties directories)
> removes all of those symlinks.
>
> The problem is that it is the device objects for each individual
> device that owns those symlinks, and normally removes those symlinks.
>
> Which means that in theory we could have double deletion going on.
>
> In practice today it doesn't matter because this is at reboot.
>
> And as far as that goes it is wrong to remove anything from sysfs during
> device_shutdown so the fix is just to not call kobject_put there.

Yes, that's just a bug. These directories should never be removed.

Thanks,
Kay

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 25/20] sysfs: Only support removing emtpy sysfs directories.
  2009-05-23 20:13 ` [PATCH 25/20] sysfs: Only support removing emtpy sysfs directories Eric W. Biederman
@ 2009-05-23 21:27   ` Kay Sievers
  2009-05-24 12:59     ` Kay Sievers
  2009-05-24  3:24   ` Tejun Heo
  1 sibling, 1 reply; 200+ messages in thread
From: Kay Sievers @ 2009-05-23 21:27 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Andrew Morton, Greg Kroah-Hartman, linux-kernel, Tejun Heo,
	Cornelia Huck, linux-fsdevel, Eric W. Biederman

On Sat, May 23, 2009 at 22:13, Eric W. Biederman <ebiederm@xmission.com> wrote:
> From: Eric W. Biederman <ebiederm@xmission.com>
>
> I have looked and I have not found a single legitimate case today where
> we remove sysfs directories with anything in them.  The only case I have
> found to date was a bug.  It was a problem of ownership.  The files in
> the directory where not owned by the directory itself.   Leaving open
> the potential for double deletion of the directory contents.
>
> Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>

Sounds good to me. We should try that, and see if there was any valid
use case we didn't think of, and if not, it's good to do what this
patch does.

Acked-by: Kay Sievers <kay.sievers@vrfy.org>

Thanks,
Kay

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 24/20] sysfs: In sysfs_add_one fail if the targe directory has been removed.
  2009-05-23 20:13 ` [PATCH 24/20] sysfs: In sysfs_add_one fail if the targe directory has been removed Eric W. Biederman
@ 2009-05-23 21:29   ` Kay Sievers
  0 siblings, 0 replies; 200+ messages in thread
From: Kay Sievers @ 2009-05-23 21:29 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Andrew Morton, Greg Kroah-Hartman, linux-kernel, Tejun Heo,
	Cornelia Huck, linux-fsdevel, Eric W. Biederman

On Sat, May 23, 2009 at 22:13, Eric W. Biederman <ebiederm@xmission.com> wrote:
> From: Eric W. Biederman <ebiederm@xmission.com>
>
> If a bug in the upper layers results in someone attempting to add
> to a sysfs directory that has already been removed, warn about it
> and fail.
>
> I don't believe this has ever happened, and it certainly never should
> happen, but be strict to avoid errors creeping in.
>
> Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>

Sounds like a good idea.

Acked-by: Kay Sievers <kay.sievers@vrfy.org>

Thanks,
Kay

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 22/20] sysfs: Make sysfs_rename_link atomic
  2009-05-23 20:13 ` [PATCH 22/20] sysfs: Make sysfs_rename_link atomic Eric W. Biederman
@ 2009-05-23 21:32   ` Kay Sievers
  2009-05-23 23:21     ` Kay Sievers
  0 siblings, 1 reply; 200+ messages in thread
From: Kay Sievers @ 2009-05-23 21:32 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Andrew Morton, Greg Kroah-Hartman, linux-kernel, Tejun Heo,
	Cornelia Huck, linux-fsdevel, Eric W. Biederman

On Sat, May 23, 2009 at 22:13, Eric W. Biederman <ebiederm@xmission.com> wrote:
> From: Eric W. Biederman <ebiederm@xmission.com>
>
> Use the existing sysfs_rename to make sysfs_rename_link an atomic
> operation that does less work.  While I am at add additional sanity
> checking to ensure it is a symlink I am renaming.
>
> Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>

Looks great, and so much better than the remove and re-create thing.

Acked-by: Kay Sievers <kay.sievers@vrfy.org>

Thanks,
Kay
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 23/20] driver core: Don't remove kobjects in device_shutdown.
  2009-05-23 20:13 ` [PATCH 23/20] driver core: Don't remove kobjects in device_shutdown Eric W. Biederman
@ 2009-05-23 22:15   ` Kay Sievers
  0 siblings, 0 replies; 200+ messages in thread
From: Kay Sievers @ 2009-05-23 22:15 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Andrew Morton, Greg Kroah-Hartman, linux-kernel, Tejun Heo,
	Cornelia Huck, linux-fsdevel, Eric W. Biederman

On Sat, May 23, 2009 at 22:13, Eric W. Biederman <ebiederm@xmission.com> wrote:
> From: Eric W. Biederman <ebiederm@xmission.com>
>
> device_shutdown is defined to just shutdown the hardware and to not
> clean up any kernel data structures.  Therefore don't put the kobjects
> for /sys/dev and /sys/dev/block and /sys/dev/char.
>
> This ensures we don't remove /sys/dev/block and /sys/dev/char while
> we still have symlinks from there to the actual devices.
>
> Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>

Looks good.

Acked-by: Kay Sievers <kay.sievers@vrfy.org>

Thanks,
Kay
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 22/20] sysfs: Make sysfs_rename_link atomic
  2009-05-23 21:32   ` Kay Sievers
@ 2009-05-23 23:21     ` Kay Sievers
  2009-05-24 13:03       ` Kay Sievers
  0 siblings, 1 reply; 200+ messages in thread
From: Kay Sievers @ 2009-05-23 23:21 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Andrew Morton, Greg Kroah-Hartman, linux-kernel, Tejun Heo,
	Cornelia Huck, linux-fsdevel, Eric W. Biederman

On Sat, May 23, 2009 at 23:32, Kay Sievers <kay.sievers@vrfy.org> wrote:
> On Sat, May 23, 2009 at 22:13, Eric W. Biederman <ebiederm@xmission.com> wrote:
>> From: Eric W. Biederman <ebiederm@xmission.com>
>>
>> Use the existing sysfs_rename to make sysfs_rename_link an atomic
>> operation that does less work.  While I am at add additional sanity
>> checking to ensure it is a symlink I am renaming.
>>
>> Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
>
> Looks great, and so much better than the remove and re-create thing.

Do you have your git tree public somewhere, or do you mind sending me
an all-in-one patch? I like to give it a try here.

Thanks,
Kay
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 25/20] sysfs:  Only support removing emtpy sysfs directories.
  2009-05-23 20:13 ` [PATCH 25/20] sysfs: Only support removing emtpy sysfs directories Eric W. Biederman
  2009-05-23 21:27   ` Kay Sievers
@ 2009-05-24  3:24   ` Tejun Heo
  1 sibling, 0 replies; 200+ messages in thread
From: Tejun Heo @ 2009-05-24  3:24 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Andrew Morton, Greg Kroah-Hartman, linux-kernel, Cornelia Huck,
	linux-fsdevel, Eric W. Biederman

Eric W. Biederman wrote:
> From: Eric W. Biederman <ebiederm@xmission.com>
> 
> I have looked and I have not found a single legitimate case today where
> we remove sysfs directories with anything in them.  The only case I have
> found to date was a bug.  It was a problem of ownership.  The files in
> the directory where not owned by the directory itself.   Leaving open
> the potential for double deletion of the directory contents.
> 
> Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>

For 21-24: Acked-by: Tejun Heo <tj@kernel.org>

25 maybe can be folded into the eariler in the series?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 25/20] sysfs: Only support removing emtpy sysfs directories.
  2009-05-23 21:27   ` Kay Sievers
@ 2009-05-24 12:59     ` Kay Sievers
  2009-05-24 14:17       ` Eric W. Biederman
  0 siblings, 1 reply; 200+ messages in thread
From: Kay Sievers @ 2009-05-24 12:59 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Andrew Morton, Greg Kroah-Hartman, linux-kernel, Tejun Heo,
	Cornelia Huck, linux-fsdevel, Eric W. Biederman

On Sat, 2009-05-23 at 23:27 +0200, Kay Sievers wrote:
> On Sat, May 23, 2009 at 22:13, Eric W. Biederman <ebiederm@xmission.com> wrote:
> > From: Eric W. Biederman <ebiederm@xmission.com>
> >
> > I have looked and I have not found a single legitimate case today where
> > we remove sysfs directories with anything in them.  The only case I have
> > found to date was a bug.  It was a problem of ownership.  The files in
> > the directory where not owned by the directory itself.   Leaving open
> > the potential for double deletion of the directory contents.
> >
> > Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
> 
> Sounds good to me. We should try that, and see if there was any valid
> use case we didn't think of, and if not, it's good to do what this
> patch does.

I get a bunch of warnings here. The question is if the users should be
fixed, the warning removed, and/or the auto-deletion added back?

I've added:
  -       WARN(dir_sd->s_dir.children,
  -               KERN_WARNING "sysfs: removing non-empty dir: %s\n",
  -               dir_sd->s_name);
  +       if (dir_sd->s_dir.children) {
  +               struct sysfs_dirent *sd;
  +
  +               WARN(dir_sd->s_dir.children,
  +                       KERN_WARNING "sysfs: removing non-empty dir: %s\n",
  +                       dir_sd->s_name);
  +               sd = dir_sd->s_dir.children;
  +               while (sd) {
  +                       printk(KERN_WARNING "%s/%s\n", dir_sd->s_name, sd->s_name);
  +                       sd = sd->s_sibling;
  +               }
  +       }

And get non-empty directories from CPU, SCSI, firmware_class, sound, block:
  sysfs: removing non-empty dir: state0
  state0/name
  state0/desc
  state0/latency
  state0/power
  state0/usage
  state0/time

  sysfs: removing non-empty dir: 0000:03:00.0
  0000:03:00.0/data
  0000:03:00.0/loading

  sysfs: removing non-empty dir: iosched
  iosched/quantum
  iosched/fifo_expire_sync
  iosched/fifo_expire_async
  iosched/back_seek_max
  iosched/back_seek_penalty
  iosched/slice_sync
  iosched/slice_async
  iosched/slice_async_rq
  iosched/slice_idle

  sysfs: removing non-empty dir: queue
  queue/nr_requests
  queue/read_ahead_kb
  queue/max_hw_sectors_kb
  queue/max_sectors_kb
  queue/scheduler
  queue/hw_sector_size
  queue/rotational
  queue/nomerges
  queue/rq_affinity
  queue/iostats

  sysfs: removing non-empty dir: 4:0:0:0
  4:0:0:0/queue_depth
  4:0:0:0/queue_type
  4:0:0:0/max_sectors

  sysfs: removing non-empty dir: host4
  host4/target4:0:0

  sysfs: removing non-empty dir: pcmC1D0c
  pcmC1D0c/pcm_class

  sysfs: removing non-empty dir: card1
  card1/id
  card1/number


Thanks,
Kay


^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 22/20] sysfs: Make sysfs_rename_link atomic
  2009-05-23 23:21     ` Kay Sievers
@ 2009-05-24 13:03       ` Kay Sievers
  0 siblings, 0 replies; 200+ messages in thread
From: Kay Sievers @ 2009-05-24 13:03 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Andrew Morton, Greg Kroah-Hartman, linux-kernel, Tejun Heo,
	Cornelia Huck, linux-fsdevel, Eric W. Biederman

On Sun, May 24, 2009 at 01:21, Kay Sievers <kay.sievers@vrfy.org> wrote:
> On Sat, May 23, 2009 at 23:32, Kay Sievers <kay.sievers@vrfy.org> wrote:
>> On Sat, May 23, 2009 at 22:13, Eric W. Biederman <ebiederm@xmission.com> wrote:
>>> From: Eric W. Biederman <ebiederm@xmission.com>
>>>
>>> Use the existing sysfs_rename to make sysfs_rename_link an atomic
>>> operation that does less work.  While I am at add additional sanity
>>> checking to ensure it is a symlink I am renaming.
>>>
>>> Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
>>
>> Looks great, and so much better than the remove and re-create thing.
>
> Do you have your git tree public somewhere, or do you mind sending me
> an all-in-one patch? I like to give it a try here.

Looks good. It survives a heavy hotplug setup just fine, renaming
works fine here, and sysfs looks right. There are a bunch of warnings
for some non-empty directories, which I replied with in the patch that
added the warning.

Thanks,
Kay
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 25/20] sysfs: Only support removing emtpy sysfs  directories.
  2009-05-24 12:59     ` Kay Sievers
@ 2009-05-24 14:17       ` Eric W. Biederman
  2009-05-24 15:20         ` Kay Sievers
  0 siblings, 1 reply; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-24 14:17 UTC (permalink / raw)
  To: Kay Sievers
  Cc: Andrew Morton, Greg Kroah-Hartman, linux-kernel, Tejun Heo,
	Cornelia Huck, linux-fsdevel, Eric W. Biederman

Kay Sievers <kay.sievers@vrfy.org> writes:

> On Sat, 2009-05-23 at 23:27 +0200, Kay Sievers wrote:
>> On Sat, May 23, 2009 at 22:13, Eric W. Biederman <ebiederm@xmission.com> wrote:
>> > From: Eric W. Biederman <ebiederm@xmission.com>
>> >
>> > I have looked and I have not found a single legitimate case today where
>> > we remove sysfs directories with anything in them.  The only case I have
>> > found to date was a bug.  It was a problem of ownership.  The files in
>> > the directory where not owned by the directory itself.   Leaving open
>> > the potential for double deletion of the directory contents.
>> >
>> > Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
>> 
>> Sounds good to me. We should try that, and see if there was any valid
>> use case we didn't think of, and if not, it's good to do what this
>> patch does.
>
> I get a bunch of warnings here. The question is if the users should be
> fixed, the warning removed, and/or the auto-deletion added back?

Thanks for finding these.  I was afraid I hadn't look far enough.
To see if non-empty directories were a problem.

Most of these look like attributes, for which the non-empty directory
removal was correct.  Although I am puzzled by why we missed them.

host4/target4:0:0 worries me.  I don't have my head wrapped around
what that is yet.  But is looks like is a directory (which we currently
do not handle correctly), and even more it looks like that is quite
possibly two kobjects in a parent/child situation where the child
was not removed when the child was.

It definitely warrants more investigation.

....

Let's make the plan to investigate these, and see how hard it would be
to actually remove these with the current device/sysfs infrastructure.

Fixing the users and adding back auto-deletion are the only two real options.

I expect we have uncovered at least one more real bug.  So I am inclined
to make the policy that we fix the users.

> I've added:
>   -       WARN(dir_sd->s_dir.children,
>   -               KERN_WARNING "sysfs: removing non-empty dir: %s\n",
>   -               dir_sd->s_name);
>   +       if (dir_sd->s_dir.children) {
>   +               struct sysfs_dirent *sd;
>   +
>   +               WARN(dir_sd->s_dir.children,
>   +                       KERN_WARNING "sysfs: removing non-empty dir: %s\n",
>   +                       dir_sd->s_name);
>   +               sd = dir_sd->s_dir.children;
>   +               while (sd) {
>   +                       printk(KERN_WARNING "%s/%s\n", dir_sd->s_name, sd->s_name);
>   +                       sd = sd->s_sibling;
>   +               }
>   +       }
>
> And get non-empty directories from CPU, SCSI, firmware_class, sound, block:
>   sysfs: removing non-empty dir: state0
>   state0/name
>   state0/desc
>   state0/latency
>   state0/power
>   state0/usage
>   state0/time
>
>   sysfs: removing non-empty dir: 0000:03:00.0
>   0000:03:00.0/data
>   0000:03:00.0/loading
>
>   sysfs: removing non-empty dir: iosched
>   iosched/quantum
>   iosched/fifo_expire_sync
>   iosched/fifo_expire_async
>   iosched/back_seek_max
>   iosched/back_seek_penalty
>   iosched/slice_sync
>   iosched/slice_async
>   iosched/slice_async_rq
>   iosched/slice_idle
>
>   sysfs: removing non-empty dir: queue
>   queue/nr_requests
>   queue/read_ahead_kb
>   queue/max_hw_sectors_kb
>   queue/max_sectors_kb
>   queue/scheduler
>   queue/hw_sector_size
>   queue/rotational
>   queue/nomerges
>   queue/rq_affinity
>   queue/iostats
>
>   sysfs: removing non-empty dir: 4:0:0:0
>   4:0:0:0/queue_depth
>   4:0:0:0/queue_type
>   4:0:0:0/max_sectors
>
>   sysfs: removing non-empty dir: host4
>   host4/target4:0:0
>
>   sysfs: removing non-empty dir: pcmC1D0c
>   pcmC1D0c/pcm_class
>
>   sysfs: removing non-empty dir: card1
>   card1/id
>   card1/number

Eric

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 25/20] sysfs: Only support removing emtpy sysfs directories.
  2009-05-24 14:17       ` Eric W. Biederman
@ 2009-05-24 15:20         ` Kay Sievers
  2009-05-25  2:06           ` Alan Stern
  2009-05-25  7:44           ` Eric W. Biederman
  0 siblings, 2 replies; 200+ messages in thread
From: Kay Sievers @ 2009-05-24 15:20 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Andrew Morton, Greg Kroah-Hartman, linux-kernel, Tejun Heo,
	Cornelia Huck, linux-fsdevel, Eric W. Biederman, stern

On Sun, 2009-05-24 at 07:17 -0700, Eric W. Biederman wrote:

> Most of these look like attributes, for which the non-empty directory
> removal was correct.  Although I am puzzled by why we missed them.

Yes, most of them are attributes. I have USB hubs here with lots of
devices connected, setups I use for udev testing, so this might trigger
things that usually don't happen.

> host4/target4:0:0 worries me.  I don't have my head wrapped around
> what that is yet.  But is looks like is a directory (which we currently
> do not handle correctly), and even more it looks like that is quite
> possibly two kobjects in a parent/child situation where the child
> was not removed when the child was.
> 
> It definitely warrants more investigation.

It seems like a real bug. We get:
  dir:   host5/target5:0:0

and we removed a parent from an active child, and the device path misses
all its parents:
  'target7:0:0' (ffff880124eb1158): fill_kobj_path: path = '/host7/target7:0:0'

it gets cleaned up:
  'target7:0:0': free name

but it should still be fixed in the user. Adding Alan Stern, maybe he
has an idea what's going on here.

Note, that it is hard to reproduce, It only happens with a frequent
connect/disconnects on a hub full of devices. But it still seems like a
real bug in the USB device cleanup logic. At the end of this mail is the
log of all files which did exist at cleanup.

> Let's make the plan to investigate these, and see how hard it would be
> to actually remove these with the current device/sysfs infrastructure.
> 
> Fixing the users and adding back auto-deletion are the only two real options.

Seems, we should remove non-directory files, which in most cases belong
to the kobject itself, but the user's cleanup logic does not cover the
removal of the created files.

But I think, we should still warn, if we find a sub-directory inside a
directory we are going to remove.

Thanks,
Kay



attr:  state0/name
attr:  state0/desc
attr:  state0/latency
attr:  state0/power
attr:  state0/usage
attr:  state0/time
attr:  state1/name
attr:  state1/desc
attr:  state1/latency
attr:  state1/power
attr:  state1/usage
attr:  state1/time
attr:  state2/name
attr:  state2/desc
attr:  state2/latency
attr:  state2/power
attr:  state2/usage
attr:  state2/time
attr:  state3/name
attr:  state3/desc
attr:  state3/latency
attr:  state3/power
attr:  state3/usage
attr:  state3/time
attr:  state0/name
attr:  state0/desc
attr:  state0/latency
attr:  state0/power
attr:  state0/usage
attr:  state0/time
attr:  state1/name
attr:  state1/desc
attr:  state1/latency
attr:  state1/power
attr:  state1/usage
attr:  state1/time
attr:  state2/name
attr:  state2/desc
attr:  state2/latency
attr:  state2/power
attr:  state2/usage
attr:  state2/time
attr:  state3/name
attr:  state3/desc
attr:  state3/latency
attr:  state3/power
attr:  state3/usage
attr:  state3/time
battr: 0000:03:00.0/data
attr:  0000:03:00.0/loading
attr:  pcmC1D0c/pcm_class
attr:  card1/id
attr:  card1/number
attr:  iosched/quantum
attr:  iosched/fifo_expire_sync
attr:  iosched/fifo_expire_async
attr:  iosched/back_seek_max
attr:  iosched/back_seek_penalty
attr:  iosched/slice_sync
attr:  iosched/slice_async
attr:  iosched/slice_async_rq
attr:  iosched/slice_idle
attr:  queue/nr_requests
attr:  queue/read_ahead_kb
attr:  queue/max_hw_sectors_kb
attr:  queue/max_sectors_kb
attr:  queue/scheduler
attr:  queue/hw_sector_size
attr:  queue/rotational
attr:  queue/nomerges
attr:  queue/rq_affinity
attr:  queue/iostats
attr:  5:0:0:0/queue_depth
attr:  5:0:0:0/queue_type
attr:  5:0:0:0/max_sectors
attr:  iosched/quantum
attr:  iosched/fifo_expire_sync
attr:  iosched/fifo_expire_async
attr:  iosched/back_seek_max
attr:  iosched/back_seek_penalty
attr:  iosched/slice_sync
attr:  iosched/slice_async
attr:  iosched/slice_async_rq
attr:  iosched/slice_idle
attr:  queue/nr_requests
attr:  queue/read_ahead_kb
attr:  queue/max_hw_sectors_kb
attr:  queue/max_sectors_kb
attr:  queue/scheduler
attr:  queue/hw_sector_size
attr:  queue/rotational
attr:  queue/nomerges
attr:  queue/rq_affinity
attr:  queue/iostats
attr:  5:0:0:1/queue_depth
attr:  5:0:0:1/queue_type
attr:  5:0:0:1/max_sectors
dir:   host5/target5:0:0
attr:  pcmC1D0c/pcm_class
attr:  card1/id
attr:  card1/number
attr:  pcmC1D0c/pcm_class
attr:  card1/id
attr:  card1/number
attr:  iosched/quantum
attr:  iosched/fifo_expire_sync
attr:  iosched/fifo_expire_async
attr:  iosched/back_seek_max
attr:  iosched/back_seek_penalty
attr:  iosched/slice_sync
attr:  iosched/slice_async
attr:  iosched/slice_async_rq
attr:  iosched/slice_idle
attr:  queue/nr_requests
attr:  queue/read_ahead_kb
attr:  queue/max_hw_sectors_kb
attr:  queue/max_sectors_kb
attr:  queue/scheduler
attr:  queue/hw_sector_size
attr:  queue/rotational
attr:  queue/nomerges
attr:  queue/rq_affinity
attr:  queue/iostats
attr:  6:0:0:0/queue_depth
attr:  6:0:0:0/queue_type
attr:  6:0:0:0/max_sectors
attr:  iosched/quantum
attr:  iosched/fifo_expire_sync
attr:  iosched/fifo_expire_async
attr:  iosched/back_seek_max
attr:  iosched/back_seek_penalty
attr:  iosched/slice_sync
attr:  iosched/slice_async
attr:  iosched/slice_async_rq
attr:  iosched/slice_idle
attr:  queue/nr_requests
attr:  queue/read_ahead_kb
attr:  queue/max_hw_sectors_kb
attr:  queue/max_sectors_kb
attr:  queue/scheduler
attr:  queue/hw_sector_size
attr:  queue/rotational
attr:  queue/nomerges
attr:  queue/rq_affinity
attr:  queue/iostats
attr:  6:0:0:1/queue_depth
attr:  6:0:0:1/queue_type
attr:  6:0:0:1/max_sectors
attr:  pcmC1D0c/pcm_class
attr:  card1/id
attr:  card1/number
attr:  pcmC1D0c/pcm_class
attr:  card1/id
attr:  card1/number


^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 25/20] sysfs: Only support removing emtpy sysfs  directories.
  2009-05-24 15:20         ` Kay Sievers
@ 2009-05-25  2:06           ` Alan Stern
  2009-05-25 11:45             ` Kay Sievers
  2009-05-25  7:44           ` Eric W. Biederman
  1 sibling, 1 reply; 200+ messages in thread
From: Alan Stern @ 2009-05-25  2:06 UTC (permalink / raw)
  To: Kay Sievers
  Cc: Eric W. Biederman, Andrew Morton, Greg Kroah-Hartman,
	linux-kernel, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Eric W. Biederman

On Sun, 24 May 2009, Kay Sievers wrote:

> On Sun, 2009-05-24 at 07:17 -0700, Eric W. Biederman wrote:
> 
> > Most of these look like attributes, for which the non-empty directory
> > removal was correct.  Although I am puzzled by why we missed them.
> 
> Yes, most of them are attributes. I have USB hubs here with lots of
> devices connected, setups I use for udev testing, so this might trigger
> things that usually don't happen.
> 
> > host4/target4:0:0 worries me.  I don't have my head wrapped around
> > what that is yet.  But is looks like is a directory (which we currently
> > do not handle correctly), and even more it looks like that is quite
> > possibly two kobjects in a parent/child situation where the child
> > was not removed when the child was.
> > 
> > It definitely warrants more investigation.
> 
> It seems like a real bug. We get:
>   dir:   host5/target5:0:0
> 
> and we removed a parent from an active child, and the device path misses
> all its parents:
>   'target7:0:0' (ffff880124eb1158): fill_kobj_path: path = '/host7/target7:0:0'
> 
> it gets cleaned up:
>   'target7:0:0': free name
> 
> but it should still be fixed in the user. Adding Alan Stern, maybe he
> has an idea what's going on here.
> 
> Note, that it is hard to reproduce, It only happens with a frequent
> connect/disconnects on a hub full of devices. But it still seems like a
> real bug in the USB device cleanup logic. At the end of this mail is the
> log of all files which did exist at cleanup.

This looks like a bug I found almost two weeks ago.  The bug was
introduced by Arjan as part of his async conversion (the async routine
runs without acquiring one of the mutexes held by its caller).  The
result is a race in the sd driver -- no connection with USB, by the way
-- so it's a little difficult to trigger.  I posted a patch, but the
reporter never said whether or not the patch fixed the problem.  Hence
the patch hasn't been submitted.

Here it is for you to try out.

Alan Stern



Index: usb-2.6/drivers/scsi/sd.c
===================================================================
--- usb-2.6.orig/drivers/scsi/sd.c
+++ usb-2.6/drivers/scsi/sd.c
@@ -1892,12 +1892,16 @@ static int sd_format_disk_name(char *pre
 static void sd_probe_async(void *data, async_cookie_t cookie)
 {
 	struct scsi_disk *sdkp = data;
-	struct scsi_device *sdp;
+	struct scsi_device *sdp = sdkp->device;
+	struct Scsi_Host *shost = sdp->host;
 	struct gendisk *gd;
 	u32 index;
 	struct device *dev;
 
-	sdp = sdkp->device;
+	mutex_lock(&shost->scan_mutex);
+	if (!scsi_host_scan_allowed(shost))
+		goto out_unlock_host;
+
 	gd = sdkp->disk;
 	index = sdkp->index;
 	dev = &sdp->sdev_gendev;
@@ -1915,8 +1919,10 @@ static void sd_probe_async(void *data, a
 	sdkp->dev.class = &sd_disk_class;
 	dev_set_name(&sdkp->dev, dev_name(&sdp->sdev_gendev));
 
-	if (device_add(&sdkp->dev))
-		goto out_free_index;
+	if (device_add(&sdkp->dev)) {
+		ida_remove(&sd_index_ida, index);
+		goto out_unlock_host;
+	}
 
 	get_device(&sdp->sdev_gendev);
 
@@ -1955,10 +1961,8 @@ static void sd_probe_async(void *data, a
 	sd_printk(KERN_NOTICE, sdkp, "Attached SCSI %sdisk\n",
 		  sdp->removable ? "removable " : "");
 
-	return;
-
- out_free_index:
-	ida_remove(&sd_index_ida, index);
+ out_unlock_host:
+	mutex_unlock(&shost->scan_mutex);
 }
 
 /**


^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 25/20] sysfs: Only support removing emtpy sysfs  directories.
  2009-05-24 15:20         ` Kay Sievers
  2009-05-25  2:06           ` Alan Stern
@ 2009-05-25  7:44           ` Eric W. Biederman
  2009-05-25  7:53             ` Eric W. Biederman
  1 sibling, 1 reply; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-25  7:44 UTC (permalink / raw)
  To: Kay Sievers
  Cc: Andrew Morton, Greg Kroah-Hartman, linux-kernel, Tejun Heo,
	Cornelia Huck, linux-fsdevel, Eric W. Biederman, stern

Kay Sievers <kay.sievers@vrfy.org> writes:

>> Let's make the plan to investigate these, and see how hard it would be
>> to actually remove these with the current device/sysfs infrastructure.
>> 
>> Fixing the users and adding back auto-deletion are the only two real options.
>
> Seems, we should remove non-directory files, which in most cases belong
> to the kobject itself, but the user's cleanup logic does not cover the
> removal of the created files.
>
> But I think, we should still warn, if we find a sub-directory inside a
> directory we are going to remove.

So far complaining about deleting non-empty directories is finding
real bugs.  It does not appear that too many users that delete
non-empty directories.

My plan moving forward is to see what has goofed and how hard it is to
change the callers to clean up after themselves.  If it is not a pain
to fix the callers who forget to delete their attributes that looks
like the right way forward.

It is certainly the principle of least surprise.

Eric

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 25/20] sysfs: Only support removing emtpy sysfs  directories.
  2009-05-25  7:44           ` Eric W. Biederman
@ 2009-05-25  7:53             ` Eric W. Biederman
  2009-05-25 10:51               ` Kay Sievers
  0 siblings, 1 reply; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-25  7:53 UTC (permalink / raw)
  To: Kay Sievers
  Cc: Andrew Morton, Greg Kroah-Hartman, linux-kernel, Tejun Heo,
	Cornelia Huck, linux-fsdevel, Eric W. Biederman, stern

ebiederm@xmission.com (Eric W. Biederman) writes:

> Kay Sievers <kay.sievers@vrfy.org> writes:
>
>>> Let's make the plan to investigate these, and see how hard it would be
>>> to actually remove these with the current device/sysfs infrastructure.
>>> 
>>> Fixing the users and adding back auto-deletion are the only two real options.
>>
>> Seems, we should remove non-directory files, which in most cases belong
>> to the kobject itself, but the user's cleanup logic does not cover the
>> removal of the created files.
>>
>> But I think, we should still warn, if we find a sub-directory inside a
>> directory we are going to remove.
>
> So far complaining about deleting non-empty directories is finding
> real bugs.  It does not appear that too many users that delete
> non-empty directories.
>
> My plan moving forward is to see what has goofed and how hard it is to
> change the callers to clean up after themselves.  If it is not a pain
> to fix the callers who forget to delete their attributes that looks
> like the right way forward.
>
> It is certainly the principle of least surprise.

Currently I expect that those attributes that are not deleted may
actually be bugs as well.  If a driver manually adds sysfs files after
device_create/device_register the uevent will have already been sent
and you can not safely use those attributes in when processing a hotplug
event.

Eric

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 25/20] sysfs: Only support removing emtpy sysfs directories.
  2009-05-25  7:53             ` Eric W. Biederman
@ 2009-05-25 10:51               ` Kay Sievers
  0 siblings, 0 replies; 200+ messages in thread
From: Kay Sievers @ 2009-05-25 10:51 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Andrew Morton, Greg Kroah-Hartman, linux-kernel, Tejun Heo,
	Cornelia Huck, linux-fsdevel, Eric W. Biederman, stern

On Mon, May 25, 2009 at 09:53, Eric W. Biederman <ebiederm@xmission.com> wrote:
> ebiederm@xmission.com (Eric W. Biederman) writes:

> Currently I expect that those attributes that are not deleted may
> actually be bugs as well.  If a driver manually adds sysfs files after
> device_create/device_register the uevent will have already been sent
> and you can not safely use those attributes in when processing a hotplug
> event.

That's true, and would be good to fix. I guess we need to come up with
a way of maintaining binary attributes in the same way as "default"
attributes. Today they need to be created manually.

Maybe we can get your patch series in, with the version that removes
the files, but warns on existing directories, and work on fixing the
users, and possible remove the deletion of files when we fixed all the
current users?

Thanks,
Kay
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 25/20] sysfs: Only support removing emtpy sysfs directories.
  2009-05-25  2:06           ` Alan Stern
@ 2009-05-25 11:45             ` Kay Sievers
  2009-05-25 12:01               ` Kay Sievers
  2009-05-26 16:27               ` Kay Sievers
  0 siblings, 2 replies; 200+ messages in thread
From: Kay Sievers @ 2009-05-25 11:45 UTC (permalink / raw)
  To: Alan Stern
  Cc: Eric W. Biederman, Andrew Morton, Greg Kroah-Hartman,
	linux-kernel, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Eric W. Biederman

On Mon, May 25, 2009 at 04:06, Alan Stern <stern@rowland.harvard.edu> wrote:

> This looks like a bug I found almost two weeks ago.  The bug was
> introduced by Arjan as part of his async conversion (the async routine
> runs without acquiring one of the mutexes held by its caller).  The
> result is a race in the sd driver -- no connection with USB,

Ah, I see.

> by the way -- so it's a little difficult to trigger.

I can trigger it pretty reliable now on plain -rc7 , but only with
more hubs in-between the storage device. It usually take less than
10-15 connect/disconnect cycles.

It looks like a serious bug though, after the bug triggered, random,
likely unrelated, applications crash, and I can not cleanly shot down
anymore.

> I posted a patch, but the
> reporter never said whether or not the patch fixed the problem.  Hence
> the patch hasn't been submitted.
>
> Here it is for you to try out.

I'll give it try now.

Kay
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 25/20] sysfs: Only support removing emtpy sysfs directories.
  2009-05-25 11:45             ` Kay Sievers
@ 2009-05-25 12:01               ` Kay Sievers
  2009-05-25 15:49                 ` Alan Stern
  2009-05-26 16:27               ` Kay Sievers
  1 sibling, 1 reply; 200+ messages in thread
From: Kay Sievers @ 2009-05-25 12:01 UTC (permalink / raw)
  To: Alan Stern
  Cc: Eric W. Biederman, Andrew Morton, Greg Kroah-Hartman,
	linux-kernel, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Eric W. Biederman

On Mon, 2009-05-25 at 13:45 +0200, Kay Sievers wrote:
> On Mon, May 25, 2009 at 04:06, Alan Stern <stern@rowland.harvard.edu> wrote:

> > by the way -- so it's a little difficult to trigger.
> 
> I can trigger it pretty reliable now on plain -rc7 , but only with
> more hubs in-between the storage device. It usually take less than
> 10-15 connect/disconnect cycles.
> 
> It looks like a serious bug though, after the bug triggered, random,
> likely unrelated, applications crash, and I can not cleanly shot down
> anymore.
> 
> > I posted a patch, but the
> > reporter never said whether or not the patch fixed the problem.  Hence
> > the patch hasn't been submitted.
> >
> > Here it is for you to try out.
> 
> I'll give it try now.

It still shows the same issue. Here is the trace with the target
directory left-over, when the host directory goes away: "host5/target5:0:0",
and the devpath with the parents lost "path = '/host5/target5:0:0'":

Thanks,
Kay


[   58.399021] kobject: 'host5' (ffff88012c52b558): kobject_uevent_env
[   58.399041] kobject: 'host5' (ffff88012c52b558): fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.7/usb2/2-2/2-2.4/2-2.4:1.0/host5/scsi_host/host5'
[   58.399213] kobject: 'scsi_host' (ffff88012548cdd0): kobject_cleanup
[   58.399217] kobject: 'scsi_host' (ffff88012548cdd0): auto cleanup kobject_del
[   58.399236] kobject: 'scsi_host' (ffff88012548cdd0): calling ktype release
[   58.399239] kobject: (ffff88012548cdd0): dynamic_kobj_release
[   58.399243] kobject: 'scsi_host': free name
[   58.399247] kobject: 'host5' (ffff88012c52b558): kobject_cleanup
[   58.399250] kobject: 'host5' (ffff88012c52b558): calling ktype release
[   58.399255] kobject: 'host5': free name
[   58.399315] kobject: 'host5' (ffff88012c52b388): kobject_uevent_env
[   58.399335] kobject: 'host5' (ffff88012c52b388): fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.7/usb2/2-2/2-2.4/2-2.4:1.0/host5'
[   58.399484] ------------[ cut here ]------------
[   58.399493] WARNING: at fs/sysfs/dir.c:794 sysfs_remove_dir+0xb2/0xd0()
[   58.399497] Hardware name: 2776LEG
[   58.399499] XXX dir: host5/target5:0:0
...
[   58.399594] Pid: 226, comm: khubd Not tainted 2.6.30-rc7-dirty #40
[   58.399598] Call Trace:
[   58.399605]  [<ffffffff8023c6a8>] warn_slowpath_common+0x78/0xb0
[   58.399610]  [<ffffffff8023c73c>] warn_slowpath_fmt+0x3c/0x40
[   58.399614]  [<ffffffff803229f6>] ? sysfs_addrm_start+0x76/0xd0
[   58.399619]  [<ffffffff80323012>] sysfs_remove_dir+0xb2/0xd0
[   58.399626]  [<ffffffff803e5036>] kobject_del+0x16/0x40
[   58.399632]  [<ffffffff80477645>] device_del+0x165/0x1a0
[   58.399638]  [<ffffffff8048256f>] scsi_remove_host+0xcf/0x120
[   58.399652]  [<ffffffffa02fd3cb>] quiesce_and_remove_host+0x6b/0xb0 [usb_storage]
[   58.399662]  [<ffffffffa02fd4f8>] usb_stor_disconnect+0x18/0x30 [usb_storage]
[   58.399686]  [<ffffffffa0062fae>] usb_unbind_interface+0x6e/0x140 [usbcore]
[   58.399694]  [<ffffffff80479e29>] __device_release_driver+0x59/0xa0
[   58.399699]  [<ffffffff80479f68>] device_release_driver+0x28/0x40
[   58.399704]  [<ffffffff8047927c>] bus_remove_device+0xac/0xe0
[   58.399709]  [<ffffffff80477607>] device_del+0x127/0x1a0
[   58.399726]  [<ffffffffa005fb77>] usb_disable_device+0xa7/0x130 [usbcore]
[   58.399744]  [<ffffffffa005a818>] usb_disconnect+0xc8/0x140 [usbcore]
[   58.399761]  [<ffffffffa005a804>] usb_disconnect+0xb4/0x140 [usbcore]
[   58.399778]  [<ffffffffa005b8db>] hub_thread+0x50b/0x1230 [usbcore]
[   58.399784]  [<ffffffff80565a56>] ? _spin_unlock_irq+0x26/0x30
[   58.399790]  [<ffffffff80237d1e>] ? finish_task_switch+0x7e/0x140
[   58.399795]  [<ffffffff80237cdb>] ? finish_task_switch+0x3b/0x140
[   58.399802]  [<ffffffff802549e0>] ? autoremove_wake_function+0x0/0x40
[   58.399818]  [<ffffffffa005b3d0>] ? hub_thread+0x0/0x1230 [usbcore]
[   58.399823]  [<ffffffff802545b5>] kthread+0x55/0xa0
[   58.399829]  [<ffffffff8020cf3a>] child_rip+0xa/0x20
[   58.399833]  [<ffffffff80254560>] ? kthread+0x0/0xa0
[   58.399838]  [<ffffffff8020cf30>] ? child_rip+0x0/0x20
[   58.399842] ---[ end trace a5fdfdfd6227b73e ]---
...
[   58.853385] kobject: 'target5:0:0' (ffff880129980480): kobject_uevent_env
[   58.853405] kobject: 'target5:0:0' (ffff880129980480): fill_kobj_path: path = '/host5/target5:0:0'
[   58.853643] kobject: 'target5:0:0' (ffff880129980480): kobject_cleanup
[   58.853647] kobject: 'target5:0:0' (ffff880129980480): calling ktype release
[   58.853653] kobject: 'host5' (ffff88012c52b388): kobject_cleanup
[   58.853657] kobject: 'host5' (ffff88012c52b388): calling ktype release
[   58.853701] kobject: '2-2.4:1.0' (ffff8801255319f0): kobject_cleanup
[   58.853705] kobject: '2-2.4:1.0' (ffff8801255319f0): calling ktype release
[   58.853721] kobject: '2-2.4:1.0': free name
[   58.853736] kobject: 'host5': free name
[   58.853742] kobject: 'target5:0:0': free name
[   58.853748] kobject: '5:0:0:0': free name



^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 25/20] sysfs: Only support removing emtpy sysfs  directories.
  2009-05-25 12:01               ` Kay Sievers
@ 2009-05-25 15:49                 ` Alan Stern
  2009-05-25 18:19                   ` Kay Sievers
  0 siblings, 1 reply; 200+ messages in thread
From: Alan Stern @ 2009-05-25 15:49 UTC (permalink / raw)
  To: Kay Sievers
  Cc: James Bottomley, Boaz Harrosh, SCSI development list,
	Eric W. Biederman, Andrew Morton, Greg Kroah-Hartman,
	Kernel development list, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Eric W. Biederman

Since this appears to be a bug in the SCSI layer, let's add some SCSI
people to the CC: list.

To summarize the problem: The SCSI core tries to unregister a host 
while its sysfs directory is still non-empty because the target hasn't 
been unregistered yet.

On Mon, 25 May 2009, Kay Sievers wrote:

> On Mon, 2009-05-25 at 13:45 +0200, Kay Sievers wrote:
> > On Mon, May 25, 2009 at 04:06, Alan Stern <stern@rowland.harvard.edu> wrote:
> 
> > > by the way -- so it's a little difficult to trigger.
> > 
> > I can trigger it pretty reliable now on plain -rc7 , but only with
> > more hubs in-between the storage device. It usually take less than
> > 10-15 connect/disconnect cycles.
> > 
> > It looks like a serious bug though, after the bug triggered, random,
> > likely unrelated, applications crash, and I can not cleanly shot down
> > anymore.
> > 
> > > I posted a patch, but the
> > > reporter never said whether or not the patch fixed the problem.  Hence
> > > the patch hasn't been submitted.
> > >
> > > Here it is for you to try out.
> > 
> > I'll give it try now.
> 
> It still shows the same issue. Here is the trace with the target
> directory left-over, when the host directory goes away: "host5/target5:0:0",
> and the devpath with the parents lost "path = '/host5/target5:0:0'":
> 
> Thanks,
> Kay
> 
> 
> [   58.399021] kobject: 'host5' (ffff88012c52b558): kobject_uevent_env
> [   58.399041] kobject: 'host5' (ffff88012c52b558): fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.7/usb2/2-2/2-2.4/2-2.4:1.0/host5/scsi_host/host5'
> [   58.399213] kobject: 'scsi_host' (ffff88012548cdd0): kobject_cleanup
> [   58.399217] kobject: 'scsi_host' (ffff88012548cdd0): auto cleanup kobject_del
> [   58.399236] kobject: 'scsi_host' (ffff88012548cdd0): calling ktype release
> [   58.399239] kobject: (ffff88012548cdd0): dynamic_kobj_release
> [   58.399243] kobject: 'scsi_host': free name
> [   58.399247] kobject: 'host5' (ffff88012c52b558): kobject_cleanup
> [   58.399250] kobject: 'host5' (ffff88012c52b558): calling ktype release
> [   58.399255] kobject: 'host5': free name
> [   58.399315] kobject: 'host5' (ffff88012c52b388): kobject_uevent_env
> [   58.399335] kobject: 'host5' (ffff88012c52b388): fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.7/usb2/2-2/2-2.4/2-2.4:1.0/host5'
> [   58.399484] ------------[ cut here ]------------
> [   58.399493] WARNING: at fs/sysfs/dir.c:794 sysfs_remove_dir+0xb2/0xd0()
> [   58.399497] Hardware name: 2776LEG
> [   58.399499] XXX dir: host5/target5:0:0
> ...
> [   58.399594] Pid: 226, comm: khubd Not tainted 2.6.30-rc7-dirty #40
> [   58.399598] Call Trace:
> [   58.399605]  [<ffffffff8023c6a8>] warn_slowpath_common+0x78/0xb0
> [   58.399610]  [<ffffffff8023c73c>] warn_slowpath_fmt+0x3c/0x40
> [   58.399614]  [<ffffffff803229f6>] ? sysfs_addrm_start+0x76/0xd0
> [   58.399619]  [<ffffffff80323012>] sysfs_remove_dir+0xb2/0xd0
> [   58.399626]  [<ffffffff803e5036>] kobject_del+0x16/0x40
> [   58.399632]  [<ffffffff80477645>] device_del+0x165/0x1a0
> [   58.399638]  [<ffffffff8048256f>] scsi_remove_host+0xcf/0x120
> [   58.399652]  [<ffffffffa02fd3cb>] quiesce_and_remove_host+0x6b/0xb0 [usb_storage]
> [   58.399662]  [<ffffffffa02fd4f8>] usb_stor_disconnect+0x18/0x30 [usb_storage]
> [   58.399686]  [<ffffffffa0062fae>] usb_unbind_interface+0x6e/0x140 [usbcore]
> [   58.399694]  [<ffffffff80479e29>] __device_release_driver+0x59/0xa0
> [   58.399699]  [<ffffffff80479f68>] device_release_driver+0x28/0x40
> [   58.399704]  [<ffffffff8047927c>] bus_remove_device+0xac/0xe0
> [   58.399709]  [<ffffffff80477607>] device_del+0x127/0x1a0
> [   58.399726]  [<ffffffffa005fb77>] usb_disable_device+0xa7/0x130 [usbcore]
> [   58.399744]  [<ffffffffa005a818>] usb_disconnect+0xc8/0x140 [usbcore]
> [   58.399761]  [<ffffffffa005a804>] usb_disconnect+0xb4/0x140 [usbcore]
> [   58.399778]  [<ffffffffa005b8db>] hub_thread+0x50b/0x1230 [usbcore]
> [   58.399784]  [<ffffffff80565a56>] ? _spin_unlock_irq+0x26/0x30
> [   58.399790]  [<ffffffff80237d1e>] ? finish_task_switch+0x7e/0x140
> [   58.399795]  [<ffffffff80237cdb>] ? finish_task_switch+0x3b/0x140
> [   58.399802]  [<ffffffff802549e0>] ? autoremove_wake_function+0x0/0x40
> [   58.399818]  [<ffffffffa005b3d0>] ? hub_thread+0x0/0x1230 [usbcore]
> [   58.399823]  [<ffffffff802545b5>] kthread+0x55/0xa0
> [   58.399829]  [<ffffffff8020cf3a>] child_rip+0xa/0x20
> [   58.399833]  [<ffffffff80254560>] ? kthread+0x0/0xa0
> [   58.399838]  [<ffffffff8020cf30>] ? child_rip+0x0/0x20
> [   58.399842] ---[ end trace a5fdfdfd6227b73e ]---
> ...
> [   58.853385] kobject: 'target5:0:0' (ffff880129980480): kobject_uevent_env
> [   58.853405] kobject: 'target5:0:0' (ffff880129980480): fill_kobj_path: path = '/host5/target5:0:0'
> [   58.853643] kobject: 'target5:0:0' (ffff880129980480): kobject_cleanup
> [   58.853647] kobject: 'target5:0:0' (ffff880129980480): calling ktype release
> [   58.853653] kobject: 'host5' (ffff88012c52b388): kobject_cleanup
> [   58.853657] kobject: 'host5' (ffff88012c52b388): calling ktype release
> [   58.853701] kobject: '2-2.4:1.0' (ffff8801255319f0): kobject_cleanup
> [   58.853705] kobject: '2-2.4:1.0' (ffff8801255319f0): calling ktype release
> [   58.853721] kobject: '2-2.4:1.0': free name
> [   58.853736] kobject: 'host5': free name
> [   58.853742] kobject: 'target5:0:0': free name
> [   58.853748] kobject: '5:0:0:0': free name

Can you provide more of the context?  I'd like to see the log starting 
from when these devices were first registered.

Alan Stern


^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 25/20] sysfs: Only support removing emtpy sysfs directories.
  2009-05-25 15:49                 ` Alan Stern
@ 2009-05-25 18:19                   ` Kay Sievers
  2009-05-25 20:14                     ` Alan Stern
  0 siblings, 1 reply; 200+ messages in thread
From: Kay Sievers @ 2009-05-25 18:19 UTC (permalink / raw)
  To: Alan Stern
  Cc: James Bottomley, Boaz Harrosh, SCSI development list,
	Eric W. Biederman, Andrew Morton, Greg Kroah-Hartman,
	Kernel development list, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Eric W. Biederman

On Mon, 2009-05-25 at 11:49 -0400, Alan Stern wrote:
> Since this appears to be a bug in the SCSI layer, let's add some SCSI
> people to the CC: list.
> 
> To summarize the problem: The SCSI core tries to unregister a host 
> while its sysfs directory is still non-empty because the target hasn't 
> been unregistered yet.

> Can you provide more of the context?  I'd like to see the log starting 
> from when these devices were first registered.

I was able to trigger it with a USB storage device only, connected to a
hub, and I removed the hub from the host.

Kay


[21415.579166] usb 2-2: new high speed USB device using ehci_hcd and address 26
[21415.693822] kobject: '2-2' (ffff8801390c9128): kobject_add_internal: parent: 'usb2', set: 'devices'
[21415.694151] kobject: '2-2' (ffff8801390c9128): kobject_uevent_env
[21415.694186] kobject: '2-2' (ffff8801390c9128): fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.7/usb2/2-2'
[21415.694299] usb 2-2: configuration #1 chosen from 1 choice
[21415.695512] kobject: '2-2:1.0' (ffff8800b9716b48): kobject_add_internal: parent: '2-2', set: 'devices'
[21415.695612] kobject: '2-2:1.0' (ffff8800b9716b48): kobject_uevent_env
[21415.695645] kobject: '2-2:1.0' (ffff8800b9716b48): fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.7/usb2/2-2/2-2:1.0'
[21415.695750] hub 2-2:1.0: USB hub found
[21415.696415] hub 2-2:1.0: 4 ports detected
[21415.701428] kobject: 'usb_endpoint' (ffff8800b9648198): kobject_add_internal: parent: '2-2:1.0', set: '<NULL>'
[21415.701457] kobject: 'usbdev2.26_ep81' (ffff8800b96f56f0): kobject_add_internal: parent: 'usb_endpoint', set: 'devices'
[21415.701656] kobject: 'usbdev2.26_ep81' (ffff8800b96f56f0): kobject_uevent_env
[21415.701694] kobject: 'usbdev2.26_ep81' (ffff8800b96f56f0): fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.7/usb2/2-2/2-2:1.0/usb_endpoint/usbdev2.26_ep81'
[21415.701826] kobject: 'usb_endpoint' (ffff8800b96486e8): kobject_add_internal: parent: '2-2', set: '<NULL>'
[21415.701847] kobject: 'usbdev2.26_ep00' (ffff8800b96f5dc8): kobject_add_internal: parent: 'usb_endpoint', set: 'devices'
[21415.701972] kobject: 'usbdev2.26_ep00' (ffff8800b96f5dc8): kobject_uevent_env
[21415.702030] kobject: 'usbdev2.26_ep00' (ffff8800b96f5dc8): fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.7/usb2/2-2/usb_endpoint/usbdev2.26_ep00'
[21415.974325] usb 2-2.4: new high speed USB device using ehci_hcd and address 27
[21416.060815] kobject: '2-2.4' (ffff88013422cb20): kobject_add_internal: parent: '2-2', set: 'devices'
[21416.061184] kobject: '2-2.4' (ffff88013422cb20): kobject_uevent_env
[21416.061218] kobject: '2-2.4' (ffff88013422cb20): fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.7/usb2/2-2/2-2.4'
[21416.061330] usb 2-2.4: configuration #1 chosen from 1 choice
[21416.063752] kobject: '2-2.4:1.0' (ffff88013a9cc908): kobject_add_internal: parent: '2-2.4', set: 'devices'
[21416.063862] kobject: '2-2.4:1.0' (ffff88013a9cc908): kobject_uevent_env
[21416.063897] kobject: '2-2.4:1.0' (ffff88013a9cc908): fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.7/usb2/2-2/2-2.4/2-2.4:1.0'
[21416.097456] scsi10 : SCSI emulation for USB Mass Storage devices
[21416.097474] kobject: 'host10' (ffff8801001592f8): kobject_add_internal: parent: '2-2.4:1.0', set: 'devices'
[21416.097520] kobject: 'host10' (ffff8801001592f8): kobject_uevent_env
[21416.097541] kobject: 'host10' (ffff8801001592f8): fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.7/usb2/2-2/2-2.4/2-2.4:1.0/host10'
[21416.097797] kobject: 'scsi_host' (ffff8801340d3880): kobject_add_internal: parent: 'host10', set: '<NULL>'
[21416.097808] kobject: 'host10' (ffff8801001594c8): kobject_add_internal: parent: 'scsi_host', set: 'devices'
[21416.097888] kobject: 'host10' (ffff8801001594c8): kobject_uevent_env
[21416.097908] kobject: 'host10' (ffff8801001594c8): fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.7/usb2/2-2/2-2.4/2-2.4:1.0/host10/scsi_host/host10'
[21416.109077] kobject: 'usb_endpoint' (ffff8801340d3f68): kobject_add_internal: parent: '2-2.4:1.0', set: '<NULL>'
[21416.109096] kobject: 'usbdev2.27_ep81' (ffff8800b9668b88): kobject_add_internal: parent: 'usb_endpoint', set: 'devices'
[21416.109238] kobject: 'usbdev2.27_ep81' (ffff8800b9668b88): kobject_uevent_env
[21416.109259] kobject: 'usbdev2.27_ep81' (ffff8800b9668b88): fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.7/usb2/2-2/2-2.4/2-2.4:1.0/usb_endpoint/usbdev2.27_ep81'
[21416.110897] kobject: 'usbdev2.27_ep02' (ffff8800b9668268): kobject_add_internal: parent: 'usb_endpoint', set: 'devices'
[21416.111007] kobject: 'usbdev2.27_ep02' (ffff8800b9668268): kobject_uevent_env
[21416.111027] kobject: 'usbdev2.27_ep02' (ffff8800b9668268): fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.7/usb2/2-2/2-2.4/2-2.4:1.0/usb_endpoint/usbdev2.27_ep02'
[21416.112326] kobject: 'usb_endpoint' (ffff880115b54990): kobject_add_internal: parent: '2-2.4', set: '<NULL>'
[21416.112340] kobject: 'usbdev2.27_ep00' (ffff8800b9668dd0): kobject_add_internal: parent: 'usb_endpoint', set: 'devices'
[21416.112439] kobject: 'usbdev2.27_ep00' (ffff8800b9668dd0): kobject_uevent_env
[21416.112459] kobject: 'usbdev2.27_ep00' (ffff8800b9668dd0): fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.7/usb2/2-2/2-2.4/usb_endpoint/usbdev2.27_ep00'
[21416.112675] usb-storage: device found at 27
[21416.112678] usb-storage: waiting for device to settle before scanning
[21421.113280] scsi 10:0:0:0: Direct-Access     SanDisk  U3 Cruzer Micro  8.02 PQ: 0 ANSI: 0 CCS
[21421.113304] kobject: 'target10:0:0' (ffff88013a9c9158): kobject_add_internal: parent: 'host10', set: 'devices'
[21421.113378] kobject: 'target10:0:0' (ffff88013a9c9158): kobject_uevent_env
[21421.113414] kobject: 'target10:0:0' (ffff88013a9c9158): fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.7/usb2/2-2/2-2.4/2-2.4:1.0/host10/target10:0:0'
[21421.113836] kobject: '10:0:0:0' (ffff88012bb87588): kobject_add_internal: parent: 'target10:0:0', set: 'devices'
[21421.118125] kobject: '10:0:0:0' (ffff88012bb87588): kobject_uevent_env
[21421.118200] kobject: '10:0:0:0' (ffff88012bb87588): fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.7/usb2/2-2/2-2.4/2-2.4:1.0/host10/target10:0:0/10:0:0:0'
[21421.120456] kobject: 'scsi_device' (ffff8801078d8220): kobject_add_internal: parent: '10:0:0:0', set: '<NULL>'
[21421.120476] kobject: '10:0:0:0' (ffff88012bb87758): kobject_add_internal: parent: 'scsi_device', set: 'devices'
[21421.120538] kobject: '10:0:0:0' (ffff88012bb87758): kobject_uevent_env
[21421.120575] kobject: '10:0:0:0' (ffff88012bb87758): fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.7/usb2/2-2/2-2.4/2-2.4:1.0/host10/target10:0:0/10:0:0:0/scsi_device/10:0:0:0'
[21421.121003] kobject: 'scsi_generic' (ffff8801388d55d8): kobject_add_internal: parent: '10:0:0:0', set: '<NULL>'
[21421.121022] kobject: 'sg3' (ffff8800b966a000): kobject_add_internal: parent: 'scsi_generic', set: 'devices'
[21421.121197] kobject: 'sg3' (ffff8800b966a000): kobject_uevent_env
[21421.121233] kobject: 'sg3' (ffff8800b966a000): fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.7/usb2/2-2/2-2.4/2-2.4:1.0/host10/target10:0:0/10:0:0:0/scsi_generic/sg3'
[21421.127335] sd 10:0:0:0: Attached scsi generic sg3 type 0
[21421.127393] kobject: 'bsg' (ffff8801388d56e8): kobject_add_internal: parent: '10:0:0:0', set: '<NULL>'
[21421.127412] kobject: '10:0:0:0' (ffff8800b966a248): kobject_add_internal: parent: 'bsg', set: 'devices'
[21421.127541] kobject: '10:0:0:0' (ffff8800b966a248): kobject_uevent_env
[21421.127581] kobject: '10:0:0:0' (ffff8800b966a248): fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.7/usb2/2-2/2-2.4/2-2.4:1.0/host10/target10:0:0/10:0:0:0/bsg/10:0:0:0'
[21421.128699] scsi 10:0:0:1: CD-ROM            SanDisk  U3 Cruzer Micro  8.02 PQ: 0 ANSI: 0
[21421.128720] kobject: '10:0:0:1' (ffff880139003b90): kobject_add_internal: parent: 'target10:0:0', set: 'devices'
[21421.128870] kobject: '10:0:0:1' (ffff880139003b90): kobject_uevent_env
[21421.128904] kobject: '10:0:0:1' (ffff880139003b90): fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.7/usb2/2-2/2-2.4/2-2.4:1.0/host10/target10:0:0/10:0:0:1'
[21421.132180] sr0: scsi3-mmc drive: 48x/48x tray
[21421.132220] kobject: 'block' (ffff880139009d48): kobject_add_internal: parent: '10:0:0:1', set: '<NULL>'
[21421.132241] kobject: 'sr0' (ffff880138836fa8): kobject_add_internal: parent: 'block', set: 'devices'
[21421.132378] kobject: 'sr0' (ffff880138836fa8): kobject_uevent_env
[21421.132385] kobject: 'sr0' (ffff880138836fa8): kobject_uevent_env: uevent_suppress caused the event to drop!
[21421.132417] kobject: 'holders' (ffff880129f63440): kobject_add_internal: parent: 'sr0', set: '<NULL>'
[21421.132437] kobject: 'slaves' (ffff88013886c6e8): kobject_add_internal: parent: 'sr0', set: '<NULL>'
[21421.132450] kobject: 'sr0' (ffff880138836fa8): kobject_uevent_env
[21421.132485] kobject: 'sr0' (ffff880138836fa8): fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.7/usb2/2-2/2-2.4/2-2.4:1.0/host10/target10:0:0/10:0:0:1/block/sr0'
[21421.132837] kobject: 'queue' (ffff88013aea0ff0): kobject_add_internal: parent: 'sr0', set: '<NULL>'
[21421.132903] kobject: 'queue' (ffff88013aea0ff0): kobject_uevent_env
[21421.132910] kobject: 'queue' (ffff88013aea0ff0): kobject_uevent_env: filter function caused the event to drop!
[21421.132921] kobject: 'iosched' (ffff8801350a5c40): kobject_add_internal: parent: 'queue', set: '<NULL>'
[21421.136715] kobject: 'iosched' (ffff8801350a5c40): kobject_uevent_env
[21421.136721] kobject: 'iosched' (ffff8801350a5c40): kobject_uevent_env: filter function caused the event to drop!
[21421.136738] kobject: '11:0' (ffff8800b96696e0): kobject_add_internal: parent: 'bdi', set: 'devices'
[21421.136785] kobject: '11:0' (ffff8800b96696e0): kobject_uevent_env
[21421.136804] kobject: '11:0' (ffff8800b96696e0): fill_kobj_path: path = '/devices/virtual/bdi/11:0'
[21421.137067] sr 10:0:0:1: Attached scsi CD-ROM sr0
[21421.137086] kobject: 'scsi_device' (ffff880129c6a550): kobject_add_internal: parent: '10:0:0:1', set: '<NULL>'
[21421.137098] kobject: '10:0:0:1' (ffff880139003d60): kobject_add_internal: parent: 'scsi_device', set: 'devices'
[21421.137148] kobject: '10:0:0:1' (ffff880139003d60): kobject_uevent_env
[21421.137170] kobject: '10:0:0:1' (ffff880139003d60): fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.7/usb2/2-2/2-2.4/2-2.4:1.0/host10/target10:0:0/10:0:0:1/scsi_device/10:0:0:1'
[21421.137369] kobject: 'scsi_generic' (ffff880129c6a6e8): kobject_add_internal: parent: '10:0:0:1', set: '<NULL>'
[21421.137380] kobject: 'sg4' (ffff8800b9669928): kobject_add_internal: parent: 'scsi_generic', set: 'devices'
[21421.137463] kobject: 'sg4' (ffff8800b9669928): kobject_uevent_env
[21421.137483] kobject: 'sg4' (ffff8800b9669928): fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.7/usb2/2-2/2-2.4/2-2.4:1.0/host10/target10:0:0/10:0:0:1/scsi_generic/sg4'
[21421.137660] sr 10:0:0:1: Attached scsi generic sg4 type 5
[21421.137683] kobject: 'bsg' (ffff880129c6add0): kobject_add_internal: parent: '10:0:0:1', set: '<NULL>'
[21421.137692] kobject: '10:0:0:1' (ffff8800b96686e8): kobject_add_internal: parent: 'bsg', set: 'devices'
[21421.137746] kobject: '10:0:0:1' (ffff8800b96686e8): kobject_uevent_env
[21421.137766] kobject: '10:0:0:1' (ffff8800b96686e8): fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.7/usb2/2-2/2-2.4/2-2.4:1.0/host10/target10:0:0/10:0:0:1/bsg/10:0:0:1'
[21421.139671] kobject: '10:0:0:2' (ffff8801158df588): kobject_cleanup
[21421.139676] kobject: '10:0:0:2' (ffff8801158df588): calling ktype release
[21421.139696] kobject: '<NULL>' (ffff88013a9bbc40): kobject_cleanup
[21421.139700] kobject: '<NULL>' (ffff88013a9bbc40): calling ktype release
[21421.139708] kobject: '<NULL>' (ffff88013aea2730): kobject_cleanup
[21421.139712] kobject: '<NULL>' (ffff88013aea2730): calling ktype release
[21421.139737] kobject: '10:0:0:2': free name
[21421.139858] kobject: '10:0:1:0' (ffff8801158df588): kobject_cleanup
[21421.139862] kobject: '10:0:1:0' (ffff8801158df588): calling ktype release
[21421.139876] kobject: '<NULL>' (ffff880139385af8): kobject_cleanup
[21421.139880] kobject: '<NULL>' (ffff880139385af8): calling ktype release
[21421.139887] kobject: '<NULL>' (ffff88013aea2730): kobject_cleanup
[21421.139890] kobject: '<NULL>' (ffff88013aea2730): calling ktype release
[21421.139912] kobject: '10:0:1:0': free name
[21421.139918] kobject: 'target10:0:1' (ffff88013a9c99e8): kobject_cleanup
[21421.139922] kobject: 'target10:0:1' (ffff88013a9c99e8): calling ktype release
[21421.139928] kobject: 'target10:0:1': free name
[21421.140064] kobject: '10:0:2:0' (ffff8801158df588): kobject_cleanup
[21421.140068] kobject: '10:0:2:0' (ffff8801158df588): calling ktype release
[21421.140082] kobject: '<NULL>' (ffff880139385af8): kobject_cleanup
[21421.140086] kobject: '<NULL>' (ffff880139385af8): calling ktype release
[21421.140092] kobject: '<NULL>' (ffff88013aea2730): kobject_cleanup
[21421.140096] kobject: '<NULL>' (ffff88013aea2730): calling ktype release
[21421.140118] kobject: '10:0:2:0': free name
[21421.140124] kobject: 'target10:0:2' (ffff880124839158): kobject_cleanup
[21421.140142] kobject: 'target10:0:2' (ffff880124839158): calling ktype release
[21421.140149] kobject: 'target10:0:2': free name
[21421.140267] kobject: '10:0:3:0' (ffff8801158df588): kobject_cleanup
[21421.140271] kobject: '10:0:3:0' (ffff8801158df588): calling ktype release
[21421.140285] kobject: '<NULL>' (ffff880139385af8): kobject_cleanup
[21421.140289] kobject: '<NULL>' (ffff880139385af8): calling ktype release
[21421.140296] kobject: '<NULL>' (ffff88013aea2730): kobject_cleanup
[21421.140299] kobject: '<NULL>' (ffff88013aea2730): calling ktype release
[21421.140321] kobject: '10:0:3:0': free name
[21421.140329] kobject: 'target10:0:3' (ffff880121091158): kobject_cleanup
[21421.140332] kobject: 'target10:0:3' (ffff880121091158): calling ktype release
[21421.140338] kobject: 'target10:0:3': free name
[21421.140459] kobject: '10:0:4:0' (ffff8801158df588): kobject_cleanup
[21421.140463] kobject: '10:0:4:0' (ffff8801158df588): calling ktype release
[21421.140477] kobject: '<NULL>' (ffff880139385af8): kobject_cleanup
[21421.140481] kobject: '<NULL>' (ffff880139385af8): calling ktype release
[21421.140488] kobject: '<NULL>' (ffff88013aea2730): kobject_cleanup
[21421.140491] kobject: '<NULL>' (ffff88013aea2730): calling ktype release
[21421.140513] kobject: '10:0:4:0': free name
[21421.140520] kobject: 'target10:0:4' (ffff880121091158): kobject_cleanup
[21421.140524] kobject: 'target10:0:4' (ffff880121091158): calling ktype release
[21421.140529] kobject: 'target10:0:4': free name
[21421.140652] kobject: '10:0:5:0' (ffff8801158df588): kobject_cleanup
[21421.140655] kobject: '10:0:5:0' (ffff8801158df588): calling ktype release
[21421.140669] kobject: '<NULL>' (ffff880139385af8): kobject_cleanup
[21421.140673] kobject: '<NULL>' (ffff880139385af8): calling ktype release
[21421.140680] kobject: '<NULL>' (ffff88013aea2730): kobject_cleanup
[21421.140684] kobject: '<NULL>' (ffff88013aea2730): calling ktype release
[21421.140706] kobject: '10:0:5:0': free name
[21421.140713] kobject: 'target10:0:5' (ffff880121091158): kobject_cleanup
[21421.140716] kobject: 'target10:0:5' (ffff880121091158): calling ktype release
[21421.140722] kobject: 'target10:0:5': free name
[21421.140839] kobject: '10:0:6:0' (ffff8801158df588): kobject_cleanup
[21421.140843] kobject: '10:0:6:0' (ffff8801158df588): calling ktype release
[21421.140857] kobject: '<NULL>' (ffff880139385af8): kobject_cleanup
[21421.140861] kobject: '<NULL>' (ffff880139385af8): calling ktype release
[21421.140868] kobject: '<NULL>' (ffff88013aea2730): kobject_cleanup
[21421.140871] kobject: '<NULL>' (ffff88013aea2730): calling ktype release
[21421.140893] kobject: '10:0:6:0': free name
[21421.140900] kobject: 'target10:0:6' (ffff880121091158): kobject_cleanup
[21421.140904] kobject: 'target10:0:6' (ffff880121091158): calling ktype release
[21421.140909] kobject: 'target10:0:6': free name
[21421.141026] kobject: '10:0:7:0' (ffff8801158df588): kobject_cleanup
[21421.141030] kobject: '10:0:7:0' (ffff8801158df588): calling ktype release
[21421.141044] kobject: '<NULL>' (ffff880139385af8): kobject_cleanup
[21421.141048] kobject: '<NULL>' (ffff880139385af8): calling ktype release
[21421.141054] kobject: '<NULL>' (ffff88013aea2730): kobject_cleanup
[21421.141058] kobject: '<NULL>' (ffff88013aea2730): calling ktype release
[21421.141080] kobject: '10:0:7:0': free name
[21421.141086] kobject: 'target10:0:7' (ffff880121091158): kobject_cleanup
[21421.141090] kobject: 'target10:0:7' (ffff880121091158): calling ktype release
[21421.141095] kobject: 'target10:0:7': free name
[21421.141101] usb-storage: device scan complete
[21421.170592] kobject: 'scsi_disk' (ffff880134173770): kobject_add_internal: parent: '10:0:0:0', set: '<NULL>'
[21421.170611] kobject: '10:0:0:0' (ffff8800b9669dc8): kobject_add_internal: parent: 'scsi_disk', set: 'devices'
[21421.170684] kobject: '10:0:0:0' (ffff8800b9669dc8): kobject_uevent_env
[21421.170708] kobject: '10:0:0:0' (ffff8800b9669dc8): fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.7/usb2/2-2/2-2.4/2-2.4:1.0/host10/target10:0:0/10:0:0:0/scsi_disk/10:0:0:0'
[21421.171628] sd 10:0:0:0: [sdd] 31777279 512-byte hardware sectors: (16.2 GB/15.1 GiB)
[21421.172223] sd 10:0:0:0: [sdd] Write Protect is off
[21421.172228] sd 10:0:0:0: [sdd] Mode Sense: 45 00 00 08
[21421.172231] sd 10:0:0:0: [sdd] Assuming drive cache: write through
[21421.172265] kobject: 'block' (ffff8800b14ae000): kobject_add_internal: parent: '10:0:0:0', set: '<NULL>'
[21421.172285] kobject: 'sdd' (ffff88013af24920): kobject_add_internal: parent: 'block', set: 'devices'
[21421.188631] kobject: 'sdd' (ffff88013af24920): kobject_uevent_env
[21421.188637] kobject: 'sdd' (ffff88013af24920): kobject_uevent_env: uevent_suppress caused the event to drop!
[21421.188658] kobject: 'holders' (ffff8800b9648880): kobject_add_internal: parent: 'sdd', set: '<NULL>'
[21421.188669] kobject: 'slaves' (ffff8800b9648908): kobject_add_internal: parent: 'sdd', set: '<NULL>'
[21421.189054] kobject: '10:0:0:0' (ffff88012bb87588): kobject_uevent_env
[21421.189077] kobject: '10:0:0:0' (ffff88012bb87588): fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.7/usb2/2-2/2-2.4/2-2.4:1.0/host10/target10:0:0/10:0:0:0'
[21421.190637] sd 10:0:0:0: [sdd] Assuming drive cache: write through
[21421.190671]  sdd: sdd1
[21421.191270] kobject: 'sdd' (ffff88013af24920): kobject_uevent_env
[21421.191275] kobject: 'sdd' (ffff88013af24920): kobject_uevent_env: uevent_suppress caused the event to drop!
[21421.191282] sdd: p1 size 31793076 limited to end of disk
[21421.191305] kobject: 'sdd1' (ffff8800b9711588): kobject_add_internal: parent: 'sdd', set: 'devices'
[21421.191397] kobject: 'sdd1' (ffff8800b9711588): kobject_uevent_env
[21421.191401] kobject: 'sdd1' (ffff8800b9711588): kobject_uevent_env: uevent_suppress caused the event to drop!
[21421.191412] kobject: 'holders' (ffff8800b9648a18): kobject_add_internal: parent: 'sdd1', set: '<NULL>'
[21421.191420] kobject: 'sdd1' (ffff8800b9711588): kobject_uevent_env
[21421.191441] kobject: 'sdd1' (ffff8800b9711588): fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.7/usb2/2-2/2-2.4/2-2.4:1.0/host10/target10:0:0/10:0:0:0/block/sdd/sdd1'
[21421.192001] kobject: 'sdd' (ffff88013af24920): kobject_uevent_env
[21421.192034] kobject: 'sdd' (ffff88013af24920): fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.7/usb2/2-2/2-2.4/2-2.4:1.0/host10/target10:0:0/10:0:0:0/block/sdd'
[21421.192209] kobject: 'sdd1' (ffff8800b9711588): kobject_uevent_env
[21421.192230] kobject: 'sdd1' (ffff8800b9711588): fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.7/usb2/2-2/2-2.4/2-2.4:1.0/host10/target10:0:0/10:0:0:0/block/sdd/sdd1'
[21421.192399] kobject: 'queue' (ffff88013aea1b90): kobject_add_internal: parent: 'sdd', set: '<NULL>'
[21421.192445] kobject: 'queue' (ffff88013aea1b90): kobject_uevent_env
[21421.192450] kobject: 'queue' (ffff88013aea1b90): kobject_uevent_env: filter function caused the event to drop!
[21421.192456] kobject: 'iosched' (ffff8801392a6158): kobject_add_internal: parent: 'queue', set: '<NULL>'
[21421.192488] kobject: 'iosched' (ffff8801392a6158): kobject_uevent_env
[21421.192492] kobject: 'iosched' (ffff8801392a6158): kobject_uevent_env: filter function caused the event to drop!
[21421.192505] kobject: '259:917504' (ffff8800b96f5498): kobject_add_internal: parent: 'bdi', set: 'devices'
[21421.192557] kobject: '259:917504' (ffff8800b96f5498): kobject_uevent_env
[21421.192575] kobject: '259:917504' (ffff8800b96f5498): fill_kobj_path: path = '/devices/virtual/bdi/259:917504'
[21421.192762] sd 10:0:0:0: [sdd] Attached SCSI removable disk
[21421.258331] kobject: 'sdd' (ffff88013af24920): kobject_uevent_env
[21421.258358] kobject: 'sdd' (ffff88013af24920): fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.7/usb2/2-2/2-2.4/2-2.4:1.0/host10/target10:0:0/10:0:0:0/block/sdd'
[21421.276118] kobject: '65534' (ffff88013ab522e8): kobject_add_internal: parent: 'uids', set: 'uids'
[21421.281107] kobject: '65534' (ffff88013ab522e8): kobject_uevent_env
[21421.281162] kobject: '65534' (ffff88013ab522e8): fill_kobj_path: path = '/kernel/uids/65534'
[21421.333642] kobject: '10:0:0:1' (ffff880139003b90): kobject_uevent_env
[21421.333667] kobject: '10:0:0:1' (ffff880139003b90): fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.7/usb2/2-2/2-2.4/2-2.4:1.0/host10/target10:0:0/10:0:0:1'
[21421.335399] kobject: '65534' (ffff88013ab522e8): kobject_uevent_env
[21421.335420] kobject: '65534' (ffff88013ab522e8): fill_kobj_path: path = '/kernel/uids/65534'
[21421.335554] kobject: '65534' (ffff88013ab522e8): kobject_cleanup
[21421.335558] kobject: '65534' (ffff88013ab522e8): calling ktype release
[21421.335562] kobject: '65534': free name
[21421.340587] kobject: '65534' (ffff88013ab522e8): kobject_add_internal: parent: 'uids', set: 'uids'
[21421.340610] kobject: '65534' (ffff88013ab522e8): kobject_uevent_env
[21421.340631] kobject: '65534' (ffff88013ab522e8): fill_kobj_path: path = '/kernel/uids/65534'
[21421.361164] kobject: '65534' (ffff88013ab522e8): kobject_uevent_env
[21421.361185] kobject: '65534' (ffff88013ab522e8): fill_kobj_path: path = '/kernel/uids/65534'
[21421.361344] kobject: '65534' (ffff88013ab522e8): kobject_cleanup
[21421.361348] kobject: '65534' (ffff88013ab522e8): calling ktype release
[21421.361351] kobject: '65534': free name
[21421.425012] kobject: '65534' (ffff88013ab522e8): kobject_add_internal: parent: 'uids', set: 'uids'
[21421.425033] kobject: '65534' (ffff88013ab522e8): kobject_uevent_env
[21421.425051] kobject: '65534' (ffff88013ab522e8): fill_kobj_path: path = '/kernel/uids/65534'
[21421.447972] kobject: 'sr0' (ffff880138836fa8): kobject_uevent_env
[21421.448000] kobject: 'sr0' (ffff880138836fa8): fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.7/usb2/2-2/2-2.4/2-2.4:1.0/host10/target10:0:0/10:0:0:1/block/sr0'
[21421.462036] kobject: '65534' (ffff88013ab522e8): kobject_uevent_env
[21421.462058] kobject: '65534' (ffff88013ab522e8): fill_kobj_path: path = '/kernel/uids/65534'
[21421.462129] kobject: '65534' (ffff88013ab522e8): kobject_cleanup
[21421.462133] kobject: '65534' (ffff88013ab522e8): calling ktype release
[21421.462136] kobject: '65534': free name
[21421.508994] kobject: '65534' (ffff88013ab522e8): kobject_add_internal: parent: 'uids', set: 'uids'
[21421.509015] kobject: '65534' (ffff88013ab522e8): kobject_uevent_env
[21421.509034] kobject: '65534' (ffff88013ab522e8): fill_kobj_path: path = '/kernel/uids/65534'
[21421.538933] kobject: '65534' (ffff88013ab522e8): kobject_uevent_env
[21421.538955] kobject: '65534' (ffff88013ab522e8): fill_kobj_path: path = '/kernel/uids/65534'
[21421.539110] kobject: '65534' (ffff88013ab522e8): kobject_cleanup
[21421.539114] kobject: '65534' (ffff88013ab522e8): calling ktype release
[21421.539118] kobject: '65534': free name
[21421.548145] kobject: '65534' (ffff88013ab522e8): kobject_add_internal: parent: 'uids', set: 'uids'
[21421.548165] kobject: '65534' (ffff88013ab522e8): kobject_uevent_env
[21421.548184] kobject: '65534' (ffff88013ab522e8): fill_kobj_path: path = '/kernel/uids/65534'
[21421.588200] kobject: '65534' (ffff88013ab522e8): kobject_uevent_env
[21421.588221] kobject: '65534' (ffff88013ab522e8): fill_kobj_path: path = '/kernel/uids/65534'
[21421.588365] kobject: '65534' (ffff88013ab522e8): kobject_cleanup
[21421.588370] kobject: '65534' (ffff88013ab522e8): calling ktype release
[21421.588373] kobject: '65534': free name
[21421.722647] kobject: '65534' (ffff8801390f02e8): kobject_add_internal: parent: 'uids', set: 'uids'
[21421.722668] kobject: '65534' (ffff8801390f02e8): kobject_uevent_env
[21421.722687] kobject: '65534' (ffff8801390f02e8): fill_kobj_path: path = '/kernel/uids/65534'
[21421.756036] kobject: '65534' (ffff8801390f02e8): kobject_uevent_env
[21421.756057] kobject: '65534' (ffff8801390f02e8): fill_kobj_path: path = '/kernel/uids/65534'
[21421.756195] kobject: '65534' (ffff8801390f02e8): kobject_cleanup
[21421.756199] kobject: '65534' (ffff8801390f02e8): calling ktype release
[21421.756202] kobject: '65534': free name
[21423.065910] usb 2-2: USB disconnect, address 26
[21423.065915] usb 2-2.4: USB disconnect, address 27
[21423.066144] kobject: 'usbdev2.27_ep81' (ffff8800b9668b88): kobject_uevent_env
[21423.066168] kobject: 'usbdev2.27_ep81' (ffff8800b9668b88): fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.7/usb2/2-2/2-2.4/2-2.4:1.0/usb_endpoint/usbdev2.27_ep81'
[21423.066227] kobject: 'usbdev2.27_ep81' (ffff8800b9668b88): kobject_cleanup
[21423.066231] kobject: 'usbdev2.27_ep81' (ffff8800b9668b88): calling ktype release
[21423.066245] kobject: 'usbdev2.27_ep81': free name
[21423.066363] kobject: 'usbdev2.27_ep02' (ffff8800b9668268): kobject_uevent_env
[21423.066383] kobject: 'usbdev2.27_ep02' (ffff8800b9668268): fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.7/usb2/2-2/2-2.4/2-2.4:1.0/usb_endpoint/usbdev2.27_ep02'
[21423.066430] kobject: 'usb_endpoint' (ffff8801340d3f68): kobject_cleanup
[21423.066434] kobject: 'usb_endpoint' (ffff8801340d3f68): auto cleanup kobject_del
[21423.066453] kobject: 'usb_endpoint' (ffff8801340d3f68): calling ktype release
[21423.066457] kobject: (ffff8801340d3f68): dynamic_kobj_release
[21423.066462] kobject: 'usb_endpoint': free name
[21423.066466] kobject: 'usbdev2.27_ep02' (ffff8800b9668268): kobject_cleanup
[21423.066470] kobject: 'usbdev2.27_ep02' (ffff8800b9668268): calling ktype release
[21423.066476] kobject: 'usbdev2.27_ep02': free name
[21423.066708] kobject: '10:0:0:0' (ffff8800b966a248): kobject_uevent_env
[21423.066727] kobject: '10:0:0:0' (ffff8800b966a248): fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.7/usb2/2-2/2-2.4/2-2.4:1.0/host10/target10:0:0/10:0:0:0/bsg/10:0:0:0'
[21423.066773] kobject: 'bsg' (ffff8801388d56e8): kobject_cleanup
[21423.066776] kobject: 'bsg' (ffff8801388d56e8): auto cleanup kobject_del
[21423.066795] kobject: 'bsg' (ffff8801388d56e8): calling ktype release
[21423.066798] kobject: (ffff8801388d56e8): dynamic_kobj_release
[21423.066802] kobject: 'bsg': free name
[21423.066806] kobject: '10:0:0:0' (ffff8800b966a248): kobject_cleanup
[21423.066808] kobject: '10:0:0:0' (ffff8800b966a248): calling ktype release
[21423.066815] kobject: '10:0:0:0': free name
[21423.066928] kobject: 'sg3' (ffff8800b966a000): kobject_uevent_env
[21423.066945] kobject: 'sg3' (ffff8800b966a000): fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.7/usb2/2-2/2-2.4/2-2.4:1.0/host10/target10:0:0/10:0:0:0/scsi_generic/sg3'
[21423.066988] kobject: 'scsi_generic' (ffff8801388d55d8): kobject_cleanup
[21423.066991] kobject: 'scsi_generic' (ffff8801388d55d8): auto cleanup kobject_del
[21423.067020] kobject: 'scsi_generic' (ffff8801388d55d8): calling ktype release
[21423.067023] kobject: (ffff8801388d55d8): dynamic_kobj_release
[21423.067026] kobject: 'scsi_generic': free name
[21423.067030] kobject: 'sg3' (ffff8800b966a000): kobject_cleanup
[21423.067033] kobject: 'sg3' (ffff8800b966a000): calling ktype release
[21423.067038] kobject: 'sg3': free name
[21423.067045] kobject: '<NULL>' (ffff8800b154c578): kobject_cleanup
[21423.067048] kobject: '<NULL>' (ffff8800b154c578): calling ktype release
[21423.067056] kobject: '<NULL>' (ffff88013af21178): kobject_cleanup
[21423.067059] kobject: '<NULL>' (ffff88013af21178): calling ktype release
[21423.067077] kobject: '10:0:0:0' (ffff88012bb87758): kobject_uevent_env
[21423.067094] kobject: '10:0:0:0' (ffff88012bb87758): fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.7/usb2/2-2/2-2.4/2-2.4:1.0/host10/target10:0:0/10:0:0:0/scsi_device/10:0:0:0'
[21423.067144] kobject: 'scsi_device' (ffff8801078d8220): kobject_cleanup
[21423.067148] kobject: 'scsi_device' (ffff8801078d8220): auto cleanup kobject_del
[21423.067165] kobject: 'scsi_device' (ffff8801078d8220): calling ktype release
[21423.067169] kobject: (ffff8801078d8220): dynamic_kobj_release
[21423.067174] kobject: 'scsi_device': free name
[21423.067178] kobject: '10:0:0:0' (ffff88012bb87758): kobject_cleanup
[21423.067181] kobject: '10:0:0:0' (ffff88012bb87758): calling ktype release
[21423.067186] kobject: '10:0:0:0': free name
[21423.067469] kobject: '10:0:0:0' (ffff8800b9669dc8): kobject_uevent_env
[21423.067490] kobject: '10:0:0:0' (ffff8800b9669dc8): fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.7/usb2/2-2/2-2.4/2-2.4:1.0/host10/target10:0:0/10:0:0:0/scsi_disk/10:0:0:0'
[21423.067543] kobject: 'scsi_disk' (ffff880134173770): kobject_cleanup
[21423.067546] kobject: 'scsi_disk' (ffff880134173770): auto cleanup kobject_del
[21423.067564] kobject: 'scsi_disk' (ffff880134173770): calling ktype release
[21423.067568] kobject: (ffff880134173770): dynamic_kobj_release
[21423.067573] kobject: 'scsi_disk': free name
[21423.106153] kobject: 'holders' (ffff8800b9648a18): kobject_cleanup
[21423.106159] kobject: 'holders' (ffff8800b9648a18): auto cleanup kobject_del
[21423.106178] kobject: 'holders' (ffff8800b9648a18): calling ktype release
[21423.106182] kobject: (ffff8800b9648a18): dynamic_kobj_release
[21423.106187] kobject: 'holders': free name
[21423.106327] kobject: 'sdd1' (ffff8800b9711588): kobject_uevent_env
[21423.106352] kobject: 'sdd1' (ffff8800b9711588): fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.7/usb2/2-2/2-2.4/2-2.4:1.0/host10/target10:0:0/10:0:0:0/block/sdd/sdd1'
[21423.106507] kobject: '259:917504' (ffff8800b96f5498): kobject_uevent_env
[21423.106526] kobject: '259:917504' (ffff8800b96f5498): fill_kobj_path: path = '/devices/virtual/bdi/259:917504'
[21423.106572] kobject: '259:917504' (ffff8800b96f5498): kobject_cleanup
[21423.106575] kobject: '259:917504' (ffff8800b96f5498): calling ktype release
[21423.106584] kobject: '259:917504': free name
[21423.106590] kobject: 'iosched' (ffff8801392a6158): kobject_uevent_env
[21423.106595] kobject: 'iosched' (ffff8801392a6158): kobject_uevent_env: filter function caused the event to drop!
[21423.106628] kobject: 'queue' (ffff88013aea1b90): kobject_uevent_env
[21423.106632] kobject: 'queue' (ffff88013aea1b90): kobject_uevent_env: filter function caused the event to drop!
[21423.106680] kobject: 'holders' (ffff8800b9648880): kobject_cleanup
[21423.106684] kobject: 'holders' (ffff8800b9648880): auto cleanup kobject_del
[21423.106694] kobject: 'holders' (ffff8800b9648880): calling ktype release
[21423.106698] kobject: (ffff8800b9648880): dynamic_kobj_release
[21423.106702] kobject: 'holders': free name
[21423.106706] kobject: 'slaves' (ffff8800b9648908): kobject_cleanup
[21423.106709] kobject: 'slaves' (ffff8800b9648908): auto cleanup kobject_del
[21423.106720] kobject: 'slaves' (ffff8800b9648908): calling ktype release
[21423.106724] kobject: (ffff8800b9648908): dynamic_kobj_release
[21423.106728] kobject: 'slaves': free name
[21423.106897] kobject: 'sdd' (ffff88013af24920): kobject_uevent_env
[21423.106918] kobject: 'sdd' (ffff88013af24920): fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.7/usb2/2-2/2-2.4/2-2.4:1.0/host10/target10:0:0/10:0:0:0/block/sdd'
[21423.106970] kobject: 'block' (ffff8800b14ae000): kobject_cleanup
[21423.106974] kobject: 'block' (ffff8800b14ae000): auto cleanup kobject_del
[21423.106995] kobject: 'block' (ffff8800b14ae000): calling ktype release
[21423.106999] kobject: (ffff8800b14ae000): dynamic_kobj_release
[21423.107023] kobject: 'block': free name
[21423.111501] kobject: '10:0:0:0' (ffff88012bb87588): kobject_uevent_env
[21423.111527] kobject: '10:0:0:0' (ffff88012bb87588): fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.7/usb2/2-2/2-2.4/2-2.4:1.0/host10/target10:0:0/10:0:0:0'
[21423.111721] kobject: '10:0:0:1' (ffff8800b96686e8): kobject_uevent_env
[21423.111743] kobject: '10:0:0:1' (ffff8800b96686e8): fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.7/usb2/2-2/2-2.4/2-2.4:1.0/host10/target10:0:0/10:0:0:1/bsg/10:0:0:1'
[21423.111796] kobject: 'bsg' (ffff880129c6add0): kobject_cleanup
[21423.111800] kobject: 'bsg' (ffff880129c6add0): auto cleanup kobject_del
[21423.111821] kobject: 'bsg' (ffff880129c6add0): calling ktype release
[21423.111825] kobject: (ffff880129c6add0): dynamic_kobj_release
[21423.111831] kobject: 'bsg': free name
[21423.111835] kobject: '10:0:0:1' (ffff8800b96686e8): kobject_cleanup
[21423.111838] kobject: '10:0:0:1' (ffff8800b96686e8): calling ktype release
[21423.111847] kobject: '10:0:0:1': free name
[21423.112456] kobject: 'sg4' (ffff8800b9669928): kobject_uevent_env
[21423.112477] kobject: 'sg4' (ffff8800b9669928): fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.7/usb2/2-2/2-2.4/2-2.4:1.0/host10/target10:0:0/10:0:0:1/scsi_generic/sg4'
[21423.112530] kobject: 'scsi_generic' (ffff880129c6a6e8): kobject_cleanup
[21423.112534] kobject: 'scsi_generic' (ffff880129c6a6e8): auto cleanup kobject_del
[21423.112553] kobject: 'scsi_generic' (ffff880129c6a6e8): calling ktype release
[21423.112556] kobject: (ffff880129c6a6e8): dynamic_kobj_release
[21423.112560] kobject: 'scsi_generic': free name
[21423.112565] kobject: 'sg4' (ffff8800b9669928): kobject_cleanup
[21423.112568] kobject: 'sg4' (ffff8800b9669928): calling ktype release
[21423.112574] kobject: 'sg4': free name
[21423.112581] kobject: '<NULL>' (ffff8800b154ced8): kobject_cleanup
[21423.112585] kobject: '<NULL>' (ffff8800b154ced8): calling ktype release
[21423.112594] kobject: '<NULL>' (ffff88013a9f2298): kobject_cleanup
[21423.112597] kobject: '<NULL>' (ffff88013a9f2298): calling ktype release
[21423.112618] kobject: '10:0:0:1' (ffff880139003d60): kobject_uevent_env
[21423.112638] kobject: '10:0:0:1' (ffff880139003d60): fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.7/usb2/2-2/2-2.4/2-2.4:1.0/host10/target10:0:0/10:0:0:1/scsi_device/10:0:0:1'
[21423.112687] kobject: 'scsi_device' (ffff880129c6a550): kobject_cleanup
[21423.112690] kobject: 'scsi_device' (ffff880129c6a550): auto cleanup kobject_del
[21423.112709] kobject: 'scsi_device' (ffff880129c6a550): calling ktype release
[21423.112713] kobject: (ffff880129c6a550): dynamic_kobj_release
[21423.112717] kobject: 'scsi_device': free name
[21423.112722] kobject: '10:0:0:1' (ffff880139003d60): kobject_cleanup
[21423.112725] kobject: '10:0:0:1' (ffff880139003d60): calling ktype release
[21423.112730] kobject: '10:0:0:1': free name
[21423.113117] kobject: '11:0' (ffff8800b96696e0): kobject_uevent_env
[21423.113137] kobject: '11:0' (ffff8800b96696e0): fill_kobj_path: path = '/devices/virtual/bdi/11:0'
[21423.113187] kobject: '11:0' (ffff8800b96696e0): kobject_cleanup
[21423.113191] kobject: '11:0' (ffff8800b96696e0): calling ktype release
[21423.113197] kobject: '11:0': free name
[21423.113202] kobject: 'iosched' (ffff8801350a5c40): kobject_uevent_env
[21423.113206] kobject: 'iosched' (ffff8801350a5c40): kobject_uevent_env: filter function caused the event to drop!
[21423.113248] kobject: 'queue' (ffff88013aea0ff0): kobject_uevent_env
[21423.113252] kobject: 'queue' (ffff88013aea0ff0): kobject_uevent_env: filter function caused the event to drop!
[21423.113289] kobject: 'holders' (ffff880129f63440): kobject_cleanup
[21423.113292] kobject: 'holders' (ffff880129f63440): auto cleanup kobject_del
[21423.113303] kobject: 'holders' (ffff880129f63440): calling ktype release
[21423.113306] kobject: (ffff880129f63440): dynamic_kobj_release
[21423.113311] kobject: 'holders': free name
[21423.113315] kobject: 'slaves' (ffff88013886c6e8): kobject_cleanup
[21423.113319] kobject: 'slaves' (ffff88013886c6e8): auto cleanup kobject_del
[21423.113329] kobject: 'slaves' (ffff88013886c6e8): calling ktype release
[21423.113332] kobject: (ffff88013886c6e8): dynamic_kobj_release
[21423.113337] kobject: 'slaves': free name
[21423.113500] kobject: 'sr0' (ffff880138836fa8): kobject_uevent_env
[21423.113520] kobject: 'sr0' (ffff880138836fa8): fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.7/usb2/2-2/2-2.4/2-2.4:1.0/host10/target10:0:0/10:0:0:1/block/sr0'
[21423.113568] kobject: 'block' (ffff880139009d48): kobject_cleanup
[21423.113571] kobject: 'block' (ffff880139009d48): auto cleanup kobject_del
[21423.113606] kobject: 'block' (ffff880139009d48): calling ktype release
[21423.113610] kobject: (ffff880139009d48): dynamic_kobj_release
[21423.113615] kobject: 'block': free name
[21423.113755] kobject: 'sr0' (ffff880138836fa8): kobject_cleanup
[21423.113759] kobject: 'sr0' (ffff880138836fa8): calling ktype release
[21423.113770] kobject: 'sr0': free name
[21423.113780] kobject: '10:0:0:1' (ffff880139003b90): kobject_uevent_env
[21423.113800] kobject: '10:0:0:1' (ffff880139003b90): fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.7/usb2/2-2/2-2.4/2-2.4:1.0/host10/target10:0:0/10:0:0:1'
[21423.113878] kobject: '10:0:0:1' (ffff880139003b90): kobject_cleanup
[21423.113882] kobject: '10:0:0:1' (ffff880139003b90): calling ktype release
[21423.113905] kobject: 'iosched' (ffff8801350a5c40): kobject_cleanup
[21423.113908] kobject: 'iosched' (ffff8801350a5c40): calling ktype release
[21423.113916] kobject: 'iosched': free name
[21423.113921] kobject: 'queue' (ffff88013aea0ff0): kobject_cleanup
[21423.113925] kobject: 'queue' (ffff88013aea0ff0): calling ktype release
[21423.113949] kobject: 'queue': free name
[21423.113961] kobject: '10:0:0:1': free name
[21423.114098] kobject: 'host10' (ffff8801001594c8): kobject_uevent_env
[21423.114118] kobject: 'host10' (ffff8801001594c8): fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.7/usb2/2-2/2-2.4/2-2.4:1.0/host10/scsi_host/host10'
[21423.114164] kobject: 'scsi_host' (ffff8801340d3880): kobject_cleanup
[21423.114168] kobject: 'scsi_host' (ffff8801340d3880): auto cleanup kobject_del
[21423.114186] kobject: 'scsi_host' (ffff8801340d3880): calling ktype release
[21423.114190] kobject: (ffff8801340d3880): dynamic_kobj_release
[21423.114194] kobject: 'scsi_host': free name
[21423.114198] kobject: 'host10' (ffff8801001594c8): kobject_cleanup
[21423.114201] kobject: 'host10' (ffff8801001594c8): calling ktype release
[21423.114207] kobject: 'host10': free name
[21423.114258] kobject: 'host10' (ffff8801001592f8): kobject_uevent_env
[21423.114277] kobject: 'host10' (ffff8801001592f8): fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.7/usb2/2-2/2-2.4/2-2.4:1.0/host10'
[21423.114310] ------------[ cut here ]------------
[21423.114319] WARNING: at fs/sysfs/dir.c:794 sysfs_remove_dir+0xb2/0xd0()
[21423.114322] Hardware name: 2776LEG
[21423.114325] XXX dir: host10/target10:0:0
[21423.114327] Modules linked in: sr_mod cdrom sit tunnel4 ipv6 tun aes_x86_64 aes_generic i915 drm i2c_algo_bit i2c_core acpi_cpufreq snd_pcm_oss snd_mixer_oss snd_seq snd_seq_device edd pl2303 usbserial usblp fuse loop dm_mod usb_storage usbhid hid cdc_ether usbnet cdc_wdm cdc_acm uvcvideo videodev v4l1_compat v4l2_compat_ioctl32 arc4 ecb snd_hda_codec_conexant snd_hda_intel kvm_intel snd_hda_codec snd_pcm iwlagn thinkpad_acpi kvm snd_timer iwlcore hwmon rfkill mac80211 backlight uhci_hcd joydev pcspkr evdev led_class snd sg cfg80211 battery nvram ehci_hcd ac soundcore e1000e usbcore snd_page_alloc thermal button processor intel_agp
[21423.114421] Pid: 213, comm: khubd Tainted: G        W  2.6.30-rc7-dirty #41
[21423.114425] Call Trace:
[21423.114433]  [<ffffffff8023c6a8>] warn_slowpath_common+0x78/0xb0
[21423.114438]  [<ffffffff8023c73c>] warn_slowpath_fmt+0x3c/0x40
[21423.114442]  [<ffffffff80322a16>] ? sysfs_addrm_start+0x76/0xd0
[21423.114447]  [<ffffffff80323032>] sysfs_remove_dir+0xb2/0xd0
[21423.114455]  [<ffffffff803e5056>] kobject_del+0x16/0x40
[21423.114461]  [<ffffffff80477665>] device_del+0x165/0x1a0
[21423.114467]  [<ffffffff8048258f>] scsi_remove_host+0xcf/0x120
[21423.114481]  [<ffffffffa02c93cb>] quiesce_and_remove_host+0x6b/0xb0 [usb_storage]
[21423.114492]  [<ffffffffa02c94f8>] usb_stor_disconnect+0x18/0x30 [usb_storage]
[21423.114514]  [<ffffffffa003ffae>] usb_unbind_interface+0x6e/0x140 [usbcore]
[21423.114523]  [<ffffffff80479e49>] __device_release_driver+0x59/0xa0
[21423.114528]  [<ffffffff80479f88>] device_release_driver+0x28/0x40
[21423.114533]  [<ffffffff8047929c>] bus_remove_device+0xac/0xe0
[21423.114538]  [<ffffffff80477627>] device_del+0x127/0x1a0
[21423.114557]  [<ffffffffa003cb77>] usb_disable_device+0xa7/0x130 [usbcore]
[21423.114575]  [<ffffffffa0037818>] usb_disconnect+0xc8/0x140 [usbcore]
[21423.114593]  [<ffffffffa0037804>] usb_disconnect+0xb4/0x140 [usbcore]
[21423.114611]  [<ffffffffa00388db>] hub_thread+0x50b/0x1230 [usbcore]
[21423.114617]  [<ffffffff80565a76>] ? _spin_unlock_irq+0x26/0x30
[21423.114623]  [<ffffffff80237d1e>] ? finish_task_switch+0x7e/0x140
[21423.114629]  [<ffffffff80237cdb>] ? finish_task_switch+0x3b/0x140
[21423.114635]  [<ffffffff802549e0>] ? autoremove_wake_function+0x0/0x40
[21423.114653]  [<ffffffffa00383d0>] ? hub_thread+0x0/0x1230 [usbcore]
[21423.114658]  [<ffffffff802545b5>] kthread+0x55/0xa0
[21423.114664]  [<ffffffff8020cf3a>] child_rip+0xa/0x20
[21423.114669]  [<ffffffff80254560>] ? kthread+0x0/0xa0
[21423.114674]  [<ffffffff8020cf30>] ? child_rip+0x0/0x20
[21423.114678] ---[ end trace 89cad9e6bbbcf701 ]---
[21423.114755] kobject: '2-2.4:1.0' (ffff88013a9cc908): kobject_uevent_env
[21423.114774] kobject: '2-2.4:1.0' (ffff88013a9cc908): fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.7/usb2/2-2/2-2.4/2-2.4:1.0'
[21423.114976] kobject: 'usbdev2.27_ep00' (ffff8800b9668dd0): kobject_uevent_env
[21423.114995] kobject: 'usbdev2.27_ep00' (ffff8800b9668dd0): fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.7/usb2/2-2/2-2.4/usb_endpoint/usbdev2.27_ep00'
[21423.115059] kobject: 'usb_endpoint' (ffff880115b54990): kobject_cleanup
[21423.115063] kobject: 'usb_endpoint' (ffff880115b54990): auto cleanup kobject_del
[21423.115092] kobject: 'usb_endpoint' (ffff880115b54990): calling ktype release
[21423.115096] kobject: (ffff880115b54990): dynamic_kobj_release
[21423.115101] kobject: 'usb_endpoint': free name
[21423.115106] kobject: 'usbdev2.27_ep00' (ffff8800b9668dd0): kobject_cleanup
[21423.115109] kobject: 'usbdev2.27_ep00' (ffff8800b9668dd0): calling ktype release
[21423.115117] kobject: 'usbdev2.27_ep00': free name
[21423.115513] kobject: '2-2.4' (ffff88013422cb20): kobject_uevent_env
[21423.115532] kobject: '2-2.4' (ffff88013422cb20): fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.7/usb2/2-2/2-2.4'
[21423.115583] kobject: '2-2.4' (ffff88013422cb20): kobject_cleanup
[21423.115586] kobject: '2-2.4' (ffff88013422cb20): calling ktype release
[21423.115607] kobject: '2-2.4': free name
[21423.115808] kobject: 'usbdev2.26_ep81' (ffff8800b96f56f0): kobject_uevent_env
[21423.115828] kobject: 'usbdev2.26_ep81' (ffff8800b96f56f0): fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.7/usb2/2-2/2-2:1.0/usb_endpoint/usbdev2.26_ep81'
[21423.115875] kobject: 'usb_endpoint' (ffff8800b9648198): kobject_cleanup
[21423.115879] kobject: 'usb_endpoint' (ffff8800b9648198): auto cleanup kobject_del
[21423.115898] kobject: 'usb_endpoint' (ffff8800b9648198): calling ktype release
[21423.115902] kobject: (ffff8800b9648198): dynamic_kobj_release
[21423.115907] kobject: 'usb_endpoint': free name
[21423.115912] kobject: 'usbdev2.26_ep81' (ffff8800b96f56f0): kobject_cleanup
[21423.115915] kobject: 'usbdev2.26_ep81' (ffff8800b96f56f0): calling ktype release
[21423.115924] kobject: 'usbdev2.26_ep81': free name
[21423.116154] kobject: '2-2:1.0' (ffff8800b9716b48): kobject_uevent_env
[21423.116173] kobject: '2-2:1.0' (ffff8800b9716b48): fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.7/usb2/2-2/2-2:1.0'
[21423.116224] kobject: '2-2:1.0' (ffff8800b9716b48): kobject_cleanup
[21423.116228] kobject: '2-2:1.0' (ffff8800b9716b48): calling ktype release
[21423.116235] kobject: '2-2:1.0': free name
[21423.116364] kobject: 'usbdev2.26_ep00' (ffff8800b96f5dc8): kobject_uevent_env
[21423.116383] kobject: 'usbdev2.26_ep00' (ffff8800b96f5dc8): fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.7/usb2/2-2/usb_endpoint/usbdev2.26_ep00'
[21423.116432] kobject: 'usb_endpoint' (ffff8800b96486e8): kobject_cleanup
[21423.116435] kobject: 'usb_endpoint' (ffff8800b96486e8): auto cleanup kobject_del
[21423.116455] kobject: 'usb_endpoint' (ffff8800b96486e8): calling ktype release
[21423.116459] kobject: (ffff8800b96486e8): dynamic_kobj_release
[21423.116463] kobject: 'usb_endpoint': free name
[21423.116468] kobject: 'usbdev2.26_ep00' (ffff8800b96f5dc8): kobject_cleanup
[21423.116472] kobject: 'usbdev2.26_ep00' (ffff8800b96f5dc8): calling ktype release
[21423.116478] kobject: 'usbdev2.26_ep00': free name
[21423.116830] kobject: '2-2' (ffff8801390c9128): kobject_uevent_env
[21423.116849] kobject: '2-2' (ffff8801390c9128): fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.7/usb2/2-2'
[21423.116897] kobject: '2-2' (ffff8801390c9128): kobject_cleanup
[21423.116901] kobject: '2-2' (ffff8801390c9128): calling ktype release
[21423.116917] kobject: '2-2': free name
[21423.269004] kobject: 'sdd1' (ffff8800b9711588): kobject_cleanup
[21423.269009] kobject: 'sdd1' (ffff8800b9711588): calling ktype release
[21423.269021] kobject: 'sdd1': free name
[21423.269040] kobject: '10:0:0:0' (ffff8800b9669dc8): kobject_cleanup
[21423.269044] kobject: '10:0:0:0' (ffff8800b9669dc8): calling ktype release
[21423.269055] kobject: '10:0:0:0': free name
[21423.269062] kobject: '10:0:0:0' (ffff88012bb87588): kobject_cleanup
[21423.269066] kobject: '10:0:0:0' (ffff88012bb87588): calling ktype release
[21423.269103] kobject: 'iosched' (ffff8801392a6158): kobject_cleanup
[21423.269106] kobject: 'iosched' (ffff8801392a6158): calling ktype release
[21423.269114] kobject: 'iosched': free name
[21423.269119] kobject: 'queue' (ffff88013aea1b90): kobject_cleanup
[21423.269123] kobject: 'queue' (ffff88013aea1b90): calling ktype release
[21423.272572] kobject: 'queue': free name
[21423.272654] kobject: 'target10:0:0' (ffff88013a9c9158): kobject_uevent_env
[21423.272674] kobject: 'target10:0:0' (ffff88013a9c9158): fill_kobj_path: path = '/host10/target10:0:0'
[21423.272864] kobject: 'target10:0:0' (ffff88013a9c9158): kobject_cleanup
[21423.272869] kobject: 'target10:0:0' (ffff88013a9c9158): calling ktype release
[21423.272876] kobject: 'host10' (ffff8801001592f8): kobject_cleanup
[21423.272879] kobject: 'host10' (ffff8801001592f8): calling ktype release
[21423.272936] kobject: '2-2.4:1.0' (ffff88013a9cc908): kobject_cleanup
[21423.272939] kobject: '2-2.4:1.0' (ffff88013a9cc908): calling ktype release
[21423.272954] kobject: '2-2.4:1.0': free name
[21423.272969] kobject: 'host10': free name
[21423.272984] kobject: 'target10:0:0': free name
[21423.272990] kobject: '10:0:0:0': free name
[21423.272995] kobject: 'sdd' (ffff88013af24920): kobject_cleanup
[21423.272998] kobject: 'sdd' (ffff88013af24920): calling ktype release
[21423.273011] kobject: 'sdd': free name


^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 25/20] sysfs: Only support removing emtpy sysfs  directories.
  2009-05-25 18:19                   ` Kay Sievers
@ 2009-05-25 20:14                     ` Alan Stern
  0 siblings, 0 replies; 200+ messages in thread
From: Alan Stern @ 2009-05-25 20:14 UTC (permalink / raw)
  To: Kay Sievers
  Cc: James Bottomley, Boaz Harrosh, SCSI development list,
	Eric W. Biederman, Andrew Morton, Greg Kroah-Hartman,
	Kernel development list, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Eric W. Biederman

On Mon, 25 May 2009, Kay Sievers wrote:

> On Mon, 2009-05-25 at 11:49 -0400, Alan Stern wrote:
> > Since this appears to be a bug in the SCSI layer, let's add some SCSI
> > people to the CC: list.
> > 
> > To summarize the problem: The SCSI core tries to unregister a host 
> > while its sysfs directory is still non-empty because the target hasn't 
> > been unregistered yet.
> 
> > Can you provide more of the context?  I'd like to see the log starting 
> > from when these devices were first registered.
> 
> I was able to trigger it with a USB storage device only, connected to a
> hub, and I removed the hub from the host.

Okay, it's a long log dump, but useful.  Evidently the problem is
caused by the fact that scsi_target_reap() is called by
scsi_device_dev_release_usercontext() instead of by
__scsi_remove_device() (both functions in drivers/scsi/scsi_sysfs.c).

There's probably a reason for this, but I don't know what it is.  Maybe 
James can explain.

Alan Stern


^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 25/20] sysfs: Only support removing emtpy sysfs directories.
  2009-05-25 11:45             ` Kay Sievers
  2009-05-25 12:01               ` Kay Sievers
@ 2009-05-26 16:27               ` Kay Sievers
  2009-05-26 19:29                 ` Alan Stern
  1 sibling, 1 reply; 200+ messages in thread
From: Kay Sievers @ 2009-05-26 16:27 UTC (permalink / raw)
  To: Alan Stern
  Cc: Eric W. Biederman, Andrew Morton, Greg Kroah-Hartman,
	linux-kernel, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Eric W. Biederman

On Mon, May 25, 2009 at 13:45, Kay Sievers <kay.sievers@vrfy.org> wrote:
> On Mon, May 25, 2009 at 04:06, Alan Stern <stern@rowland.harvard.edu> wrote:

>> by the way -- so it's a little difficult to trigger.
>
> I can trigger it pretty reliable now on plain -rc7 , but only with
> more hubs in-between the storage device. It usually take less than
> 10-15 connect/disconnect cycles.
>
> It looks like a serious bug though, after the bug triggered, random,
> likely unrelated, applications crash, and I can not cleanly shot down
> anymore.

Just a heads up if anybody is trying to reproduce this, it trashed my
ext3 rootfs, which is not recoverable.

Not sure what exactly caused this, but I didn't have anything like
this for a very long time.

I tried to reproduce the issue a few times more, and it crashed random
processes after the bug triggered, like mentioned above, and the box
never shut down cleanly.

It's entirely possible, that bug causes serious issues.

Thanks,
Kay

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 25/20] sysfs: Only support removing emtpy sysfs  directories.
  2009-05-26 16:27               ` Kay Sievers
@ 2009-05-26 19:29                 ` Alan Stern
  2009-05-26 21:09                   ` James Bottomley
  0 siblings, 1 reply; 200+ messages in thread
From: Alan Stern @ 2009-05-26 19:29 UTC (permalink / raw)
  To: Kay Sievers
  Cc: James Bottomley, SCSI development list, Eric W. Biederman,
	Andrew Morton, Greg Kroah-Hartman, Kernel development list,
	Tejun Heo, Cornelia Huck, linux-fsdevel, Eric W. Biederman

On Tue, 26 May 2009, Kay Sievers wrote:

> On Mon, May 25, 2009 at 13:45, Kay Sievers <kay.sievers@vrfy.org> wrote:
> > On Mon, May 25, 2009 at 04:06, Alan Stern <stern@rowland.harvard.edu> wrote:
> 
> >> by the way -- so it's a little difficult to trigger.
> >
> > I can trigger it pretty reliable now on plain -rc7 , but only with
> > more hubs in-between the storage device. It usually take less than
> > 10-15 connect/disconnect cycles.
> >
> > It looks like a serious bug though, after the bug triggered, random,
> > likely unrelated, applications crash, and I can not cleanly shot down
> > anymore.
> 
> Just a heads up if anybody is trying to reproduce this, it trashed my
> ext3 rootfs, which is not recoverable.
> 
> Not sure what exactly caused this, but I didn't have anything like
> this for a very long time.
> 
> I tried to reproduce the issue a few times more, and it crashed random
> processes after the bug triggered, like mentioned above, and the box
> never shut down cleanly.
> 
> It's entirely possible, that bug causes serious issues.

If you don't mind trashing some more ext3 root filesystems :-) you can
try this patch.  It's almost certainly not quite the right thing to do
and I have probably messed up the target's reference counting, but
maybe it's a step in the right direction.

This strange business of deferring unregistration into a workqueue 
means that the calls might not be executed in the same order that 
they're made.

Alan Stern


Index: usb-2.6/drivers/scsi/scsi_scan.c
===================================================================
--- usb-2.6.orig/drivers/scsi/scsi_scan.c
+++ usb-2.6/drivers/scsi/scsi_scan.c
@@ -956,6 +956,7 @@ static inline void scsi_destroy_sdev(str
 	if (sdev->host->hostt->slave_destroy)
 		sdev->host->hostt->slave_destroy(sdev);
 	transport_destroy_device(&sdev->sdev_gendev);
+	put_device(sdev->sdev_gendev.parent);
 	put_device(&sdev->sdev_gendev);
 }
 
Index: usb-2.6/drivers/scsi/scsi_sysfs.c
===================================================================
--- usb-2.6.orig/drivers/scsi/scsi_sysfs.c
+++ usb-2.6/drivers/scsi/scsi_sysfs.c
@@ -327,8 +327,6 @@ static void scsi_device_dev_release_user
 		sdev->request_queue = NULL;
 	}
 
-	scsi_target_reap(scsi_target(sdev));
-
 	kfree(sdev->inquiry);
 	kfree(sdev);
 
@@ -954,6 +952,7 @@ void __scsi_remove_device(struct scsi_de
 	if (sdev->host->hostt->slave_destroy)
 		sdev->host->hostt->slave_destroy(sdev);
 	transport_destroy_device(dev);
+	scsi_target_reap(scsi_target(sdev));
 	put_device(dev);
 }
 


^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 25/20] sysfs: Only support removing emtpy sysfs directories.
  2009-05-26 19:29                 ` Alan Stern
@ 2009-05-26 21:09                   ` James Bottomley
  2009-05-26 21:13                     ` Kay Sievers
  2009-05-26 21:39                     ` Alan Stern
  0 siblings, 2 replies; 200+ messages in thread
From: James Bottomley @ 2009-05-26 21:09 UTC (permalink / raw)
  To: Alan Stern
  Cc: Kay Sievers, SCSI development list, Eric W. Biederman,
	Andrew Morton, Greg Kroah-Hartman, Kernel development list,
	Tejun Heo, Cornelia Huck, linux-fsdevel, Eric W. Biederman

On Tue, 2009-05-26 at 15:29 -0400, Alan Stern wrote:
> On Tue, 26 May 2009, Kay Sievers wrote:
> 
> > On Mon, May 25, 2009 at 13:45, Kay Sievers <kay.sievers@vrfy.org> wrote:
> > > On Mon, May 25, 2009 at 04:06, Alan Stern <stern@rowland.harvard.edu> wrote:
> > 
> > >> by the way -- so it's a little difficult to trigger.
> > >
> > > I can trigger it pretty reliable now on plain -rc7 , but only with
> > > more hubs in-between the storage device. It usually take less than
> > > 10-15 connect/disconnect cycles.
> > >
> > > It looks like a serious bug though, after the bug triggered, random,
> > > likely unrelated, applications crash, and I can not cleanly shot down
> > > anymore.
> > 
> > Just a heads up if anybody is trying to reproduce this, it trashed my
> > ext3 rootfs, which is not recoverable.
> > 
> > Not sure what exactly caused this, but I didn't have anything like
> > this for a very long time.
> > 
> > I tried to reproduce the issue a few times more, and it crashed random
> > processes after the bug triggered, like mentioned above, and the box
> > never shut down cleanly.
> > 
> > It's entirely possible, that bug causes serious issues.
> 
> If you don't mind trashing some more ext3 root filesystems :-) you can
> try this patch.  It's almost certainly not quite the right thing to do
> and I have probably messed up the target's reference counting, but
> maybe it's a step in the right direction.
> 
> This strange business of deferring unregistration into a workqueue 
> means that the calls might not be executed in the same order that 
> they're made.
> 
> Alan Stern
> 
> 
> Index: usb-2.6/drivers/scsi/scsi_scan.c
> ===================================================================
> --- usb-2.6.orig/drivers/scsi/scsi_scan.c
> +++ usb-2.6/drivers/scsi/scsi_scan.c
> @@ -956,6 +956,7 @@ static inline void scsi_destroy_sdev(str
>  	if (sdev->host->hostt->slave_destroy)
>  		sdev->host->hostt->slave_destroy(sdev);
>  	transport_destroy_device(&sdev->sdev_gendev);
> +	put_device(sdev->sdev_gendev.parent);
>  	put_device(&sdev->sdev_gendev);
>  }
>  
> Index: usb-2.6/drivers/scsi/scsi_sysfs.c
> ===================================================================
> --- usb-2.6.orig/drivers/scsi/scsi_sysfs.c
> +++ usb-2.6/drivers/scsi/scsi_sysfs.c
> @@ -327,8 +327,6 @@ static void scsi_device_dev_release_user
>  		sdev->request_queue = NULL;
>  	}
>  
> -	scsi_target_reap(scsi_target(sdev));
> -
>  	kfree(sdev->inquiry);
>  	kfree(sdev);
>  
> @@ -954,6 +952,7 @@ void __scsi_remove_device(struct scsi_de
>  	if (sdev->host->hostt->slave_destroy)
>  		sdev->host->hostt->slave_destroy(sdev);
>  	transport_destroy_device(dev);
> +	scsi_target_reap(scsi_target(sdev));
>  	put_device(dev);
>  }

Um, well, you're right, it's wrong.  The reap needs to be matched with
the reap_ref++

It's hard to follow the problem without full context, but if I
understand correctly the problem is you want all the target directories
removed before you call device_del() on the host and the thing that gets
in the way is the necessary user context removal of the host.  So a
simple solution, rather than mucking with the way it works, is to wait
for the workqueues to complete.  Does this fix it?

James

---

diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
index c447838..5846c26 100644
--- a/drivers/scsi/scsi_scan.c
+++ b/drivers/scsi/scsi_scan.c
@@ -1877,6 +1877,12 @@ void scsi_forget_host(struct Scsi_Host *shost)
 		goto restart;
 	}
 	spin_unlock_irqrestore(shost->host_lock, flags);
+
+	/*
+	 * the sdev removal goes through a workqueue for user context, so
+	 * make sure it's all complete before we return
+	 */
+	flush_scheduled_work();
 }
 
 /*



^ permalink raw reply related	[flat|nested] 200+ messages in thread

* Re: [PATCH 25/20] sysfs: Only support removing emtpy sysfs directories.
  2009-05-26 21:09                   ` James Bottomley
@ 2009-05-26 21:13                     ` Kay Sievers
  2009-05-26 21:56                       ` Alan Stern
  2009-05-26 21:39                     ` Alan Stern
  1 sibling, 1 reply; 200+ messages in thread
From: Kay Sievers @ 2009-05-26 21:13 UTC (permalink / raw)
  To: James Bottomley
  Cc: Alan Stern, SCSI development list, Eric W. Biederman,
	Andrew Morton, Greg Kroah-Hartman, Kernel development list,
	Tejun Heo, Cornelia Huck, linux-fsdevel, Eric W. Biederman

On Tue, May 26, 2009 at 23:09, James Bottomley
<James.Bottomley@hansenpartnership.com> wrote:
> On Tue, 2009-05-26 at 15:29 -0400, Alan Stern wrote:

>> If you don't mind trashing some more ext3 root filesystems :-) you can
>> try this patch.  It's almost certainly not quite the right thing to do
>> and I have probably messed up the target's reference counting, but
>> maybe it's a step in the right direction.

> It's hard to follow the problem without full context, but if I
> understand correctly the problem is you want all the target directories
> removed before you call device_del() on the host and the thing that gets
> in the way is the necessary user context removal of the host.  So a
> simple solution, rather than mucking with the way it works, is to wait
> for the workqueues to complete.  Does this fix it?

Ok, I copied my newly installed system to another disks, to have a
root filesytem to trash again by this bug. :)

Which of your patches should I try?

Thanks,
Kay
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 25/20] sysfs: Only support removing emtpy sysfs  directories.
  2009-05-26 21:09                   ` James Bottomley
  2009-05-26 21:13                     ` Kay Sievers
@ 2009-05-26 21:39                     ` Alan Stern
  1 sibling, 0 replies; 200+ messages in thread
From: Alan Stern @ 2009-05-26 21:39 UTC (permalink / raw)
  To: James Bottomley
  Cc: Kay Sievers, SCSI development list, Eric W. Biederman,
	Andrew Morton, Greg Kroah-Hartman, Kernel development list,
	Tejun Heo, Cornelia Huck, linux-fsdevel, Eric W. Biederman

On Tue, 26 May 2009, James Bottomley wrote:

> Um, well, you're right, it's wrong.  The reap needs to be matched with
> the reap_ref++

I was afraid of that.  Not understanding how the reap_ref counts are 
supposed to work makes it hard to get them right...

> It's hard to follow the problem without full context, but if I
> understand correctly the problem is you want all the target directories
> removed before you call device_del() on the host and the thing that gets
> in the way is the necessary user context removal of the host.

Removal of the _target_, not the host.  (Do targets ever get reaped
from non-process context?  I didn't notice any places where that would
happen.)

That's part of the problem.  The other part is that the target isn't
unregistered until all the sdevs have been released, which might not
happen for a long time if the sdevs are pinned for any reason.

That is, the target should be unregistered when all the sdevs are
deleted, in __scsi_remove_device(), not when scsi_device_dev_release()  
or scsi_device_dev_release_usercontext() runs.

>  So a
> simple solution, rather than mucking with the way it works, is to wait
> for the workqueues to complete.  Does this fix it?
> 
> James
> 
> ---
> 
> diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
> index c447838..5846c26 100644
> --- a/drivers/scsi/scsi_scan.c
> +++ b/drivers/scsi/scsi_scan.c
> @@ -1877,6 +1877,12 @@ void scsi_forget_host(struct Scsi_Host *shost)
>  		goto restart;
>  	}
>  	spin_unlock_irqrestore(shost->host_lock, flags);
> +
> +	/*
> +	 * the sdev removal goes through a workqueue for user context, so
> +	 * make sure it's all complete before we return
> +	 */
> +	flush_scheduled_work();
>  }
>  
>  /*

This may well fix Kay's problem, but I think the other point needs to 
be addressed also.

Alan Stern


^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 25/20] sysfs: Only support removing emtpy sysfs  directories.
  2009-05-26 21:13                     ` Kay Sievers
@ 2009-05-26 21:56                       ` Alan Stern
  2009-05-26 22:03                         ` Kay Sievers
  0 siblings, 1 reply; 200+ messages in thread
From: Alan Stern @ 2009-05-26 21:56 UTC (permalink / raw)
  To: Kay Sievers
  Cc: James Bottomley, SCSI development list, Eric W. Biederman,
	Andrew Morton, Greg Kroah-Hartman, Kernel development list,
	Tejun Heo, Cornelia Huck, linux-fsdevel, Eric W. Biederman

On Tue, 26 May 2009, Kay Sievers wrote:

> Ok, I copied my newly installed system to another disks, to have a
> root filesytem to trash again by this bug. :)
> 
> Which of your patches should I try?

James's patch, not mine.

Alan Stern


^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 25/20] sysfs: Only support removing emtpy sysfs directories.
  2009-05-26 21:56                       ` Alan Stern
@ 2009-05-26 22:03                         ` Kay Sievers
  2009-05-26 23:49                           ` James Bottomley
  0 siblings, 1 reply; 200+ messages in thread
From: Kay Sievers @ 2009-05-26 22:03 UTC (permalink / raw)
  To: Alan Stern
  Cc: James Bottomley, SCSI development list, Eric W. Biederman,
	Andrew Morton, Greg Kroah-Hartman, Kernel development list,
	Tejun Heo, Cornelia Huck, linux-fsdevel, Eric W. Biederman

On Tue, May 26, 2009 at 23:56, Alan Stern <stern@rowland.harvard.edu> wrote:
> On Tue, 26 May 2009, Kay Sievers wrote:
>
>> Ok, I copied my newly installed system to another disks, to have a
>> root filesytem to trash again by this bug. :)
>>
>> Which of your patches should I try?
>
> James's patch, not mine.

I tried both, both don't fix the issue. With Alan's patch it *seems*
the target device never gets removed, at least I didn't see anything
in the kobject debug logs.

Kay

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 25/20] sysfs: Only support removing emtpy sysfs directories.
  2009-05-26 22:03                         ` Kay Sievers
@ 2009-05-26 23:49                           ` James Bottomley
  2009-05-27  0:02                             ` Kay Sievers
  0 siblings, 1 reply; 200+ messages in thread
From: James Bottomley @ 2009-05-26 23:49 UTC (permalink / raw)
  To: Kay Sievers
  Cc: Alan Stern, SCSI development list, Eric W. Biederman,
	Andrew Morton, Greg Kroah-Hartman, Kernel development list,
	Tejun Heo, Cornelia Huck, linux-fsdevel, Eric W. Biederman

On Wed, 2009-05-27 at 00:03 +0200, Kay Sievers wrote:
> On Tue, May 26, 2009 at 23:56, Alan Stern <stern@rowland.harvard.edu> wrote:
> > On Tue, 26 May 2009, Kay Sievers wrote:
> >
> >> Ok, I copied my newly installed system to another disks, to have a
> >> root filesytem to trash again by this bug. :)
> >>
> >> Which of your patches should I try?
> >
> > James's patch, not mine.
> 
> I tried both, both don't fix the issue. With Alan's patch it *seems*
> the target device never gets removed, at least I didn't see anything
> in the kobject debug logs.

OK ... perhaps we have to wait a little harder: try this; it waits until
all the targets have disappeared from visibility via an event.

James

---

diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c
index 89d41a4..b2946bf 100644
--- a/drivers/scsi/hosts.c
+++ b/drivers/scsi/hosts.c
@@ -173,6 +173,8 @@ void scsi_remove_host(struct Scsi_Host *shost)
 		BUG_ON(scsi_host_set_state(shost, SHOST_DEL_RECOVERY));
 	spin_unlock_irqrestore(shost->host_lock, flags);
 
+	scsi_wait_for_targets_gone(shost);
+
 	transport_unregister_device(&shost->shost_gendev);
 	device_unregister(&shost->shost_dev);
 	device_del(&shost->shost_gendev);
diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
index c447838..367216c 100644
--- a/drivers/scsi/scsi_scan.c
+++ b/drivers/scsi/scsi_scan.c
@@ -32,6 +32,7 @@
 #include <linux/delay.h>
 #include <linux/kthread.h>
 #include <linux/spinlock.h>
+#include <linux/wait.h>
 #include <linux/async.h>
 
 #include <scsi/scsi.h>
@@ -324,6 +325,12 @@ out:
 	return NULL;
 }
 
+static  DECLARE_WAIT_QUEUE_HEAD(scsi_target_removed);
+void scsi_wait_for_targets_gone(struct Scsi_Host *shost)
+{
+	wait_event(scsi_target_removed, list_empty(&shost->__targets));
+}
+
 static void scsi_target_destroy(struct scsi_target *starget)
 {
 	struct device *dev = &starget->dev;
@@ -336,6 +343,7 @@ static void scsi_target_destroy(struct scsi_target *starget)
 		shost->hostt->target_destroy(starget);
 	list_del_init(&starget->siblings);
 	spin_unlock_irqrestore(shost->host_lock, flags);
+	wake_up(&scsi_target_removed);
 	put_device(dev);
 }
 
diff --git a/include/scsi/scsi_host.h b/include/scsi/scsi_host.h
index b62a097..b63a901 100644
--- a/include/scsi/scsi_host.h
+++ b/include/scsi/scsi_host.h
@@ -747,6 +747,7 @@ static inline int scsi_host_scan_allowed(struct Scsi_Host *shost)
 
 extern void scsi_unblock_requests(struct Scsi_Host *);
 extern void scsi_block_requests(struct Scsi_Host *);
+extern void scsi_wait_for_targets_gone(struct Scsi_Host *);
 
 struct class_container;
 



^ permalink raw reply related	[flat|nested] 200+ messages in thread

* Re: [PATCH 25/20] sysfs: Only support removing emtpy sysfs directories.
  2009-05-26 23:49                           ` James Bottomley
@ 2009-05-27  0:02                             ` Kay Sievers
  2009-05-27  2:17                               ` Alan Stern
  0 siblings, 1 reply; 200+ messages in thread
From: Kay Sievers @ 2009-05-27  0:02 UTC (permalink / raw)
  To: James Bottomley
  Cc: Alan Stern, SCSI development list, Eric W. Biederman,
	Andrew Morton, Greg Kroah-Hartman, Kernel development list,
	Tejun Heo, Cornelia Huck, linux-fsdevel, Eric W. Biederman

On Wed, May 27, 2009 at 01:49, James Bottomley
<James.Bottomley@hansenpartnership.com> wrote:

> OK ... perhaps we have to wait a little harder: try this; it waits until
> all the targets have disappeared from visibility via an event.

That seems to work fine here.

Thanks,
Kay

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 25/20] sysfs: Only support removing emtpy sysfs  directories.
  2009-05-27  0:02                             ` Kay Sievers
@ 2009-05-27  2:17                               ` Alan Stern
  2009-05-27 11:35                                 ` Hannes Reinecke
  2009-05-27 18:00                                 ` Eric W. Biederman
  0 siblings, 2 replies; 200+ messages in thread
From: Alan Stern @ 2009-05-27  2:17 UTC (permalink / raw)
  To: Kay Sievers
  Cc: James Bottomley, SCSI development list, Eric W. Biederman,
	Andrew Morton, Greg Kroah-Hartman, Kernel development list,
	Tejun Heo, Cornelia Huck, linux-fsdevel, Eric W. Biederman

On Wed, 27 May 2009, Kay Sievers wrote:

> On Wed, May 27, 2009 at 01:49, James Bottomley
> <James.Bottomley@hansenpartnership.com> wrote:
> 
> > OK ... perhaps we have to wait a little harder: try this; it waits until
> > all the targets have disappeared from visibility via an event.
> 
> That seems to work fine here.

It's good for a short-term fix.  For the longer term, I still think 
it's a mistake to wait for the sdevs to be released before deleting the 
target.  It gives user programs the ability to block the host-removal 
thread indefinitely.

Alan Stern


^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 25/20] sysfs: Only support removing emtpy sysfs  directories.
  2009-05-27  2:17                               ` Alan Stern
@ 2009-05-27 11:35                                 ` Hannes Reinecke
  2009-05-27 16:01                                   ` James Bottomley
  2009-05-27 18:00                                 ` Eric W. Biederman
  1 sibling, 1 reply; 200+ messages in thread
From: Hannes Reinecke @ 2009-05-27 11:35 UTC (permalink / raw)
  To: Alan Stern
  Cc: Kay Sievers, James Bottomley, SCSI development list,
	Eric W. Biederman, Andrew Morton, Greg Kroah-Hartman,
	Kernel development list, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Eric W. Biederman

Hi all,

Alan Stern wrote:
> On Wed, 27 May 2009, Kay Sievers wrote:
> 
>> On Wed, May 27, 2009 at 01:49, James Bottomley
>> <James.Bottomley@hansenpartnership.com> wrote:
>>
>>> OK ... perhaps we have to wait a little harder: try this; it waits until
>>> all the targets have disappeared from visibility via an event.
>> That seems to work fine here.
> 
> It's good for a short-term fix.  For the longer term, I still think 
> it's a mistake to wait for the sdevs to be released before deleting the 
> target.  It gives user programs the ability to block the host-removal 
> thread indefinitely.
> 
Quite so. We should rather see to have the reference counting fixed
properly, than this would go away automatically.

So I just have to look for someone to fix my iSCSI bugs, than I could
revamp my patchset ...

Sigh.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      zSeries & Storage
hare@suse.de			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Markus Rex, HRB 16746 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 25/20] sysfs: Only support removing emtpy sysfs directories.
  2009-05-27 11:35                                 ` Hannes Reinecke
@ 2009-05-27 16:01                                   ` James Bottomley
  2009-05-27 16:16                                     ` Alan Stern
  0 siblings, 1 reply; 200+ messages in thread
From: James Bottomley @ 2009-05-27 16:01 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: Alan Stern, Kay Sievers, SCSI development list,
	Eric W. Biederman, Andrew Morton, Greg Kroah-Hartman,
	Kernel development list, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Eric W. Biederman

On Wed, 2009-05-27 at 13:35 +0200, Hannes Reinecke wrote:
> Hi all,
> 
> Alan Stern wrote:
> > On Wed, 27 May 2009, Kay Sievers wrote:
> > 
> >> On Wed, May 27, 2009 at 01:49, James Bottomley
> >> <James.Bottomley@hansenpartnership.com> wrote:
> >>
> >>> OK ... perhaps we have to wait a little harder: try this; it waits until
> >>> all the targets have disappeared from visibility via an event.
> >> That seems to work fine here.
> > 
> > It's good for a short-term fix.  For the longer term, I still think 
> > it's a mistake to wait for the sdevs to be released before deleting the 
> > target.  It gives user programs the ability to block the host-removal 
> > thread indefinitely.
> > 
> Quite so. We should rather see to have the reference counting fixed
> properly, than this would go away automatically.

Hardly ... our current refcounting is on destruction (releases).  This
problem is an instance of visibility (the del calls) we need the
visibility teardown to work nicely.  We currently have no refcounting on
the visibility.  Even if we did (and we could add a ref on when the
underlying device del calls are done), what happens if the target needs
to become visible again.  Apparently the generic device infrastructure
can't accept doing an add on a previously del'd device.

The most obvious way of fixing this is to have a special case for
targets of dying hosts ... they could call del early on the
understanding that they're never getting new underlying devices.  That
would allow the wait to trigger on the last target del, which is what is
optimal.

James



^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 25/20] sysfs: Only support removing emtpy sysfs  directories.
  2009-05-27 16:01                                   ` James Bottomley
@ 2009-05-27 16:16                                     ` Alan Stern
  2009-05-27 16:24                                       ` James Bottomley
  0 siblings, 1 reply; 200+ messages in thread
From: Alan Stern @ 2009-05-27 16:16 UTC (permalink / raw)
  To: James Bottomley
  Cc: Hannes Reinecke, Kay Sievers, SCSI development list,
	Eric W. Biederman, Andrew Morton, Greg Kroah-Hartman,
	Kernel development list, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Eric W. Biederman

On Wed, 27 May 2009, James Bottomley wrote:

> Hardly ... our current refcounting is on destruction (releases).  This
> problem is an instance of visibility (the del calls) we need the
> visibility teardown to work nicely.  We currently have no refcounting on
> the visibility.  Even if we did (and we could add a ref on when the
> underlying device del calls are done), what happens if the target needs
> to become visible again.  Apparently the generic device infrastructure
> can't accept doing an add on a previously del'd device.

Definitely not.

> The most obvious way of fixing this is to have a special case for
> targets of dying hosts ... they could call del early on the
> understanding that they're never getting new underlying devices.  That
> would allow the wait to trigger on the last target del, which is what is
> optimal.

I don't understand all the subtle issues here.  In other contexts, the 
solution would be to initialize a refcount to 1 when the target is 
allocated, increment it when a device is added, and decrement it when a 
device is removed or the host is removed.  When the refcount goes to 0, 
the target is deleted.  Why wouldn't this kind of approach work?

Alan Stern


^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 25/20] sysfs: Only support removing emtpy sysfs directories.
  2009-05-27 16:16                                     ` Alan Stern
@ 2009-05-27 16:24                                       ` James Bottomley
  2009-05-27 17:01                                         ` Alan Stern
  0 siblings, 1 reply; 200+ messages in thread
From: James Bottomley @ 2009-05-27 16:24 UTC (permalink / raw)
  To: Alan Stern
  Cc: Hannes Reinecke, Kay Sievers, SCSI development list,
	Eric W. Biederman, Andrew Morton, Greg Kroah-Hartman,
	Kernel development list, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Eric W. Biederman

On Wed, 2009-05-27 at 12:16 -0400, Alan Stern wrote:
> On Wed, 27 May 2009, James Bottomley wrote:
> 
> > Hardly ... our current refcounting is on destruction (releases).  This
> > problem is an instance of visibility (the del calls) we need the
> > visibility teardown to work nicely.  We currently have no refcounting on
> > the visibility.  Even if we did (and we could add a ref on when the
> > underlying device del calls are done), what happens if the target needs
> > to become visible again.  Apparently the generic device infrastructure
> > can't accept doing an add on a previously del'd device.
> 
> Definitely not.
> 
> > The most obvious way of fixing this is to have a special case for
> > targets of dying hosts ... they could call del early on the
> > understanding that they're never getting new underlying devices.  That
> > would allow the wait to trigger on the last target del, which is what is
> > optimal.
> 
> I don't understand all the subtle issues here.  In other contexts, the 
> solution would be to initialize a refcount to 1 when the target is 
> allocated, increment it when a device is added, and decrement it when a 
> device is removed or the host is removed.  When the refcount goes to 0, 
> the target is deleted.  Why wouldn't this kind of approach work?

Um, well that's exactly how it works (modulo the fact that there are
parts of the lifecycle where the ref count is zero, like scanning).  The
problem you're complaining about is that the device ref on the target
may take a long time to release, so we can't key the del event on the
refcount going to zero, which is what we do today.

James



^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 25/20] sysfs: Only support removing emtpy sysfs  directories.
  2009-05-27 16:24                                       ` James Bottomley
@ 2009-05-27 17:01                                         ` Alan Stern
  2009-05-27 17:08                                           ` James Bottomley
  0 siblings, 1 reply; 200+ messages in thread
From: Alan Stern @ 2009-05-27 17:01 UTC (permalink / raw)
  To: James Bottomley
  Cc: Hannes Reinecke, Kay Sievers, SCSI development list,
	Eric W. Biederman, Andrew Morton, Greg Kroah-Hartman,
	Kernel development list, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Eric W. Biederman

On Wed, 27 May 2009, James Bottomley wrote:

> > I don't understand all the subtle issues here.  In other contexts, the 
> > solution would be to initialize a refcount to 1 when the target is 
> > allocated, increment it when a device is added, and decrement it when a 
> > device is removed or the host is removed.  When the refcount goes to 0, 
> > the target is deleted.  Why wouldn't this kind of approach work?
> 
> Um, well that's exactly how it works (modulo the fact that there are
> parts of the lifecycle where the ref count is zero, like scanning).

Why does that happen?  It's reasonable that there should be times 
during scanning when the target doesn't have any children, but the 
refcount should still be positive.

>  The
> problem you're complaining about is that the device ref on the target
> may take a long time to release, so we can't key the del event on the
> refcount going to zero, which is what we do today.

Maybe we should be talking about two separate refcounts: a normal 
get_device/put_device kref counter for the target's lifetime, and a 
visibility counter (one for each child device and one overall) which 
keys the del event and must go to 0 before the host removal finishes.

Alan Stern


^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 25/20] sysfs: Only support removing emtpy sysfs directories.
  2009-05-27 17:01                                         ` Alan Stern
@ 2009-05-27 17:08                                           ` James Bottomley
  2009-05-27 18:07                                             ` Alan Stern
  0 siblings, 1 reply; 200+ messages in thread
From: James Bottomley @ 2009-05-27 17:08 UTC (permalink / raw)
  To: Alan Stern
  Cc: Hannes Reinecke, Kay Sievers, SCSI development list,
	Eric W. Biederman, Andrew Morton, Greg Kroah-Hartman,
	Kernel development list, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Eric W. Biederman

On Wed, 2009-05-27 at 13:01 -0400, Alan Stern wrote:
> On Wed, 27 May 2009, James Bottomley wrote:
> 
> > > I don't understand all the subtle issues here.  In other contexts, the 
> > > solution would be to initialize a refcount to 1 when the target is 
> > > allocated, increment it when a device is added, and decrement it when a 
> > > device is removed or the host is removed.  When the refcount goes to 0, 
> > > the target is deleted.  Why wouldn't this kind of approach work?
> > 
> > Um, well that's exactly how it works (modulo the fact that there are
> > parts of the lifecycle where the ref count is zero, like scanning).
> 
> Why does that happen?  It's reasonable that there should be times 
> during scanning when the target doesn't have any children, but the 
> refcount should still be positive.

By refcount, I mean count of underlying devices.

> >  The
> > problem you're complaining about is that the device ref on the target
> > may take a long time to release, so we can't key the del event on the
> > refcount going to zero, which is what we do today.
> 
> Maybe we should be talking about two separate refcounts: a normal 
> get_device/put_device kref counter for the target's lifetime, and a 
> visibility counter (one for each child device and one overall) which 
> keys the del event and must go to 0 before the host removal finishes.

Um, well, that's roughly how I said we'd have to fix all of this in the
email to hannes ... it would be much easier if we could make a del'd
device visible, but now we have to have different behaviours depending
on whether the host is going away or not.

James



^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 25/20] sysfs: Only support removing emtpy sysfs  directories.
  2009-05-27  2:17                               ` Alan Stern
  2009-05-27 11:35                                 ` Hannes Reinecke
@ 2009-05-27 18:00                                 ` Eric W. Biederman
  2009-05-27 18:15                                   ` Alan Stern
  1 sibling, 1 reply; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-27 18:00 UTC (permalink / raw)
  To: Alan Stern
  Cc: Kay Sievers, James Bottomley, SCSI development list,
	Andrew Morton, Greg Kroah-Hartman, Kernel development list,
	Tejun Heo, Cornelia Huck, linux-fsdevel, Eric W. Biederman

Alan Stern <stern@rowland.harvard.edu> writes:

> On Wed, 27 May 2009, Kay Sievers wrote:
>
>> On Wed, May 27, 2009 at 01:49, James Bottomley
>> <James.Bottomley@hansenpartnership.com> wrote:
>> 
>> > OK ... perhaps we have to wait a little harder: try this; it waits until
>> > all the targets have disappeared from visibility via an event.
>> 
>> That seems to work fine here.
>
> It's good for a short-term fix.  For the longer term, I still think 
> it's a mistake to wait for the sdevs to be released before deleting the 
> target.  It gives user programs the ability to block the host-removal 
> thread indefinitely.

How can user programs block removal indefinitely today?

Eric

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 25/20] sysfs: Only support removing emtpy sysfs  directories.
  2009-05-27 17:08                                           ` James Bottomley
@ 2009-05-27 18:07                                             ` Alan Stern
  2009-05-27 19:44                                               ` James Bottomley
  0 siblings, 1 reply; 200+ messages in thread
From: Alan Stern @ 2009-05-27 18:07 UTC (permalink / raw)
  To: James Bottomley
  Cc: Hannes Reinecke, Kay Sievers, SCSI development list,
	Eric W. Biederman, Andrew Morton, Greg Kroah-Hartman,
	Kernel development list, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Eric W. Biederman

On Wed, 27 May 2009, James Bottomley wrote:

> By refcount, I mean count of underlying devices.

Does that mean only registered devices, or does it include devices 
which are unregistered but not yet released?

> > >  The
> > > problem you're complaining about is that the device ref on the target
> > > may take a long time to release, so we can't key the del event on the
> > > refcount going to zero, which is what we do today.
> > 
> > Maybe we should be talking about two separate refcounts: a normal 
> > get_device/put_device kref counter for the target's lifetime, and a 
> > visibility counter (one for each child device and one overall) which 
> > keys the del event and must go to 0 before the host removal finishes.
> 
> Um, well, that's roughly how I said we'd have to fix all of this in the
> email to hannes ... it would be much easier if we could make a del'd
> device visible,

I don't follow.  Why would you want to delete a target before the host
is removed and then make it visible again later?  Because it doesn't
have any underlying devices at the moment but may gain some later on?

If that's the case, why not delete the target when there are no more
registered devices beneath it and then create a new target structure
when a new device appears?

> but now we have to have different behaviours depending
> on whether the host is going away or not.

Yes, one does get the feeling that we're going around in circles...

Okay.  So now I have made two proposals.  One is to delete targets and
create new ones as needed.  The other is to keep a target hanging
around, even if there are no underlying devices, until the host is
removed.  This can be implemented easily by making the counter
represent the number of registered devices plus one for the host.  

Alan Stern

P.S.: If the counter is made to refer to registered devices, as opposed
to un-released devices, would there ever be a situation where you want
to delete a target in a non-process context?


^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 25/20] sysfs: Only support removing emtpy sysfs  directories.
  2009-05-27 18:00                                 ` Eric W. Biederman
@ 2009-05-27 18:15                                   ` Alan Stern
  2009-05-27 18:24                                     ` Eric W. Biederman
  0 siblings, 1 reply; 200+ messages in thread
From: Alan Stern @ 2009-05-27 18:15 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Kay Sievers, James Bottomley, SCSI development list,
	Andrew Morton, Greg Kroah-Hartman, Kernel development list,
	Tejun Heo, Cornelia Huck, linux-fsdevel, Eric W. Biederman

On Wed, 27 May 2009, Eric W. Biederman wrote:

> Alan Stern <stern@rowland.harvard.edu> writes:
> 
> > On Wed, 27 May 2009, Kay Sievers wrote:
> >
> >> On Wed, May 27, 2009 at 01:49, James Bottomley
> >> <James.Bottomley@hansenpartnership.com> wrote:
> >> 
> >> > OK ... perhaps we have to wait a little harder: try this; it waits until
> >> > all the targets have disappeared from visibility via an event.
> >> 
> >> That seems to work fine here.
> >
> > It's good for a short-term fix.  For the longer term, I still think 
> > it's a mistake to wait for the sdevs to be released before deleting the 
> > target.  It gives user programs the ability to block the host-removal 
> > thread indefinitely.
> 
> How can user programs block removal indefinitely today?

As fas as I know, they can't.  Instead, they can cause the SCSI layer 
to unregister a sysfs directory containing a child directory.  :-)

Basically, a user program can delay removal of the child (i.e., the
target) directory indefinitely, because currently the target isn't
unregistered when all its children are removed -- it's unregistered
when all its children are _released_.

Alan Stern


^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 25/20] sysfs: Only support removing emtpy sysfs  directories.
  2009-05-27 18:15                                   ` Alan Stern
@ 2009-05-27 18:24                                     ` Eric W. Biederman
  2009-05-27 21:38                                       ` Alan Stern
  0 siblings, 1 reply; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-27 18:24 UTC (permalink / raw)
  To: Alan Stern
  Cc: Kay Sievers, James Bottomley, SCSI development list,
	Andrew Morton, Greg Kroah-Hartman, Kernel development list,
	Tejun Heo, Cornelia Huck, linux-fsdevel, Eric W. Biederman

Alan Stern <stern@rowland.harvard.edu> writes:

>
> As fas as I know, they can't.  Instead, they can cause the SCSI layer 
> to unregister a sysfs directory containing a child directory.  :-)
>
> Basically, a user program can delay removal of the child (i.e., the
> target) directory indefinitely, because currently the target isn't
> unregistered when all its children are removed -- it's unregistered
> when all its children are _released_.

Ok.  Is this opens of /dev/sda1 and the like that are being held open by
userspace that are potentially causing problems?

I think I have the fix to that...

Eric

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 25/20] sysfs: Only support removing emtpy sysfs directories.
  2009-05-27 18:07                                             ` Alan Stern
@ 2009-05-27 19:44                                               ` James Bottomley
  2009-05-27 20:40                                                 ` Alan Stern
  0 siblings, 1 reply; 200+ messages in thread
From: James Bottomley @ 2009-05-27 19:44 UTC (permalink / raw)
  To: Alan Stern
  Cc: Hannes Reinecke, Kay Sievers, SCSI development list,
	Eric W. Biederman, Andrew Morton, Greg Kroah-Hartman,
	Kernel development list, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Eric W. Biederman

On Wed, 2009-05-27 at 14:07 -0400, Alan Stern wrote:
> On Wed, 27 May 2009, James Bottomley wrote:
> 
> > By refcount, I mean count of underlying devices.
> 
> Does that mean only registered devices, or does it include devices 
> which are unregistered but not yet released?

All devices ... scsi_device has to has a target parent before its
usable.

> > > >  The
> > > > problem you're complaining about is that the device ref on the target
> > > > may take a long time to release, so we can't key the del event on the
> > > > refcount going to zero, which is what we do today.
> > > 
> > > Maybe we should be talking about two separate refcounts: a normal 
> > > get_device/put_device kref counter for the target's lifetime, and a 
> > > visibility counter (one for each child device and one overall) which 
> > > keys the del event and must go to 0 before the host removal finishes.
> > 
> > Um, well, that's roughly how I said we'd have to fix all of this in the
> > email to hannes ... it would be much easier if we could make a del'd
> > device visible,
> 
> I don't follow.  Why would you want to delete a target before the host
> is removed and then make it visible again later?  Because it doesn't
> have any underlying devices at the moment but may gain some later on?

Perhaps I haven't made the problem clear enough.  You only want early
del if the host is going away, otherwise the target might be reused and
it can't be if you've called del on it.  So there needs to be an
integration into the host lifecycle in some form.

James



^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 25/20] sysfs: Only support removing emtpy sysfs  directories.
  2009-05-27 19:44                                               ` James Bottomley
@ 2009-05-27 20:40                                                 ` Alan Stern
  2009-05-27 20:49                                                   ` James Bottomley
  0 siblings, 1 reply; 200+ messages in thread
From: Alan Stern @ 2009-05-27 20:40 UTC (permalink / raw)
  To: James Bottomley
  Cc: Hannes Reinecke, Kay Sievers, SCSI development list,
	Eric W. Biederman, Andrew Morton, Greg Kroah-Hartman,
	Kernel development list, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Eric W. Biederman

On Wed, 27 May 2009, James Bottomley wrote:

> On Wed, 2009-05-27 at 14:07 -0400, Alan Stern wrote:
> > On Wed, 27 May 2009, James Bottomley wrote:
> > 
> > > By refcount, I mean count of underlying devices.
> > 
> > Does that mean only registered devices, or does it include devices 
> > which are unregistered but not yet released?
> 
> All devices ... scsi_device has to has a target parent before its
> usable.

I can't tell whether you understood my point.  After a scsi_device is
unregistered but before it is released -- i.e., when its state is
SDEV_DEL -- it _is_ essentially unusable.  So why wait until it is
released to decrement the target's device counter?  Why not do the
decrement in __scsi_remove_device()?


> > > Um, well, that's roughly how I said we'd have to fix all of this in the
> > > email to hannes ... it would be much easier if we could make a del'd
> > > device visible,
> > 
> > I don't follow.  Why would you want to delete a target before the host
> > is removed and then make it visible again later?  Because it doesn't
> > have any underlying devices at the moment but may gain some later on?
> 
> Perhaps I haven't made the problem clear enough.  You only want early
> del if the host is going away, otherwise the target might be reused and
> it can't be if you've called del on it.  So there needs to be an
> integration into the host lifecycle in some form.

Yes, granted.  That integration doesn't have to be complicated.  
Basically, you just decrement the counters in all the targets when
setting a host's state to SHOST_DEL or SHOST_DEL_RECOVERY.  At that 
point there's no reason to keep an unpopulated target around, right?

Up until that point, the counter's value should be one more than the
number of underlying sdevs.  So if the counter decrements to 0 then
there were no underlying sdevs and the target is deleted immediately;
otherwise it is deleted when the last remaining sdev is deleted.

Alan Stern


^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 25/20] sysfs: Only support removing emtpy sysfs directories.
  2009-05-27 20:40                                                 ` Alan Stern
@ 2009-05-27 20:49                                                   ` James Bottomley
  2009-05-27 21:31                                                     ` Alan Stern
  0 siblings, 1 reply; 200+ messages in thread
From: James Bottomley @ 2009-05-27 20:49 UTC (permalink / raw)
  To: Alan Stern
  Cc: Hannes Reinecke, Kay Sievers, SCSI development list,
	Eric W. Biederman, Andrew Morton, Greg Kroah-Hartman,
	Kernel development list, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Eric W. Biederman

On Wed, 2009-05-27 at 16:40 -0400, Alan Stern wrote:
> On Wed, 27 May 2009, James Bottomley wrote:
> 
> > On Wed, 2009-05-27 at 14:07 -0400, Alan Stern wrote:
> > > On Wed, 27 May 2009, James Bottomley wrote:
> > > 
> > > > By refcount, I mean count of underlying devices.
> > > 
> > > Does that mean only registered devices, or does it include devices 
> > > which are unregistered but not yet released?
> > 
> > All devices ... scsi_device has to has a target parent before its
> > usable.
> 
> I can't tell whether you understood my point.  After a scsi_device is
> unregistered but before it is released -- i.e., when its state is
> SDEV_DEL -- it _is_ essentially unusable.  So why wait until it is
> released to decrement the target's device counter?  Why not do the
> decrement in __scsi_remove_device()?

because the use model of the device still requires a valid target.  Even
though it gets gated in several places in SDEV_DEL, we still have use of
the target parent.  This is fixable, but only by a long audit of all the
sdev uses plus the enforcement of no use of target in DEL state rule,
which adds complexity.

> > > > Um, well, that's roughly how I said we'd have to fix all of this in the
> > > > email to hannes ... it would be much easier if we could make a del'd
> > > > device visible,
> > > 
> > > I don't follow.  Why would you want to delete a target before the host
> > > is removed and then make it visible again later?  Because it doesn't
> > > have any underlying devices at the moment but may gain some later on?
> > 
> > Perhaps I haven't made the problem clear enough.  You only want early
> > del if the host is going away, otherwise the target might be reused and
> > it can't be if you've called del on it.  So there needs to be an
> > integration into the host lifecycle in some form.
> 
> Yes, granted.  That integration doesn't have to be complicated.  
> Basically, you just decrement the counters in all the targets when
> setting a host's state to SHOST_DEL or SHOST_DEL_RECOVERY.  At that 

And SHOST_CANCEL and SHOST_CANCEL_RECOVERY.

> point there's no reason to keep an unpopulated target around, right?

If the child list were empty, sure.  However, it's likely not going to
be at this point.

> Up until that point, the counter's value should be one more than the
> number of underlying sdevs.  So if the counter decrements to 0 then
> there were no underlying sdevs and the target is deleted immediately;
> otherwise it is deleted when the last remaining sdev is deleted.

No, that's the problem.  It can be removed from visibility if it has no
visible sdevs, but it can't be deleted until the last sdev is released.

James



^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 25/20] sysfs: Only support removing emtpy sysfs  directories.
  2009-05-27 20:49                                                   ` James Bottomley
@ 2009-05-27 21:31                                                     ` Alan Stern
  2009-05-27 21:42                                                       ` James Bottomley
  0 siblings, 1 reply; 200+ messages in thread
From: Alan Stern @ 2009-05-27 21:31 UTC (permalink / raw)
  To: James Bottomley
  Cc: Hannes Reinecke, Kay Sievers, SCSI development list,
	Eric W. Biederman, Andrew Morton, Greg Kroah-Hartman,
	Kernel development list, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Eric W. Biederman

On Wed, 27 May 2009, James Bottomley wrote:

> > I can't tell whether you understood my point.  After a scsi_device is
> > unregistered but before it is released -- i.e., when its state is
> > SDEV_DEL -- it _is_ essentially unusable.  So why wait until it is
> > released to decrement the target's device counter?  Why not do the
> > decrement in __scsi_remove_device()?
> 
> because the use model of the device still requires a valid target.  Even
> though it gets gated in several places in SDEV_DEL, we still have use of
> the target parent.  This is fixable, but only by a long audit of all the
> sdev uses plus the enforcement of no use of target in DEL state rule,
> which adds complexity.

You're failing to distinguish properly between "delete" and "release".  
A target (or device in general) is deleted when it is removed from
visibility -- i.e., when device_del() is called.  It is released when
the final put_device() call occurs and the data structure is
deallocated.

So, all I'm saying is there's nothing wrong with deleting a target
when all its children are deleted, provided the target isn't released
until all the children are released.  Below you say the same thing.


> > > Perhaps I haven't made the problem clear enough.  You only want early
> > > del if the host is going away, otherwise the target might be reused and
> > > it can't be if you've called del on it.  So there needs to be an
> > > integration into the host lifecycle in some form.
> > 
> > Yes, granted.  That integration doesn't have to be complicated.  
> > Basically, you just decrement the counters in all the targets when
> > setting a host's state to SHOST_DEL or SHOST_DEL_RECOVERY.  At that 
> 
> And SHOST_CANCEL and SHOST_CANCEL_RECOVERY.

If you prefer.  I thought SHOST_DEL would be more appropriate because
it occurs after scsi_forget_host() is called.  All those transitions
occur in scsi_remove_host(), anyway.

> > point there's no reason to keep an unpopulated target around, right?
> 
> If the child list were empty, sure.  However, it's likely not going to
> be at this point.

Regardless, it will work either way.

> > Up until that point, the counter's value should be one more than the
> > number of underlying sdevs.  So if the counter decrements to 0 then
> > there were no underlying sdevs and the target is deleted immediately;
> > otherwise it is deleted when the last remaining sdev is deleted.
> 
> No, that's the problem.  It can be removed from visibility if it has no
> visible sdevs, but it can't be deleted until the last sdev is released.

Allow me to rephrase this: A target can be removed from visibility if 
it has no visible sdevs, but it can't be _released_ until the last sdev 
is released.

That's fine.  You remove a target from visibility when target->reap_ref
becomes 0.  The target isn't released until the target's embedded
struct device's refcount becomes 0.  To make this work, simply have
scsi_alloc_sdev() call

	get_device(&starget->dev);

and have scsi_device_dev_release_usercontext() call

	put_device(&starget->dev);

Doesn't that do exactly what you're asking for?

Alan Stern


^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 25/20] sysfs: Only support removing emtpy sysfs  directories.
  2009-05-27 18:24                                     ` Eric W. Biederman
@ 2009-05-27 21:38                                       ` Alan Stern
  2009-05-27 22:06                                         ` Eric W. Biederman
  0 siblings, 1 reply; 200+ messages in thread
From: Alan Stern @ 2009-05-27 21:38 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Kay Sievers, James Bottomley, SCSI development list,
	Andrew Morton, Greg Kroah-Hartman, Kernel development list,
	Tejun Heo, Cornelia Huck, linux-fsdevel, Eric W. Biederman

On Wed, 27 May 2009, Eric W. Biederman wrote:

> Alan Stern <stern@rowland.harvard.edu> writes:
> 
> >
> > As fas as I know, they can't.  Instead, they can cause the SCSI layer 
> > to unregister a sysfs directory containing a child directory.  :-)
> >
> > Basically, a user program can delay removal of the child (i.e., the
> > target) directory indefinitely, because currently the target isn't
> > unregistered when all its children are removed -- it's unregistered
> > when all its children are _released_.
> 
> Ok.  Is this opens of /dev/sda1 and the like that are being held open by
> userspace that are potentially causing problems?

Yes, plus any other mechanism for preventing a struct device's refcount 
from going to 0.

> I think I have the fix to that...

The fix is to delete the target when its children are deleted, and not
wait until the children are released.

Alan Stern


^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 25/20] sysfs: Only support removing emtpy sysfs directories.
  2009-05-27 21:31                                                     ` Alan Stern
@ 2009-05-27 21:42                                                       ` James Bottomley
  2009-05-27 22:15                                                         ` Alan Stern
  0 siblings, 1 reply; 200+ messages in thread
From: James Bottomley @ 2009-05-27 21:42 UTC (permalink / raw)
  To: Alan Stern
  Cc: Hannes Reinecke, Kay Sievers, SCSI development list,
	Eric W. Biederman, Andrew Morton, Greg Kroah-Hartman,
	Kernel development list, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Eric W. Biederman

On Wed, 2009-05-27 at 17:31 -0400, Alan Stern wrote:
> On Wed, 27 May 2009, James Bottomley wrote:
> 
> > > I can't tell whether you understood my point.  After a scsi_device is
> > > unregistered but before it is released -- i.e., when its state is
> > > SDEV_DEL -- it _is_ essentially unusable.  So why wait until it is
> > > released to decrement the target's device counter?  Why not do the
> > > decrement in __scsi_remove_device()?
> > 
> > because the use model of the device still requires a valid target.  Even
> > though it gets gated in several places in SDEV_DEL, we still have use of
> > the target parent.  This is fixable, but only by a long audit of all the
> > sdev uses plus the enforcement of no use of target in DEL state rule,
> > which adds complexity.
> 
> You're failing to distinguish properly between "delete" and "release".  
> A target (or device in general) is deleted when it is removed from
> visibility -- i.e., when device_del() is called.  It is released when
> the final put_device() call occurs and the data structure is
> deallocated.

I find the terms delete and release too close for comfort, which is why
I've always been careful to say remove from visibility.

> So, all I'm saying is there's nothing wrong with deleting a target
> when all its children are deleted, provided the target isn't released
> until all the children are released.  Below you say the same thing.
> 
> 
> > > > Perhaps I haven't made the problem clear enough.  You only want early
> > > > del if the host is going away, otherwise the target might be reused and
> > > > it can't be if you've called del on it.  So there needs to be an
> > > > integration into the host lifecycle in some form.
> > > 
> > > Yes, granted.  That integration doesn't have to be complicated.  
> > > Basically, you just decrement the counters in all the targets when
> > > setting a host's state to SHOST_DEL or SHOST_DEL_RECOVERY.  At that 
> > 
> > And SHOST_CANCEL and SHOST_CANCEL_RECOVERY.
> 
> If you prefer.  I thought SHOST_DEL would be more appropriate because
> it occurs after scsi_forget_host() is called.  All those transitions
> occur in scsi_remove_host(), anyway.

I mean in all four states.

> > > point there's no reason to keep an unpopulated target around, right?
> > 
> > If the child list were empty, sure.  However, it's likely not going to
> > be at this point.
> 
> Regardless, it will work either way.
> 
> > > Up until that point, the counter's value should be one more than the
> > > number of underlying sdevs.  So if the counter decrements to 0 then
> > > there were no underlying sdevs and the target is deleted immediately;
> > > otherwise it is deleted when the last remaining sdev is deleted.
> > 
> > No, that's the problem.  It can be removed from visibility if it has no
> > visible sdevs, but it can't be deleted until the last sdev is released.
> 
> Allow me to rephrase this: A target can be removed from visibility if 
> it has no visible sdevs, but it can't be _released_ until the last sdev 
> is released.
> 
> That's fine.  You remove a target from visibility when target->reap_ref
> becomes 0.  The target isn't released until the target's embedded
> struct device's refcount becomes 0.  To make this work, simply have
> scsi_alloc_sdev() call
> 
> 	get_device(&starget->dev);
> 
> and have scsi_device_dev_release_usercontext() call
> 
> 	put_device(&starget->dev);
> 
> Doesn't that do exactly what you're asking for?

That's um what we do to day ... the addition has to be to the visibility
management.

James



^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 25/20] sysfs: Only support removing emtpy sysfs  directories.
  2009-05-27 21:38                                       ` Alan Stern
@ 2009-05-27 22:06                                         ` Eric W. Biederman
  2009-05-27 22:18                                           ` Alan Stern
  0 siblings, 1 reply; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-27 22:06 UTC (permalink / raw)
  To: Alan Stern
  Cc: Kay Sievers, James Bottomley, SCSI development list,
	Andrew Morton, Greg Kroah-Hartman, Kernel development list,
	Tejun Heo, Cornelia Huck, linux-fsdevel, Eric W. Biederman

Alan Stern <stern@rowland.harvard.edu> writes:

> On Wed, 27 May 2009, Eric W. Biederman wrote:
>
>> Alan Stern <stern@rowland.harvard.edu> writes:
>> 
>> >
>> > As fas as I know, they can't.  Instead, they can cause the SCSI layer 
>> > to unregister a sysfs directory containing a child directory.  :-)
>> >
>> > Basically, a user program can delay removal of the child (i.e., the
>> > target) directory indefinitely, because currently the target isn't
>> > unregistered when all its children are removed -- it's unregistered
>> > when all its children are _released_.
>> 
>> Ok.  Is this opens of /dev/sda1 and the like that are being held open by
>> userspace that are potentially causing problems?
>
> Yes, plus any other mechanism for preventing a struct device's refcount 
> from going to 0.

Thanks.  The discussion makes sense now.

>> I think I have the fix to that...
>
> The fix is to delete the target when its children are deleted, and not
> wait until the children are released.

I think I can do both.

I am currently working on a patchset which at the VFS layer disconnects a
fd from an underlying device.  It does the necessary use count tracking
and when the disconnect is done it returns.  As part of the disconnect it
calls the release method.

We already do this in at least sysfs, proc, sysctl, and sound.  So I
figure it is time to move this into some generic code so we don't need
to duplicate the bugs and the insanities.

Once merged it would take just a few lines of code to use this functionality.

Eric

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 25/20] sysfs: Only support removing emtpy sysfs  directories.
  2009-05-27 21:42                                                       ` James Bottomley
@ 2009-05-27 22:15                                                         ` Alan Stern
  2009-05-27 22:22                                                           ` James Bottomley
  0 siblings, 1 reply; 200+ messages in thread
From: Alan Stern @ 2009-05-27 22:15 UTC (permalink / raw)
  To: James Bottomley
  Cc: Hannes Reinecke, Kay Sievers, SCSI development list,
	Eric W. Biederman, Andrew Morton, Greg Kroah-Hartman,
	Kernel development list, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Eric W. Biederman

On Wed, 27 May 2009, James Bottomley wrote:

> I find the terms delete and release too close for comfort, which is why
> I've always been careful to say remove from visibility.

Okay; I'll use your terms when conversing with you.  :-)

> > That's fine.  You remove a target from visibility when target->reap_ref
> > becomes 0.  The target isn't released until the target's embedded
> > struct device's refcount becomes 0.  To make this work, simply have
> > scsi_alloc_sdev() call
> > 
> > 	get_device(&starget->dev);
> > 
> > and have scsi_device_dev_release_usercontext() call
> > 
> > 	put_device(&starget->dev);
> > 
> > Doesn't that do exactly what you're asking for?
> 
> That's um what we do to day ... the addition has to be to the visibility
> management.

That's what I was trying to accomplish in the patch you said was wrong.  
It moved the call to scsi_target_reap() from
scsi_device_dev_release_usercontext() into __scsi_remove_device().  
That is, the target's count of underlying sdevs was to be decremented
whenever an sdev was removed from visibility, not when the sdev was
released.

That's how the problem should be solved.  But the details need to be 
correct, and I don't understand how they all work (as you noticed when 
reading the patch).

Alan Stern


^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 25/20] sysfs: Only support removing emtpy sysfs  directories.
  2009-05-27 22:06                                         ` Eric W. Biederman
@ 2009-05-27 22:18                                           ` Alan Stern
  0 siblings, 0 replies; 200+ messages in thread
From: Alan Stern @ 2009-05-27 22:18 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Kay Sievers, James Bottomley, SCSI development list,
	Andrew Morton, Greg Kroah-Hartman, Kernel development list,
	Tejun Heo, Cornelia Huck, linux-fsdevel, Eric W. Biederman

On Wed, 27 May 2009, Eric W. Biederman wrote:

> >> I think I have the fix to that...
> >
> > The fix is to delete the target when its children are deleted, and not
> > wait until the children are released.
> 
> I think I can do both.
> 
> I am currently working on a patchset which at the VFS layer disconnects a
> fd from an underlying device.  It does the necessary use count tracking
> and when the disconnect is done it returns.  As part of the disconnect it
> calls the release method.
> 
> We already do this in at least sysfs, proc, sysctl, and sound.  So I
> figure it is time to move this into some generic code so we don't need
> to duplicate the bugs and the insanities.
> 
> Once merged it would take just a few lines of code to use this functionality.

That's good, but it isn't enough to solve this problem.  Even though
user programs might not be able to pin the device by holding the file
open any more, other mechanisms (internal to the kernel) could have the
same effect.  Only a short delay is needed.

The underlying cause should be fixed properly.

Alan Stern

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 25/20] sysfs: Only support removing emtpy sysfs directories.
  2009-05-27 22:15                                                         ` Alan Stern
@ 2009-05-27 22:22                                                           ` James Bottomley
  2009-05-28 15:24                                                             ` Alan Stern
  0 siblings, 1 reply; 200+ messages in thread
From: James Bottomley @ 2009-05-27 22:22 UTC (permalink / raw)
  To: Alan Stern
  Cc: Hannes Reinecke, Kay Sievers, SCSI development list,
	Eric W. Biederman, Andrew Morton, Greg Kroah-Hartman,
	Kernel development list, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Eric W. Biederman

On Wed, 2009-05-27 at 18:15 -0400, Alan Stern wrote:
> On Wed, 27 May 2009, James Bottomley wrote:
> 
> > I find the terms delete and release too close for comfort, which is why
> > I've always been careful to say remove from visibility.
> 
> Okay; I'll use your terms when conversing with you.  :-)
> 
> > > That's fine.  You remove a target from visibility when target->reap_ref
> > > becomes 0.  The target isn't released until the target's embedded
> > > struct device's refcount becomes 0.  To make this work, simply have
> > > scsi_alloc_sdev() call
> > > 
> > > 	get_device(&starget->dev);
> > > 
> > > and have scsi_device_dev_release_usercontext() call
> > > 
> > > 	put_device(&starget->dev);
> > > 
> > > Doesn't that do exactly what you're asking for?
> > 
> > That's um what we do to day ... the addition has to be to the visibility
> > management.
> 
> That's what I was trying to accomplish in the patch you said was wrong.  
> It moved the call to scsi_target_reap() from
> scsi_device_dev_release_usercontext() into __scsi_remove_device().  
> That is, the target's count of underlying sdevs was to be decremented
> whenever an sdev was removed from visibility, not when the sdev was
> released.
> 
> That's how the problem should be solved.  But the details need to be 
> correct, and I don't understand how they all work (as you noticed when 
> reading the patch).

Right, and I think reap_ref can be seconded to count underlying device
visibility.  However, the piece that's missing, is the fact that all of
this has to be tied into the host state.  If the host is running, you
can't remove the target from visibility even if all its children are
invisible because it might get another visible child added. once it goes
into the cancel or del states, it can't acquire new children, so then
it's safe to make a target with no visible children invisible.

James



^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 01/20] sysfs: Implement sysfs_rename_link
  2009-05-21  0:27 ` [PATCH 01/20] sysfs: Implement sysfs_rename_link Eric W. Biederman
                     ` (2 preceding siblings ...)
  2009-05-21  5:35   ` Tejun Heo
@ 2009-05-28  0:14   ` Greg KH
  2009-05-28  0:30     ` Kay Sievers
  3 siblings, 1 reply; 200+ messages in thread
From: Greg KH @ 2009-05-28  0:14 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Andrew Morton, Greg Kroah-Hartman, linux-kernel, Tejun Heo,
	Cornelia Huck, linux-fsdevel, Benjamin Thery, Daniel Lezcano,
	Eric W. Biederman


So, there's been a lot of talk in this thread.

Eric, do you have an updated set of patches for me to try out?

Or are there still problems, like the "fry the ext3 boot partition" that
Kay found?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 01/20] sysfs: Implement sysfs_rename_link
  2009-05-28  0:14   ` Greg KH
@ 2009-05-28  0:30     ` Kay Sievers
  2009-05-28  0:37       ` Greg KH
  2009-05-28  1:51       ` [PATCH 01/20] sysfs: Implement sysfs_rename_link Eric W. Biederman
  0 siblings, 2 replies; 200+ messages in thread
From: Kay Sievers @ 2009-05-28  0:30 UTC (permalink / raw)
  To: Greg KH
  Cc: Eric W. Biederman, Andrew Morton, Greg Kroah-Hartman,
	linux-kernel, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Benjamin Thery, Daniel Lezcano, Eric W. Biederman

On Thu, May 28, 2009 at 02:14, Greg KH <greg@kroah.com> wrote:
>
> So, there's been a lot of talk in this thread.
>
> Eric, do you have an updated set of patches for me to try out?

I think we should get a version of the patch in that removes all files
in a directory on cleanup, but warn if a subdirectory is still there.
James has a patch to fix the one issue we've seen so far with existing
child directories.

After that, we should work on fixing the users that leave files
behind, and can possibly stop cleaning up files, if we want to.

> Or are there still problems, like the "fry the ext3 boot partition" that
> Kay found?

That is unrelated to Eric's patches, They just added the dump, which I
tried to trigger. It's not entirely clear what caused the filesytem
damage, but I was definitely able to reproduce the unclean shutdown
without any of Eric's sysfs patches.

Thanks,
Kay

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 01/20] sysfs: Implement sysfs_rename_link
  2009-05-28  0:30     ` Kay Sievers
@ 2009-05-28  0:37       ` Greg KH
  2009-05-28 22:58         ` [PATCH 0/24] sysfs cleanups Eric W. Biederman
  2009-05-28  1:51       ` [PATCH 01/20] sysfs: Implement sysfs_rename_link Eric W. Biederman
  1 sibling, 1 reply; 200+ messages in thread
From: Greg KH @ 2009-05-28  0:37 UTC (permalink / raw)
  To: Kay Sievers
  Cc: Greg KH, Eric W. Biederman, Andrew Morton, linux-kernel,
	Tejun Heo, Cornelia Huck, linux-fsdevel, Benjamin Thery,
	Daniel Lezcano, Eric W. Biederman

On Thu, May 28, 2009 at 02:30:15AM +0200, Kay Sievers wrote:
> On Thu, May 28, 2009 at 02:14, Greg KH <greg@kroah.com> wrote:
> >
> > So, there's been a lot of talk in this thread.
> >
> > Eric, do you have an updated set of patches for me to try out?
> 
> I think we should get a version of the patch in that removes all files
> in a directory on cleanup, but warn if a subdirectory is still there.
> James has a patch to fix the one issue we've seen so far with existing
> child directories.
> 
> After that, we should work on fixing the users that leave files
> behind, and can possibly stop cleaning up files, if we want to.

That would be good.

But note that we always "allowed" such things to happen, so odds are
there are lots of places in the kernel that took advantage of this.  If
we make it a rule, people will complain.

> > Or are there still problems, like the "fry the ext3 boot partition" that
> > Kay found?
> 
> That is unrelated to Eric's patches, They just added the dump, which I
> tried to trigger. It's not entirely clear what caused the filesytem
> damage, but I was definitely able to reproduce the unclean shutdown
> without any of Eric's sysfs patches.

Ok.

But I think Eric had some updates to some of the patches along the way,
so a whole new respin would be good to ensure I get the correct ones.

Eric?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 01/20] sysfs: Implement sysfs_rename_link
  2009-05-28  0:30     ` Kay Sievers
  2009-05-28  0:37       ` Greg KH
@ 2009-05-28  1:51       ` Eric W. Biederman
  1 sibling, 0 replies; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-28  1:51 UTC (permalink / raw)
  To: Kay Sievers
  Cc: Greg KH, Andrew Morton, Greg Kroah-Hartman, linux-kernel,
	Tejun Heo, Cornelia Huck, linux-fsdevel, Benjamin Thery,
	Daniel Lezcano, Eric W. Biederman

Kay Sievers <kay.sievers@vrfy.org> writes:

> On Thu, May 28, 2009 at 02:14, Greg KH <greg@kroah.com> wrote:
>>
>> So, there's been a lot of talk in this thread.
>>
>> Eric, do you have an updated set of patches for me to try out?
>
> I think we should get a version of the patch in that removes all files
> in a directory on cleanup, but warn if a subdirectory is still there.
> James has a patch to fix the one issue we've seen so far with existing
> child directories.
>
> After that, we should work on fixing the users that leave files
> behind, and can possibly stop cleaning up files, if we want to.

Agreed.  We are not yet ready to only remove empty directories from sysfs.
So fixing that problem with the current sysfs should be done in a
separate patchset after the other fixes.

I will look at putting that together.

Eric

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 25/20] sysfs: Only support removing emtpy sysfs  directories.
  2009-05-27 22:22                                                           ` James Bottomley
@ 2009-05-28 15:24                                                             ` Alan Stern
  2009-05-28 15:45                                                               ` Eric W. Biederman
  2009-05-28 18:21                                                               ` James Bottomley
  0 siblings, 2 replies; 200+ messages in thread
From: Alan Stern @ 2009-05-28 15:24 UTC (permalink / raw)
  To: James Bottomley
  Cc: Hannes Reinecke, Kay Sievers, SCSI development list,
	Eric W. Biederman, Andrew Morton, Greg Kroah-Hartman,
	Kernel development list, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Eric W. Biederman

On Wed, 27 May 2009, James Bottomley wrote:

> Right, and I think reap_ref can be seconded to count underlying device
> visibility.

Exactly.  It should count the number of underlying devices that have 
not yet been removed from visibility (this may include some which still 
have to become visible), plus one if we want to keep the target hanging 
around for a while with no visible children (while scanning it, for 
example).

>  However, the piece that's missing, is the fact that all of
> this has to be tied into the host state.  If the host is running, you
> can't remove the target from visibility even if all its children are
> invisible because it might get another visible child added.

Are you sure about that?  It's not obvious at all to me.

For example, suppose during scanning it turns out there are no LUNs at
a particular target address.  Why should the empty target be retained?  
You'd end up with unusable targets at all possible bus addresses.

Besides, if a target is removed from visibility and then another child
is added, the answer is simply to create a new target structure.  
There's already code in scsi_alloc_target() to do this.

>  once it goes
> into the cancel or del states, it can't acquire new children, so then
> it's safe to make a target with no visible children invisible.

If you grant my point above, targets don't need to be tied into the
host state.  They can be removed from visibility whenever the reap_ref
counter goes to 0.  This will happen naturally while the host is in 
the CANCEL state, thanks to scsi_forget_host().

There's another point to consider.  If you do accept my argument that 
empty targets can be removed from visibility regardless of the host's 
state, then this removal races with addition of a new child.  Since 
removal involves calling device_del(), it can't be protected by the 
host lock.  Instead we'd have to use a mutex to protect both target 
addition and target removal.

Since the host's scan_mutex already protects target addition, extending 
its scope to encompass target removal (and perhaps sdev removal too) 
seems natural.  Do you agree?

Alan Stern


^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 25/20] sysfs: Only support removing emtpy sysfs  directories.
  2009-05-28 15:24                                                             ` Alan Stern
@ 2009-05-28 15:45                                                               ` Eric W. Biederman
  2009-05-28 17:51                                                                 ` Alan Stern
  2009-05-28 18:21                                                               ` James Bottomley
  1 sibling, 1 reply; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-28 15:45 UTC (permalink / raw)
  To: Alan Stern
  Cc: James Bottomley, Hannes Reinecke, Kay Sievers,
	SCSI development list, Andrew Morton, Greg Kroah-Hartman,
	Kernel development list, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Eric W. Biederman

Alan Stern <stern@rowland.harvard.edu> writes:

> There's another point to consider.  If you do accept my argument that 
> empty targets can be removed from visibility regardless of the host's 
> state, then this removal races with addition of a new child.  Since 
> removal involves calling device_del(), it can't be protected by the 
> host lock.  Instead we'd have to use a mutex to protect both target 
> addition and target removal.

Careful.  Holding a lock over device_del is an easy and hidden way
to trigger a rare deadlocks.

Eric

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 25/20] sysfs: Only support removing emtpy sysfs  directories.
  2009-05-28 15:45                                                               ` Eric W. Biederman
@ 2009-05-28 17:51                                                                 ` Alan Stern
  0 siblings, 0 replies; 200+ messages in thread
From: Alan Stern @ 2009-05-28 17:51 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: James Bottomley, Hannes Reinecke, Kay Sievers,
	SCSI development list, Andrew Morton, Greg Kroah-Hartman,
	Kernel development list, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Eric W. Biederman

On Thu, 28 May 2009, Eric W. Biederman wrote:

> Alan Stern <stern@rowland.harvard.edu> writes:
> 
> > There's another point to consider.  If you do accept my argument that 
> > empty targets can be removed from visibility regardless of the host's 
> > state, then this removal races with addition of a new child.  Since 
> > removal involves calling device_del(), it can't be protected by the 
> > host lock.  Instead we'd have to use a mutex to protect both target 
> > addition and target removal.
> 
> Careful.  Holding a lock over device_del is an easy and hidden way
> to trigger a rare deadlocks.

Your point is well taken.  In addition, I don't really like the idea of 
forcing device removal to wait for some other device to be added.

I'll work around the problem somehow...  A short polling loop shoud do 
the job.

Alan Stern


^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 25/20] sysfs: Only support removing emtpy sysfs directories.
  2009-05-28 15:24                                                             ` Alan Stern
  2009-05-28 15:45                                                               ` Eric W. Biederman
@ 2009-05-28 18:21                                                               ` James Bottomley
  2009-05-28 20:02                                                                 ` Alan Stern
  1 sibling, 1 reply; 200+ messages in thread
From: James Bottomley @ 2009-05-28 18:21 UTC (permalink / raw)
  To: Alan Stern
  Cc: Hannes Reinecke, Kay Sievers, SCSI development list,
	Eric W. Biederman, Andrew Morton, Greg Kroah-Hartman,
	Kernel development list, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Eric W. Biederman

On Thu, 2009-05-28 at 11:24 -0400, Alan Stern wrote:
> On Wed, 27 May 2009, James Bottomley wrote:
> 
> > Right, and I think reap_ref can be seconded to count underlying device
> > visibility.
> 
> Exactly.  It should count the number of underlying devices that have 
> not yet been removed from visibility (this may include some which still 
> have to become visible), plus one if we want to keep the target hanging 
> around for a while with no visible children (while scanning it, for 
> example).
> 
> >  However, the piece that's missing, is the fact that all of
> > this has to be tied into the host state.  If the host is running, you
> > can't remove the target from visibility even if all its children are
> > invisible because it might get another visible child added.
> 
> Are you sure about that?  It's not obvious at all to me.

Yes ... otherwise you have to elongate the DEL interval from a few ms to
potentially anything.  That would allow locking a target in a dying
state and prevent any new LUNs being added.

> For example, suppose during scanning it turns out there are no LUNs at
> a particular target address.  Why should the empty target be retained?  
> You'd end up with unusable targets at all possible bus addresses.
> 
> Besides, if a target is removed from visibility and then another child
> is added, the answer is simply to create a new target structure.  
> There's already code in scsi_alloc_target() to do this.

As I've said several times, this could be done, but we'd have to audit
the code paths to make sure we allow for multiple same targets in the
list.

> >  once it goes
> > into the cancel or del states, it can't acquire new children, so then
> > it's safe to make a target with no visible children invisible.
> 
> If you grant my point above, targets don't need to be tied into the
> host state.  They can be removed from visibility whenever the reap_ref
> counter goes to 0.  This will happen naturally while the host is in 
> the CANCEL state, thanks to scsi_forget_host().
> 
> There's another point to consider.  If you do accept my argument that 
> empty targets can be removed from visibility regardless of the host's 
> state, then this removal races with addition of a new child.  Since 
> removal involves calling device_del(), it can't be protected by the 
> host lock.  Instead we'd have to use a mutex to protect both target 
> addition and target removal.

No, this is state model 101 ... you alter the state inside the lock and
call del outside of it.  Technically you're lying about the state for
the few us it takes to run out of the lock and del the target, but
there's a papal indulgence for that.

> Since the host's scan_mutex already protects target addition, extending 
> its scope to encompass target removal (and perhaps sdev removal too) 
> seems natural.  Do you agree?

James



^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 25/20] sysfs: Only support removing emtpy sysfs  directories.
  2009-05-28 18:21                                                               ` James Bottomley
@ 2009-05-28 20:02                                                                 ` Alan Stern
  2009-05-28 20:10                                                                   ` James Bottomley
  0 siblings, 1 reply; 200+ messages in thread
From: Alan Stern @ 2009-05-28 20:02 UTC (permalink / raw)
  To: James Bottomley
  Cc: Hannes Reinecke, Kay Sievers, SCSI development list,
	Eric W. Biederman, Andrew Morton, Greg Kroah-Hartman,
	Kernel development list, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Eric W. Biederman

On Thu, 28 May 2009, James Bottomley wrote:

> > >  However, the piece that's missing, is the fact that all of
> > > this has to be tied into the host state.  If the host is running, you
> > > can't remove the target from visibility even if all its children are
> > > invisible because it might get another visible child added.
> > 
> > Are you sure about that?  It's not obvious at all to me.
> 
> Yes ... otherwise you have to elongate the DEL interval from a few ms to
> potentially anything.  That would allow locking a target in a dying
> state and prevent any new LUNs being added.

How so?  Why not unlink the target from the host's list when the 
device_del() call returns?  A new target can be created any time after 
that, since the old one is now completely invisible.

> > For example, suppose during scanning it turns out there are no LUNs at
> > a particular target address.  Why should the empty target be retained?  
> > You'd end up with unusable targets at all possible bus addresses.
> > 
> > Besides, if a target is removed from visibility and then another child
> > is added, the answer is simply to create a new target structure.  
> > There's already code in scsi_alloc_target() to do this.
> 
> As I've said several times, this could be done, but we'd have to audit
> the code paths to make sure we allow for multiple same targets in the
> list.

No, not if the old target is removed from the host's list before the
new target is added.

Is there any reason the old target has to remain on the list?  If 
there is, we can introduce a new state: STARGET_CLEANUP.  The old 
target gets put into this state when device_del() returns.  List 
entries in that state are ignored by __scsi_find_target() or whatever 
else looks through the list.

Alan Stern

P.S.: Does scsi_target_reap() really ever get called in non-process
context?  I couldn't find any place where that might happen.


^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 25/20] sysfs: Only support removing emtpy sysfs directories.
  2009-05-28 20:02                                                                 ` Alan Stern
@ 2009-05-28 20:10                                                                   ` James Bottomley
  2009-05-28 21:04                                                                     ` Alan Stern
  2009-05-29 20:08                                                                     ` Alan Stern
  0 siblings, 2 replies; 200+ messages in thread
From: James Bottomley @ 2009-05-28 20:10 UTC (permalink / raw)
  To: Alan Stern
  Cc: Hannes Reinecke, Kay Sievers, SCSI development list,
	Eric W. Biederman, Andrew Morton, Greg Kroah-Hartman,
	Kernel development list, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Eric W. Biederman

On Thu, 2009-05-28 at 16:02 -0400, Alan Stern wrote:
> On Thu, 28 May 2009, James Bottomley wrote:
> 
> > > >  However, the piece that's missing, is the fact that all of
> > > > this has to be tied into the host state.  If the host is running, you
> > > > can't remove the target from visibility even if all its children are
> > > > invisible because it might get another visible child added.
> > > 
> > > Are you sure about that?  It's not obvious at all to me.
> > 
> > Yes ... otherwise you have to elongate the DEL interval from a few ms to
> > potentially anything.  That would allow locking a target in a dying
> > state and prevent any new LUNs being added.
> 
> How so?  Why not unlink the target from the host's list when the 
> device_del() call returns?  A new target can be created any time after 
> that, since the old one is now completely invisible.

The answer to that one is several emails back: we need the target in the
host list for the lifetime of the devices ... it's alterable, but even
more auditing.

> > > For example, suppose during scanning it turns out there are no LUNs at
> > > a particular target address.  Why should the empty target be retained?  
> > > You'd end up with unusable targets at all possible bus addresses.
> > > 
> > > Besides, if a target is removed from visibility and then another child
> > > is added, the answer is simply to create a new target structure.  
> > > There's already code in scsi_alloc_target() to do this.
> > 
> > As I've said several times, this could be done, but we'd have to audit
> > the code paths to make sure we allow for multiple same targets in the
> > list.
> 
> No, not if the old target is removed from the host's list before the
> new target is added.
> 
> Is there any reason the old target has to remain on the list?  If 
> there is, we can introduce a new state: STARGET_CLEANUP.  The old 
> target gets put into this state when device_del() returns.  List 
> entries in that state are ignored by __scsi_find_target() or whatever 
> else looks through the list.
> 
> Alan Stern
> 
> P.S.: Does scsi_target_reap() really ever get called in non-process
> context?  I couldn't find any place where that might happen.

>From the device release, which is done by last put, which could be I/O
context.

James



^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 25/20] sysfs: Only support removing emtpy sysfs  directories.
  2009-05-28 20:10                                                                   ` James Bottomley
@ 2009-05-28 21:04                                                                     ` Alan Stern
  2009-05-29 12:32                                                                       ` Hannes Reinecke
  2009-05-29 20:08                                                                     ` Alan Stern
  1 sibling, 1 reply; 200+ messages in thread
From: Alan Stern @ 2009-05-28 21:04 UTC (permalink / raw)
  To: James Bottomley
  Cc: Hannes Reinecke, Kay Sievers, SCSI development list,
	Eric W. Biederman, Andrew Morton, Greg Kroah-Hartman,
	Kernel development list, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Eric W. Biederman

On Thu, 28 May 2009, James Bottomley wrote:

> > How so?  Why not unlink the target from the host's list when the 
> > device_del() call returns?  A new target can be created any time after 
> > that, since the old one is now completely invisible.
> 
> The answer to that one is several emails back: we need the target in the
> host list for the lifetime of the devices ... it's alterable, but even
> more auditing.

I don't recall you mentioning that the target had to be linked into the 
host's list for the lifetime of the devices; I thought you said merely 
that the target had to _exist_ for the lifetime of the devices.

Does it really need to be linked, or is existence of the structure 
sufficient?

Likewise, after a device is removed from visibility, does it need to 
remain linked into the host's and target's lists?

> > P.S.: Does scsi_target_reap() really ever get called in non-process
> > context?  I couldn't find any place where that might happen.
> 
> From the device release, which is done by last put, which could be I/O
> context.

But scsi_target_reap() isn't called directly from the device release.  
It's called from scsi_device_dev_release_usercontext().

And besides, in the patch I'm working on it isn't called from either of 
those places -- it's called from __scsi_remove_device().  So I'll go 
ahead and get rid of scsi_target_reap_usercontext().

Alan Stern


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [PATCH 0/24] sysfs cleanups.
  2009-05-28  0:37       ` Greg KH
@ 2009-05-28 22:58         ` Eric W. Biederman
  2009-05-28 23:00           ` [PATCH 01/24] sysfs: Implement sysfs_rename_link Eric W. Biederman
                             ` (24 more replies)
  0 siblings, 25 replies; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-28 22:58 UTC (permalink / raw)
  To: Greg KH
  Cc: Kay Sievers, Greg KH, Andrew Morton, linux-kernel, Tejun Heo,
	Cornelia Huck, linux-fsdevel, Benjamin Thery, Daniel Lezcano,
	Eric W. Biederman

Greg KH <gregkh@suse.de> writes:

>> > Or are there still problems, like the "fry the ext3 boot partition" that
>> > Kay found?
>> 
>> That is unrelated to Eric's patches, They just added the dump, which I
>> tried to trigger. It's not entirely clear what caused the filesytem
>> damage, but I was definitely able to reproduce the unclean shutdown
>> without any of Eric's sysfs patches.
>
> Ok.
>
> But I think Eric had some updates to some of the patches along the way,
> so a whole new respin would be good to ensure I get the correct ones.


Ok.  Here is my respun patchset.  I have not changed which files
we remove in a directory when we remove a non-empty directory.
Leaving that issue to be addressed another time.

To accommodate that I have respun two patches:

patch  4 sysfs: Normalize removing sysfs directories.
patch 15 sysfs: Kill sysfs_addrm_start and sysfs_addrm_finish

Killing addrm is only changed to track the previous differences.

Eric

^ permalink raw reply	[flat|nested] 200+ messages in thread

* [PATCH 01/24] sysfs: Implement sysfs_rename_link
  2009-05-28 22:58         ` [PATCH 0/24] sysfs cleanups Eric W. Biederman
@ 2009-05-28 23:00           ` Eric W. Biederman
  2009-05-28 23:00           ` [PATCH 02/24] driver core: Use sysfs_rename_link in device_rename Eric W. Biederman
                             ` (23 subsequent siblings)
  24 siblings, 0 replies; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-28 23:00 UTC (permalink / raw)
  To: Andrew Morton, Greg Kroah-Hartman
  Cc: linux-kernel, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Kay Sievers, Greg KH, Eric W. Biederman, Benjamin Thery,
	Daniel Lezcano, Eric W. Biederman

From: Eric W. Biederman <ebiederm@xmission.com>

Because of rename ordering problems we occassionally give false
warnings about invalid sysfs operations, so implement a helper
function for this common sysfs idiom.

This is a stripped down version of an earlier patch that
also added sysfs_delete_link.

Cc: Benjamin Thery <benjamin.thery@bull.net>
Cc: Daniel Lezcano <dlezcano@fr.ibm.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
---
 fs/sysfs/symlink.c    |   16 ++++++++++++++++
 include/linux/sysfs.h |    9 +++++++++
 2 files changed, 25 insertions(+), 0 deletions(-)

diff --git a/fs/sysfs/symlink.c b/fs/sysfs/symlink.c
index a3ba217..11c4da5 100644
--- a/fs/sysfs/symlink.c
+++ b/fs/sysfs/symlink.c
@@ -122,6 +122,22 @@ void sysfs_remove_link(struct kobject * kobj, const char * name)
 	sysfs_hash_and_remove(parent_sd, name);
 }
 
+/**
+ *	sysfs_rename_link - rename symlink in object's directory.
+ *	@kobj:	object we're acting for.
+ *	@targ:	object we're pointing to.
+ *	@old:	previous name of the symlink.
+ *	@new:	new name of the symlink.
+ *
+ *	A helper function for the common rename symlink idiom.
+ */
+int sysfs_rename_link(struct kobject *kobj, struct kobject *targ,
+			const char *old, const char *new)
+{
+	sysfs_remove_link(kobj, old);
+	return sysfs_create_link(kobj, targ, new);
+}
+
 static int sysfs_get_target_path(struct sysfs_dirent *parent_sd,
 				 struct sysfs_dirent *target_sd, char *path)
 {
diff --git a/include/linux/sysfs.h b/include/linux/sysfs.h
index 9d68fed..18c8e70 100644
--- a/include/linux/sysfs.h
+++ b/include/linux/sysfs.h
@@ -109,6 +109,9 @@ int __must_check sysfs_create_link_nowarn(struct kobject *kobj,
 					  const char *name);
 void sysfs_remove_link(struct kobject *kobj, const char *name);
 
+int sysfs_rename_link(struct kobject *kobj, struct kobject *target,
+			const char *old_name, const char *new_name);
+
 int __must_check sysfs_create_group(struct kobject *kobj,
 				    const struct attribute_group *grp);
 int sysfs_update_group(struct kobject *kobj,
@@ -202,6 +205,12 @@ static inline void sysfs_remove_link(struct kobject *kobj, const char *name)
 {
 }
 
+static inline int sysfs_rename_link(struct kobject *k, struct kobject *t,
+				    const char *old_name, const char *new_name)
+{
+	return 0;
+}
+
 static inline int sysfs_create_group(struct kobject *kobj,
 				     const struct attribute_group *grp)
 {
-- 
1.6.3.1.54.g99dd.dirty


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH 02/24] driver core: Use sysfs_rename_link in device_rename
  2009-05-28 22:58         ` [PATCH 0/24] sysfs cleanups Eric W. Biederman
  2009-05-28 23:00           ` [PATCH 01/24] sysfs: Implement sysfs_rename_link Eric W. Biederman
@ 2009-05-28 23:00           ` Eric W. Biederman
  2009-05-28 23:00           ` [PATCH 03/24] sysfs: Remove now unnecessary error reporting suppression Eric W. Biederman
                             ` (22 subsequent siblings)
  24 siblings, 0 replies; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-28 23:00 UTC (permalink / raw)
  To: Andrew Morton, Greg Kroah-Hartman
  Cc: linux-kernel, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Kay Sievers, Greg KH, Eric W. Biederman, Eric W. Biederman

From: Eric W. Biederman <ebiederm@xmission.com>

Don't open code the renaming of symlinks in sysfs
instead use the new helper function sysfs_rename_link

Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
---
 drivers/base/core.c |   18 ++++++------------
 1 files changed, 6 insertions(+), 12 deletions(-)

diff --git a/drivers/base/core.c b/drivers/base/core.c
index 4aa527b..8a1569c 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -1490,22 +1490,16 @@ int device_rename(struct device *dev, char *new_name)
 	if (old_class_name) {
 		new_class_name = make_class_name(dev->class->name, &dev->kobj);
 		if (new_class_name) {
-			error = sysfs_create_link_nowarn(&dev->parent->kobj,
-							 &dev->kobj,
-							 new_class_name);
-			if (error)
-				goto out;
-			sysfs_remove_link(&dev->parent->kobj, old_class_name);
+			error = sysfs_rename_link(&dev->parent->kobj,
+						  &dev->kobj,
+						  old_class_name,
+						  new_class_name);
 		}
 	}
 #else
 	if (dev->class) {
-		error = sysfs_create_link_nowarn(&dev->class->p->class_subsys.kobj,
-						 &dev->kobj, dev_name(dev));
-		if (error)
-			goto out;
-		sysfs_remove_link(&dev->class->p->class_subsys.kobj,
-				  old_device_name);
+		error = sysfs_rename_link(&dev->class->p->class_subsys.kobj,
+					  &dev->kobj, old_device_name, new_name);
 	}
 #endif
 
-- 
1.6.3.1.54.g99dd.dirty


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH 03/24] sysfs: Remove now unnecessary error reporting suppression.
  2009-05-28 22:58         ` [PATCH 0/24] sysfs cleanups Eric W. Biederman
  2009-05-28 23:00           ` [PATCH 01/24] sysfs: Implement sysfs_rename_link Eric W. Biederman
  2009-05-28 23:00           ` [PATCH 02/24] driver core: Use sysfs_rename_link in device_rename Eric W. Biederman
@ 2009-05-28 23:00           ` Eric W. Biederman
  2009-05-28 23:00           ` [PATCH 04/24] sysfs: Normalize removing sysfs directories Eric W. Biederman
                             ` (21 subsequent siblings)
  24 siblings, 0 replies; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-28 23:00 UTC (permalink / raw)
  To: Andrew Morton, Greg Kroah-Hartman
  Cc: linux-kernel, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Kay Sievers, Greg KH, Eric W. Biederman, Eric W. Biederman

From: Eric W. Biederman <ebiederm@xmission.com>

Now that we use sysfs_rename_link in the places we previously
used sysfs_create_link_nowarn we can remove sysfs_create_link_nowarn
and all it's supporting infrastructure as it has no callers.

Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
---
 fs/sysfs/dir.c     |   54 +++++++++++----------------------------------------
 fs/sysfs/symlink.c |   42 ++++++++-------------------------------
 fs/sysfs/sysfs.h   |    1 -
 3 files changed, 21 insertions(+), 76 deletions(-)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index d88d0fa..b95cc07 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -397,43 +397,6 @@ void sysfs_addrm_start(struct sysfs_addrm_cxt *acxt,
 }
 
 /**
- *	__sysfs_add_one - add sysfs_dirent to parent without warning
- *	@acxt: addrm context to use
- *	@sd: sysfs_dirent to be added
- *
- *	Get @acxt->parent_sd and set sd->s_parent to it and increment
- *	nlink of parent inode if @sd is a directory and link into the
- *	children list of the parent.
- *
- *	This function should be called between calls to
- *	sysfs_addrm_start() and sysfs_addrm_finish() and should be
- *	passed the same @acxt as passed to sysfs_addrm_start().
- *
- *	LOCKING:
- *	Determined by sysfs_addrm_start().
- *
- *	RETURNS:
- *	0 on success, -EEXIST if entry with the given name already
- *	exists.
- */
-int __sysfs_add_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd)
-{
-	if (sysfs_find_dirent(acxt->parent_sd, sd->s_name))
-		return -EEXIST;
-
-	sd->s_parent = sysfs_get(acxt->parent_sd);
-
-	if (sysfs_type(sd) == SYSFS_DIR && acxt->parent_inode)
-		inc_nlink(acxt->parent_inode);
-
-	acxt->cnt++;
-
-	sysfs_link_sibling(sd);
-
-	return 0;
-}
-
-/**
  *	sysfs_pathname - return full path to sysfs dirent
  *	@sd: sysfs_dirent whose path we want
  *	@path: caller allocated buffer
@@ -475,10 +438,7 @@ static char *sysfs_pathname(struct sysfs_dirent *sd, char *path)
  */
 int sysfs_add_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd)
 {
-	int ret;
-
-	ret = __sysfs_add_one(acxt, sd);
-	if (ret == -EEXIST) {
+	if (sysfs_find_dirent(acxt->parent_sd, sd->s_name)) {
 		char *path = kzalloc(PATH_MAX, GFP_KERNEL);
 		WARN(1, KERN_WARNING
 		     "sysfs: cannot create duplicate filename '%s'\n",
@@ -486,9 +446,19 @@ int sysfs_add_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd)
 		     strcat(strcat(sysfs_pathname(acxt->parent_sd, path), "/"),
 		            sd->s_name));
 		kfree(path);
+		return -EEXIST;
 	}
 
-	return ret;
+	sd->s_parent = sysfs_get(acxt->parent_sd);
+
+	if (sysfs_type(sd) == SYSFS_DIR && acxt->parent_inode)
+		inc_nlink(acxt->parent_inode);
+
+	acxt->cnt++;
+
+	sysfs_link_sibling(sd);
+
+	return 0;
 }
 
 /**
diff --git a/fs/sysfs/symlink.c b/fs/sysfs/symlink.c
index 11c4da5..ac13e61 100644
--- a/fs/sysfs/symlink.c
+++ b/fs/sysfs/symlink.c
@@ -19,8 +19,14 @@
 
 #include "sysfs.h"
 
-static int sysfs_do_create_link(struct kobject *kobj, struct kobject *target,
-				const char *name, int warn)
+/**
+ *	sysfs_create_link - create symlink between two objects.
+ *	@kobj:	object whose directory we're creating the link in.
+ *	@target:	object we're pointing to.
+ *	@name:		name of the symlink.
+ */
+int sysfs_create_link(struct kobject *kobj, struct kobject *target,
+			const char *name)
 {
 	struct sysfs_dirent *parent_sd = NULL;
 	struct sysfs_dirent *target_sd = NULL;
@@ -60,10 +66,7 @@ static int sysfs_do_create_link(struct kobject *kobj, struct kobject *target,
 	target_sd = NULL;	/* reference is now owned by the symlink */
 
 	sysfs_addrm_start(&acxt, parent_sd);
-	if (warn)
-		error = sysfs_add_one(&acxt, sd);
-	else
-		error = __sysfs_add_one(&acxt, sd);
+	error = sysfs_add_one(&acxt, sd);
 	sysfs_addrm_finish(&acxt);
 
 	if (error)
@@ -78,33 +81,6 @@ static int sysfs_do_create_link(struct kobject *kobj, struct kobject *target,
 }
 
 /**
- *	sysfs_create_link - create symlink between two objects.
- *	@kobj:	object whose directory we're creating the link in.
- *	@target:	object we're pointing to.
- *	@name:		name of the symlink.
- */
-int sysfs_create_link(struct kobject *kobj, struct kobject *target,
-		      const char *name)
-{
-	return sysfs_do_create_link(kobj, target, name, 1);
-}
-
-/**
- *	sysfs_create_link_nowarn - create symlink between two objects.
- *	@kobj:	object whose directory we're creating the link in.
- *	@target:	object we're pointing to.
- *	@name:		name of the symlink.
- *
- *	This function does the same as sysf_create_link(), but it
- *	doesn't warn if the link already exists.
- */
-int sysfs_create_link_nowarn(struct kobject *kobj, struct kobject *target,
-			     const char *name)
-{
-	return sysfs_do_create_link(kobj, target, name, 0);
-}
-
-/**
  *	sysfs_remove_link - remove symlink in object's directory.
  *	@kobj:	object we're acting for.
  *	@name:	name of the symlink to remove.
diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index 3fa0d98..abf05f4 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -108,7 +108,6 @@ struct sysfs_dirent *sysfs_get_active_two(struct sysfs_dirent *sd);
 void sysfs_put_active_two(struct sysfs_dirent *sd);
 void sysfs_addrm_start(struct sysfs_addrm_cxt *acxt,
 		       struct sysfs_dirent *parent_sd);
-int __sysfs_add_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd);
 int sysfs_add_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd);
 void sysfs_remove_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd);
 void sysfs_addrm_finish(struct sysfs_addrm_cxt *acxt);
-- 
1.6.3.1.54.g99dd.dirty


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH 04/24] sysfs: Normalize removing sysfs directories.
  2009-05-28 22:58         ` [PATCH 0/24] sysfs cleanups Eric W. Biederman
                             ` (2 preceding siblings ...)
  2009-05-28 23:00           ` [PATCH 03/24] sysfs: Remove now unnecessary error reporting suppression Eric W. Biederman
@ 2009-05-28 23:00           ` Eric W. Biederman
  2009-05-29  9:14             ` Tejun Heo
  2009-05-28 23:00           ` [PATCH 05/24] sysfs: Rename sysfs_d_iput to sysfs_dentry_iput Eric W. Biederman
                             ` (20 subsequent siblings)
  24 siblings, 1 reply; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-28 23:00 UTC (permalink / raw)
  To: Andrew Morton, Greg Kroah-Hartman
  Cc: linux-kernel, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Kay Sievers, Greg KH, Eric W. Biederman, Eric W. Biederman

From: Eric W. Biederman <ebiederm@xmission.com>

Individually wrap each call to sysfs_remove_one with sysfs_addrm_start
and sysfs_addrm_finish this prepares for these functions to be removed.

I don't change the logic of which dirents we remove.  I do document
what we are doing and why what we are doing it.

A warning is added if we attempt to remove an non-empty directories
so that users that would prevsiosly have had leaks and other problems
like the scsi layer get a clear warning that there is a problem.

I don't hold sysfs_mutex across the entire operation as that is unneeded
for coherence at the sysfs level and some level of coordination is expected
at the upper layers.

Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
---
 fs/sysfs/dir.c |   62 +++++++++++++++++++++++++++++++++++++++----------------
 1 files changed, 44 insertions(+), 18 deletions(-)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index b95cc07..3e3a87f 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -732,12 +732,28 @@ const struct inode_operations sysfs_dir_inode_operations = {
 	.setattr	= sysfs_setattr,
 };
 
-static void remove_dir(struct sysfs_dirent *sd)
+static void remove_dir(struct sysfs_dirent *dir_sd)
 {
 	struct sysfs_addrm_cxt acxt;
 
-	sysfs_addrm_start(&acxt, sd->s_parent);
-	sysfs_remove_one(&acxt, sd);
+	pr_debug("sysfs %s: removing dir\n", dir_sd->s_name);
+
+	/* Removing non-empty directories is not valid complain! */
+	if (unlikely(dir_sd->s_dir.children)) {
+		struct sysfs_dirent *sd;
+
+		WARN(1, KERN_WARNING "sysfs: removing non-empty dir: %s\n",
+			dir_sd->s_name);
+
+		mutex_lock(&sysfs_mutex);
+		for (sd = dir_sd->s_dir.children; sd; sd  = sd->s_sibling)
+			printk(KERN_WARNING "%s/%s\n",
+				dir_sd->s_name, sd->s_name);
+		mutex_unlock(&sysfs_mutex);
+	}
+
+	sysfs_addrm_start(&acxt, dir_sd->s_parent);
+	sysfs_remove_one(&acxt, dir_sd);
 	sysfs_addrm_finish(&acxt);
 }
 
@@ -746,27 +762,37 @@ void sysfs_remove_subdir(struct sysfs_dirent *sd)
 	remove_dir(sd);
 }
 
+static struct sysfs_dirent *get_dirent_to_remove(struct sysfs_dirent *dir_sd)
+{
+	struct sysfs_dirent *sd;
+
+	mutex_lock(&sysfs_mutex);
+	for (sd = dir_sd->s_dir.children; sd; sd = sd->s_sibling) {
+		/* Directories might be owned by someone else
+		 * making recursive directory removal unsafe.
+		 */
+		if (sysfs_type(sd) == SYSFS_DIR)
+			continue;
+		break;
+	}
+	sysfs_get(sd);
+	mutex_unlock(&sysfs_mutex);
+
+	return sd;
+}
 
 static void __sysfs_remove_dir(struct sysfs_dirent *dir_sd)
 {
 	struct sysfs_addrm_cxt acxt;
-	struct sysfs_dirent **pos;
-
-	if (!dir_sd)
-		return;
-
-	pr_debug("sysfs %s: removing dir\n", dir_sd->s_name);
-	sysfs_addrm_start(&acxt, dir_sd);
-	pos = &dir_sd->s_dir.children;
-	while (*pos) {
-		struct sysfs_dirent *sd = *pos;
+	struct sysfs_dirent *sd;
 
-		if (sysfs_type(sd) != SYSFS_DIR)
-			sysfs_remove_one(&acxt, sd);
-		else
-			pos = &(*pos)->s_sibling;
+	/* Remove children that we think are safe */
+	while ((sd = get_dirent_to_remove(dir_sd))) {
+		sysfs_addrm_start(&acxt, sd->s_parent);
+		sysfs_remove_one(&acxt, sd);
+		sysfs_addrm_finish(&acxt);
+		sysfs_put(sd);
 	}
-	sysfs_addrm_finish(&acxt);
 
 	remove_dir(dir_sd);
 }
-- 
1.6.3.1.54.g99dd.dirty


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH 05/24] sysfs: Rename sysfs_d_iput to sysfs_dentry_iput
  2009-05-28 22:58         ` [PATCH 0/24] sysfs cleanups Eric W. Biederman
                             ` (3 preceding siblings ...)
  2009-05-28 23:00           ` [PATCH 04/24] sysfs: Normalize removing sysfs directories Eric W. Biederman
@ 2009-05-28 23:00           ` Eric W. Biederman
  2009-05-28 23:00           ` [PATCH 06/24] sysfs: Use dentry_ops instead of directly playing with the dcache Eric W. Biederman
                             ` (19 subsequent siblings)
  24 siblings, 0 replies; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-28 23:00 UTC (permalink / raw)
  To: Andrew Morton, Greg Kroah-Hartman
  Cc: linux-kernel, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Kay Sievers, Greg KH, Eric W. Biederman, Eric W. Biederman

From: Eric W. Biederman <ebiederm@xmission.com>

Using dentry instead of d in the function name is what
several other filesystems are doing and it seems to be
a more readable convention.

Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
---
 fs/sysfs/dir.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index 3e3a87f..f9f32b8 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -294,7 +294,7 @@ void release_sysfs_dirent(struct sysfs_dirent * sd)
 		goto repeat;
 }
 
-static void sysfs_d_iput(struct dentry * dentry, struct inode * inode)
+static void sysfs_dentry_iput(struct dentry * dentry, struct inode * inode)
 {
 	struct sysfs_dirent * sd = dentry->d_fsdata;
 
@@ -303,7 +303,7 @@ static void sysfs_d_iput(struct dentry * dentry, struct inode * inode)
 }
 
 static const struct dentry_operations sysfs_dentry_ops = {
-	.d_iput		= sysfs_d_iput,
+	.d_iput		= sysfs_dentry_iput,
 };
 
 struct sysfs_dirent *sysfs_new_dirent(const char *name, umode_t mode, int type)
-- 
1.6.3.1.54.g99dd.dirty


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH 06/24] sysfs: Use dentry_ops instead of directly playing with the dcache
  2009-05-28 22:58         ` [PATCH 0/24] sysfs cleanups Eric W. Biederman
                             ` (4 preceding siblings ...)
  2009-05-28 23:00           ` [PATCH 05/24] sysfs: Rename sysfs_d_iput to sysfs_dentry_iput Eric W. Biederman
@ 2009-05-28 23:00           ` Eric W. Biederman
  2009-05-28 23:00           ` [PATCH 07/24] sysfs: Simplify sysfs_chmod_file semantics Eric W. Biederman
                             ` (18 subsequent siblings)
  24 siblings, 0 replies; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-28 23:00 UTC (permalink / raw)
  To: Andrew Morton, Greg Kroah-Hartman
  Cc: linux-kernel, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Kay Sievers, Greg KH, Eric W. Biederman, Eric W. Biederman

From: Eric W. Biederman <ebiederm@xmission.com>

Calling d_drop unconditionally when a sysfs_dirent is deleted has
the potential to leak mounts, so instead implement dentry delete
and revalidate operations that cause sysfs dentries to be removed
at the appropriate time.

Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
---
 fs/sysfs/dir.c |   73 +++++++++++++++++++++++++++++++++++--------------------
 1 files changed, 46 insertions(+), 27 deletions(-)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index f9f32b8..e0bf3a5 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -294,6 +294,46 @@ void release_sysfs_dirent(struct sysfs_dirent * sd)
 		goto repeat;
 }
 
+static int sysfs_dentry_delete(struct dentry *dentry)
+{
+	struct sysfs_dirent *sd = dentry->d_fsdata;
+	return !!(sd->s_flags & SYSFS_FLAG_REMOVED);
+}
+
+static int sysfs_dentry_revalidate(struct dentry *dentry, struct nameidata *nd)
+{
+	struct sysfs_dirent *sd = dentry->d_fsdata;
+	int is_dir;
+
+	mutex_lock(&sysfs_mutex);
+
+	/* The sysfs dirent has been deleted */
+	if (sd->s_flags & SYSFS_FLAG_REMOVED)
+		goto out_bad;
+
+	mutex_unlock(&sysfs_mutex);
+out_valid:
+	return 1;
+out_bad:
+	/* Remove the dentry from the dcache hashes.
+	 * If this is a deleted dentry we use d_drop instead of d_delete
+	 * so sysfs doesn't need to cope with negative dentries.
+	 */
+	is_dir = (sysfs_type(sd) == SYSFS_DIR);
+	mutex_unlock(&sysfs_mutex);
+	if (is_dir) {
+		/* If we have submounts we must allow the vfs caches
+		 * to lie about the state of the filesystem to prevent
+		 * leaks and other nasty things.
+		 */
+		if (have_submounts(dentry))
+			goto out_valid;
+		shrink_dcache_parent(dentry);
+	}
+	d_drop(dentry);
+	return 0;
+}
+
 static void sysfs_dentry_iput(struct dentry * dentry, struct inode * inode)
 {
 	struct sysfs_dirent * sd = dentry->d_fsdata;
@@ -303,6 +343,8 @@ static void sysfs_dentry_iput(struct dentry * dentry, struct inode * inode)
 }
 
 static const struct dentry_operations sysfs_dentry_ops = {
+	.d_revalidate	= sysfs_dentry_revalidate,
+	.d_delete	= sysfs_dentry_delete,
 	.d_iput		= sysfs_dentry_iput,
 };
 
@@ -493,44 +535,21 @@ void sysfs_remove_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd)
 }
 
 /**
- *	sysfs_drop_dentry - drop dentry for the specified sysfs_dirent
+ *	sysfs_dec_nlink - Decrement link count for the specified sysfs_dirent
  *	@sd: target sysfs_dirent
  *
- *	Drop dentry for @sd.  @sd must have been unlinked from its
+ *	Decrement nlink for @sd.  @sd must have been unlinked from its
  *	parent on entry to this function such that it can't be looked
  *	up anymore.
  */
-static void sysfs_drop_dentry(struct sysfs_dirent *sd)
+static void sysfs_dec_nlink(struct sysfs_dirent *sd)
 {
 	struct inode *inode;
-	struct dentry *dentry;
 
 	inode = ilookup(sysfs_sb, sd->s_ino);
 	if (!inode)
 		return;
 
-	/* Drop any existing dentries associated with sd.
-	 *
-	 * For the dentry to be properly freed we need to grab a
-	 * reference to the dentry under the dcache lock,  unhash it,
-	 * and then put it.  The playing with the dentry count allows
-	 * dput to immediately free the dentry  if it is not in use.
-	 */
-repeat:
-	spin_lock(&dcache_lock);
-	list_for_each_entry(dentry, &inode->i_dentry, d_alias) {
-		if (d_unhashed(dentry))
-			continue;
-		dget_locked(dentry);
-		spin_lock(&dentry->d_lock);
-		__d_drop(dentry);
-		spin_unlock(&dentry->d_lock);
-		spin_unlock(&dcache_lock);
-		dput(dentry);
-		goto repeat;
-	}
-	spin_unlock(&dcache_lock);
-
 	/* adjust nlink and update timestamp */
 	mutex_lock(&inode->i_mutex);
 
@@ -577,7 +596,7 @@ void sysfs_addrm_finish(struct sysfs_addrm_cxt *acxt)
 		acxt->removed = sd->s_sibling;
 		sd->s_sibling = NULL;
 
-		sysfs_drop_dentry(sd);
+		sysfs_dec_nlink(sd);
 		sysfs_deactivate(sd);
 		unmap_bin_file(sd);
 		sysfs_put(sd);
-- 
1.6.3.1.54.g99dd.dirty


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH 07/24] sysfs: Simplify sysfs_chmod_file semantics
  2009-05-28 22:58         ` [PATCH 0/24] sysfs cleanups Eric W. Biederman
                             ` (5 preceding siblings ...)
  2009-05-28 23:00           ` [PATCH 06/24] sysfs: Use dentry_ops instead of directly playing with the dcache Eric W. Biederman
@ 2009-05-28 23:00           ` Eric W. Biederman
  2009-05-28 23:00           ` [PATCH 08/24] sysfs: Optimize just changing the sysfs file mode Eric W. Biederman
                             ` (17 subsequent siblings)
  24 siblings, 0 replies; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-28 23:00 UTC (permalink / raw)
  To: Andrew Morton, Greg Kroah-Hartman
  Cc: linux-kernel, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Kay Sievers, Greg KH, Eric W. Biederman, Eric W. Biederman

From: Eric W. Biederman <ebiederm@xmission.com>

Currently every caller of sysfs_chmod_file happens at either
file creation time to set a non-default mode or in response
to a specific user requested space change in policy.  Making
timestamps of when the chmod happens and notification of
a file changing mode uninteresting.

Remove the unnecessary time stamp and filesystem change
notification, and removes the last of the explicit inotify
and donitfy support from sysfs.

Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
---
 fs/sysfs/file.c |   10 +---------
 1 files changed, 1 insertions(+), 9 deletions(-)

diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c
index b1606e0..0786b41 100644
--- a/fs/sysfs/file.c
+++ b/fs/sysfs/file.c
@@ -602,17 +602,9 @@ int sysfs_chmod_file(struct kobject *kobj, struct attribute *attr, mode_t mode)
 	mutex_lock(&inode->i_mutex);
 
 	newattrs.ia_mode = (mode & S_IALLUGO) | (inode->i_mode & ~S_IALLUGO);
-	newattrs.ia_valid = ATTR_MODE | ATTR_CTIME;
-	newattrs.ia_ctime = current_fs_time(inode->i_sb);
+	newattrs.ia_valid = ATTR_MODE;
 	rc = sysfs_setattr(victim, &newattrs);
 
-	if (rc == 0) {
-		fsnotify_change(victim, newattrs.ia_valid);
-		mutex_lock(&sysfs_mutex);
-		victim_sd->s_mode = newattrs.ia_mode;
-		mutex_unlock(&sysfs_mutex);
-	}
-
 	mutex_unlock(&inode->i_mutex);
  out:
 	dput(victim);
-- 
1.6.3.1.54.g99dd.dirty


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH 08/24] sysfs: Optimize just changing the sysfs file mode.
  2009-05-28 22:58         ` [PATCH 0/24] sysfs cleanups Eric W. Biederman
                             ` (6 preceding siblings ...)
  2009-05-28 23:00           ` [PATCH 07/24] sysfs: Simplify sysfs_chmod_file semantics Eric W. Biederman
@ 2009-05-28 23:00           ` Eric W. Biederman
  2009-05-28 23:00           ` [PATCH 09/24] sysfs: Simplify iattr assignments Eric W. Biederman
                             ` (16 subsequent siblings)
  24 siblings, 0 replies; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-28 23:00 UTC (permalink / raw)
  To: Andrew Morton, Greg Kroah-Hartman
  Cc: linux-kernel, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Kay Sievers, Greg KH, Eric W. Biederman, Eric W. Biederman

From: Eric W. Biederman <ebiederm@xmission.com>

Don't allocate a struct iattr for the sysfs dentry if just
the mode changes because we have a field for that on the
sysfs_dirent, and we can trigger that case with sysfs_chmod_file.

Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
---
 fs/sysfs/inode.c |   22 ++++++++++++++--------
 1 files changed, 14 insertions(+), 8 deletions(-)

diff --git a/fs/sysfs/inode.c b/fs/sysfs/inode.c
index 555f0ff..70ff2a2 100644
--- a/fs/sysfs/inode.c
+++ b/fs/sysfs/inode.c
@@ -60,12 +60,16 @@ int sysfs_setattr(struct dentry * dentry, struct iattr * iattr)
 		return error;
 
 	iattr->ia_valid &= ~ATTR_SIZE; /* ignore size changes */
+	if (iattr->ia_valid & ATTR_MODE) {
+		if (!in_group_p(inode->i_gid) && !capable(CAP_FSETID))
+			iattr->ia_mode &= ~S_ISGID;
+	}
 
 	error = inode_setattr(inode, iattr);
 	if (error)
 		return error;
 
-	if (!sd_iattr) {
+	if (!sd_iattr && (ia_valid & ~ATTR_MODE)) {
 		/* setting attributes for the first time, allocate now */
 		sd_iattr = kzalloc(sizeof(struct iattr), GFP_KERNEL);
 		if (!sd_iattr)
@@ -78,6 +82,13 @@ int sysfs_setattr(struct dentry * dentry, struct iattr * iattr)
 		sd->s_iattr = sd_iattr;
 	}
 
+	if (ia_valid & ATTR_MODE)
+		sd->s_mode = iattr->ia_mode;
+
+	/* If we don't need the extra attributes leave */
+	if (!sd_iattr)
+		return 0;
+
 	/* attributes were changed atleast once in past */
 
 	if (ia_valid & ATTR_UID)
@@ -93,13 +104,8 @@ int sysfs_setattr(struct dentry * dentry, struct iattr * iattr)
 	if (ia_valid & ATTR_CTIME)
 		sd_iattr->ia_ctime = timespec_trunc(iattr->ia_ctime,
 						inode->i_sb->s_time_gran);
-	if (ia_valid & ATTR_MODE) {
-		umode_t mode = iattr->ia_mode;
-
-		if (!in_group_p(inode->i_gid) && !capable(CAP_FSETID))
-			mode &= ~S_ISGID;
-		sd_iattr->ia_mode = sd->s_mode = mode;
-	}
+	if (ia_valid & ATTR_MODE)
+		sd_iattr->ia_mode = iattr->ia_mode;
 
 	return error;
 }
-- 
1.6.3.1.54.g99dd.dirty


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH 09/24] sysfs: Simplify iattr assignments
  2009-05-28 22:58         ` [PATCH 0/24] sysfs cleanups Eric W. Biederman
                             ` (7 preceding siblings ...)
  2009-05-28 23:00           ` [PATCH 08/24] sysfs: Optimize just changing the sysfs file mode Eric W. Biederman
@ 2009-05-28 23:00           ` Eric W. Biederman
  2009-05-28 23:00           ` [PATCH 10/24] sysfs: Fix locking and factor out sysfs_sd_setattr Eric W. Biederman
                             ` (15 subsequent siblings)
  24 siblings, 0 replies; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-28 23:00 UTC (permalink / raw)
  To: Andrew Morton, Greg Kroah-Hartman
  Cc: linux-kernel, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Kay Sievers, Greg KH, Eric W. Biederman, Eric W. Biederman

From: Eric W. Biederman <ebiederm@xmission.com>

The granularity of sysfs time when we keep it is 1 ns.  Which
when passed to timestamp_trunc results in a nop.  So remove
the unnecessary function call making sysfs_setattr slightly
easier to read.

Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
---
 fs/sysfs/inode.c |    9 +++------
 1 files changed, 3 insertions(+), 6 deletions(-)

diff --git a/fs/sysfs/inode.c b/fs/sysfs/inode.c
index 70ff2a2..5020a1d 100644
--- a/fs/sysfs/inode.c
+++ b/fs/sysfs/inode.c
@@ -96,14 +96,11 @@ int sysfs_setattr(struct dentry * dentry, struct iattr * iattr)
 	if (ia_valid & ATTR_GID)
 		sd_iattr->ia_gid = iattr->ia_gid;
 	if (ia_valid & ATTR_ATIME)
-		sd_iattr->ia_atime = timespec_trunc(iattr->ia_atime,
-						inode->i_sb->s_time_gran);
+		sd_iattr->ia_atime = iattr->ia_atime;
 	if (ia_valid & ATTR_MTIME)
-		sd_iattr->ia_mtime = timespec_trunc(iattr->ia_mtime,
-						inode->i_sb->s_time_gran);
+		sd_iattr->ia_mtime = iattr->ia_mtime;
 	if (ia_valid & ATTR_CTIME)
-		sd_iattr->ia_ctime = timespec_trunc(iattr->ia_ctime,
-						inode->i_sb->s_time_gran);
+		sd_iattr->ia_ctime = iattr->ia_ctime;
 	if (ia_valid & ATTR_MODE)
 		sd_iattr->ia_mode = iattr->ia_mode;
 
-- 
1.6.3.1.54.g99dd.dirty


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH 10/24] sysfs: Fix locking and factor out sysfs_sd_setattr
  2009-05-28 22:58         ` [PATCH 0/24] sysfs cleanups Eric W. Biederman
                             ` (8 preceding siblings ...)
  2009-05-28 23:00           ` [PATCH 09/24] sysfs: Simplify iattr assignments Eric W. Biederman
@ 2009-05-28 23:00           ` Eric W. Biederman
  2009-05-28 23:00           ` [PATCH 11/24] sysfs: Update s_iattr on link and unlink Eric W. Biederman
                             ` (14 subsequent siblings)
  24 siblings, 0 replies; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-28 23:00 UTC (permalink / raw)
  To: Andrew Morton, Greg Kroah-Hartman
  Cc: linux-kernel, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Kay Sievers, Greg KH, Eric W. Biederman, Eric W. Biederman

From: Eric W. Biederman <ebiederm@xmission.com>

Cleanly separate the work that is specific to setting the
attributes of a sysfs_dirent from what is needed to update
the attributes of a vfs inode.

Additionally grab the sysfs_mutex to keep any nasties from
surprising us when updating the sysfs_dirent.

Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
---
 fs/sysfs/inode.c |   52 ++++++++++++++++++++++++++++++----------------------
 fs/sysfs/sysfs.h |    1 +
 2 files changed, 31 insertions(+), 22 deletions(-)

diff --git a/fs/sysfs/inode.c b/fs/sysfs/inode.c
index 5020a1d..dd154cb 100644
--- a/fs/sysfs/inode.c
+++ b/fs/sysfs/inode.c
@@ -42,33 +42,12 @@ int __init sysfs_inode_init(void)
 	return bdi_init(&sysfs_backing_dev_info);
 }
 
-int sysfs_setattr(struct dentry * dentry, struct iattr * iattr)
+int sysfs_sd_setattr(struct sysfs_dirent *sd, struct iattr * iattr)
 {
-	struct inode * inode = dentry->d_inode;
-	struct sysfs_dirent * sd = dentry->d_fsdata;
 	struct iattr * sd_iattr;
 	unsigned int ia_valid = iattr->ia_valid;
-	int error;
-
-	if (!sd)
-		return -EINVAL;
 
 	sd_iattr = sd->s_iattr;
-
-	error = inode_change_ok(inode, iattr);
-	if (error)
-		return error;
-
-	iattr->ia_valid &= ~ATTR_SIZE; /* ignore size changes */
-	if (iattr->ia_valid & ATTR_MODE) {
-		if (!in_group_p(inode->i_gid) && !capable(CAP_FSETID))
-			iattr->ia_mode &= ~S_ISGID;
-	}
-
-	error = inode_setattr(inode, iattr);
-	if (error)
-		return error;
-
 	if (!sd_iattr && (ia_valid & ~ATTR_MODE)) {
 		/* setting attributes for the first time, allocate now */
 		sd_iattr = kzalloc(sizeof(struct iattr), GFP_KERNEL);
@@ -103,6 +82,35 @@ int sysfs_setattr(struct dentry * dentry, struct iattr * iattr)
 		sd_iattr->ia_ctime = iattr->ia_ctime;
 	if (ia_valid & ATTR_MODE)
 		sd_iattr->ia_mode = iattr->ia_mode;
+	return 0;
+}
+
+int sysfs_setattr(struct dentry * dentry, struct iattr * iattr)
+{
+	struct inode * inode = dentry->d_inode;
+	struct sysfs_dirent * sd = dentry->d_fsdata;
+	int error;
+
+	if (!sd)
+		return -EINVAL;
+
+	error = inode_change_ok(inode, iattr);
+	if (error)
+		return error;
+
+	iattr->ia_valid &= ~ATTR_SIZE; /* ignore size changes */
+	if (iattr->ia_valid & ATTR_MODE) {
+		if (!in_group_p(inode->i_gid) && !capable(CAP_FSETID))
+			iattr->ia_mode &= ~S_ISGID;
+	}
+
+	error = inode_setattr(inode, iattr);
+	if (error)
+		return error;
+
+	mutex_lock(&sysfs_mutex);
+	error = sysfs_sd_setattr(sd, iattr);
+	mutex_unlock(&sysfs_mutex);
 
 	return error;
 }
diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index abf05f4..043bb13 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -146,6 +146,7 @@ static inline void __sysfs_put(struct sysfs_dirent *sd)
  */
 struct inode *sysfs_get_inode(struct sysfs_dirent *sd);
 void sysfs_delete_inode(struct inode *inode);
+int sysfs_sd_setattr(struct sysfs_dirent *sd, struct iattr *iattr);
 int sysfs_setattr(struct dentry *dentry, struct iattr *iattr);
 int sysfs_hash_and_remove(struct sysfs_dirent *dir_sd, const char *name);
 int sysfs_inode_init(void);
-- 
1.6.3.1.54.g99dd.dirty


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH 11/24] sysfs: Update s_iattr on link and unlink.
  2009-05-28 22:58         ` [PATCH 0/24] sysfs cleanups Eric W. Biederman
                             ` (9 preceding siblings ...)
  2009-05-28 23:00           ` [PATCH 10/24] sysfs: Fix locking and factor out sysfs_sd_setattr Eric W. Biederman
@ 2009-05-28 23:00           ` Eric W. Biederman
  2009-05-28 23:00           ` [PATCH 12/24] sysfs: Nicely indent sysfs_symlink_inode_operations Eric W. Biederman
                             ` (13 subsequent siblings)
  24 siblings, 0 replies; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-28 23:00 UTC (permalink / raw)
  To: Andrew Morton, Greg Kroah-Hartman
  Cc: linux-kernel, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Kay Sievers, Greg KH, Eric W. Biederman, Eric W. Biederman

From: Eric W. Biederman <ebiederm@xmission.com>

Currently sysfs updates the timestamps on the vfs directory
inode when we create or remove a directory entry but doesn't
update the cached copy on the sysfs_dirent, fix that oversight.

Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
---
 fs/sysfs/dir.c |   14 ++++++++++++++
 1 files changed, 14 insertions(+), 0 deletions(-)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index e0bf3a5..c6472d8 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -480,6 +480,8 @@ static char *sysfs_pathname(struct sysfs_dirent *sd, char *path)
  */
 int sysfs_add_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd)
 {
+	struct iattr *ps_iattr;
+
 	if (sysfs_find_dirent(acxt->parent_sd, sd->s_name)) {
 		char *path = kzalloc(PATH_MAX, GFP_KERNEL);
 		WARN(1, KERN_WARNING
@@ -500,6 +502,11 @@ int sysfs_add_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd)
 
 	sysfs_link_sibling(sd);
 
+	/* Update timestamps on the parent */
+	ps_iattr = acxt->parent_sd->s_iattr;
+	if (ps_iattr)
+		ps_iattr->ia_ctime = ps_iattr->ia_mtime = CURRENT_TIME;
+
 	return 0;
 }
 
@@ -520,10 +527,17 @@ int sysfs_add_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd)
  */
 void sysfs_remove_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd)
 {
+	struct iattr *ps_iattr;
+
 	BUG_ON(sd->s_flags & SYSFS_FLAG_REMOVED);
 
 	sysfs_unlink_sibling(sd);
 
+	/* Update timestamps on the parent */
+	ps_iattr = acxt->parent_sd->s_iattr;
+	if (ps_iattr)
+		ps_iattr->ia_ctime = ps_iattr->ia_mtime = CURRENT_TIME;
+
 	sd->s_flags |= SYSFS_FLAG_REMOVED;
 	sd->s_sibling = acxt->removed;
 	acxt->removed = sd;
-- 
1.6.3.1.54.g99dd.dirty


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH 12/24] sysfs: Nicely indent sysfs_symlink_inode_operations
  2009-05-28 22:58         ` [PATCH 0/24] sysfs cleanups Eric W. Biederman
                             ` (10 preceding siblings ...)
  2009-05-28 23:00           ` [PATCH 11/24] sysfs: Update s_iattr on link and unlink Eric W. Biederman
@ 2009-05-28 23:00           ` Eric W. Biederman
  2009-05-28 23:00           ` [PATCH 13/24] sysfs: Implement sysfs_getattr & sysfs_permission Eric W. Biederman
                             ` (12 subsequent siblings)
  24 siblings, 0 replies; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-28 23:00 UTC (permalink / raw)
  To: Andrew Morton, Greg Kroah-Hartman
  Cc: linux-kernel, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Kay Sievers, Greg KH, Eric W. Biederman, Eric W. Biederman

From: Eric W. Biederman <ebiederm@xmission.com>

Lining up the functions in sysfs_symlink_inode_operations
follows the pattern in the rest of sysfs and makes things
slightly more readable.

Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
---
 fs/sysfs/symlink.c |    6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/fs/sysfs/symlink.c b/fs/sysfs/symlink.c
index ac13e61..0367ed1 100644
--- a/fs/sysfs/symlink.c
+++ b/fs/sysfs/symlink.c
@@ -198,9 +198,9 @@ static void sysfs_put_link(struct dentry *dentry, struct nameidata *nd, void *co
 }
 
 const struct inode_operations sysfs_symlink_inode_operations = {
-	.readlink = generic_readlink,
-	.follow_link = sysfs_follow_link,
-	.put_link = sysfs_put_link,
+	.readlink	= generic_readlink,
+	.follow_link	= sysfs_follow_link,
+	.put_link	= sysfs_put_link,
 };
 
 
-- 
1.6.3.1.54.g99dd.dirty


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH 13/24] sysfs: Implement sysfs_getattr & sysfs_permission
  2009-05-28 22:58         ` [PATCH 0/24] sysfs cleanups Eric W. Biederman
                             ` (11 preceding siblings ...)
  2009-05-28 23:00           ` [PATCH 12/24] sysfs: Nicely indent sysfs_symlink_inode_operations Eric W. Biederman
@ 2009-05-28 23:00           ` Eric W. Biederman
  2009-05-28 23:00           ` [PATCH 14/24] sysfs: In sysfs_chmod_file lazily propagate the mode change Eric W. Biederman
                             ` (11 subsequent siblings)
  24 siblings, 0 replies; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-28 23:00 UTC (permalink / raw)
  To: Andrew Morton, Greg Kroah-Hartman
  Cc: linux-kernel, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Kay Sievers, Greg KH, Eric W. Biederman, Eric W. Biederman

From: Eric W. Biederman <ebiederm@xmission.com>

With the implementation of sysfs_getattr and sysfs_permission
sysfs becomes able to lazily propogate inode attribute changes
from the sysfs_dirents to the vfs inodes.   This paves the way
for deleting significant chunks of now unnecessary code.

Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
---
 fs/sysfs/dir.c     |    2 +
 fs/sysfs/inode.c   |   54 ++++++++++++++++++++++++++++++++++++++++-----------
 fs/sysfs/symlink.c |    3 ++
 fs/sysfs/sysfs.h   |    2 +
 4 files changed, 49 insertions(+), 12 deletions(-)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index c6472d8..b75c938 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -763,6 +763,8 @@ static struct dentry * sysfs_lookup(struct inode *dir, struct dentry *dentry,
 const struct inode_operations sysfs_dir_inode_operations = {
 	.lookup		= sysfs_lookup,
 	.setattr	= sysfs_setattr,
+	.getattr	= sysfs_getattr,
+	.permission	= sysfs_permission,
 };
 
 static void remove_dir(struct sysfs_dirent *dir_sd)
diff --git a/fs/sysfs/inode.c b/fs/sysfs/inode.c
index dd154cb..1b7ed3c 100644
--- a/fs/sysfs/inode.c
+++ b/fs/sysfs/inode.c
@@ -35,6 +35,8 @@ static struct backing_dev_info sysfs_backing_dev_info = {
 
 static const struct inode_operations sysfs_inode_operations ={
 	.setattr	= sysfs_setattr,
+	.getattr	= sysfs_getattr,
+	.permission	= sysfs_permission,
 };
 
 int __init sysfs_inode_init(void)
@@ -123,7 +125,6 @@ static inline void set_default_inode_attr(struct inode * inode, mode_t mode)
 
 static inline void set_inode_attr(struct inode * inode, struct iattr * iattr)
 {
-	inode->i_mode = iattr->ia_mode;
 	inode->i_uid = iattr->ia_uid;
 	inode->i_gid = iattr->ia_gid;
 	inode->i_atime = iattr->ia_atime;
@@ -154,6 +155,33 @@ static int sysfs_count_nlink(struct sysfs_dirent *sd)
 	return nr + 2;
 }
 
+static void sysfs_refresh_inode(struct sysfs_dirent *sd, struct inode *inode)
+{
+	inode->i_mode = sd->s_mode;
+	if (sd->s_iattr) {
+		/* sysfs_dirent has non-default attributes
+		 * get them from persistent copy in sysfs_dirent
+		 */
+		set_inode_attr(inode, sd->s_iattr);
+	}
+
+	if (sysfs_type(sd) == SYSFS_DIR)
+		inode->i_nlink = sysfs_count_nlink(sd);
+}
+
+int sysfs_getattr(struct vfsmount *mnt, struct dentry *dentry, struct kstat *stat)
+{
+	struct sysfs_dirent *sd = dentry->d_fsdata;
+	struct inode *inode = dentry->d_inode;
+
+	mutex_lock(&sysfs_mutex);
+	sysfs_refresh_inode(sd, inode);
+	mutex_unlock(&sysfs_mutex);
+
+	generic_fillattr(inode, stat);
+	return 0;
+}
+
 static void sysfs_init_inode(struct sysfs_dirent *sd, struct inode *inode)
 {
 	struct bin_attribute *bin_attr;
@@ -162,25 +190,16 @@ static void sysfs_init_inode(struct sysfs_dirent *sd, struct inode *inode)
 	inode->i_mapping->a_ops = &sysfs_aops;
 	inode->i_mapping->backing_dev_info = &sysfs_backing_dev_info;
 	inode->i_op = &sysfs_inode_operations;
-	inode->i_ino = sd->s_ino;
 	lockdep_set_class(&inode->i_mutex, &sysfs_inode_imutex_key);
 
-	if (sd->s_iattr) {
-		/* sysfs_dirent has non-default attributes
-		 * get them for the new inode from persistent copy
-		 * in sysfs_dirent
-		 */
-		set_inode_attr(inode, sd->s_iattr);
-	} else
-		set_default_inode_attr(inode, sd->s_mode);
-
+	set_default_inode_attr(inode, sd->s_mode);
+	sysfs_refresh_inode(sd, inode);
 
 	/* initialize inode according to type */
 	switch (sysfs_type(sd)) {
 	case SYSFS_DIR:
 		inode->i_op = &sysfs_dir_inode_operations;
 		inode->i_fop = &sysfs_dir_operations;
-		inode->i_nlink = sysfs_count_nlink(sd);
 		break;
 	case SYSFS_KOBJ_ATTR:
 		inode->i_size = PAGE_SIZE;
@@ -263,3 +282,14 @@ int sysfs_hash_and_remove(struct sysfs_dirent *dir_sd, const char *name)
 	else
 		return -ENOENT;
 }
+
+int sysfs_permission(struct inode *inode, int mask)
+{
+	struct sysfs_dirent *sd = inode->i_private;
+
+	mutex_lock(&sysfs_mutex);
+	sysfs_refresh_inode(sd, inode);
+	mutex_unlock(&sysfs_mutex);
+
+	return generic_permission(inode, mask, NULL);
+}
diff --git a/fs/sysfs/symlink.c b/fs/sysfs/symlink.c
index 0367ed1..05e4984 100644
--- a/fs/sysfs/symlink.c
+++ b/fs/sysfs/symlink.c
@@ -201,6 +201,9 @@ const struct inode_operations sysfs_symlink_inode_operations = {
 	.readlink	= generic_readlink,
 	.follow_link	= sysfs_follow_link,
 	.put_link	= sysfs_put_link,
+	.setattr	= sysfs_setattr,
+	.getattr	= sysfs_getattr,
+	.permission	= sysfs_permission,
 };
 
 
diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index 043bb13..f5b53cf 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -148,6 +148,8 @@ struct inode *sysfs_get_inode(struct sysfs_dirent *sd);
 void sysfs_delete_inode(struct inode *inode);
 int sysfs_sd_setattr(struct sysfs_dirent *sd, struct iattr *iattr);
 int sysfs_setattr(struct dentry *dentry, struct iattr *iattr);
+int sysfs_getattr(struct vfsmount *mnt, struct dentry *dentry, struct kstat *stat);
+int sysfs_permission(struct inode *inode, int mask);
 int sysfs_hash_and_remove(struct sysfs_dirent *dir_sd, const char *name);
 int sysfs_inode_init(void);
 
-- 
1.6.3.1.54.g99dd.dirty


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH 14/24] sysfs: In sysfs_chmod_file lazily propagate the mode change.
  2009-05-28 22:58         ` [PATCH 0/24] sysfs cleanups Eric W. Biederman
                             ` (12 preceding siblings ...)
  2009-05-28 23:00           ` [PATCH 13/24] sysfs: Implement sysfs_getattr & sysfs_permission Eric W. Biederman
@ 2009-05-28 23:00           ` Eric W. Biederman
  2009-05-28 23:00           ` [PATCH 15/24] sysfs: Kill sysfs_addrm_start and sysfs_addrm_finish Eric W. Biederman
                             ` (10 subsequent siblings)
  24 siblings, 0 replies; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-28 23:00 UTC (permalink / raw)
  To: Andrew Morton, Greg Kroah-Hartman
  Cc: linux-kernel, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Kay Sievers, Greg KH, Eric W. Biederman, Eric W. Biederman

From: Eric W. Biederman <ebiederm@xmission.com>

Now that sysfs_getattr and sysfs_permission refresh the vfs
inode there is no need to immediatly push the mode change
into the vfs cache.  Reducing the amount of work needed and
simplifying the locking.

Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
---
 fs/sysfs/file.c |   31 ++++++++-----------------------
 1 files changed, 8 insertions(+), 23 deletions(-)

diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c
index 0786b41..31cfe1d 100644
--- a/fs/sysfs/file.c
+++ b/fs/sysfs/file.c
@@ -577,38 +577,23 @@ EXPORT_SYMBOL_GPL(sysfs_add_file_to_group);
  */
 int sysfs_chmod_file(struct kobject *kobj, struct attribute *attr, mode_t mode)
 {
-	struct sysfs_dirent *victim_sd = NULL;
-	struct dentry *victim = NULL;
-	struct inode * inode;
+	struct sysfs_dirent *sd;
 	struct iattr newattrs;
 	int rc;
 
-	rc = -ENOENT;
-	victim_sd = sysfs_get_dirent(kobj->sd, attr->name);
-	if (!victim_sd)
-		goto out;
+	mutex_lock(&sysfs_mutex);
 
-	mutex_lock(&sysfs_rename_mutex);
-	victim = sysfs_get_dentry(victim_sd);
-	mutex_unlock(&sysfs_rename_mutex);
-	if (IS_ERR(victim)) {
-		rc = PTR_ERR(victim);
-		victim = NULL;
+	rc = -ENOENT;
+	sd = sysfs_find_dirent(kobj->sd, attr->name);
+	if (!sd)
 		goto out;
-	}
-
-	inode = victim->d_inode;
 
-	mutex_lock(&inode->i_mutex);
-
-	newattrs.ia_mode = (mode & S_IALLUGO) | (inode->i_mode & ~S_IALLUGO);
+	newattrs.ia_mode = (mode & S_IALLUGO) | (sd->s_mode & ~S_IALLUGO);
 	newattrs.ia_valid = ATTR_MODE;
-	rc = sysfs_setattr(victim, &newattrs);
+	rc = sysfs_sd_setattr(sd, &newattrs);
 
-	mutex_unlock(&inode->i_mutex);
  out:
-	dput(victim);
-	sysfs_put(victim_sd);
+	mutex_unlock(&sysfs_mutex);
 	return rc;
 }
 EXPORT_SYMBOL_GPL(sysfs_chmod_file);
-- 
1.6.3.1.54.g99dd.dirty


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH 15/24] sysfs: Kill sysfs_addrm_start and sysfs_addrm_finish
  2009-05-28 22:58         ` [PATCH 0/24] sysfs cleanups Eric W. Biederman
                             ` (13 preceding siblings ...)
  2009-05-28 23:00           ` [PATCH 14/24] sysfs: In sysfs_chmod_file lazily propagate the mode change Eric W. Biederman
@ 2009-05-28 23:00           ` Eric W. Biederman
  2009-05-28 23:00           ` [PATCH 16/24] sysfs: Propagate renames to the vfs on demand Eric W. Biederman
                             ` (9 subsequent siblings)
  24 siblings, 0 replies; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-28 23:00 UTC (permalink / raw)
  To: Andrew Morton, Greg Kroah-Hartman
  Cc: linux-kernel, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Kay Sievers, Greg KH, Eric W. Biederman, Eric W. Biederman

From: Eric W. Biederman <ebiederm@xmission.com>

With lazy inode updates and dentry operations bringing everything
into sync on demand there is no longer any need to immediately
update the vfs or grab i_mutex to protect those updates as we
make changes to sysfs.

So stop updating the vfs inodes and move what remains of
sysfs_addrm_start and sysfs_addrm_finsih (just barely more than taking
the sysfs_mutex) into sysfs_add_one and sysfs_remove_one.

Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
---
 fs/sysfs/dir.c     |  188 +++++++---------------------------------------------
 fs/sysfs/file.c    |    6 +--
 fs/sysfs/inode.c   |   16 ++---
 fs/sysfs/symlink.c |    6 +--
 fs/sysfs/sysfs.h   |   17 +----
 5 files changed, 32 insertions(+), 201 deletions(-)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index b75c938..0cf3fad 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -382,62 +382,6 @@ struct sysfs_dirent *sysfs_new_dirent(const char *name, umode_t mode, int type)
 	return NULL;
 }
 
-static int sysfs_ilookup_test(struct inode *inode, void *arg)
-{
-	struct sysfs_dirent *sd = arg;
-	return inode->i_ino == sd->s_ino;
-}
-
-/**
- *	sysfs_addrm_start - prepare for sysfs_dirent add/remove
- *	@acxt: pointer to sysfs_addrm_cxt to be used
- *	@parent_sd: parent sysfs_dirent
- *
- *	This function is called when the caller is about to add or
- *	remove sysfs_dirent under @parent_sd.  This function acquires
- *	sysfs_mutex, grabs inode for @parent_sd if available and lock
- *	i_mutex of it.  @acxt is used to keep and pass context to
- *	other addrm functions.
- *
- *	LOCKING:
- *	Kernel thread context (may sleep).  sysfs_mutex is locked on
- *	return.  i_mutex of parent inode is locked on return if
- *	available.
- */
-void sysfs_addrm_start(struct sysfs_addrm_cxt *acxt,
-		       struct sysfs_dirent *parent_sd)
-{
-	struct inode *inode;
-
-	memset(acxt, 0, sizeof(*acxt));
-	acxt->parent_sd = parent_sd;
-
-	/* Lookup parent inode.  inode initialization is protected by
-	 * sysfs_mutex, so inode existence can be determined by
-	 * looking up inode while holding sysfs_mutex.
-	 */
-	mutex_lock(&sysfs_mutex);
-
-	inode = ilookup5(sysfs_sb, parent_sd->s_ino, sysfs_ilookup_test,
-			 parent_sd);
-	if (inode) {
-		WARN_ON(inode->i_state & I_NEW);
-
-		/* parent inode available */
-		acxt->parent_inode = inode;
-
-		/* sysfs_mutex is below i_mutex in lock hierarchy.
-		 * First, trylock i_mutex.  If fails, unlock
-		 * sysfs_mutex and lock them in order.
-		 */
-		if (!mutex_trylock(&inode->i_mutex)) {
-			mutex_unlock(&sysfs_mutex);
-			mutex_lock(&inode->i_mutex);
-			mutex_lock(&sysfs_mutex);
-		}
-	}
-}
-
 /**
  *	sysfs_pathname - return full path to sysfs dirent
  *	@sd: sysfs_dirent whose path we want
@@ -460,161 +404,83 @@ static char *sysfs_pathname(struct sysfs_dirent *sd, char *path)
 
 /**
  *	sysfs_add_one - add sysfs_dirent to parent
- *	@acxt: addrm context to use
+ *	@parent_sd: directory to add @sd into
  *	@sd: sysfs_dirent to be added
  *
- *	Get @acxt->parent_sd and set sd->s_parent to it and increment
+ *	Get @parent_sd and set sd->s_parent to it and increment
  *	nlink of parent inode if @sd is a directory and link into the
  *	children list of the parent.
  *
- *	This function should be called between calls to
- *	sysfs_addrm_start() and sysfs_addrm_finish() and should be
- *	passed the same @acxt as passed to sysfs_addrm_start().
- *
  *	LOCKING:
- *	Determined by sysfs_addrm_start().
+ *	Kernel thread context (may sleep).  Grabs sysfs_mutex.
  *
  *	RETURNS:
  *	0 on success, -EEXIST if entry with the given name already
  *	exists.
  */
-int sysfs_add_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd)
+int sysfs_add_one(struct sysfs_dirent *parent_sd, struct sysfs_dirent *sd)
 {
 	struct iattr *ps_iattr;
 
-	if (sysfs_find_dirent(acxt->parent_sd, sd->s_name)) {
-		char *path = kzalloc(PATH_MAX, GFP_KERNEL);
+	mutex_lock(&sysfs_mutex);
+	if (sysfs_find_dirent(parent_sd, sd->s_name)) {
+		char *path;
+		mutex_unlock(&sysfs_mutex);
+
+		path = kzalloc(PATH_MAX, GFP_KERNEL);
 		WARN(1, KERN_WARNING
 		     "sysfs: cannot create duplicate filename '%s'\n",
 		     (path == NULL) ? sd->s_name :
-		     strcat(strcat(sysfs_pathname(acxt->parent_sd, path), "/"),
+		     strcat(strcat(sysfs_pathname(parent_sd, path), "/"),
 		            sd->s_name));
 		kfree(path);
 		return -EEXIST;
 	}
 
-	sd->s_parent = sysfs_get(acxt->parent_sd);
-
-	if (sysfs_type(sd) == SYSFS_DIR && acxt->parent_inode)
-		inc_nlink(acxt->parent_inode);
-
-	acxt->cnt++;
-
+	sd->s_parent = sysfs_get(parent_sd);
 	sysfs_link_sibling(sd);
 
 	/* Update timestamps on the parent */
-	ps_iattr = acxt->parent_sd->s_iattr;
+	ps_iattr = parent_sd->s_iattr;
 	if (ps_iattr)
 		ps_iattr->ia_ctime = ps_iattr->ia_mtime = CURRENT_TIME;
 
+	mutex_unlock(&sysfs_mutex);
 	return 0;
 }
 
 /**
  *	sysfs_remove_one - remove sysfs_dirent from parent
- *	@acxt: addrm context to use
  *	@sd: sysfs_dirent to be removed
  *
  *	Mark @sd removed and drop nlink of parent inode if @sd is a
  *	directory.  @sd is unlinked from the children list.
  *
- *	This function should be called between calls to
- *	sysfs_addrm_start() and sysfs_addrm_finish() and should be
- *	passed the same @acxt as passed to sysfs_addrm_start().
- *
  *	LOCKING:
- *	Determined by sysfs_addrm_start().
+ *	Kernel thread context (may sleep).  Grabs sysfs_mutex.
  */
-void sysfs_remove_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd)
+void sysfs_remove_one(struct sysfs_dirent *sd)
 {
 	struct iattr *ps_iattr;
 
 	BUG_ON(sd->s_flags & SYSFS_FLAG_REMOVED);
 
+	mutex_lock(&sysfs_mutex);
+
 	sysfs_unlink_sibling(sd);
 
 	/* Update timestamps on the parent */
-	ps_iattr = acxt->parent_sd->s_iattr;
+	ps_iattr = sd->s_parent->s_iattr;
 	if (ps_iattr)
 		ps_iattr->ia_ctime = ps_iattr->ia_mtime = CURRENT_TIME;
 
 	sd->s_flags |= SYSFS_FLAG_REMOVED;
-	sd->s_sibling = acxt->removed;
-	acxt->removed = sd;
-
-	if (sysfs_type(sd) == SYSFS_DIR && acxt->parent_inode)
-		drop_nlink(acxt->parent_inode);
-
-	acxt->cnt++;
-}
-
-/**
- *	sysfs_dec_nlink - Decrement link count for the specified sysfs_dirent
- *	@sd: target sysfs_dirent
- *
- *	Decrement nlink for @sd.  @sd must have been unlinked from its
- *	parent on entry to this function such that it can't be looked
- *	up anymore.
- */
-static void sysfs_dec_nlink(struct sysfs_dirent *sd)
-{
-	struct inode *inode;
-
-	inode = ilookup(sysfs_sb, sd->s_ino);
-	if (!inode)
-		return;
-
-	/* adjust nlink and update timestamp */
-	mutex_lock(&inode->i_mutex);
-
-	inode->i_ctime = CURRENT_TIME;
-	drop_nlink(inode);
-	if (sysfs_type(sd) == SYSFS_DIR)
-		drop_nlink(inode);
-
-	mutex_unlock(&inode->i_mutex);
-
-	iput(inode);
-}
 
-/**
- *	sysfs_addrm_finish - finish up sysfs_dirent add/remove
- *	@acxt: addrm context to finish up
- *
- *	Finish up sysfs_dirent add/remove.  Resources acquired by
- *	sysfs_addrm_start() are released and removed sysfs_dirents are
- *	cleaned up.  Timestamps on the parent inode are updated.
- *
- *	LOCKING:
- *	All mutexes acquired by sysfs_addrm_start() are released.
- */
-void sysfs_addrm_finish(struct sysfs_addrm_cxt *acxt)
-{
-	/* release resources acquired by sysfs_addrm_start() */
 	mutex_unlock(&sysfs_mutex);
-	if (acxt->parent_inode) {
-		struct inode *inode = acxt->parent_inode;
 
-		/* if added/removed, update timestamps on the parent */
-		if (acxt->cnt)
-			inode->i_ctime = inode->i_mtime = CURRENT_TIME;
-
-		mutex_unlock(&inode->i_mutex);
-		iput(inode);
-	}
-
-	/* kill removed sysfs_dirents */
-	while (acxt->removed) {
-		struct sysfs_dirent *sd = acxt->removed;
-
-		acxt->removed = sd->s_sibling;
-		sd->s_sibling = NULL;
-
-		sysfs_dec_nlink(sd);
-		sysfs_deactivate(sd);
-		unmap_bin_file(sd);
-		sysfs_put(sd);
-	}
+	sysfs_deactivate(sd);
+	unmap_bin_file(sd);
+	sysfs_put(sd);
 }
 
 /**
@@ -673,7 +539,6 @@ static int create_dir(struct kobject *kobj, struct sysfs_dirent *parent_sd,
 		      const char *name, struct sysfs_dirent **p_sd)
 {
 	umode_t mode = S_IFDIR| S_IRWXU | S_IRUGO | S_IXUGO;
-	struct sysfs_addrm_cxt acxt;
 	struct sysfs_dirent *sd;
 	int rc;
 
@@ -684,10 +549,8 @@ static int create_dir(struct kobject *kobj, struct sysfs_dirent *parent_sd,
 	sd->s_dir.kobj = kobj;
 
 	/* link in */
-	sysfs_addrm_start(&acxt, parent_sd);
-	rc = sysfs_add_one(&acxt, sd);
-	sysfs_addrm_finish(&acxt);
 
+	rc = sysfs_add_one(parent_sd, sd);
 	if (rc == 0)
 		*p_sd = sd;
 	else
@@ -787,9 +650,7 @@ static void remove_dir(struct sysfs_dirent *dir_sd)
 		mutex_unlock(&sysfs_mutex);
 	}
 
-	sysfs_addrm_start(&acxt, dir_sd->s_parent);
 	sysfs_remove_one(&acxt, dir_sd);
-	sysfs_addrm_finish(&acxt);
 }
 
 void sysfs_remove_subdir(struct sysfs_dirent *sd)
@@ -818,14 +679,11 @@ static struct sysfs_dirent *get_dirent_to_remove(struct sysfs_dirent *dir_sd)
 
 static void __sysfs_remove_dir(struct sysfs_dirent *dir_sd)
 {
-	struct sysfs_addrm_cxt acxt;
 	struct sysfs_dirent *sd;
 
 	/* Remove children that we think are safe */
 	while ((sd = get_dirent_to_remove(dir_sd))) {
-		sysfs_addrm_start(&acxt, sd->s_parent);
 		sysfs_remove_one(&acxt, sd);
-		sysfs_addrm_finish(&acxt);
 		sysfs_put(sd);
 	}
 
diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c
index 31cfe1d..b512ce6 100644
--- a/fs/sysfs/file.c
+++ b/fs/sysfs/file.c
@@ -499,7 +499,6 @@ int sysfs_add_file_mode(struct sysfs_dirent *dir_sd,
 			const struct attribute *attr, int type, mode_t amode)
 {
 	umode_t mode = (amode & S_IALLUGO) | S_IFREG;
-	struct sysfs_addrm_cxt acxt;
 	struct sysfs_dirent *sd;
 	int rc;
 
@@ -508,10 +507,7 @@ int sysfs_add_file_mode(struct sysfs_dirent *dir_sd,
 		return -ENOMEM;
 	sd->s_attr.attr = (void *)attr;
 
-	sysfs_addrm_start(&acxt, dir_sd);
-	rc = sysfs_add_one(&acxt, sd);
-	sysfs_addrm_finish(&acxt);
-
+	rc = sysfs_add_one(dir_sd, sd);
 	if (rc)
 		sysfs_put(sd);
 
diff --git a/fs/sysfs/inode.c b/fs/sysfs/inode.c
index 1b7ed3c..ad9a30d 100644
--- a/fs/sysfs/inode.c
+++ b/fs/sysfs/inode.c
@@ -263,23 +263,17 @@ void sysfs_delete_inode(struct inode *inode)
 
 int sysfs_hash_and_remove(struct sysfs_dirent *dir_sd, const char *name)
 {
-	struct sysfs_addrm_cxt acxt;
 	struct sysfs_dirent *sd;
 
 	if (!dir_sd)
 		return -ENOENT;
 
-	sysfs_addrm_start(&acxt, dir_sd);
-
-	sd = sysfs_find_dirent(dir_sd, name);
-	if (sd)
-		sysfs_remove_one(&acxt, sd);
-
-	sysfs_addrm_finish(&acxt);
-
-	if (sd)
+	sd = sysfs_get_dirent(dir_sd, name);
+	if (sd) {
+		sysfs_remove_one(sd);
+		sysfs_put(sd);
 		return 0;
-	else
+	} else
 		return -ENOENT;
 }
 
diff --git a/fs/sysfs/symlink.c b/fs/sysfs/symlink.c
index 05e4984..fc5fc86 100644
--- a/fs/sysfs/symlink.c
+++ b/fs/sysfs/symlink.c
@@ -31,7 +31,6 @@ int sysfs_create_link(struct kobject *kobj, struct kobject *target,
 	struct sysfs_dirent *parent_sd = NULL;
 	struct sysfs_dirent *target_sd = NULL;
 	struct sysfs_dirent *sd = NULL;
-	struct sysfs_addrm_cxt acxt;
 	int error;
 
 	BUG_ON(!name);
@@ -65,10 +64,7 @@ int sysfs_create_link(struct kobject *kobj, struct kobject *target,
 	sd->s_symlink.target_sd = target_sd;
 	target_sd = NULL;	/* reference is now owned by the symlink */
 
-	sysfs_addrm_start(&acxt, parent_sd);
-	error = sysfs_add_one(&acxt, sd);
-	sysfs_addrm_finish(&acxt);
-
+	error = sysfs_add_one(parent_sd, sd);
 	if (error)
 		goto out_put;
 
diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index f5b53cf..f17ebb8 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -77,16 +77,6 @@ static inline unsigned int sysfs_type(struct sysfs_dirent *sd)
 }
 
 /*
- * Context structure to be used while adding/removing nodes.
- */
-struct sysfs_addrm_cxt {
-	struct sysfs_dirent	*parent_sd;
-	struct inode		*parent_inode;
-	struct sysfs_dirent	*removed;
-	int			cnt;
-};
-
-/*
  * mount.c
  */
 extern struct sysfs_dirent sysfs_root;
@@ -106,11 +96,8 @@ extern const struct inode_operations sysfs_dir_inode_operations;
 struct dentry *sysfs_get_dentry(struct sysfs_dirent *sd);
 struct sysfs_dirent *sysfs_get_active_two(struct sysfs_dirent *sd);
 void sysfs_put_active_two(struct sysfs_dirent *sd);
-void sysfs_addrm_start(struct sysfs_addrm_cxt *acxt,
-		       struct sysfs_dirent *parent_sd);
-int sysfs_add_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd);
-void sysfs_remove_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd);
-void sysfs_addrm_finish(struct sysfs_addrm_cxt *acxt);
+int sysfs_add_one(struct sysfs_dirent *parent_sd, struct sysfs_dirent *sd);
+void sysfs_remove_one(struct sysfs_dirent *sd);
 
 struct sysfs_dirent *sysfs_find_dirent(struct sysfs_dirent *parent_sd,
 				       const unsigned char *name);
-- 
1.6.3.1.54.g99dd.dirty


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH 16/24] sysfs: Propagate renames to the vfs on demand
  2009-05-28 22:58         ` [PATCH 0/24] sysfs cleanups Eric W. Biederman
                             ` (14 preceding siblings ...)
  2009-05-28 23:00           ` [PATCH 15/24] sysfs: Kill sysfs_addrm_start and sysfs_addrm_finish Eric W. Biederman
@ 2009-05-28 23:00           ` Eric W. Biederman
  2009-05-28 23:00           ` [PATCH 17/24] sysfs: Merge sysfs_rename_dir and sysfs_move_dir Eric W. Biederman
                             ` (8 subsequent siblings)
  24 siblings, 0 replies; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-28 23:00 UTC (permalink / raw)
  To: Andrew Morton, Greg Kroah-Hartman
  Cc: linux-kernel, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Kay Sievers, Greg KH, Eric W. Biederman, Eric W. Biederman

From: Eric W. Biederman <ebiederm@xmission.com>

By teaching sysfs_revalidate to hide a dentry for
a sysfs_dirent if the sysfs_dirent has been renamed,
and by teaching sysfs_lookup to return the original
dentry if the sysfs dirent has been renamed.  I can
show the results of renames correctly without having to
update the dcache during the directory rename.

This massively simplifies the rename logic allowing a lot
of weird sysfs special cases to be removed along with
a lot of now unnecesary helper code.

Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
---
 fs/namei.c            |   22 -------
 fs/sysfs/dir.c        |  156 ++++++++++---------------------------------------
 fs/sysfs/inode.c      |   12 ----
 fs/sysfs/sysfs.h      |    1 -
 include/linux/namei.h |    1 -
 5 files changed, 32 insertions(+), 160 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index 78f253c..69f559a 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -1260,28 +1260,6 @@ struct dentry *lookup_one_len(const char *name, struct dentry *base, int len)
 	return __lookup_hash(&this, base, NULL);
 }
 
-/**
- * lookup_one_noperm - bad hack for sysfs
- * @name:	pathname component to lookup
- * @base:	base directory to lookup from
- *
- * This is a variant of lookup_one_len that doesn't perform any permission
- * checks.   It's a horrible hack to work around the braindead sysfs
- * architecture and should not be used anywhere else.
- *
- * DON'T USE THIS FUNCTION EVER, thanks.
- */
-struct dentry *lookup_one_noperm(const char *name, struct dentry *base)
-{
-	int err;
-	struct qstr this;
-
-	err = __lookup_one_len(name, &this, base, strlen(name));
-	if (err)
-		return ERR_PTR(err);
-	return __lookup_hash(&this, base, NULL);
-}
-
 int user_path_at(int dfd, const char __user *name, unsigned flags,
 		 struct path *path)
 {
diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index 0cf3fad..efe8a01 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -24,7 +24,6 @@
 #include "sysfs.h"
 
 DEFINE_MUTEX(sysfs_mutex);
-DEFINE_MUTEX(sysfs_rename_mutex);
 DEFINE_SPINLOCK(sysfs_assoc_lock);
 
 static DEFINE_SPINLOCK(sysfs_ino_lock);
@@ -84,46 +83,6 @@ static void sysfs_unlink_sibling(struct sysfs_dirent *sd)
 }
 
 /**
- *	sysfs_get_dentry - get dentry for the given sysfs_dirent
- *	@sd: sysfs_dirent of interest
- *
- *	Get dentry for @sd.  Dentry is looked up if currently not
- *	present.  This function descends from the root looking up
- *	dentry for each step.
- *
- *	LOCKING:
- *	mutex_lock(sysfs_rename_mutex)
- *
- *	RETURNS:
- *	Pointer to found dentry on success, ERR_PTR() value on error.
- */
-struct dentry *sysfs_get_dentry(struct sysfs_dirent *sd)
-{
-	struct dentry *dentry = dget(sysfs_sb->s_root);
-
-	while (dentry->d_fsdata != sd) {
-		struct sysfs_dirent *cur;
-		struct dentry *parent;
-
-		/* find the first ancestor which hasn't been looked up */
-		cur = sd;
-		while (cur->s_parent != dentry->d_fsdata)
-			cur = cur->s_parent;
-
-		/* look it up */
-		parent = dentry;
-		mutex_lock(&parent->d_inode->i_mutex);
-		dentry = lookup_one_noperm(cur->s_name, parent);
-		mutex_unlock(&parent->d_inode->i_mutex);
-		dput(parent);
-
-		if (IS_ERR(dentry))
-			break;
-	}
-	return dentry;
-}
-
-/**
  *	sysfs_get_active - get an active reference to sysfs_dirent
  *	@sd: sysfs_dirent to get an active reference to
  *
@@ -311,6 +270,14 @@ static int sysfs_dentry_revalidate(struct dentry *dentry, struct nameidata *nd)
 	if (sd->s_flags & SYSFS_FLAG_REMOVED)
 		goto out_bad;
 
+	/* The sysfs dirent has been moved? */
+	if (dentry->d_parent->d_fsdata != sd->s_parent)
+		goto out_bad;
+
+	/* The sysfs dirent has been renamed */
+	if (strcmp(dentry->d_name.name, sd->s_name) != 0)
+		goto out_bad;
+
 	mutex_unlock(&sysfs_mutex);
 out_valid:
 	return 1;
@@ -318,6 +285,12 @@ out_bad:
 	/* Remove the dentry from the dcache hashes.
 	 * If this is a deleted dentry we use d_drop instead of d_delete
 	 * so sysfs doesn't need to cope with negative dentries.
+	 *
+	 * If this is a dentry that has simply been renamed we
+	 * use d_drop to remove it from the dcache lookup on its
+	 * old parent.  If this dentry persists later when a lookup
+	 * is performed at its new name the dentry will be readded
+	 * to the dcache hashes.
 	 */
 	is_dir = (sysfs_type(sd) == SYSFS_DIR);
 	mutex_unlock(&sysfs_mutex);
@@ -613,10 +586,15 @@ static struct dentry * sysfs_lookup(struct inode *dir, struct dentry *dentry,
 	}
 
 	/* instantiate and hash dentry */
-	dentry->d_op = &sysfs_dentry_ops;
-	dentry->d_fsdata = sysfs_get(sd);
-	d_instantiate(dentry, inode);
-	d_rehash(dentry);
+	ret = d_find_alias(inode);
+	if (!ret) {
+		dentry->d_op = &sysfs_dentry_ops;
+		dentry->d_fsdata = sysfs_get(sd);
+		d_add(dentry, inode);
+	} else {
+		d_move(ret, dentry);
+		iput(inode);
+	}
 
  out_unlock:
 	mutex_unlock(&sysfs_mutex);
@@ -713,62 +691,32 @@ void sysfs_remove_dir(struct kobject * kobj)
 int sysfs_rename_dir(struct kobject * kobj, const char *new_name)
 {
 	struct sysfs_dirent *sd = kobj->sd;
-	struct dentry *parent = NULL;
-	struct dentry *old_dentry = NULL, *new_dentry = NULL;
 	const char *dup_name = NULL;
 	int error;
 
-	mutex_lock(&sysfs_rename_mutex);
+	mutex_lock(&sysfs_mutex);
 
 	error = 0;
 	if (strcmp(sd->s_name, new_name) == 0)
 		goto out;	/* nothing to rename */
 
-	/* get the original dentry */
-	old_dentry = sysfs_get_dentry(sd);
-	if (IS_ERR(old_dentry)) {
-		error = PTR_ERR(old_dentry);
-		old_dentry = NULL;
-		goto out;
-	}
-
-	parent = old_dentry->d_parent;
-
-	/* lock parent and get dentry for new name */
-	mutex_lock(&parent->d_inode->i_mutex);
-	mutex_lock(&sysfs_mutex);
-
 	error = -EEXIST;
 	if (sysfs_find_dirent(sd->s_parent, new_name))
-		goto out_unlock;
-
-	error = -ENOMEM;
-	new_dentry = d_alloc_name(parent, new_name);
-	if (!new_dentry)
-		goto out_unlock;
+		goto out;
 
 	/* rename sysfs_dirent */
 	error = -ENOMEM;
 	new_name = dup_name = kstrdup(new_name, GFP_KERNEL);
 	if (!new_name)
-		goto out_unlock;
+		goto out;
 
 	dup_name = sd->s_name;
 	sd->s_name = new_name;
 
-	/* rename */
-	d_add(new_dentry, NULL);
-	d_move(old_dentry, new_dentry);
-
 	error = 0;
- out_unlock:
+ out:
 	mutex_unlock(&sysfs_mutex);
-	mutex_unlock(&parent->d_inode->i_mutex);
 	kfree(dup_name);
-	dput(old_dentry);
-	dput(new_dentry);
- out:
-	mutex_unlock(&sysfs_rename_mutex);
 	return error;
 }
 
@@ -776,54 +724,20 @@ int sysfs_move_dir(struct kobject *kobj, struct kobject *new_parent_kobj)
 {
 	struct sysfs_dirent *sd = kobj->sd;
 	struct sysfs_dirent *new_parent_sd;
-	struct dentry *old_parent, *new_parent = NULL;
-	struct dentry *old_dentry = NULL, *new_dentry = NULL;
 	int error;
 
-	mutex_lock(&sysfs_rename_mutex);
 	BUG_ON(!sd->s_parent);
+
+	mutex_lock(&sysfs_mutex);
 	new_parent_sd = new_parent_kobj->sd ? new_parent_kobj->sd : &sysfs_root;
 
 	error = 0;
 	if (sd->s_parent == new_parent_sd)
 		goto out;	/* nothing to move */
 
-	/* get dentries */
-	old_dentry = sysfs_get_dentry(sd);
-	if (IS_ERR(old_dentry)) {
-		error = PTR_ERR(old_dentry);
-		old_dentry = NULL;
-		goto out;
-	}
-	old_parent = old_dentry->d_parent;
-
-	new_parent = sysfs_get_dentry(new_parent_sd);
-	if (IS_ERR(new_parent)) {
-		error = PTR_ERR(new_parent);
-		new_parent = NULL;
-		goto out;
-	}
-
-again:
-	mutex_lock(&old_parent->d_inode->i_mutex);
-	if (!mutex_trylock(&new_parent->d_inode->i_mutex)) {
-		mutex_unlock(&old_parent->d_inode->i_mutex);
-		goto again;
-	}
-	mutex_lock(&sysfs_mutex);
-
 	error = -EEXIST;
 	if (sysfs_find_dirent(new_parent_sd, sd->s_name))
-		goto out_unlock;
-
-	error = -ENOMEM;
-	new_dentry = d_alloc_name(new_parent, sd->s_name);
-	if (!new_dentry)
-		goto out_unlock;
-
-	error = 0;
-	d_add(new_dentry, NULL);
-	d_move(old_dentry, new_dentry);
+		goto out;
 
 	/* Remove from old parent's list and insert into new parent's list. */
 	sysfs_unlink_sibling(sd);
@@ -832,15 +746,9 @@ again:
 	sd->s_parent = new_parent_sd;
 	sysfs_link_sibling(sd);
 
- out_unlock:
+	error = 0;
+out:
 	mutex_unlock(&sysfs_mutex);
-	mutex_unlock(&new_parent->d_inode->i_mutex);
-	mutex_unlock(&old_parent->d_inode->i_mutex);
- out:
-	dput(new_parent);
-	dput(old_dentry);
-	dput(new_dentry);
-	mutex_unlock(&sysfs_rename_mutex);
 	return error;
 }
 
diff --git a/fs/sysfs/inode.c b/fs/sysfs/inode.c
index ad9a30d..a1917b5 100644
--- a/fs/sysfs/inode.c
+++ b/fs/sysfs/inode.c
@@ -132,17 +132,6 @@ static inline void set_inode_attr(struct inode * inode, struct iattr * iattr)
 	inode->i_ctime = iattr->ia_ctime;
 }
 
-
-/*
- * sysfs has a different i_mutex lock order behavior for i_mutex than other
- * filesystems; sysfs i_mutex is called in many places with subsystem locks
- * held. At the same time, many of the VFS locking rules do not apply to
- * sysfs at all (cross directory rename for example). To untangle this mess
- * (which gives false positives in lockdep), we're giving sysfs inodes their
- * own class for i_mutex.
- */
-static struct lock_class_key sysfs_inode_imutex_key;
-
 static int sysfs_count_nlink(struct sysfs_dirent *sd)
 {
 	struct sysfs_dirent *child;
@@ -190,7 +179,6 @@ static void sysfs_init_inode(struct sysfs_dirent *sd, struct inode *inode)
 	inode->i_mapping->a_ops = &sysfs_aops;
 	inode->i_mapping->backing_dev_info = &sysfs_backing_dev_info;
 	inode->i_op = &sysfs_inode_operations;
-	lockdep_set_class(&inode->i_mutex, &sysfs_inode_imutex_key);
 
 	set_default_inode_attr(inode, sd->s_mode);
 	sysfs_refresh_inode(sd, inode);
diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index f17ebb8..2db952c 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -87,7 +87,6 @@ extern struct kmem_cache *sysfs_dir_cachep;
  * dir.c
  */
 extern struct mutex sysfs_mutex;
-extern struct mutex sysfs_rename_mutex;
 extern spinlock_t sysfs_assoc_lock;
 
 extern const struct file_operations sysfs_dir_operations;
diff --git a/include/linux/namei.h b/include/linux/namei.h
index fc2e035..758ecfb 100644
--- a/include/linux/namei.h
+++ b/include/linux/namei.h
@@ -76,7 +76,6 @@ extern struct file *nameidata_to_filp(struct nameidata *nd, int flags);
 extern void release_open_intent(struct nameidata *);
 
 extern struct dentry *lookup_one_len(const char *, struct dentry *, int);
-extern struct dentry *lookup_one_noperm(const char *, struct dentry *);
 
 extern int follow_down(struct vfsmount **, struct dentry **);
 extern int follow_up(struct vfsmount **, struct dentry **);
-- 
1.6.3.1.54.g99dd.dirty


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH 17/24] sysfs: Merge sysfs_rename_dir and sysfs_move_dir
  2009-05-28 22:58         ` [PATCH 0/24] sysfs cleanups Eric W. Biederman
                             ` (15 preceding siblings ...)
  2009-05-28 23:00           ` [PATCH 16/24] sysfs: Propagate renames to the vfs on demand Eric W. Biederman
@ 2009-05-28 23:00           ` Eric W. Biederman
  2009-05-28 23:00           ` [PATCH 18/24] sysfs: Pass super_block to sysfs_get_inode Eric W. Biederman
                             ` (7 subsequent siblings)
  24 siblings, 0 replies; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-28 23:00 UTC (permalink / raw)
  To: Andrew Morton, Greg Kroah-Hartman
  Cc: linux-kernel, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Kay Sievers, Greg KH, Eric W. Biederman, Eric W. Biederman

From: Eric W. Biederman <ebiederm@xmission.com>

These two functions do 90% of the same work and it doesn't significantly
obfuscate the function to allow both the parent dir and the name to change
at the same time.  So merge them together to simplify maintenance, and
increase testing.

Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
---
 fs/sysfs/dir.c |   66 +++++++++++++++++++++++--------------------------------
 1 files changed, 28 insertions(+), 38 deletions(-)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index efe8a01..1f8fb9c 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -688,30 +688,42 @@ void sysfs_remove_dir(struct kobject * kobj)
 	__sysfs_remove_dir(sd);
 }
 
-int sysfs_rename_dir(struct kobject * kobj, const char *new_name)
+static int sysfs_mv_dir(struct sysfs_dirent *sd,
+	struct sysfs_dirent *new_parent_sd, const char *new_name)
 {
-	struct sysfs_dirent *sd = kobj->sd;
 	const char *dup_name = NULL;
 	int error;
 
 	mutex_lock(&sysfs_mutex);
 
 	error = 0;
-	if (strcmp(sd->s_name, new_name) == 0)
+	if ((sd->s_parent == new_parent_sd) &&
+	    (strcmp(sd->s_name, new_name) == 0))
 		goto out;	/* nothing to rename */
 
 	error = -EEXIST;
-	if (sysfs_find_dirent(sd->s_parent, new_name))
+	if (sysfs_find_dirent(new_parent_sd, new_name))
 		goto out;
 
 	/* rename sysfs_dirent */
-	error = -ENOMEM;
-	new_name = dup_name = kstrdup(new_name, GFP_KERNEL);
-	if (!new_name)
-		goto out;
+	if (strcmp(sd->s_name, new_name) != 0) {
+		error = -ENOMEM;
+		new_name = dup_name = kstrdup(new_name, GFP_KERNEL);
+		if (!new_name)
+			goto out;
+
+		dup_name = sd->s_name;
+		sd->s_name = new_name;
+	}
 
-	dup_name = sd->s_name;
-	sd->s_name = new_name;
+	/* Remove from old parent's list and insert into new parent's list. */
+	if (sd->s_parent != new_parent_sd) {
+		sysfs_unlink_sibling(sd);
+		sysfs_get(new_parent_sd);
+		sysfs_put(sd->s_parent);
+		sd->s_parent = new_parent_sd;
+		sysfs_link_sibling(sd);
+	}
 
 	error = 0;
  out:
@@ -720,36 +732,14 @@ int sysfs_rename_dir(struct kobject * kobj, const char *new_name)
 	return error;
 }
 
-int sysfs_move_dir(struct kobject *kobj, struct kobject *new_parent_kobj)
+int sysfs_rename_dir(struct kobject * kobj, const char *new_name)
 {
-	struct sysfs_dirent *sd = kobj->sd;
-	struct sysfs_dirent *new_parent_sd;
-	int error;
-
-	BUG_ON(!sd->s_parent);
-
-	mutex_lock(&sysfs_mutex);
-	new_parent_sd = new_parent_kobj->sd ? new_parent_kobj->sd : &sysfs_root;
-
-	error = 0;
-	if (sd->s_parent == new_parent_sd)
-		goto out;	/* nothing to move */
-
-	error = -EEXIST;
-	if (sysfs_find_dirent(new_parent_sd, sd->s_name))
-		goto out;
-
-	/* Remove from old parent's list and insert into new parent's list. */
-	sysfs_unlink_sibling(sd);
-	sysfs_get(new_parent_sd);
-	sysfs_put(sd->s_parent);
-	sd->s_parent = new_parent_sd;
-	sysfs_link_sibling(sd);
+	return sysfs_mv_dir(kobj->sd, kobj->sd->s_parent, new_name);
+}
 
-	error = 0;
-out:
-	mutex_unlock(&sysfs_mutex);
-	return error;
+int sysfs_move_dir(struct kobject *kobj, struct kobject *new_parent_kobj)
+{
+	return sysfs_mv_dir(kobj->sd, new_parent_kobj->sd, kobj->sd->s_name);
 }
 
 /* Relationship between s_mode and the DT_xxx types */
-- 
1.6.3.1.54.g99dd.dirty


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH 18/24] sysfs: Pass super_block to sysfs_get_inode
  2009-05-28 22:58         ` [PATCH 0/24] sysfs cleanups Eric W. Biederman
                             ` (16 preceding siblings ...)
  2009-05-28 23:00           ` [PATCH 17/24] sysfs: Merge sysfs_rename_dir and sysfs_move_dir Eric W. Biederman
@ 2009-05-28 23:00           ` Eric W. Biederman
  2009-05-28 23:01           ` [PATCH 19/24] sysfs: Kill unused sysfs_sb variable Eric W. Biederman
                             ` (6 subsequent siblings)
  24 siblings, 0 replies; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-28 23:00 UTC (permalink / raw)
  To: Andrew Morton, Greg Kroah-Hartman
  Cc: linux-kernel, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Kay Sievers, Greg KH, Eric W. Biederman, Eric W. Biederman

From: Eric W. Biederman <ebiederm@xmission.com>

Currently sysfs_get_inode magically returns an inode on
sysfs_sb.  Make the super_block parameter explicit and
the code becomes clearer.

Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
---
 fs/sysfs/dir.c   |    2 +-
 fs/sysfs/inode.c |    5 +++--
 fs/sysfs/mount.c |    2 +-
 fs/sysfs/sysfs.h |    2 +-
 4 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index 1f8fb9c..39c6944 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -579,7 +579,7 @@ static struct dentry * sysfs_lookup(struct inode *dir, struct dentry *dentry,
 	}
 
 	/* attach dentry and inode */
-	inode = sysfs_get_inode(sd);
+	inode = sysfs_get_inode(dir->i_sb, sd);
 	if (!inode) {
 		ret = ERR_PTR(-ENOMEM);
 		goto out_unlock;
diff --git a/fs/sysfs/inode.c b/fs/sysfs/inode.c
index a1917b5..c725aeb 100644
--- a/fs/sysfs/inode.c
+++ b/fs/sysfs/inode.c
@@ -210,6 +210,7 @@ static void sysfs_init_inode(struct sysfs_dirent *sd, struct inode *inode)
 
 /**
  *	sysfs_get_inode - get inode for sysfs_dirent
+ *	@sb: super block
  *	@sd: sysfs_dirent to allocate inode for
  *
  *	Get inode for @sd.  If such inode doesn't exist, a new inode
@@ -222,11 +223,11 @@ static void sysfs_init_inode(struct sysfs_dirent *sd, struct inode *inode)
  *	RETURNS:
  *	Pointer to allocated inode on success, NULL on failure.
  */
-struct inode * sysfs_get_inode(struct sysfs_dirent *sd)
+struct inode * sysfs_get_inode(struct super_block *sb, struct sysfs_dirent *sd)
 {
 	struct inode *inode;
 
-	inode = iget_locked(sysfs_sb, sd->s_ino);
+	inode = iget_locked(sb, sd->s_ino);
 	if (inode && (inode->i_state & I_NEW))
 		sysfs_init_inode(sd, inode);
 
diff --git a/fs/sysfs/mount.c b/fs/sysfs/mount.c
index 4974995..89db07e 100644
--- a/fs/sysfs/mount.c
+++ b/fs/sysfs/mount.c
@@ -54,7 +54,7 @@ static int sysfs_fill_super(struct super_block *sb, void *data, int silent)
 
 	/* get root inode, initialize and unlock it */
 	mutex_lock(&sysfs_mutex);
-	inode = sysfs_get_inode(&sysfs_root);
+	inode = sysfs_get_inode(sb, &sysfs_root);
 	mutex_unlock(&sysfs_mutex);
 	if (!inode) {
 		pr_debug("sysfs: could not get root inode\n");
diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index 2db952c..cf21b06 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -130,7 +130,7 @@ static inline void __sysfs_put(struct sysfs_dirent *sd)
 /*
  * inode.c
  */
-struct inode *sysfs_get_inode(struct sysfs_dirent *sd);
+struct inode *sysfs_get_inode(struct super_block *sb, struct sysfs_dirent *sd);
 void sysfs_delete_inode(struct inode *inode);
 int sysfs_sd_setattr(struct sysfs_dirent *sd, struct iattr *iattr);
 int sysfs_setattr(struct dentry *dentry, struct iattr *iattr);
-- 
1.6.3.1.54.g99dd.dirty


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH 19/24] sysfs: Kill unused sysfs_sb variable.
  2009-05-28 22:58         ` [PATCH 0/24] sysfs cleanups Eric W. Biederman
                             ` (17 preceding siblings ...)
  2009-05-28 23:00           ` [PATCH 18/24] sysfs: Pass super_block to sysfs_get_inode Eric W. Biederman
@ 2009-05-28 23:01           ` Eric W. Biederman
  2009-05-28 23:01           ` [PATCH 20/24] sysfs: Normalize error handling in sysfs_fill_inode Eric W. Biederman
                             ` (5 subsequent siblings)
  24 siblings, 0 replies; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-28 23:01 UTC (permalink / raw)
  To: Andrew Morton, Greg Kroah-Hartman
  Cc: linux-kernel, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Kay Sievers, Greg KH, Eric W. Biederman, Eric W. Biederman

From: Eric W. Biederman <ebiederm@xmission.com>

Now that there are no more users we can remove
the sysfs_sb variable.

Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
---
 fs/sysfs/mount.c |    2 --
 fs/sysfs/sysfs.h |    1 -
 2 files changed, 0 insertions(+), 3 deletions(-)

diff --git a/fs/sysfs/mount.c b/fs/sysfs/mount.c
index 89db07e..0cb1088 100644
--- a/fs/sysfs/mount.c
+++ b/fs/sysfs/mount.c
@@ -23,7 +23,6 @@
 
 
 static struct vfsmount *sysfs_mount;
-struct super_block * sysfs_sb = NULL;
 struct kmem_cache *sysfs_dir_cachep;
 
 static const struct super_operations sysfs_ops = {
@@ -50,7 +49,6 @@ static int sysfs_fill_super(struct super_block *sb, void *data, int silent)
 	sb->s_magic = SYSFS_MAGIC;
 	sb->s_op = &sysfs_ops;
 	sb->s_time_gran = 1;
-	sysfs_sb = sb;
 
 	/* get root inode, initialize and unlock it */
 	mutex_lock(&sysfs_mutex);
diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index cf21b06..5dd8168 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -80,7 +80,6 @@ static inline unsigned int sysfs_type(struct sysfs_dirent *sd)
  * mount.c
  */
 extern struct sysfs_dirent sysfs_root;
-extern struct super_block *sysfs_sb;
 extern struct kmem_cache *sysfs_dir_cachep;
 
 /*
-- 
1.6.3.1.54.g99dd.dirty


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH 20/24] sysfs: Normalize error handling in sysfs_fill_inode
  2009-05-28 22:58         ` [PATCH 0/24] sysfs cleanups Eric W. Biederman
                             ` (18 preceding siblings ...)
  2009-05-28 23:01           ` [PATCH 19/24] sysfs: Kill unused sysfs_sb variable Eric W. Biederman
@ 2009-05-28 23:01           ` Eric W. Biederman
  2009-05-28 23:01           ` [PATCH 21/24] sysfs: Rename sysfs_mv_dir sysfs_rename Eric W. Biederman
                             ` (4 subsequent siblings)
  24 siblings, 0 replies; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-28 23:01 UTC (permalink / raw)
  To: Andrew Morton, Greg Kroah-Hartman
  Cc: linux-kernel, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Kay Sievers, Greg KH, Eric W. Biederman, Eric W. Biederman

From: Eric W. Biederman <ebiederm@xmission.com>

Use a single error exit path instead of doing whatever
is the required cleanup at each point we find the error.
Ultimately this should make the code more maintainable.

Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
---
 fs/sysfs/mount.c |   16 +++++++++++-----
 1 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/fs/sysfs/mount.c b/fs/sysfs/mount.c
index 0cb1088..1dd023a 100644
--- a/fs/sysfs/mount.c
+++ b/fs/sysfs/mount.c
@@ -41,8 +41,9 @@ struct sysfs_dirent sysfs_root = {
 
 static int sysfs_fill_super(struct super_block *sb, void *data, int silent)
 {
-	struct inode *inode;
-	struct dentry *root;
+	struct inode *inode = NULL;
+	struct dentry *root = NULL;
+	int error;
 
 	sb->s_blocksize = PAGE_CACHE_SIZE;
 	sb->s_blocksize_bits = PAGE_CACHE_SHIFT;
@@ -51,24 +52,29 @@ static int sysfs_fill_super(struct super_block *sb, void *data, int silent)
 	sb->s_time_gran = 1;
 
 	/* get root inode, initialize and unlock it */
+	error = -ENOMEM;
 	mutex_lock(&sysfs_mutex);
 	inode = sysfs_get_inode(sb, &sysfs_root);
 	mutex_unlock(&sysfs_mutex);
 	if (!inode) {
 		pr_debug("sysfs: could not get root inode\n");
-		return -ENOMEM;
+		goto err_out;
 	}
 
 	/* instantiate and link root dentry */
+	error = -ENOMEM;
 	root = d_alloc_root(inode);
 	if (!root) {
 		pr_debug("%s: could not get root dentry!\n",__func__);
-		iput(inode);
-		return -ENOMEM;
+		goto err_out;
 	}
 	root->d_fsdata = &sysfs_root;
 	sb->s_root = root;
 	return 0;
+err_out:
+	dput(root);
+	iput(inode);
+	return error;
 }
 
 static int sysfs_get_sb(struct file_system_type *fs_type,
-- 
1.6.3.1.54.g99dd.dirty


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH 21/24] sysfs: Rename sysfs_mv_dir sysfs_rename
  2009-05-28 22:58         ` [PATCH 0/24] sysfs cleanups Eric W. Biederman
                             ` (19 preceding siblings ...)
  2009-05-28 23:01           ` [PATCH 20/24] sysfs: Normalize error handling in sysfs_fill_inode Eric W. Biederman
@ 2009-05-28 23:01           ` Eric W. Biederman
  2009-05-28 23:01           ` [PATCH 22/24] sysfs: Make sysfs_rename_link atomic Eric W. Biederman
                             ` (3 subsequent siblings)
  24 siblings, 0 replies; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-28 23:01 UTC (permalink / raw)
  To: Andrew Morton, Greg Kroah-Hartman
  Cc: linux-kernel, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Kay Sievers, Greg KH, Eric W. Biederman, Eric W. Biederman

From: Eric W. Biederman <ebiederm@xmission.com>

It turns out that sysfs_mv_dir actually makes no assumptions that what
is being renamed is a directory.   So rename sysfs_mv_dir to sysfs_rename to
reflect the functions general utility.  Later we will use it rename symlinks
in sysfs.

Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
---
 fs/sysfs/dir.c   |    6 +++---
 fs/sysfs/sysfs.h |    3 +++
 2 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index 39c6944..96f95f5 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -688,7 +688,7 @@ void sysfs_remove_dir(struct kobject * kobj)
 	__sysfs_remove_dir(sd);
 }
 
-static int sysfs_mv_dir(struct sysfs_dirent *sd,
+int sysfs_rename(struct sysfs_dirent *sd,
 	struct sysfs_dirent *new_parent_sd, const char *new_name)
 {
 	const char *dup_name = NULL;
@@ -734,12 +734,12 @@ static int sysfs_mv_dir(struct sysfs_dirent *sd,
 
 int sysfs_rename_dir(struct kobject * kobj, const char *new_name)
 {
-	return sysfs_mv_dir(kobj->sd, kobj->sd->s_parent, new_name);
+	return sysfs_rename(kobj->sd, kobj->sd->s_parent, new_name);
 }
 
 int sysfs_move_dir(struct kobject *kobj, struct kobject *new_parent_kobj)
 {
-	return sysfs_mv_dir(kobj->sd, new_parent_kobj->sd, kobj->sd->s_name);
+	return sysfs_rename(kobj->sd, new_parent_kobj->sd, kobj->sd->s_name);
 }
 
 /* Relationship between s_mode and the DT_xxx types */
diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index 5dd8168..be1d932 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -109,6 +109,9 @@ int sysfs_create_subdir(struct kobject *kobj, const char *name,
 			struct sysfs_dirent **p_sd);
 void sysfs_remove_subdir(struct sysfs_dirent *sd);
 
+int sysfs_rename(struct sysfs_dirent *sd,
+	struct sysfs_dirent *new_parent_sd, const char *new_name);
+
 static inline struct sysfs_dirent *__sysfs_get(struct sysfs_dirent *sd)
 {
 	if (sd) {
-- 
1.6.3.1.54.g99dd.dirty


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH 22/24] sysfs: Make sysfs_rename_link atomic
  2009-05-28 22:58         ` [PATCH 0/24] sysfs cleanups Eric W. Biederman
                             ` (20 preceding siblings ...)
  2009-05-28 23:01           ` [PATCH 21/24] sysfs: Rename sysfs_mv_dir sysfs_rename Eric W. Biederman
@ 2009-05-28 23:01           ` Eric W. Biederman
  2009-05-29  9:16             ` Tejun Heo
  2009-05-28 23:01           ` [PATCH 23/24] driver core: Don't remove kobjects in device_shutdown Eric W. Biederman
                             ` (2 subsequent siblings)
  24 siblings, 1 reply; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-28 23:01 UTC (permalink / raw)
  To: Andrew Morton, Greg Kroah-Hartman
  Cc: linux-kernel, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Kay Sievers, Greg KH, Eric W. Biederman, Eric W. Biederman

From: Eric W. Biederman <ebiederm@xmission.com>

Use the existing sysfs_rename to make sysfs_rename_link an atomic
operation that does less work.  While I am at add additional sanity
checking to ensure it is a symlink I am renaming.

Acked-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
---
 fs/sysfs/symlink.c |   26 ++++++++++++++++++++++++--
 1 files changed, 24 insertions(+), 2 deletions(-)

diff --git a/fs/sysfs/symlink.c b/fs/sysfs/symlink.c
index fc5fc86..39d050b 100644
--- a/fs/sysfs/symlink.c
+++ b/fs/sysfs/symlink.c
@@ -106,8 +106,30 @@ void sysfs_remove_link(struct kobject * kobj, const char * name)
 int sysfs_rename_link(struct kobject *kobj, struct kobject *targ,
 			const char *old, const char *new)
 {
-	sysfs_remove_link(kobj, old);
-	return sysfs_create_link(kobj, targ, new);
+	struct sysfs_dirent *parent_sd, *sd = NULL;
+	int result;
+
+	if (!kobj)
+		parent_sd = &sysfs_root;
+	else
+		parent_sd = kobj->sd;
+
+	result = -ENOENT;
+	sd = sysfs_get_dirent(parent_sd, old);
+	if (!sd)
+		goto out;
+
+	result = -EINVAL;
+	if (sysfs_type(sd) != SYSFS_KOBJ_LINK)
+		goto out;
+	if (sd->s_symlink.target_sd->s_dir.kobj != targ)
+		goto out;
+
+	result = sysfs_rename(sd, parent_sd, new);
+
+out:
+	sysfs_put(sd);
+	return result;
 }
 
 static int sysfs_get_target_path(struct sysfs_dirent *parent_sd,
-- 
1.6.3.1.54.g99dd.dirty


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH 23/24] driver core: Don't remove kobjects in device_shutdown.
  2009-05-28 22:58         ` [PATCH 0/24] sysfs cleanups Eric W. Biederman
                             ` (21 preceding siblings ...)
  2009-05-28 23:01           ` [PATCH 22/24] sysfs: Make sysfs_rename_link atomic Eric W. Biederman
@ 2009-05-28 23:01           ` Eric W. Biederman
  2009-05-28 23:01           ` [PATCH 24/24] sysfs: In sysfs_add_one fail if the targe directory has been removed Eric W. Biederman
  2009-05-29 20:18           ` [PATCH 0/26] sysfs cleanups v3 Eric W. Biederman
  24 siblings, 0 replies; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-28 23:01 UTC (permalink / raw)
  To: Andrew Morton, Greg Kroah-Hartman
  Cc: linux-kernel, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Kay Sievers, Greg KH, Eric W. Biederman, Eric W. Biederman

From: Eric W. Biederman <ebiederm@xmission.com>

device_shutdown is defined to just shutdown the hardware and to not
clean up any kernel data structures.  Therefore don't put the kobjects
for /sys/dev and /sys/dev/block and /sys/dev/char.

This ensures we don't remove /sys/dev/block and /sys/dev/char while
we still have symlinks from there to the actual devices.

Acked-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
---
 drivers/base/core.c |    3 ---
 1 files changed, 0 insertions(+), 3 deletions(-)

diff --git a/drivers/base/core.c b/drivers/base/core.c
index 8a1569c..49d3142 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -1653,7 +1653,4 @@ void device_shutdown(void)
 			dev->driver->shutdown(dev);
 		}
 	}
-	kobject_put(sysfs_dev_char_kobj);
-	kobject_put(sysfs_dev_block_kobj);
-	kobject_put(dev_kobj);
 }
-- 
1.6.3.1.54.g99dd.dirty


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH 24/24] sysfs: In sysfs_add_one fail if the targe directory has been removed.
  2009-05-28 22:58         ` [PATCH 0/24] sysfs cleanups Eric W. Biederman
                             ` (22 preceding siblings ...)
  2009-05-28 23:01           ` [PATCH 23/24] driver core: Don't remove kobjects in device_shutdown Eric W. Biederman
@ 2009-05-28 23:01           ` Eric W. Biederman
  2009-05-29  9:18             ` Tejun Heo
  2009-05-29 20:18           ` [PATCH 0/26] sysfs cleanups v3 Eric W. Biederman
  24 siblings, 1 reply; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-28 23:01 UTC (permalink / raw)
  To: Andrew Morton, Greg Kroah-Hartman
  Cc: linux-kernel, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Kay Sievers, Greg KH, Eric W. Biederman, Eric W. Biederman

From: Eric W. Biederman <ebiederm@xmission.com>

If a bug in the upper layers results in someone attempting to add
to a sysfs directory that has already been removed, warn about it
and fail.

I don't believe this has ever happened, and it certainly never should
happen, but be strict to avoid errors creeping in.

Acked-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
---
 fs/sysfs/dir.c |   37 +++++++++++++++++++++++--------------
 1 files changed, 23 insertions(+), 14 deletions(-)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index 96f95f5..b75cb25 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -394,21 +394,17 @@ static char *sysfs_pathname(struct sysfs_dirent *sd, char *path)
 int sysfs_add_one(struct sysfs_dirent *parent_sd, struct sysfs_dirent *sd)
 {
 	struct iattr *ps_iattr;
+	char *path;
+	int result;
 
 	mutex_lock(&sysfs_mutex);
-	if (sysfs_find_dirent(parent_sd, sd->s_name)) {
-		char *path;
-		mutex_unlock(&sysfs_mutex);
 
-		path = kzalloc(PATH_MAX, GFP_KERNEL);
-		WARN(1, KERN_WARNING
-		     "sysfs: cannot create duplicate filename '%s'\n",
-		     (path == NULL) ? sd->s_name :
-		     strcat(strcat(sysfs_pathname(parent_sd, path), "/"),
-		            sd->s_name));
-		kfree(path);
-		return -EEXIST;
-	}
+	result = -ENOENT;
+	if (parent_sd->s_flags & SYSFS_FLAG_REMOVED)
+		goto out_err;
+
+	if (sysfs_find_dirent(parent_sd, sd->s_name))
+		goto out_err;
 
 	sd->s_parent = sysfs_get(parent_sd);
 	sysfs_link_sibling(sd);
@@ -417,9 +413,22 @@ int sysfs_add_one(struct sysfs_dirent *parent_sd, struct sysfs_dirent *sd)
 	ps_iattr = parent_sd->s_iattr;
 	if (ps_iattr)
 		ps_iattr->ia_ctime = ps_iattr->ia_mtime = CURRENT_TIME;
-
 	mutex_unlock(&sysfs_mutex);
 	return 0;
+
+out_err:
+	mutex_unlock(&sysfs_mutex);
+
+	path = kzalloc(PATH_MAX, GFP_KERNEL);
+	WARN(1, KERN_WARNING "sysfs: cannot create '%s' %s\n",
+		(path == NULL) ? sd->s_name :
+		strcat(strcat(sysfs_pathname(parent_sd, path), "/"),
+		       sd->s_name),
+		(result == -EEXIST ? "duplicate filename" : "no such directory")
+		);
+	kfree(path);
+
+	return result;
 }
 
 /**
-- 
1.6.3.1.54.g99dd.dirty


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* Re: [PATCH 04/24] sysfs: Normalize removing sysfs directories.
  2009-05-28 23:00           ` [PATCH 04/24] sysfs: Normalize removing sysfs directories Eric W. Biederman
@ 2009-05-29  9:14             ` Tejun Heo
  2009-05-29 16:52               ` Eric W. Biederman
  0 siblings, 1 reply; 200+ messages in thread
From: Tejun Heo @ 2009-05-29  9:14 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Andrew Morton, Greg Kroah-Hartman, linux-kernel, Cornelia Huck,
	linux-fsdevel, Kay Sievers, Greg KH, Eric W. Biederman

Hello,

Eric W. Biederman wrote:
> @@ -732,12 +732,28 @@ const struct inode_operations sysfs_dir_inode_operations = {
>  	.setattr	= sysfs_setattr,
>  };
>  
> -static void remove_dir(struct sysfs_dirent *sd)
> +static void remove_dir(struct sysfs_dirent *dir_sd)
>  {
>  	struct sysfs_addrm_cxt acxt;
>  
> -	sysfs_addrm_start(&acxt, sd->s_parent);
> -	sysfs_remove_one(&acxt, sd);
> +	pr_debug("sysfs %s: removing dir\n", dir_sd->s_name);
> +
> +	/* Removing non-empty directories is not valid complain! */
						     ^^^
						  missing . or ,

> +static struct sysfs_dirent *get_dirent_to_remove(struct sysfs_dirent *dir_sd)
> +{
> +	struct sysfs_dirent *sd;
> +
> +	mutex_lock(&sysfs_mutex);
> +	for (sd = dir_sd->s_dir.children; sd; sd = sd->s_sibling) {
> +		/* Directories might be owned by someone else
> +		 * making recursive directory removal unsafe.
> +		 */
> +		if (sysfs_type(sd) == SYSFS_DIR)
> +			continue;
> +		break;
> +	}
> +	sysfs_get(sd);
> +	mutex_unlock(&sysfs_mutex);
> +
> +	return sd;
> +}
>  
>  static void __sysfs_remove_dir(struct sysfs_dirent *dir_sd)
>  {
>  	struct sysfs_addrm_cxt acxt;
> -	struct sysfs_dirent **pos;
> -
> -	if (!dir_sd)
> -		return;
> -
> -	pr_debug("sysfs %s: removing dir\n", dir_sd->s_name);
> -	sysfs_addrm_start(&acxt, dir_sd);
> -	pos = &dir_sd->s_dir.children;
> -	while (*pos) {
> -		struct sysfs_dirent *sd = *pos;
> +	struct sysfs_dirent *sd;
>  
> -		if (sysfs_type(sd) != SYSFS_DIR)
> -			sysfs_remove_one(&acxt, sd);
> -		else
> -			pos = &(*pos)->s_sibling;
> +	/* Remove children that we think are safe */
> +	while ((sd = get_dirent_to_remove(dir_sd))) {
> +		sysfs_addrm_start(&acxt, sd->s_parent);
> +		sysfs_remove_one(&acxt, sd);
> +		sysfs_addrm_finish(&acxt);
> +		sysfs_put(sd);
>  	}
> -	sysfs_addrm_finish(&acxt);

Ummm... Null @dir_sd handling is being removed, which could be fine
but please do it in a separate patch or at least mention it in the
patch description.  Also, I'm quite uncomfortable with these things
being done in non-atomic manner.  It can be made to work but things
like this can lead to subtle race conditions and with the kind of
layering we put on top of sysfs (kobject, driver model, driver
midlayers and so on), it isn't all that easy to verify what's going
on, so NACK for this one.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 22/24] sysfs: Make sysfs_rename_link atomic
  2009-05-28 23:01           ` [PATCH 22/24] sysfs: Make sysfs_rename_link atomic Eric W. Biederman
@ 2009-05-29  9:16             ` Tejun Heo
  2009-05-29 17:17               ` Eric W. Biederman
  0 siblings, 1 reply; 200+ messages in thread
From: Tejun Heo @ 2009-05-29  9:16 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Andrew Morton, Greg Kroah-Hartman, linux-kernel, Cornelia Huck,
	linux-fsdevel, Kay Sievers, Greg KH, Eric W. Biederman

Eric W. Biederman wrote:
> From: Eric W. Biederman <ebiederm@xmission.com>
> 
> Use the existing sysfs_rename to make sysfs_rename_link an atomic
> operation that does less work.  While I am at add additional sanity
> checking to ensure it is a symlink I am renaming.
> 
> Acked-by: Kay Sievers <kay.sievers@vrfy.org>
> Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>

It would be really nice to merge or group this together with the first
three patches.  Other than that,

Acked-by: Tejun Heo <tj@kernel.org>

-- 
tejun

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 24/24] sysfs: In sysfs_add_one fail if the targe directory has been removed.
  2009-05-28 23:01           ` [PATCH 24/24] sysfs: In sysfs_add_one fail if the targe directory has been removed Eric W. Biederman
@ 2009-05-29  9:18             ` Tejun Heo
  0 siblings, 0 replies; 200+ messages in thread
From: Tejun Heo @ 2009-05-29  9:18 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Andrew Morton, Greg Kroah-Hartman, linux-kernel, Cornelia Huck,
	linux-fsdevel, Kay Sievers, Greg KH, Eric W. Biederman

Eric W. Biederman wrote:
> From: Eric W. Biederman <ebiederm@xmission.com>
> 
> If a bug in the upper layers results in someone attempting to add
> to a sysfs directory that has already been removed, warn about it
> and fail.
> 
> I don't believe this has ever happened, and it certainly never should
> happen, but be strict to avoid errors creeping in.
> 
> Acked-by: Kay Sievers <kay.sievers@vrfy.org>
> Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>

Acked-by: Tejun Heo <tj@kernel.org>

-- 
tejun

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 25/20] sysfs: Only support removing emtpy sysfs  directories.
  2009-05-28 21:04                                                                     ` Alan Stern
@ 2009-05-29 12:32                                                                       ` Hannes Reinecke
  0 siblings, 0 replies; 200+ messages in thread
From: Hannes Reinecke @ 2009-05-29 12:32 UTC (permalink / raw)
  To: Alan Stern
  Cc: James Bottomley, Kay Sievers, SCSI development list,
	Eric W. Biederman, Andrew Morton, Greg Kroah-Hartman,
	Kernel development list, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Eric W. Biederman

[-- Attachment #1: Type: text/plain, Size: 844 bytes --]

Hi Alan,

> And besides, in the patch I'm working on it isn't called from either of 
> those places -- it's called from __scsi_remove_device().  So I'll go 
> ahead and get rid of scsi_target_reap_usercontext().
> 
And just for reference, here is my patchset I've created sometime ago
which streamlines the sdev and starget lifetime. I think I've tried
to send it upstream at one point but never got far with it.
Be aware that it's relative to a rather git tree (2.6.22?) so
it might not apply properly. But it's mainly to get you an idea
of what I've done so far.

And would have continued pushing it if real life hadn't interfered ...

HTH.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      zSeries & Storage
hare@suse.de			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Markus Rex, HRB 16746 (AG Nürnberg)

[-- Attachment #2: 0001-Fix-refcounting-for-attribute_container.patch --]
[-- Type: text/x-patch, Size: 1621 bytes --]

>From 404fc549f858cd0cf3a84865442fe85fedb920de Mon Sep 17 00:00:00 2001
From: Hannes Reinecke <hare@acerbis.suse.de>
Date: Sat, 8 Mar 2008 10:30:58 +0100
Subject: [PATCH] Fix refcounting for attribute_container

attribute_container_add_device() took an explicit reference on the
parent device, making it impossible to remove the parent by doing
a simple put. So we'd rather _not_ take a reference here as
attribute_container will be handled explicitly by calls to
attribute_container_remove_device()/_destroy_device() anyway.

Signed-off-by: Hannes Reinecke <hare@suse.de>
---
 drivers/base/attribute_container.c |    8 +++++---
 1 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/base/attribute_container.c b/drivers/base/attribute_container.c
index f57652d..6c7e633 100644
--- a/drivers/base/attribute_container.c
+++ b/drivers/base/attribute_container.c
@@ -114,10 +114,8 @@ static void attribute_container_release(struct device *classdev)
 {
 	struct internal_container *ic 
 		= container_of(classdev, struct internal_container, classdev);
-	struct device *dev = classdev->parent;
 
 	kfree(ic);
-	put_device(dev);
 }
 
 /**
@@ -164,7 +162,11 @@ attribute_container_add_device(struct device *dev,
 
 		ic->cont = cont;
 		device_initialize(&ic->classdev);
-		ic->classdev.parent = get_device(dev);
+		/*
+		 * Don't increase refcount here, device will be
+		 * removed explicitly by a call to _destroy().
+		 */
+		ic->classdev.parent = dev;
 		ic->classdev.class = cont->class;
 		cont->class->dev_release = attribute_container_release;
 		strcpy(ic->classdev.bus_id, dev->bus_id);
-- 
1.5.3.2


[-- Attachment #3: 0002-Remove-reap_ref.patch --]
[-- Type: text/x-patch, Size: 3927 bytes --]

>From 9e96c5dd094d3822093656e87b71cd433e818cd2 Mon Sep 17 00:00:00 2001
From: Hannes Reinecke <hare@acerbis.suse.de>
Date: Sat, 8 Mar 2008 12:28:17 +0100
Subject: [PATCH] Remove reap_ref

struct scsi_target contains a 'reap_ref' counter, which is
basically a reference counter for the target.
As we now have proper reference counting we can remove this
and clear out the calling sequence for scsi_target_reap().

Signed-off-by: Hannes Reinecke <hare@suse.de>
---
 drivers/scsi/scsi_scan.c   |   22 ++++++++++++++--------
 drivers/scsi/scsi_sysfs.c  |    3 ---
 include/scsi/scsi_device.h |    1 -
 3 files changed, 14 insertions(+), 12 deletions(-)

diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
index d61a8e8..2feab2a 100644
--- a/drivers/scsi/scsi_scan.c
+++ b/drivers/scsi/scsi_scan.c
@@ -448,7 +448,6 @@ static struct scsi_target *scsi_alloc_target(struct device *parent,
 	return starget;
 
  found:
-	found_target->reap_ref++;
 	spin_unlock_irqrestore(shost->host_lock, flags);
 	if (found_target->state != STARGET_DEL) {
 		put_device(parent);
@@ -505,7 +504,7 @@ void scsi_target_reap(struct scsi_target *starget)
 
 	spin_lock_irqsave(shost->host_lock, flags);
 
-	if (--starget->reap_ref == 0 && list_empty(&starget->devices)) {
+	if (list_empty(&starget->devices)) {
 		if (starget->state == STARGET_CREATED) {
 			spin_unlock_irqrestore(shost->host_lock, flags);
 			starget->state = STARGET_DEL;
@@ -1516,8 +1515,13 @@ struct scsi_device *__scsi_add_device(struct Scsi_Host *shost, uint channel,
 	if (scsi_host_scan_allowed(shost))
 		scsi_probe_and_add_lun(starget, lun, NULL, &sdev, 1, hostdata);
 	mutex_unlock(&shost->scan_mutex);
-	transport_configure_device(&starget->dev);
-	scsi_target_reap(starget);
+	/*
+	 * scsi_target_reap is called from the release function
+	 * of each sdev.
+	 */
+	if (starget->state != STARGET_DEL)
+		transport_configure_device(&starget->dev);
+
 	put_device(&starget->dev);
 
 	return sdev;
@@ -1595,10 +1599,12 @@ static void __scsi_scan_target(struct device *parent, unsigned int channel,
 	}
 
  out_reap:
-	/* now determine if the target has any children at all
-	 * and if not, nuke it */
-	transport_configure_device(&starget->dev);
-	scsi_target_reap(starget);
+	/*
+	 * scsi_target_reap is called from the release function
+	 * of each sdev.
+	 */
+	if (starget->state != STARGET_DEL)
+		transport_configure_device(&starget->dev);
 
 	put_device(&starget->dev);
 }
diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
index bd49d4e..4db0fed 100644
--- a/drivers/scsi/scsi_sysfs.c
+++ b/drivers/scsi/scsi_sysfs.c
@@ -297,7 +297,6 @@ static void scsi_device_dev_release_usercontext(struct work_struct *work)
 	starget = to_scsi_target(parent);
 
 	spin_lock_irqsave(sdev->host->host_lock, flags);
-	starget->reap_ref++;
 	list_del(&sdev->siblings);
 	list_del(&sdev->same_target_siblings);
 	list_del(&sdev->starved_entry);
@@ -937,7 +936,6 @@ static void __scsi_remove_target(struct scsi_target *starget)
 	struct scsi_device *sdev;
 
 	spin_lock_irqsave(shost->host_lock, flags);
-	starget->reap_ref++;
  restart:
 	list_for_each_entry(sdev, &shost->__devices, siblings) {
 		if (sdev->channel != starget->channel ||
@@ -950,7 +948,6 @@ static void __scsi_remove_target(struct scsi_target *starget)
 		goto restart;
 	}
 	spin_unlock_irqrestore(shost->host_lock, flags);
-	scsi_target_reap(starget);
 }
 
 static int __remove_child (struct device * dev, void * data)
diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h
index f6a9fe0..ccc437b 100644
--- a/include/scsi/scsi_device.h
+++ b/include/scsi/scsi_device.h
@@ -196,7 +196,6 @@ struct scsi_target {
 	struct list_head	siblings;
 	struct list_head	devices;
 	struct device		dev;
-	unsigned int		reap_ref; /* protected by the host lock */
 	unsigned int		channel;
 	unsigned int		id; /* target id ... replace
 				     * scsi_device.id eventually */
-- 
1.5.3.2


[-- Attachment #4: 0003-Improve-error-messages-in-scsi_sysfs_add_sdev.patch --]
[-- Type: text/x-patch, Size: 1185 bytes --]

>From b15634110e53a8418378c952bd1b6488f2746f86 Mon Sep 17 00:00:00 2001
From: Hannes Reinecke <hare@acerbis.suse.de>
Date: Mon, 25 Feb 2008 03:39:28 +0100
Subject: [PATCH] Improve error messages in scsi_sysfs_add_sdev()

When we fail to add a device to the driver core, only the very
helpful message 'error X' is displayed.
Print out some more meaningful messages.

Signed-off-by: Hannes Reinecke <hare@suse.de>
---
 drivers/scsi/scsi_sysfs.c |    6 ++++--
 1 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
index 4db0fed..8f674ac 100644
--- a/drivers/scsi/scsi_sysfs.c
+++ b/drivers/scsi/scsi_sysfs.c
@@ -830,12 +830,14 @@ int scsi_sysfs_add_sdev(struct scsi_device *sdev)
 	error = device_add(&sdev->sdev_gendev);
 	if (error) {
 		put_device(sdev->sdev_gendev.parent);
-		printk(KERN_INFO "error 1\n");
+		sdev_printk(KERN_INFO, sdev,
+			    "failed to add device: %d\n", error);
 		return error;
 	}
 	error = device_add(&sdev->sdev_dev);
 	if (error) {
-		printk(KERN_INFO "error 2\n");
+		sdev_printk(KERN_INFO, sdev,
+			    "failed to add class device: %d\n", error);
 		goto clean_device;
 	}
 
-- 
1.5.3.2


[-- Attachment #5: 0004-Remove-stale-put_device-from-scsi_sysfs_add_sdev.patch --]
[-- Type: text/x-patch, Size: 972 bytes --]

>From f1717cf66290b81f9b376ddeba65426c91fb7fe4 Mon Sep 17 00:00:00 2001
From: Hannes Reinecke <hare@acerbis.suse.de>
Date: Sat, 8 Mar 2008 18:01:40 +0100
Subject: [PATCH] Remove stale put_device() from scsi_sysfs_add_sdev()

In one obscure error path someone decided to do a put_device()
on the sdev parent.
This doesn't make much sense as we didn't take the reference
previously. So remove it.

Signed-off-by: Hannes Reinecke <hare@suse.de>
---
 drivers/scsi/scsi_sysfs.c |    1 -
 1 files changed, 0 insertions(+), 1 deletions(-)

diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
index 8f674ac..7dc3015 100644
--- a/drivers/scsi/scsi_sysfs.c
+++ b/drivers/scsi/scsi_sysfs.c
@@ -829,7 +829,6 @@ int scsi_sysfs_add_sdev(struct scsi_device *sdev)
 
 	error = device_add(&sdev->sdev_gendev);
 	if (error) {
-		put_device(sdev->sdev_gendev.parent);
 		sdev_printk(KERN_INFO, sdev,
 			    "failed to add device: %d\n", error);
 		return error;
-- 
1.5.3.2


[-- Attachment #6: 0005-Implement-SDEV_NEW-state.patch --]
[-- Type: text/x-patch, Size: 3769 bytes --]

>From 57109790e4cb53561f36d73a8efc7b9abd2736a9 Mon Sep 17 00:00:00 2001
From: Hannes Reinecke <hare@acerbis.suse.de>
Date: Sat, 8 Mar 2008 18:04:13 +0100
Subject: [PATCH] Implement SDEV_NEW state

When a scsi_device is allocated it's state is set to SDEV_CREATED.
However, we don't have any chance to detect if slave_alloc() has
run successfully or not.
This patch introduces a state SDEV_NEW which is used instead of
SDEV_CREATED upon initial sdev creation. After slave_alloc() has
run successfully the state is changed to SDEV_CREATED.
This allows us to detect later on if we might call slave_destroy()
or not.

Signed-off-by: Hannes Reinecke <hare@suse.de>
---
 drivers/scsi/scsi_lib.c    |   17 +++++++++++++----
 drivers/scsi/scsi_scan.c   |    9 ++++++++-
 drivers/scsi/scsi_sysfs.c  |    1 +
 include/scsi/scsi_device.h |    3 ++-
 4 files changed, 24 insertions(+), 6 deletions(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index ba21d97..c398767 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1999,12 +1999,21 @@ scsi_device_set_state(struct scsi_device *sdev, enum scsi_device_state state)
 		return 0;
 
 	switch (state) {
-	case SDEV_CREATED:
+	case SDEV_NEW:
 		/* There are no legal states that come back to
-		 * created.  This is the manually initialised start
+		 * new.  This is the manually initialised start
 		 * state */
 		goto illegal;
-			
+
+	case SDEV_CREATED:
+		switch (oldstate) {
+		case SDEV_NEW:
+			break;
+		default:
+			goto illegal;
+		}
+		break;
+
 	case SDEV_RUNNING:
 		switch (oldstate) {
 		case SDEV_CREATED:
@@ -2064,7 +2073,7 @@ scsi_device_set_state(struct scsi_device *sdev, enum scsi_device_state state)
 
 	case SDEV_DEL:
 		switch (oldstate) {
-		case SDEV_CREATED:
+		case SDEV_NEW:
 		case SDEV_RUNNING:
 		case SDEV_OFFLINE:
 		case SDEV_CANCEL:
diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
index 2feab2a..aa632f9 100644
--- a/drivers/scsi/scsi_scan.c
+++ b/drivers/scsi/scsi_scan.c
@@ -253,7 +253,7 @@ static struct scsi_device *scsi_alloc_sdev(struct scsi_target *starget,
 	sdev->id = starget->id;
 	sdev->lun = lun;
 	sdev->channel = starget->channel;
-	sdev->sdev_state = SDEV_CREATED;
+	sdev->sdev_state = SDEV_NEW;
 	INIT_LIST_HEAD(&sdev->siblings);
 	INIT_LIST_HEAD(&sdev->same_target_siblings);
 	INIT_LIST_HEAD(&sdev->cmd_list);
@@ -307,9 +307,16 @@ static struct scsi_device *scsi_alloc_sdev(struct scsi_target *starget,
 			 */
 			if (ret == -ENXIO)
 				display_failure_msg = 0;
+			/*
+			 * sdev remains in SDEV_NEW as the release
+			 * function has to know whether slave_alloc()
+			 * failed or not.
+			 */
 			goto out_device_destroy;
 		}
 	}
+	/* Device is created properly */
+	scsi_device_set_state(sdev, SDEV_CREATED);
 
 	return sdev;
 
diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
index 7dc3015..3ec76dd 100644
--- a/drivers/scsi/scsi_sysfs.c
+++ b/drivers/scsi/scsi_sysfs.c
@@ -27,6 +27,7 @@ static const struct {
 	enum scsi_device_state	value;
 	char			*name;
 } sdev_states[] = {
+	{ SDEV_NEW, "new" },
 	{ SDEV_CREATED, "created" },
 	{ SDEV_RUNNING, "running" },
 	{ SDEV_CANCEL, "cancel" },
diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h
index ccc437b..1616b26 100644
--- a/include/scsi/scsi_device.h
+++ b/include/scsi/scsi_device.h
@@ -28,7 +28,8 @@ struct scsi_mode_data {
  * scsi_lib:scsi_device_set_state().
  */
 enum scsi_device_state {
-	SDEV_CREATED = 1,	/* device created but not added to sysfs
+	SDEV_NEW = 1,		/* device created, slave_alloc has not run */
+	SDEV_CREATED,		/* device created but not added to sysfs
 				 * Only internal commands allowed (for inq) */
 	SDEV_RUNNING,		/* device properly configured
 				 * All commands allowed */
-- 
1.5.3.2


[-- Attachment #7: 0006-Rename-__scsi_remove_device-into-scsi_sysfs_remove.patch --]
[-- Type: text/x-patch, Size: 3669 bytes --]

>From a2d6ea3b183edc7a71f33e66fce63088c0f3e8a1 Mon Sep 17 00:00:00 2001
From: Hannes Reinecke <hare@acerbis.suse.de>
Date: Mon, 25 Feb 2008 04:17:51 +0100
Subject: [PATCH] Rename __scsi_remove_device() into scsi_sysfs_remove_sdev()

__scsi_remove_device() is actually the counterpart to
scsi_sysfs_add_sdev(). So we'd better rename it to avoid
confusion.

Signed-off-by: Hannes Reinecke <hare@suse.de>
---
 drivers/scsi/scsi_priv.h  |    2 +-
 drivers/scsi/scsi_scan.c  |    4 ++--
 drivers/scsi/scsi_sysfs.c |   10 +++++-----
 3 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/scsi/scsi_priv.h b/drivers/scsi/scsi_priv.h
index b33e725..4ff548c 100644
--- a/drivers/scsi/scsi_priv.h
+++ b/drivers/scsi/scsi_priv.h
@@ -112,13 +112,13 @@ extern void scsi_exit_sysctl(void);
 
 /* scsi_sysfs.c */
 extern int scsi_sysfs_add_sdev(struct scsi_device *);
+extern void scsi_sysfs_remove_sdev(struct scsi_device *);
 extern int scsi_sysfs_add_host(struct Scsi_Host *);
 extern int scsi_sysfs_register(void);
 extern void scsi_sysfs_unregister(void);
 extern void scsi_sysfs_device_initialize(struct scsi_device *);
 extern int scsi_sysfs_target_initialize(struct scsi_device *);
 extern struct scsi_transport_template blank_transport_template;
-extern void __scsi_remove_device(struct scsi_device *);
 
 extern struct bus_type scsi_bus_type;
 extern struct attribute_group *scsi_sysfs_shost_attr_groups[];
diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
index aa632f9..971ac9e 100644
--- a/drivers/scsi/scsi_scan.c
+++ b/drivers/scsi/scsi_scan.c
@@ -1131,7 +1131,7 @@ static int scsi_probe_and_add_lun(struct scsi_target *starget,
 			if (scsi_device_get(sdev) == 0) {
 				*sdevp = sdev;
 			} else {
-				__scsi_remove_device(sdev);
+				scsi_sysfs_remove_sdev(sdev);
 				res = SCSI_SCAN_NO_RESPONSE;
 			}
 		}
@@ -1881,7 +1881,7 @@ void scsi_forget_host(struct Scsi_Host *shost)
 		if (sdev->sdev_state == SDEV_DEL)
 			continue;
 		spin_unlock_irqrestore(shost->host_lock, flags);
-		__scsi_remove_device(sdev);
+		scsi_sysfs_remove_sdev(sdev);
 		goto restart;
 	}
 	spin_unlock_irqrestore(shost->host_lock, flags);
diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
index 3ec76dd..0f33b99 100644
--- a/drivers/scsi/scsi_sysfs.c
+++ b/drivers/scsi/scsi_sysfs.c
@@ -851,7 +851,7 @@ int scsi_sysfs_add_sdev(struct scsi_device *sdev)
 	else
 		error = device_create_file(&sdev->sdev_gendev, &dev_attr_queue_depth);
 	if (error) {
-		__scsi_remove_device(sdev);
+		scsi_sysfs_remove_sdev(sdev);
 		goto out;
 	}
 	if (sdev->host->hostt->change_queue_type)
@@ -859,7 +859,7 @@ int scsi_sysfs_add_sdev(struct scsi_device *sdev)
 	else
 		error = device_create_file(&sdev->sdev_gendev, &dev_attr_queue_type);
 	if (error) {
-		__scsi_remove_device(sdev);
+		scsi_sysfs_remove_sdev(sdev);
 		goto out;
 	}
 
@@ -879,7 +879,7 @@ int scsi_sysfs_add_sdev(struct scsi_device *sdev)
 			error = device_create_file(&sdev->sdev_gendev,
 					sdev->host->hostt->sdev_attrs[i]);
 			if (error) {
-				__scsi_remove_device(sdev);
+				scsi_sysfs_remove_sdev(sdev);
 				goto out;
 			}
 		}
@@ -899,7 +899,7 @@ int scsi_sysfs_add_sdev(struct scsi_device *sdev)
 	return error;
 }
 
-void __scsi_remove_device(struct scsi_device *sdev)
+void scsi_sysfs_remove_sdev(struct scsi_device *sdev)
 {
 	struct device *dev = &sdev->sdev_gendev;
 
@@ -926,7 +926,7 @@ void scsi_remove_device(struct scsi_device *sdev)
 	struct Scsi_Host *shost = sdev->host;
 
 	mutex_lock(&shost->scan_mutex);
-	__scsi_remove_device(sdev);
+	scsi_sysfs_remove_sdev(sdev);
 	mutex_unlock(&shost->scan_mutex);
 }
 EXPORT_SYMBOL(scsi_remove_device);
-- 
1.5.3.2


[-- Attachment #8: 0007-Remove-stale-reap_ref-reference.patch --]
[-- Type: text/x-patch, Size: 786 bytes --]

>From 96841a56d0127229a74d5555830759838ff06454 Mon Sep 17 00:00:00 2001
From: Hannes Reinecke <hare@acerbis.suse.de>
Date: Sat, 8 Mar 2008 18:37:22 +0100
Subject: [PATCH] Remove stale reap_ref reference

Signed-off-by: Hannes Reinecke <hare@suse.de>
---
 drivers/scsi/scsi_scan.c |    1 -
 1 files changed, 0 insertions(+), 1 deletions(-)

diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
index 971ac9e..6130495 100644
--- a/drivers/scsi/scsi_scan.c
+++ b/drivers/scsi/scsi_scan.c
@@ -409,7 +409,6 @@ static struct scsi_target *scsi_alloc_target(struct device *parent,
 	}
 	dev = &starget->dev;
 	device_initialize(dev);
-	starget->reap_ref = 1;
 	dev->parent = get_device(parent);
 	sprintf(dev->bus_id, "target%d:%d:%d",
 		shost->host_no, channel, id);
-- 
1.5.3.2


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* Re: [PATCH 04/24] sysfs: Normalize removing sysfs directories.
  2009-05-29  9:14             ` Tejun Heo
@ 2009-05-29 16:52               ` Eric W. Biederman
  2009-05-30 10:43                 ` Tejun Heo
  0 siblings, 1 reply; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-29 16:52 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Andrew Morton, Greg Kroah-Hartman, linux-kernel, Cornelia Huck,
	linux-fsdevel, Kay Sievers, Greg KH, Eric W. Biederman

Tejun Heo <tj@kernel.org> writes:

> Hello,
>
> Eric W. Biederman wrote:
>> @@ -732,12 +732,28 @@ const struct inode_operations sysfs_dir_inode_operations = {
>>  	.setattr	= sysfs_setattr,
>>  };
>>  
>> -static void remove_dir(struct sysfs_dirent *sd)
>> +static void remove_dir(struct sysfs_dirent *dir_sd)
>>  {
>>  	struct sysfs_addrm_cxt acxt;
>>  
>> -	sysfs_addrm_start(&acxt, sd->s_parent);
>> -	sysfs_remove_one(&acxt, sd);
>> +	pr_debug("sysfs %s: removing dir\n", dir_sd->s_name);
>> +
>> +	/* Removing non-empty directories is not valid complain! */
> 						     ^^^
> 						  missing . or ,
>
>> +static struct sysfs_dirent *get_dirent_to_remove(struct sysfs_dirent *dir_sd)
>> +{
>> +	struct sysfs_dirent *sd;
>> +
>> +	mutex_lock(&sysfs_mutex);
>> +	for (sd = dir_sd->s_dir.children; sd; sd = sd->s_sibling) {
>> +		/* Directories might be owned by someone else
>> +		 * making recursive directory removal unsafe.
>> +		 */
>> +		if (sysfs_type(sd) == SYSFS_DIR)
>> +			continue;
>> +		break;
>> +	}
>> +	sysfs_get(sd);
>> +	mutex_unlock(&sysfs_mutex);
>> +
>> +	return sd;
>> +}
>>  
>>  static void __sysfs_remove_dir(struct sysfs_dirent *dir_sd)
>>  {
>>  	struct sysfs_addrm_cxt acxt;
>> -	struct sysfs_dirent **pos;
>> -
>> -	if (!dir_sd)
>> -		return;
>> -
>> -	pr_debug("sysfs %s: removing dir\n", dir_sd->s_name);
>> -	sysfs_addrm_start(&acxt, dir_sd);
>> -	pos = &dir_sd->s_dir.children;
>> -	while (*pos) {
>> -		struct sysfs_dirent *sd = *pos;
>> +	struct sysfs_dirent *sd;
>>  
>> -		if (sysfs_type(sd) != SYSFS_DIR)
>> -			sysfs_remove_one(&acxt, sd);
>> -		else
>> -			pos = &(*pos)->s_sibling;
>> +	/* Remove children that we think are safe */
>> +	while ((sd = get_dirent_to_remove(dir_sd))) {
>> +		sysfs_addrm_start(&acxt, sd->s_parent);
>> +		sysfs_remove_one(&acxt, sd);
>> +		sysfs_addrm_finish(&acxt);
>> +		sysfs_put(sd);
>>  	}
>> -	sysfs_addrm_finish(&acxt);
>
> Ummm... Null @dir_sd handling is being removed, which could be fine
> but please do it in a separate patch or at least mention it in the
> patch description.

Agreed.  That should be documented.  I took a look and it appears we
are completely protected by the kobj->state_in_sysfs flag.

  Also, I'm quite uncomfortable with these things
> being done in non-atomic manner.  It can be made to work but things
> like this can lead to subtle race conditions and with the kind of
> layering we put on top of sysfs (kobject, driver model, driver
> midlayers and so on), it isn't all that easy to verify what's going
> on, so NACK for this one.

Total nonsense.

Mucking about with sysfs after we start deleting a directory is a bug.
At worst my change makes a buggy race slightly less deterministic.

I am not ready to consider keeping the current unnecessary atomic
removal step.  That unnecessary atomicity makes the following patches
more difficult, and requires a lot of unnecessary retesting.

What do you think the extra unnecessary atomicity helps protect?

Eric

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 22/24] sysfs: Make sysfs_rename_link atomic
  2009-05-29  9:16             ` Tejun Heo
@ 2009-05-29 17:17               ` Eric W. Biederman
  2009-05-30 10:48                 ` Tejun Heo
  0 siblings, 1 reply; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-29 17:17 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Andrew Morton, Greg Kroah-Hartman, linux-kernel, Cornelia Huck,
	linux-fsdevel, Kay Sievers, Greg KH, Eric W. Biederman

Tejun Heo <tj@kernel.org> writes:

> Eric W. Biederman wrote:
>> From: Eric W. Biederman <ebiederm@xmission.com>
>> 
>> Use the existing sysfs_rename to make sysfs_rename_link an atomic
>> operation that does less work.  While I am at add additional sanity
>> checking to ensure it is a symlink I am renaming.
>> 
>> Acked-by: Kay Sievers <kay.sievers@vrfy.org>
>> Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
>
> It would be really nice to merge or group this together with the first
> three patches.  Other than that,

Perfection is the enemy of the good on that one.  That just convolutes
things unnecessarily, makes the patches harder to review, and requires
additional testing.

I much prefer to work in a tree without rewinding.

> Acked-by: Tejun Heo <tj@kernel.org>
>
> -- 
> tejun

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 25/20] sysfs: Only support removing emtpy sysfs  directories.
  2009-05-28 20:10                                                                   ` James Bottomley
  2009-05-28 21:04                                                                     ` Alan Stern
@ 2009-05-29 20:08                                                                     ` Alan Stern
  1 sibling, 0 replies; 200+ messages in thread
From: Alan Stern @ 2009-05-29 20:08 UTC (permalink / raw)
  To: James Bottomley
  Cc: Hannes Reinecke, Kay Sievers, SCSI development list,
	Eric W. Biederman, Andrew Morton, Greg Kroah-Hartman,
	Kernel development list, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Eric W. Biederman

To all interested parties:

I have just sent out a series of six patches addressing this problem.  
They are CC'ed to the linux-scsi list.

Alan Stern


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [PATCH 0/26] sysfs cleanups v3.
  2009-05-28 22:58         ` [PATCH 0/24] sysfs cleanups Eric W. Biederman
                             ` (23 preceding siblings ...)
  2009-05-28 23:01           ` [PATCH 24/24] sysfs: In sysfs_add_one fail if the targe directory has been removed Eric W. Biederman
@ 2009-05-29 20:18           ` Eric W. Biederman
  2009-05-29 20:19             ` [PATCH 01/26] sysfs: Implement sysfs_rename_link Eric W. Biederman
                               ` (25 more replies)
  24 siblings, 26 replies; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-29 20:18 UTC (permalink / raw)
  To: Greg KH
  Cc: Kay Sievers, Greg KH, Andrew Morton, linux-kernel, Tejun Heo,
	Cornelia Huck, linux-fsdevel, Eric W. Biederman


Here is the third respin of my sysfs patches.

Tejun had a good point I was doing to much when I touched __sysfs_read_dir.
So that work has been split into 3 patches.

This made a good excuse for me to review the if (!dir_sd) check and be
certain it was not needed.  I have not found any actual problems so
the resulting code is the same as the last patchset.

Now the change to delete each dirent in sysfs_read_dir is it's own
separate patch so Tejun can happily beat me up on just that one idea.
I honestly can't see how holding sysfs_mutex over the entire directory
deletion helps anything.

The net change is less buggy than the current sysfs and a fair bit simpler.

 git-diff-tree --stat v2.6.30-rc5..HEAD
 drivers/base/core.c   |   21 +--
 fs/namei.c            |   22 --
 fs/sysfs/dir.c        |  596 ++++++++++++++++---------------------------------
 fs/sysfs/file.c       |   47 +---
 fs/sysfs/inode.c      |  154 ++++++++------
 fs/sysfs/mount.c      |   20 +-
 fs/sysfs/symlink.c    |   93 +++++----
 fs/sysfs/sysfs.h      |   28 +--
 include/linux/namei.h |    1 -
 include/linux/sysfs.h |    9 +
 10 files changed, 383 insertions(+), 608 deletions(-)


Eric

^ permalink raw reply	[flat|nested] 200+ messages in thread

* [PATCH 01/26] sysfs: Implement sysfs_rename_link
  2009-05-29 20:18           ` [PATCH 0/26] sysfs cleanups v3 Eric W. Biederman
@ 2009-05-29 20:19             ` Eric W. Biederman
  2009-06-02 22:57               ` patch sysfs-implement-sysfs_rename_link.patch added to gregkh-2.6 tree gregkh
  2009-05-29 20:19             ` [PATCH 02/26] driver core: Use sysfs_rename_link in device_rename Eric W. Biederman
                               ` (24 subsequent siblings)
  25 siblings, 1 reply; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-29 20:19 UTC (permalink / raw)
  To: Andrew Morton, Greg Kroah-Hartman
  Cc: linux-kernel, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Kay Sievers, Greg KH, Eric W. Biederman, Benjamin Thery,
	Daniel Lezcano, Eric W. Biederman

From: Eric W. Biederman <ebiederm@xmission.com>

Because of rename ordering problems we occassionally give false
warnings about invalid sysfs operations, so implement a helper
function for this common sysfs idiom.

This is a stripped down version of an earlier patch that
also added sysfs_delete_link.

Cc: Benjamin Thery <benjamin.thery@bull.net>
Cc: Daniel Lezcano <dlezcano@fr.ibm.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
---
 fs/sysfs/symlink.c    |   16 ++++++++++++++++
 include/linux/sysfs.h |    9 +++++++++
 2 files changed, 25 insertions(+), 0 deletions(-)

diff --git a/fs/sysfs/symlink.c b/fs/sysfs/symlink.c
index a3ba217..11c4da5 100644
--- a/fs/sysfs/symlink.c
+++ b/fs/sysfs/symlink.c
@@ -122,6 +122,22 @@ void sysfs_remove_link(struct kobject * kobj, const char * name)
 	sysfs_hash_and_remove(parent_sd, name);
 }
 
+/**
+ *	sysfs_rename_link - rename symlink in object's directory.
+ *	@kobj:	object we're acting for.
+ *	@targ:	object we're pointing to.
+ *	@old:	previous name of the symlink.
+ *	@new:	new name of the symlink.
+ *
+ *	A helper function for the common rename symlink idiom.
+ */
+int sysfs_rename_link(struct kobject *kobj, struct kobject *targ,
+			const char *old, const char *new)
+{
+	sysfs_remove_link(kobj, old);
+	return sysfs_create_link(kobj, targ, new);
+}
+
 static int sysfs_get_target_path(struct sysfs_dirent *parent_sd,
 				 struct sysfs_dirent *target_sd, char *path)
 {
diff --git a/include/linux/sysfs.h b/include/linux/sysfs.h
index 9d68fed..18c8e70 100644
--- a/include/linux/sysfs.h
+++ b/include/linux/sysfs.h
@@ -109,6 +109,9 @@ int __must_check sysfs_create_link_nowarn(struct kobject *kobj,
 					  const char *name);
 void sysfs_remove_link(struct kobject *kobj, const char *name);
 
+int sysfs_rename_link(struct kobject *kobj, struct kobject *target,
+			const char *old_name, const char *new_name);
+
 int __must_check sysfs_create_group(struct kobject *kobj,
 				    const struct attribute_group *grp);
 int sysfs_update_group(struct kobject *kobj,
@@ -202,6 +205,12 @@ static inline void sysfs_remove_link(struct kobject *kobj, const char *name)
 {
 }
 
+static inline int sysfs_rename_link(struct kobject *k, struct kobject *t,
+				    const char *old_name, const char *new_name)
+{
+	return 0;
+}
+
 static inline int sysfs_create_group(struct kobject *kobj,
 				     const struct attribute_group *grp)
 {
-- 
1.6.3.1.54.g99dd.dirty


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH 02/26] driver core: Use sysfs_rename_link in device_rename
  2009-05-29 20:18           ` [PATCH 0/26] sysfs cleanups v3 Eric W. Biederman
  2009-05-29 20:19             ` [PATCH 01/26] sysfs: Implement sysfs_rename_link Eric W. Biederman
@ 2009-05-29 20:19             ` Eric W. Biederman
  2009-06-02 22:57               ` patch driver-core-use-sysfs_rename_link-in-device_rename.patch added to gregkh-2.6 tree gregkh
  2009-05-29 20:19             ` [PATCH 03/26] sysfs: Remove now unnecessary error reporting suppression Eric W. Biederman
                               ` (23 subsequent siblings)
  25 siblings, 1 reply; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-29 20:19 UTC (permalink / raw)
  To: Andrew Morton, Greg Kroah-Hartman
  Cc: linux-kernel, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Kay Sievers, Greg KH, Eric W. Biederman, Eric W. Biederman

From: Eric W. Biederman <ebiederm@xmission.com>

Don't open code the renaming of symlinks in sysfs
instead use the new helper function sysfs_rename_link

Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
---
 drivers/base/core.c |   18 ++++++------------
 1 files changed, 6 insertions(+), 12 deletions(-)

diff --git a/drivers/base/core.c b/drivers/base/core.c
index 4aa527b..8a1569c 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -1490,22 +1490,16 @@ int device_rename(struct device *dev, char *new_name)
 	if (old_class_name) {
 		new_class_name = make_class_name(dev->class->name, &dev->kobj);
 		if (new_class_name) {
-			error = sysfs_create_link_nowarn(&dev->parent->kobj,
-							 &dev->kobj,
-							 new_class_name);
-			if (error)
-				goto out;
-			sysfs_remove_link(&dev->parent->kobj, old_class_name);
+			error = sysfs_rename_link(&dev->parent->kobj,
+						  &dev->kobj,
+						  old_class_name,
+						  new_class_name);
 		}
 	}
 #else
 	if (dev->class) {
-		error = sysfs_create_link_nowarn(&dev->class->p->class_subsys.kobj,
-						 &dev->kobj, dev_name(dev));
-		if (error)
-			goto out;
-		sysfs_remove_link(&dev->class->p->class_subsys.kobj,
-				  old_device_name);
+		error = sysfs_rename_link(&dev->class->p->class_subsys.kobj,
+					  &dev->kobj, old_device_name, new_name);
 	}
 #endif
 
-- 
1.6.3.1.54.g99dd.dirty


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH 03/26] sysfs: Remove now unnecessary error reporting suppression.
  2009-05-29 20:18           ` [PATCH 0/26] sysfs cleanups v3 Eric W. Biederman
  2009-05-29 20:19             ` [PATCH 01/26] sysfs: Implement sysfs_rename_link Eric W. Biederman
  2009-05-29 20:19             ` [PATCH 02/26] driver core: Use sysfs_rename_link in device_rename Eric W. Biederman
@ 2009-05-29 20:19             ` Eric W. Biederman
  2009-06-02 22:57               ` patch sysfs-remove-now-unnecessary-error-reporting-suppression.patch added to gregkh-2.6 tree gregkh
  2009-05-29 20:19             ` [PATCH 04/26] sysfs: sysfs_remove_dir stop checking for bogus cases Eric W. Biederman
                               ` (22 subsequent siblings)
  25 siblings, 1 reply; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-29 20:19 UTC (permalink / raw)
  To: Andrew Morton, Greg Kroah-Hartman
  Cc: linux-kernel, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Kay Sievers, Greg KH, Eric W. Biederman, Eric W. Biederman

From: Eric W. Biederman <ebiederm@xmission.com>

Now that we use sysfs_rename_link in the places we previously
used sysfs_create_link_nowarn we can remove sysfs_create_link_nowarn
and all it's supporting infrastructure as it has no callers.

Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
---
 fs/sysfs/dir.c     |   54 +++++++++++----------------------------------------
 fs/sysfs/symlink.c |   42 ++++++++-------------------------------
 fs/sysfs/sysfs.h   |    1 -
 3 files changed, 21 insertions(+), 76 deletions(-)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index d88d0fa..b95cc07 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -397,43 +397,6 @@ void sysfs_addrm_start(struct sysfs_addrm_cxt *acxt,
 }
 
 /**
- *	__sysfs_add_one - add sysfs_dirent to parent without warning
- *	@acxt: addrm context to use
- *	@sd: sysfs_dirent to be added
- *
- *	Get @acxt->parent_sd and set sd->s_parent to it and increment
- *	nlink of parent inode if @sd is a directory and link into the
- *	children list of the parent.
- *
- *	This function should be called between calls to
- *	sysfs_addrm_start() and sysfs_addrm_finish() and should be
- *	passed the same @acxt as passed to sysfs_addrm_start().
- *
- *	LOCKING:
- *	Determined by sysfs_addrm_start().
- *
- *	RETURNS:
- *	0 on success, -EEXIST if entry with the given name already
- *	exists.
- */
-int __sysfs_add_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd)
-{
-	if (sysfs_find_dirent(acxt->parent_sd, sd->s_name))
-		return -EEXIST;
-
-	sd->s_parent = sysfs_get(acxt->parent_sd);
-
-	if (sysfs_type(sd) == SYSFS_DIR && acxt->parent_inode)
-		inc_nlink(acxt->parent_inode);
-
-	acxt->cnt++;
-
-	sysfs_link_sibling(sd);
-
-	return 0;
-}
-
-/**
  *	sysfs_pathname - return full path to sysfs dirent
  *	@sd: sysfs_dirent whose path we want
  *	@path: caller allocated buffer
@@ -475,10 +438,7 @@ static char *sysfs_pathname(struct sysfs_dirent *sd, char *path)
  */
 int sysfs_add_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd)
 {
-	int ret;
-
-	ret = __sysfs_add_one(acxt, sd);
-	if (ret == -EEXIST) {
+	if (sysfs_find_dirent(acxt->parent_sd, sd->s_name)) {
 		char *path = kzalloc(PATH_MAX, GFP_KERNEL);
 		WARN(1, KERN_WARNING
 		     "sysfs: cannot create duplicate filename '%s'\n",
@@ -486,9 +446,19 @@ int sysfs_add_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd)
 		     strcat(strcat(sysfs_pathname(acxt->parent_sd, path), "/"),
 		            sd->s_name));
 		kfree(path);
+		return -EEXIST;
 	}
 
-	return ret;
+	sd->s_parent = sysfs_get(acxt->parent_sd);
+
+	if (sysfs_type(sd) == SYSFS_DIR && acxt->parent_inode)
+		inc_nlink(acxt->parent_inode);
+
+	acxt->cnt++;
+
+	sysfs_link_sibling(sd);
+
+	return 0;
 }
 
 /**
diff --git a/fs/sysfs/symlink.c b/fs/sysfs/symlink.c
index 11c4da5..ac13e61 100644
--- a/fs/sysfs/symlink.c
+++ b/fs/sysfs/symlink.c
@@ -19,8 +19,14 @@
 
 #include "sysfs.h"
 
-static int sysfs_do_create_link(struct kobject *kobj, struct kobject *target,
-				const char *name, int warn)
+/**
+ *	sysfs_create_link - create symlink between two objects.
+ *	@kobj:	object whose directory we're creating the link in.
+ *	@target:	object we're pointing to.
+ *	@name:		name of the symlink.
+ */
+int sysfs_create_link(struct kobject *kobj, struct kobject *target,
+			const char *name)
 {
 	struct sysfs_dirent *parent_sd = NULL;
 	struct sysfs_dirent *target_sd = NULL;
@@ -60,10 +66,7 @@ static int sysfs_do_create_link(struct kobject *kobj, struct kobject *target,
 	target_sd = NULL;	/* reference is now owned by the symlink */
 
 	sysfs_addrm_start(&acxt, parent_sd);
-	if (warn)
-		error = sysfs_add_one(&acxt, sd);
-	else
-		error = __sysfs_add_one(&acxt, sd);
+	error = sysfs_add_one(&acxt, sd);
 	sysfs_addrm_finish(&acxt);
 
 	if (error)
@@ -78,33 +81,6 @@ static int sysfs_do_create_link(struct kobject *kobj, struct kobject *target,
 }
 
 /**
- *	sysfs_create_link - create symlink between two objects.
- *	@kobj:	object whose directory we're creating the link in.
- *	@target:	object we're pointing to.
- *	@name:		name of the symlink.
- */
-int sysfs_create_link(struct kobject *kobj, struct kobject *target,
-		      const char *name)
-{
-	return sysfs_do_create_link(kobj, target, name, 1);
-}
-
-/**
- *	sysfs_create_link_nowarn - create symlink between two objects.
- *	@kobj:	object whose directory we're creating the link in.
- *	@target:	object we're pointing to.
- *	@name:		name of the symlink.
- *
- *	This function does the same as sysf_create_link(), but it
- *	doesn't warn if the link already exists.
- */
-int sysfs_create_link_nowarn(struct kobject *kobj, struct kobject *target,
-			     const char *name)
-{
-	return sysfs_do_create_link(kobj, target, name, 0);
-}
-
-/**
  *	sysfs_remove_link - remove symlink in object's directory.
  *	@kobj:	object we're acting for.
  *	@name:	name of the symlink to remove.
diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index 3fa0d98..abf05f4 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -108,7 +108,6 @@ struct sysfs_dirent *sysfs_get_active_two(struct sysfs_dirent *sd);
 void sysfs_put_active_two(struct sysfs_dirent *sd);
 void sysfs_addrm_start(struct sysfs_addrm_cxt *acxt,
 		       struct sysfs_dirent *parent_sd);
-int __sysfs_add_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd);
 int sysfs_add_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd);
 void sysfs_remove_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd);
 void sysfs_addrm_finish(struct sysfs_addrm_cxt *acxt);
-- 
1.6.3.1.54.g99dd.dirty


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH 04/26] sysfs: sysfs_remove_dir stop checking for bogus cases.
  2009-05-29 20:18           ` [PATCH 0/26] sysfs cleanups v3 Eric W. Biederman
                               ` (2 preceding siblings ...)
  2009-05-29 20:19             ` [PATCH 03/26] sysfs: Remove now unnecessary error reporting suppression Eric W. Biederman
@ 2009-05-29 20:19             ` Eric W. Biederman
  2009-06-03 23:53               ` Greg KH
  2009-05-29 20:19             ` [PATCH 05/26] sysfs: Improve sysfs directory deletion debugging Eric W. Biederman
                               ` (21 subsequent siblings)
  25 siblings, 1 reply; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-29 20:19 UTC (permalink / raw)
  To: Andrew Morton, Greg Kroah-Hartman
  Cc: linux-kernel, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Kay Sievers, Greg KH, Eric W. Biederman, Eric W. Biederman

From: Eric W. Biederman <ebiederm@xmission.com>

kobj->sd can not be NULL in sysfs_remove_dir.

sysfs_remove_dir is only called from kobject_add (to clean up after failure)
and from kobject_del at the end of a kobject's life.  In both cases kobject_add
has already called sysfs_create_dir successfully.  The only writers of
kobj->sd are sysfs_create_dir on sucess and sysfs_remove_dir when it clears
the kobj just before deleting the directory.

Which means at the time sysfs_remove_dir is called kobj->sd will be
valid.

Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
---
 fs/sysfs/dir.c |    3 ---
 1 files changed, 0 insertions(+), 3 deletions(-)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index b95cc07..a55e1d4 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -752,9 +752,6 @@ static void __sysfs_remove_dir(struct sysfs_dirent *dir_sd)
 	struct sysfs_addrm_cxt acxt;
 	struct sysfs_dirent **pos;
 
-	if (!dir_sd)
-		return;
-
 	pr_debug("sysfs %s: removing dir\n", dir_sd->s_name);
 	sysfs_addrm_start(&acxt, dir_sd);
 	pos = &dir_sd->s_dir.children;
-- 
1.6.3.1.54.g99dd.dirty


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH 05/26] sysfs: Improve sysfs directory deletion debugging.
  2009-05-29 20:18           ` [PATCH 0/26] sysfs cleanups v3 Eric W. Biederman
                               ` (3 preceding siblings ...)
  2009-05-29 20:19             ` [PATCH 04/26] sysfs: sysfs_remove_dir stop checking for bogus cases Eric W. Biederman
@ 2009-05-29 20:19             ` Eric W. Biederman
  2009-05-29 20:19             ` [PATCH 06/26] sysfs: Don't hold addrm_start/addrm_finish over multiple removals Eric W. Biederman
                               ` (20 subsequent siblings)
  25 siblings, 0 replies; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-29 20:19 UTC (permalink / raw)
  To: Andrew Morton, Greg Kroah-Hartman
  Cc: linux-kernel, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Kay Sievers, Greg KH, Eric W. Biederman, Eric W. Biederman

From: Eric W. Biederman <ebiederm@xmission.com>

We have found several cases where directories are deleted without
removing all of their subdirectories.  That case isn't valid so
warn anyone who makes that mistake, and continue to leak dirents
to keep the system as operational as possible.

Move the debug message when a directory is deleted into remove_dir
so we are told when subdirectories are deleted as well as full
fledge kobject directories.

Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
---
 fs/sysfs/dir.c |   23 +++++++++++++++++++----
 1 files changed, 19 insertions(+), 4 deletions(-)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index a55e1d4..60482be 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -732,12 +732,28 @@ const struct inode_operations sysfs_dir_inode_operations = {
 	.setattr	= sysfs_setattr,
 };
 
-static void remove_dir(struct sysfs_dirent *sd)
+static void remove_dir(struct sysfs_dirent *dir_sd)
 {
 	struct sysfs_addrm_cxt acxt;
 
-	sysfs_addrm_start(&acxt, sd->s_parent);
-	sysfs_remove_one(&acxt, sd);
+	pr_debug("sysfs %s: removing dir\n", dir_sd->s_name);
+
+	/* Removing non-empty directories is not valid complain! */
+	if (unlikely(dir_sd->s_dir.children)) {
+		struct sysfs_dirent *sd;
+
+		WARN(1, KERN_WARNING "sysfs: removing non-empty dir: %s\n",
+			dir_sd->s_name);
+
+		mutex_lock(&sysfs_mutex);
+		for (sd = dir_sd->s_dir.children; sd; sd  = sd->s_sibling)
+			printk(KERN_WARNING "%s/%s\n",
+				dir_sd->s_name, sd->s_name);
+		mutex_unlock(&sysfs_mutex);
+	}
+
+	sysfs_addrm_start(&acxt, dir_sd->s_parent);
+	sysfs_remove_one(&acxt, dir_sd);
 	sysfs_addrm_finish(&acxt);
 }
 
@@ -752,7 +768,6 @@ static void __sysfs_remove_dir(struct sysfs_dirent *dir_sd)
 	struct sysfs_addrm_cxt acxt;
 	struct sysfs_dirent **pos;
 
-	pr_debug("sysfs %s: removing dir\n", dir_sd->s_name);
 	sysfs_addrm_start(&acxt, dir_sd);
 	pos = &dir_sd->s_dir.children;
 	while (*pos) {
-- 
1.6.3.1.54.g99dd.dirty


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH 06/26] sysfs: Don't hold addrm_start/addrm_finish over multiple removals.
  2009-05-29 20:18           ` [PATCH 0/26] sysfs cleanups v3 Eric W. Biederman
                               ` (4 preceding siblings ...)
  2009-05-29 20:19             ` [PATCH 05/26] sysfs: Improve sysfs directory deletion debugging Eric W. Biederman
@ 2009-05-29 20:19             ` Eric W. Biederman
  2009-05-29 20:19             ` [PATCH 07/26] sysfs: Rename sysfs_d_iput to sysfs_dentry_iput Eric W. Biederman
                               ` (19 subsequent siblings)
  25 siblings, 0 replies; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-29 20:19 UTC (permalink / raw)
  To: Andrew Morton, Greg Kroah-Hartman
  Cc: linux-kernel, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Kay Sievers, Greg KH, Eric W. Biederman, Eric W. Biederman

From: Eric W. Biederman <ebiederm@xmission.com>

With respect to the basic integrity of the sysfs data structures holding
the locks across the deletion of multiple sysfs_dirents is unnecessary.

Upper layers are required to coordinate their activity so that they
do not add or delete entries in sysfs directories as they are being
removed, and I have seen nothing to indicate the don't. The upper layers
can not rely on sysfs doing anything for them as it is a compile option
and may not be there.  So the previous atomic delete of the directory
entries and the directory serves no useful purpose.

By removing the only case where addrm_start/addrm_finish are held
over multiple dirent removals I simplify the requirements and
pave the way removing sysfs_addrm_start and sysfs_addrm_finish
completely.

Additionally add some comments explaining some of the thinking behind
sysfs_dirent removal in __sysfs_remove_dir.

Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
---
 fs/sysfs/dir.c |   36 +++++++++++++++++++++++++-----------
 1 files changed, 25 insertions(+), 11 deletions(-)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index 60482be..3e3a87f 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -762,23 +762,37 @@ void sysfs_remove_subdir(struct sysfs_dirent *sd)
 	remove_dir(sd);
 }
 
+static struct sysfs_dirent *get_dirent_to_remove(struct sysfs_dirent *dir_sd)
+{
+	struct sysfs_dirent *sd;
+
+	mutex_lock(&sysfs_mutex);
+	for (sd = dir_sd->s_dir.children; sd; sd = sd->s_sibling) {
+		/* Directories might be owned by someone else
+		 * making recursive directory removal unsafe.
+		 */
+		if (sysfs_type(sd) == SYSFS_DIR)
+			continue;
+		break;
+	}
+	sysfs_get(sd);
+	mutex_unlock(&sysfs_mutex);
+
+	return sd;
+}
 
 static void __sysfs_remove_dir(struct sysfs_dirent *dir_sd)
 {
 	struct sysfs_addrm_cxt acxt;
-	struct sysfs_dirent **pos;
-
-	sysfs_addrm_start(&acxt, dir_sd);
-	pos = &dir_sd->s_dir.children;
-	while (*pos) {
-		struct sysfs_dirent *sd = *pos;
+	struct sysfs_dirent *sd;
 
-		if (sysfs_type(sd) != SYSFS_DIR)
-			sysfs_remove_one(&acxt, sd);
-		else
-			pos = &(*pos)->s_sibling;
+	/* Remove children that we think are safe */
+	while ((sd = get_dirent_to_remove(dir_sd))) {
+		sysfs_addrm_start(&acxt, sd->s_parent);
+		sysfs_remove_one(&acxt, sd);
+		sysfs_addrm_finish(&acxt);
+		sysfs_put(sd);
 	}
-	sysfs_addrm_finish(&acxt);
 
 	remove_dir(dir_sd);
 }
-- 
1.6.3.1.54.g99dd.dirty


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH 07/26] sysfs: Rename sysfs_d_iput to sysfs_dentry_iput
  2009-05-29 20:18           ` [PATCH 0/26] sysfs cleanups v3 Eric W. Biederman
                               ` (5 preceding siblings ...)
  2009-05-29 20:19             ` [PATCH 06/26] sysfs: Don't hold addrm_start/addrm_finish over multiple removals Eric W. Biederman
@ 2009-05-29 20:19             ` Eric W. Biederman
  2009-05-29 20:19             ` [PATCH 08/26] sysfs: Use dentry_ops instead of directly playing with the dcache Eric W. Biederman
                               ` (18 subsequent siblings)
  25 siblings, 0 replies; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-29 20:19 UTC (permalink / raw)
  To: Andrew Morton, Greg Kroah-Hartman
  Cc: linux-kernel, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Kay Sievers, Greg KH, Eric W. Biederman, Eric W. Biederman

From: Eric W. Biederman <ebiederm@xmission.com>

Using dentry instead of d in the function name is what
several other filesystems are doing and it seems to be
a more readable convention.

Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
---
 fs/sysfs/dir.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index 3e3a87f..f9f32b8 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -294,7 +294,7 @@ void release_sysfs_dirent(struct sysfs_dirent * sd)
 		goto repeat;
 }
 
-static void sysfs_d_iput(struct dentry * dentry, struct inode * inode)
+static void sysfs_dentry_iput(struct dentry * dentry, struct inode * inode)
 {
 	struct sysfs_dirent * sd = dentry->d_fsdata;
 
@@ -303,7 +303,7 @@ static void sysfs_d_iput(struct dentry * dentry, struct inode * inode)
 }
 
 static const struct dentry_operations sysfs_dentry_ops = {
-	.d_iput		= sysfs_d_iput,
+	.d_iput		= sysfs_dentry_iput,
 };
 
 struct sysfs_dirent *sysfs_new_dirent(const char *name, umode_t mode, int type)
-- 
1.6.3.1.54.g99dd.dirty


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH 08/26] sysfs: Use dentry_ops instead of directly playing with the dcache
  2009-05-29 20:18           ` [PATCH 0/26] sysfs cleanups v3 Eric W. Biederman
                               ` (6 preceding siblings ...)
  2009-05-29 20:19             ` [PATCH 07/26] sysfs: Rename sysfs_d_iput to sysfs_dentry_iput Eric W. Biederman
@ 2009-05-29 20:19             ` Eric W. Biederman
  2009-05-29 20:19             ` [PATCH 09/26] sysfs: Simplify sysfs_chmod_file semantics Eric W. Biederman
                               ` (17 subsequent siblings)
  25 siblings, 0 replies; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-29 20:19 UTC (permalink / raw)
  To: Andrew Morton, Greg Kroah-Hartman
  Cc: linux-kernel, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Kay Sievers, Greg KH, Eric W. Biederman, Eric W. Biederman

From: Eric W. Biederman <ebiederm@xmission.com>

Calling d_drop unconditionally when a sysfs_dirent is deleted has
the potential to leak mounts, so instead implement dentry delete
and revalidate operations that cause sysfs dentries to be removed
at the appropriate time.

Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
---
 fs/sysfs/dir.c |   73 +++++++++++++++++++++++++++++++++++--------------------
 1 files changed, 46 insertions(+), 27 deletions(-)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index f9f32b8..e0bf3a5 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -294,6 +294,46 @@ void release_sysfs_dirent(struct sysfs_dirent * sd)
 		goto repeat;
 }
 
+static int sysfs_dentry_delete(struct dentry *dentry)
+{
+	struct sysfs_dirent *sd = dentry->d_fsdata;
+	return !!(sd->s_flags & SYSFS_FLAG_REMOVED);
+}
+
+static int sysfs_dentry_revalidate(struct dentry *dentry, struct nameidata *nd)
+{
+	struct sysfs_dirent *sd = dentry->d_fsdata;
+	int is_dir;
+
+	mutex_lock(&sysfs_mutex);
+
+	/* The sysfs dirent has been deleted */
+	if (sd->s_flags & SYSFS_FLAG_REMOVED)
+		goto out_bad;
+
+	mutex_unlock(&sysfs_mutex);
+out_valid:
+	return 1;
+out_bad:
+	/* Remove the dentry from the dcache hashes.
+	 * If this is a deleted dentry we use d_drop instead of d_delete
+	 * so sysfs doesn't need to cope with negative dentries.
+	 */
+	is_dir = (sysfs_type(sd) == SYSFS_DIR);
+	mutex_unlock(&sysfs_mutex);
+	if (is_dir) {
+		/* If we have submounts we must allow the vfs caches
+		 * to lie about the state of the filesystem to prevent
+		 * leaks and other nasty things.
+		 */
+		if (have_submounts(dentry))
+			goto out_valid;
+		shrink_dcache_parent(dentry);
+	}
+	d_drop(dentry);
+	return 0;
+}
+
 static void sysfs_dentry_iput(struct dentry * dentry, struct inode * inode)
 {
 	struct sysfs_dirent * sd = dentry->d_fsdata;
@@ -303,6 +343,8 @@ static void sysfs_dentry_iput(struct dentry * dentry, struct inode * inode)
 }
 
 static const struct dentry_operations sysfs_dentry_ops = {
+	.d_revalidate	= sysfs_dentry_revalidate,
+	.d_delete	= sysfs_dentry_delete,
 	.d_iput		= sysfs_dentry_iput,
 };
 
@@ -493,44 +535,21 @@ void sysfs_remove_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd)
 }
 
 /**
- *	sysfs_drop_dentry - drop dentry for the specified sysfs_dirent
+ *	sysfs_dec_nlink - Decrement link count for the specified sysfs_dirent
  *	@sd: target sysfs_dirent
  *
- *	Drop dentry for @sd.  @sd must have been unlinked from its
+ *	Decrement nlink for @sd.  @sd must have been unlinked from its
  *	parent on entry to this function such that it can't be looked
  *	up anymore.
  */
-static void sysfs_drop_dentry(struct sysfs_dirent *sd)
+static void sysfs_dec_nlink(struct sysfs_dirent *sd)
 {
 	struct inode *inode;
-	struct dentry *dentry;
 
 	inode = ilookup(sysfs_sb, sd->s_ino);
 	if (!inode)
 		return;
 
-	/* Drop any existing dentries associated with sd.
-	 *
-	 * For the dentry to be properly freed we need to grab a
-	 * reference to the dentry under the dcache lock,  unhash it,
-	 * and then put it.  The playing with the dentry count allows
-	 * dput to immediately free the dentry  if it is not in use.
-	 */
-repeat:
-	spin_lock(&dcache_lock);
-	list_for_each_entry(dentry, &inode->i_dentry, d_alias) {
-		if (d_unhashed(dentry))
-			continue;
-		dget_locked(dentry);
-		spin_lock(&dentry->d_lock);
-		__d_drop(dentry);
-		spin_unlock(&dentry->d_lock);
-		spin_unlock(&dcache_lock);
-		dput(dentry);
-		goto repeat;
-	}
-	spin_unlock(&dcache_lock);
-
 	/* adjust nlink and update timestamp */
 	mutex_lock(&inode->i_mutex);
 
@@ -577,7 +596,7 @@ void sysfs_addrm_finish(struct sysfs_addrm_cxt *acxt)
 		acxt->removed = sd->s_sibling;
 		sd->s_sibling = NULL;
 
-		sysfs_drop_dentry(sd);
+		sysfs_dec_nlink(sd);
 		sysfs_deactivate(sd);
 		unmap_bin_file(sd);
 		sysfs_put(sd);
-- 
1.6.3.1.54.g99dd.dirty


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH 09/26] sysfs: Simplify sysfs_chmod_file semantics
  2009-05-29 20:18           ` [PATCH 0/26] sysfs cleanups v3 Eric W. Biederman
                               ` (7 preceding siblings ...)
  2009-05-29 20:19             ` [PATCH 08/26] sysfs: Use dentry_ops instead of directly playing with the dcache Eric W. Biederman
@ 2009-05-29 20:19             ` Eric W. Biederman
  2009-05-29 20:19             ` [PATCH 10/26] sysfs: Optimize just changing the sysfs file mode Eric W. Biederman
                               ` (16 subsequent siblings)
  25 siblings, 0 replies; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-29 20:19 UTC (permalink / raw)
  To: Andrew Morton, Greg Kroah-Hartman
  Cc: linux-kernel, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Kay Sievers, Greg KH, Eric W. Biederman, Eric W. Biederman

From: Eric W. Biederman <ebiederm@xmission.com>

Currently every caller of sysfs_chmod_file happens at either
file creation time to set a non-default mode or in response
to a specific user requested space change in policy.  Making
timestamps of when the chmod happens and notification of
a file changing mode uninteresting.

Remove the unnecessary time stamp and filesystem change
notification, and removes the last of the explicit inotify
and donitfy support from sysfs.

Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
---
 fs/sysfs/file.c |   10 +---------
 1 files changed, 1 insertions(+), 9 deletions(-)

diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c
index b1606e0..0786b41 100644
--- a/fs/sysfs/file.c
+++ b/fs/sysfs/file.c
@@ -602,17 +602,9 @@ int sysfs_chmod_file(struct kobject *kobj, struct attribute *attr, mode_t mode)
 	mutex_lock(&inode->i_mutex);
 
 	newattrs.ia_mode = (mode & S_IALLUGO) | (inode->i_mode & ~S_IALLUGO);
-	newattrs.ia_valid = ATTR_MODE | ATTR_CTIME;
-	newattrs.ia_ctime = current_fs_time(inode->i_sb);
+	newattrs.ia_valid = ATTR_MODE;
 	rc = sysfs_setattr(victim, &newattrs);
 
-	if (rc == 0) {
-		fsnotify_change(victim, newattrs.ia_valid);
-		mutex_lock(&sysfs_mutex);
-		victim_sd->s_mode = newattrs.ia_mode;
-		mutex_unlock(&sysfs_mutex);
-	}
-
 	mutex_unlock(&inode->i_mutex);
  out:
 	dput(victim);
-- 
1.6.3.1.54.g99dd.dirty


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH 10/26] sysfs: Optimize just changing the sysfs file mode.
  2009-05-29 20:18           ` [PATCH 0/26] sysfs cleanups v3 Eric W. Biederman
                               ` (8 preceding siblings ...)
  2009-05-29 20:19             ` [PATCH 09/26] sysfs: Simplify sysfs_chmod_file semantics Eric W. Biederman
@ 2009-05-29 20:19             ` Eric W. Biederman
  2009-05-29 20:19             ` [PATCH 11/26] sysfs: Simplify iattr assignments Eric W. Biederman
                               ` (15 subsequent siblings)
  25 siblings, 0 replies; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-29 20:19 UTC (permalink / raw)
  To: Andrew Morton, Greg Kroah-Hartman
  Cc: linux-kernel, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Kay Sievers, Greg KH, Eric W. Biederman, Eric W. Biederman

From: Eric W. Biederman <ebiederm@xmission.com>

Don't allocate a struct iattr for the sysfs dentry if just
the mode changes because we have a field for that on the
sysfs_dirent, and we can trigger that case with sysfs_chmod_file.

Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
---
 fs/sysfs/inode.c |   22 ++++++++++++++--------
 1 files changed, 14 insertions(+), 8 deletions(-)

diff --git a/fs/sysfs/inode.c b/fs/sysfs/inode.c
index 555f0ff..70ff2a2 100644
--- a/fs/sysfs/inode.c
+++ b/fs/sysfs/inode.c
@@ -60,12 +60,16 @@ int sysfs_setattr(struct dentry * dentry, struct iattr * iattr)
 		return error;
 
 	iattr->ia_valid &= ~ATTR_SIZE; /* ignore size changes */
+	if (iattr->ia_valid & ATTR_MODE) {
+		if (!in_group_p(inode->i_gid) && !capable(CAP_FSETID))
+			iattr->ia_mode &= ~S_ISGID;
+	}
 
 	error = inode_setattr(inode, iattr);
 	if (error)
 		return error;
 
-	if (!sd_iattr) {
+	if (!sd_iattr && (ia_valid & ~ATTR_MODE)) {
 		/* setting attributes for the first time, allocate now */
 		sd_iattr = kzalloc(sizeof(struct iattr), GFP_KERNEL);
 		if (!sd_iattr)
@@ -78,6 +82,13 @@ int sysfs_setattr(struct dentry * dentry, struct iattr * iattr)
 		sd->s_iattr = sd_iattr;
 	}
 
+	if (ia_valid & ATTR_MODE)
+		sd->s_mode = iattr->ia_mode;
+
+	/* If we don't need the extra attributes leave */
+	if (!sd_iattr)
+		return 0;
+
 	/* attributes were changed atleast once in past */
 
 	if (ia_valid & ATTR_UID)
@@ -93,13 +104,8 @@ int sysfs_setattr(struct dentry * dentry, struct iattr * iattr)
 	if (ia_valid & ATTR_CTIME)
 		sd_iattr->ia_ctime = timespec_trunc(iattr->ia_ctime,
 						inode->i_sb->s_time_gran);
-	if (ia_valid & ATTR_MODE) {
-		umode_t mode = iattr->ia_mode;
-
-		if (!in_group_p(inode->i_gid) && !capable(CAP_FSETID))
-			mode &= ~S_ISGID;
-		sd_iattr->ia_mode = sd->s_mode = mode;
-	}
+	if (ia_valid & ATTR_MODE)
+		sd_iattr->ia_mode = iattr->ia_mode;
 
 	return error;
 }
-- 
1.6.3.1.54.g99dd.dirty


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH 11/26] sysfs: Simplify iattr assignments
  2009-05-29 20:18           ` [PATCH 0/26] sysfs cleanups v3 Eric W. Biederman
                               ` (9 preceding siblings ...)
  2009-05-29 20:19             ` [PATCH 10/26] sysfs: Optimize just changing the sysfs file mode Eric W. Biederman
@ 2009-05-29 20:19             ` Eric W. Biederman
  2009-05-29 20:19             ` [PATCH 12/26] sysfs: Fix locking and factor out sysfs_sd_setattr Eric W. Biederman
                               ` (14 subsequent siblings)
  25 siblings, 0 replies; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-29 20:19 UTC (permalink / raw)
  To: Andrew Morton, Greg Kroah-Hartman
  Cc: linux-kernel, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Kay Sievers, Greg KH, Eric W. Biederman, Eric W. Biederman

From: Eric W. Biederman <ebiederm@xmission.com>

The granularity of sysfs time when we keep it is 1 ns.  Which
when passed to timestamp_trunc results in a nop.  So remove
the unnecessary function call making sysfs_setattr slightly
easier to read.

Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
---
 fs/sysfs/inode.c |    9 +++------
 1 files changed, 3 insertions(+), 6 deletions(-)

diff --git a/fs/sysfs/inode.c b/fs/sysfs/inode.c
index 70ff2a2..5020a1d 100644
--- a/fs/sysfs/inode.c
+++ b/fs/sysfs/inode.c
@@ -96,14 +96,11 @@ int sysfs_setattr(struct dentry * dentry, struct iattr * iattr)
 	if (ia_valid & ATTR_GID)
 		sd_iattr->ia_gid = iattr->ia_gid;
 	if (ia_valid & ATTR_ATIME)
-		sd_iattr->ia_atime = timespec_trunc(iattr->ia_atime,
-						inode->i_sb->s_time_gran);
+		sd_iattr->ia_atime = iattr->ia_atime;
 	if (ia_valid & ATTR_MTIME)
-		sd_iattr->ia_mtime = timespec_trunc(iattr->ia_mtime,
-						inode->i_sb->s_time_gran);
+		sd_iattr->ia_mtime = iattr->ia_mtime;
 	if (ia_valid & ATTR_CTIME)
-		sd_iattr->ia_ctime = timespec_trunc(iattr->ia_ctime,
-						inode->i_sb->s_time_gran);
+		sd_iattr->ia_ctime = iattr->ia_ctime;
 	if (ia_valid & ATTR_MODE)
 		sd_iattr->ia_mode = iattr->ia_mode;
 
-- 
1.6.3.1.54.g99dd.dirty


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH 12/26] sysfs: Fix locking and factor out sysfs_sd_setattr
  2009-05-29 20:18           ` [PATCH 0/26] sysfs cleanups v3 Eric W. Biederman
                               ` (10 preceding siblings ...)
  2009-05-29 20:19             ` [PATCH 11/26] sysfs: Simplify iattr assignments Eric W. Biederman
@ 2009-05-29 20:19             ` Eric W. Biederman
  2009-05-29 20:19             ` [PATCH 13/26] sysfs: Update s_iattr on link and unlink Eric W. Biederman
                               ` (13 subsequent siblings)
  25 siblings, 0 replies; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-29 20:19 UTC (permalink / raw)
  To: Andrew Morton, Greg Kroah-Hartman
  Cc: linux-kernel, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Kay Sievers, Greg KH, Eric W. Biederman, Eric W. Biederman

From: Eric W. Biederman <ebiederm@xmission.com>

Cleanly separate the work that is specific to setting the
attributes of a sysfs_dirent from what is needed to update
the attributes of a vfs inode.

Additionally grab the sysfs_mutex to keep any nasties from
surprising us when updating the sysfs_dirent.

Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
---
 fs/sysfs/inode.c |   52 ++++++++++++++++++++++++++++++----------------------
 fs/sysfs/sysfs.h |    1 +
 2 files changed, 31 insertions(+), 22 deletions(-)

diff --git a/fs/sysfs/inode.c b/fs/sysfs/inode.c
index 5020a1d..dd154cb 100644
--- a/fs/sysfs/inode.c
+++ b/fs/sysfs/inode.c
@@ -42,33 +42,12 @@ int __init sysfs_inode_init(void)
 	return bdi_init(&sysfs_backing_dev_info);
 }
 
-int sysfs_setattr(struct dentry * dentry, struct iattr * iattr)
+int sysfs_sd_setattr(struct sysfs_dirent *sd, struct iattr * iattr)
 {
-	struct inode * inode = dentry->d_inode;
-	struct sysfs_dirent * sd = dentry->d_fsdata;
 	struct iattr * sd_iattr;
 	unsigned int ia_valid = iattr->ia_valid;
-	int error;
-
-	if (!sd)
-		return -EINVAL;
 
 	sd_iattr = sd->s_iattr;
-
-	error = inode_change_ok(inode, iattr);
-	if (error)
-		return error;
-
-	iattr->ia_valid &= ~ATTR_SIZE; /* ignore size changes */
-	if (iattr->ia_valid & ATTR_MODE) {
-		if (!in_group_p(inode->i_gid) && !capable(CAP_FSETID))
-			iattr->ia_mode &= ~S_ISGID;
-	}
-
-	error = inode_setattr(inode, iattr);
-	if (error)
-		return error;
-
 	if (!sd_iattr && (ia_valid & ~ATTR_MODE)) {
 		/* setting attributes for the first time, allocate now */
 		sd_iattr = kzalloc(sizeof(struct iattr), GFP_KERNEL);
@@ -103,6 +82,35 @@ int sysfs_setattr(struct dentry * dentry, struct iattr * iattr)
 		sd_iattr->ia_ctime = iattr->ia_ctime;
 	if (ia_valid & ATTR_MODE)
 		sd_iattr->ia_mode = iattr->ia_mode;
+	return 0;
+}
+
+int sysfs_setattr(struct dentry * dentry, struct iattr * iattr)
+{
+	struct inode * inode = dentry->d_inode;
+	struct sysfs_dirent * sd = dentry->d_fsdata;
+	int error;
+
+	if (!sd)
+		return -EINVAL;
+
+	error = inode_change_ok(inode, iattr);
+	if (error)
+		return error;
+
+	iattr->ia_valid &= ~ATTR_SIZE; /* ignore size changes */
+	if (iattr->ia_valid & ATTR_MODE) {
+		if (!in_group_p(inode->i_gid) && !capable(CAP_FSETID))
+			iattr->ia_mode &= ~S_ISGID;
+	}
+
+	error = inode_setattr(inode, iattr);
+	if (error)
+		return error;
+
+	mutex_lock(&sysfs_mutex);
+	error = sysfs_sd_setattr(sd, iattr);
+	mutex_unlock(&sysfs_mutex);
 
 	return error;
 }
diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index abf05f4..043bb13 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -146,6 +146,7 @@ static inline void __sysfs_put(struct sysfs_dirent *sd)
  */
 struct inode *sysfs_get_inode(struct sysfs_dirent *sd);
 void sysfs_delete_inode(struct inode *inode);
+int sysfs_sd_setattr(struct sysfs_dirent *sd, struct iattr *iattr);
 int sysfs_setattr(struct dentry *dentry, struct iattr *iattr);
 int sysfs_hash_and_remove(struct sysfs_dirent *dir_sd, const char *name);
 int sysfs_inode_init(void);
-- 
1.6.3.1.54.g99dd.dirty


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH 13/26] sysfs: Update s_iattr on link and unlink.
  2009-05-29 20:18           ` [PATCH 0/26] sysfs cleanups v3 Eric W. Biederman
                               ` (11 preceding siblings ...)
  2009-05-29 20:19             ` [PATCH 12/26] sysfs: Fix locking and factor out sysfs_sd_setattr Eric W. Biederman
@ 2009-05-29 20:19             ` Eric W. Biederman
  2009-05-29 20:19             ` [PATCH 14/26] sysfs: Nicely indent sysfs_symlink_inode_operations Eric W. Biederman
                               ` (12 subsequent siblings)
  25 siblings, 0 replies; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-29 20:19 UTC (permalink / raw)
  To: Andrew Morton, Greg Kroah-Hartman
  Cc: linux-kernel, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Kay Sievers, Greg KH, Eric W. Biederman, Eric W. Biederman

From: Eric W. Biederman <ebiederm@xmission.com>

Currently sysfs updates the timestamps on the vfs directory
inode when we create or remove a directory entry but doesn't
update the cached copy on the sysfs_dirent, fix that oversight.

Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
---
 fs/sysfs/dir.c |   14 ++++++++++++++
 1 files changed, 14 insertions(+), 0 deletions(-)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index e0bf3a5..c6472d8 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -480,6 +480,8 @@ static char *sysfs_pathname(struct sysfs_dirent *sd, char *path)
  */
 int sysfs_add_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd)
 {
+	struct iattr *ps_iattr;
+
 	if (sysfs_find_dirent(acxt->parent_sd, sd->s_name)) {
 		char *path = kzalloc(PATH_MAX, GFP_KERNEL);
 		WARN(1, KERN_WARNING
@@ -500,6 +502,11 @@ int sysfs_add_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd)
 
 	sysfs_link_sibling(sd);
 
+	/* Update timestamps on the parent */
+	ps_iattr = acxt->parent_sd->s_iattr;
+	if (ps_iattr)
+		ps_iattr->ia_ctime = ps_iattr->ia_mtime = CURRENT_TIME;
+
 	return 0;
 }
 
@@ -520,10 +527,17 @@ int sysfs_add_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd)
  */
 void sysfs_remove_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd)
 {
+	struct iattr *ps_iattr;
+
 	BUG_ON(sd->s_flags & SYSFS_FLAG_REMOVED);
 
 	sysfs_unlink_sibling(sd);
 
+	/* Update timestamps on the parent */
+	ps_iattr = acxt->parent_sd->s_iattr;
+	if (ps_iattr)
+		ps_iattr->ia_ctime = ps_iattr->ia_mtime = CURRENT_TIME;
+
 	sd->s_flags |= SYSFS_FLAG_REMOVED;
 	sd->s_sibling = acxt->removed;
 	acxt->removed = sd;
-- 
1.6.3.1.54.g99dd.dirty


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH 14/26] sysfs: Nicely indent sysfs_symlink_inode_operations
  2009-05-29 20:18           ` [PATCH 0/26] sysfs cleanups v3 Eric W. Biederman
                               ` (12 preceding siblings ...)
  2009-05-29 20:19             ` [PATCH 13/26] sysfs: Update s_iattr on link and unlink Eric W. Biederman
@ 2009-05-29 20:19             ` Eric W. Biederman
  2009-05-29 20:19             ` [PATCH 15/26] sysfs: Implement sysfs_getattr & sysfs_permission Eric W. Biederman
                               ` (11 subsequent siblings)
  25 siblings, 0 replies; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-29 20:19 UTC (permalink / raw)
  To: Andrew Morton, Greg Kroah-Hartman
  Cc: linux-kernel, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Kay Sievers, Greg KH, Eric W. Biederman, Eric W. Biederman

From: Eric W. Biederman <ebiederm@xmission.com>

Lining up the functions in sysfs_symlink_inode_operations
follows the pattern in the rest of sysfs and makes things
slightly more readable.

Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
---
 fs/sysfs/symlink.c |    6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/fs/sysfs/symlink.c b/fs/sysfs/symlink.c
index ac13e61..0367ed1 100644
--- a/fs/sysfs/symlink.c
+++ b/fs/sysfs/symlink.c
@@ -198,9 +198,9 @@ static void sysfs_put_link(struct dentry *dentry, struct nameidata *nd, void *co
 }
 
 const struct inode_operations sysfs_symlink_inode_operations = {
-	.readlink = generic_readlink,
-	.follow_link = sysfs_follow_link,
-	.put_link = sysfs_put_link,
+	.readlink	= generic_readlink,
+	.follow_link	= sysfs_follow_link,
+	.put_link	= sysfs_put_link,
 };
 
 
-- 
1.6.3.1.54.g99dd.dirty


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH 15/26] sysfs: Implement sysfs_getattr & sysfs_permission
  2009-05-29 20:18           ` [PATCH 0/26] sysfs cleanups v3 Eric W. Biederman
                               ` (13 preceding siblings ...)
  2009-05-29 20:19             ` [PATCH 14/26] sysfs: Nicely indent sysfs_symlink_inode_operations Eric W. Biederman
@ 2009-05-29 20:19             ` Eric W. Biederman
  2009-05-29 20:19             ` [PATCH 16/26] sysfs: In sysfs_chmod_file lazily propagate the mode change Eric W. Biederman
                               ` (10 subsequent siblings)
  25 siblings, 0 replies; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-29 20:19 UTC (permalink / raw)
  To: Andrew Morton, Greg Kroah-Hartman
  Cc: linux-kernel, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Kay Sievers, Greg KH, Eric W. Biederman, Eric W. Biederman

From: Eric W. Biederman <ebiederm@xmission.com>

With the implementation of sysfs_getattr and sysfs_permission
sysfs becomes able to lazily propogate inode attribute changes
from the sysfs_dirents to the vfs inodes.   This paves the way
for deleting significant chunks of now unnecessary code.

Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
---
 fs/sysfs/dir.c     |    2 +
 fs/sysfs/inode.c   |   54 ++++++++++++++++++++++++++++++++++++++++-----------
 fs/sysfs/symlink.c |    3 ++
 fs/sysfs/sysfs.h   |    2 +
 4 files changed, 49 insertions(+), 12 deletions(-)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index c6472d8..b75c938 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -763,6 +763,8 @@ static struct dentry * sysfs_lookup(struct inode *dir, struct dentry *dentry,
 const struct inode_operations sysfs_dir_inode_operations = {
 	.lookup		= sysfs_lookup,
 	.setattr	= sysfs_setattr,
+	.getattr	= sysfs_getattr,
+	.permission	= sysfs_permission,
 };
 
 static void remove_dir(struct sysfs_dirent *dir_sd)
diff --git a/fs/sysfs/inode.c b/fs/sysfs/inode.c
index dd154cb..1b7ed3c 100644
--- a/fs/sysfs/inode.c
+++ b/fs/sysfs/inode.c
@@ -35,6 +35,8 @@ static struct backing_dev_info sysfs_backing_dev_info = {
 
 static const struct inode_operations sysfs_inode_operations ={
 	.setattr	= sysfs_setattr,
+	.getattr	= sysfs_getattr,
+	.permission	= sysfs_permission,
 };
 
 int __init sysfs_inode_init(void)
@@ -123,7 +125,6 @@ static inline void set_default_inode_attr(struct inode * inode, mode_t mode)
 
 static inline void set_inode_attr(struct inode * inode, struct iattr * iattr)
 {
-	inode->i_mode = iattr->ia_mode;
 	inode->i_uid = iattr->ia_uid;
 	inode->i_gid = iattr->ia_gid;
 	inode->i_atime = iattr->ia_atime;
@@ -154,6 +155,33 @@ static int sysfs_count_nlink(struct sysfs_dirent *sd)
 	return nr + 2;
 }
 
+static void sysfs_refresh_inode(struct sysfs_dirent *sd, struct inode *inode)
+{
+	inode->i_mode = sd->s_mode;
+	if (sd->s_iattr) {
+		/* sysfs_dirent has non-default attributes
+		 * get them from persistent copy in sysfs_dirent
+		 */
+		set_inode_attr(inode, sd->s_iattr);
+	}
+
+	if (sysfs_type(sd) == SYSFS_DIR)
+		inode->i_nlink = sysfs_count_nlink(sd);
+}
+
+int sysfs_getattr(struct vfsmount *mnt, struct dentry *dentry, struct kstat *stat)
+{
+	struct sysfs_dirent *sd = dentry->d_fsdata;
+	struct inode *inode = dentry->d_inode;
+
+	mutex_lock(&sysfs_mutex);
+	sysfs_refresh_inode(sd, inode);
+	mutex_unlock(&sysfs_mutex);
+
+	generic_fillattr(inode, stat);
+	return 0;
+}
+
 static void sysfs_init_inode(struct sysfs_dirent *sd, struct inode *inode)
 {
 	struct bin_attribute *bin_attr;
@@ -162,25 +190,16 @@ static void sysfs_init_inode(struct sysfs_dirent *sd, struct inode *inode)
 	inode->i_mapping->a_ops = &sysfs_aops;
 	inode->i_mapping->backing_dev_info = &sysfs_backing_dev_info;
 	inode->i_op = &sysfs_inode_operations;
-	inode->i_ino = sd->s_ino;
 	lockdep_set_class(&inode->i_mutex, &sysfs_inode_imutex_key);
 
-	if (sd->s_iattr) {
-		/* sysfs_dirent has non-default attributes
-		 * get them for the new inode from persistent copy
-		 * in sysfs_dirent
-		 */
-		set_inode_attr(inode, sd->s_iattr);
-	} else
-		set_default_inode_attr(inode, sd->s_mode);
-
+	set_default_inode_attr(inode, sd->s_mode);
+	sysfs_refresh_inode(sd, inode);
 
 	/* initialize inode according to type */
 	switch (sysfs_type(sd)) {
 	case SYSFS_DIR:
 		inode->i_op = &sysfs_dir_inode_operations;
 		inode->i_fop = &sysfs_dir_operations;
-		inode->i_nlink = sysfs_count_nlink(sd);
 		break;
 	case SYSFS_KOBJ_ATTR:
 		inode->i_size = PAGE_SIZE;
@@ -263,3 +282,14 @@ int sysfs_hash_and_remove(struct sysfs_dirent *dir_sd, const char *name)
 	else
 		return -ENOENT;
 }
+
+int sysfs_permission(struct inode *inode, int mask)
+{
+	struct sysfs_dirent *sd = inode->i_private;
+
+	mutex_lock(&sysfs_mutex);
+	sysfs_refresh_inode(sd, inode);
+	mutex_unlock(&sysfs_mutex);
+
+	return generic_permission(inode, mask, NULL);
+}
diff --git a/fs/sysfs/symlink.c b/fs/sysfs/symlink.c
index 0367ed1..05e4984 100644
--- a/fs/sysfs/symlink.c
+++ b/fs/sysfs/symlink.c
@@ -201,6 +201,9 @@ const struct inode_operations sysfs_symlink_inode_operations = {
 	.readlink	= generic_readlink,
 	.follow_link	= sysfs_follow_link,
 	.put_link	= sysfs_put_link,
+	.setattr	= sysfs_setattr,
+	.getattr	= sysfs_getattr,
+	.permission	= sysfs_permission,
 };
 
 
diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index 043bb13..f5b53cf 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -148,6 +148,8 @@ struct inode *sysfs_get_inode(struct sysfs_dirent *sd);
 void sysfs_delete_inode(struct inode *inode);
 int sysfs_sd_setattr(struct sysfs_dirent *sd, struct iattr *iattr);
 int sysfs_setattr(struct dentry *dentry, struct iattr *iattr);
+int sysfs_getattr(struct vfsmount *mnt, struct dentry *dentry, struct kstat *stat);
+int sysfs_permission(struct inode *inode, int mask);
 int sysfs_hash_and_remove(struct sysfs_dirent *dir_sd, const char *name);
 int sysfs_inode_init(void);
 
-- 
1.6.3.1.54.g99dd.dirty


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH 16/26] sysfs: In sysfs_chmod_file lazily propagate the mode change.
  2009-05-29 20:18           ` [PATCH 0/26] sysfs cleanups v3 Eric W. Biederman
                               ` (14 preceding siblings ...)
  2009-05-29 20:19             ` [PATCH 15/26] sysfs: Implement sysfs_getattr & sysfs_permission Eric W. Biederman
@ 2009-05-29 20:19             ` Eric W. Biederman
  2009-05-29 20:19             ` [PATCH 17/26] sysfs: Kill sysfs_addrm_start and sysfs_addrm_finish Eric W. Biederman
                               ` (9 subsequent siblings)
  25 siblings, 0 replies; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-29 20:19 UTC (permalink / raw)
  To: Andrew Morton, Greg Kroah-Hartman
  Cc: linux-kernel, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Kay Sievers, Greg KH, Eric W. Biederman, Eric W. Biederman

From: Eric W. Biederman <ebiederm@xmission.com>

Now that sysfs_getattr and sysfs_permission refresh the vfs
inode there is no need to immediatly push the mode change
into the vfs cache.  Reducing the amount of work needed and
simplifying the locking.

Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
---
 fs/sysfs/file.c |   31 ++++++++-----------------------
 1 files changed, 8 insertions(+), 23 deletions(-)

diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c
index 0786b41..31cfe1d 100644
--- a/fs/sysfs/file.c
+++ b/fs/sysfs/file.c
@@ -577,38 +577,23 @@ EXPORT_SYMBOL_GPL(sysfs_add_file_to_group);
  */
 int sysfs_chmod_file(struct kobject *kobj, struct attribute *attr, mode_t mode)
 {
-	struct sysfs_dirent *victim_sd = NULL;
-	struct dentry *victim = NULL;
-	struct inode * inode;
+	struct sysfs_dirent *sd;
 	struct iattr newattrs;
 	int rc;
 
-	rc = -ENOENT;
-	victim_sd = sysfs_get_dirent(kobj->sd, attr->name);
-	if (!victim_sd)
-		goto out;
+	mutex_lock(&sysfs_mutex);
 
-	mutex_lock(&sysfs_rename_mutex);
-	victim = sysfs_get_dentry(victim_sd);
-	mutex_unlock(&sysfs_rename_mutex);
-	if (IS_ERR(victim)) {
-		rc = PTR_ERR(victim);
-		victim = NULL;
+	rc = -ENOENT;
+	sd = sysfs_find_dirent(kobj->sd, attr->name);
+	if (!sd)
 		goto out;
-	}
-
-	inode = victim->d_inode;
 
-	mutex_lock(&inode->i_mutex);
-
-	newattrs.ia_mode = (mode & S_IALLUGO) | (inode->i_mode & ~S_IALLUGO);
+	newattrs.ia_mode = (mode & S_IALLUGO) | (sd->s_mode & ~S_IALLUGO);
 	newattrs.ia_valid = ATTR_MODE;
-	rc = sysfs_setattr(victim, &newattrs);
+	rc = sysfs_sd_setattr(sd, &newattrs);
 
-	mutex_unlock(&inode->i_mutex);
  out:
-	dput(victim);
-	sysfs_put(victim_sd);
+	mutex_unlock(&sysfs_mutex);
 	return rc;
 }
 EXPORT_SYMBOL_GPL(sysfs_chmod_file);
-- 
1.6.3.1.54.g99dd.dirty


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH 17/26] sysfs: Kill sysfs_addrm_start and sysfs_addrm_finish
  2009-05-29 20:18           ` [PATCH 0/26] sysfs cleanups v3 Eric W. Biederman
                               ` (15 preceding siblings ...)
  2009-05-29 20:19             ` [PATCH 16/26] sysfs: In sysfs_chmod_file lazily propagate the mode change Eric W. Biederman
@ 2009-05-29 20:19             ` Eric W. Biederman
  2009-05-29 20:19             ` [PATCH 18/26] sysfs: Propagate renames to the vfs on demand Eric W. Biederman
                               ` (8 subsequent siblings)
  25 siblings, 0 replies; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-29 20:19 UTC (permalink / raw)
  To: Andrew Morton, Greg Kroah-Hartman
  Cc: linux-kernel, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Kay Sievers, Greg KH, Eric W. Biederman, Eric W. Biederman

From: Eric W. Biederman <ebiederm@xmission.com>

With lazy inode updates and dentry operations bringing everything
into sync on demand there is no longer any need to immediately
update the vfs or grab i_mutex to protect those updates as we
make changes to sysfs.

So stop updating the vfs inodes and move what remains of
sysfs_addrm_start and sysfs_addrm_finsih (just barely more than taking
the sysfs_mutex) into sysfs_add_one and sysfs_remove_one.

Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
---
 fs/sysfs/dir.c     |  194 +++++++---------------------------------------------
 fs/sysfs/file.c    |    6 +--
 fs/sysfs/inode.c   |   16 ++---
 fs/sysfs/symlink.c |    6 +--
 fs/sysfs/sysfs.h   |   17 +----
 5 files changed, 34 insertions(+), 205 deletions(-)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index b75c938..3caf9c6 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -382,62 +382,6 @@ struct sysfs_dirent *sysfs_new_dirent(const char *name, umode_t mode, int type)
 	return NULL;
 }
 
-static int sysfs_ilookup_test(struct inode *inode, void *arg)
-{
-	struct sysfs_dirent *sd = arg;
-	return inode->i_ino == sd->s_ino;
-}
-
-/**
- *	sysfs_addrm_start - prepare for sysfs_dirent add/remove
- *	@acxt: pointer to sysfs_addrm_cxt to be used
- *	@parent_sd: parent sysfs_dirent
- *
- *	This function is called when the caller is about to add or
- *	remove sysfs_dirent under @parent_sd.  This function acquires
- *	sysfs_mutex, grabs inode for @parent_sd if available and lock
- *	i_mutex of it.  @acxt is used to keep and pass context to
- *	other addrm functions.
- *
- *	LOCKING:
- *	Kernel thread context (may sleep).  sysfs_mutex is locked on
- *	return.  i_mutex of parent inode is locked on return if
- *	available.
- */
-void sysfs_addrm_start(struct sysfs_addrm_cxt *acxt,
-		       struct sysfs_dirent *parent_sd)
-{
-	struct inode *inode;
-
-	memset(acxt, 0, sizeof(*acxt));
-	acxt->parent_sd = parent_sd;
-
-	/* Lookup parent inode.  inode initialization is protected by
-	 * sysfs_mutex, so inode existence can be determined by
-	 * looking up inode while holding sysfs_mutex.
-	 */
-	mutex_lock(&sysfs_mutex);
-
-	inode = ilookup5(sysfs_sb, parent_sd->s_ino, sysfs_ilookup_test,
-			 parent_sd);
-	if (inode) {
-		WARN_ON(inode->i_state & I_NEW);
-
-		/* parent inode available */
-		acxt->parent_inode = inode;
-
-		/* sysfs_mutex is below i_mutex in lock hierarchy.
-		 * First, trylock i_mutex.  If fails, unlock
-		 * sysfs_mutex and lock them in order.
-		 */
-		if (!mutex_trylock(&inode->i_mutex)) {
-			mutex_unlock(&sysfs_mutex);
-			mutex_lock(&inode->i_mutex);
-			mutex_lock(&sysfs_mutex);
-		}
-	}
-}
-
 /**
  *	sysfs_pathname - return full path to sysfs dirent
  *	@sd: sysfs_dirent whose path we want
@@ -460,161 +404,83 @@ static char *sysfs_pathname(struct sysfs_dirent *sd, char *path)
 
 /**
  *	sysfs_add_one - add sysfs_dirent to parent
- *	@acxt: addrm context to use
+ *	@parent_sd: directory to add @sd into
  *	@sd: sysfs_dirent to be added
  *
- *	Get @acxt->parent_sd and set sd->s_parent to it and increment
+ *	Get @parent_sd and set sd->s_parent to it and increment
  *	nlink of parent inode if @sd is a directory and link into the
  *	children list of the parent.
  *
- *	This function should be called between calls to
- *	sysfs_addrm_start() and sysfs_addrm_finish() and should be
- *	passed the same @acxt as passed to sysfs_addrm_start().
- *
  *	LOCKING:
- *	Determined by sysfs_addrm_start().
+ *	Kernel thread context (may sleep).  Grabs sysfs_mutex.
  *
  *	RETURNS:
  *	0 on success, -EEXIST if entry with the given name already
  *	exists.
  */
-int sysfs_add_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd)
+int sysfs_add_one(struct sysfs_dirent *parent_sd, struct sysfs_dirent *sd)
 {
 	struct iattr *ps_iattr;
 
-	if (sysfs_find_dirent(acxt->parent_sd, sd->s_name)) {
-		char *path = kzalloc(PATH_MAX, GFP_KERNEL);
+	mutex_lock(&sysfs_mutex);
+	if (sysfs_find_dirent(parent_sd, sd->s_name)) {
+		char *path;
+		mutex_unlock(&sysfs_mutex);
+
+		path = kzalloc(PATH_MAX, GFP_KERNEL);
 		WARN(1, KERN_WARNING
 		     "sysfs: cannot create duplicate filename '%s'\n",
 		     (path == NULL) ? sd->s_name :
-		     strcat(strcat(sysfs_pathname(acxt->parent_sd, path), "/"),
+		     strcat(strcat(sysfs_pathname(parent_sd, path), "/"),
 		            sd->s_name));
 		kfree(path);
 		return -EEXIST;
 	}
 
-	sd->s_parent = sysfs_get(acxt->parent_sd);
-
-	if (sysfs_type(sd) == SYSFS_DIR && acxt->parent_inode)
-		inc_nlink(acxt->parent_inode);
-
-	acxt->cnt++;
-
+	sd->s_parent = sysfs_get(parent_sd);
 	sysfs_link_sibling(sd);
 
 	/* Update timestamps on the parent */
-	ps_iattr = acxt->parent_sd->s_iattr;
+	ps_iattr = parent_sd->s_iattr;
 	if (ps_iattr)
 		ps_iattr->ia_ctime = ps_iattr->ia_mtime = CURRENT_TIME;
 
+	mutex_unlock(&sysfs_mutex);
 	return 0;
 }
 
 /**
  *	sysfs_remove_one - remove sysfs_dirent from parent
- *	@acxt: addrm context to use
  *	@sd: sysfs_dirent to be removed
  *
  *	Mark @sd removed and drop nlink of parent inode if @sd is a
  *	directory.  @sd is unlinked from the children list.
  *
- *	This function should be called between calls to
- *	sysfs_addrm_start() and sysfs_addrm_finish() and should be
- *	passed the same @acxt as passed to sysfs_addrm_start().
- *
  *	LOCKING:
- *	Determined by sysfs_addrm_start().
+ *	Kernel thread context (may sleep).  Grabs sysfs_mutex.
  */
-void sysfs_remove_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd)
+void sysfs_remove_one(struct sysfs_dirent *sd)
 {
 	struct iattr *ps_iattr;
 
 	BUG_ON(sd->s_flags & SYSFS_FLAG_REMOVED);
 
+	mutex_lock(&sysfs_mutex);
+
 	sysfs_unlink_sibling(sd);
 
 	/* Update timestamps on the parent */
-	ps_iattr = acxt->parent_sd->s_iattr;
+	ps_iattr = sd->s_parent->s_iattr;
 	if (ps_iattr)
 		ps_iattr->ia_ctime = ps_iattr->ia_mtime = CURRENT_TIME;
 
 	sd->s_flags |= SYSFS_FLAG_REMOVED;
-	sd->s_sibling = acxt->removed;
-	acxt->removed = sd;
-
-	if (sysfs_type(sd) == SYSFS_DIR && acxt->parent_inode)
-		drop_nlink(acxt->parent_inode);
 
-	acxt->cnt++;
-}
-
-/**
- *	sysfs_dec_nlink - Decrement link count for the specified sysfs_dirent
- *	@sd: target sysfs_dirent
- *
- *	Decrement nlink for @sd.  @sd must have been unlinked from its
- *	parent on entry to this function such that it can't be looked
- *	up anymore.
- */
-static void sysfs_dec_nlink(struct sysfs_dirent *sd)
-{
-	struct inode *inode;
-
-	inode = ilookup(sysfs_sb, sd->s_ino);
-	if (!inode)
-		return;
-
-	/* adjust nlink and update timestamp */
-	mutex_lock(&inode->i_mutex);
-
-	inode->i_ctime = CURRENT_TIME;
-	drop_nlink(inode);
-	if (sysfs_type(sd) == SYSFS_DIR)
-		drop_nlink(inode);
-
-	mutex_unlock(&inode->i_mutex);
-
-	iput(inode);
-}
-
-/**
- *	sysfs_addrm_finish - finish up sysfs_dirent add/remove
- *	@acxt: addrm context to finish up
- *
- *	Finish up sysfs_dirent add/remove.  Resources acquired by
- *	sysfs_addrm_start() are released and removed sysfs_dirents are
- *	cleaned up.  Timestamps on the parent inode are updated.
- *
- *	LOCKING:
- *	All mutexes acquired by sysfs_addrm_start() are released.
- */
-void sysfs_addrm_finish(struct sysfs_addrm_cxt *acxt)
-{
-	/* release resources acquired by sysfs_addrm_start() */
 	mutex_unlock(&sysfs_mutex);
-	if (acxt->parent_inode) {
-		struct inode *inode = acxt->parent_inode;
-
-		/* if added/removed, update timestamps on the parent */
-		if (acxt->cnt)
-			inode->i_ctime = inode->i_mtime = CURRENT_TIME;
 
-		mutex_unlock(&inode->i_mutex);
-		iput(inode);
-	}
-
-	/* kill removed sysfs_dirents */
-	while (acxt->removed) {
-		struct sysfs_dirent *sd = acxt->removed;
-
-		acxt->removed = sd->s_sibling;
-		sd->s_sibling = NULL;
-
-		sysfs_dec_nlink(sd);
-		sysfs_deactivate(sd);
-		unmap_bin_file(sd);
-		sysfs_put(sd);
-	}
+	sysfs_deactivate(sd);
+	unmap_bin_file(sd);
+	sysfs_put(sd);
 }
 
 /**
@@ -673,7 +539,6 @@ static int create_dir(struct kobject *kobj, struct sysfs_dirent *parent_sd,
 		      const char *name, struct sysfs_dirent **p_sd)
 {
 	umode_t mode = S_IFDIR| S_IRWXU | S_IRUGO | S_IXUGO;
-	struct sysfs_addrm_cxt acxt;
 	struct sysfs_dirent *sd;
 	int rc;
 
@@ -684,10 +549,8 @@ static int create_dir(struct kobject *kobj, struct sysfs_dirent *parent_sd,
 	sd->s_dir.kobj = kobj;
 
 	/* link in */
-	sysfs_addrm_start(&acxt, parent_sd);
-	rc = sysfs_add_one(&acxt, sd);
-	sysfs_addrm_finish(&acxt);
 
+	rc = sysfs_add_one(parent_sd, sd);
 	if (rc == 0)
 		*p_sd = sd;
 	else
@@ -769,8 +632,6 @@ const struct inode_operations sysfs_dir_inode_operations = {
 
 static void remove_dir(struct sysfs_dirent *dir_sd)
 {
-	struct sysfs_addrm_cxt acxt;
-
 	pr_debug("sysfs %s: removing dir\n", dir_sd->s_name);
 
 	/* Removing non-empty directories is not valid complain! */
@@ -787,9 +648,7 @@ static void remove_dir(struct sysfs_dirent *dir_sd)
 		mutex_unlock(&sysfs_mutex);
 	}
 
-	sysfs_addrm_start(&acxt, dir_sd->s_parent);
-	sysfs_remove_one(&acxt, dir_sd);
-	sysfs_addrm_finish(&acxt);
+	sysfs_remove_one(dir_sd);
 }
 
 void sysfs_remove_subdir(struct sysfs_dirent *sd)
@@ -818,14 +677,11 @@ static struct sysfs_dirent *get_dirent_to_remove(struct sysfs_dirent *dir_sd)
 
 static void __sysfs_remove_dir(struct sysfs_dirent *dir_sd)
 {
-	struct sysfs_addrm_cxt acxt;
 	struct sysfs_dirent *sd;
 
 	/* Remove children that we think are safe */
 	while ((sd = get_dirent_to_remove(dir_sd))) {
-		sysfs_addrm_start(&acxt, sd->s_parent);
-		sysfs_remove_one(&acxt, sd);
-		sysfs_addrm_finish(&acxt);
+		sysfs_remove_one(sd);
 		sysfs_put(sd);
 	}
 
diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c
index 31cfe1d..b512ce6 100644
--- a/fs/sysfs/file.c
+++ b/fs/sysfs/file.c
@@ -499,7 +499,6 @@ int sysfs_add_file_mode(struct sysfs_dirent *dir_sd,
 			const struct attribute *attr, int type, mode_t amode)
 {
 	umode_t mode = (amode & S_IALLUGO) | S_IFREG;
-	struct sysfs_addrm_cxt acxt;
 	struct sysfs_dirent *sd;
 	int rc;
 
@@ -508,10 +507,7 @@ int sysfs_add_file_mode(struct sysfs_dirent *dir_sd,
 		return -ENOMEM;
 	sd->s_attr.attr = (void *)attr;
 
-	sysfs_addrm_start(&acxt, dir_sd);
-	rc = sysfs_add_one(&acxt, sd);
-	sysfs_addrm_finish(&acxt);
-
+	rc = sysfs_add_one(dir_sd, sd);
 	if (rc)
 		sysfs_put(sd);
 
diff --git a/fs/sysfs/inode.c b/fs/sysfs/inode.c
index 1b7ed3c..ad9a30d 100644
--- a/fs/sysfs/inode.c
+++ b/fs/sysfs/inode.c
@@ -263,23 +263,17 @@ void sysfs_delete_inode(struct inode *inode)
 
 int sysfs_hash_and_remove(struct sysfs_dirent *dir_sd, const char *name)
 {
-	struct sysfs_addrm_cxt acxt;
 	struct sysfs_dirent *sd;
 
 	if (!dir_sd)
 		return -ENOENT;
 
-	sysfs_addrm_start(&acxt, dir_sd);
-
-	sd = sysfs_find_dirent(dir_sd, name);
-	if (sd)
-		sysfs_remove_one(&acxt, sd);
-
-	sysfs_addrm_finish(&acxt);
-
-	if (sd)
+	sd = sysfs_get_dirent(dir_sd, name);
+	if (sd) {
+		sysfs_remove_one(sd);
+		sysfs_put(sd);
 		return 0;
-	else
+	} else
 		return -ENOENT;
 }
 
diff --git a/fs/sysfs/symlink.c b/fs/sysfs/symlink.c
index 05e4984..fc5fc86 100644
--- a/fs/sysfs/symlink.c
+++ b/fs/sysfs/symlink.c
@@ -31,7 +31,6 @@ int sysfs_create_link(struct kobject *kobj, struct kobject *target,
 	struct sysfs_dirent *parent_sd = NULL;
 	struct sysfs_dirent *target_sd = NULL;
 	struct sysfs_dirent *sd = NULL;
-	struct sysfs_addrm_cxt acxt;
 	int error;
 
 	BUG_ON(!name);
@@ -65,10 +64,7 @@ int sysfs_create_link(struct kobject *kobj, struct kobject *target,
 	sd->s_symlink.target_sd = target_sd;
 	target_sd = NULL;	/* reference is now owned by the symlink */
 
-	sysfs_addrm_start(&acxt, parent_sd);
-	error = sysfs_add_one(&acxt, sd);
-	sysfs_addrm_finish(&acxt);
-
+	error = sysfs_add_one(parent_sd, sd);
 	if (error)
 		goto out_put;
 
diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index f5b53cf..f17ebb8 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -77,16 +77,6 @@ static inline unsigned int sysfs_type(struct sysfs_dirent *sd)
 }
 
 /*
- * Context structure to be used while adding/removing nodes.
- */
-struct sysfs_addrm_cxt {
-	struct sysfs_dirent	*parent_sd;
-	struct inode		*parent_inode;
-	struct sysfs_dirent	*removed;
-	int			cnt;
-};
-
-/*
  * mount.c
  */
 extern struct sysfs_dirent sysfs_root;
@@ -106,11 +96,8 @@ extern const struct inode_operations sysfs_dir_inode_operations;
 struct dentry *sysfs_get_dentry(struct sysfs_dirent *sd);
 struct sysfs_dirent *sysfs_get_active_two(struct sysfs_dirent *sd);
 void sysfs_put_active_two(struct sysfs_dirent *sd);
-void sysfs_addrm_start(struct sysfs_addrm_cxt *acxt,
-		       struct sysfs_dirent *parent_sd);
-int sysfs_add_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd);
-void sysfs_remove_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd);
-void sysfs_addrm_finish(struct sysfs_addrm_cxt *acxt);
+int sysfs_add_one(struct sysfs_dirent *parent_sd, struct sysfs_dirent *sd);
+void sysfs_remove_one(struct sysfs_dirent *sd);
 
 struct sysfs_dirent *sysfs_find_dirent(struct sysfs_dirent *parent_sd,
 				       const unsigned char *name);
-- 
1.6.3.1.54.g99dd.dirty


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH 18/26] sysfs: Propagate renames to the vfs on demand
  2009-05-29 20:18           ` [PATCH 0/26] sysfs cleanups v3 Eric W. Biederman
                               ` (16 preceding siblings ...)
  2009-05-29 20:19             ` [PATCH 17/26] sysfs: Kill sysfs_addrm_start and sysfs_addrm_finish Eric W. Biederman
@ 2009-05-29 20:19             ` Eric W. Biederman
  2009-05-29 20:19             ` [PATCH 19/26] sysfs: Merge sysfs_rename_dir and sysfs_move_dir Eric W. Biederman
                               ` (7 subsequent siblings)
  25 siblings, 0 replies; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-29 20:19 UTC (permalink / raw)
  To: Andrew Morton, Greg Kroah-Hartman
  Cc: linux-kernel, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Kay Sievers, Greg KH, Eric W. Biederman, Eric W. Biederman

From: Eric W. Biederman <ebiederm@xmission.com>

By teaching sysfs_revalidate to hide a dentry for
a sysfs_dirent if the sysfs_dirent has been renamed,
and by teaching sysfs_lookup to return the original
dentry if the sysfs dirent has been renamed.  I can
show the results of renames correctly without having to
update the dcache during the directory rename.

This massively simplifies the rename logic allowing a lot
of weird sysfs special cases to be removed along with
a lot of now unnecesary helper code.

Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
---
 fs/namei.c            |   22 -------
 fs/sysfs/dir.c        |  156 ++++++++++---------------------------------------
 fs/sysfs/inode.c      |   12 ----
 fs/sysfs/sysfs.h      |    1 -
 include/linux/namei.h |    1 -
 5 files changed, 32 insertions(+), 160 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index 78f253c..69f559a 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -1260,28 +1260,6 @@ struct dentry *lookup_one_len(const char *name, struct dentry *base, int len)
 	return __lookup_hash(&this, base, NULL);
 }
 
-/**
- * lookup_one_noperm - bad hack for sysfs
- * @name:	pathname component to lookup
- * @base:	base directory to lookup from
- *
- * This is a variant of lookup_one_len that doesn't perform any permission
- * checks.   It's a horrible hack to work around the braindead sysfs
- * architecture and should not be used anywhere else.
- *
- * DON'T USE THIS FUNCTION EVER, thanks.
- */
-struct dentry *lookup_one_noperm(const char *name, struct dentry *base)
-{
-	int err;
-	struct qstr this;
-
-	err = __lookup_one_len(name, &this, base, strlen(name));
-	if (err)
-		return ERR_PTR(err);
-	return __lookup_hash(&this, base, NULL);
-}
-
 int user_path_at(int dfd, const char __user *name, unsigned flags,
 		 struct path *path)
 {
diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index 3caf9c6..73dbc34 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -24,7 +24,6 @@
 #include "sysfs.h"
 
 DEFINE_MUTEX(sysfs_mutex);
-DEFINE_MUTEX(sysfs_rename_mutex);
 DEFINE_SPINLOCK(sysfs_assoc_lock);
 
 static DEFINE_SPINLOCK(sysfs_ino_lock);
@@ -84,46 +83,6 @@ static void sysfs_unlink_sibling(struct sysfs_dirent *sd)
 }
 
 /**
- *	sysfs_get_dentry - get dentry for the given sysfs_dirent
- *	@sd: sysfs_dirent of interest
- *
- *	Get dentry for @sd.  Dentry is looked up if currently not
- *	present.  This function descends from the root looking up
- *	dentry for each step.
- *
- *	LOCKING:
- *	mutex_lock(sysfs_rename_mutex)
- *
- *	RETURNS:
- *	Pointer to found dentry on success, ERR_PTR() value on error.
- */
-struct dentry *sysfs_get_dentry(struct sysfs_dirent *sd)
-{
-	struct dentry *dentry = dget(sysfs_sb->s_root);
-
-	while (dentry->d_fsdata != sd) {
-		struct sysfs_dirent *cur;
-		struct dentry *parent;
-
-		/* find the first ancestor which hasn't been looked up */
-		cur = sd;
-		while (cur->s_parent != dentry->d_fsdata)
-			cur = cur->s_parent;
-
-		/* look it up */
-		parent = dentry;
-		mutex_lock(&parent->d_inode->i_mutex);
-		dentry = lookup_one_noperm(cur->s_name, parent);
-		mutex_unlock(&parent->d_inode->i_mutex);
-		dput(parent);
-
-		if (IS_ERR(dentry))
-			break;
-	}
-	return dentry;
-}
-
-/**
  *	sysfs_get_active - get an active reference to sysfs_dirent
  *	@sd: sysfs_dirent to get an active reference to
  *
@@ -311,6 +270,14 @@ static int sysfs_dentry_revalidate(struct dentry *dentry, struct nameidata *nd)
 	if (sd->s_flags & SYSFS_FLAG_REMOVED)
 		goto out_bad;
 
+	/* The sysfs dirent has been moved? */
+	if (dentry->d_parent->d_fsdata != sd->s_parent)
+		goto out_bad;
+
+	/* The sysfs dirent has been renamed */
+	if (strcmp(dentry->d_name.name, sd->s_name) != 0)
+		goto out_bad;
+
 	mutex_unlock(&sysfs_mutex);
 out_valid:
 	return 1;
@@ -318,6 +285,12 @@ out_bad:
 	/* Remove the dentry from the dcache hashes.
 	 * If this is a deleted dentry we use d_drop instead of d_delete
 	 * so sysfs doesn't need to cope with negative dentries.
+	 *
+	 * If this is a dentry that has simply been renamed we
+	 * use d_drop to remove it from the dcache lookup on its
+	 * old parent.  If this dentry persists later when a lookup
+	 * is performed at its new name the dentry will be readded
+	 * to the dcache hashes.
 	 */
 	is_dir = (sysfs_type(sd) == SYSFS_DIR);
 	mutex_unlock(&sysfs_mutex);
@@ -613,10 +586,15 @@ static struct dentry * sysfs_lookup(struct inode *dir, struct dentry *dentry,
 	}
 
 	/* instantiate and hash dentry */
-	dentry->d_op = &sysfs_dentry_ops;
-	dentry->d_fsdata = sysfs_get(sd);
-	d_instantiate(dentry, inode);
-	d_rehash(dentry);
+	ret = d_find_alias(inode);
+	if (!ret) {
+		dentry->d_op = &sysfs_dentry_ops;
+		dentry->d_fsdata = sysfs_get(sd);
+		d_add(dentry, inode);
+	} else {
+		d_move(ret, dentry);
+		iput(inode);
+	}
 
  out_unlock:
 	mutex_unlock(&sysfs_mutex);
@@ -711,62 +689,32 @@ void sysfs_remove_dir(struct kobject * kobj)
 int sysfs_rename_dir(struct kobject * kobj, const char *new_name)
 {
 	struct sysfs_dirent *sd = kobj->sd;
-	struct dentry *parent = NULL;
-	struct dentry *old_dentry = NULL, *new_dentry = NULL;
 	const char *dup_name = NULL;
 	int error;
 
-	mutex_lock(&sysfs_rename_mutex);
+	mutex_lock(&sysfs_mutex);
 
 	error = 0;
 	if (strcmp(sd->s_name, new_name) == 0)
 		goto out;	/* nothing to rename */
 
-	/* get the original dentry */
-	old_dentry = sysfs_get_dentry(sd);
-	if (IS_ERR(old_dentry)) {
-		error = PTR_ERR(old_dentry);
-		old_dentry = NULL;
-		goto out;
-	}
-
-	parent = old_dentry->d_parent;
-
-	/* lock parent and get dentry for new name */
-	mutex_lock(&parent->d_inode->i_mutex);
-	mutex_lock(&sysfs_mutex);
-
 	error = -EEXIST;
 	if (sysfs_find_dirent(sd->s_parent, new_name))
-		goto out_unlock;
-
-	error = -ENOMEM;
-	new_dentry = d_alloc_name(parent, new_name);
-	if (!new_dentry)
-		goto out_unlock;
+		goto out;
 
 	/* rename sysfs_dirent */
 	error = -ENOMEM;
 	new_name = dup_name = kstrdup(new_name, GFP_KERNEL);
 	if (!new_name)
-		goto out_unlock;
+		goto out;
 
 	dup_name = sd->s_name;
 	sd->s_name = new_name;
 
-	/* rename */
-	d_add(new_dentry, NULL);
-	d_move(old_dentry, new_dentry);
-
 	error = 0;
- out_unlock:
+ out:
 	mutex_unlock(&sysfs_mutex);
-	mutex_unlock(&parent->d_inode->i_mutex);
 	kfree(dup_name);
-	dput(old_dentry);
-	dput(new_dentry);
- out:
-	mutex_unlock(&sysfs_rename_mutex);
 	return error;
 }
 
@@ -774,54 +722,20 @@ int sysfs_move_dir(struct kobject *kobj, struct kobject *new_parent_kobj)
 {
 	struct sysfs_dirent *sd = kobj->sd;
 	struct sysfs_dirent *new_parent_sd;
-	struct dentry *old_parent, *new_parent = NULL;
-	struct dentry *old_dentry = NULL, *new_dentry = NULL;
 	int error;
 
-	mutex_lock(&sysfs_rename_mutex);
 	BUG_ON(!sd->s_parent);
+
+	mutex_lock(&sysfs_mutex);
 	new_parent_sd = new_parent_kobj->sd ? new_parent_kobj->sd : &sysfs_root;
 
 	error = 0;
 	if (sd->s_parent == new_parent_sd)
 		goto out;	/* nothing to move */
 
-	/* get dentries */
-	old_dentry = sysfs_get_dentry(sd);
-	if (IS_ERR(old_dentry)) {
-		error = PTR_ERR(old_dentry);
-		old_dentry = NULL;
-		goto out;
-	}
-	old_parent = old_dentry->d_parent;
-
-	new_parent = sysfs_get_dentry(new_parent_sd);
-	if (IS_ERR(new_parent)) {
-		error = PTR_ERR(new_parent);
-		new_parent = NULL;
-		goto out;
-	}
-
-again:
-	mutex_lock(&old_parent->d_inode->i_mutex);
-	if (!mutex_trylock(&new_parent->d_inode->i_mutex)) {
-		mutex_unlock(&old_parent->d_inode->i_mutex);
-		goto again;
-	}
-	mutex_lock(&sysfs_mutex);
-
 	error = -EEXIST;
 	if (sysfs_find_dirent(new_parent_sd, sd->s_name))
-		goto out_unlock;
-
-	error = -ENOMEM;
-	new_dentry = d_alloc_name(new_parent, sd->s_name);
-	if (!new_dentry)
-		goto out_unlock;
-
-	error = 0;
-	d_add(new_dentry, NULL);
-	d_move(old_dentry, new_dentry);
+		goto out;
 
 	/* Remove from old parent's list and insert into new parent's list. */
 	sysfs_unlink_sibling(sd);
@@ -830,15 +744,9 @@ again:
 	sd->s_parent = new_parent_sd;
 	sysfs_link_sibling(sd);
 
- out_unlock:
+	error = 0;
+out:
 	mutex_unlock(&sysfs_mutex);
-	mutex_unlock(&new_parent->d_inode->i_mutex);
-	mutex_unlock(&old_parent->d_inode->i_mutex);
- out:
-	dput(new_parent);
-	dput(old_dentry);
-	dput(new_dentry);
-	mutex_unlock(&sysfs_rename_mutex);
 	return error;
 }
 
diff --git a/fs/sysfs/inode.c b/fs/sysfs/inode.c
index ad9a30d..a1917b5 100644
--- a/fs/sysfs/inode.c
+++ b/fs/sysfs/inode.c
@@ -132,17 +132,6 @@ static inline void set_inode_attr(struct inode * inode, struct iattr * iattr)
 	inode->i_ctime = iattr->ia_ctime;
 }
 
-
-/*
- * sysfs has a different i_mutex lock order behavior for i_mutex than other
- * filesystems; sysfs i_mutex is called in many places with subsystem locks
- * held. At the same time, many of the VFS locking rules do not apply to
- * sysfs at all (cross directory rename for example). To untangle this mess
- * (which gives false positives in lockdep), we're giving sysfs inodes their
- * own class for i_mutex.
- */
-static struct lock_class_key sysfs_inode_imutex_key;
-
 static int sysfs_count_nlink(struct sysfs_dirent *sd)
 {
 	struct sysfs_dirent *child;
@@ -190,7 +179,6 @@ static void sysfs_init_inode(struct sysfs_dirent *sd, struct inode *inode)
 	inode->i_mapping->a_ops = &sysfs_aops;
 	inode->i_mapping->backing_dev_info = &sysfs_backing_dev_info;
 	inode->i_op = &sysfs_inode_operations;
-	lockdep_set_class(&inode->i_mutex, &sysfs_inode_imutex_key);
 
 	set_default_inode_attr(inode, sd->s_mode);
 	sysfs_refresh_inode(sd, inode);
diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index f17ebb8..2db952c 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -87,7 +87,6 @@ extern struct kmem_cache *sysfs_dir_cachep;
  * dir.c
  */
 extern struct mutex sysfs_mutex;
-extern struct mutex sysfs_rename_mutex;
 extern spinlock_t sysfs_assoc_lock;
 
 extern const struct file_operations sysfs_dir_operations;
diff --git a/include/linux/namei.h b/include/linux/namei.h
index fc2e035..758ecfb 100644
--- a/include/linux/namei.h
+++ b/include/linux/namei.h
@@ -76,7 +76,6 @@ extern struct file *nameidata_to_filp(struct nameidata *nd, int flags);
 extern void release_open_intent(struct nameidata *);
 
 extern struct dentry *lookup_one_len(const char *, struct dentry *, int);
-extern struct dentry *lookup_one_noperm(const char *, struct dentry *);
 
 extern int follow_down(struct vfsmount **, struct dentry **);
 extern int follow_up(struct vfsmount **, struct dentry **);
-- 
1.6.3.1.54.g99dd.dirty


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH 19/26] sysfs: Merge sysfs_rename_dir and sysfs_move_dir
  2009-05-29 20:18           ` [PATCH 0/26] sysfs cleanups v3 Eric W. Biederman
                               ` (17 preceding siblings ...)
  2009-05-29 20:19             ` [PATCH 18/26] sysfs: Propagate renames to the vfs on demand Eric W. Biederman
@ 2009-05-29 20:19             ` Eric W. Biederman
  2009-05-29 20:19             ` [PATCH 20/26] sysfs: Pass super_block to sysfs_get_inode Eric W. Biederman
                               ` (6 subsequent siblings)
  25 siblings, 0 replies; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-29 20:19 UTC (permalink / raw)
  To: Andrew Morton, Greg Kroah-Hartman
  Cc: linux-kernel, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Kay Sievers, Greg KH, Eric W. Biederman, Eric W. Biederman

From: Eric W. Biederman <ebiederm@xmission.com>

These two functions do 90% of the same work and it doesn't significantly
obfuscate the function to allow both the parent dir and the name to change
at the same time.  So merge them together to simplify maintenance, and
increase testing.

Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
---
 fs/sysfs/dir.c |   66 +++++++++++++++++++++++--------------------------------
 1 files changed, 28 insertions(+), 38 deletions(-)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index 73dbc34..64c94ed 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -686,30 +686,42 @@ void sysfs_remove_dir(struct kobject * kobj)
 	__sysfs_remove_dir(sd);
 }
 
-int sysfs_rename_dir(struct kobject * kobj, const char *new_name)
+static int sysfs_mv_dir(struct sysfs_dirent *sd,
+	struct sysfs_dirent *new_parent_sd, const char *new_name)
 {
-	struct sysfs_dirent *sd = kobj->sd;
 	const char *dup_name = NULL;
 	int error;
 
 	mutex_lock(&sysfs_mutex);
 
 	error = 0;
-	if (strcmp(sd->s_name, new_name) == 0)
+	if ((sd->s_parent == new_parent_sd) &&
+	    (strcmp(sd->s_name, new_name) == 0))
 		goto out;	/* nothing to rename */
 
 	error = -EEXIST;
-	if (sysfs_find_dirent(sd->s_parent, new_name))
+	if (sysfs_find_dirent(new_parent_sd, new_name))
 		goto out;
 
 	/* rename sysfs_dirent */
-	error = -ENOMEM;
-	new_name = dup_name = kstrdup(new_name, GFP_KERNEL);
-	if (!new_name)
-		goto out;
+	if (strcmp(sd->s_name, new_name) != 0) {
+		error = -ENOMEM;
+		new_name = dup_name = kstrdup(new_name, GFP_KERNEL);
+		if (!new_name)
+			goto out;
+
+		dup_name = sd->s_name;
+		sd->s_name = new_name;
+	}
 
-	dup_name = sd->s_name;
-	sd->s_name = new_name;
+	/* Remove from old parent's list and insert into new parent's list. */
+	if (sd->s_parent != new_parent_sd) {
+		sysfs_unlink_sibling(sd);
+		sysfs_get(new_parent_sd);
+		sysfs_put(sd->s_parent);
+		sd->s_parent = new_parent_sd;
+		sysfs_link_sibling(sd);
+	}
 
 	error = 0;
  out:
@@ -718,36 +730,14 @@ int sysfs_rename_dir(struct kobject * kobj, const char *new_name)
 	return error;
 }
 
-int sysfs_move_dir(struct kobject *kobj, struct kobject *new_parent_kobj)
+int sysfs_rename_dir(struct kobject * kobj, const char *new_name)
 {
-	struct sysfs_dirent *sd = kobj->sd;
-	struct sysfs_dirent *new_parent_sd;
-	int error;
-
-	BUG_ON(!sd->s_parent);
-
-	mutex_lock(&sysfs_mutex);
-	new_parent_sd = new_parent_kobj->sd ? new_parent_kobj->sd : &sysfs_root;
-
-	error = 0;
-	if (sd->s_parent == new_parent_sd)
-		goto out;	/* nothing to move */
-
-	error = -EEXIST;
-	if (sysfs_find_dirent(new_parent_sd, sd->s_name))
-		goto out;
-
-	/* Remove from old parent's list and insert into new parent's list. */
-	sysfs_unlink_sibling(sd);
-	sysfs_get(new_parent_sd);
-	sysfs_put(sd->s_parent);
-	sd->s_parent = new_parent_sd;
-	sysfs_link_sibling(sd);
+	return sysfs_mv_dir(kobj->sd, kobj->sd->s_parent, new_name);
+}
 
-	error = 0;
-out:
-	mutex_unlock(&sysfs_mutex);
-	return error;
+int sysfs_move_dir(struct kobject *kobj, struct kobject *new_parent_kobj)
+{
+	return sysfs_mv_dir(kobj->sd, new_parent_kobj->sd, kobj->sd->s_name);
 }
 
 /* Relationship between s_mode and the DT_xxx types */
-- 
1.6.3.1.54.g99dd.dirty


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH 20/26] sysfs: Pass super_block to sysfs_get_inode
  2009-05-29 20:18           ` [PATCH 0/26] sysfs cleanups v3 Eric W. Biederman
                               ` (18 preceding siblings ...)
  2009-05-29 20:19             ` [PATCH 19/26] sysfs: Merge sysfs_rename_dir and sysfs_move_dir Eric W. Biederman
@ 2009-05-29 20:19             ` Eric W. Biederman
  2009-05-29 20:19             ` [PATCH 21/26] sysfs: Kill unused sysfs_sb variable Eric W. Biederman
                               ` (5 subsequent siblings)
  25 siblings, 0 replies; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-29 20:19 UTC (permalink / raw)
  To: Andrew Morton, Greg Kroah-Hartman
  Cc: linux-kernel, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Kay Sievers, Greg KH, Eric W. Biederman, Eric W. Biederman

From: Eric W. Biederman <ebiederm@xmission.com>

Currently sysfs_get_inode magically returns an inode on
sysfs_sb.  Make the super_block parameter explicit and
the code becomes clearer.

Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
---
 fs/sysfs/dir.c   |    2 +-
 fs/sysfs/inode.c |    5 +++--
 fs/sysfs/mount.c |    2 +-
 fs/sysfs/sysfs.h |    2 +-
 4 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index 64c94ed..97ca5bf 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -579,7 +579,7 @@ static struct dentry * sysfs_lookup(struct inode *dir, struct dentry *dentry,
 	}
 
 	/* attach dentry and inode */
-	inode = sysfs_get_inode(sd);
+	inode = sysfs_get_inode(dir->i_sb, sd);
 	if (!inode) {
 		ret = ERR_PTR(-ENOMEM);
 		goto out_unlock;
diff --git a/fs/sysfs/inode.c b/fs/sysfs/inode.c
index a1917b5..c725aeb 100644
--- a/fs/sysfs/inode.c
+++ b/fs/sysfs/inode.c
@@ -210,6 +210,7 @@ static void sysfs_init_inode(struct sysfs_dirent *sd, struct inode *inode)
 
 /**
  *	sysfs_get_inode - get inode for sysfs_dirent
+ *	@sb: super block
  *	@sd: sysfs_dirent to allocate inode for
  *
  *	Get inode for @sd.  If such inode doesn't exist, a new inode
@@ -222,11 +223,11 @@ static void sysfs_init_inode(struct sysfs_dirent *sd, struct inode *inode)
  *	RETURNS:
  *	Pointer to allocated inode on success, NULL on failure.
  */
-struct inode * sysfs_get_inode(struct sysfs_dirent *sd)
+struct inode * sysfs_get_inode(struct super_block *sb, struct sysfs_dirent *sd)
 {
 	struct inode *inode;
 
-	inode = iget_locked(sysfs_sb, sd->s_ino);
+	inode = iget_locked(sb, sd->s_ino);
 	if (inode && (inode->i_state & I_NEW))
 		sysfs_init_inode(sd, inode);
 
diff --git a/fs/sysfs/mount.c b/fs/sysfs/mount.c
index 4974995..89db07e 100644
--- a/fs/sysfs/mount.c
+++ b/fs/sysfs/mount.c
@@ -54,7 +54,7 @@ static int sysfs_fill_super(struct super_block *sb, void *data, int silent)
 
 	/* get root inode, initialize and unlock it */
 	mutex_lock(&sysfs_mutex);
-	inode = sysfs_get_inode(&sysfs_root);
+	inode = sysfs_get_inode(sb, &sysfs_root);
 	mutex_unlock(&sysfs_mutex);
 	if (!inode) {
 		pr_debug("sysfs: could not get root inode\n");
diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index 2db952c..cf21b06 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -130,7 +130,7 @@ static inline void __sysfs_put(struct sysfs_dirent *sd)
 /*
  * inode.c
  */
-struct inode *sysfs_get_inode(struct sysfs_dirent *sd);
+struct inode *sysfs_get_inode(struct super_block *sb, struct sysfs_dirent *sd);
 void sysfs_delete_inode(struct inode *inode);
 int sysfs_sd_setattr(struct sysfs_dirent *sd, struct iattr *iattr);
 int sysfs_setattr(struct dentry *dentry, struct iattr *iattr);
-- 
1.6.3.1.54.g99dd.dirty


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH 21/26] sysfs: Kill unused sysfs_sb variable.
  2009-05-29 20:18           ` [PATCH 0/26] sysfs cleanups v3 Eric W. Biederman
                               ` (19 preceding siblings ...)
  2009-05-29 20:19             ` [PATCH 20/26] sysfs: Pass super_block to sysfs_get_inode Eric W. Biederman
@ 2009-05-29 20:19             ` Eric W. Biederman
  2009-05-29 20:19             ` [PATCH 22/26] sysfs: Normalize error handling in sysfs_fill_inode Eric W. Biederman
                               ` (4 subsequent siblings)
  25 siblings, 0 replies; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-29 20:19 UTC (permalink / raw)
  To: Andrew Morton, Greg Kroah-Hartman
  Cc: linux-kernel, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Kay Sievers, Greg KH, Eric W. Biederman, Eric W. Biederman

From: Eric W. Biederman <ebiederm@xmission.com>

Now that there are no more users we can remove
the sysfs_sb variable.

Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
---
 fs/sysfs/mount.c |    2 --
 fs/sysfs/sysfs.h |    1 -
 2 files changed, 0 insertions(+), 3 deletions(-)

diff --git a/fs/sysfs/mount.c b/fs/sysfs/mount.c
index 89db07e..0cb1088 100644
--- a/fs/sysfs/mount.c
+++ b/fs/sysfs/mount.c
@@ -23,7 +23,6 @@
 
 
 static struct vfsmount *sysfs_mount;
-struct super_block * sysfs_sb = NULL;
 struct kmem_cache *sysfs_dir_cachep;
 
 static const struct super_operations sysfs_ops = {
@@ -50,7 +49,6 @@ static int sysfs_fill_super(struct super_block *sb, void *data, int silent)
 	sb->s_magic = SYSFS_MAGIC;
 	sb->s_op = &sysfs_ops;
 	sb->s_time_gran = 1;
-	sysfs_sb = sb;
 
 	/* get root inode, initialize and unlock it */
 	mutex_lock(&sysfs_mutex);
diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index cf21b06..5dd8168 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -80,7 +80,6 @@ static inline unsigned int sysfs_type(struct sysfs_dirent *sd)
  * mount.c
  */
 extern struct sysfs_dirent sysfs_root;
-extern struct super_block *sysfs_sb;
 extern struct kmem_cache *sysfs_dir_cachep;
 
 /*
-- 
1.6.3.1.54.g99dd.dirty


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH 22/26] sysfs: Normalize error handling in sysfs_fill_inode
  2009-05-29 20:18           ` [PATCH 0/26] sysfs cleanups v3 Eric W. Biederman
                               ` (20 preceding siblings ...)
  2009-05-29 20:19             ` [PATCH 21/26] sysfs: Kill unused sysfs_sb variable Eric W. Biederman
@ 2009-05-29 20:19             ` Eric W. Biederman
  2009-05-29 20:19             ` [PATCH 23/26] sysfs: Rename sysfs_mv_dir sysfs_rename Eric W. Biederman
                               ` (3 subsequent siblings)
  25 siblings, 0 replies; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-29 20:19 UTC (permalink / raw)
  To: Andrew Morton, Greg Kroah-Hartman
  Cc: linux-kernel, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Kay Sievers, Greg KH, Eric W. Biederman, Eric W. Biederman

From: Eric W. Biederman <ebiederm@xmission.com>

Use a single error exit path instead of doing whatever
is the required cleanup at each point we find the error.
Ultimately this should make the code more maintainable.

Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
---
 fs/sysfs/mount.c |   16 +++++++++++-----
 1 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/fs/sysfs/mount.c b/fs/sysfs/mount.c
index 0cb1088..1dd023a 100644
--- a/fs/sysfs/mount.c
+++ b/fs/sysfs/mount.c
@@ -41,8 +41,9 @@ struct sysfs_dirent sysfs_root = {
 
 static int sysfs_fill_super(struct super_block *sb, void *data, int silent)
 {
-	struct inode *inode;
-	struct dentry *root;
+	struct inode *inode = NULL;
+	struct dentry *root = NULL;
+	int error;
 
 	sb->s_blocksize = PAGE_CACHE_SIZE;
 	sb->s_blocksize_bits = PAGE_CACHE_SHIFT;
@@ -51,24 +52,29 @@ static int sysfs_fill_super(struct super_block *sb, void *data, int silent)
 	sb->s_time_gran = 1;
 
 	/* get root inode, initialize and unlock it */
+	error = -ENOMEM;
 	mutex_lock(&sysfs_mutex);
 	inode = sysfs_get_inode(sb, &sysfs_root);
 	mutex_unlock(&sysfs_mutex);
 	if (!inode) {
 		pr_debug("sysfs: could not get root inode\n");
-		return -ENOMEM;
+		goto err_out;
 	}
 
 	/* instantiate and link root dentry */
+	error = -ENOMEM;
 	root = d_alloc_root(inode);
 	if (!root) {
 		pr_debug("%s: could not get root dentry!\n",__func__);
-		iput(inode);
-		return -ENOMEM;
+		goto err_out;
 	}
 	root->d_fsdata = &sysfs_root;
 	sb->s_root = root;
 	return 0;
+err_out:
+	dput(root);
+	iput(inode);
+	return error;
 }
 
 static int sysfs_get_sb(struct file_system_type *fs_type,
-- 
1.6.3.1.54.g99dd.dirty


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH 23/26] sysfs: Rename sysfs_mv_dir sysfs_rename
  2009-05-29 20:18           ` [PATCH 0/26] sysfs cleanups v3 Eric W. Biederman
                               ` (21 preceding siblings ...)
  2009-05-29 20:19             ` [PATCH 22/26] sysfs: Normalize error handling in sysfs_fill_inode Eric W. Biederman
@ 2009-05-29 20:19             ` Eric W. Biederman
  2009-05-29 20:19             ` [PATCH 24/26] sysfs: Make sysfs_rename_link atomic Eric W. Biederman
                               ` (2 subsequent siblings)
  25 siblings, 0 replies; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-29 20:19 UTC (permalink / raw)
  To: Andrew Morton, Greg Kroah-Hartman
  Cc: linux-kernel, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Kay Sievers, Greg KH, Eric W. Biederman, Eric W. Biederman

From: Eric W. Biederman <ebiederm@xmission.com>

It turns out that sysfs_mv_dir actually makes no assumptions that what
is being renamed is a directory.   So rename sysfs_mv_dir to sysfs_rename to
reflect the functions general utility.  Later we will use it rename symlinks
in sysfs.

Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
---
 fs/sysfs/dir.c   |    6 +++---
 fs/sysfs/sysfs.h |    3 +++
 2 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index 97ca5bf..7da42fb 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -686,7 +686,7 @@ void sysfs_remove_dir(struct kobject * kobj)
 	__sysfs_remove_dir(sd);
 }
 
-static int sysfs_mv_dir(struct sysfs_dirent *sd,
+int sysfs_rename(struct sysfs_dirent *sd,
 	struct sysfs_dirent *new_parent_sd, const char *new_name)
 {
 	const char *dup_name = NULL;
@@ -732,12 +732,12 @@ static int sysfs_mv_dir(struct sysfs_dirent *sd,
 
 int sysfs_rename_dir(struct kobject * kobj, const char *new_name)
 {
-	return sysfs_mv_dir(kobj->sd, kobj->sd->s_parent, new_name);
+	return sysfs_rename(kobj->sd, kobj->sd->s_parent, new_name);
 }
 
 int sysfs_move_dir(struct kobject *kobj, struct kobject *new_parent_kobj)
 {
-	return sysfs_mv_dir(kobj->sd, new_parent_kobj->sd, kobj->sd->s_name);
+	return sysfs_rename(kobj->sd, new_parent_kobj->sd, kobj->sd->s_name);
 }
 
 /* Relationship between s_mode and the DT_xxx types */
diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
index 5dd8168..be1d932 100644
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -109,6 +109,9 @@ int sysfs_create_subdir(struct kobject *kobj, const char *name,
 			struct sysfs_dirent **p_sd);
 void sysfs_remove_subdir(struct sysfs_dirent *sd);
 
+int sysfs_rename(struct sysfs_dirent *sd,
+	struct sysfs_dirent *new_parent_sd, const char *new_name);
+
 static inline struct sysfs_dirent *__sysfs_get(struct sysfs_dirent *sd)
 {
 	if (sd) {
-- 
1.6.3.1.54.g99dd.dirty


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH 24/26] sysfs: Make sysfs_rename_link atomic
  2009-05-29 20:18           ` [PATCH 0/26] sysfs cleanups v3 Eric W. Biederman
                               ` (22 preceding siblings ...)
  2009-05-29 20:19             ` [PATCH 23/26] sysfs: Rename sysfs_mv_dir sysfs_rename Eric W. Biederman
@ 2009-05-29 20:19             ` Eric W. Biederman
  2009-05-29 20:19             ` [PATCH 25/26] driver core: Don't remove kobjects in device_shutdown Eric W. Biederman
  2009-05-29 20:19             ` [PATCH 26/26] sysfs: In sysfs_add_one fail if the targe directory has been removed Eric W. Biederman
  25 siblings, 0 replies; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-29 20:19 UTC (permalink / raw)
  To: Andrew Morton, Greg Kroah-Hartman
  Cc: linux-kernel, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Kay Sievers, Greg KH, Eric W. Biederman, Eric W. Biederman

From: Eric W. Biederman <ebiederm@xmission.com>

Use the existing sysfs_rename to make sysfs_rename_link an atomic
operation that does less work.  While I am at add additional sanity
checking to ensure it is a symlink I am renaming.

Acked-by: Kay Sievers <kay.sievers@vrfy.org>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
---
 fs/sysfs/symlink.c |   26 ++++++++++++++++++++++++--
 1 files changed, 24 insertions(+), 2 deletions(-)

diff --git a/fs/sysfs/symlink.c b/fs/sysfs/symlink.c
index fc5fc86..39d050b 100644
--- a/fs/sysfs/symlink.c
+++ b/fs/sysfs/symlink.c
@@ -106,8 +106,30 @@ void sysfs_remove_link(struct kobject * kobj, const char * name)
 int sysfs_rename_link(struct kobject *kobj, struct kobject *targ,
 			const char *old, const char *new)
 {
-	sysfs_remove_link(kobj, old);
-	return sysfs_create_link(kobj, targ, new);
+	struct sysfs_dirent *parent_sd, *sd = NULL;
+	int result;
+
+	if (!kobj)
+		parent_sd = &sysfs_root;
+	else
+		parent_sd = kobj->sd;
+
+	result = -ENOENT;
+	sd = sysfs_get_dirent(parent_sd, old);
+	if (!sd)
+		goto out;
+
+	result = -EINVAL;
+	if (sysfs_type(sd) != SYSFS_KOBJ_LINK)
+		goto out;
+	if (sd->s_symlink.target_sd->s_dir.kobj != targ)
+		goto out;
+
+	result = sysfs_rename(sd, parent_sd, new);
+
+out:
+	sysfs_put(sd);
+	return result;
 }
 
 static int sysfs_get_target_path(struct sysfs_dirent *parent_sd,
-- 
1.6.3.1.54.g99dd.dirty


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH 25/26] driver core: Don't remove kobjects in device_shutdown.
  2009-05-29 20:18           ` [PATCH 0/26] sysfs cleanups v3 Eric W. Biederman
                               ` (23 preceding siblings ...)
  2009-05-29 20:19             ` [PATCH 24/26] sysfs: Make sysfs_rename_link atomic Eric W. Biederman
@ 2009-05-29 20:19             ` Eric W. Biederman
  2009-05-29 20:19             ` [PATCH 26/26] sysfs: In sysfs_add_one fail if the targe directory has been removed Eric W. Biederman
  25 siblings, 0 replies; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-29 20:19 UTC (permalink / raw)
  To: Andrew Morton, Greg Kroah-Hartman
  Cc: linux-kernel, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Kay Sievers, Greg KH, Eric W. Biederman, Eric W. Biederman

From: Eric W. Biederman <ebiederm@xmission.com>

device_shutdown is defined to just shutdown the hardware and to not
clean up any kernel data structures.  Therefore don't put the kobjects
for /sys/dev and /sys/dev/block and /sys/dev/char.

This ensures we don't remove /sys/dev/block and /sys/dev/char while
we still have symlinks from there to the actual devices.

Acked-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
---
 drivers/base/core.c |    3 ---
 1 files changed, 0 insertions(+), 3 deletions(-)

diff --git a/drivers/base/core.c b/drivers/base/core.c
index 8a1569c..49d3142 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -1653,7 +1653,4 @@ void device_shutdown(void)
 			dev->driver->shutdown(dev);
 		}
 	}
-	kobject_put(sysfs_dev_char_kobj);
-	kobject_put(sysfs_dev_block_kobj);
-	kobject_put(dev_kobj);
 }
-- 
1.6.3.1.54.g99dd.dirty


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [PATCH 26/26] sysfs: In sysfs_add_one fail if the targe directory has been removed.
  2009-05-29 20:18           ` [PATCH 0/26] sysfs cleanups v3 Eric W. Biederman
                               ` (24 preceding siblings ...)
  2009-05-29 20:19             ` [PATCH 25/26] driver core: Don't remove kobjects in device_shutdown Eric W. Biederman
@ 2009-05-29 20:19             ` Eric W. Biederman
  25 siblings, 0 replies; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-29 20:19 UTC (permalink / raw)
  To: Andrew Morton, Greg Kroah-Hartman
  Cc: linux-kernel, Tejun Heo, Cornelia Huck, linux-fsdevel,
	Kay Sievers, Greg KH, Eric W. Biederman, Eric W. Biederman

From: Eric W. Biederman <ebiederm@xmission.com>

If a bug in the upper layers results in someone attempting to add
to a sysfs directory that has already been removed, warn about it
and fail.

I don't believe this has ever happened, and it certainly never should
happen, but be strict to avoid errors creeping in.

Acked-by: Kay Sievers <kay.sievers@vrfy.org>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
---
 fs/sysfs/dir.c |   37 +++++++++++++++++++++++--------------
 1 files changed, 23 insertions(+), 14 deletions(-)

diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index 7da42fb..fda141d 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -394,21 +394,17 @@ static char *sysfs_pathname(struct sysfs_dirent *sd, char *path)
 int sysfs_add_one(struct sysfs_dirent *parent_sd, struct sysfs_dirent *sd)
 {
 	struct iattr *ps_iattr;
+	char *path;
+	int result;
 
 	mutex_lock(&sysfs_mutex);
-	if (sysfs_find_dirent(parent_sd, sd->s_name)) {
-		char *path;
-		mutex_unlock(&sysfs_mutex);
 
-		path = kzalloc(PATH_MAX, GFP_KERNEL);
-		WARN(1, KERN_WARNING
-		     "sysfs: cannot create duplicate filename '%s'\n",
-		     (path == NULL) ? sd->s_name :
-		     strcat(strcat(sysfs_pathname(parent_sd, path), "/"),
-		            sd->s_name));
-		kfree(path);
-		return -EEXIST;
-	}
+	result = -ENOENT;
+	if (parent_sd->s_flags & SYSFS_FLAG_REMOVED)
+		goto out_err;
+
+	if (sysfs_find_dirent(parent_sd, sd->s_name))
+		goto out_err;
 
 	sd->s_parent = sysfs_get(parent_sd);
 	sysfs_link_sibling(sd);
@@ -417,9 +413,22 @@ int sysfs_add_one(struct sysfs_dirent *parent_sd, struct sysfs_dirent *sd)
 	ps_iattr = parent_sd->s_iattr;
 	if (ps_iattr)
 		ps_iattr->ia_ctime = ps_iattr->ia_mtime = CURRENT_TIME;
-
 	mutex_unlock(&sysfs_mutex);
 	return 0;
+
+out_err:
+	mutex_unlock(&sysfs_mutex);
+
+	path = kzalloc(PATH_MAX, GFP_KERNEL);
+	WARN(1, KERN_WARNING "sysfs: cannot create '%s' %s\n",
+		(path == NULL) ? sd->s_name :
+		strcat(strcat(sysfs_pathname(parent_sd, path), "/"),
+		       sd->s_name),
+		(result == -EEXIST ? "duplicate filename" : "no such directory")
+		);
+	kfree(path);
+
+	return result;
 }
 
 /**
-- 
1.6.3.1.54.g99dd.dirty


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* Re: [PATCH 04/24] sysfs: Normalize removing sysfs directories.
  2009-05-29 16:52               ` Eric W. Biederman
@ 2009-05-30 10:43                 ` Tejun Heo
  2009-05-30 13:07                   ` Eric W. Biederman
  0 siblings, 1 reply; 200+ messages in thread
From: Tejun Heo @ 2009-05-30 10:43 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Andrew Morton, Greg Kroah-Hartman, linux-kernel, Cornelia Huck,
	linux-fsdevel, Kay Sievers, Greg KH, Eric W. Biederman

Eric W. Biederman wrote:
>   Also, I'm quite uncomfortable with these things
>> being done in non-atomic manner.  It can be made to work but things
>> like this can lead to subtle race conditions and with the kind of
>> layering we put on top of sysfs (kobject, driver model, driver
>> midlayers and so on), it isn't all that easy to verify what's going
>> on, so NACK for this one.
> 
> Total nonsense.
> 
> Mucking about with sysfs after we start deleting a directory is a bug.
> At worst my change makes a buggy race slightly less deterministic.
> 
> I am not ready to consider keeping the current unnecessary atomic
> removal step.  That unnecessary atomicity makes the following patches
> more difficult, and requires a lot of unnecessary retesting.
> 
> What do you think the extra unnecessary atomicity helps protect?

It's just not a clean API.  When people are trying to code things way
up in the stack, they aren't likely to look up the code to see what
assumptions are being made especially when the stack is deep and
complex and sysfs is near the bottom of the tall stack.  IMHO
implementing the usually expected semantics at this depth is worth
every effort.  It's just good implementation style which might look
like wasted effort but will harden the stack in the long run.  Plus,
it's not like making it atomic is difficult or anything.

So, still NACK.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 22/24] sysfs: Make sysfs_rename_link atomic
  2009-05-29 17:17               ` Eric W. Biederman
@ 2009-05-30 10:48                 ` Tejun Heo
  0 siblings, 0 replies; 200+ messages in thread
From: Tejun Heo @ 2009-05-30 10:48 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Andrew Morton, Greg Kroah-Hartman, linux-kernel, Cornelia Huck,
	linux-fsdevel, Kay Sievers, Greg KH, Eric W. Biederman

Eric W. Biederman wrote:
> Tejun Heo <tj@kernel.org> writes:
> 
>> Eric W. Biederman wrote:
>>> From: Eric W. Biederman <ebiederm@xmission.com>
>>>
>>> Use the existing sysfs_rename to make sysfs_rename_link an atomic
>>> operation that does less work.  While I am at add additional sanity
>>> checking to ensure it is a symlink I am renaming.
>>>
>>> Acked-by: Kay Sievers <kay.sievers@vrfy.org>
>>> Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
>> It would be really nice to merge or group this together with the first
>> three patches.  Other than that,
> 
> Perfection is the enemy of the good on that one.  That just convolutes
> things unnecessarily, makes the patches harder to review, and requires
> additional testing.
>
> I much prefer to work in a tree without rewinding.

Everything is a matter of degree and I don't think my bar here was too
high.  Unless patches are in some exported tree and when the content
of the patchset changes (you're changing the behavior of the same code
two times), reordering patches is what's usually done.  When the end
results stay the same, reshuffling isn't that much of work, is it?

Well, I'm not the maintainer and things like this are mostly upto the
maintainer, so let's leave it to be decided by Greg.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 04/24] sysfs: Normalize removing sysfs directories.
  2009-05-30 10:43                 ` Tejun Heo
@ 2009-05-30 13:07                   ` Eric W. Biederman
  2009-05-30 13:20                     ` Tejun Heo
                                       ` (2 more replies)
  0 siblings, 3 replies; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-30 13:07 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Andrew Morton, Greg Kroah-Hartman, linux-kernel, Cornelia Huck,
	linux-fsdevel, Kay Sievers, Greg KH, Eric W. Biederman

Tejun Heo <tj@kernel.org> writes:

> Eric W. Biederman wrote:
>>   Also, I'm quite uncomfortable with these things
>>> being done in non-atomic manner.  It can be made to work but things
>>> like this can lead to subtle race conditions and with the kind of
>>> layering we put on top of sysfs (kobject, driver model, driver
>>> midlayers and so on), it isn't all that easy to verify what's going
>>> on, so NACK for this one.
>> 
>> Total nonsense.
>> 
>> Mucking about with sysfs after we start deleting a directory is a bug.
>> At worst my change makes a buggy race slightly less deterministic.
>> 
>> I am not ready to consider keeping the current unnecessary atomic
>> removal step.  That unnecessary atomicity makes the following patches
>> more difficult, and requires a lot of unnecessary retesting.
>> 
>> What do you think the extra unnecessary atomicity helps protect?
>
> It's just not a clean API.  When people are trying to code things way
> up in the stack, they aren't likely to look up the code to see what
> assumptions are being made especially when the stack is deep and
> complex and sysfs is near the bottom of the tall stack.  IMHO
> implementing the usually expected semantics at this depth is worth
> every effort.  It's just good implementation style which might look
> like wasted effort but will harden the stack in the long run.  Plus,
> it's not like making it atomic is difficult or anything.

I guess we are going to have to disagree on this one.

My take is simply that a correct user has to wait until no one else
can find the kobject before calling kobject_del.  At which point
races are impossible, and it doesn't matter if sysfs_mutex is held
across the entire operation.


For the long term I still intend to kill __sysfs_remove_dir.  Just
not in this patch series.

Eric

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 04/24] sysfs: Normalize removing sysfs directories.
  2009-05-30 13:07                   ` Eric W. Biederman
@ 2009-05-30 13:20                     ` Tejun Heo
  2009-05-30 14:29                       ` Eric W. Biederman
  2009-05-30 13:59                     ` Kay Sievers
  2009-05-30 14:19                     ` James Bottomley
  2 siblings, 1 reply; 200+ messages in thread
From: Tejun Heo @ 2009-05-30 13:20 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Andrew Morton, Greg Kroah-Hartman, linux-kernel, Cornelia Huck,
	linux-fsdevel, Kay Sievers, Greg KH, Eric W. Biederman

Hello,

Eric W. Biederman wrote:
> I guess we are going to have to disagree on this one.

Yeap, seems like it.

> My take is simply that a correct user has to wait until no one else
> can find the kobject before calling kobject_del.  At which point
> races are impossible, and it doesn't matter if sysfs_mutex is held
> across the entire operation.

This one also is a matter of degree.  Way back when users could crash
sysfs reliably from userland, the sysfs code had a lot of assumptions
about object lifetime and synchronizaion which even the sysfs code
itself didn't really follow leading to fragility.  My focus while
restructuring the code was to make the code behave as expected by the
usual conventions.  It could be that I'm a bit paranoid about this,
but in general I really don't like when low level code doesn't do its
due diligence to save several hours of effort to implement clean
semantics, but again there's nothing wrong with your due and my due
being different.

I guess I'll have to pass the buck to Greg again with my rather strong
NACK.

> For the long term I still intend to kill __sysfs_remove_dir.  Just
> not in this patch series.

Yeah, sysfs code is in the middle of two sane ways.  sysfs either has
to deal with children creation/removal including atomicity of the
operations or it should force its users to do so.  I prefer the former
but any would be better than the current situation.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 04/24] sysfs: Normalize removing sysfs directories.
  2009-05-30 13:07                   ` Eric W. Biederman
  2009-05-30 13:20                     ` Tejun Heo
@ 2009-05-30 13:59                     ` Kay Sievers
  2009-05-30 14:19                     ` James Bottomley
  2 siblings, 0 replies; 200+ messages in thread
From: Kay Sievers @ 2009-05-30 13:59 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Tejun Heo, Andrew Morton, Greg Kroah-Hartman, linux-kernel,
	Cornelia Huck, linux-fsdevel, Greg KH, Eric W. Biederman

On Sat, May 30, 2009 at 15:07, Eric W. Biederman <ebiederm@xmission.com> wrote:

> My take is simply that a correct user has to wait until no one else
> can find the kobject before calling kobject_del.  At which point
> races are impossible, and it doesn't matter if sysfs_mutex is held
> across the entire operation.

We have circular references between kobjects in some "advanced" users.
We can not break the circle without calling _del() on one of the
objects involved, but at that time all of the objects are still
referenced by another object. Only a _del() will break the circle, and
result in a release of the objects involved. Not sure how that would
behave with this change?

Thanks,
Kay
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 04/24] sysfs: Normalize removing sysfs directories.
  2009-05-30 13:07                   ` Eric W. Biederman
  2009-05-30 13:20                     ` Tejun Heo
  2009-05-30 13:59                     ` Kay Sievers
@ 2009-05-30 14:19                     ` James Bottomley
  2009-05-30 15:15                       ` Eric W. Biederman
  2 siblings, 1 reply; 200+ messages in thread
From: James Bottomley @ 2009-05-30 14:19 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Tejun Heo, Andrew Morton, Greg Kroah-Hartman, linux-kernel,
	Cornelia Huck, linux-fsdevel, Kay Sievers, Greg KH,
	Eric W. Biederman

On Sat, 2009-05-30 at 06:07 -0700, Eric W. Biederman wrote:
> Tejun Heo <tj@kernel.org> writes:
> 
> > Eric W. Biederman wrote:
> >>   Also, I'm quite uncomfortable with these things
> >>> being done in non-atomic manner.  It can be made to work but things
> >>> like this can lead to subtle race conditions and with the kind of
> >>> layering we put on top of sysfs (kobject, driver model, driver
> >>> midlayers and so on), it isn't all that easy to verify what's going
> >>> on, so NACK for this one.
> >> 
> >> Total nonsense.
> >> 
> >> Mucking about with sysfs after we start deleting a directory is a bug.
> >> At worst my change makes a buggy race slightly less deterministic.
> >> 
> >> I am not ready to consider keeping the current unnecessary atomic
> >> removal step.  That unnecessary atomicity makes the following patches
> >> more difficult, and requires a lot of unnecessary retesting.
> >> 
> >> What do you think the extra unnecessary atomicity helps protect?
> >
> > It's just not a clean API.  When people are trying to code things way
> > up in the stack, they aren't likely to look up the code to see what
> > assumptions are being made especially when the stack is deep and
> > complex and sysfs is near the bottom of the tall stack.  IMHO
> > implementing the usually expected semantics at this depth is worth
> > every effort.  It's just good implementation style which might look
> > like wasted effort but will harden the stack in the long run.  Plus,
> > it's not like making it atomic is difficult or anything.
> 
> I guess we are going to have to disagree on this one.
> 
> My take is simply that a correct user has to wait until no one else
> can find the kobject before calling kobject_del.  At which point
> races are impossible, and it doesn't matter if sysfs_mutex is held
> across the entire operation.

I'm afraid this one isn't a valid assumption.  If you look in SCSI,
you'll see we do get objects after they've been removed from visibility.
We use it as part of the state model for how our objects work (objects
removed from visibility are dying, but we still need them to be findable
(and gettable).

Now, this could be altered as part of an object lifetime rewrite of SCSI
(and I suspect a few other subsystems) but it's certainly an open
question of whether the pain is worth the gain.

James



^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 04/24] sysfs: Normalize removing sysfs directories.
  2009-05-30 13:20                     ` Tejun Heo
@ 2009-05-30 14:29                       ` Eric W. Biederman
  0 siblings, 0 replies; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-30 14:29 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Andrew Morton, Greg Kroah-Hartman, linux-kernel, Cornelia Huck,
	linux-fsdevel, Kay Sievers, Greg KH, Eric W. Biederman

Tejun Heo <tj@kernel.org> writes:

> Hello,
>
> Eric W. Biederman wrote:
>> I guess we are going to have to disagree on this one.
>
> Yeap, seems like it.
>
>> My take is simply that a correct user has to wait until no one else
>> can find the kobject before calling kobject_del.  At which point
>> races are impossible, and it doesn't matter if sysfs_mutex is held
>> across the entire operation.
>
> This one also is a matter of degree.  Way back when users could crash
> sysfs reliably from userland, the sysfs code had a lot of assumptions
> about object lifetime and synchronizaion which even the sysfs code
> itself didn't really follow leading to fragility.  My focus while
> restructuring the code was to make the code behave as expected by the
> usual conventions.  It could be that I'm a bit paranoid about this,
> but in general I really don't like when low level code doesn't do its
> due diligence to save several hours of effort to implement clean
> semantics, but again there's nothing wrong with your due and my due
> being different.

Given a history of user space was crashing sysfs I can see where you
would be nervous.

I come from the perspective of sysfs still having bugs (although not
easily triggered), but the code is so complicated that people can't
see them because there is just so much code.  So I value every little
bit of extra simplicity I can.

>> For the long term I still intend to kill __sysfs_remove_dir.  Just
>> not in this patch series.
>
> Yeah, sysfs code is in the middle of two sane ways.  sysfs either has
> to deal with children creation/removal including atomicity of the
> operations or it should force its users to do so.  I prefer the former
> but any would be better than the current situation.

What user space cares about is that we perform the additions in
the following order:

kobject_add()
sysfs_add_file()
kobject_uevent()

So that when user space receives the event it can consult sysfs and
be safe in assuming all of the attributes for the device are present.

Since device_add calls kobject_uevent we must have all of the
attributes setup in a manner that device_add can add them to sysfs,
before device_add calls kobject_uevent.


So for most of the kernel today, the right solution is to setup attribute
groups and set them on the device before device_add is called.  For
most of the code that really isn't too hard.

Making that also to get all of the attributes removed in device_del
before sysfs_remove_dir is called.

So for the common case all of the complexity is handled at the device
layer today.

There are a few more cases but that just takes time.


When I add that to the fact that there are cases where sysfs simply
gets it wrong by deleting things like /sys/dev/char and
/sys/dev/block.  I can't see keeping __sysfs_remove_dir indefinitely.


Just adding my warning has turned up an issue in the scsi target vs
host lifetime rules that does the wrong thing in sysfs, and that isn't
something I can see sysfs making any easier by trying to be helpful.

Eric

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 04/24] sysfs: Normalize removing sysfs directories.
  2009-05-30 14:19                     ` James Bottomley
@ 2009-05-30 15:15                       ` Eric W. Biederman
  2009-05-30 15:51                         ` James Bottomley
  0 siblings, 1 reply; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-30 15:15 UTC (permalink / raw)
  To: James Bottomley
  Cc: Tejun Heo, Andrew Morton, Greg Kroah-Hartman, linux-kernel,
	Cornelia Huck, linux-fsdevel, Kay Sievers, Greg KH,
	Eric W. Biederman

James Bottomley <James.Bottomley@HansenPartnership.com> writes:

> On Sat, 2009-05-30 at 06:07 -0700, Eric W. Biederman wrote:
>> Tejun Heo <tj@kernel.org> writes:
>> 
>> > Eric W. Biederman wrote:
>> >>   Also, I'm quite uncomfortable with these things
>> >>> being done in non-atomic manner.  It can be made to work but things
>> >>> like this can lead to subtle race conditions and with the kind of
>> >>> layering we put on top of sysfs (kobject, driver model, driver
>> >>> midlayers and so on), it isn't all that easy to verify what's going
>> >>> on, so NACK for this one.
>> >> 
>> >> Total nonsense.
>> >> 
>> >> Mucking about with sysfs after we start deleting a directory is a bug.
>> >> At worst my change makes a buggy race slightly less deterministic.
>> >> 
>> >> I am not ready to consider keeping the current unnecessary atomic
>> >> removal step.  That unnecessary atomicity makes the following patches
>> >> more difficult, and requires a lot of unnecessary retesting.
>> >> 
>> >> What do you think the extra unnecessary atomicity helps protect?
>> >
>> > It's just not a clean API.  When people are trying to code things way
>> > up in the stack, they aren't likely to look up the code to see what
>> > assumptions are being made especially when the stack is deep and
>> > complex and sysfs is near the bottom of the tall stack.  IMHO
>> > implementing the usually expected semantics at this depth is worth
>> > every effort.  It's just good implementation style which might look
>> > like wasted effort but will harden the stack in the long run.  Plus,
>> > it's not like making it atomic is difficult or anything.
>> 
>> I guess we are going to have to disagree on this one.
>> 
>> My take is simply that a correct user has to wait until no one else
>> can find the kobject before calling kobject_del.  At which point
>> races are impossible, and it doesn't matter if sysfs_mutex is held
>> across the entire operation.
>
> I'm afraid this one isn't a valid assumption.  If you look in SCSI,
> you'll see we do get objects after they've been removed from visibility.
> We use it as part of the state model for how our objects work (objects
> removed from visibility are dying, but we still need them to be findable
> (and gettable).

I was not precise enough.  It appears I overlooked the fact that
kobject_del is not always called from kobject_put by way of
kobject_release.

Strictly the requirement is that after kobject_del we don't add,
remove or otherwise manipulate sysfs attributes.  That is we don't
call any of:

sysfs_add_file
sysfs_create_file
sysfs_create_bin_file
sysfs_remove_file
sysfs_remove_bin_file
sysfs_create_link
sysfs_remove_link
sysfs_create_group
sysfs_remove_group
sysfs_create_subdir
sysfs_remove_subdir


Those all either oops or BUG today if you try it.  So I can't see how
a subsystem could depend on those working.

Also there is sysfs_remove_dir (on a subdirectory) aka kobject_del on
a child object after kobject_del on the parent object.

As best I can tell that only works by fluke today.

> Now, this could be altered as part of an object lifetime rewrite of SCSI
> (and I suspect a few other subsystems) but it's certainly an open
> question of whether the pain is worth the gain.

I won't tell you that sysfs, the kobject layer, or the device layer
are the best thing since sliced bread.  I'm just trying to simplify
the code and get the bugs out.

Eric

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 04/24] sysfs: Normalize removing sysfs directories.
  2009-05-30 15:15                       ` Eric W. Biederman
@ 2009-05-30 15:51                         ` James Bottomley
  2009-05-30 21:20                           ` Eric W. Biederman
  0 siblings, 1 reply; 200+ messages in thread
From: James Bottomley @ 2009-05-30 15:51 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Tejun Heo, Andrew Morton, Greg Kroah-Hartman, linux-kernel,
	Cornelia Huck, linux-fsdevel, Kay Sievers, Greg KH,
	Eric W. Biederman

On Sat, 2009-05-30 at 08:15 -0700, Eric W. Biederman wrote:
> James Bottomley <James.Bottomley@HansenPartnership.com> writes:
> 
> > On Sat, 2009-05-30 at 06:07 -0700, Eric W. Biederman wrote:
> >> Tejun Heo <tj@kernel.org> writes:
> >> 
> >> > Eric W. Biederman wrote:
> >> >>   Also, I'm quite uncomfortable with these things
> >> >>> being done in non-atomic manner.  It can be made to work but things
> >> >>> like this can lead to subtle race conditions and with the kind of
> >> >>> layering we put on top of sysfs (kobject, driver model, driver
> >> >>> midlayers and so on), it isn't all that easy to verify what's going
> >> >>> on, so NACK for this one.
> >> >> 
> >> >> Total nonsense.
> >> >> 
> >> >> Mucking about with sysfs after we start deleting a directory is a bug.
> >> >> At worst my change makes a buggy race slightly less deterministic.
> >> >> 
> >> >> I am not ready to consider keeping the current unnecessary atomic
> >> >> removal step.  That unnecessary atomicity makes the following patches
> >> >> more difficult, and requires a lot of unnecessary retesting.
> >> >> 
> >> >> What do you think the extra unnecessary atomicity helps protect?
> >> >
> >> > It's just not a clean API.  When people are trying to code things way
> >> > up in the stack, they aren't likely to look up the code to see what
> >> > assumptions are being made especially when the stack is deep and
> >> > complex and sysfs is near the bottom of the tall stack.  IMHO
> >> > implementing the usually expected semantics at this depth is worth
> >> > every effort.  It's just good implementation style which might look
> >> > like wasted effort but will harden the stack in the long run.  Plus,
> >> > it's not like making it atomic is difficult or anything.
> >> 
> >> I guess we are going to have to disagree on this one.
> >> 
> >> My take is simply that a correct user has to wait until no one else
> >> can find the kobject before calling kobject_del.  At which point
> >> races are impossible, and it doesn't matter if sysfs_mutex is held
> >> across the entire operation.
> >
> > I'm afraid this one isn't a valid assumption.  If you look in SCSI,
> > you'll see we do get objects after they've been removed from visibility.
> > We use it as part of the state model for how our objects work (objects
> > removed from visibility are dying, but we still need them to be findable
> > (and gettable).
> 
> I was not precise enough.  It appears I overlooked the fact that
> kobject_del is not always called from kobject_put by way of
> kobject_release.

OK ... just so you understand, I'm thinking about the device model
rather than kobjects.  device_del() can't be called from release methods
because they're often called from interrupt context and the mutex
requirements in device_del() mean it needs user context.

> Strictly the requirement is that after kobject_del we don't add,
> remove or otherwise manipulate sysfs attributes.  That is we don't
> call any of:
> 
> sysfs_add_file
> sysfs_create_file
> sysfs_create_bin_file
> sysfs_remove_file
> sysfs_remove_bin_file
> sysfs_create_link
> sysfs_remove_link
> sysfs_create_group
> sysfs_remove_group
> sysfs_create_subdir
> sysfs_remove_subdir
> 
> 
> Those all either oops or BUG today if you try it.  So I can't see how
> a subsystem could depend on those working.

It doesn't; you've altered your requirement.  We can fully buy into this
new relaxed one.

> Also there is sysfs_remove_dir (on a subdirectory) aka kobject_del on
> a child object after kobject_del on the parent object.
> 
> As best I can tell that only works by fluke today.

Yes, that's an artifact of the fact that the reference counted lifecycle
is on release ... del just happens at a certain point in it.  We don't
hold any counters that tell us what the visibility of our children are,
so it's possible to make a parent invisible by calling del simply
because you don't know.

> > Now, this could be altered as part of an object lifetime rewrite of SCSI
> > (and I suspect a few other subsystems) but it's certainly an open
> > question of whether the pain is worth the gain.
> 
> I won't tell you that sysfs, the kobject layer, or the device layer
> are the best thing since sliced bread.  I'm just trying to simplify
> the code and get the bugs out.

James



^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 04/24] sysfs: Normalize removing sysfs directories.
  2009-05-30 15:51                         ` James Bottomley
@ 2009-05-30 21:20                           ` Eric W. Biederman
  0 siblings, 0 replies; 200+ messages in thread
From: Eric W. Biederman @ 2009-05-30 21:20 UTC (permalink / raw)
  To: James Bottomley
  Cc: Tejun Heo, Andrew Morton, Greg Kroah-Hartman, linux-kernel,
	Cornelia Huck, linux-fsdevel, Kay Sievers, Greg KH,
	Eric W. Biederman

James Bottomley <James.Bottomley@HansenPartnership.com> writes:

>> >> My take is simply that a correct user has to wait until no one else
>> >> can find the kobject before calling kobject_del.  At which point
>> >> races are impossible, and it doesn't matter if sysfs_mutex is held
>> >> across the entire operation.
>> >
>> > I'm afraid this one isn't a valid assumption.  If you look in SCSI,
>> > you'll see we do get objects after they've been removed from visibility.
>> > We use it as part of the state model for how our objects work (objects
>> > removed from visibility are dying, but we still need them to be findable
>> > (and gettable).
>> 
>> I was not precise enough.  It appears I overlooked the fact that
>> kobject_del is not always called from kobject_put by way of
>> kobject_release.
>
> OK ... just so you understand, I'm thinking about the device model
> rather than kobjects.  device_del() can't be called from release methods
> because they're often called from interrupt context and the mutex
> requirements in device_del() mean it needs user context.

Makes sense.

>> Strictly the requirement is that after kobject_del we don't add,
>> remove or otherwise manipulate sysfs attributes.  That is we don't
>> call any of:
>> 
>> sysfs_add_file
>> sysfs_create_file
>> sysfs_create_bin_file
>> sysfs_remove_file
>> sysfs_remove_bin_file
>> sysfs_create_link
>> sysfs_remove_link
>> sysfs_create_group
>> sysfs_remove_group
>> sysfs_create_subdir
>> sysfs_remove_subdir
>> 
>> 
>> Those all either oops or BUG today if you try it.  So I can't see how
>> a subsystem could depend on those working.
>
> It doesn't; you've altered your requirement.  We can fully buy into this
> new relaxed one.

My apologies for misstating it earlier.  Sometimes translating what
is happening in sysfs up to the device model can be a bit of a challenge.

At the sysfs layer the requirement is all the same.  Don't mess with a
directory as or after you have deleted it.


To recap, my change that Tejun has a problem with is simply that I have
refactored sysfs_remove_dir so that if there are directory entries
present.  A very fast observer in the kernel or in user space can see
each directory entry being deleted individually.  Before I delete the
directory itself.

This is because I now drop and reacquire the sysfs_mutex in between
each delete.

As the upper layers must already avoid messing with the attributes
of a sysfs directory from the time we call kobject_del I don't
see that this makes any difference to them.

>> Also there is sysfs_remove_dir (on a subdirectory) aka kobject_del on
>> a child object after kobject_del on the parent object.
>> 
>> As best I can tell that only works by fluke today.
>
> Yes, that's an artifact of the fact that the reference counted lifecycle
> is on release ... del just happens at a certain point in it.  We don't
> hold any counters that tell us what the visibility of our children are,
> so it's possible to make a parent invisible by calling del simply
> because you don't know.

Strictly speaking my changes don't affect this part either except to
issue a warning that something unexpected is going on.

Eric

^ permalink raw reply	[flat|nested] 200+ messages in thread

* patch driver-core-use-sysfs_rename_link-in-device_rename.patch added to gregkh-2.6 tree
  2009-05-29 20:19             ` [PATCH 02/26] driver core: Use sysfs_rename_link in device_rename Eric W. Biederman
@ 2009-06-02 22:57               ` gregkh
  0 siblings, 0 replies; 200+ messages in thread
From: gregkh @ 2009-06-02 22:57 UTC (permalink / raw)
  To: ebiederm, akpm, cornelia.huck, ebiederm, gregkh, greg,
	kay.sievers, linux-fsdevel


This is a note to let you know that I've just added the patch titled

    Subject: driver core: Use sysfs_rename_link in device_rename

to my gregkh-2.6 tree.  Its filename is

    driver-core-use-sysfs_rename_link-in-device_rename.patch

This tree can be found at 
    http://www.kernel.org/pub/linux/kernel/people/gregkh/gregkh-2.6/patches/


>From ebiederm@xmission.com  Tue Jun  2 15:34:29 2009
From: "Eric W. Biederman" <ebiederm@xmission.com>
Date: Fri, 29 May 2009 13:19:12 -0700
Subject: driver core: Use sysfs_rename_link in device_rename
To: Andrew Morton <akpm@linux-foundation.org>, Greg Kroah-Hartman <gregkh@suse.de>
Cc: Tejun Heo <tj@kernel.org>, Cornelia Huck <cornelia.huck@de.ibm.com>, <linux-fsdevel@vger.kernel.org>, Kay Sievers <kay.sievers@vrfy.org>, Greg KH <greg@kroah.com>, "Eric W. Biederman" <ebiederm@xmission.com>, "Eric W. Biederman" <ebiederm@aristanetworks.com>
Message-ID: <1243628376-22905-2-git-send-email-ebiederm@xmission.com>


From: Eric W. Biederman <ebiederm@xmission.com>

Don't open code the renaming of symlinks in sysfs
instead use the new helper function sysfs_rename_link

Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

---
 drivers/base/core.c |   18 ++++++------------
 1 file changed, 6 insertions(+), 12 deletions(-)

--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -1546,22 +1546,16 @@ int device_rename(struct device *dev, ch
 	if (old_class_name) {
 		new_class_name = make_class_name(dev->class->name, &dev->kobj);
 		if (new_class_name) {
-			error = sysfs_create_link_nowarn(&dev->parent->kobj,
-							 &dev->kobj,
-							 new_class_name);
-			if (error)
-				goto out;
-			sysfs_remove_link(&dev->parent->kobj, old_class_name);
+			error = sysfs_rename_link(&dev->parent->kobj,
+						  &dev->kobj,
+						  old_class_name,
+						  new_class_name);
 		}
 	}
 #else
 	if (dev->class) {
-		error = sysfs_create_link_nowarn(&dev->class->p->class_subsys.kobj,
-						 &dev->kobj, dev_name(dev));
-		if (error)
-			goto out;
-		sysfs_remove_link(&dev->class->p->class_subsys.kobj,
-				  old_device_name);
+		error = sysfs_rename_link(&dev->class->p->class_subsys.kobj,
+					  &dev->kobj, old_device_name, new_name);
 	}
 #endif
 


^ permalink raw reply	[flat|nested] 200+ messages in thread

* patch sysfs-implement-sysfs_rename_link.patch added to gregkh-2.6 tree
  2009-05-29 20:19             ` [PATCH 01/26] sysfs: Implement sysfs_rename_link Eric W. Biederman
@ 2009-06-02 22:57               ` gregkh
  0 siblings, 0 replies; 200+ messages in thread
From: gregkh @ 2009-06-02 22:57 UTC (permalink / raw)
  To: ebiederm, akpm, benjamin.thery, cornelia.huck, dlezcano,
	ebiederm, gregkh, greg


This is a note to let you know that I've just added the patch titled

    Subject: sysfs: Implement sysfs_rename_link

to my gregkh-2.6 tree.  Its filename is

    sysfs-implement-sysfs_rename_link.patch

This tree can be found at 
    http://www.kernel.org/pub/linux/kernel/people/gregkh/gregkh-2.6/patches/


>From ebiederm@xmission.com  Tue Jun  2 15:34:06 2009
From: "Eric W. Biederman" <ebiederm@xmission.com>
Date: Fri, 29 May 2009 13:19:11 -0700
Subject: sysfs: Implement sysfs_rename_link
To: Andrew Morton <akpm@linux-foundation.org>, Greg Kroah-Hartman <gregkh@suse.de>
Cc: Tejun Heo <tj@kernel.org>, Cornelia Huck <cornelia.huck@de.ibm.com>, <linux-fsdevel@vger.kernel.org>, Kay Sievers <kay.sievers@vrfy.org>, Greg KH <greg@kroah.com>, "Eric W. Biederman" <ebiederm@xmission.com>, Benjamin Thery <benjamin.thery@bull.net>, Daniel Lezcano <dlezcano@fr.ibm.com>, "Eric W. Biederman" <ebiederm@aristanetworks.com>
Message-ID: <1243628376-22905-1-git-send-email-ebiederm@xmission.com>


From: Eric W. Biederman <ebiederm@xmission.com>

Because of rename ordering problems we occassionally give false
warnings about invalid sysfs operations, so implement a helper
function for this common sysfs idiom.

This is a stripped down version of an earlier patch that
also added sysfs_delete_link.

Cc: Benjamin Thery <benjamin.thery@bull.net>
Cc: Daniel Lezcano <dlezcano@fr.ibm.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

---
 fs/sysfs/symlink.c    |   16 ++++++++++++++++
 include/linux/sysfs.h |    9 +++++++++
 2 files changed, 25 insertions(+)

--- a/fs/sysfs/symlink.c
+++ b/fs/sysfs/symlink.c
@@ -122,6 +122,22 @@ void sysfs_remove_link(struct kobject * 
 	sysfs_hash_and_remove(parent_sd, name);
 }
 
+/**
+ *	sysfs_rename_link - rename symlink in object's directory.
+ *	@kobj:	object we're acting for.
+ *	@targ:	object we're pointing to.
+ *	@old:	previous name of the symlink.
+ *	@new:	new name of the symlink.
+ *
+ *	A helper function for the common rename symlink idiom.
+ */
+int sysfs_rename_link(struct kobject *kobj, struct kobject *targ,
+			const char *old, const char *new)
+{
+	sysfs_remove_link(kobj, old);
+	return sysfs_create_link(kobj, targ, new);
+}
+
 static int sysfs_get_target_path(struct sysfs_dirent *parent_sd,
 				 struct sysfs_dirent *target_sd, char *path)
 {
--- a/include/linux/sysfs.h
+++ b/include/linux/sysfs.h
@@ -109,6 +109,9 @@ int __must_check sysfs_create_link_nowar
 					  const char *name);
 void sysfs_remove_link(struct kobject *kobj, const char *name);
 
+int sysfs_rename_link(struct kobject *kobj, struct kobject *target,
+			const char *old_name, const char *new_name);
+
 int __must_check sysfs_create_group(struct kobject *kobj,
 				    const struct attribute_group *grp);
 int sysfs_update_group(struct kobject *kobj,
@@ -202,6 +205,12 @@ static inline void sysfs_remove_link(str
 {
 }
 
+static inline int sysfs_rename_link(struct kobject *k, struct kobject *t,
+				    const char *old_name, const char *new_name)
+{
+	return 0;
+}
+
 static inline int sysfs_create_group(struct kobject *kobj,
 				     const struct attribute_group *grp)
 {


^ permalink raw reply	[flat|nested] 200+ messages in thread

* patch sysfs-remove-now-unnecessary-error-reporting-suppression.patch added to gregkh-2.6 tree
  2009-05-29 20:19             ` [PATCH 03/26] sysfs: Remove now unnecessary error reporting suppression Eric W. Biederman
@ 2009-06-02 22:57               ` gregkh
  0 siblings, 0 replies; 200+ messages in thread
From: gregkh @ 2009-06-02 22:57 UTC (permalink / raw)
  To: ebiederm, akpm, cornelia.huck, ebiederm, gregkh, greg,
	kay.sievers, linux-fsdevel


This is a note to let you know that I've just added the patch titled

    Subject: sysfs: Remove now unnecessary error reporting suppression.

to my gregkh-2.6 tree.  Its filename is

    sysfs-remove-now-unnecessary-error-reporting-suppression.patch

This tree can be found at 
    http://www.kernel.org/pub/linux/kernel/people/gregkh/gregkh-2.6/patches/


>From ebiederm@xmission.com  Tue Jun  2 15:35:03 2009
From: "Eric W. Biederman" <ebiederm@xmission.com>
Date: Fri, 29 May 2009 13:19:13 -0700
Subject: sysfs: Remove now unnecessary error reporting suppression.
To: Andrew Morton <akpm@linux-foundation.org>, Greg Kroah-Hartman <gregkh@suse.de>
Cc: Tejun Heo <tj@kernel.org>, Cornelia Huck <cornelia.huck@de.ibm.com>, <linux-fsdevel@vger.kernel.org>, Kay Sievers <kay.sievers@vrfy.org>, Greg KH <greg@kroah.com>, "Eric W. Biederman" <ebiederm@xmission.com>, "Eric W. Biederman" <ebiederm@aristanetworks.com>
Message-ID: <1243628376-22905-3-git-send-email-ebiederm@xmission.com>


From: Eric W. Biederman <ebiederm@xmission.com>

Now that we use sysfs_rename_link in the places we previously
used sysfs_create_link_nowarn we can remove sysfs_create_link_nowarn
and all it's supporting infrastructure as it has no callers.

Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

---
 fs/sysfs/dir.c     |   54 +++++++++++------------------------------------------
 fs/sysfs/symlink.c |   42 ++++++++---------------------------------
 fs/sysfs/sysfs.h   |    1 
 3 files changed, 21 insertions(+), 76 deletions(-)

--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -397,43 +397,6 @@ void sysfs_addrm_start(struct sysfs_addr
 }
 
 /**
- *	__sysfs_add_one - add sysfs_dirent to parent without warning
- *	@acxt: addrm context to use
- *	@sd: sysfs_dirent to be added
- *
- *	Get @acxt->parent_sd and set sd->s_parent to it and increment
- *	nlink of parent inode if @sd is a directory and link into the
- *	children list of the parent.
- *
- *	This function should be called between calls to
- *	sysfs_addrm_start() and sysfs_addrm_finish() and should be
- *	passed the same @acxt as passed to sysfs_addrm_start().
- *
- *	LOCKING:
- *	Determined by sysfs_addrm_start().
- *
- *	RETURNS:
- *	0 on success, -EEXIST if entry with the given name already
- *	exists.
- */
-int __sysfs_add_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd)
-{
-	if (sysfs_find_dirent(acxt->parent_sd, sd->s_name))
-		return -EEXIST;
-
-	sd->s_parent = sysfs_get(acxt->parent_sd);
-
-	if (sysfs_type(sd) == SYSFS_DIR && acxt->parent_inode)
-		inc_nlink(acxt->parent_inode);
-
-	acxt->cnt++;
-
-	sysfs_link_sibling(sd);
-
-	return 0;
-}
-
-/**
  *	sysfs_pathname - return full path to sysfs dirent
  *	@sd: sysfs_dirent whose path we want
  *	@path: caller allocated buffer
@@ -475,10 +438,7 @@ static char *sysfs_pathname(struct sysfs
  */
 int sysfs_add_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd)
 {
-	int ret;
-
-	ret = __sysfs_add_one(acxt, sd);
-	if (ret == -EEXIST) {
+	if (sysfs_find_dirent(acxt->parent_sd, sd->s_name)) {
 		char *path = kzalloc(PATH_MAX, GFP_KERNEL);
 		WARN(1, KERN_WARNING
 		     "sysfs: cannot create duplicate filename '%s'\n",
@@ -486,9 +446,19 @@ int sysfs_add_one(struct sysfs_addrm_cxt
 		     strcat(strcat(sysfs_pathname(acxt->parent_sd, path), "/"),
 		            sd->s_name));
 		kfree(path);
+		return -EEXIST;
 	}
 
-	return ret;
+	sd->s_parent = sysfs_get(acxt->parent_sd);
+
+	if (sysfs_type(sd) == SYSFS_DIR && acxt->parent_inode)
+		inc_nlink(acxt->parent_inode);
+
+	acxt->cnt++;
+
+	sysfs_link_sibling(sd);
+
+	return 0;
 }
 
 /**
--- a/fs/sysfs/symlink.c
+++ b/fs/sysfs/symlink.c
@@ -19,8 +19,14 @@
 
 #include "sysfs.h"
 
-static int sysfs_do_create_link(struct kobject *kobj, struct kobject *target,
-				const char *name, int warn)
+/**
+ *	sysfs_create_link - create symlink between two objects.
+ *	@kobj:	object whose directory we're creating the link in.
+ *	@target:	object we're pointing to.
+ *	@name:		name of the symlink.
+ */
+int sysfs_create_link(struct kobject *kobj, struct kobject *target,
+			const char *name)
 {
 	struct sysfs_dirent *parent_sd = NULL;
 	struct sysfs_dirent *target_sd = NULL;
@@ -60,10 +66,7 @@ static int sysfs_do_create_link(struct k
 	target_sd = NULL;	/* reference is now owned by the symlink */
 
 	sysfs_addrm_start(&acxt, parent_sd);
-	if (warn)
-		error = sysfs_add_one(&acxt, sd);
-	else
-		error = __sysfs_add_one(&acxt, sd);
+	error = sysfs_add_one(&acxt, sd);
 	sysfs_addrm_finish(&acxt);
 
 	if (error)
@@ -78,33 +81,6 @@ static int sysfs_do_create_link(struct k
 }
 
 /**
- *	sysfs_create_link - create symlink between two objects.
- *	@kobj:	object whose directory we're creating the link in.
- *	@target:	object we're pointing to.
- *	@name:		name of the symlink.
- */
-int sysfs_create_link(struct kobject *kobj, struct kobject *target,
-		      const char *name)
-{
-	return sysfs_do_create_link(kobj, target, name, 1);
-}
-
-/**
- *	sysfs_create_link_nowarn - create symlink between two objects.
- *	@kobj:	object whose directory we're creating the link in.
- *	@target:	object we're pointing to.
- *	@name:		name of the symlink.
- *
- *	This function does the same as sysf_create_link(), but it
- *	doesn't warn if the link already exists.
- */
-int sysfs_create_link_nowarn(struct kobject *kobj, struct kobject *target,
-			     const char *name)
-{
-	return sysfs_do_create_link(kobj, target, name, 0);
-}
-
-/**
  *	sysfs_remove_link - remove symlink in object's directory.
  *	@kobj:	object we're acting for.
  *	@name:	name of the symlink to remove.
--- a/fs/sysfs/sysfs.h
+++ b/fs/sysfs/sysfs.h
@@ -108,7 +108,6 @@ struct sysfs_dirent *sysfs_get_active_tw
 void sysfs_put_active_two(struct sysfs_dirent *sd);
 void sysfs_addrm_start(struct sysfs_addrm_cxt *acxt,
 		       struct sysfs_dirent *parent_sd);
-int __sysfs_add_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd);
 int sysfs_add_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd);
 void sysfs_remove_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd);
 void sysfs_addrm_finish(struct sysfs_addrm_cxt *acxt);


^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 04/26] sysfs: sysfs_remove_dir stop checking for bogus cases.
  2009-05-29 20:19             ` [PATCH 04/26] sysfs: sysfs_remove_dir stop checking for bogus cases Eric W. Biederman
@ 2009-06-03 23:53               ` Greg KH
  2009-06-04  0:41                 ` Eric W. Biederman
  0 siblings, 1 reply; 200+ messages in thread
From: Greg KH @ 2009-06-03 23:53 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Andrew Morton, Greg Kroah-Hartman, linux-kernel, Tejun Heo,
	Cornelia Huck, linux-fsdevel, Kay Sievers, Eric W. Biederman

On Fri, May 29, 2009 at 01:19:14PM -0700, Eric W. Biederman wrote:
> From: Eric W. Biederman <ebiederm@xmission.com>
> 
> kobj->sd can not be NULL in sysfs_remove_dir.
> 
> sysfs_remove_dir is only called from kobject_add (to clean up after failure)
> and from kobject_del at the end of a kobject's life.  In both cases kobject_add
> has already called sysfs_create_dir successfully.  The only writers of
> kobj->sd are sysfs_create_dir on sucess and sysfs_remove_dir when it clears
> the kobj just before deleting the directory.
> 
> Which means at the time sysfs_remove_dir is called kobj->sd will be
> valid.

Yeah, we would hope so.

But as we have been forced to add many checks like this into the driver
core to handle those "no one could ever call this" type problems that
have been springing up over time, I am hesitant to remove this check.

Why do you want to remove it, what is the problem here you are trying to
solve?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [PATCH 04/26] sysfs: sysfs_remove_dir stop checking for bogus cases.
  2009-06-03 23:53               ` Greg KH
@ 2009-06-04  0:41                 ` Eric W. Biederman
  0 siblings, 0 replies; 200+ messages in thread
From: Eric W. Biederman @ 2009-06-04  0:41 UTC (permalink / raw)
  To: Greg KH
  Cc: Andrew Morton, Greg Kroah-Hartman, linux-kernel, Tejun Heo,
	Cornelia Huck, linux-fsdevel, Kay Sievers, Eric W. Biederman

Greg KH <greg@kroah.com> writes:

> On Fri, May 29, 2009 at 01:19:14PM -0700, Eric W. Biederman wrote:
>> From: Eric W. Biederman <ebiederm@xmission.com>
>> 
>> kobj->sd can not be NULL in sysfs_remove_dir.
>> 
>> sysfs_remove_dir is only called from kobject_add (to clean up after failure)
>> and from kobject_del at the end of a kobject's life.  In both cases kobject_add
>> has already called sysfs_create_dir successfully.  The only writers of
>> kobj->sd are sysfs_create_dir on sucess and sysfs_remove_dir when it clears
>> the kobj just before deleting the directory.
>> 
>> Which means at the time sysfs_remove_dir is called kobj->sd will be
>> valid.
>
> Yeah, we would hope so.
>
> But as we have been forced to add many checks like this into the driver
> core to handle those "no one could ever call this" type problems that
> have been springing up over time, I am hesitant to remove this check.
>
> Why do you want to remove it, what is the problem here you are trying to
> solve?

I'm cleaning up and simplifying the code.

In this particular instance the check is actually wrong.  There is no
valid code path that depends on that behavior.   Which means the test
is both useless and actively obscures the code.

Making it a BUG_ON would be valid.

The other side of it is I realized I had been running for months without
that check.  I had deleted it as an oversight.  Then when the code churn
was small enough and Tejun asked me about it.  I split this hunk out
into it's own separate patch.  As I figured  it deserved it's own patch
so the change could stand out and be reviewed.  Since my review of the code
path showed no valid use case I kept it.

Eric

^ permalink raw reply	[flat|nested] 200+ messages in thread

end of thread, other threads:[~2009-06-04  0:41 UTC | newest]

Thread overview: 200+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-05-20  4:09 [PATCH 0/20] Sysfs cleanups Eric W. Biederman
2009-05-20 15:37 ` Greg KH
2009-05-20 23:04   ` Eric W. Biederman
2009-05-21  0:27 ` [PATCH 01/20] sysfs: Implement sysfs_rename_link Eric W. Biederman
2009-05-21  0:27   ` [PATCH 02/20] driver core: Use sysfs_rename_link in device_rename Eric W. Biederman
2009-05-21  0:27     ` [PATCH 03/20] sysfs: Remove now unnecessary error reporting suppression Eric W. Biederman
2009-05-21  0:27       ` [PATCH 04/20] sysfs: Handle the general case of removing of directories with subdirectories Eric W. Biederman
2009-05-21  0:27         ` [PATCH 05/20] sysfs: Rename sysfs_d_iput to sysfs_dentry_iput Eric W. Biederman
2009-05-21  0:28           ` [PATCH 06/20] sysfs: Use dentry_ops instead of directly playing with the dcache Eric W. Biederman
2009-05-21  0:28             ` [PATCH 07/20] sysfs: Simplify sysfs_chmod_file semantics Eric W. Biederman
2009-05-21  0:28               ` [PATCH 08/20] sysfs: Optimize just changing the sysfs file mode Eric W. Biederman
2009-05-21  0:28                 ` [PATCH 09/20] sysfs: Simplify iattr assignments Eric W. Biederman
2009-05-21  0:28                   ` [PATCH 10/20] sysfs: Fix locking and factor out sysfs_sd_setattr Eric W. Biederman
2009-05-21  0:28                     ` [PATCH 11/20] sysfs: Update s_iattr on link and unlink Eric W. Biederman
2009-05-21  0:28                       ` [PATCH 12/20] sysfs: Nicely indent sysfs_symlink_inode_operations Eric W. Biederman
2009-05-21  0:28                         ` [PATCH 13/20] sysfs: Implement sysfs_getattr & sysfs_permission Eric W. Biederman
2009-05-21  0:28                           ` [PATCH 14/20] sysfs: In sysfs_chmod_file lazily propagate the mode change Eric W. Biederman
2009-05-21  0:28                             ` [PATCH 15/20] sysfs: Kill sysfs_addrm_start and sysfs_addrm_finish Eric W. Biederman
2009-05-21  0:28                               ` [PATCH 16/20] sysfs: Propagate renames to the vfs on demand Eric W. Biederman
2009-05-21  0:28                                 ` [PATCH 17/20] sysfs: Merge sysfs_rename_dir and sysfs_move_dir Eric W. Biederman
2009-05-21  0:28                                   ` [PATCH 18/20] sysfs: Pass super_block to sysfs_get_inode Eric W. Biederman
2009-05-21  0:28                                     ` [PATCH 19/20] sysfs: Kill unused sysfs_sb variable Eric W. Biederman
2009-05-21  0:28                                       ` [PATCH 20/20] sysfs: Normalize error handling in sysfs_fill_inode Eric W. Biederman
2009-05-21  9:43                                         ` Tejun Heo
2009-05-21 10:29                                           ` Eric W. Biederman
2009-05-21  9:42                                   ` [PATCH 17/20] sysfs: Merge sysfs_rename_dir and sysfs_move_dir Tejun Heo
2009-05-21  9:41                                 ` [PATCH 16/20] sysfs: Propagate renames to the vfs on demand Tejun Heo
2009-05-21  9:31                               ` [PATCH 15/20] sysfs: Kill sysfs_addrm_start and sysfs_addrm_finish Tejun Heo
2009-05-21  9:16                             ` [PATCH 14/20] sysfs: In sysfs_chmod_file lazily propagate the mode change Tejun Heo
2009-05-21  9:14                           ` [PATCH 13/20] sysfs: Implement sysfs_getattr & sysfs_permission Tejun Heo
2009-05-21  7:42                         ` [PATCH 12/20] sysfs: Nicely indent sysfs_symlink_inode_operations Tejun Heo
2009-05-21  8:42                       ` [PATCH 11/20] sysfs: Update s_iattr on link and unlink Tejun Heo
2009-05-21  7:42                     ` [PATCH 10/20] sysfs: Fix locking and factor out sysfs_sd_setattr Tejun Heo
2009-05-21  7:31                   ` [PATCH 09/20] sysfs: Simplify iattr assignments Tejun Heo
2009-05-21  7:29                 ` [PATCH 08/20] sysfs: Optimize just changing the sysfs file mode Tejun Heo
2009-05-21  7:54                   ` Eric W. Biederman
2009-05-21  8:41                     ` Tejun Heo
2009-05-21  6:42               ` [PATCH 07/20] sysfs: Simplify sysfs_chmod_file semantics Tejun Heo
2009-05-21  6:41             ` [PATCH 06/20] sysfs: Use dentry_ops instead of directly playing with the dcache Tejun Heo
2009-05-21  7:37               ` Eric W. Biederman
2009-05-21  7:40                 ` Tejun Heo
2009-05-21  6:24           ` [PATCH 05/20] sysfs: Rename sysfs_d_iput to sysfs_dentry_iput Tejun Heo
2009-05-21  6:23         ` [PATCH 04/20] sysfs: Handle the general case of removing of directories with subdirectories Tejun Heo
2009-05-21  7:29           ` Eric W. Biederman
2009-05-21  7:36             ` Tejun Heo
2009-05-21  8:04               ` Eric W. Biederman
2009-05-21  8:37                 ` Tejun Heo
2009-05-21  9:18                   ` Eric W. Biederman
2009-05-21  9:28                     ` Tejun Heo
2009-05-23  6:33                       ` Eric W. Biederman
2009-05-23 11:35                         ` Kay Sievers
2009-05-23 20:09                           ` Eric W. Biederman
2009-05-23 20:46                             ` Kay Sievers
2009-05-21  5:37       ` [PATCH 03/20] sysfs: Remove now unnecessary error reporting suppression Tejun Heo
2009-05-21  6:12         ` Eric W. Biederman
2009-05-21  6:20           ` Tejun Heo
2009-05-21  1:49   ` [PATCH 01/20] sysfs: Implement sysfs_rename_link Tejun Heo
2009-05-21  5:35   ` Tejun Heo
2009-05-21 10:06     ` Kay Sievers
2009-05-21 10:29       ` Eric W. Biederman
2009-05-21 11:40         ` Kay Sievers
2009-05-28  0:14   ` Greg KH
2009-05-28  0:30     ` Kay Sievers
2009-05-28  0:37       ` Greg KH
2009-05-28 22:58         ` [PATCH 0/24] sysfs cleanups Eric W. Biederman
2009-05-28 23:00           ` [PATCH 01/24] sysfs: Implement sysfs_rename_link Eric W. Biederman
2009-05-28 23:00           ` [PATCH 02/24] driver core: Use sysfs_rename_link in device_rename Eric W. Biederman
2009-05-28 23:00           ` [PATCH 03/24] sysfs: Remove now unnecessary error reporting suppression Eric W. Biederman
2009-05-28 23:00           ` [PATCH 04/24] sysfs: Normalize removing sysfs directories Eric W. Biederman
2009-05-29  9:14             ` Tejun Heo
2009-05-29 16:52               ` Eric W. Biederman
2009-05-30 10:43                 ` Tejun Heo
2009-05-30 13:07                   ` Eric W. Biederman
2009-05-30 13:20                     ` Tejun Heo
2009-05-30 14:29                       ` Eric W. Biederman
2009-05-30 13:59                     ` Kay Sievers
2009-05-30 14:19                     ` James Bottomley
2009-05-30 15:15                       ` Eric W. Biederman
2009-05-30 15:51                         ` James Bottomley
2009-05-30 21:20                           ` Eric W. Biederman
2009-05-28 23:00           ` [PATCH 05/24] sysfs: Rename sysfs_d_iput to sysfs_dentry_iput Eric W. Biederman
2009-05-28 23:00           ` [PATCH 06/24] sysfs: Use dentry_ops instead of directly playing with the dcache Eric W. Biederman
2009-05-28 23:00           ` [PATCH 07/24] sysfs: Simplify sysfs_chmod_file semantics Eric W. Biederman
2009-05-28 23:00           ` [PATCH 08/24] sysfs: Optimize just changing the sysfs file mode Eric W. Biederman
2009-05-28 23:00           ` [PATCH 09/24] sysfs: Simplify iattr assignments Eric W. Biederman
2009-05-28 23:00           ` [PATCH 10/24] sysfs: Fix locking and factor out sysfs_sd_setattr Eric W. Biederman
2009-05-28 23:00           ` [PATCH 11/24] sysfs: Update s_iattr on link and unlink Eric W. Biederman
2009-05-28 23:00           ` [PATCH 12/24] sysfs: Nicely indent sysfs_symlink_inode_operations Eric W. Biederman
2009-05-28 23:00           ` [PATCH 13/24] sysfs: Implement sysfs_getattr & sysfs_permission Eric W. Biederman
2009-05-28 23:00           ` [PATCH 14/24] sysfs: In sysfs_chmod_file lazily propagate the mode change Eric W. Biederman
2009-05-28 23:00           ` [PATCH 15/24] sysfs: Kill sysfs_addrm_start and sysfs_addrm_finish Eric W. Biederman
2009-05-28 23:00           ` [PATCH 16/24] sysfs: Propagate renames to the vfs on demand Eric W. Biederman
2009-05-28 23:00           ` [PATCH 17/24] sysfs: Merge sysfs_rename_dir and sysfs_move_dir Eric W. Biederman
2009-05-28 23:00           ` [PATCH 18/24] sysfs: Pass super_block to sysfs_get_inode Eric W. Biederman
2009-05-28 23:01           ` [PATCH 19/24] sysfs: Kill unused sysfs_sb variable Eric W. Biederman
2009-05-28 23:01           ` [PATCH 20/24] sysfs: Normalize error handling in sysfs_fill_inode Eric W. Biederman
2009-05-28 23:01           ` [PATCH 21/24] sysfs: Rename sysfs_mv_dir sysfs_rename Eric W. Biederman
2009-05-28 23:01           ` [PATCH 22/24] sysfs: Make sysfs_rename_link atomic Eric W. Biederman
2009-05-29  9:16             ` Tejun Heo
2009-05-29 17:17               ` Eric W. Biederman
2009-05-30 10:48                 ` Tejun Heo
2009-05-28 23:01           ` [PATCH 23/24] driver core: Don't remove kobjects in device_shutdown Eric W. Biederman
2009-05-28 23:01           ` [PATCH 24/24] sysfs: In sysfs_add_one fail if the targe directory has been removed Eric W. Biederman
2009-05-29  9:18             ` Tejun Heo
2009-05-29 20:18           ` [PATCH 0/26] sysfs cleanups v3 Eric W. Biederman
2009-05-29 20:19             ` [PATCH 01/26] sysfs: Implement sysfs_rename_link Eric W. Biederman
2009-06-02 22:57               ` patch sysfs-implement-sysfs_rename_link.patch added to gregkh-2.6 tree gregkh
2009-05-29 20:19             ` [PATCH 02/26] driver core: Use sysfs_rename_link in device_rename Eric W. Biederman
2009-06-02 22:57               ` patch driver-core-use-sysfs_rename_link-in-device_rename.patch added to gregkh-2.6 tree gregkh
2009-05-29 20:19             ` [PATCH 03/26] sysfs: Remove now unnecessary error reporting suppression Eric W. Biederman
2009-06-02 22:57               ` patch sysfs-remove-now-unnecessary-error-reporting-suppression.patch added to gregkh-2.6 tree gregkh
2009-05-29 20:19             ` [PATCH 04/26] sysfs: sysfs_remove_dir stop checking for bogus cases Eric W. Biederman
2009-06-03 23:53               ` Greg KH
2009-06-04  0:41                 ` Eric W. Biederman
2009-05-29 20:19             ` [PATCH 05/26] sysfs: Improve sysfs directory deletion debugging Eric W. Biederman
2009-05-29 20:19             ` [PATCH 06/26] sysfs: Don't hold addrm_start/addrm_finish over multiple removals Eric W. Biederman
2009-05-29 20:19             ` [PATCH 07/26] sysfs: Rename sysfs_d_iput to sysfs_dentry_iput Eric W. Biederman
2009-05-29 20:19             ` [PATCH 08/26] sysfs: Use dentry_ops instead of directly playing with the dcache Eric W. Biederman
2009-05-29 20:19             ` [PATCH 09/26] sysfs: Simplify sysfs_chmod_file semantics Eric W. Biederman
2009-05-29 20:19             ` [PATCH 10/26] sysfs: Optimize just changing the sysfs file mode Eric W. Biederman
2009-05-29 20:19             ` [PATCH 11/26] sysfs: Simplify iattr assignments Eric W. Biederman
2009-05-29 20:19             ` [PATCH 12/26] sysfs: Fix locking and factor out sysfs_sd_setattr Eric W. Biederman
2009-05-29 20:19             ` [PATCH 13/26] sysfs: Update s_iattr on link and unlink Eric W. Biederman
2009-05-29 20:19             ` [PATCH 14/26] sysfs: Nicely indent sysfs_symlink_inode_operations Eric W. Biederman
2009-05-29 20:19             ` [PATCH 15/26] sysfs: Implement sysfs_getattr & sysfs_permission Eric W. Biederman
2009-05-29 20:19             ` [PATCH 16/26] sysfs: In sysfs_chmod_file lazily propagate the mode change Eric W. Biederman
2009-05-29 20:19             ` [PATCH 17/26] sysfs: Kill sysfs_addrm_start and sysfs_addrm_finish Eric W. Biederman
2009-05-29 20:19             ` [PATCH 18/26] sysfs: Propagate renames to the vfs on demand Eric W. Biederman
2009-05-29 20:19             ` [PATCH 19/26] sysfs: Merge sysfs_rename_dir and sysfs_move_dir Eric W. Biederman
2009-05-29 20:19             ` [PATCH 20/26] sysfs: Pass super_block to sysfs_get_inode Eric W. Biederman
2009-05-29 20:19             ` [PATCH 21/26] sysfs: Kill unused sysfs_sb variable Eric W. Biederman
2009-05-29 20:19             ` [PATCH 22/26] sysfs: Normalize error handling in sysfs_fill_inode Eric W. Biederman
2009-05-29 20:19             ` [PATCH 23/26] sysfs: Rename sysfs_mv_dir sysfs_rename Eric W. Biederman
2009-05-29 20:19             ` [PATCH 24/26] sysfs: Make sysfs_rename_link atomic Eric W. Biederman
2009-05-29 20:19             ` [PATCH 25/26] driver core: Don't remove kobjects in device_shutdown Eric W. Biederman
2009-05-29 20:19             ` [PATCH 26/26] sysfs: In sysfs_add_one fail if the targe directory has been removed Eric W. Biederman
2009-05-28  1:51       ` [PATCH 01/20] sysfs: Implement sysfs_rename_link Eric W. Biederman
2009-05-23 20:13 ` [PATCH 21/20] sysfs: Rename sysfs_mv_dir sysfs_rename Eric W. Biederman
2009-05-23 20:13 ` [PATCH 22/20] sysfs: Make sysfs_rename_link atomic Eric W. Biederman
2009-05-23 21:32   ` Kay Sievers
2009-05-23 23:21     ` Kay Sievers
2009-05-24 13:03       ` Kay Sievers
2009-05-23 20:13 ` [PATCH 23/20] driver core: Don't remove kobjects in device_shutdown Eric W. Biederman
2009-05-23 22:15   ` Kay Sievers
2009-05-23 20:13 ` [PATCH 24/20] sysfs: In sysfs_add_one fail if the targe directory has been removed Eric W. Biederman
2009-05-23 21:29   ` Kay Sievers
2009-05-23 20:13 ` [PATCH 25/20] sysfs: Only support removing emtpy sysfs directories Eric W. Biederman
2009-05-23 21:27   ` Kay Sievers
2009-05-24 12:59     ` Kay Sievers
2009-05-24 14:17       ` Eric W. Biederman
2009-05-24 15:20         ` Kay Sievers
2009-05-25  2:06           ` Alan Stern
2009-05-25 11:45             ` Kay Sievers
2009-05-25 12:01               ` Kay Sievers
2009-05-25 15:49                 ` Alan Stern
2009-05-25 18:19                   ` Kay Sievers
2009-05-25 20:14                     ` Alan Stern
2009-05-26 16:27               ` Kay Sievers
2009-05-26 19:29                 ` Alan Stern
2009-05-26 21:09                   ` James Bottomley
2009-05-26 21:13                     ` Kay Sievers
2009-05-26 21:56                       ` Alan Stern
2009-05-26 22:03                         ` Kay Sievers
2009-05-26 23:49                           ` James Bottomley
2009-05-27  0:02                             ` Kay Sievers
2009-05-27  2:17                               ` Alan Stern
2009-05-27 11:35                                 ` Hannes Reinecke
2009-05-27 16:01                                   ` James Bottomley
2009-05-27 16:16                                     ` Alan Stern
2009-05-27 16:24                                       ` James Bottomley
2009-05-27 17:01                                         ` Alan Stern
2009-05-27 17:08                                           ` James Bottomley
2009-05-27 18:07                                             ` Alan Stern
2009-05-27 19:44                                               ` James Bottomley
2009-05-27 20:40                                                 ` Alan Stern
2009-05-27 20:49                                                   ` James Bottomley
2009-05-27 21:31                                                     ` Alan Stern
2009-05-27 21:42                                                       ` James Bottomley
2009-05-27 22:15                                                         ` Alan Stern
2009-05-27 22:22                                                           ` James Bottomley
2009-05-28 15:24                                                             ` Alan Stern
2009-05-28 15:45                                                               ` Eric W. Biederman
2009-05-28 17:51                                                                 ` Alan Stern
2009-05-28 18:21                                                               ` James Bottomley
2009-05-28 20:02                                                                 ` Alan Stern
2009-05-28 20:10                                                                   ` James Bottomley
2009-05-28 21:04                                                                     ` Alan Stern
2009-05-29 12:32                                                                       ` Hannes Reinecke
2009-05-29 20:08                                                                     ` Alan Stern
2009-05-27 18:00                                 ` Eric W. Biederman
2009-05-27 18:15                                   ` Alan Stern
2009-05-27 18:24                                     ` Eric W. Biederman
2009-05-27 21:38                                       ` Alan Stern
2009-05-27 22:06                                         ` Eric W. Biederman
2009-05-27 22:18                                           ` Alan Stern
2009-05-26 21:39                     ` Alan Stern
2009-05-25  7:44           ` Eric W. Biederman
2009-05-25  7:53             ` Eric W. Biederman
2009-05-25 10:51               ` Kay Sievers
2009-05-24  3:24   ` Tejun Heo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).