linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH]: Fw: [Bugme-new] [Bug 1242] New: devfs oops with SMP kernel (not in UP mode)
       [not found] <20030915212755.5017acc7.akpm@osdl.org>
@ 2004-01-04 16:32 ` Andrey Borzenkov
  2004-01-04 16:49   ` Andrey Borzenkov
                     ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Andrey Borzenkov @ 2004-01-04 16:32 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, Pavel Roskin, kpfleming

On Tuesday 16 September 2003 08:27, Andrew Morton wrote:
> Andrey,
>
> didn't we fix this?
>
>

Sorry for delay. Oops is due to concurrent d_instantiate on the same dentry; 
the bug was unfortunately quite ugly to fix inside devfs itself.

The attached patch makes sure d_revalidate is always called under parent i_sem 
allowing it to drop and reacquire semaphore before going to wait. It provides 
both mutual exclusion with devfs_lookup and between d_revalidate, fixing

- this bug; unfortunately I do not know how to reproduce it on purpose. It 
apparently needs at least true SMP that I do not have. We need two 
d_revalidate's against the same dentry running concurrently

- old devfs_lookup/d_revalidate deadlock (which has been fixed a bit 
differently before). This I can test using old tests.

- theoretically possible problem when dentry->d_op is changed after 
d_op->d_revalidate has been tested resulting in NULL pointer dereferencing 
(if (dentry->d_op->d_revalidate) dentry->d_op->d_revalidate). I am not even 
sure if it is really possible.

Pavel, you have been lucky in cathing devfs bugs, could you please test this 
if it works for you?

I appreciate comments about fs/namei.c changes. I tried to make them as 
non-intrusive as possible. Believe me - it is the most simple way to close 
devfs races.

with this resolved we can start cleaning devfs; my final goal is to autoremove 
unneeded path components and get rid of devfs_name alltogether. Now when 
every driver has kernel name it is enough to register using this one letting 
devfsd to do the rest.

regards

-andrey

> Begin forwarded message:
>
> Date: Mon, 15 Sep 2003 21:03:04 -0700
> From: bugme-daemon@osdl.org
> To: bugme-new@lists.osdl.org
> Subject: [Bugme-new] [Bug 1242] New: devfs oops with SMP kernel (not in UP
> mode)
>
>
> http://bugme.osdl.org/show_bug.cgi?id=1242
>
>            Summary: devfs oops with SMP kernel (not in UP mode)
>     Kernel Version: 2.6.0-test5
>             Status: NEW
>           Severity: normal
>              Owner: bugme-janitors@lists.osdl.org
>          Submitter: kpfleming@cox.net
>
>
> Distribution: homegrown
> Hardware Environment: Intel L440GX+ with 2 Pentium III 750 CPUs, 512MB RAM
> Software Environment: kernel and LFS-built system
> Problem Description: During system startup, rapid activity (filesystem
> mounting and other) causes this oops. It can appear in various processes,
> but the call trace is always as attached. In this case, the process that
> got oops'ed was a simple bash script to mount the sysfs filesystem on /sys
> (the oops occurred during the mount operation itself as best I can tell).
> The oops does not occur on the same system with kernel recompiled with SMP
> turned off.
>
> Steps to reproduce: Boot my system with an SMP kernel :-)
>
> ------- You are receiving this mail because: -------
> You are on the CC list for the bug, or are watching someone who is.


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH]: Fw: [Bugme-new] [Bug 1242] New: devfs oops with SMP kernel (not in UP mode)
  2004-01-04 16:32 ` [PATCH]: Fw: [Bugme-new] [Bug 1242] New: devfs oops with SMP kernel (not in UP mode) Andrey Borzenkov
@ 2004-01-04 16:49   ` Andrey Borzenkov
  2004-01-04 21:48   ` Andrew Morton
  2004-01-13  8:33   ` Pavel Roskin
  2 siblings, 0 replies; 4+ messages in thread
From: Andrey Borzenkov @ 2004-01-04 16:49 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, Pavel Roskin, kpfleming

[-- Attachment #1: Type: text/plain, Size: 97 bytes --]

On Sunday 04 January 2004 19:32, Andrey Borzenkov wrote:
>
> The attached patch

really attached

[-- Attachment #2: 2.6.0-devfs-d_revalidate.patch --]
[-- Type: text/x-diff, Size: 7635 bytes --]

--- linux-2.6.0/fs/devfs/base.c.devfs_revalidate	2003-12-18 05:58:16.000000000 +0300
+++ linux-2.6.0/fs/devfs/base.c	2004-01-04 14:04:22.000000000 +0300
@@ -2185,17 +2185,9 @@ static void devfs_d_iput (struct dentry 
 }   /*  End Function devfs_d_iput  */
 
 static int devfs_d_delete (struct dentry *dentry);
-
-static struct dentry_operations devfs_dops =
-{
-    .d_delete     = devfs_d_delete,
-    .d_release    = devfs_d_release,
-    .d_iput       = devfs_d_iput,
-};
-
 static int devfs_d_revalidate_wait (struct dentry *dentry, struct nameidata *);
 
-static struct dentry_operations devfs_wait_dops =
+static struct dentry_operations devfs_dops =
 {
     .d_delete     = devfs_d_delete,
     .d_release    = devfs_d_release,
@@ -2212,8 +2204,8 @@ static int devfs_d_delete (struct dentry
 {
     struct inode *inode = dentry->d_inode;
 
-    if (dentry->d_op == &devfs_wait_dops) dentry->d_op = &devfs_dops;
     /*  Unhash dentry if negative (has no inode)  */
+    /* FIXME should we check for d_fsdata? */
     if (inode == NULL)
     {
 	DPRINTK (DEBUG_D_DELETE, "(%p): dropping negative dentry\n", dentry);
@@ -2230,6 +2222,11 @@ struct devfs_lookup_struct
 
 /* XXX: this doesn't handle the case where we got a negative dentry
         but a devfs entry has been registered in the meanwhile */
+/*
+ * This relies on the fact that d_revalidate is called under parent i_sem
+ * providing mutual exclusion with devfs_lookup and protection for
+ * dentry->d_fsdata
+ */
 static int devfs_d_revalidate_wait (struct dentry *dentry, struct nameidata *nd)
 {
     struct inode *dir = dentry->d_parent->d_inode;
@@ -2238,6 +2235,10 @@ static int devfs_d_revalidate_wait (stru
     struct devfs_lookup_struct *lookup_info = dentry->d_fsdata;
     DECLARE_WAITQUEUE (wait, current);
 
+    /* Ordinary case - nothing to revalidate */
+    if (lookup_info == NULL) return 1;  /*  Early termination  */
+
+    /* Called from devfsd action - cannot block. */
     if ( is_devfsd_or_child (fs_info) )
     {
 	devfs_handle_t de = lookup_info->de;
@@ -2266,25 +2267,22 @@ static int devfs_d_revalidate_wait (stru
 	d_instantiate (dentry, inode);
 	return 1;
     }
-    if (lookup_info == NULL) return 1;  /*  Early termination  */
-    read_lock (&parent->u.dir.lock);
-    if (dentry->d_fsdata)
-    {
-	set_current_state (TASK_UNINTERRUPTIBLE);
-	add_wait_queue (&lookup_info->wait_queue, &wait);
-	read_unlock (&parent->u.dir.lock);
-	schedule ();
-	/*
-	 * This does not need nor should remove wait from wait_queue.
-	 * Wait queue head is never reused - nothing is ever added to it
-	 * after all waiters have been waked up and head itself disappears
-	 * very soon after it. Moreover it is local variable on stack that
-	 * is likely to have already disappeared so any reference to it
-	 * at this point is buggy.
-	 */
 
-    }
-    else read_unlock (&parent->u.dir.lock);
+    /* Not devfsd - must wait for devfsd to return */
+    set_current_state (TASK_UNINTERRUPTIBLE);
+    add_wait_queue (&lookup_info->wait_queue, &wait);
+    up(&dir->i_sem);
+    schedule ();
+    down(&dir->i_sem);
+    /*
+     * This does not need nor should remove wait from wait_queue.
+     * Wait queue head is never reused - nothing is ever added to it
+     * after all waiters have been waked up and head itself disappears
+     * very soon after it. Moreover it is local variable on stack that
+     * is likely to have already disappeared so any reference to it
+     * at this point is buggy.
+     */
+
     return 1;
 }   /*  End Function devfs_d_revalidate_wait  */
 
@@ -2323,17 +2321,18 @@ static struct dentry *devfs_lookup (stru
 	if (try_modload (parent, fs_info,
 			 dentry->d_name.name, dentry->d_name.len, &tmp) < 0)
 	{   /*  Lookup event was not queued to devfsd  */
+	    dentry->d_fsdata = NULL;
 	    d_add (dentry, NULL);
 	    return NULL;
 	}
     }
-    dentry->d_op = &devfs_wait_dops;
     d_add (dentry, NULL);  /*  Open the floodgates  */
     /*  Unlock directory semaphore, which will release any waiters. They
 	will get the hashed dentry, and may be forced to wait for
 	revalidation  */
     up (&dir->i_sem);
     wait_for_devfsd_finished (fs_info);  /*  If I'm not devfsd, must wait  */
+    down (&dir->i_sem);      /*  Grab it again because them's the rules  */
     de = lookup_info.de;
     /*  If someone else has been so kind as to make the inode, we go home
 	early  */
@@ -2358,12 +2357,8 @@ static struct dentry *devfs_lookup (stru
 	     de->name, de->inode.ino, inode, de, current->comm);
     d_instantiate (dentry, inode);
 out:
-    write_lock (&parent->u.dir.lock);
-    dentry->d_op = &devfs_dops;
     dentry->d_fsdata = NULL;
     wake_up (&lookup_info.wait_queue);
-    write_unlock (&parent->u.dir.lock);
-    down (&dir->i_sem);      /*  Grab it again because them's the rules  */
     devfs_put (de);
     return retval;
 }   /*  End Function devfs_lookup  */
@@ -2606,6 +2601,7 @@ static struct file_system_type devfs_fs_
     .name	= DEVFS_NAME,
     .get_sb	= devfs_get_sb,
     .kill_sb	= kill_anon_super,
+    .fs_flags	= FS_ODD_REVALIDATE,
 };
 
 /*  File operations for devfsd follow  */
--- linux-2.6.0/fs/namei.c.devfs_revalidate	2003-12-18 05:58:40.000000000 +0300
+++ linux-2.6.0/fs/namei.c	2004-01-04 14:06:15.000000000 +0300
@@ -274,6 +274,9 @@ void path_release(struct nameidata *nd)
  * Internal lookup() using the new generic dcache.
  * SMP-safe
  */
+/*
+ * This seems to always be called under parent->i_sem locked
+ */
 static struct dentry * cached_lookup(struct dentry * parent, struct qstr * name, struct nameidata *nd)
 {
 	struct dentry * dentry = __d_lookup(parent, name);
@@ -342,6 +345,7 @@ static struct dentry * real_lookup(struc
 {
 	struct dentry * result;
 	struct inode *dir = parent->d_inode;
+	int needlock = dir->i_sb->s_type->fs_flags & FS_ODD_REVALIDATE;
 
 	down(&dir->i_sem);
 	/*
@@ -377,13 +381,16 @@ static struct dentry * real_lookup(struc
 	 * Uhhuh! Nasty case: the cache was re-populated while
 	 * we waited on the semaphore. Need to revalidate.
 	 */
-	up(&dir->i_sem);
+	if (!needlock)
+		up(&dir->i_sem);
 	if (result->d_op && result->d_op->d_revalidate) {
 		if (!result->d_op->d_revalidate(result, nd) && !d_invalidate(result)) {
 			dput(result);
 			result = ERR_PTR(-ENOENT);
 		}
 	}
+	if (needlock)
+		up(&dir->i_sem);
 	return result;
 }
 
@@ -528,11 +535,17 @@ static int do_lookup(struct nameidata *n
 {
 	struct vfsmount *mnt = nd->mnt;
 	struct dentry *dentry = __d_lookup(nd->dentry, name);
+	int needlock = mnt->mnt_sb->s_type->fs_flags & FS_ODD_REVALIDATE;
 
 	if (!dentry)
 		goto need_lookup;
+	if (needlock)
+		down(&nd->dentry->d_inode->i_sem);
 	if (dentry->d_op && dentry->d_op->d_revalidate)
 		goto need_revalidate;
+unlock:
+	if (needlock)
+		up(&nd->dentry->d_inode->i_sem);
 done:
 	path->mnt = mnt;
 	path->dentry = dentry;
@@ -546,9 +559,11 @@ need_lookup:
 
 need_revalidate:
 	if (dentry->d_op->d_revalidate(dentry, nd))
-		goto done;
+		goto unlock;
 	if (d_invalidate(dentry))
-		goto done;
+		goto unlock;
+	if (needlock)
+		up(&nd->dentry->d_inode->i_sem);
 	dput(dentry);
 	goto need_lookup;
 
--- linux-2.6.0/include/linux/fs.h.devfs_revalidate	2003-12-28 15:57:45.000000000 +0300
+++ linux-2.6.0/include/linux/fs.h	2004-01-04 18:47:11.000000000 +0300
@@ -94,6 +94,7 @@ extern int leases_enable, dir_notify_ena
 #define FS_ODD_RENAME	32768	/* Temporary stuff; will go away as soon
 				  * as nfs_rename() will be cleaned up
 				  */
+#define FS_ODD_REVALIDATE	(1<<16) /* d_revalidate needs lock on i_sem */
 /*
  * These are the fs-independent mount-flags: up to 32 flags are supported
  */

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH]: Fw: [Bugme-new] [Bug 1242] New: devfs oops with SMP kernel (not in UP mode)
  2004-01-04 16:32 ` [PATCH]: Fw: [Bugme-new] [Bug 1242] New: devfs oops with SMP kernel (not in UP mode) Andrey Borzenkov
  2004-01-04 16:49   ` Andrey Borzenkov
@ 2004-01-04 21:48   ` Andrew Morton
  2004-01-13  8:33   ` Pavel Roskin
  2 siblings, 0 replies; 4+ messages in thread
From: Andrew Morton @ 2004-01-04 21:48 UTC (permalink / raw)
  To: Andrey Borzenkov; +Cc: linux-kernel, proski, kpfleming

Andrey Borzenkov <arvidjaar@mail.ru> wrote:
>
> On Tuesday 16 September 2003 08:27, Andrew Morton wrote:
>  > Andrey,
>  >
>  > didn't we fix this?
>  >
>  >
> 
>  Sorry for delay. Oops is due to concurrent d_instantiate on the same dentry; 
>  the bug was unfortunately quite ugly to fix inside devfs itself.

Andrey, thanks for working on this.  It is good to have a go-to guy for
devfs problems.  They all seem to be nasty nowadays.

>  The attached patch makes sure d_revalidate is always called under parent i_sem 
>  allowing it to drop and reacquire semaphore before going to wait. It provides 
>  both mutual exclusion with devfs_lookup and between d_revalidate, fixing

Alas, people will have a heart-attack over the fs/namei.c changes.  Is it
not possible to add an additional semaphore into devfs to provide this
exclusion?  Are there any alternatives you can think of?




^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH]: Fw: [Bugme-new] [Bug 1242] New: devfs oops with SMP kernel (not in UP mode)
  2004-01-04 16:32 ` [PATCH]: Fw: [Bugme-new] [Bug 1242] New: devfs oops with SMP kernel (not in UP mode) Andrey Borzenkov
  2004-01-04 16:49   ` Andrey Borzenkov
  2004-01-04 21:48   ` Andrew Morton
@ 2004-01-13  8:33   ` Pavel Roskin
  2 siblings, 0 replies; 4+ messages in thread
From: Pavel Roskin @ 2004-01-13  8:33 UTC (permalink / raw)
  To: Andrey Borzenkov; +Cc: Andrew Morton, linux-kernel, kpfleming

On Sun, 4 Jan 2004, Andrey Borzenkov wrote:

> Pavel, you have been lucky in cathing devfs bugs, could you please test this
> if it works for you?

Yes, it does.  No apparent problems.  However, I only had lookups on Red
Hat 8 and 9, and I'm running Debian unstable now.  The kernel is
2.6.1-bk1.  devfs is enabled, devfsd is running.  SMP is not enabled.

-- 
Regards,
Pavel Roskin

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2004-01-13  8:33 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20030915212755.5017acc7.akpm@osdl.org>
2004-01-04 16:32 ` [PATCH]: Fw: [Bugme-new] [Bug 1242] New: devfs oops with SMP kernel (not in UP mode) Andrey Borzenkov
2004-01-04 16:49   ` Andrey Borzenkov
2004-01-04 21:48   ` Andrew Morton
2004-01-13  8:33   ` Pavel Roskin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).