* [PATCH]: Fw: [Bugme-new] [Bug 1242] New: devfs oops with SMP kernel (not in UP mode)
[not found] <20030915212755.5017acc7.akpm@osdl.org>
@ 2004-01-04 16:32 ` Andrey Borzenkov
2004-01-04 16:49 ` Andrey Borzenkov
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: Andrey Borzenkov @ 2004-01-04 16:32 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-kernel, Pavel Roskin, kpfleming
On Tuesday 16 September 2003 08:27, Andrew Morton wrote:
> Andrey,
>
> didn't we fix this?
>
>
Sorry for delay. Oops is due to concurrent d_instantiate on the same dentry;
the bug was unfortunately quite ugly to fix inside devfs itself.
The attached patch makes sure d_revalidate is always called under parent i_sem
allowing it to drop and reacquire semaphore before going to wait. It provides
both mutual exclusion with devfs_lookup and between d_revalidate, fixing
- this bug; unfortunately I do not know how to reproduce it on purpose. It
apparently needs at least true SMP that I do not have. We need two
d_revalidate's against the same dentry running concurrently
- old devfs_lookup/d_revalidate deadlock (which has been fixed a bit
differently before). This I can test using old tests.
- theoretically possible problem when dentry->d_op is changed after
d_op->d_revalidate has been tested resulting in NULL pointer dereferencing
(if (dentry->d_op->d_revalidate) dentry->d_op->d_revalidate). I am not even
sure if it is really possible.
Pavel, you have been lucky in cathing devfs bugs, could you please test this
if it works for you?
I appreciate comments about fs/namei.c changes. I tried to make them as
non-intrusive as possible. Believe me - it is the most simple way to close
devfs races.
with this resolved we can start cleaning devfs; my final goal is to autoremove
unneeded path components and get rid of devfs_name alltogether. Now when
every driver has kernel name it is enough to register using this one letting
devfsd to do the rest.
regards
-andrey
> Begin forwarded message:
>
> Date: Mon, 15 Sep 2003 21:03:04 -0700
> From: bugme-daemon@osdl.org
> To: bugme-new@lists.osdl.org
> Subject: [Bugme-new] [Bug 1242] New: devfs oops with SMP kernel (not in UP
> mode)
>
>
> http://bugme.osdl.org/show_bug.cgi?id=1242
>
> Summary: devfs oops with SMP kernel (not in UP mode)
> Kernel Version: 2.6.0-test5
> Status: NEW
> Severity: normal
> Owner: bugme-janitors@lists.osdl.org
> Submitter: kpfleming@cox.net
>
>
> Distribution: homegrown
> Hardware Environment: Intel L440GX+ with 2 Pentium III 750 CPUs, 512MB RAM
> Software Environment: kernel and LFS-built system
> Problem Description: During system startup, rapid activity (filesystem
> mounting and other) causes this oops. It can appear in various processes,
> but the call trace is always as attached. In this case, the process that
> got oops'ed was a simple bash script to mount the sysfs filesystem on /sys
> (the oops occurred during the mount operation itself as best I can tell).
> The oops does not occur on the same system with kernel recompiled with SMP
> turned off.
>
> Steps to reproduce: Boot my system with an SMP kernel :-)
>
> ------- You are receiving this mail because: -------
> You are on the CC list for the bug, or are watching someone who is.
^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH]: Fw: [Bugme-new] [Bug 1242] New: devfs oops with SMP kernel (not in UP mode)
2004-01-04 16:32 ` [PATCH]: Fw: [Bugme-new] [Bug 1242] New: devfs oops with SMP kernel (not in UP mode) Andrey Borzenkov
@ 2004-01-04 16:49 ` Andrey Borzenkov
2004-01-04 21:48 ` Andrew Morton
2004-01-13 8:33 ` Pavel Roskin
2 siblings, 0 replies; 4+ messages in thread
From: Andrey Borzenkov @ 2004-01-04 16:49 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-kernel, Pavel Roskin, kpfleming
[-- Attachment #1: Type: text/plain, Size: 97 bytes --]
On Sunday 04 January 2004 19:32, Andrey Borzenkov wrote:
>
> The attached patch
really attached
[-- Attachment #2: 2.6.0-devfs-d_revalidate.patch --]
[-- Type: text/x-diff, Size: 7635 bytes --]
--- linux-2.6.0/fs/devfs/base.c.devfs_revalidate 2003-12-18 05:58:16.000000000 +0300
+++ linux-2.6.0/fs/devfs/base.c 2004-01-04 14:04:22.000000000 +0300
@@ -2185,17 +2185,9 @@ static void devfs_d_iput (struct dentry
} /* End Function devfs_d_iput */
static int devfs_d_delete (struct dentry *dentry);
-
-static struct dentry_operations devfs_dops =
-{
- .d_delete = devfs_d_delete,
- .d_release = devfs_d_release,
- .d_iput = devfs_d_iput,
-};
-
static int devfs_d_revalidate_wait (struct dentry *dentry, struct nameidata *);
-static struct dentry_operations devfs_wait_dops =
+static struct dentry_operations devfs_dops =
{
.d_delete = devfs_d_delete,
.d_release = devfs_d_release,
@@ -2212,8 +2204,8 @@ static int devfs_d_delete (struct dentry
{
struct inode *inode = dentry->d_inode;
- if (dentry->d_op == &devfs_wait_dops) dentry->d_op = &devfs_dops;
/* Unhash dentry if negative (has no inode) */
+ /* FIXME should we check for d_fsdata? */
if (inode == NULL)
{
DPRINTK (DEBUG_D_DELETE, "(%p): dropping negative dentry\n", dentry);
@@ -2230,6 +2222,11 @@ struct devfs_lookup_struct
/* XXX: this doesn't handle the case where we got a negative dentry
but a devfs entry has been registered in the meanwhile */
+/*
+ * This relies on the fact that d_revalidate is called under parent i_sem
+ * providing mutual exclusion with devfs_lookup and protection for
+ * dentry->d_fsdata
+ */
static int devfs_d_revalidate_wait (struct dentry *dentry, struct nameidata *nd)
{
struct inode *dir = dentry->d_parent->d_inode;
@@ -2238,6 +2235,10 @@ static int devfs_d_revalidate_wait (stru
struct devfs_lookup_struct *lookup_info = dentry->d_fsdata;
DECLARE_WAITQUEUE (wait, current);
+ /* Ordinary case - nothing to revalidate */
+ if (lookup_info == NULL) return 1; /* Early termination */
+
+ /* Called from devfsd action - cannot block. */
if ( is_devfsd_or_child (fs_info) )
{
devfs_handle_t de = lookup_info->de;
@@ -2266,25 +2267,22 @@ static int devfs_d_revalidate_wait (stru
d_instantiate (dentry, inode);
return 1;
}
- if (lookup_info == NULL) return 1; /* Early termination */
- read_lock (&parent->u.dir.lock);
- if (dentry->d_fsdata)
- {
- set_current_state (TASK_UNINTERRUPTIBLE);
- add_wait_queue (&lookup_info->wait_queue, &wait);
- read_unlock (&parent->u.dir.lock);
- schedule ();
- /*
- * This does not need nor should remove wait from wait_queue.
- * Wait queue head is never reused - nothing is ever added to it
- * after all waiters have been waked up and head itself disappears
- * very soon after it. Moreover it is local variable on stack that
- * is likely to have already disappeared so any reference to it
- * at this point is buggy.
- */
- }
- else read_unlock (&parent->u.dir.lock);
+ /* Not devfsd - must wait for devfsd to return */
+ set_current_state (TASK_UNINTERRUPTIBLE);
+ add_wait_queue (&lookup_info->wait_queue, &wait);
+ up(&dir->i_sem);
+ schedule ();
+ down(&dir->i_sem);
+ /*
+ * This does not need nor should remove wait from wait_queue.
+ * Wait queue head is never reused - nothing is ever added to it
+ * after all waiters have been waked up and head itself disappears
+ * very soon after it. Moreover it is local variable on stack that
+ * is likely to have already disappeared so any reference to it
+ * at this point is buggy.
+ */
+
return 1;
} /* End Function devfs_d_revalidate_wait */
@@ -2323,17 +2321,18 @@ static struct dentry *devfs_lookup (stru
if (try_modload (parent, fs_info,
dentry->d_name.name, dentry->d_name.len, &tmp) < 0)
{ /* Lookup event was not queued to devfsd */
+ dentry->d_fsdata = NULL;
d_add (dentry, NULL);
return NULL;
}
}
- dentry->d_op = &devfs_wait_dops;
d_add (dentry, NULL); /* Open the floodgates */
/* Unlock directory semaphore, which will release any waiters. They
will get the hashed dentry, and may be forced to wait for
revalidation */
up (&dir->i_sem);
wait_for_devfsd_finished (fs_info); /* If I'm not devfsd, must wait */
+ down (&dir->i_sem); /* Grab it again because them's the rules */
de = lookup_info.de;
/* If someone else has been so kind as to make the inode, we go home
early */
@@ -2358,12 +2357,8 @@ static struct dentry *devfs_lookup (stru
de->name, de->inode.ino, inode, de, current->comm);
d_instantiate (dentry, inode);
out:
- write_lock (&parent->u.dir.lock);
- dentry->d_op = &devfs_dops;
dentry->d_fsdata = NULL;
wake_up (&lookup_info.wait_queue);
- write_unlock (&parent->u.dir.lock);
- down (&dir->i_sem); /* Grab it again because them's the rules */
devfs_put (de);
return retval;
} /* End Function devfs_lookup */
@@ -2606,6 +2601,7 @@ static struct file_system_type devfs_fs_
.name = DEVFS_NAME,
.get_sb = devfs_get_sb,
.kill_sb = kill_anon_super,
+ .fs_flags = FS_ODD_REVALIDATE,
};
/* File operations for devfsd follow */
--- linux-2.6.0/fs/namei.c.devfs_revalidate 2003-12-18 05:58:40.000000000 +0300
+++ linux-2.6.0/fs/namei.c 2004-01-04 14:06:15.000000000 +0300
@@ -274,6 +274,9 @@ void path_release(struct nameidata *nd)
* Internal lookup() using the new generic dcache.
* SMP-safe
*/
+/*
+ * This seems to always be called under parent->i_sem locked
+ */
static struct dentry * cached_lookup(struct dentry * parent, struct qstr * name, struct nameidata *nd)
{
struct dentry * dentry = __d_lookup(parent, name);
@@ -342,6 +345,7 @@ static struct dentry * real_lookup(struc
{
struct dentry * result;
struct inode *dir = parent->d_inode;
+ int needlock = dir->i_sb->s_type->fs_flags & FS_ODD_REVALIDATE;
down(&dir->i_sem);
/*
@@ -377,13 +381,16 @@ static struct dentry * real_lookup(struc
* Uhhuh! Nasty case: the cache was re-populated while
* we waited on the semaphore. Need to revalidate.
*/
- up(&dir->i_sem);
+ if (!needlock)
+ up(&dir->i_sem);
if (result->d_op && result->d_op->d_revalidate) {
if (!result->d_op->d_revalidate(result, nd) && !d_invalidate(result)) {
dput(result);
result = ERR_PTR(-ENOENT);
}
}
+ if (needlock)
+ up(&dir->i_sem);
return result;
}
@@ -528,11 +535,17 @@ static int do_lookup(struct nameidata *n
{
struct vfsmount *mnt = nd->mnt;
struct dentry *dentry = __d_lookup(nd->dentry, name);
+ int needlock = mnt->mnt_sb->s_type->fs_flags & FS_ODD_REVALIDATE;
if (!dentry)
goto need_lookup;
+ if (needlock)
+ down(&nd->dentry->d_inode->i_sem);
if (dentry->d_op && dentry->d_op->d_revalidate)
goto need_revalidate;
+unlock:
+ if (needlock)
+ up(&nd->dentry->d_inode->i_sem);
done:
path->mnt = mnt;
path->dentry = dentry;
@@ -546,9 +559,11 @@ need_lookup:
need_revalidate:
if (dentry->d_op->d_revalidate(dentry, nd))
- goto done;
+ goto unlock;
if (d_invalidate(dentry))
- goto done;
+ goto unlock;
+ if (needlock)
+ up(&nd->dentry->d_inode->i_sem);
dput(dentry);
goto need_lookup;
--- linux-2.6.0/include/linux/fs.h.devfs_revalidate 2003-12-28 15:57:45.000000000 +0300
+++ linux-2.6.0/include/linux/fs.h 2004-01-04 18:47:11.000000000 +0300
@@ -94,6 +94,7 @@ extern int leases_enable, dir_notify_ena
#define FS_ODD_RENAME 32768 /* Temporary stuff; will go away as soon
* as nfs_rename() will be cleaned up
*/
+#define FS_ODD_REVALIDATE (1<<16) /* d_revalidate needs lock on i_sem */
/*
* These are the fs-independent mount-flags: up to 32 flags are supported
*/
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH]: Fw: [Bugme-new] [Bug 1242] New: devfs oops with SMP kernel (not in UP mode)
2004-01-04 16:32 ` [PATCH]: Fw: [Bugme-new] [Bug 1242] New: devfs oops with SMP kernel (not in UP mode) Andrey Borzenkov
2004-01-04 16:49 ` Andrey Borzenkov
@ 2004-01-04 21:48 ` Andrew Morton
2004-01-13 8:33 ` Pavel Roskin
2 siblings, 0 replies; 4+ messages in thread
From: Andrew Morton @ 2004-01-04 21:48 UTC (permalink / raw)
To: Andrey Borzenkov; +Cc: linux-kernel, proski, kpfleming
Andrey Borzenkov <arvidjaar@mail.ru> wrote:
>
> On Tuesday 16 September 2003 08:27, Andrew Morton wrote:
> > Andrey,
> >
> > didn't we fix this?
> >
> >
>
> Sorry for delay. Oops is due to concurrent d_instantiate on the same dentry;
> the bug was unfortunately quite ugly to fix inside devfs itself.
Andrey, thanks for working on this. It is good to have a go-to guy for
devfs problems. They all seem to be nasty nowadays.
> The attached patch makes sure d_revalidate is always called under parent i_sem
> allowing it to drop and reacquire semaphore before going to wait. It provides
> both mutual exclusion with devfs_lookup and between d_revalidate, fixing
Alas, people will have a heart-attack over the fs/namei.c changes. Is it
not possible to add an additional semaphore into devfs to provide this
exclusion? Are there any alternatives you can think of?
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH]: Fw: [Bugme-new] [Bug 1242] New: devfs oops with SMP kernel (not in UP mode)
2004-01-04 16:32 ` [PATCH]: Fw: [Bugme-new] [Bug 1242] New: devfs oops with SMP kernel (not in UP mode) Andrey Borzenkov
2004-01-04 16:49 ` Andrey Borzenkov
2004-01-04 21:48 ` Andrew Morton
@ 2004-01-13 8:33 ` Pavel Roskin
2 siblings, 0 replies; 4+ messages in thread
From: Pavel Roskin @ 2004-01-13 8:33 UTC (permalink / raw)
To: Andrey Borzenkov; +Cc: Andrew Morton, linux-kernel, kpfleming
On Sun, 4 Jan 2004, Andrey Borzenkov wrote:
> Pavel, you have been lucky in cathing devfs bugs, could you please test this
> if it works for you?
Yes, it does. No apparent problems. However, I only had lookups on Red
Hat 8 and 9, and I'm running Debian unstable now. The kernel is
2.6.1-bk1. devfs is enabled, devfsd is running. SMP is not enabled.
--
Regards,
Pavel Roskin
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2004-01-13 8:33 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <20030915212755.5017acc7.akpm@osdl.org>
2004-01-04 16:32 ` [PATCH]: Fw: [Bugme-new] [Bug 1242] New: devfs oops with SMP kernel (not in UP mode) Andrey Borzenkov
2004-01-04 16:49 ` Andrey Borzenkov
2004-01-04 21:48 ` Andrew Morton
2004-01-13 8:33 ` Pavel Roskin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).