* [Question] race condition with remove_proc_entry
@ 2005-12-30 20:04 Steven Rostedt
2005-12-30 21:28 ` [PATCH] protect remove_proc_entry Steven Rostedt
0 siblings, 1 reply; 25+ messages in thread
From: Steven Rostedt @ 2005-12-30 20:04 UTC (permalink / raw)
To: LKML; +Cc: Andrew Morton, Ingo Molnar
I'm just curious if it is know that remove_proc_entry has an inherit
race condition? I have a modified kernel that would add and remove
stuff from the proc system and it would every so often crash. I traced
the bug to remove_proc_entry.
for (p = &parent->subdir; *p; p=&(*p)->next ) {
if (!proc_match(len, fn, *p))
continue;
Looking at proc_match
int proc_match(int len, const char *name, struct proc_dir_entry *de)
{
if (de->namelen != len)
return 0;
return !memcmp(name, de->name, len);
}
The bug would happen either at de->namelen in proc_match or in the loop
of p=&(*p)->next.
The race is if two threads remove two entries that are siblings. Since
p = &(*p)->next, and this is then dereferenced, the race is with *p
becoming NULL.
The way I'm fixing this is to put a lock around the call to
remove_proc_entry. But is this race already known and the solution is
to have the callers perform their own locking? Or is this an actual
bug? If it is not a bug, where's the documentation on having callers
protect it?
Thanks,
-- Steve
^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH] protect remove_proc_entry
2005-12-30 20:04 [Question] race condition with remove_proc_entry Steven Rostedt
@ 2005-12-30 21:28 ` Steven Rostedt
2005-12-30 21:34 ` Daniel Walker
` (3 more replies)
0 siblings, 4 replies; 25+ messages in thread
From: Steven Rostedt @ 2005-12-30 21:28 UTC (permalink / raw)
To: LKML; +Cc: Ingo Molnar, Andrew Morton
Working on a custom kernel that adds and removes proc entries quite a
bit, I discovered that remove_proc_entry is not protected against
multiple threads removing entries belonging to the same parent. At
first I thought that this is only a problem with my changes, but after
inspecting the vanilla kernel, I see that there's several places that
calls remove_proc_entry with the same parent (most noticeably
/proc/drivers).
I've added a global remove_proc_lock to protect this section of code. I
was going to add a lock to proc_dir_entry so that the locking is only
cut down to the same parent, but since this function is called so
infrequently, why waste more memory then is needed. One global lock
should not cause too much of a headache here.
I'm not sure if remove_proc_entry is called from interrupt context, so I
did a irqsave just in case.
-- Steve
Index: linux-2.6.15-rc7/fs/proc/generic.c
===================================================================
--- linux-2.6.15-rc7.orig/fs/proc/generic.c 2005-12-30 14:19:39.000000000 -0500
+++ linux-2.6.15-rc7/fs/proc/generic.c 2005-12-30 16:18:42.000000000 -0500
@@ -19,6 +19,7 @@
#include <linux/idr.h>
#include <linux/namei.h>
#include <linux/bitops.h>
+#include <linux/spinlock.h>
#include <asm/uaccess.h>
static ssize_t proc_file_read(struct file *file, char __user *buf,
@@ -27,6 +28,8 @@
size_t count, loff_t *ppos);
static loff_t proc_file_lseek(struct file *, loff_t, int);
+static DEFINE_SPINLOCK(remove_proc_lock);
+
int proc_match(int len, const char *name, struct proc_dir_entry *de)
{
if (de->namelen != len)
@@ -689,10 +692,13 @@
struct proc_dir_entry *de;
const char *fn = name;
int len;
+ unsigned long flags;
if (!parent && xlate_proc_name(name, &parent, &fn) != 0)
goto out;
len = strlen(fn);
+
+ spin_lock_irqsave(&remove_proc_lock, flags);
for (p = &parent->subdir; *p; p=&(*p)->next ) {
if (!proc_match(len, fn, *p))
continue;
@@ -713,6 +719,7 @@
}
break;
}
+ spin_unlock_irqrestore(&remove_proc_lock, flags);
out:
return;
}
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] protect remove_proc_entry
2005-12-30 21:28 ` [PATCH] protect remove_proc_entry Steven Rostedt
@ 2005-12-30 21:34 ` Daniel Walker
2005-12-30 21:55 ` Steven Rostedt
2005-12-30 21:55 ` Mitchell Blank Jr
` (2 subsequent siblings)
3 siblings, 1 reply; 25+ messages in thread
From: Daniel Walker @ 2005-12-30 21:34 UTC (permalink / raw)
To: Steven Rostedt; +Cc: LKML, Ingo Molnar, Andrew Morton
On Fri, 2005-12-30 at 16:28 -0500, Steven Rostedt wrote:
> + spin_lock_irqsave(&remove_proc_lock, flags);
...
> + spin_unlock_irqrestore(&remove_proc_lock, flags);
Why do an irqsave here? Are you not sure what context it gets called
from?
Daniel
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] protect remove_proc_entry
2005-12-30 21:28 ` [PATCH] protect remove_proc_entry Steven Rostedt
2005-12-30 21:34 ` Daniel Walker
@ 2005-12-30 21:55 ` Mitchell Blank Jr
2005-12-30 22:09 ` Steven Rostedt
2005-12-30 22:11 ` Steven Rostedt
2005-12-30 23:46 ` Andrew Morton
2006-01-07 11:36 ` Andrew Morton
3 siblings, 2 replies; 25+ messages in thread
From: Mitchell Blank Jr @ 2005-12-30 21:55 UTC (permalink / raw)
To: Steven Rostedt; +Cc: LKML, Ingo Molnar, Andrew Morton
Steven Rostedt wrote:
> I've added a global remove_proc_lock to protect this section of code. I
> was going to add a lock to proc_dir_entry so that the locking is only
> cut down to the same parent, but since this function is called so
> infrequently, why waste more memory then is needed. One global lock
> should not cause too much of a headache here.
Are you sure that it's the only place where we need guard ->subdir? It
looks like proc_lookup() and proc_readdir() use the BLK when walking that
list, so probably the best fix would be to use that lock everywhere else
->subdir is touched
-Mitch
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] protect remove_proc_entry
2005-12-30 21:34 ` Daniel Walker
@ 2005-12-30 21:55 ` Steven Rostedt
0 siblings, 0 replies; 25+ messages in thread
From: Steven Rostedt @ 2005-12-30 21:55 UTC (permalink / raw)
To: Daniel Walker; +Cc: LKML, Ingo Molnar, Andrew Morton
On Fri, 30 Dec 2005, Daniel Walker wrote:
> On Fri, 2005-12-30 at 16:28 -0500, Steven Rostedt wrote:
>
> > + spin_lock_irqsave(&remove_proc_lock, flags);
> ...
> > + spin_unlock_irqrestore(&remove_proc_lock, flags);
>
> Why do an irqsave here? Are you not sure what context it gets called
> from?
>
[snipped from original message]
"I'm not sure if remove_proc_entry is called from interrupt context, so I
did a irqsave just in case."
;-)
-- Steve
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] protect remove_proc_entry
2005-12-30 21:55 ` Mitchell Blank Jr
@ 2005-12-30 22:09 ` Steven Rostedt
2005-12-30 22:18 ` Steven Rostedt
2006-01-07 11:25 ` Andrew Morton
2005-12-30 22:11 ` Steven Rostedt
1 sibling, 2 replies; 25+ messages in thread
From: Steven Rostedt @ 2005-12-30 22:09 UTC (permalink / raw)
To: Mitchell Blank Jr; +Cc: LKML, Ingo Molnar, Andrew Morton
On Fri, 2005-12-30 at 13:55 -0800, Mitchell Blank Jr wrote:
> Steven Rostedt wrote:
> > I've added a global remove_proc_lock to protect this section of code. I
> > was going to add a lock to proc_dir_entry so that the locking is only
> > cut down to the same parent, but since this function is called so
> > infrequently, why waste more memory then is needed. One global lock
> > should not cause too much of a headache here.
>
> Are you sure that it's the only place where we need guard ->subdir? It
> looks like proc_lookup() and proc_readdir() use the BLK when walking that
> list, so probably the best fix would be to use that lock everywhere else
> ->subdir is touched
Good point.
God, we should be getting rid of the stupid BKL, not add more. But
seeing that this is what is used to protect that list, I guess I'll add
it.
I'm also assuming that interrupt context wont use this.
-- Steve
Index: linux-2.6.15-rc7/fs/proc/generic.c
===================================================================
--- linux-2.6.15-rc7.orig/fs/proc/generic.c 2005-12-30 14:19:39.000000000 -0500
+++ linux-2.6.15-rc7/fs/proc/generic.c 2005-12-30 17:05:56.000000000 -0500
@@ -693,6 +693,8 @@
if (!parent && xlate_proc_name(name, &parent, &fn) != 0)
goto out;
len = strlen(fn);
+
+ lock_kernel();
for (p = &parent->subdir; *p; p=&(*p)->next ) {
if (!proc_match(len, fn, *p))
continue;
@@ -713,6 +715,7 @@
}
break;
}
+ unlock_kernel();
out:
return;
}
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] protect remove_proc_entry
2005-12-30 21:55 ` Mitchell Blank Jr
2005-12-30 22:09 ` Steven Rostedt
@ 2005-12-30 22:11 ` Steven Rostedt
1 sibling, 0 replies; 25+ messages in thread
From: Steven Rostedt @ 2005-12-30 22:11 UTC (permalink / raw)
To: Mitchell Blank Jr; +Cc: LKML, Ingo Molnar, Andrew Morton
On Fri, 2005-12-30 at 13:55 -0800, Mitchell Blank Jr wrote:
> Steven Rostedt wrote:
> > I've added a global remove_proc_lock to protect this section of code. I
> > was going to add a lock to proc_dir_entry so that the locking is only
> > cut down to the same parent, but since this function is called so
> > infrequently, why waste more memory then is needed. One global lock
> > should not cause too much of a headache here.
>
> Are you sure that it's the only place where we need guard ->subdir? It
> looks like proc_lookup() and proc_readdir() use the BLK when walking that
> list, so probably the best fix would be to use that lock everywhere else
> ->subdir is touched
Perhaps this is a good candidate to have the BKL removed from this
protection and replaced with a spin lock or something else. If I
remember, I'll look into that further.
-- Steve
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] protect remove_proc_entry
2005-12-30 22:09 ` Steven Rostedt
@ 2005-12-30 22:18 ` Steven Rostedt
2006-01-04 9:21 ` Andrew Morton
2006-01-07 11:25 ` Andrew Morton
1 sibling, 1 reply; 25+ messages in thread
From: Steven Rostedt @ 2005-12-30 22:18 UTC (permalink / raw)
To: Mitchell Blank Jr; +Cc: Andrew Morton, Ingo Molnar, LKML
On Fri, 2005-12-30 at 17:09 -0500, Steven Rostedt wrote:
> Index: linux-2.6.15-rc7/fs/proc/generic.c
> ===================================================================
> --- linux-2.6.15-rc7.orig/fs/proc/generic.c 2005-12-30 14:19:39.000000000 -0500
> +++ linux-2.6.15-rc7/fs/proc/generic.c 2005-12-30 17:05:56.000000000 -0500
> @@ -693,6 +693,8 @@
> if (!parent && xlate_proc_name(name, &parent, &fn) != 0)
> goto out;
> len = strlen(fn);
> +
> + lock_kernel();
> for (p = &parent->subdir; *p; p=&(*p)->next ) {
> if (!proc_match(len, fn, *p))
> continue;
> @@ -713,6 +715,7 @@
> }
> break;
> }
> + unlock_kernel();
> out:
> return;
> }
>
FYI, to make sure that this solves the problem, I'm removing my locking
in my kernel and using this instead. It usually crashes in a day or
two, so I can say this works if it makes it three days.
-- Steve
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] protect remove_proc_entry
2005-12-30 21:28 ` [PATCH] protect remove_proc_entry Steven Rostedt
2005-12-30 21:34 ` Daniel Walker
2005-12-30 21:55 ` Mitchell Blank Jr
@ 2005-12-30 23:46 ` Andrew Morton
2005-12-31 6:58 ` Steven Rostedt
` (2 more replies)
2006-01-07 11:36 ` Andrew Morton
3 siblings, 3 replies; 25+ messages in thread
From: Andrew Morton @ 2005-12-30 23:46 UTC (permalink / raw)
To: Steven Rostedt; +Cc: linux-kernel, mingo
Steven Rostedt <rostedt@goodmis.org> wrote:
>
> +static DEFINE_SPINLOCK(remove_proc_lock);
>
I'll take a closer look at this next week.
The official way of protecting the contents of a directory from concurrent
lookup or modification is to take its i_sem. But procfs is totally weird
and that approach may well not be practical here. We'd certainly prefer
not to rely upon lock_kernel().
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] protect remove_proc_entry
2005-12-30 23:46 ` Andrew Morton
@ 2005-12-31 6:58 ` Steven Rostedt
2005-12-31 8:34 ` Arjan van de Ven
2005-12-31 8:53 ` Kirill Korotaev
2006-01-02 13:02 ` Steven Rostedt
2 siblings, 1 reply; 25+ messages in thread
From: Steven Rostedt @ 2005-12-31 6:58 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-kernel, mingo
On Fri, 30 Dec 2005, Andrew Morton wrote:
> Steven Rostedt <rostedt@goodmis.org> wrote:
> >
> > +static DEFINE_SPINLOCK(remove_proc_lock);
> >
>
> I'll take a closer look at this next week.
>
> The official way of protecting the contents of a directory from concurrent
> lookup or modification is to take its i_sem. But procfs is totally weird
> and that approach may well not be practical here. We'd certainly prefer
> not to rely upon lock_kernel().
>
Yeah, I thought about using the sem (or mutex ;) but remove_proc_entry is
used so darn much around the kernel, I wasn't sure it's not used in irq
context. So I'm not sure I like the idea of making a non-scheduling
function schedule. But it may not be a problem and it could very well be
ok to schedule here.
Also as Mitchell Blank pointed out, this list should probably be protected
everywhere by the same protection used, and an analysis should be done to
see what replacing the BKL affects.
-- Steve
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] protect remove_proc_entry
2005-12-31 6:58 ` Steven Rostedt
@ 2005-12-31 8:34 ` Arjan van de Ven
0 siblings, 0 replies; 25+ messages in thread
From: Arjan van de Ven @ 2005-12-31 8:34 UTC (permalink / raw)
To: Steven Rostedt; +Cc: Andrew Morton, linux-kernel, mingo
On Sat, 2005-12-31 at 01:58 -0500, Steven Rostedt wrote:
> On Fri, 30 Dec 2005, Andrew Morton wrote:
>
> > Steven Rostedt <rostedt@goodmis.org> wrote:
> > >
> > > +static DEFINE_SPINLOCK(remove_proc_lock);
> > >
> >
> > I'll take a closer look at this next week.
> >
> > The official way of protecting the contents of a directory from concurrent
> > lookup or modification is to take its i_sem. But procfs is totally weird
> > and that approach may well not be practical here. We'd certainly prefer
> > not to rely upon lock_kernel().
> >
>
> Yeah, I thought about using the sem (or mutex ;) but remove_proc_entry is
> used so darn much around the kernel, I wasn't sure it's not used in irq
> context.
it can't be; "anything-VFS" like this can sleep if the file is busy etc
etc.
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] protect remove_proc_entry
2005-12-30 23:46 ` Andrew Morton
2005-12-31 6:58 ` Steven Rostedt
@ 2005-12-31 8:53 ` Kirill Korotaev
2006-01-04 9:36 ` Andrew Morton
2006-01-02 13:02 ` Steven Rostedt
2 siblings, 1 reply; 25+ messages in thread
From: Kirill Korotaev @ 2005-12-31 8:53 UTC (permalink / raw)
To: Andrew Morton; +Cc: Steven Rostedt, linux-kernel, mingo
[-- Attachment #1: Type: text/plain, Size: 1407 bytes --]
Hi Andrew,
I have a full patch for this.
I don't remember the details yet, but lock was not god here, we used
semaphore. I pointed to this problem long ago when fixed error path in
proc with moduleget.
This patch protects proc_dir_entry tree with a proc_tree_sem semaphore.
I suppose lock_kernel() can be removed later after checking that no proc
handlers require it.
Also this patch remakes de refcounters a bit making it more clear and
more similar to dentry scheme - this is required to make sure that
everything works correctly.
Patch is against 2.6.15-rcX and was tested for about a week. Also works
half a year on 2.6.8 :)
Signed-Off-By: Pavel Emelianov <xemul@sw.ru>
Signed-Off-By: Kirill Korotaev <dev@openvz.org>
Kirill
> Steven Rostedt <rostedt@goodmis.org> wrote:
>> +static DEFINE_SPINLOCK(remove_proc_lock);
>>
>
> I'll take a closer look at this next week.
>
> The official way of protecting the contents of a directory from concurrent
> lookup or modification is to take its i_sem. But procfs is totally weird
> and that approach may well not be practical here. We'd certainly prefer
> not to rely upon lock_kernel().
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
[-- Attachment #2: diff-ms-proc-locks-20051231 --]
[-- Type: text/plain, Size: 7112 bytes --]
--- ./fs/proc/generic.c.proclk 2005-12-26 13:43:14.000000000 +0300
+++ ./fs/proc/generic.c 2005-12-31 11:48:16.000000000 +0300
@@ -27,6 +27,8 @@ static ssize_t proc_file_write(struct fi
size_t count, loff_t *ppos);
static loff_t proc_file_lseek(struct file *, loff_t, int);
+static DECLARE_RWSEM(proc_tree_sem);
+
int proc_match(int len, const char *name, struct proc_dir_entry *de)
{
if (de->namelen != len)
@@ -381,6 +383,7 @@ struct dentry *proc_lookup(struct inode
lock_kernel();
de = PDE(dir);
if (de) {
+ down_read(&proc_tree_sem);
for (de = de->subdir; de ; de = de->next) {
if (de->namelen != dentry->d_name.len)
continue;
@@ -392,6 +395,7 @@ struct dentry *proc_lookup(struct inode
break;
}
}
+ up_read(&proc_tree_sem);
}
unlock_kernel();
@@ -446,12 +450,13 @@ int proc_readdir(struct file * filp,
filp->f_pos++;
/* fall through */
default:
+ down_read(&proc_tree_sem);
de = de->subdir;
i -= 2;
for (;;) {
if (!de) {
ret = 1;
- goto out;
+ goto out_up;
}
if (!i)
break;
@@ -462,14 +467,18 @@ int proc_readdir(struct file * filp,
do {
if (filldir(dirent, de->name, de->namelen, filp->f_pos,
de->low_ino, de->mode >> 12) < 0)
- goto out;
+ goto out_up;
filp->f_pos++;
de = de->next;
} while (de);
+ up_read(&proc_tree_sem);
}
ret = 1;
out: unlock_kernel();
return ret;
+out_up:
+ up_read(&proc_tree_sem);
+ goto out;
}
/*
@@ -517,6 +526,7 @@ static int proc_register(struct proc_dir
if (dp->proc_iops == NULL)
dp->proc_iops = &proc_file_inode_operations;
}
+ de_get(dir);
return 0;
}
@@ -576,6 +586,7 @@ static struct proc_dir_entry *proc_creat
memset(ent, 0, sizeof(struct proc_dir_entry));
memcpy(((char *) ent) + sizeof(struct proc_dir_entry), fn, len + 1);
+ atomic_set(&ent->count, 1);
ent->name = ((char *) ent) + sizeof(*ent);
ent->namelen = len;
ent->mode = mode;
@@ -589,6 +600,7 @@ struct proc_dir_entry *proc_symlink(cons
{
struct proc_dir_entry *ent;
+ down_write(&proc_tree_sem);
ent = proc_create(&parent,name,
(S_IFLNK | S_IRUGO | S_IWUGO | S_IXUGO),1);
@@ -606,6 +618,7 @@ struct proc_dir_entry *proc_symlink(cons
ent = NULL;
}
}
+ up_write(&proc_tree_sem);
return ent;
}
@@ -614,6 +627,7 @@ struct proc_dir_entry *proc_mkdir_mode(c
{
struct proc_dir_entry *ent;
+ down_write(&proc_tree_sem);
ent = proc_create(&parent, name, S_IFDIR | mode, 2);
if (ent) {
ent->proc_fops = &proc_dir_operations;
@@ -624,6 +638,7 @@ struct proc_dir_entry *proc_mkdir_mode(c
ent = NULL;
}
}
+ up_write(&proc_tree_sem);
return ent;
}
@@ -633,7 +648,7 @@ struct proc_dir_entry *proc_mkdir(const
return proc_mkdir_mode(name, S_IRUGO | S_IXUGO, parent);
}
-struct proc_dir_entry *create_proc_entry(const char *name, mode_t mode,
+static struct proc_dir_entry *__create_proc_entry(const char *name, mode_t mode,
struct proc_dir_entry *parent)
{
struct proc_dir_entry *ent;
@@ -665,6 +680,17 @@ struct proc_dir_entry *create_proc_entry
return ent;
}
+struct proc_dir_entry *create_proc_entry(const char *name, mode_t mode,
+ struct proc_dir_entry *parent)
+{
+ struct proc_dir_entry *ent;
+
+ down_write(&proc_tree_sem);
+ ent = __create_proc_entry(name, mode, parent);
+ up_write(&proc_tree_sem);
+ return ent;
+}
+
void free_proc_entry(struct proc_dir_entry *de)
{
unsigned int ino = de->low_ino;
@@ -683,15 +709,13 @@ void free_proc_entry(struct proc_dir_ent
* Remove a /proc entry and free it if it's not currently in use.
* If it is in use, we set the 'deleted' flag.
*/
-void remove_proc_entry(const char *name, struct proc_dir_entry *parent)
+static void __remove_proc_entry(const char *name, struct proc_dir_entry *parent)
{
struct proc_dir_entry **p;
struct proc_dir_entry *de;
const char *fn = name;
int len;
- if (!parent && xlate_proc_name(name, &parent, &fn) != 0)
- goto out;
len = strlen(fn);
for (p = &parent->subdir; *p; p=&(*p)->next ) {
if (!proc_match(len, fn, *p))
@@ -699,20 +723,24 @@ void remove_proc_entry(const char *name,
de = *p;
*p = de->next;
de->next = NULL;
+ de_put(parent);
if (S_ISDIR(de->mode))
parent->nlink--;
proc_kill_inodes(de);
de->nlink = 0;
WARN_ON(de->subdir);
- if (!atomic_read(&de->count))
- free_proc_entry(de);
- else {
- de->deleted = 1;
- printk("remove_proc_entry: %s/%s busy, count=%d\n",
- parent->name, de->name, atomic_read(&de->count));
- }
+ de->deleted = 1;
+ de_put(de);
break;
}
-out:
- return;
+}
+
+void remove_proc_entry(const char *name, struct proc_dir_entry *parent)
+{
+ const char *fn = name;
+
+ down_write(&proc_tree_sem);
+ if (parent || xlate_proc_name(name, &parent, &fn) == 0)
+ __remove_proc_entry(name, parent);
+ up_write(&proc_tree_sem);
}
--- ./fs/proc/inode.c.proclk 2005-12-26 13:43:14.000000000 +0300
+++ ./fs/proc/inode.c 2005-12-31 11:48:16.000000000 +0300
@@ -21,34 +21,25 @@
extern void free_proc_entry(struct proc_dir_entry *);
-static inline struct proc_dir_entry * de_get(struct proc_dir_entry *de)
-{
- if (de)
- atomic_inc(&de->count);
- return de;
-}
-
/*
* Decrements the use count and checks for deferred deletion.
*/
-static void de_put(struct proc_dir_entry *de)
+void de_put(struct proc_dir_entry *de)
{
if (de) {
- lock_kernel();
if (!atomic_read(&de->count)) {
printk("de_put: entry %s already free!\n", de->name);
- unlock_kernel();
return;
}
if (atomic_dec_and_test(&de->count)) {
- if (de->deleted) {
- printk("de_put: deferred delete of %s\n",
- de->name);
- free_proc_entry(de);
+ if (!de->deleted) {
+ printk("de_put: entry %s is not removed yet\n",
+ de->name);
+ return;
}
- }
- unlock_kernel();
+ free_proc_entry(de);
+ }
}
}
--- ./include/linux/proc_fs.h.proclk 2005-12-26 13:43:16.000000000 +0300
+++ ./include/linux/proc_fs.h 2005-12-31 11:48:16.000000000 +0300
@@ -69,6 +69,14 @@ struct proc_dir_entry {
void *set;
};
+extern void de_put(struct proc_dir_entry *);
+static inline struct proc_dir_entry *de_get(struct proc_dir_entry *de)
+{
+ if (de)
+ atomic_inc(&de->count);
+ return de;
+}
+
struct kcore_list {
struct kcore_list *next;
unsigned long addr;
--- ./kernel/sysctl.c.proclk 2005-12-26 13:43:16.000000000 +0300
+++ ./kernel/sysctl.c 2005-12-31 11:48:37.000000000 +0300
@@ -1396,19 +1396,15 @@ static void unregister_proc_table(ctl_ta
continue;
}
- /*
- * In any case, mark the entry as goner; we'll keep it
- * around if it's busy, but we'll know to do nothing with
- * its fields. We are under sysctl_lock here.
- */
de->data = NULL;
-
- /* Don't unregister proc entries that are still being used.. */
- if (atomic_read(&de->count))
- continue;
-
table->de = NULL;
+ /*
+ * sys_sysctl can't find us, since we are removed from list.
+ * proc won't touch either, since de->data is NULL.
+ */
+ spin_unlock(&sysctl_lock);
remove_proc_entry(table->procname, root);
+ spin_lock(&sysctl_lock);
}
}
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] protect remove_proc_entry
2005-12-30 23:46 ` Andrew Morton
2005-12-31 6:58 ` Steven Rostedt
2005-12-31 8:53 ` Kirill Korotaev
@ 2006-01-02 13:02 ` Steven Rostedt
2 siblings, 0 replies; 25+ messages in thread
From: Steven Rostedt @ 2006-01-02 13:02 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-kernel, mingo
On Fri, 2005-12-30 at 15:46 -0800, Andrew Morton wrote:
> Steven Rostedt <rostedt@goodmis.org> wrote:
> >
> > +static DEFINE_SPINLOCK(remove_proc_lock);
> >
>
> I'll take a closer look at this next week.
>
> The official way of protecting the contents of a directory from concurrent
> lookup or modification is to take its i_sem. But procfs is totally weird
> and that approach may well not be practical here. We'd certainly prefer
> not to rely upon lock_kernel().
FWIW,
My test that would crash within two days has been running for three days
now with the lock_kernel patch. So, at least this fixes the problem,
whether we use another locking or not, it's good to know what to fix.
-- Steve
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] protect remove_proc_entry
2005-12-30 22:18 ` Steven Rostedt
@ 2006-01-04 9:21 ` Andrew Morton
2006-01-04 12:18 ` Steven Rostedt
0 siblings, 1 reply; 25+ messages in thread
From: Andrew Morton @ 2006-01-04 9:21 UTC (permalink / raw)
To: Steven Rostedt; +Cc: mitch, mingo, linux-kernel
Steven Rostedt <rostedt@goodmis.org> wrote:
>
> On Fri, 2005-12-30 at 17:09 -0500, Steven Rostedt wrote:
>
> > Index: linux-2.6.15-rc7/fs/proc/generic.c
> > ===================================================================
> > --- linux-2.6.15-rc7.orig/fs/proc/generic.c 2005-12-30 14:19:39.000000000 -0500
> > +++ linux-2.6.15-rc7/fs/proc/generic.c 2005-12-30 17:05:56.000000000 -0500
> > @@ -693,6 +693,8 @@
> > if (!parent && xlate_proc_name(name, &parent, &fn) != 0)
> > goto out;
> > len = strlen(fn);
> > +
> > + lock_kernel();
> > for (p = &parent->subdir; *p; p=&(*p)->next ) {
> > if (!proc_match(len, fn, *p))
> > continue;
> > @@ -713,6 +715,7 @@
> > }
> > break;
> > }
> > + unlock_kernel();
> > out:
> > return;
> > }
> >
>
> FYI, to make sure that this solves the problem, I'm removing my locking
> in my kernel and using this instead. It usually crashes in a day or
> two, so I can say this works if it makes it three days.
>
I guess the lock_kernel() approach is the way to go. Fixing a race and
de-BKLing procfs are separate exercises...
Did the patch work?
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] protect remove_proc_entry
2005-12-31 8:53 ` Kirill Korotaev
@ 2006-01-04 9:36 ` Andrew Morton
2006-01-04 11:27 ` Kirill Korotaev
0 siblings, 1 reply; 25+ messages in thread
From: Andrew Morton @ 2006-01-04 9:36 UTC (permalink / raw)
To: Kirill Korotaev; +Cc: rostedt, linux-kernel, mingo
Kirill Korotaev <dev@sw.ru> wrote:
>
> Hi Andrew,
>
> I have a full patch for this.
Please don't top-post. It makes things hard...
> I don't remember the details yet, but lock was not god here, we used
> semaphore. I pointed to this problem long ago when fixed error path in
> proc with moduleget.
>
> This patch protects proc_dir_entry tree with a proc_tree_sem semaphore.
> I suppose lock_kernel() can be removed later after checking that no proc
> handlers require it.
> Also this patch remakes de refcounters a bit making it more clear and
> more similar to dentry scheme - this is required to make sure that
> everything works correctly.
>
> Patch is against 2.6.15-rcX and was tested for about a week. Also works
> half a year on 2.6.8 :)
>
> [ patch which uses an rwsem for procfs and somewhat removes lock_kernel() ]
>
I worry about replacing a spinlock with a sleeping lock. In some
circumstances it can cause a complete scalability collapse and I suspect
this could happen with /proc. Although I guess the only fastpath here is
proc_readdir(), and as the lock is taken there for reading, we'll be OK..
The patch does leave some lock_kernel() calls behind. If we're going to do
this, I think they should all be removed?
Races in /proc have been plentiful and hard to find. The patch worries me,
frankly. I'd like to see quite a bit more description of the locking
schema and some demonstration that it's actually complete before taking the
plunge.
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] protect remove_proc_entry
2006-01-04 9:36 ` Andrew Morton
@ 2006-01-04 11:27 ` Kirill Korotaev
0 siblings, 0 replies; 25+ messages in thread
From: Kirill Korotaev @ 2006-01-04 11:27 UTC (permalink / raw)
To: Andrew Morton; +Cc: rostedt, linux-kernel, mingo
>>Hi Andrew,
>>
>>I have a full patch for this.
>
>
> Please don't top-post. It makes things hard...
do you prefer separate mails with patch and with reference to original
report? will do so.
>>I don't remember the details yet, but lock was not god here, we used
>>semaphore. I pointed to this problem long ago when fixed error path in
>>proc with moduleget.
>>
>>This patch protects proc_dir_entry tree with a proc_tree_sem semaphore.
>>I suppose lock_kernel() can be removed later after checking that no proc
>>handlers require it.
>>Also this patch remakes de refcounters a bit making it more clear and
>>more similar to dentry scheme - this is required to make sure that
>>everything works correctly.
>>
>>Patch is against 2.6.15-rcX and was tested for about a week. Also works
>>half a year on 2.6.8 :)
>>
>>[ patch which uses an rwsem for procfs and somewhat removes lock_kernel() ]
>>
>
>
> I worry about replacing a spinlock with a sleeping lock. In some
> circumstances it can cause a complete scalability collapse and I suspect
> this could happen with /proc. Although I guess the only fastpath here is
> proc_readdir(), and as the lock is taken there for reading, we'll be OK..
>
> The patch does leave some lock_kernel() calls behind. If we're going to do
> this, I think they should all be removed?
>
> Races in /proc have been plentiful and hard to find. The patch worries me,
> frankly. I'd like to see quite a bit more description of the locking
> schema and some demonstration that it's actually complete before taking the
> plunge.
ok, here goes some more descriptions:
1.
proc_readdir is a sleeping ops:
sys_getdents
`- vfs_readdir
`- proc_readdir
`- filldir
`- put_user/copy_to_user
that's why we must use semaphore. of course spinlock can be used too,
but this will case another problem: it must be dropped to call filldir, but
beofre it will be retaken the dentry being filldir-ed may be removed and
we won't see it's siblings in output.
2.
lock_kernel() is needed to handle at least simultaneous sys_read vs
create_proc_entry with sequental de->proc_fops = XXX. this can be
handled by passing fops directly to create_proc_entry.
i.e. there is a 3rd problem I pointed to you before:
create_proc_entry() and setting of de->owner/de->proc_fops is inatomic.
lock_kernel() is not a best way to protect against this, sure...
I would prefer to fix it with a separate patch somehow...
3.
refcounting:
each proc_dir_entry's refcounter is the reference from inode plus
references from children. once this count reaches zero - dentry is freed.
so on each proc_register() parent is get-ed, on each remove_proc_entry
parent is put-ed.
Kirill
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] protect remove_proc_entry
2006-01-04 9:21 ` Andrew Morton
@ 2006-01-04 12:18 ` Steven Rostedt
2006-01-05 1:48 ` Mitchell Blank Jr
0 siblings, 1 reply; 25+ messages in thread
From: Steven Rostedt @ 2006-01-04 12:18 UTC (permalink / raw)
To: Andrew Morton; +Cc: mitch, mingo, linux-kernel
On Wed, 4 Jan 2006, Andrew Morton wrote:
> Steven Rostedt <rostedt@goodmis.org> wrote:
> >
> > FYI, to make sure that this solves the problem, I'm removing my locking
> > in my kernel and using this instead. It usually crashes in a day or
> > two, so I can say this works if it makes it three days.
> >
>
> I guess the lock_kernel() approach is the way to go. Fixing a race and
> de-BKLing procfs are separate exercises...
>
> Did the patch work?
>
Sorry, I forgot to respond, because the test is still running ;)
So yes, it not only ran for three days, it ran for six. I just killed it.
-- Steve
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] protect remove_proc_entry
2006-01-04 12:18 ` Steven Rostedt
@ 2006-01-05 1:48 ` Mitchell Blank Jr
0 siblings, 0 replies; 25+ messages in thread
From: Mitchell Blank Jr @ 2006-01-05 1:48 UTC (permalink / raw)
To: Steven Rostedt; +Cc: Andrew Morton, mingo, linux-kernel
Steven Rostedt wrote:
> > I guess the lock_kernel() approach is the way to go. Fixing a race and
> > de-BKLing procfs are separate exercises...
> >
> > Did the patch work?
>
> Sorry, I forgot to respond, because the test is still running ;)
>
> So yes, it not only ran for three days, it ran for six. I just killed it.
Have you looked at the other paths that touch ->subdir? Namely:
proc_devtree.c:
proc_device_tree_add_node() -- plays games with ->subdir directly
generic.c:
proc_create() -- calls xlate_proc_name() which touches ->subdir and
should therefore probably be called under BKL
proc_register() -- both the call to xlate_proc_name() and the
following accesses to ->subdir should be under BKL
-Mitch
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] protect remove_proc_entry
2005-12-30 22:09 ` Steven Rostedt
2005-12-30 22:18 ` Steven Rostedt
@ 2006-01-07 11:25 ` Andrew Morton
1 sibling, 0 replies; 25+ messages in thread
From: Andrew Morton @ 2006-01-07 11:25 UTC (permalink / raw)
To: Steven Rostedt; +Cc: mitch, linux-kernel, mingo
Steven Rostedt <rostedt@goodmis.org> wrote:
>
> God, we should be getting rid of the stupid BKL, not add more. But
> seeing that this is what is used to protect that list, I guess I'll add
> it.
>
> I'm also assuming that interrupt context wont use this.
>
> -- Steve
>
> Index: linux-2.6.15-rc7/fs/proc/generic.c
> ===================================================================
> --- linux-2.6.15-rc7.orig/fs/proc/generic.c 2005-12-30 14:19:39.000000000 -0500
> +++ linux-2.6.15-rc7/fs/proc/generic.c 2005-12-30 17:05:56.000000000 -0500
> @@ -693,6 +693,8 @@
> if (!parent && xlate_proc_name(name, &parent, &fn) != 0)
> goto out;
> len = strlen(fn);
> +
> + lock_kernel();
> for (p = &parent->subdir; *p; p=&(*p)->next ) {
> if (!proc_match(len, fn, *p))
> continue;
> @@ -713,6 +715,7 @@
> }
> break;
> }
> + unlock_kernel();
> out:
> return;
> }
OK, we're kind of screwed here.
Debug: sleeping function called from invalid context at include/asm/semaphore.h:105
in_atomic():1, irqs_disabled():0
Call Trace:<ffffffff8012689b>{__might_sleep+190} <ffffffff803e83ce>{lock_kernel+53}
<ffffffff801a6db2>{remove_proc_entry+74} <ffffffff8016b295>{poison_obj+58}
<ffffffff80134ee5>{unregister_proc_table+121} <ffffffff80134eb6>{unregister_proc_table+74}
<ffffffff80134eb6>{unregister_proc_table+74} <ffffffff80134eb6>{unregister_proc_table+74}
<ffffffff80134eb6>{unregister_proc_table+74} <ffffffff80134fdb>{unregister_sysctl_table+232}
<ffffffff803a2a0e>{ip_mc_dec_group+181} <ffffffff803e802e>{_write_lock_irqsave+32}
<ffffffff80133dc2>{local_bh_enable+114} <ffffffff803e82b3>{_write_unlock_bh+24}
<ffffffff8039ebfe>{devinet_sysctl_unregister+31} <ffffffff8039ecc1>{inetdev_destroy+171}
<ffffffff8039f1c1>{inet_del_ifa+509} <ffffffff8039f2dc>{inet_rtm_deladdr+268}
<ffffffff8036efea>{rtnetlink_rcv_msg+437} <ffffffff803761c3>{netlink_run_queue+140}
<ffffffff8036ee35>{rtnetlink_rcv_msg+0} <ffffffff8036f032>{rtnetlink_rcv+41}
<ffffffff803758af>{netlink_data_ready+23} <ffffffff80374d77>{netlink_sendskb+41}
<ffffffff80374ff4>{netlink_unicast+539} <ffffffff80375881>{netlink_sendmsg+667}
<ffffffff8035bedf>{sock_sendmsg+232} <ffffffff803e8203>{_read_unlock_irq+20}
<ffffffff80142e90>{autoremove_wake_function+0} <ffffffff80171a1e>{fget+157}
<ffffffff80142e90>{autoremove_wake_function+0} <ffffffff80171a1e>{fget+157}
<ffffffff8035bbea>{sockfd_lookup+18} <ffffffff8035d408>{sys_sendto+246}
<ffffffff803e80cc>{_spin_unlock_irqrestore+27} <ffffffff80238031>{__up_write+371}
<ffffffff8010db46>{system_call+126}
Because CONFIG_PREEMPT_BKL makes lock_kernel do down() and some callers of
remove_proc_entry() do it from inside spinlock.
And the spinlocking variant of lock_kernel() is also pretty much illegal,
because (in this case) whatever lock networking has taken may be taken
elsewhere inside lock_kernel(), so we have an ab/ba deadlock.
Best I can think of is that we need a private spinlock for this list.
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] protect remove_proc_entry
2005-12-30 21:28 ` [PATCH] protect remove_proc_entry Steven Rostedt
` (2 preceding siblings ...)
2005-12-30 23:46 ` Andrew Morton
@ 2006-01-07 11:36 ` Andrew Morton
2006-01-07 12:04 ` Steven Rostedt
2006-01-09 19:16 ` Steven Rostedt
3 siblings, 2 replies; 25+ messages in thread
From: Andrew Morton @ 2006-01-07 11:36 UTC (permalink / raw)
To: Steven Rostedt; +Cc: linux-kernel, mingo
Steven Rostedt <rostedt@goodmis.org> wrote:
>
> Working on a custom kernel that adds and removes proc entries quite a
> bit, I discovered that remove_proc_entry is not protected against
> multiple threads removing entries belonging to the same parent. At
> first I thought that this is only a problem with my changes, but after
> inspecting the vanilla kernel, I see that there's several places that
> calls remove_proc_entry with the same parent (most noticeably
> /proc/drivers).
>
> I've added a global remove_proc_lock to protect this section of code. I
> was going to add a lock to proc_dir_entry so that the locking is only
> cut down to the same parent, but since this function is called so
> infrequently, why waste more memory then is needed. One global lock
> should not cause too much of a headache here.
>
> I'm not sure if remove_proc_entry is called from interrupt context, so I
> did a irqsave just in case.
>
> -- Steve
>
>
> Index: linux-2.6.15-rc7/fs/proc/generic.c
> ===================================================================
> --- linux-2.6.15-rc7.orig/fs/proc/generic.c 2005-12-30 14:19:39.000000000 -0500
> +++ linux-2.6.15-rc7/fs/proc/generic.c 2005-12-30 16:18:42.000000000 -0500
> @@ -19,6 +19,7 @@
> #include <linux/idr.h>
> #include <linux/namei.h>
> #include <linux/bitops.h>
> +#include <linux/spinlock.h>
> #include <asm/uaccess.h>
>
> static ssize_t proc_file_read(struct file *file, char __user *buf,
> @@ -27,6 +28,8 @@
> size_t count, loff_t *ppos);
> static loff_t proc_file_lseek(struct file *, loff_t, int);
>
> +static DEFINE_SPINLOCK(remove_proc_lock);
> +
> int proc_match(int len, const char *name, struct proc_dir_entry *de)
> {
> if (de->namelen != len)
> @@ -689,10 +692,13 @@
> struct proc_dir_entry *de;
> const char *fn = name;
> int len;
> + unsigned long flags;
>
> if (!parent && xlate_proc_name(name, &parent, &fn) != 0)
> goto out;
> len = strlen(fn);
> +
> + spin_lock_irqsave(&remove_proc_lock, flags);
> for (p = &parent->subdir; *p; p=&(*p)->next ) {
> if (!proc_match(len, fn, *p))
> continue;
> @@ -713,6 +719,7 @@
> }
> break;
> }
> + spin_unlock_irqrestore(&remove_proc_lock, flags);
> out:
> return;
> }
Aren't there other places where we need to take this lock? Code which
traverses that list, code which adds things to it?
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] protect remove_proc_entry
2006-01-07 11:36 ` Andrew Morton
@ 2006-01-07 12:04 ` Steven Rostedt
2006-01-09 19:16 ` Steven Rostedt
1 sibling, 0 replies; 25+ messages in thread
From: Steven Rostedt @ 2006-01-07 12:04 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-kernel, mingo
On Sat, 7 Jan 2006, Andrew Morton wrote:
>
> Aren't there other places where we need to take this lock? Code which
> traverses that list, code which adds things to it?
>
Yeah, that patch was just a quick fix. I'll look more into that on
Monday. (My wife has too many chores for me this weekend ;)
-- Steve
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] protect remove_proc_entry
2006-01-07 11:36 ` Andrew Morton
2006-01-07 12:04 ` Steven Rostedt
@ 2006-01-09 19:16 ` Steven Rostedt
2006-01-10 0:59 ` Steven Rostedt
2006-01-10 13:26 ` Steven Rostedt
1 sibling, 2 replies; 25+ messages in thread
From: Steven Rostedt @ 2006-01-09 19:16 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-kernel, mingo
On Sat, 2006-01-07 at 03:36 -0800, Andrew Morton wrote:
>
> Aren't there other places where we need to take this lock? Code which
> traverses that list, code which adds things to it?
>
Andrew,
How's this patch look? I tested this with the following module:
http://www.kihontech.com/tests/proc/proc_stress.c
Without the patch, I could hang the processes (the processes would
either crash, or just get stuck spinning inside the list). With the
patch, the module ran to completion each time.
I don't believe any of the calls are called from interrupt context, so I
held off from using the _irqsave versions of spin_lock.
-- Steve
Index: linux-2.6.15/fs/proc/generic.c
===================================================================
--- linux-2.6.15.orig/fs/proc/generic.c 2006-01-09 13:58:23.000000000 -0500
+++ linux-2.6.15/fs/proc/generic.c 2006-01-09 13:58:34.000000000 -0500
@@ -19,6 +19,7 @@
#include <linux/idr.h>
#include <linux/namei.h>
#include <linux/bitops.h>
+#include <linux/spinlock.h>
#include <asm/uaccess.h>
static ssize_t proc_file_read(struct file *file, char __user *buf,
@@ -27,6 +28,8 @@
size_t count, loff_t *ppos);
static loff_t proc_file_lseek(struct file *, loff_t, int);
+DEFINE_SPINLOCK(proc_subdir_lock);
+
int proc_match(int len, const char *name, struct proc_dir_entry *de)
{
if (de->namelen != len)
@@ -275,7 +278,9 @@
const char *cp = name, *next;
struct proc_dir_entry *de;
int len;
+ int rtn = 0;
+ spin_lock(&proc_subdir_lock);
de = &proc_root;
while (1) {
next = strchr(cp, '/');
@@ -287,13 +292,17 @@
if (proc_match(len, cp, de))
break;
}
- if (!de)
- return -ENOENT;
+ if (!de) {
+ rtn = -ENOENT;
+ goto out;
+ }
cp += len + 1;
}
*residual = cp;
*ret = de;
- return 0;
+out:
+ spin_unlock(&proc_subdir_lock);
+ return rtn;
}
static DEFINE_IDR(proc_inum_idr);
@@ -378,6 +387,7 @@
int error = -ENOENT;
lock_kernel();
+ spin_lock(&proc_subdir_lock);
de = PDE(dir);
if (de) {
for (de = de->subdir; de ; de = de->next) {
@@ -386,12 +396,15 @@
if (!memcmp(dentry->d_name.name, de->name, de->namelen)) {
unsigned int ino = de->low_ino;
+ spin_unlock(&proc_subdir_lock);
error = -EINVAL;
inode = proc_get_inode(dir->i_sb, ino, de);
+ spin_lock(&proc_subdir_lock);
break;
}
}
}
+ spin_unlock(&proc_subdir_lock);
unlock_kernel();
if (inode) {
@@ -445,11 +458,13 @@
filp->f_pos++;
/* fall through */
default:
+ spin_lock(&proc_subdir_lock);
de = de->subdir;
i -= 2;
for (;;) {
if (!de) {
ret = 1;
+ spin_unlock(&proc_subdir_lock);
goto out;
}
if (!i)
@@ -459,12 +474,16 @@
}
do {
+ /* filldir passes info to user space */
+ spin_unlock(&proc_subdir_lock);
if (filldir(dirent, de->name, de->namelen, filp->f_pos,
de->low_ino, de->mode >> 12) < 0)
goto out;
+ spin_lock(&proc_subdir_lock);
filp->f_pos++;
de = de->next;
} while (de);
+ spin_unlock(&proc_subdir_lock);
}
ret = 1;
out: unlock_kernel();
@@ -498,9 +517,13 @@
if (i == 0)
return -EAGAIN;
dp->low_ino = i;
+
+ spin_lock(&proc_subdir_lock);
dp->next = dir->subdir;
dp->parent = dir;
dir->subdir = dp;
+ spin_unlock(&proc_subdir_lock);
+
if (S_ISDIR(dp->mode)) {
if (dp->proc_iops == NULL) {
dp->proc_fops = &proc_dir_operations;
@@ -692,6 +715,8 @@
if (!parent && xlate_proc_name(name, &parent, &fn) != 0)
goto out;
len = strlen(fn);
+
+ spin_lock(&proc_subdir_lock);
for (p = &parent->subdir; *p; p=&(*p)->next ) {
if (!proc_match(len, fn, *p))
continue;
@@ -712,6 +737,7 @@
}
break;
}
+ spin_unlock(&proc_subdir_lock);
out:
return;
}
Index: linux-2.6.15/fs/proc/proc_devtree.c
===================================================================
--- linux-2.6.15.orig/fs/proc/proc_devtree.c 2006-01-09 13:58:23.000000000 -0500
+++ linux-2.6.15/fs/proc/proc_devtree.c 2006-01-09 14:05:10.000000000 -0500
@@ -112,9 +112,11 @@
* properties are quite unimportant for us though, thus we
* simply "skip" them here, but we do have to check.
*/
+ spin_lock(&proc_subdir_lock);
for (ent = de->subdir; ent != NULL; ent = ent->next)
if (!strcmp(ent->name, pp->name))
break;
+ spin_unlock(&proc_subdir_lock);
if (ent != NULL) {
printk(KERN_WARNING "device-tree: property \"%s\" name"
" conflicts with node in %s\n", pp->name,
Index: linux-2.6.15/include/linux/proc_fs.h
===================================================================
--- linux-2.6.15.orig/include/linux/proc_fs.h 2006-01-09 08:59:13.000000000 -0500
+++ linux-2.6.15/include/linux/proc_fs.h 2006-01-09 14:07:09.000000000 -0500
@@ -4,6 +4,7 @@
#include <linux/config.h>
#include <linux/slab.h>
#include <linux/fs.h>
+#include <linux/spinlock.h>
#include <asm/atomic.h>
/*
@@ -92,6 +93,8 @@
extern struct proc_dir_entry *proc_root_driver;
extern struct proc_dir_entry *proc_root_kcore;
+extern spinlock_t proc_subdir_lock;
+
extern void proc_root_init(void);
extern void proc_misc_init(void);
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] protect remove_proc_entry
2006-01-09 19:16 ` Steven Rostedt
@ 2006-01-10 0:59 ` Steven Rostedt
2006-01-10 1:05 ` Ingo Molnar
2006-01-10 13:26 ` Steven Rostedt
1 sibling, 1 reply; 25+ messages in thread
From: Steven Rostedt @ 2006-01-10 0:59 UTC (permalink / raw)
To: Ingo Molnar; +Cc: LKML
Ingo,
FYI
I just uploaded my 2.6.15-rt2-sr3 which includes the latest patch to fix
the bug in remove_proc_entry.
http://home.stny.rr.com/rostedt/patches/patch-2.6.15-rt2-sr3
Again, the module to test this is here:
http://www.kihontech.com/tests/proc/proc_stress.c
I tested it like the following:
# insmod proc_stress.ko & for i in `seq 1 10000`; do ls /proc/proc_tests; done
-- Steve
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] protect remove_proc_entry
2006-01-10 0:59 ` Steven Rostedt
@ 2006-01-10 1:05 ` Ingo Molnar
0 siblings, 0 replies; 25+ messages in thread
From: Ingo Molnar @ 2006-01-10 1:05 UTC (permalink / raw)
To: Steven Rostedt; +Cc: LKML
* Steven Rostedt <rostedt@goodmis.org> wrote:
> Ingo,
>
> FYI
>
> I just uploaded my 2.6.15-rt2-sr3 which includes the latest patch to
> fix the bug in remove_proc_entry.
>
> http://home.stny.rr.com/rostedt/patches/patch-2.6.15-rt2-sr3
thanks Steve - i've applied your fixes and have uploaded 2.6.15-rt3 to
the usual place:
http://redhat.com/~mingo/realtime-preempt/
(other than the version string it is the same as -rt2-sr3.)
Ingo
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] protect remove_proc_entry
2006-01-09 19:16 ` Steven Rostedt
2006-01-10 0:59 ` Steven Rostedt
@ 2006-01-10 13:26 ` Steven Rostedt
1 sibling, 0 replies; 25+ messages in thread
From: Steven Rostedt @ 2006-01-10 13:26 UTC (permalink / raw)
To: Andrew Morton; +Cc: mingo, linux-kernel
Andrew, here's a better explanation of the patch as well as my
Signed-off.
Description:
It has been discovered that the remove_proc_entry has a race in the
removing of entries in the proc file system that are siblings. There's
no protection around the traversing and removing of elements that belong
in the same subdirectory.
This subdirectory list is protected in other areas by the BKL. So the
BKL was at first used to protect this area too, but unfortunately,
remove_proc_entry may be called with spinlocks held. The BKL may
schedule, so this was not a solution.
The final solution was to add a new global spin lock to protect this
list, called proc_subdir_lock. This lock now protects the list in
remove_proc_entry, and I also went around looking for other areas that
this list is modified and added this protection there too. Care must be
taken since these locations call several functions that may also
schedule.
Since I don't see any location that these functions that modify the
subdirectory list are called by interrupts, the irqsave/restore versions
of the spin lock was _not_ used.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
-- Steve
^ permalink raw reply [flat|nested] 25+ messages in thread
end of thread, other threads:[~2006-01-10 13:27 UTC | newest]
Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-12-30 20:04 [Question] race condition with remove_proc_entry Steven Rostedt
2005-12-30 21:28 ` [PATCH] protect remove_proc_entry Steven Rostedt
2005-12-30 21:34 ` Daniel Walker
2005-12-30 21:55 ` Steven Rostedt
2005-12-30 21:55 ` Mitchell Blank Jr
2005-12-30 22:09 ` Steven Rostedt
2005-12-30 22:18 ` Steven Rostedt
2006-01-04 9:21 ` Andrew Morton
2006-01-04 12:18 ` Steven Rostedt
2006-01-05 1:48 ` Mitchell Blank Jr
2006-01-07 11:25 ` Andrew Morton
2005-12-30 22:11 ` Steven Rostedt
2005-12-30 23:46 ` Andrew Morton
2005-12-31 6:58 ` Steven Rostedt
2005-12-31 8:34 ` Arjan van de Ven
2005-12-31 8:53 ` Kirill Korotaev
2006-01-04 9:36 ` Andrew Morton
2006-01-04 11:27 ` Kirill Korotaev
2006-01-02 13:02 ` Steven Rostedt
2006-01-07 11:36 ` Andrew Morton
2006-01-07 12:04 ` Steven Rostedt
2006-01-09 19:16 ` Steven Rostedt
2006-01-10 0:59 ` Steven Rostedt
2006-01-10 1:05 ` Ingo Molnar
2006-01-10 13:26 ` Steven Rostedt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).