* PROBLEM: /proc (procfs) task exit race condition causes a kernel crash
@ 2006-05-26 0:43 Tony Griffiths
2006-05-28 15:37 ` Eric W. Biederman
0 siblings, 1 reply; 4+ messages in thread
From: Tony Griffiths @ 2006-05-26 0:43 UTC (permalink / raw)
To: linux-kernel
[-- Attachment #1: Type: text/plain, Size: 5076 bytes --]
Summary:
A condition exists that crashes the kernel when one or more tasks are
exiting while at the same time another task is reading their /proc
entries. The crash is caused by either a bad VA (NULL, LIST_POISON1, or
LIST_POISON2) in prune_dcache() or a BUG_ON() sanity check in
include/linux/list.h!
Detailed Description:
If there is a great deal of modification activity in /proc caused by
task creation [fork()] and task exiting, and at the same time other
task(s) are reading /proc/<pid>/... files, the dentry_unused list
becomes corrupted and the kernel crashes, usually in function
prune_dcache() in module fs/dcache.c! A simple program that forks
itself run in a continuous loop combined with a 'find /proc ... cat {}
\;' to read the /proc task entries is all that is needed to induce the
condition. A couple of sample crash outputs look like-
(a) BUG_ON() --
------------[ cut here ]------------
kernel BUG at include/linux/list.h:167!
invalid opcode: 0000 [#1]
SMP
last sysfs file: /class/vc/vcs1/dev
Modules linked in: parport_pc lp parport autofs4 i2c_dev i2c_core
microcode binfmt_misc video thermal sony_acpi processor fan button
battery ac ehci_hcd usbcore ide_cd cdrom sg ext3 jbd dm_mod mptspi
scsi_transport_spi mptscsih mptbase sd_mod scsi_mod
CPU: 1
EIP: 0060:[<c017ec60>] Not tainted VLI
EFLAGS: 00010203 (2.6.16-mm2 #1)
EIP is at prune_dcache+0x3c6/0x3d3
eax: 00000010 ebx: f7326b08 ecx: f7326b10 edx: c017e280
esi: f7326ae0 edi: f7e81e5c ebp: 00000001 esp: f7e81e4c
ds: 007b es: 007b ss: 0068
Process init (pid: 1, threadinfo=f7e80000 task=c352eaa0)
Stack: <0>c0401c00 f7e81e5c f7e81ead c017ef28 f7e81e5c f7e81e5c f7326df8
f620c000
f7326df8 f7e81ea8 c017efe6 00000006 f7ec0e00 f620c000 c019c71a
f7326df8
f7e81e98 c036a63a 000077a4 089c21d9 00000005 f7e81ea8 c0117643
32363033
Call Trace:
<c017ef28> select_parent+0x17/0xbc <c017efe6>
shrink_dcache_parent+0x19/0x2c
<c019c71a> proc_flush_task+0x5f/0x1f5 <c0117643> sched_exit+0xb1/0xc8
<c0120552> release_task+0x84/0x101 <c01028c3> handle_signal+0x108/0x143
<c0122094> wait_task_zombie+0x2de/0x3cf <c0102984> do_signal+0x86/0x11c
<c012291d> do_wait+0x36f/0x40f <c0119153> default_wake_function+0x0/0x12
<c01f08ec> copy_to_user+0x3c/0x50 <c0119153>
default_wake_function+0x0/0x12
<c0122a8c> sys_wait4+0x3f/0x43 <c0122ab7> sys_waitpid+0x27/0x2b
<c0102b5f> syscall_call+0x7/0xb
Code: 31 ff ff ff 0f 0b a7 00 9f 69 35 c0 e9 8c fd ff ff 0f 0b a8 00 9f
69 35 c0 e9 8b fd ff ff 0f 0b a8 00 9f 69 35 c0 e9 9e fe ff ff <0f> 0b
a7 00 9f 69 35 c0 e9 85 fe ff ff 55 b8 00 1c 40 c0 57 56
(b) LIST_POISON1/LIST_POISON2 --
# Unable to handle kernel paging request at virtual address 00100104
printing eip:
c0179d12
*pde = 3780b001
Oops: 0002 [#1]
SMP
Modules linked in: parport_pc lp parport autofs4 i2c_dev i2c_core
microcode binfmt_misc video thermal processor fan button battery ac
ehci_hcd usbcore ide_cd cdrom sg ext3 jbd dm_mod mptspi mptscsih mptbase
sd_mod scsi_mod
CPU: 7
EIP: 0060:[<c0179d12>] Not tainted VLI
EFLAGS: 00010202 (2.6.16.18 #1)
EIP is at prune_dcache+0x231/0x327
eax: 00100100 ebx: f553d55c ecx: f553d564 edx: 00200200
esi: f553d534 edi: f7e81e94 ebp: 00000002 esp: f7e81e84
ds: 007b es: 007b ss: 0068
Process init (pid: 1, threadinfo=f7e80000 task=c352ea90)
Stack: <0>c03f2c00 f7e81e94 f65fcac0 c017a03d f7e81e94 f7e81e94 f576a954
f59c2a90
f576a954 00000000 c017a0fa 0000000a f7ecf000 f576a954 c0196c8a
f576a954
f59c2a90 c01212b2 f576a954 c0102993 f59c2a90 00000000 00003b88
00000000
Call Trace:
[<c017a03d>] select_parent+0x17/0xbb
[<c017a0fa>] shrink_dcache_parent+0x19/0x2c
[<c0196c8a>] proc_pid_flush+0x14/0x26
[<c01212b2>] release_task+0xa3/0x12e
[<c0102993>] handle_signal+0x108/0x143
[<c0122dac>] wait_task_zombie+0x2de/0x3c9
[<c0102a5e>] do_signal+0x90/0x127
[<c01235c9>] do_wait+0x34d/0x3de
[<c011a913>] default_wake_function+0x0/0x12
[<c01e9b5e>] copy_to_user+0x3c/0x50
[<c011a913>] default_wake_function+0x0/0x12
[<c0123722>] sys_wait4+0x3f/0x43
[<c012374d>] sys_waitpid+0x27/0x2b
[<c0102c2d>] syscall_call+0x7/0xb
Code: fe ff ff 8b 4e 50 e9 bd fe ff ff 8d 7c 24 10 89 7c 24 10 89 7c 24
14 8b 46 04 a8 10 0f 84 a8 00 00 00 8d 4e 30 8b 46 30 8b 51 04 <89> 50
04 89 02 c7 41 04 00 02 20 00 c7 46 30 00 01 10 00 83 2d
<0>Kernel panic - not syncing: Attempted to kill init!
All kernels from 2.6.15 -> 2.6.17 with any of the applicable patch-sets
(-git or -mm) are affected!!! Also RedHat FC<n> kernels.
Environment:
The environment is any SMP hardware with the kernel build with or
without PREEMPT enabled. Any P4 hyperthreaded chip, or Xeon
multi-processor system [DELL 1425 & 1850 dual-Xeon and also dual-core
dual-Xeon in my case] will exhibit the crash.
The attached forkalot.c program combined with the simple shell scripts
do the job. Running the forkalot shell script while at the same time
running any of the proc-*.sh in a 'while true; do ... ; done' loop
crashes by systems within a couple of minutes.
[-- Attachment #2: forkalot.c --]
[-- Type: text/x-csrc, Size: 885 bytes --]
// Program: forkalot.c
//
// Compile: cc forkalot.c -o forkalot
// Run: ./forkalot [100] [1]
//
// Args: arg1 = # of copies of program to run simultaneously [100]
// arg2 = Sleep time before exiting [1]
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <unistd.h>
#include <fcntl.h>
#define CHILDREN 100
#define SLEEP_FOR 1
int main(int argc, char *argv[])
{
int count, this_long;
int pid;
if (argc > 1)
count = atoi(argv[1]);
else
count = CHILDREN;
if (argc > 2)
this_long = atoi(argv[2]);
else
this_long = SLEEP_FOR;
/* fork count-1 children */
while (count-- > 1) {
pid = fork();
if (pid == 0) {
/* child */
break;
} else if (pid < 0) {
perror("fork");
exit(1);
}
}
/* Sleepy... sleepy... */
sleep(this_long);
/* All done... return success! */
return 0;
}
[-- Attachment #3: forkalot-test.sh --]
[-- Type: application/x-shellscript, Size: 222 bytes --]
[-- Attachment #4: proc-cmdline.sh --]
[-- Type: application/x-shellscript, Size: 78 bytes --]
[-- Attachment #5: proc-status.sh --]
[-- Type: application/x-shellscript, Size: 77 bytes --]
[-- Attachment #6: proc-torture.sh --]
[-- Type: application/x-shellscript, Size: 178 bytes --]
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: PROBLEM: /proc (procfs) task exit race condition causes a kernel crash
2006-05-26 0:43 PROBLEM: /proc (procfs) task exit race condition causes a kernel crash Tony Griffiths
@ 2006-05-28 15:37 ` Eric W. Biederman
2006-05-29 0:28 ` Tony Griffiths
2006-05-31 6:50 ` Tony Griffiths
0 siblings, 2 replies; 4+ messages in thread
From: Eric W. Biederman @ 2006-05-28 15:37 UTC (permalink / raw)
To: Tony Griffiths; +Cc: linux-kernel, Andrew Morton
I have tried to reproduce this. The circumstances weren't the most
controlled but they did overlap with what you described and I haven't seen
anything.
So I am guessing that you are having memory corruption from some source.
Either bad ram or a bad module.
I'm off on vacation for a week, so I won't be able to follow up.
Eric
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: PROBLEM: /proc (procfs) task exit race condition causes a kernel crash
2006-05-28 15:37 ` Eric W. Biederman
@ 2006-05-29 0:28 ` Tony Griffiths
2006-05-31 6:50 ` Tony Griffiths
1 sibling, 0 replies; 4+ messages in thread
From: Tony Griffiths @ 2006-05-29 0:28 UTC (permalink / raw)
To: Eric W. Biederman; +Cc: linux-kernel, Andrew Morton
Eric W. Biederman wrote:
>I have tried to reproduce this. The circumstances weren't the most
>controlled but they did overlap with what you described and I haven't seen
>anything.
>
>
What version of the kernel and patch-set did you test against?
Over the last month I've tried *EVERYTHING* I can lay my hands on and
can still cause a crash VERY easily!
>So I am guessing that you are having memory corruption from some source.
>Either bad ram or a bad module.
>
>
On my DELL 1850 dual-core dual-Xeon system I did have a flaky DIMM which
cause a few correctable ECC errors, but that has been replaced and still
the same. My other test systems are (multiple) DELL 1425 dual-Xeon
machines [2.8 or 3.0 GHz chips].
>I'm off on vacation for a week, so I won't be able to follow up.
>
>
Have a good one...
>
>Eric
>
>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: PROBLEM: /proc (procfs) task exit race condition causes a kernel crash
2006-05-28 15:37 ` Eric W. Biederman
2006-05-29 0:28 ` Tony Griffiths
@ 2006-05-31 6:50 ` Tony Griffiths
1 sibling, 0 replies; 4+ messages in thread
From: Tony Griffiths @ 2006-05-31 6:50 UTC (permalink / raw)
To: Eric W. Biederman; +Cc: linux-kernel, Andrew Morton
[-- Attachment #1: Type: text/plain, Size: 1307 bytes --]
Eric,
I've attached a patch file which rolls up a set of patches to dcache.c
along with a few changed I made to locking symantics. So far a 2.6.16
(+ -mm2) with the patch applied survives the harshest testing I can
throw at it!
Note that I've also made some minor changes to exit.c [preempt
disable/enable during task exit] and truncate.c [BUG_ON check of
'private' page]. The testing I've done is with a kernel built without
preemption, and one built with voluntary preemption. A kernel built
with forced preemption and with spinlock debugging enabled did NOT work
very well!!! Also, the performance of a kernel built with voluntary
preemption and with the cond_resched_lock() calls in dcache.c was
surprisingly *BAD*, with the system going into a wheel-spin for a long
period [high system cpu of 87%+] when I fired up a number of large
cpu+memory hungry tasks. This might need to be looked at more closely?!
Eric W. Biederman wrote:
>I have tried to reproduce this. The circumstances weren't the most
>controlled but they did overlap with what you described and I haven't seen
>anything.
>
>So I am guessing that you are having memory corruption from some source.
>Either bad ram or a bad module.
>
>I'm off on vacation for a week, so I won't be able to follow up.
>
>
>Eric
>
>
[-- Attachment #2: post-2.6.16-mm2-dcache.patch --]
[-- Type: text/x-patch, Size: 15151 bytes --]
diff -urpN linux-2.6.16-mm2/fs/dcache.c linux-2.6.16/fs/dcache.c
--- linux-2.6.16-mm2/fs/dcache.c 2006-05-31 16:30:13.000000000 +1000
+++ linux-2.6.16/fs/dcache.c 2006-05-31 16:25:53.000000000 +1000
@@ -36,12 +36,10 @@
int sysctl_vfs_cache_pressure __read_mostly = 100;
-EXPORT_SYMBOL_GPL(sysctl_vfs_cache_pressure);
__cacheline_aligned_in_smp DEFINE_SPINLOCK(dcache_lock);
static seqlock_t rename_lock __cacheline_aligned_in_smp = SEQLOCK_UNLOCKED;
-EXPORT_SYMBOL(dcache_lock);
static kmem_cache_t *dentry_cache __read_mostly;
@@ -142,21 +140,18 @@ static void dentry_iput(struct dentry *
* no dcache lock, please.
*/
-void dput(struct dentry *dentry)
+static void dput_locked(struct dentry *dentry, struct list_head *list)
{
if (!dentry)
return;
-repeat:
- if (atomic_read(&dentry->d_count) == 1)
- might_sleep();
- if (!atomic_dec_and_lock(&dentry->d_count, &dcache_lock))
+ if (!atomic_dec_and_test(&dentry->d_count))
return;
+repeat:
spin_lock(&dentry->d_lock);
if (atomic_read(&dentry->d_count)) {
spin_unlock(&dentry->d_lock);
- spin_unlock(&dcache_lock);
return;
}
@@ -176,33 +171,59 @@ repeat:
dentry_stat.nr_unused++;
}
spin_unlock(&dentry->d_lock);
- spin_unlock(&dcache_lock);
return;
unhash_it:
__d_drop(dentry);
kill_it: {
- struct dentry *parent;
-
/* If dentry was on d_lru list
* delete it from there
*/
if (!list_empty(&dentry->d_lru)) {
- list_del(&dentry->d_lru);
+ list_del_init(&dentry->d_lru);
dentry_stat.nr_unused--;
}
list_del(&dentry->d_u.d_child);
dentry_stat.nr_dentry--; /* For d_free, below */
- /*drops the locks, at that point nobody can reach this dentry */
- dentry_iput(dentry);
- parent = dentry->d_parent;
- d_free(dentry);
- if (dentry == parent)
+ /* at this point nobody can reach this dentry */
+ list_add(&dentry->d_lru, list);
+ spin_unlock(&dentry->d_lock);
+ if (dentry == dentry->d_parent)
return;
- dentry = parent;
- goto repeat;
+ dentry = dentry->d_parent;
+ if (atomic_dec_and_test(&dentry->d_count))
+ goto repeat;
+ /* out */
+ }
+}
+
+void dput(struct dentry *dentry)
+{
+ LIST_HEAD(free_list);
+
+ if (!dentry)
+ goto do_return;
+
+ if (atomic_add_unless(&dentry->d_count, -1, 1))
+ goto do_return;
+
+ spin_lock(&dcache_lock); /* While we hold the dcache_lock */
+ dput_locked(dentry, &free_list); /* Put ALL free-able dentry's onto 'free_list' */
+
+ if (!list_empty(&free_list)) { /* Then process as a single batch! */
+ struct dentry *dentry, *p;
+ list_for_each_entry_safe(dentry, p, &free_list, d_lru) {
+ spin_lock(&dentry->d_lock); /* Lock dentry while also holding dcache_lock! */
+ list_del(&dentry->d_lru);
+ dentry_iput(dentry); /* Enter with locks held; Exit with no locks! */
+ d_free(dentry);
+ spin_lock(&dcache_lock); /* Assume we will iterate again so ... */
+ }
}
+ spin_unlock(&dcache_lock); /* There *MUST* be a better way of doing this?! */
+do_return:
+ return;
}
/**
@@ -219,13 +240,15 @@ kill_it: {
int d_invalidate(struct dentry * dentry)
{
+ int ret = 0;
+
/*
* If it's already been dropped, return OK.
*/
spin_lock(&dcache_lock);
if (d_unhashed(dentry)) {
spin_unlock(&dcache_lock);
- return 0;
+ goto do_return;
}
/*
* Check whether to do a partial shrink_dcache
@@ -252,14 +275,16 @@ int d_invalidate(struct dentry * dentry)
if (dentry->d_inode && S_ISDIR(dentry->d_inode->i_mode)) {
spin_unlock(&dentry->d_lock);
spin_unlock(&dcache_lock);
- return -EBUSY;
+ ret = -EBUSY;
+ goto do_return;
}
}
__d_drop(dentry);
spin_unlock(&dentry->d_lock);
spin_unlock(&dcache_lock);
- return 0;
+do_return:
+ return ret;
}
/* This should be called _only_ with dcache_lock held */
@@ -276,7 +301,10 @@ static inline struct dentry * __dget_loc
struct dentry * dget_locked(struct dentry *dentry)
{
- return __dget_locked(dentry);
+ struct dentry* ret;
+
+ ret = __dget_locked(dentry);
+ return ret;
}
/**
@@ -366,22 +394,39 @@ restart:
*/
static inline void prune_one_dentry(struct dentry * dentry)
{
- struct dentry * parent;
+ LIST_HEAD(free_list);
__d_drop(dentry);
list_del(&dentry->d_u.d_child);
dentry_stat.nr_dentry--; /* For d_free, below */
- dentry_iput(dentry);
- parent = dentry->d_parent;
+
+ /* dput the parent here before we release dcache_lock */
+ if (dentry != dentry->d_parent)
+ dput_locked(dentry->d_parent, &free_list);
+
+ dentry_iput(dentry); /* drop locks */
d_free(dentry);
- if (parent != dentry)
- dput(parent);
+
+ if (!list_empty(&free_list)) {
+ struct dentry *tmp, *p;
+
+ list_for_each_entry_safe(tmp, p, &free_list, d_lru) {
+ spin_lock(&dcache_lock); /* All of this locking/unlocking */
+ spin_lock(&tmp->d_lock); /* is so incredibly UGLY!!! */
+ list_del(&tmp->d_lru);
+ dentry_iput(tmp);
+ d_free(tmp);
+ }
+ }
+
spin_lock(&dcache_lock);
}
/**
* prune_dcache - shrink the dcache
* @count: number of entries to try and free
+ * @sb: if given, ignore dentries for other superblocks
+ * which are being unmounted.
*
* Shrink the dcache. This is done when we need
* more memory, or simply when we need to unmount
@@ -392,16 +437,30 @@ static inline void prune_one_dentry(stru
* all the dentries are in use.
*/
-static void prune_dcache(int count)
+static void prune_dcache(int count, struct super_block *sb)
{
spin_lock(&dcache_lock);
for (; count ; count--) {
struct dentry *dentry;
struct list_head *tmp;
+ struct rw_semaphore *s_umount;
- cond_resched_lock(&dcache_lock);
+ /*cond_resched_lock(&dcache_lock); ** ?BAD PERFORMANCE? ** */
tmp = dentry_unused.prev;
+ if (unlikely(sb)) {
+ /* Try to find a dentry for this sb, but don't try
+ * too hard, if they aren't near the tail they will
+ * be moved down again soon
+ */
+ int skip = count;
+ while (skip &&
+ tmp != &dentry_unused &&
+ list_entry(tmp, struct dentry, d_lru)->d_sb != sb) {
+ skip--;
+ tmp = tmp->prev;
+ }
+ }
if (tmp == &dentry_unused)
break;
list_del_init(tmp);
@@ -427,7 +486,45 @@ static void prune_dcache(int count)
spin_unlock(&dentry->d_lock);
continue;
}
- prune_one_dentry(dentry);
+ /*
+ * If the dentry is not DCACHED_REFERENCED, it is time
+ * to remove it from the dcache, provided the super block is
+ * NULL (which means we are trying to reclaim memory)
+ * or this dentry belongs to the same super block that
+ * we want to shrink.
+ */
+ /*
+ * If this dentry is for "my" filesystem, then I can prune it
+ * without taking the s_umount lock (I already hold it).
+ */
+ if (sb && dentry->d_sb == sb) {
+ prune_one_dentry(dentry);
+ continue;
+ }
+ /*
+ * ...otherwise we need to be sure this filesystem isn't being
+ * unmounted, otherwise we could race with
+ * generic_shutdown_super(), and end up holding a reference to
+ * an inode while the filesystem is unmounted.
+ * So we try to get s_umount, and make sure s_root isn't NULL.
+ * (Take a local copy of s_umount to avoid a use-after-free of
+ * `dentry').
+ */
+ s_umount = &dentry->d_sb->s_umount;
+ if (down_read_trylock(s_umount)) {
+ if (dentry->d_sb->s_root != NULL) {
+ prune_one_dentry(dentry);
+ up_read(s_umount);
+ continue;
+ }
+ up_read(s_umount);
+ }
+ spin_unlock(&dentry->d_lock);
+ /* Cannot remove the first dentry, and it isn't appropriate
+ * to move it to the head of the list, so give up, and try
+ * later
+ */
+ break;
}
spin_unlock(&dcache_lock);
}
@@ -481,14 +578,14 @@ repeat:
if (dentry->d_sb != sb)
continue;
dentry_stat.nr_unused--;
- list_del_init(tmp);
spin_lock(&dentry->d_lock);
+ list_del_init(tmp);
if (atomic_read(&dentry->d_count)) {
spin_unlock(&dentry->d_lock);
continue;
}
prune_one_dentry(dentry);
- cond_resched_lock(&dcache_lock);
+ /*cond_resched_lock(&dcache_lock); ** ?BAD PERFORMANCE? ** */
goto repeat;
}
spin_unlock(&dcache_lock);
@@ -512,6 +609,7 @@ int have_submounts(struct dentry *parent
{
struct dentry *this_parent = parent;
struct list_head *next;
+ int ret = 1;
spin_lock(&dcache_lock);
if (d_mountpoint(parent))
@@ -539,11 +637,10 @@ resume:
this_parent = this_parent->d_parent;
goto resume;
}
- spin_unlock(&dcache_lock);
- return 0; /* No mount points found in tree */
+ ret = 0; /* No mount points found in tree */
positive:
spin_unlock(&dcache_lock);
- return 1;
+ return ret;
}
/*
@@ -630,7 +727,7 @@ void shrink_dcache_parent(struct dentry
int found;
while ((found = select_parent(parent)) != 0)
- prune_dcache(found);
+ prune_dcache(found, parent->d_sb);
}
/**
@@ -643,9 +740,10 @@ void shrink_dcache_parent(struct dentry
* done under dcache_lock.
*
*/
-void shrink_dcache_anon(struct hlist_head *head)
+void shrink_dcache_anon(struct super_block *sb)
{
struct hlist_node *lp;
+ struct hlist_head *head = &sb->s_anon;
int found;
do {
found = 0;
@@ -668,7 +766,7 @@ void shrink_dcache_anon(struct hlist_hea
}
}
spin_unlock(&dcache_lock);
- prune_dcache(found);
+ prune_dcache(found, sb);
} while(found);
}
@@ -686,12 +784,16 @@ void shrink_dcache_anon(struct hlist_hea
*/
static int shrink_dcache_memory(int nr, gfp_t gfp_mask)
{
+ int ret = -1;
+
if (nr) {
if (!(gfp_mask & __GFP_FS))
- return -1;
- prune_dcache(nr);
+ goto do_return;
+ prune_dcache(nr, NULL);
}
- return (dentry_stat.nr_unused / 100) * sysctl_vfs_cache_pressure;
+ ret = (dentry_stat.nr_unused / 100) * sysctl_vfs_cache_pressure;
+do_return:
+ return ret;
}
/**
@@ -711,13 +813,14 @@ struct dentry *d_alloc(struct dentry * p
dentry = kmem_cache_alloc(dentry_cache, GFP_KERNEL);
if (!dentry)
- return NULL;
+ goto do_return;
if (name->len > DNAME_INLINE_LEN-1) {
dname = kmalloc(name->len + 1, GFP_KERNEL);
if (!dname) {
kmem_cache_free(dentry_cache, dentry);
- return NULL;
+ dentry = NULL;
+ goto do_return;
}
} else {
dname = dentry->d_iname;
@@ -758,18 +861,20 @@ struct dentry *d_alloc(struct dentry * p
list_add(&dentry->d_u.d_child, &parent->d_subdirs);
dentry_stat.nr_dentry++;
spin_unlock(&dcache_lock);
-
+do_return:
return dentry;
}
struct dentry *d_alloc_name(struct dentry *parent, const char *name)
{
struct qstr q;
+ struct dentry * ret;
q.name = name;
q.len = strlen(name);
q.hash = full_name_hash(q.name, q.len);
- return d_alloc(parent, &q);
+ ret = d_alloc(parent, &q);
+ return ret;
}
/**
@@ -817,7 +922,7 @@ void d_instantiate(struct dentry *entry,
*/
struct dentry *d_instantiate_unique(struct dentry *entry, struct inode *inode)
{
- struct dentry *alias;
+ struct dentry *alias = NULL;
int len = entry->d_name.len;
const char *name = entry->d_name.name;
unsigned int hash = entry->d_name.hash;
@@ -841,7 +946,7 @@ struct dentry *d_instantiate_unique(stru
spin_unlock(&dcache_lock);
BUG_ON(!d_unhashed(alias));
iput(inode);
- return alias;
+ goto do_return;
}
list_add(&entry->d_alias, &inode->i_dentry);
do_negative:
@@ -849,9 +954,10 @@ do_negative:
fsnotify_d_instantiate(entry, inode);
spin_unlock(&dcache_lock);
security_d_instantiate(entry, inode);
- return NULL;
+ alias = NULL;
+do_return:
+ return alias;
}
-EXPORT_SYMBOL(d_instantiate_unique);
/**
* d_alloc_root - allocate root dentry
@@ -915,12 +1021,14 @@ struct dentry * d_alloc_anon(struct inod
if ((res = d_find_alias(inode))) {
iput(inode);
- return res;
+ goto do_return;
}
tmp = d_alloc(NULL, &anonstring);
- if (!tmp)
- return NULL;
+ if (!tmp) {
+ res = NULL;
+ goto do_return;
+ }
tmp->d_parent = tmp; /* make sure dput doesn't croak */
@@ -948,6 +1056,7 @@ struct dentry * d_alloc_anon(struct inod
iput(inode);
if (tmp)
dput(tmp);
+do_return:
return res;
}
@@ -1142,6 +1251,7 @@ int d_validate(struct dentry *dentry, st
{
struct hlist_head *base;
struct hlist_node *lhp;
+ int ret = 0;
/* Check whether the ptr might be valid at all.. */
if (!kmem_ptr_validate(dentry_cache, dentry))
@@ -1159,12 +1269,13 @@ int d_validate(struct dentry *dentry, st
if (dentry == hlist_entry(lhp, struct dentry, d_hash)) {
__dget_locked(dentry);
spin_unlock(&dcache_lock);
- return 1;
+ ret = 1;
+ goto out;
}
}
spin_unlock(&dcache_lock);
out:
- return 0;
+ return ret;
}
/*
@@ -1191,6 +1302,7 @@ out:
void d_delete(struct dentry * dentry)
{
int isdir = 0;
+
/*
* Are we the only user?
*/
@@ -1203,7 +1315,7 @@ void d_delete(struct dentry * dentry)
dentry_iput(dentry);
fsnotify_nameremove(dentry, isdir);
- return;
+ goto do_return;
}
if (!d_unhashed(dentry))
@@ -1213,6 +1325,8 @@ void d_delete(struct dentry * dentry)
spin_unlock(&dcache_lock);
fsnotify_nameremove(dentry, isdir);
+do_return:
+ return;
}
static void __d_rehash(struct dentry * entry, struct hlist_head *list)
@@ -1497,13 +1611,13 @@ char * d_path(struct dentry *dentry, str
*/
asmlinkage long sys_getcwd(char __user *buf, unsigned long size)
{
- int error;
+ int error = -ENOMEM;
struct vfsmount *pwdmnt, *rootmnt;
struct dentry *pwd, *root;
char *page = (char *) __get_free_page(GFP_USER);
if (!page)
- return -ENOMEM;
+ goto do_return;
read_lock(¤t->fs->lock);
pwdmnt = mntget(current->fs->pwdmnt);
@@ -1542,6 +1656,7 @@ out:
dput(root);
mntput(rootmnt);
free_page((unsigned long) page);
+do_return:
return error;
}
@@ -1729,7 +1844,6 @@ kmem_cache_t *names_cachep __read_mostly
/* SLAB cache for file structures */
kmem_cache_t *filp_cachep __read_mostly;
-EXPORT_SYMBOL(d_genocide);
extern void bdev_cache_init(void);
extern void chrdev_init(void);
@@ -1764,6 +1878,10 @@ void __init vfs_caches_init(unsigned lon
chrdev_init();
}
+EXPORT_SYMBOL_GPL(sysctl_vfs_cache_pressure);
+EXPORT_SYMBOL(dcache_lock);
+EXPORT_SYMBOL(d_instantiate_unique);
+EXPORT_SYMBOL(d_genocide);
EXPORT_SYMBOL(d_alloc);
EXPORT_SYMBOL(d_alloc_anon);
EXPORT_SYMBOL(d_alloc_root);
diff -urpN linux-2.6.16-mm2/kernel/exit.c linux-2.6.16/kernel/exit.c
--- linux-2.6.16-mm2/kernel/exit.c 2006-05-31 16:30:14.000000000 +1000
+++ linux-2.6.16/kernel/exit.c 2006-05-30 14:49:35.000000000 +1000
@@ -136,6 +136,7 @@ void release_task(struct task_struct * p
{
int zap_leader;
task_t *leader;
+ preempt_disable(); // ** Cleanup as fast as we can! **
repeat:
atomic_dec(&p->user->processes);
write_lock_irq(&tasklist_lock);
@@ -173,6 +174,7 @@ repeat:
p = leader;
if (unlikely(zap_leader))
goto repeat;
+ preempt_enable(); // ** OK to give other tasks some cycles now **
}
/*
diff -urpN linux-2.6.16-mm2/mm/truncate.c linux-2.6.16/mm/truncate.c
--- linux-2.6.16-mm2/mm/truncate.c 2006-05-31 16:30:14.000000000 +1000
+++ linux-2.6.16/mm/truncate.c 2006-05-28 17:22:46.000000000 +1000
@@ -80,7 +80,7 @@ invalidate_complete_page(struct address_
return 0;
}
- BUG_ON(PagePrivate(page));
+ BUG_ON(PagePrivate(page) && (page_private(page) != 0));
__remove_from_page_cache(page);
write_unlock_irq(&mapping->tree_lock);
ClearPageUptodate(page);
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2006-05-31 6:50 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-05-26 0:43 PROBLEM: /proc (procfs) task exit race condition causes a kernel crash Tony Griffiths
2006-05-28 15:37 ` Eric W. Biederman
2006-05-29 0:28 ` Tony Griffiths
2006-05-31 6:50 ` Tony Griffiths
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).