linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* kernel 2.6.9-rc1-mm4 oops
@ 2004-09-12 18:48 Roel van der Made
  2004-09-13  8:06 ` [PATCH]: " Kirill Korotaev
  0 siblings, 1 reply; 11+ messages in thread
From: Roel van der Made @ 2004-09-12 18:48 UTC (permalink / raw)
  To: linux-kernel

Hi there,

This morning one of our (MySQL-)database serves crashed with the
following kernel trace. Anyone has an idea what could've caused it?
The machine is an SMP Xeon 2.8Ghz with 4G internal Reg. ECC ram running
4 scsi disks in sw raid 5 on a Debian (almost sid-)distribution.

The trace:

------------[ cut here ]------------
kernel BUG at kernel/exit.c:852!
invalid operand: 0000 [#1]
SMP
Modules linked in: ip_vs_wlc af_packet ipt_MARK iptable_mangle ip_tables ip_vs tg3 e1000 e100 eepro100 mii
nfsd exportfs nfs lockd sunrpc unix
CPU:    0
EIP:    0060:[<c011df03>]    Not tainted VLI
EFLAGS: 00010246   (2.6.9-rc1-mm4-fw-xeon.1)
EIP is at next_thread+0xc/0x41
eax: 00000000   ebx: 00000001   ecx: 00000001   edx: e93c3aa0
esi: 00000000   edi: e93c3aa0   ebp: 00000000   esp: f3893dd8
ds: 007b   es: 007b   ss: 0068
Process snmpd (pid: 1182, threadinfo=f3892000 task=f3fa1550)
Stack: c0182368 f3893f14 e93c3aa0 c016cecb c30c8a00 c011542b c03bfbe0 c30c8a00
       c017fcf6 e18d6eb0 e93c3aa0 0000000d c017fdad e93c3aa0 4143bbb4 247966f0
       c016c653 c03bfbe0 e18d6eb0 c03a4bc5 c01802a0 f3e56c20 e18d6eb0 0000000d
Call Trace:
 [<c0182368>] do_task_stat+0x279/0x752
 [<c016cecb>] alloc_inode+0x1b/0x146

 [<c011542b>] do_page_fault+0x19d/0x5c7
 [<c017fcf6>] task_dumpable+0x39/0x4a
 [<c017fdad>] proc_pid_make_inode+0xa6/0xe5
 [<c016c653>] d_rehash+0x55/0x79
 [<c01802a0>] proc_pident_lookup+0x100/0x26c
 [<c0161586>] real_lookup+0xcd/0xf0
 [<c016b468>] dput+0x24/0x209
 [<c0162247>] link_path_walk+0xa3e/0xd89
 [<c0182883>] proc_tgid_stat+0x1f/0x23
 [<c017f3ed>] proc_info_read+0x6a/0x9f
 [<c015417f>] vfs_read+0xbc/0x127
 [<c015444d>] sys_read+0x51/0x80
 [<c0105cdf>] syscall_call+0x7/0xb
Code: 8b 44 24 0c 89 04 24 e8 1d fc ff ff 83 ec 04 0f b6 44 24 08 c1 e0 08 89 04 24 e8 0a fc ff ff 89 c2
8b 80 d0 04 00 00 85 c0 75 08 <0f> 0b 54 03 e5

Thanks,

-- 
Roel van der Made                             .--.
GNU/Linux Systems/Network Engineer           |o_o |
Telegraaf Media ICT - Internet Services      |:_/ |
Tel.: 020 585 2229                          //   \ \
GnuPG Key available at: http://roel.net/gpgkey.asc 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH]: Re: kernel 2.6.9-rc1-mm4 oops
  2004-09-13  8:06 ` [PATCH]: " Kirill Korotaev
@ 2004-09-13  8:05   ` William Lee Irwin III
  2004-09-13  8:31   ` Ingo Molnar
  1 sibling, 0 replies; 11+ messages in thread
From: William Lee Irwin III @ 2004-09-13  8:05 UTC (permalink / raw)
  To: Kirill Korotaev; +Cc: Roel van der Made, linux-kernel, akpm, torvalds

On Mon, Sep 13, 2004 at 12:06:39PM +0400, Kirill Korotaev wrote:
> --- ./kernel/exit.c.nt	2004-09-13 11:18:26.000000000 +0400
> +++ ./kernel/exit.c	2004-09-13 11:53:23.611075360 +0400
> @@ -848,10 +848,7 @@ asmlinkage long sys_exit(int error_code)
>  task_t fastcall *next_thread(const task_t *p)
>  {
>  #ifdef CONFIG_SMP
> -	if (!p->sighand)
> -		BUG();
> -	if (!spin_is_locked(&p->sighand->siglock) &&
> -				!rwlock_is_locked(&tasklist_lock))
> +	if (!rwlock_is_locked(&tasklist_lock))
>  		BUG();
>  #endif
>  	return pid_task(p->pids[PIDTYPE_TGID].pid_list.next, PIDTYPE_TGID);

Hmm, BUG_ON() for whatever users (embedded?) that define BUG()/BUG_ON()
checks to nops may be useful here since no side effects are expected
from rwlock_is_locked().

FWIW I got this once while running initscripts(!) on a 4x logical x86-64
on virgin 2.6.9-rc1-mm4 but couldn't reproduce it.


-- wli

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH]: Re: kernel 2.6.9-rc1-mm4 oops
  2004-09-12 18:48 kernel 2.6.9-rc1-mm4 oops Roel van der Made
@ 2004-09-13  8:06 ` Kirill Korotaev
  2004-09-13  8:05   ` William Lee Irwin III
  2004-09-13  8:31   ` Ingo Molnar
  0 siblings, 2 replies; 11+ messages in thread
From: Kirill Korotaev @ 2004-09-13  8:06 UTC (permalink / raw)
  To: Roel van der Made; +Cc: linux-kernel, akpm, torvalds, wli

[-- Attachment #1: Type: text/plain, Size: 3242 bytes --]

Roel van der Made wrote:
> Hi there,
> 
> This morning one of our (MySQL-)database serves crashed with the
> following kernel trace. Anyone has an idea what could've caused it?
> The machine is an SMP Xeon 2.8Ghz with 4G internal Reg. ECC ram running
> 4 scsi disks in sw raid 5 on a Debian (almost sid-)distribution.

> The trace:
> 
> ------------[ cut here ]------------
> kernel BUG at kernel/exit.c:852!
> invalid operand: 0000 [#1]
> SMP
> Modules linked in: ip_vs_wlc af_packet ipt_MARK iptable_mangle ip_tables ip_vs tg3 e1000 e100 eepro100 mii
> nfsd exportfs nfs lockd sunrpc unix
> CPU:    0
> EIP:    0060:[<c011df03>]    Not tainted VLI
> EFLAGS: 00010246   (2.6.9-rc1-mm4-fw-xeon.1)
> EIP is at next_thread+0xc/0x41
> eax: 00000000   ebx: 00000001   ecx: 00000001   edx: e93c3aa0
> esi: 00000000   edi: e93c3aa0   ebp: 00000000   esp: f3893dd8
> ds: 007b   es: 007b   ss: 0068
> Process snmpd (pid: 1182, threadinfo=f3892000 task=f3fa1550)
> Stack: c0182368 f3893f14 e93c3aa0 c016cecb c30c8a00 c011542b c03bfbe0 c30c8a00
>        c017fcf6 e18d6eb0 e93c3aa0 0000000d c017fdad e93c3aa0 4143bbb4 247966f0
>        c016c653 c03bfbe0 e18d6eb0 c03a4bc5 c01802a0 f3e56c20 e18d6eb0 0000000d
> Call Trace:
>  [<c0182368>] do_task_stat+0x279/0x752
>  [<c016cecb>] alloc_inode+0x1b/0x146
>  [<c011542b>] do_page_fault+0x19d/0x5c7
>  [<c017fcf6>] task_dumpable+0x39/0x4a
>  [<c017fdad>] proc_pid_make_inode+0xa6/0xe5
>  [<c016c653>] d_rehash+0x55/0x79
>  [<c01802a0>] proc_pident_lookup+0x100/0x26c
>  [<c0161586>] real_lookup+0xcd/0xf0
>  [<c016b468>] dput+0x24/0x209
>  [<c0162247>] link_path_walk+0xa3e/0xd89
>  [<c0182883>] proc_tgid_stat+0x1f/0x23
>  [<c017f3ed>] proc_info_read+0x6a/0x9f
>  [<c015417f>] vfs_read+0xbc/0x127
>  [<c015444d>] sys_read+0x51/0x80
>  [<c0105cdf>] syscall_call+0x7/0xb
> Code: 8b 44 24 0c 89 04 24 e8 1d fc ff ff 83 ec 04 0f b6 44 24 08 c1 e0 08 89 04 24 e8 0a fc ff ff 89 c2
> 8b 80 d0 04 00 00 85 c0 75 08 <0f> 0b 54 03 e5

It looks like an incorrect BUG() in next_thread().

Description
~~~~~~~~~~~

Note, that during exit process there can be a thread in the system with 
tsk->sighand == NULL, since the following call trace:

release_task()
{
	....
	__exit_sighand()	<<< makes tsk->sighand == NULL;
	__unhash_process()	<<< unhashes thread
	....
}

next, we see that next_thread checks for tsk->sighand != NULL:

task_t fastcall *next_thread(const task_t *p)
{
  #ifdef CONFIG_SMP
        if (!p->sighand)
                BUG();		<<< BUG happened here!!!
        if (!spin_is_locked(&p->sighand->siglock) &&
                                !rwlock_is_locked(&tasklist_lock))
	....
}

So the question is why next_thread() should check for
(p->sighand != NULL) && spin_is_locked(&p->sighand->siglock)?

I think these checks are invalid. For example do_task_stat() (which 
called next_thread() in this BUG) checks for tsk->sighand != NULL 
explicitly.
And moreover, next_thread() DOES always works correctly, whether there 
are threads or none.

This patch removes sighand checks from the next_thread(), since they are 
incorrect and has nothing to do with the next_thread() function. So they 
could trigger BUG() when there were no actually bug at all.

Signed-Off-By: Kirill Korotaev <dev@sw.ru>

Kirill

[-- Attachment #2: diff-next_thread --]
[-- Type: text/plain, Size: 491 bytes --]

--- ./kernel/exit.c.nt	2004-09-13 11:18:26.000000000 +0400
+++ ./kernel/exit.c	2004-09-13 11:53:23.611075360 +0400
@@ -848,10 +848,7 @@ asmlinkage long sys_exit(int error_code)
 task_t fastcall *next_thread(const task_t *p)
 {
 #ifdef CONFIG_SMP
-	if (!p->sighand)
-		BUG();
-	if (!spin_is_locked(&p->sighand->siglock) &&
-				!rwlock_is_locked(&tasklist_lock))
+	if (!rwlock_is_locked(&tasklist_lock))
 		BUG();
 #endif
 	return pid_task(p->pids[PIDTYPE_TGID].pid_list.next, PIDTYPE_TGID);

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH]: Re: kernel 2.6.9-rc1-mm4 oops
  2004-09-13  8:06 ` [PATCH]: " Kirill Korotaev
  2004-09-13  8:05   ` William Lee Irwin III
@ 2004-09-13  8:31   ` Ingo Molnar
  2004-09-13  9:15     ` Kirill Korotaev
  1 sibling, 1 reply; 11+ messages in thread
From: Ingo Molnar @ 2004-09-13  8:31 UTC (permalink / raw)
  To: Kirill Korotaev; +Cc: Roel van der Made, linux-kernel, akpm, torvalds, wli


* Kirill Korotaev <dev@sw.ru> wrote:

> This patch removes sighand checks from the next_thread(), since they
> are incorrect and has nothing to do with the next_thread() function.
> So they could trigger BUG() when there were no actually bug at all.

the problem is, generally it is not valid to have a thread on the thread
list that has no ->sighand structure. This is what happens when we exit
a task:

        write_lock_irq(&tasklist_lock);
	...
        __exit_sighand(p);
        __unhash_process(p);

the BUG() is useful for all the code that uses next_thread() - you can
only do a safe next_thread() iteration if you've locked ->sighand.

there's one exception: in the procfs code we can get a reference to
almost-dead tasks as well that are not even in the tasklist. (This is a
relatively new thing introduced by me that can happen due to the
preemptability of some of the exit path.)

so i believe your fix papers over the real bug which is the use of an
almost-dead task for thread iterations. Since we've already done
__unhash_process() not doing the BUG introduces a more subtle bug: the
use of the stale PID pointers! So i believe the right fix is the one
below, which (under the safety of read_lock(tasklock)) checks for the
availability of task->sighand - and skips the thread iterations if so.

	Ingo

--- linux/fs/proc/array.c.orig
+++ linux/fs/proc/array.c
@@ -356,7 +356,7 @@ static int do_task_stat(struct task_stru
 			stime = task->signal->stime;
 		}
 	}
-	if (whole) {
+	if (whole && task->sighand) {
 		t = task;
 		do {
 			min_flt += t->min_flt;

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH]: Re: kernel 2.6.9-rc1-mm4 oops
  2004-09-13  8:31   ` Ingo Molnar
@ 2004-09-13  9:15     ` Kirill Korotaev
  2004-09-13  9:24       ` Ingo Molnar
  0 siblings, 1 reply; 11+ messages in thread
From: Kirill Korotaev @ 2004-09-13  9:15 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Roel van der Made, linux-kernel, akpm, torvalds, wli

Ingo Molnar wrote:
> * Kirill Korotaev <dev@sw.ru> wrote:

>>This patch removes sighand checks from the next_thread(), since they
>>are incorrect and has nothing to do with the next_thread() function.
>>So they could trigger BUG() when there were no actually bug at all.
> 
> the problem is, generally it is not valid to have a thread on the thread
> list that has no ->sighand structure. This is what happens when we exit
> a task:
> 
>         write_lock_irq(&tasklist_lock);
> 	...
>         __exit_sighand(p);
>         __unhash_process(p);

> the BUG() is useful for all the code that uses next_thread() - you can
> only do a safe next_thread() iteration if you've locked ->sighand.

1. I don't see spin_lock() on p->sighand->siglock in do_task_stat() 
before calling next_thread(). And the check inside next_thread() permits 
only one of the locks to be taken:

         if (!spin_is_locked(&p->sighand->siglock) &&
                                 !rwlock_is_locked(&tasklist_lock))

which is probably wrong, since tasklist_lock is always required!

2. I think the idea of checking sighand is quite obscure.
Probably it would be better to call pid_alive() for check at such places 
in proc, isn't it?

3. And yes, now I agree that this check and BUG() prevented 
next_thread() from walking through the deleted list_head in tsk->pid_list.

But I would propose to reorganize these checks in next_thread() to 
something like this:

if (!rwlock_is_locked(&tasklist_lock) || p->pids[PIDTYPE_TGID].nr == 0)
	BUG();

the last check ensures that we are still hashed and this check is more 
straithforward for understanding, agree?

4. If we do checks this way, then we can found strange proc numeric 
results. Suppose, you have read the do_task_stat() and it iterated 
through the threads and summed the times in this loop with 
next_thread(). Next, the thread dies and you can read the results w/o 
this loop and threads times, only this thread stats. Looks a bit 
invalid. Don't you think so?
Maybe we should return an error?

> so i believe your fix papers over the real bug which is the use of an
> almost-dead task for thread iterations. Since we've already done
> __unhash_process() not doing the BUG introduces a more subtle bug: the
> use of the stale PID pointers! So i believe the right fix is the one
> below, which (under the safety of read_lock(tasklock)) checks for the
> availability of task->sighand - and skips the thread iterations if so.
You patch is correct. Though I would like to hear on my previous notes 
about pid_alive().

Kirill


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH]: Re: kernel 2.6.9-rc1-mm4 oops
  2004-09-13  9:15     ` Kirill Korotaev
@ 2004-09-13  9:24       ` Ingo Molnar
  2004-09-13 13:34         ` Roel van der Made
  2004-09-13 14:39         ` Kirill Korotaev
  0 siblings, 2 replies; 11+ messages in thread
From: Ingo Molnar @ 2004-09-13  9:24 UTC (permalink / raw)
  To: Kirill Korotaev; +Cc: Roel van der Made, linux-kernel, akpm, torvalds, wli


* Kirill Korotaev <dev@sw.ru> wrote:

> >the BUG() is useful for all the code that uses next_thread() - you can
> >only do a safe next_thread() iteration if you've locked ->sighand.

> 1. I don't see spin_lock() on p->sighand->siglock in do_task_stat() 
> before calling next_thread(). And the check inside next_thread() permits 
> only one of the locks to be taken:
> 
>         if (!spin_is_locked(&p->sighand->siglock) &&
>                                 !rwlock_is_locked(&tasklist_lock))
> 
> which is probably wrong, since tasklist_lock is always required!

It's not 'wrong' in terms of correctness it's simply too restrictive for
no reason. I agree that we should check for the tasklist lock only.

> 2. I think the idea of checking sighand is quite obscure. Probably it
> would be better to call pid_alive() for check at such places in proc,
> isn't it?

yeah, it's just as good of a check.

> 3. And yes, now I agree that this check and BUG() prevented
> next_thread() from walking through the deleted list_head in
> tsk->pid_list.

good.

> But I would propose to reorganize these checks in next_thread() to
> something like this:
> 
> if (!rwlock_is_locked(&tasklist_lock) || p->pids[PIDTYPE_TGID].nr == 0)
> 	BUG();
> 
> the last check ensures that we are still hashed and this check is more 
> straithforward for understanding, agree?

yep - please send a new patch to Andrew.

> 4. If we do checks this way, then we can found strange proc numeric
> results. Suppose, you have read the do_task_stat() and it iterated
> through the threads and summed the times in this loop with
> next_thread(). Next, the thread dies and you can read the results w/o
> this loop and threads times, only this thread stats. Looks a bit
> invalid. Don't you think so? Maybe we should return an error?

i'd just skip filling out that statistics field - like my minimal patch
does.

	Ingo

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH]: Re: kernel 2.6.9-rc1-mm4 oops
  2004-09-13  9:24       ` Ingo Molnar
@ 2004-09-13 13:34         ` Roel van der Made
  2004-09-13 13:38           ` Ingo Molnar
  2004-09-13 14:39         ` Kirill Korotaev
  1 sibling, 1 reply; 11+ messages in thread
From: Roel van der Made @ 2004-09-13 13:34 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Kirill Korotaev, linux-kernel, akpm, torvalds, wli

[-- Attachment #1: Type: text/plain, Size: 1119 bytes --]

Hi,

On Mon, Sep 13, 2004 at 11:24:43AM +0200, Ingo Molnar wrote:
> * Kirill Korotaev <dev@sw.ru> wrote:
> > >the BUG() is useful for all the code that uses next_thread() - you can
> > >only do a safe next_thread() iteration if you've locked ->sighand.
> > 1. I don't see spin_lock() on p->sighand->siglock in do_task_stat() 
> > before calling next_thread(). And the check inside next_thread() permits 
> > only one of the locks to be taken:
> >         if (!spin_is_locked(&p->sighand->siglock) &&
> >                                 !rwlock_is_locked(&tasklist_lock))

<snip>

> > the last check ensures that we are still hashed and this check is more 
> > straithforward for understanding, agree?
> 
> yep - please send a new patch to Andrew.

I'll be awaiting the patch and give it a shot.

Thanks all for the feedback.

-- 
Roel van der Made                             .--.
GNU/Linux Systems/Network Engineer           |o_o |
Telegraaf Media ICT - Internet Services      |:_/ |
Tel.: 020 585 2229                          //   \ \
GnuPG Key available at: http://roel.net/gpgkey.asc 

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH]: Re: kernel 2.6.9-rc1-mm4 oops
  2004-09-13 13:34         ` Roel van der Made
@ 2004-09-13 13:38           ` Ingo Molnar
  2004-09-13 13:42             ` Roel van der Made
  0 siblings, 1 reply; 11+ messages in thread
From: Ingo Molnar @ 2004-09-13 13:38 UTC (permalink / raw)
  To: Roel van der Made; +Cc: Kirill Korotaev, linux-kernel, akpm, torvalds, wli


* Roel van der Made <roel@telegraafnet.nl> wrote:

> > > the last check ensures that we are still hashed and this check is more 
> > > straithforward for understanding, agree?
> > 
> > yep - please send a new patch to Andrew.
> 
> I'll be awaiting the patch and give it a shot.
> 
> Thanks all for the feedback.

you can try my patch too, it will do the job of fixing the bug. The
other changes we talked about are only improvements to the debugging
infrastructure.

	Ingo

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH]: Re: kernel 2.6.9-rc1-mm4 oops
  2004-09-13 13:38           ` Ingo Molnar
@ 2004-09-13 13:42             ` Roel van der Made
  2004-09-13 15:03               ` Kirill Korotaev
  0 siblings, 1 reply; 11+ messages in thread
From: Roel van der Made @ 2004-09-13 13:42 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Kirill Korotaev, linux-kernel, akpm, torvalds, wli

[-- Attachment #1: Type: text/plain, Size: 906 bytes --]

On Mon, Sep 13, 2004 at 03:38:47PM +0200, Ingo Molnar wrote:
> * Roel van der Made <roel@telegraafnet.nl> wrote:
> > > > the last check ensures that we are still hashed and this check is more 
> > > > straithforward for understanding, agree?
> > > yep - please send a new patch to Andrew.
> > I'll be awaiting the patch and give it a shot.
> > Thanks all for the feedback.
> you can try my patch too, it will do the job of fixing the bug. The
> other changes we talked about are only improvements to the debugging
> infrastructure.

Saw there's an mm5 release already, is your patch included in this one
also?

> 	Ingo

-- 
Roel van der Made                             .--.
GNU/Linux Systems/Network Engineer           |o_o |
Telegraaf Media ICT - Internet Services      |:_/ |
Tel.: 020 585 2229                          //   \ \
GnuPG Key available at: http://roel.net/gpgkey.asc 

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH]: Re: kernel 2.6.9-rc1-mm4 oops
  2004-09-13  9:24       ` Ingo Molnar
  2004-09-13 13:34         ` Roel van der Made
@ 2004-09-13 14:39         ` Kirill Korotaev
  1 sibling, 0 replies; 11+ messages in thread
From: Kirill Korotaev @ 2004-09-13 14:39 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Roel van der Made, linux-kernel, akpm, torvalds

[-- Attachment #1: Type: text/plain, Size: 2185 bytes --]

Ingo Molnar wrote:
> * Kirill Korotaev <dev@sw.ru> wrote:
>>>the BUG() is useful for all the code that uses next_thread() - you can
>>>only do a safe next_thread() iteration if you've locked ->sighand.
> 
>>1. I don't see spin_lock() on p->sighand->siglock in do_task_stat() 
>>before calling next_thread(). And the check inside next_thread() permits 
>>only one of the locks to be taken:
>>
>>        if (!spin_is_locked(&p->sighand->siglock) &&
>>                                !rwlock_is_locked(&tasklist_lock))
>>
>>which is probably wrong, since tasklist_lock is always required!
> 
> It's not 'wrong' in terms of correctness it's simply too restrictive for
> no reason. I agree that we should check for the tasklist lock only.
that is what I wanted to say :)
I removed check for siglock being locked and changed check for sighand 
!= NULL to pid.nr check as we discussed below.

>>2. I think the idea of checking sighand is quite obscure. Probably it
>>would be better to call pid_alive() for check at such places in proc,
>>isn't it?
> yeah, it's just as good of a check.
So I replaced the check in your patch with pid_alive() one, ok?

>>But I would propose to reorganize these checks in next_thread() to
>>something like this:
>>
>>if (!rwlock_is_locked(&tasklist_lock) || p->pids[PIDTYPE_TGID].nr == 0)
>>	BUG();
>>
>>the last check ensures that we are still hashed and this check is more 
>>straithforward for understanding, agree?
> yep - please send a new patch to Andrew.
here it is, please review it as well.

There are 2 patches here:

diff-next_thread (for both linus and 2.6.9-rc1-mm4 trees)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
This patch changes obscure BUG() checks in next_thread() with pid checks 
meaning exactly the same (It checks for task being hashed).

Signed-Off-By: Kirill Korotaev <dev@sw.ru>

diff-task_stat (for 2.6.9-rc1-mm4 tree)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
This patch fixes BUG() happening in do_task_stat()->next_thread(), since 
tsk->sighand can be NULL there. It adds check for pid_alive() in 
do_task_stat() to prevent thread loop for already unhashed task.

Signed-Off-By: Kirill Korotaev <dev@sw.ru>

Kirill

[-- Attachment #2: diff-task_stat --]
[-- Type: text/plain, Size: 1368 bytes --]

--- ./include/linux/sched.h.nt	2004-09-13 18:00:12.000000000 +0400
+++ ./include/linux/sched.h	2004-09-13 18:06:03.680828072 +0400
@@ -699,6 +699,12 @@ extern struct task_struct *find_task_by_
 extern void set_special_pids(pid_t session, pid_t pgrp);
 extern void __set_special_pids(pid_t session, pid_t pgrp);
 
+/* checks whether task is still hashed and can be accessed safely */
+static inline int pid_alive(struct task_struct *p)
+{
+	return p->pids[PIDTYPE_PID].nr != 0;
+}
+
 /* per-UID process charging. */
 extern struct user_struct * alloc_uid(uid_t);
 static inline struct user_struct *get_uid(struct user_struct *u)
--- ./fs/proc/array.c.nt	2004-09-13 18:00:09.178720584 +0400
+++ ./fs/proc/array.c	2004-09-13 18:00:51.861231856 +0400
@@ -356,7 +356,7 @@ static int do_task_stat(struct task_stru
 			stime = task->signal->stime;
 		}
 	}
-	if (whole) {
+	if (whole && pid_alive(task)) {
 		t = task;
 		do {
 			min_flt += t->min_flt;
--- ./fs/proc/base.c.nt	2004-09-13 18:00:09.181720128 +0400
+++ ./fs/proc/base.c	2004-09-13 18:00:51.862231704 +0400
@@ -793,11 +793,6 @@ static struct inode_operations proc_pid_
 	.follow_link	= proc_pid_follow_link
 };
 
-static inline int pid_alive(struct task_struct *p)
-{
-	return p->pids[PIDTYPE_PID].nr != 0;
-}
-
 #define NUMBUF 10
 
 static int proc_readfd(struct file * filp, void * dirent, filldir_t filldir)

[-- Attachment #3: diff-next_thread --]
[-- Type: text/plain, Size: 524 bytes --]

--- ./kernel/exit.c.nt	2004-09-13 18:00:12.727181136 +0400
+++ ./kernel/exit.c	2004-09-13 18:00:51.864231400 +0400
@@ -848,10 +848,7 @@ asmlinkage long sys_exit(int error_code)
 task_t fastcall *next_thread(const task_t *p)
 {
 #ifdef CONFIG_SMP
-	if (!p->sighand)
-		BUG();
-	if (!spin_is_locked(&p->sighand->siglock) &&
-				!rwlock_is_locked(&tasklist_lock))
+	if (!rwlock_is_locked(&tasklist_lock) || p->pids[PIDTYPE_TGID].nr == 0)
 		BUG();
 #endif
 	return pid_task(p->pids[PIDTYPE_TGID].pid_list.next, PIDTYPE_TGID);

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH]: Re: kernel 2.6.9-rc1-mm4 oops
  2004-09-13 13:42             ` Roel van der Made
@ 2004-09-13 15:03               ` Kirill Korotaev
  0 siblings, 0 replies; 11+ messages in thread
From: Kirill Korotaev @ 2004-09-13 15:03 UTC (permalink / raw)
  To: Roel van der Made; +Cc: linux-kernel

Roel van der Made wrote:
> On Mon, Sep 13, 2004 at 03:38:47PM +0200, Ingo Molnar wrote:
> 
>>* Roel van der Made <roel@telegraafnet.nl> wrote:
>>
>>>>>the last check ensures that we are still hashed and this check is more 
>>>>>straithforward for understanding, agree?
>>>>
>>>>yep - please send a new patch to Andrew.
>>>
>>>I'll be awaiting the patch and give it a shot.
>>>Thanks all for the feedback.
>>
>>you can try my patch too, it will do the job of fixing the bug. The
>>other changes we talked about are only improvements to the debugging
>>infrastructure.
> 

> Saw there's an mm5 release already, is your patch included in this one
> also?
I've checked that -mm5 contains fix to the BUG() you reported.
It's slight different (looks almost like Ingo Molnar patch), but it's there.

Kirill


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2004-09-13 15:02 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-09-12 18:48 kernel 2.6.9-rc1-mm4 oops Roel van der Made
2004-09-13  8:06 ` [PATCH]: " Kirill Korotaev
2004-09-13  8:05   ` William Lee Irwin III
2004-09-13  8:31   ` Ingo Molnar
2004-09-13  9:15     ` Kirill Korotaev
2004-09-13  9:24       ` Ingo Molnar
2004-09-13 13:34         ` Roel van der Made
2004-09-13 13:38           ` Ingo Molnar
2004-09-13 13:42             ` Roel van der Made
2004-09-13 15:03               ` Kirill Korotaev
2004-09-13 14:39         ` Kirill Korotaev

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).