* BUG: wait_task_zombie NULL dereference
@ 2012-12-04 13:48 Bill Huey (hui)
2012-12-04 14:03 ` Bill Huey (hui)
0 siblings, 1 reply; 3+ messages in thread
From: Bill Huey (hui) @ 2012-12-04 13:48 UTC (permalink / raw)
To: Linux Kernel Mailing List; +Cc: ebiederm
I'm hitting this under a heavy scheduler test load with SCHED_RR tasks
exiting normally after completion and the parent exiting with some of
the pthreads still running:
(gdb) bt
#0 no_context (regs=0xffff880018c55d58, error_code=0, address=4,
signal=signal@entry=11,
si_code=si_code@entry=196609) at arch/x86/mm/fault.c:630
#1 0xffffffff816a02fe in __bad_area_nosemaphore
(regs=regs@entry=0xffff880018c55d58,
error_code=error_code@entry=0, address=address@entry=4,
si_code=si_code@entry=196609)
at arch/x86/mm/fault.c:767
#2 0xffffffff816a0565 in __bad_area (si_code=196609, address=4,
error_code=0, regs=0xffff880018c55d58)
at arch/x86/mm/fault.c:789
#3 bad_area (regs=regs@entry=0xffff880018c55d58,
error_code=error_code@entry=0, address=address@entry=4)
at arch/x86/mm/fault.c:795
#4 0xffffffff816b381c in do_page_fault
(regs=regs@entry=0xffff880018c55d58, error_code=error_code@entry=0)
at arch/x86/mm/fault.c:1159
#5 0xffffffff816b2ff5 in do_async_page_fault
(regs=0xffff880018c55d58, error_code=0) at arch/x86/kernel/kvm.c:246
#6 <signal handler called>
#7 wait_task_zombie (p=0xffff88003a034500, wo=0xffff880018c55f00) at
kernel/exit.c:1224
#8 wait_consider_task (p=0xffff88003a034500, ptrace=0,
wo=0xffff880018c55f00) at kernel/exit.c:1591
#9 wait_consider_task (wo=0xffff880018c55f00, ptrace=0,
p=0xffff88003a034500) at kernel/exit.c:1544
#10 0xffffffff8105a910 in do_wait_thread (tsk=0xffff88002f510000,
wo=0xffff880018c55f00) at kernel/exit.c:1666
#11 do_wait (wo=wo@entry=0xffff880018c55f00) at kernel/exit.c:1735
#12 0xffffffff8105bd45 in sys_wait4 (upid=<optimized out>,
stat_addr=0x7fff40f4168c, options=<optimized out>,
ru=0x0 <irq_stack_union>) at kernel/exit.c:1865
#13 <signal handler called>
#14 0x00007f58c4d7f4ea in ?? ()
#15 0xffff88000000001b in ?? ()
#16 0xdead4ead001e001e in ?? ()
#17 0x00000000ffffffff in ?? ()
#18 0xffffffffffffffff in ?? ()
#19 0xffffffff8280e5e8 in __key.30461 ()
#20 0xffffffff8205f850 in lock_classes ()
#21 0x0000000000000000 in ?? ()
(gdb) down
#7 wait_task_zombie (p=0xffff88003a034500, wo=0xffff880018c55f00) at
kernel/exit.c:1224
1224 kuid_t two= task_uid(p);
[ 23.324284] BUG: unable to handle kernel NULL pointer dereference
at 0000000000000004
[ 23.324284] IP: [<ffffffff8105a1a0>] wait_consider_task+0x5b0/0xc20
[ 23.324284] PGD 2fa48067 PUD 39ff4067 PMD 0
[ 23.324284] Oops: 0000 [#1] SMP
......
It crashes at that point with a NULL dereference it looks like. I
expanded out the arguments for from_kuid_munged() so that gdb can get
at a specific line.
bill
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: BUG: wait_task_zombie NULL dereference
2012-12-04 13:48 BUG: wait_task_zombie NULL dereference Bill Huey (hui)
@ 2012-12-04 14:03 ` Bill Huey (hui)
2012-12-04 19:20 ` Eric W. Biederman
0 siblings, 1 reply; 3+ messages in thread
From: Bill Huey (hui) @ 2012-12-04 14:03 UTC (permalink / raw)
To: Linux Kernel Mailing List; +Cc: ebiederm
I should add that I encountered this on 3.6.0 with some mild
modifications to the scheduler path that enqueue/dequeue a task before
any of the schedule exit logic gets hit. The SCHED_FF/FIFO rebalancer
does much the same so I can't imagine that being the source of the
problem.
I could be wrong however.
bill
On Tue, Dec 4, 2012 at 5:48 AM, Bill Huey (hui) <bill.huey@gmail.com> wrote:
> I'm hitting this under a heavy scheduler test load with SCHED_RR tasks
> exiting normally after completion and the parent exiting with some of
> the pthreads still running:
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: BUG: wait_task_zombie NULL dereference
2012-12-04 14:03 ` Bill Huey (hui)
@ 2012-12-04 19:20 ` Eric W. Biederman
0 siblings, 0 replies; 3+ messages in thread
From: Eric W. Biederman @ 2012-12-04 19:20 UTC (permalink / raw)
To: Bill Huey (hui); +Cc: Linux Kernel Mailing List
"Bill Huey (hui)" <bill.huey@gmail.com> writes:
> I should add that I encountered this on 3.6.0 with some mild
> modifications to the scheduler path that enqueue/dequeue a task before
> any of the schedule exit logic gets hit. The SCHED_FF/FIFO rebalancer
> does much the same so I can't imagine that being the source of the
> problem.
>
> I could be wrong however.
In 3.6 from_kuid_munged should be only be expanded to the inline noop
version.
The code you quote does not exist in kernel/exit.c in wait_task_zombie
and has not existed in wait_task_zombie in Linus's tree. So since I
can't see the code I can't help.
I suspect the bug relates to your local modifications.
Eric
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2012-12-04 19:20 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-12-04 13:48 BUG: wait_task_zombie NULL dereference Bill Huey (hui)
2012-12-04 14:03 ` Bill Huey (hui)
2012-12-04 19:20 ` Eric W. Biederman
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).