* POSIX timer deletion race?
@ 2007-07-13 18:19 Jeremy Katz
2007-07-17 8:53 ` [PATCH] posix-timer: fix deletion race Thomas Gleixner
0 siblings, 1 reply; 24+ messages in thread
From: Jeremy Katz @ 2007-07-13 18:19 UTC (permalink / raw)
To: linux-kernel
[also sent to George Anzinger]
Hi,
I'm seeing a problem related to deleting posix timers.
Using a moderately patched 2.6.14 kernel, I am encountering a trio of BUGs
involving the posix timers. This is running on an SMP xeon (Intel MPCBL
0030 and 0001) system with 200 threads managing a large number of timers.
The issue appears to be that itimer_delete() and sys_timer_delete() both do
something along the lines of this:
sys_timer_delete(timer_t timer_id)
{
...
unlock_timer(timer, flags);
release_posix_timer(timer, IT_ID_SET);
return 0;
As such, there's a window of opportunity for something else to grab the timer
and attempt an operation of one sort or another. Eventually, either
sigqueue_free() or send_sigqueue() detect that the signal queue has been
deleted, and call BUG_ON:
kernel/signal.c:
void sigqueue_free(struct sigqueue *q)
{
unsigned long flags;
BUG_ON(!(q->flags & SIGQUEUE_PREALLOC));
...
Reading the kernel.org pristine 2.6.22, the same sequence occurs.
kernel BUG at <bad filename>:52971!
invalid operand: 0000 [#1]
SMP
LTT NESTING LEVEL : 0
Modules linked in: ipmi_watchdog ipmi_si ipmi_devintf ipmi_msghandler
softdog b
in
fmt_misc video thermal processor fan button battery ac sctp uhci_hcd
usbcore ip
6_
tables ip_tables ipv6
CPU: 0
EIP: 0060:[<c0138fdb>] Not tainted VLI
EFLAGS: 00010246 (2.6.14.7-selinux1-WR1.4aq_cgl)
EIP is at sigqueue_free+0x3b/0x7d
eax: 00000000 ebx: de28f9c4 ecx: ddcd9e60 edx: 00000292
esi: de3336f8 edi: 00000000 ebp: df829f90 esp: df829f8c
ds: 007b es: 007b ss: 0068
Process hrtm_test (pid: 16282, threadinfo=df829000 task=dfe9d350)
Stack: 00000292 df829fa0 c013ff57 df829000 de3336f8 df829fb4 c0140802
00000286
000003b9 00000000 df829000 c039e3ec 000003b9 4a248ff4 5a019aa0
00000000
00000000 b42fc3c8 00000107 c039007b 0000007b 00000107 4a2460c0
00000073
Call Trace:
[<c0103fad>] show_stack+0x7a/0x90
[<c010412b>] show_registers+0x14f/0x1c7
[<c010516b>] die+0x11a/0x195
[<c039ff79>] do_trap+0x1991/0x205b
[<c0105417>] do_invalid_op+0xa3/0xad
[<c0103cdb>] error_code+0x4f/0x54
[<c013ff57>] release_posix_timer+0x1b/0x7b
[<c0140802>] sys_timer_delete+0xd9/0x10f
[<c039e3ec>] no_syscall_entry_trace+0xb/0xf
Code: 2e 83 e0 fe 89 43 0c 8b 83 90 00 00 00 f0 ff 48 0c 8b 83 90 00 00 00
e8 6
6
df ff ff 89 da a1 e0 ea 4b c0 e8 69 56 02 00 5b 5d c3 <0f> 0b eb ce b8 00
3e 46
c
0 e8 61 50 26 00 8b 43 08 e8 69 50 26
The other two occur via calls to itimer_delete(), and posix_timer_event(), in
similar fashion. Unfortunately, I don't have test code that I can share.
Version details:
Linux thermopolis-1
2.6.14.7-selinux1-WR1.4aq_cgl #18 SMP Thu
Jul 12 20:03:02 PDT 2007 i686 GNU/Linux
Gnu C 3.4.4
util-linux 2.12
mount 2.12
module-init-tools 3.2.2
e2fsprogs 1.38
reiserfsprogs 3.6.19
xfsprogs 2.7.11
PPP 2.4.3
Linux C Library 2.3.6
Dynamic linker (ldd) 2.3.6
Linux C++ Library 6.0.3
Procps 3.2.6
Net-tools 1.60
Kbd 1.12
Sh-utils 5.2.1
udev 079
Modules Loaded ipmi_watchdog
sd_mod ipmi_si ipmi_devintf
ipmi_msghandler softdog binfmt_misc video
thermal processor fan button battery ac
usb_storage scsi_mod sctp uhci_hcd usbcore
ip6_tables ip_tables ipv6
--------------------------------
Jeremy Katz, Senior Engineer, Wind River
Telephone: +1 510 749 2901
^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH] posix-timer: fix deletion race
2007-07-13 18:19 POSIX timer deletion race? Jeremy Katz
@ 2007-07-17 8:53 ` Thomas Gleixner
2007-07-17 9:57 ` Ingo Molnar
` (2 more replies)
0 siblings, 3 replies; 24+ messages in thread
From: Thomas Gleixner @ 2007-07-17 8:53 UTC (permalink / raw)
To: Jeremy Katz
Cc: linux-kernel, Andrew Morton, Ingo Molnar, Oleg Nesterov, Stable Team
Jeremy Katz experienced a posix-timer related bug on 2.6.14. This is
caused by a subtle race, which is there since the original posix timer
commit and persists until today.
timer_delete does:
lock_timer();
timer->it_process = NULL;
unlock_timer();
release_posix_timer();
timer->it_process is checked in lock_timer() to prevent access to a
timer, which is on the way to be deleted, but the check happens after
idr_lock is dropped. This allows release_posix_timer() to delete the
timer before the lock code can check the timer:
CPU 0 CPU 1
lock_timer();
timer->it_process = NULL;
unlock_timer();
lock_timer()
spin_lock(idr_lock);
timer = idr_find();
spin_lock(timer->lock);
spin_unlock(idr_lock);
release_posix_timer();
spin_lock(idr_lock);
idr_remove(timer);
spin_unlock(idr_lock);
free_timer(timer);
if (timer->......)
Change the locking to prevent this.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
diff --git a/kernel/posix-timers.c b/kernel/posix-timers.c
index 329ce01..c2ac6fd 100644
--- a/kernel/posix-timers.c
+++ b/kernel/posix-timers.c
@@ -605,13 +605,14 @@ static struct k_itimer * lock_timer(timer_t timer_id, unsigned long *flags)
timr = (struct k_itimer *) idr_find(&posix_timers_id, (int) timer_id);
if (timr) {
spin_lock(&timr->it_lock);
- spin_unlock(&idr_lock);
if ((timr->it_id != timer_id) || !(timr->it_process) ||
timr->it_process->tgid != current->tgid) {
- unlock_timer(timr, *flags);
+ spin_unlock(&timr->it_lock);
+ spin_unlock_irqrestore(&idr_lock, *flags);
timr = NULL;
- }
+ } else
+ spin_unlock(&idr_lock);
} else
spin_unlock_irqrestore(&idr_lock, *flags);
^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [PATCH] posix-timer: fix deletion race
2007-07-17 8:53 ` [PATCH] posix-timer: fix deletion race Thomas Gleixner
@ 2007-07-17 9:57 ` Ingo Molnar
2007-07-17 18:29 ` Jeremy Katz
2007-07-17 13:07 ` Oleg Nesterov
2007-07-17 18:39 ` Jeremy Katz
2 siblings, 1 reply; 24+ messages in thread
From: Ingo Molnar @ 2007-07-17 9:57 UTC (permalink / raw)
To: Thomas Gleixner
Cc: Jeremy Katz, linux-kernel, Andrew Morton, Oleg Nesterov, Stable Team
* Thomas Gleixner <tglx@linutronix.de> wrote:
> Jeremy Katz experienced a posix-timer related bug on 2.6.14. This is
> caused by a subtle race, which is there since the original posix timer
> commit and persists until today.
>
> timer_delete does:
> lock_timer();
> timer->it_process = NULL;
> unlock_timer();
> release_posix_timer();
>
> timer->it_process is checked in lock_timer() to prevent access to a
> timer, which is on the way to be deleted, but the check happens after
> idr_lock is dropped. This allows release_posix_timer() to delete the
> timer before the lock code can check the timer:
>
> CPU 0 CPU 1
> lock_timer();
> timer->it_process = NULL;
> unlock_timer();
> lock_timer()
> spin_lock(idr_lock);
> timer = idr_find();
> spin_lock(timer->lock);
> spin_unlock(idr_lock);
> release_posix_timer();
> spin_lock(idr_lock);
> idr_remove(timer);
> spin_unlock(idr_lock);
> free_timer(timer);
> if (timer->......)
>
> Change the locking to prevent this.
>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
nice one! The race looks pretty narrow - Jeremy, does your Xens have
hyperthreading? (or are there any heavy SMI sources perhaps that could
open up this race.) If not then there might be some other bug lurking in
there as well.
Acked-by: Ingo Molnar <mingo@elte.hu>
Ingo
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH] posix-timer: fix deletion race
2007-07-17 8:53 ` [PATCH] posix-timer: fix deletion race Thomas Gleixner
2007-07-17 9:57 ` Ingo Molnar
@ 2007-07-17 13:07 ` Oleg Nesterov
2007-07-17 17:02 ` Thomas Gleixner
2007-07-17 18:39 ` Jeremy Katz
2 siblings, 1 reply; 24+ messages in thread
From: Oleg Nesterov @ 2007-07-17 13:07 UTC (permalink / raw)
To: Thomas Gleixner
Cc: Jeremy Katz, linux-kernel, Andrew Morton, Ingo Molnar, Stable Team
On 07/17, Thomas Gleixner wrote:
>
> Jeremy Katz experienced a posix-timer related bug on 2.6.14. This is
> caused by a subtle race, which is there since the original posix timer
> commit and persists until today.
>
> timer_delete does:
> lock_timer();
> timer->it_process = NULL;
> unlock_timer();
> release_posix_timer();
>
> timer->it_process is checked in lock_timer() to prevent access to a
> timer, which is on the way to be deleted, but the check happens after
> idr_lock is dropped. This allows release_posix_timer() to delete the
> timer before the lock code can check the timer:
>
> CPU 0 CPU 1
> lock_timer();
> timer->it_process = NULL;
> unlock_timer();
> lock_timer()
> spin_lock(idr_lock);
> timer = idr_find();
> spin_lock(timer->lock);
> spin_unlock(idr_lock);
> release_posix_timer();
> spin_lock(idr_lock);
> idr_remove(timer);
> spin_unlock(idr_lock);
> free_timer(timer);
> if (timer->......)
This is funny. I do remember this bug was discussed a couple of years ago,
and the conclusion was "we can fix it later" :)
> --- a/kernel/posix-timers.c
> +++ b/kernel/posix-timers.c
> @@ -605,13 +605,14 @@ static struct k_itimer * lock_timer(timer_t timer_id, unsigned long *flags)
> timr = (struct k_itimer *) idr_find(&posix_timers_id, (int) timer_id);
> if (timr) {
> spin_lock(&timr->it_lock);
> - spin_unlock(&idr_lock);
>
> if ((timr->it_id != timer_id) || !(timr->it_process) ||
> timr->it_process->tgid != current->tgid) {
> - unlock_timer(timr, *flags);
> + spin_unlock(&timr->it_lock);
> + spin_unlock_irqrestore(&idr_lock, *flags);
> timr = NULL;
> - }
> + } else
> + spin_unlock(&idr_lock);
> } else
> spin_unlock_irqrestore(&idr_lock, *flags);
I think we can make a simpler patch,
--- posix-timers.c~ 2007-06-29 14:45:04.000000000 +0400
+++ posix-timers.c 2007-07-17 16:59:45.000000000 +0400
@@ -449,6 +449,9 @@ static void release_posix_timer(struct k
idr_remove(&posix_timers_id, tmr->it_id);
spin_unlock_irqrestore(&idr_lock, flags);
}
+
+ spin_unlock_wait(tmr->it_lock);
+
sigqueue_free(tmr->sigq);
if (unlikely(tmr->it_process) &&
tmr->it_sigev_notify == (SIGEV_SIGNAL|SIGEV_THREAD_ID))
What do you think? Instead of complicating the lock_timer(), release_posix_timer()
just makes sure that nobody can "use" tmr.
(Actually, I am not sure this is my idea, perhaps something like above was suggested
by somebody else, I forgot the discussion completely).
Oleg.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH] posix-timer: fix deletion race
2007-07-17 13:07 ` Oleg Nesterov
@ 2007-07-17 17:02 ` Thomas Gleixner
0 siblings, 0 replies; 24+ messages in thread
From: Thomas Gleixner @ 2007-07-17 17:02 UTC (permalink / raw)
To: Oleg Nesterov
Cc: Jeremy Katz, linux-kernel, Andrew Morton, Ingo Molnar, Stable Team
On Tue, 2007-07-17 at 17:07 +0400, Oleg Nesterov wrote:
> I think we can make a simpler patch,
>
> --- posix-timers.c~ 2007-06-29 14:45:04.000000000 +0400
> +++ posix-timers.c 2007-07-17 16:59:45.000000000 +0400
> @@ -449,6 +449,9 @@ static void release_posix_timer(struct k
> idr_remove(&posix_timers_id, tmr->it_id);
> spin_unlock_irqrestore(&idr_lock, flags);
> }
> +
> + spin_unlock_wait(tmr->it_lock);
> +
> sigqueue_free(tmr->sigq);
> if (unlikely(tmr->it_process) &&
> tmr->it_sigev_notify == (SIGEV_SIGNAL|SIGEV_THREAD_ID))
>
>
> What do you think? Instead of complicating the lock_timer(), release_posix_timer()
> just makes sure that nobody can "use" tmr.
Hmm, I prefer to fix that at the place where the race actually happens.
tglx
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH] posix-timer: fix deletion race
2007-07-17 9:57 ` Ingo Molnar
@ 2007-07-17 18:29 ` Jeremy Katz
0 siblings, 0 replies; 24+ messages in thread
From: Jeremy Katz @ 2007-07-17 18:29 UTC (permalink / raw)
To: Ingo Molnar
Cc: Thomas Gleixner, linux-kernel, Andrew Morton, Oleg Nesterov, Stable Team
On Tue, 17 Jul 2007, Ingo Molnar wrote:
> nice one! The race looks pretty narrow - Jeremy, does your Xens have
> hyperthreading? (or are there any heavy SMI sources perhaps that could
> open up this race.) If not then there might be some other bug lurking in
> there as well.
Affirmative. 2 cores, 2 hyperthreading units per.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH] posix-timer: fix deletion race
2007-07-17 8:53 ` [PATCH] posix-timer: fix deletion race Thomas Gleixner
2007-07-17 9:57 ` Ingo Molnar
2007-07-17 13:07 ` Oleg Nesterov
@ 2007-07-17 18:39 ` Jeremy Katz
2007-07-17 20:17 ` Thomas Gleixner
2 siblings, 1 reply; 24+ messages in thread
From: Jeremy Katz @ 2007-07-17 18:39 UTC (permalink / raw)
To: Thomas Gleixner
Cc: linux-kernel, Andrew Morton, Ingo Molnar, Oleg Nesterov, Stable Team
On Tue, 17 Jul 2007, Thomas Gleixner wrote:
> Jeremy Katz experienced a posix-timer related bug on 2.6.14. This is
> caused by a subtle race, which is there since the original posix timer
> commit and persists until today.
>
> timer_delete does:
> lock_timer();
> timer->it_process = NULL;
> unlock_timer();
> release_posix_timer();
>
> timer->it_process is checked in lock_timer() to prevent access to a
> timer, which is on the way to be deleted, but the check happens after
> idr_lock is dropped. This allows release_posix_timer() to delete the
> timer before the lock code can check the timer:
>
> CPU 0 CPU 1
> lock_timer();
> timer->it_process = NULL;
> unlock_timer();
> lock_timer()
> spin_lock(idr_lock);
> timer = idr_find();
> spin_lock(timer->lock);
> spin_unlock(idr_lock);
> release_posix_timer();
> spin_lock(idr_lock);
> idr_remove(timer);
> spin_unlock(idr_lock);
> free_timer(timer);
> if (timer->......)
>
> Change the locking to prevent this.
I tried the patch with my test case, but still see the issue.
Here's my explanation of the double free race:
CPU 0 CPU 1
sys_timer_delete():
lock_timer();
...
unlock_timer(); itimer_delete()
release_posix_timer(): spin_lock_irqsave(timer...)
... ...
sigqueue_free() unlock_timer()
... sigqueue_free()
BUG_ON(!(q->flags...
the timer fires during deletion:
CPU 0 CPU 1
sys_timer_delete():
lock_timer();
...
unlock_timer(); posix_timer_fn()
release_posix_timer(): spin_lock_irqsave(timer...)
... ...
sigqueue_free() posix_timer_event()
... ...
send_sigqueue()
BUG_ON(!(q->flags
Currently, this appears to fix the problem. The system survived 19 hours
before I rebooted to try other things, while the I usually see at least
one of the BUGs within 15 minutes.
--- linux-2.6.14-cgl.original/kernel/posix-timers.c 2007-07-11 16:06:47.000000000 -0700
+++ linux-2.6.14-cgl/kernel/posix-timers.c 2007-07-16 15:25:43.000000000 -0700
@@ -339,7 +339,9 @@ static int posix_timer_fn(void *data)
int si_private = 0;
int ret = HRTIMER_NORESTART;
- spin_lock_irqsave(&timr->it_lock, flags);
+ if(lock_timer(timr->it_id, &flags) == NULL) {
+ return ret;
+ }
if (timr->it.real.interval.tv64 != 0)
si_private = ++timr->it_requeue_pending;
@@ -411,8 +413,10 @@ static void release_posix_timer(struct k
{
if (it_id_set) {
unsigned long flags;
+ /* unmangle timer id */
+ int it_id = tmr->it_id & INT_MAX;
spin_lock_irqsave(&idr_lock, flags);
- idr_remove(&posix_timers_id, tmr->it_id);
+ idr_remove(&posix_timers_id, it_id);
spin_unlock_irqrestore(&idr_lock, flags);
}
sigqueue_free(tmr->sigq);
@@ -545,8 +549,11 @@ sys_timer_create(const clockid_t which_c
*/
out:
- if (error)
+ if (error) {
+ /* mangle the timer id for the release */
+ new_timer->it_id |= ~INT_MAX;
release_posix_timer(new_timer, it_id_set);
+ }
return error;
}
@@ -822,6 +829,8 @@ retry_delete:
put_task_struct(timer->it_process);
timer->it_process = NULL;
}
+ /* mangle the timer ID to prevent anyone else from finding it */
+ timer->it_id |= ~INT_MAX;
unlock_timer(timer, flags);
release_posix_timer(timer, IT_ID_SET);
return 0;
@@ -835,7 +844,9 @@ static inline void itimer_delete(struct
unsigned long flags;
retry_delete:
- spin_lock_irqsave(&timer->it_lock, flags);
+ /* timer already deleted? */
+ if (lock_timer(timer->it_id, &flags) == NULL)
+ return;
if (timer_delete_hook(timer) == TIMER_RETRY) {
unlock_timer(timer, flags);
@@ -851,6 +862,8 @@ retry_delete:
put_task_struct(timer->it_process);
timer->it_process = NULL;
}
+ /* mangle the timer ID to prevent anyone else from finding it */
+ timer->it_id |= ~INT_MAX;
unlock_timer(timer, flags);
release_posix_timer(timer, IT_ID_SET);
}
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH] posix-timer: fix deletion race
2007-07-17 18:39 ` Jeremy Katz
@ 2007-07-17 20:17 ` Thomas Gleixner
2007-07-17 23:12 ` Jeremy Katz
0 siblings, 1 reply; 24+ messages in thread
From: Thomas Gleixner @ 2007-07-17 20:17 UTC (permalink / raw)
To: Jeremy Katz
Cc: linux-kernel, Andrew Morton, Ingo Molnar, Oleg Nesterov, Stable Team
On Tue, 2007-07-17 at 11:39 -0700, Jeremy Katz wrote:
> I tried the patch with my test case, but still see the issue.
> Here's my explanation of the double free race:
> CPU 0 CPU 1
> sys_timer_delete():
> lock_timer();
> ...
> unlock_timer(); itimer_delete()
> release_posix_timer(): spin_lock_irqsave(timer...)
> ... ...
> sigqueue_free() unlock_timer()
> ... sigqueue_free()
> BUG_ON(!(q->flags...
With 2.6.14 or with current mainline ?
I doubt that this can happen, as itimer_delete() is only called, when
there are no more references to the shared sig struct. At this point no
other related thread can run sys_timer_delete().
If this happens, then something else is completely broken.
> the timer fires during deletion:
> CPU 0 CPU 1
> sys_timer_delete():
> lock_timer();
> ...
> unlock_timer(); posix_timer_fn()
> release_posix_timer(): spin_lock_irqsave(timer...)
> ... ...
> sigqueue_free() posix_timer_event()
> ... ...
> send_sigqueue()
> BUG_ON(!(q->flags
>
I do not buy that either. In sys_timer_delete() the active timer is
canceled. If we can not cancel it then we retry until the timer callback
has been processed. This is even true for 2.6.14 (pre hrtimer).
> --- linux-2.6.14-cgl.original/kernel/posix-timers.c 2007-07-11 16:06:47.000000000 -0700
> +++ linux-2.6.14-cgl/kernel/posix-timers.c 2007-07-16 15:25:43.000000000 -0700
> @@ -339,7 +339,9 @@ static int posix_timer_fn(void *data)
> int si_private = 0;
> int ret = HRTIMER_NORESTART;
>
> - spin_lock_irqsave(&timr->it_lock, flags);
> + if(lock_timer(timr->it_id, &flags) == NULL) {
> + return ret;
> + }
This is wrong. You look at something which is not valid anymore, if your
theory is correct. You are just pampering over the real bug.
> @@ -835,7 +844,9 @@ static inline void itimer_delete(struct
> unsigned long flags;
>
> retry_delete:
> - spin_lock_irqsave(&timer->it_lock, flags);
> + /* timer already deleted? */
> + if (lock_timer(timer->it_id, &flags) == NULL)
> + return;
Same here.
Can you reproduce the problem against mainline + my patch applied ?
tglx
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH] posix-timer: fix deletion race
2007-07-17 20:17 ` Thomas Gleixner
@ 2007-07-17 23:12 ` Jeremy Katz
2007-07-17 23:58 ` Jeremy Katz
0 siblings, 1 reply; 24+ messages in thread
From: Jeremy Katz @ 2007-07-17 23:12 UTC (permalink / raw)
To: Thomas Gleixner
Cc: linux-kernel, Andrew Morton, Ingo Molnar, Oleg Nesterov, Stable Team
On Tue, 17 Jul 2007, Thomas Gleixner wrote:
> On Tue, 2007-07-17 at 11:39 -0700, Jeremy Katz wrote:
>> I tried the patch with my test case, but still see the issue.
>> Here's my explanation of the double free race:
>> CPU 0 CPU 1
>> sys_timer_delete():
>> lock_timer();
>> ...
>> unlock_timer(); itimer_delete()
>> release_posix_timer(): spin_lock_irqsave(timer...)
>> ... ...
>> sigqueue_free() unlock_timer()
>> ... sigqueue_free()
>> BUG_ON(!(q->flags...
>
> With 2.6.14 or with current mainline ?
I haven't been keeping notes quite as studiously as I should have been,
but this just occurred with 2.6.22.1 + the hrt6 patch + your proposed fix:
------------[ cut here ]------------
Kernel BUG at c0125987 [verbose debug info unavailable]
invalid opcode: 0000 [#1]
SMP
Modules linked in:
CPU: 0
EIP: 0060:[<c0125987>] Not tainted VLI
EFLAGS: 00010246 (2.6.22-hrt1-WR1.4aq_cgl #1)
EIP is at sigqueue_free+0x23/0x72
eax: 00000000 ebx: f713a7d8 ecx: f79e25e0 edx: 00000202
esi: f6f4fa1c edi: 00000283 ebp: f7207e54 esp: f7207e4c
ds: 007b es: 007b fs: 00d8 gs: 0033 ss: 0068
Process hrtm_test (pid: 19369, ti=f7207000 task=f729ea50 task.ti=f7207000)
Stack: 00000202 f6f4fa1c f7207e64 c012c7ca f6f4fa1c f6f4fa24 f7207e78
c012d110
f77463c0 f77463fc 00000001 f7207e88 c012d14b f77463c0 f729ea50
f7207eb4
c011e57c f729eea4 00000006 f729ea50 f7207ea4 c01244e8 00000006
f77463c0
Call Trace:
[<c0103504>] show_trace_log_lvl+0x1a/0x30
[<c01035bb>] show_stack_log_lvl+0x8d/0xaa
[<c01037f5>] show_registers+0x1cd/0x2cb
[<c0103a4a>] die+0x113/0x207
[<c0395cf5>] do_trap+0x8f/0xc6
[<c0103d35>] do_invalid_op+0x88/0x92
[<c0395ac2>] error_code+0x72/0x78
[<c012c7ca>] release_posix_timer+0x1b/0x7a
[<c012d110>] itimer_delete+0x9b/0xc0
[<c012d14b>] exit_itimers+0x16/0x21
[<c011e57c>] do_exit+0x300/0x3e8
[<c011e6b2>] do_group_exit+0x29/0x6f
[<c012643a>] get_signal_to_deliver+0x210/0x298
[<c010256d>] do_signal+0x5c/0x158
[<c01026a5>] do_notify_resume+0x3c/0x3f
[<c0102866>] work_notifysig+0x13/0x19
> I doubt that this can happen, as itimer_delete() is only called, when
> there are no more references to the shared sig struct. At this point no
> other related thread can run sys_timer_delete().
I came to the above explanation after failing to find another path that
clears the SIGQUEUE_PREALLOC flag. Some form of memory corruption could
be to blame.
>> the timer fires during deletion:
>> CPU 0 CPU 1
>> sys_timer_delete():
>> lock_timer();
>> ...
>> unlock_timer(); posix_timer_fn()
>> release_posix_timer(): spin_lock_irqsave(timer...)
>> ... ...
>> sigqueue_free() posix_timer_event()
>> ... ...
>> send_sigqueue()
>> BUG_ON(!(q->flags
>>
>
> I do not buy that either. In sys_timer_delete() the active timer is
> canceled. If we can not cancel it then we retry until the timer callback
> has been processed. This is even true for 2.6.14 (pre hrtimer).
>
>> --- linux-2.6.14-cgl.original/kernel/posix-timers.c 2007-07-11 16:06:47.000000000 -0700
>> +++ linux-2.6.14-cgl/kernel/posix-timers.c 2007-07-16 15:25:43.000000000 -0700
>> @@ -339,7 +339,9 @@ static int posix_timer_fn(void *data)
>> int si_private = 0;
>> int ret = HRTIMER_NORESTART;
>>
>> - spin_lock_irqsave(&timr->it_lock, flags);
>> + if(lock_timer(timr->it_id, &flags) == NULL) {
>> + return ret;
>> + }
>
> This is wrong. You look at something which is not valid anymore, if your
> theory is correct. You are just pampering over the real bug.
Noticed right after I sent the patch out. My goal was to validate and
lock the timer through a single interface, but itimer_delete() and
posix_timer_fn() currently recieve a k_itimer * instead of a timer_t. I
should have reworked them.
>> @@ -835,7 +844,9 @@ static inline void itimer_delete(struct
>> unsigned long flags;
>>
>> retry_delete:
>> - spin_lock_irqsave(&timer->it_lock, flags);
>> + /* timer already deleted? */
>> + if (lock_timer(timer->it_id, &flags) == NULL)
>> + return;
>
> Same here.
Thanks for the reply,
Jeremy
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH] posix-timer: fix deletion race
2007-07-17 23:12 ` Jeremy Katz
@ 2007-07-17 23:58 ` Jeremy Katz
2007-07-18 6:05 ` Thomas Gleixner
2007-07-18 16:11 ` Oleg Nesterov
0 siblings, 2 replies; 24+ messages in thread
From: Jeremy Katz @ 2007-07-17 23:58 UTC (permalink / raw)
To: Thomas Gleixner
Cc: linux-kernel, Andrew Morton, Ingo Molnar, Oleg Nesterov, Stable Team
On Tue, 17 Jul 2007, Jeremy Katz wrote:
> On Tue, 17 Jul 2007, Thomas Gleixner wrote:
>
>> With 2.6.14 or with current mainline ?
>
> I haven't been keeping notes quite as studiously as I should have been, but
> this just occurred with 2.6.22.1 + the hrt6 patch + your proposed fix:
Scratch that. I had infrastructure problems, and ended up using the wrong
build.
This is with the patch (and 2.6.22.1 and hrt6):
------------[ cut here ]------------
Kernel BUG at c0125adb [verbose debug info unavailable]
invalid opcode: 0000 [#1]
SMP
Modules linked in:
CPU: 3
EIP: 0060:[<c0125adb>] Not tainted VLI
EFLAGS: 00010246 (2.6.22.1-WR1.4aq_cgl #2)
EIP is at sigqueue_free+0x23/0x72
eax: 00000000 ebx: f6eaf1a8 ecx: f6d4b888 edx: 00000202
esi: f73e2ba8 edi: 00000000 ebp: f7c7bf8c esp: f7c7bf84
ds: 007b es: 007b fs: 00d8 gs: 0033 ss: 0068
Process hrtm_test (pid: 15340, ti=f7c7b000 task=f6c0f030 task.ti=f7c7b000)
Stack: 00000202 f73e2ba8 f7c7bf9c c012c941 f73e2ba8 000002fb f7c7bfb0
c012d1c0
00000286 000002fb 00000000 f7c7b000 c01027ca 000002fb 466afff4
08064190
00000000 00000000 ae8d43c8 00000107 ffff007b c010007b 00000000
00000107
Call Trace:
[<c0103504>] show_trace_log_lvl+0x1a/0x30
[<c01035bb>] show_stack_log_lvl+0x8d/0xaa
[<c01037f5>] show_registers+0x1cd/0x2cb
[<c0103a4a>] die+0x113/0x207
[<c039aab5>] do_trap+0x8f/0xc6
[<c0103d35>] do_invalid_op+0x88/0x92
[<c039a882>] error_code+0x72/0x78
[<c012c941>] release_posix_timer+0x1b/0x7a
[<c012d1c0>] sys_timer_delete+0xd7/0x10c
[<c01027ca>] syscall_call+0x7/0xb
=======================
Code: 74 04 83 48 08 01 5d c3 55 89 e5 56 53 89 c3 8b 40 08 a8 01 74 15 39
1b 75 15 83 e0 fe 89 43 08 89 d8 e8 1d ec ff ff 5b 5e 5d c3 <0f> 0b eb fe
64 a1 00 c0 4a c0 8b b0 50 04 00 00 b8 00 1e 46 c0
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH] posix-timer: fix deletion race
2007-07-17 23:58 ` Jeremy Katz
@ 2007-07-18 6:05 ` Thomas Gleixner
2007-07-18 6:58 ` Thomas Gleixner
2007-07-18 16:11 ` Oleg Nesterov
1 sibling, 1 reply; 24+ messages in thread
From: Thomas Gleixner @ 2007-07-18 6:05 UTC (permalink / raw)
To: Jeremy Katz
Cc: linux-kernel, Andrew Morton, Ingo Molnar, Oleg Nesterov, Stable Team
On Tue, 2007-07-17 at 16:58 -0700, Jeremy Katz wrote:
> Scratch that. I had infrastructure problems, and ended up using the wrong
> build.
> EFLAGS: 00010246 (2.6.22.1-WR1.4aq_cgl #2)
Hmm. Are there any other patches on that kernel ?
Is there a chance that you can whip up a test program which is not
secret sauce ?
Also can you please enable CONFIG_PROVE_LOCKING, which might catch any
locking problem, which might be related to this.
tglx
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH] posix-timer: fix deletion race
2007-07-18 6:05 ` Thomas Gleixner
@ 2007-07-18 6:58 ` Thomas Gleixner
2007-07-18 19:02 ` Jeremy Katz
0 siblings, 1 reply; 24+ messages in thread
From: Thomas Gleixner @ 2007-07-18 6:58 UTC (permalink / raw)
To: Jeremy Katz
Cc: linux-kernel, Andrew Morton, Ingo Molnar, Oleg Nesterov, Stable Team
On Wed, 2007-07-18 at 08:05 +0200, Thomas Gleixner wrote:
> On Tue, 2007-07-17 at 16:58 -0700, Jeremy Katz wrote:
>
> > Scratch that. I had infrastructure problems, and ended up using the wrong
> > build.
>
> > EFLAGS: 00010246 (2.6.22.1-WR1.4aq_cgl #2)
>
> Hmm. Are there any other patches on that kernel ?
>
> Is there a chance that you can whip up a test program which is not
> secret sauce ?
>
> Also can you please enable CONFIG_PROVE_LOCKING, which might catch any
> locking problem, which might be related to this.
Another test: Can you please disable CONFIG_SCHED_SMT to narrow it down
further ?
tglx
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH] posix-timer: fix deletion race
2007-07-17 23:58 ` Jeremy Katz
2007-07-18 6:05 ` Thomas Gleixner
@ 2007-07-18 16:11 ` Oleg Nesterov
2007-07-18 19:08 ` Jeremy Katz
1 sibling, 1 reply; 24+ messages in thread
From: Oleg Nesterov @ 2007-07-18 16:11 UTC (permalink / raw)
To: Jeremy Katz
Cc: Thomas Gleixner, linux-kernel, Andrew Morton, Ingo Molnar, Stable Team
On 07/17, Jeremy Katz wrote:
>
> This is with the patch (and 2.6.22.1 and hrt6):
>
> ------------[ cut here ]------------
> Kernel BUG at c0125adb [verbose debug info unavailable]
> invalid opcode: 0000 [#1]
> SMP
> Modules linked in:
> CPU: 3
> EIP: 0060:[<c0125adb>] Not tainted VLI
> EFLAGS: 00010246 (2.6.22.1-WR1.4aq_cgl #2)
> EIP is at sigqueue_free+0x23/0x72
> eax: 00000000 ebx: f6eaf1a8 ecx: f6d4b888 edx: 00000202
> esi: f73e2ba8 edi: 00000000 ebp: f7c7bf8c esp: f7c7bf84
> ds: 007b es: 007b fs: 00d8 gs: 0033 ss: 0068
> Process hrtm_test (pid: 15340, ti=f7c7b000 task=f6c0f030 task.ti=f7c7b000)
> Stack: 00000202 f73e2ba8 f7c7bf9c c012c941 f73e2ba8 000002fb f7c7bfb0
> c012d1c0
> 00000286 000002fb 00000000 f7c7b000 c01027ca 000002fb 466afff4
> 08064190
> 00000000 00000000 ae8d43c8 00000107 ffff007b c010007b 00000000
> 00000107
> Call Trace:
> [<c0103504>] show_trace_log_lvl+0x1a/0x30
> [<c01035bb>] show_stack_log_lvl+0x8d/0xaa
> [<c01037f5>] show_registers+0x1cd/0x2cb
> [<c0103a4a>] die+0x113/0x207
> [<c039aab5>] do_trap+0x8f/0xc6
> [<c0103d35>] do_invalid_op+0x88/0x92
> [<c039a882>] error_code+0x72/0x78
> [<c012c941>] release_posix_timer+0x1b/0x7a
> [<c012d1c0>] sys_timer_delete+0xd7/0x10c
> [<c01027ca>] syscall_call+0x7/0xb
This looks really impossible.
I believe I see another bug in sys_timer_create(), but it is not related
to this problem.
sys_timer_create() inserts a partly initialized new_timer into posix_timers_id
and drops idr_lock. Suppose that another thread does sys_timer_delete(). If it
sees ->it_process != NULL, lock_timer() succeeds.
If the timer was not fully initialized at this time (or another CPU sees the
result of STOREs out of order), we can have multiple probles.
list_del() may oops, we can do put_task_struct() before sys_timer_create()
does get_task_struct(), we may leak the task_struct if sys_timer_delete()
doesn't see SIGEV_THREAD_ID yet.
Jeremy, I agree with Thomas that your patch should not be right, but it
does make a difference. Perhaps this is just the timing, but who knows.
Could you add some printk's to be sure that lock_timer() actually fails
while it never should?
Oleg.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH] posix-timer: fix deletion race
2007-07-18 6:58 ` Thomas Gleixner
@ 2007-07-18 19:02 ` Jeremy Katz
2007-07-18 23:43 ` Jeremy Katz
0 siblings, 1 reply; 24+ messages in thread
From: Jeremy Katz @ 2007-07-18 19:02 UTC (permalink / raw)
To: Thomas Gleixner
Cc: linux-kernel, Andrew Morton, Ingo Molnar, Oleg Nesterov, Stable Team
On Wed, 18 Jul 2007, Thomas Gleixner wrote:
> On Wed, 2007-07-18 at 08:05 +0200, Thomas Gleixner wrote:
>> On Tue, 2007-07-17 at 16:58 -0700, Jeremy Katz wrote:
>>
>>> EFLAGS: 00010246 (2.6.22.1-WR1.4aq_cgl #2)
>>
>> Hmm. Are there any other patches on that kernel ?
Just hrt6 and your proposed fix. The build system I'm using appends the
WR1.4aq_cgl, but isn't patching anything.
>> Is there a chance that you can whip up a test program which is not
>> secret sauce ?
I've taken a brief stab at it, but haven't produced anything successful
thus far. Unfortunately, I don't even have complete source to the test
code, and the people that do have expressed an inability to release it due
to contractual obligations.
>> Also can you please enable CONFIG_PROVE_LOCKING, which might catch any
>> locking problem, which might be related to this.
>
> Another test: Can you please disable CONFIG_SCHED_SMT to narrow it down
> further ?
I'll try both of these.
Jeremy
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH] posix-timer: fix deletion race
2007-07-18 16:11 ` Oleg Nesterov
@ 2007-07-18 19:08 ` Jeremy Katz
2007-07-20 14:15 ` Oleg Nesterov
0 siblings, 1 reply; 24+ messages in thread
From: Jeremy Katz @ 2007-07-18 19:08 UTC (permalink / raw)
To: Oleg Nesterov
Cc: Thomas Gleixner, linux-kernel, Andrew Morton, Ingo Molnar, Stable Team
On Wed, 18 Jul 2007, Oleg Nesterov wrote:
> Jeremy, I agree with Thomas that your patch should not be right, but it
> does make a difference. Perhaps this is just the timing, but who knows.
> Could you add some printk's to be sure that lock_timer() actually fails
> while it never should?
Agreed.
Unfortunately, adding any significant output appears to alter the
situation to the point where the issue either does not occur, or takes
significantly longer to surface.
Jeremy
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH] posix-timer: fix deletion race
2007-07-18 19:02 ` Jeremy Katz
@ 2007-07-18 23:43 ` Jeremy Katz
2007-07-19 5:50 ` Thomas Gleixner
0 siblings, 1 reply; 24+ messages in thread
From: Jeremy Katz @ 2007-07-18 23:43 UTC (permalink / raw)
To: Thomas Gleixner
Cc: linux-kernel, Andrew Morton, Ingo Molnar, Oleg Nesterov, Stable Team
On Wed, 18 Jul 2007, Jeremy Katz wrote:
> On Wed, 18 Jul 2007, Thomas Gleixner wrote:
>
>>> Also can you please enable CONFIG_PROVE_LOCKING, which might catch any
>>> locking problem, which might be related to this.
>>
>> Another test: Can you please disable CONFIG_SCHED_SMT to narrow it down
>> further ?
>
> I'll try both of these.
I'm still seeing the sys_timer_delete version with your patch, and
sys_timer_delete and posix_timer_event without. The itimer_delete version
is currently missing in action, but getting any particular one to perform
on demand is currently not in my power.
Jeremy
Stack dumps with 2.6.22.1 + hrt6:
------------[ cut here ]------------
Kernel BUG at c01266fb [verbose debug info unavailable]
invalid opcode: 0000 [#1]
SMP
Modules linked in:
CPU: 1
EIP: 0060:[<c01266fb>] Not tainted VLI
EFLAGS: 00010246 (2.6.22.1-WR1.4aq_cgl #4)
EIP is at sigqueue_free+0x23/0x75
eax: 00000000 ebx: f6f5f868 ecx: 00000000 edx: 00000000
esi: f73d1614 edi: 00000000 ebp: f737ff8c esp: f737ff84
ds: 007b es: 007b fs: 00d8 gs: 0033 ss: 0068
Process hrtm_test (pid: 1442, ti=f737f000 task=f738b590 task.ti=f737f000)
Stack: 00000286 f73d1614 f737ff9c c012d771 f73d1614 0000067c f737ffb0
c012dfff
00000282 0000067c 00000000 f737f000 c0102802 0000067c 466afff4
92607898
00000000 00000000 936a43c8 00000107 c010007b 0000007b 00000000
00000107
Call Trace:
[<c01035c4>] show_trace_log_lvl+0x1a/0x30
[<c010367b>] show_stack_log_lvl+0x8d/0xaa
[<c01038b5>] show_registers+0x1cd/0x2cb
[<c0103b0a>] die+0x113/0x207
[<c03a925d>] do_trap+0x8f/0xc6
[<c0103df5>] do_invalid_op+0x88/0x92
[<c03a902a>] error_code+0x72/0x78
[<c012d771>] release_posix_timer+0x1b/0x7a
[<c012dfff>] sys_timer_delete+0xda/0x10f
[<c0102802>] syscall_call+0x7/0xb
=======================
Code: 74 04 83 48 08 01 5d c3 55 89 e5 56 53 89 c3 8b 40 08 a8 01 74 15 39
1b 75 15 83 e0 fe 89 43 08 89 d8 e8 ed eb ff ff 5b 5e 5d c3 <0f> 0b eb fe
64 a1 00f0 4f c0 8b b0 50 04 00 00 b8 00 3e 4b c0
EIP: [<c01266fb>] sigqueue_free+0x23/0x75 SS:ESP 0068:f737ff84
------------[ cut here ]------------
Kernel BUG at c012682d [verbose debug info unavailable]
invalid opcode: 0000 [#5]
SMP
Modules linked in:
CPU: 3
EIP: 0060:[<c012682d>] Not tainted VLI
EFLAGS: 00010046 (2.6.22.1-WR1.4aq_cgl #4)
EIP is at send_sigqueue+0xe0/0xe8
eax: 00000020 ebx: f6d44c58 ecx: f71e8aa0 edx: f6d44c58
esi: f71e8aa0 edi: 00000020 ebp: c050cf84 esp: c050cf70
ds: 007b es: 007b fs: 00d8 gs: 0000 ss: 0068
Process swapper (pid: 0, ti=c050c000 task=c2e00aa0 task.ti=c2e02000)
Stack: 00000000 f6f6bd9c f6f6bd94 f6f6bde0 f6d44ce4 c050cf94 c012d519
f6f6bd94
00000003 c050cfc0 c012d595 c2a41990 c2a41980 c050cfb4 00000046
00000000
00000282 c012d54d f6f6bde0 c2a41980 c050cfdc c0131c14 00000000
c2a41a18
Call Trace:
[<c01035c4>] show_trace_log_lvl+0x1a/0x30
[<c010367b>] show_stack_log_lvl+0x8d/0xaa
[<c01038b5>] show_registers+0x1cd/0x2cb
[<c0103b0a>] die+0x113/0x207
[<c03a925d>] do_trap+0x8f/0xc6
[<c0103df5>] do_invalid_op+0x88/0x92
[<c03a902a>] error_code+0x72/0x78
[<c012d519>] posix_timer_event+0x71/0xa5
[<c012d595>] posix_timer_fn+0x48/0xdd
[<c0131c14>] run_hrtimer_softirq+0x5a/0xba
[<c01211ac>] __do_softirq+0x7e/0xf9
[<c0104708>] do_softirq+0x8c/0x101
[<c0121292>] irq_exit+0x4b/0x4d
[<c010f315>] smp_apic_timer_interrupt+0x2f/0x39
[<c01032db>] apic_timer_interrupt+0x33/0x38
[<c0100b5c>] mwait_idle+0x12/0x14
[<c0100a11>] cpu_idle+0x7b/0x7d
[<c04d06d5>] start_secondary+0xe8/0x30c
[<00000000>] 0x0
=======================
Code: 45 ec 01 00 00 00 eb c8 89 fa 89 f0 e8 a0 99 06 00 eb 93 81 7b 14 fe
ff 01 00 75 13 83 43 1c 01 eb ae c7 45 ec ff ff ff ff eb b8 <0f> 0b eb fe
0f 0b eb fe 55 89 e5 57 89 c7 56 89 ce 53 89 d3 83
EIP: [<c012682d>] send_sigqueue+0xe0/0xe8 SS:ESP 0068:c050cf70
Kernel panic - not syncing: Fatal exception in interrupt
With 2.6.22.1 + hrt6 + tglx's proposed fix:
------------[ cut here ]------------
Kernel BUG at c01266fb [verbose debug info unavailable]
invalid opcode: 0000 [#1]
SMP
Modules linked in:
CPU: 0
EIP: 0060:[<c01266fb>] Not tainted VLI
EFLAGS: 00010246 (2.6.22.1-WR1.4aq_cgl #5)
EIP is at sigqueue_free+0x23/0x75
eax: 00000000 ebx: f7349358 ecx: 00000000 edx: 00000000
esi: f7354194 edi: 00000000 ebp: f709af8c esp: f709af84
ds: 007b es: 007b fs: 00d8 gs: 0033 ss: 0068
Process hrtm_test (pid: 1424, ti=f709a000 task=f70d0ae0 task.ti=f709a000)
Stack: 00000286 f7354194 f709af9c c012d771 f7354194 00000190 f709afb0
c012e00c
00000282 00000190 00000000 f709a000 c0102802 00000190 466afff4
8fa00fe0
00000000 00000000 972753c8 00000107 ffff007b c010007b 00000000
00000107
Call Trace:
[<c01035c4>] show_trace_log_lvl+0x1a/0x30
[<c010367b>] show_stack_log_lvl+0x8d/0xaa
[<c01038b5>] show_registers+0x1cd/0x2cb
[<c0103b0a>] die+0x113/0x207
[<c03a926d>] do_trap+0x8f/0xc6
[<c0103df5>] do_invalid_op+0x88/0x92
[<c03a903a>] error_code+0x72/0x78
[<c012d771>] release_posix_timer+0x1b/0x7a
[<c012e00c>] sys_timer_delete+0xda/0x10f
[<c0102802>] syscall_call+0x7/0xb
=======================
Code: 74 04 83 48 08 01 5d c3 55 89 e5 56 53 89 c3 8b 40 08 a8 01 74 15 39
1b 75 15 83 e0 fe 89 43 08 89 d8 e8 ed eb ff ff 5b 5e 5d c3 <0f> 0b eb fe
64 a1 00 f0 4f c0 8b b0 50 04 00 00 b8 00 3e 4b c0
EIP: [<c01266fb>] sigqueue_free+0x23/0x75 SS:ESP 0068:f709af84
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH] posix-timer: fix deletion race
2007-07-18 23:43 ` Jeremy Katz
@ 2007-07-19 5:50 ` Thomas Gleixner
2007-07-19 19:24 ` Jeremy Katz
0 siblings, 1 reply; 24+ messages in thread
From: Thomas Gleixner @ 2007-07-19 5:50 UTC (permalink / raw)
To: Jeremy Katz
Cc: linux-kernel, Andrew Morton, Ingo Molnar, Oleg Nesterov, Stable Team
On Wed, 2007-07-18 at 16:43 -0700, Jeremy Katz wrote:
> On Wed, 18 Jul 2007, Jeremy Katz wrote:
>
> > On Wed, 18 Jul 2007, Thomas Gleixner wrote:
> >
> >>> Also can you please enable CONFIG_PROVE_LOCKING, which might catch any
> >>> locking problem, which might be related to this.
> >>
> >> Another test: Can you please disable CONFIG_SCHED_SMT to narrow it down
> >> further ?
> >
> > I'll try both of these.
>
> I'm still seeing the sys_timer_delete version with your patch, and
> sys_timer_delete and posix_timer_event without. The itimer_delete version
> is currently missing in action, but getting any particular one to perform
> on demand is currently not in my power.
Ok, let me summarize:
2.6.22 + hrt6
Both problems are there whether CONFIG_SCHED_SMT is on or not
2.6.22 + hrt6 + posixtimer patch
Both problems are there whether CONFIG_SCHED_SMT is on
The timer callback problem is gone when CONFIG_SCHED_SMT is off
Correct ?
tglx
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH] posix-timer: fix deletion race
2007-07-19 5:50 ` Thomas Gleixner
@ 2007-07-19 19:24 ` Jeremy Katz
0 siblings, 0 replies; 24+ messages in thread
From: Jeremy Katz @ 2007-07-19 19:24 UTC (permalink / raw)
To: Thomas Gleixner
Cc: linux-kernel, Andrew Morton, Ingo Molnar, Oleg Nesterov, Stable Team
On Thu, 19 Jul 2007, Thomas Gleixner wrote:
> On Wed, 2007-07-18 at 16:43 -0700, Jeremy Katz wrote:
>> On Wed, 18 Jul 2007, Jeremy Katz wrote:
>>
>>> On Wed, 18 Jul 2007, Thomas Gleixner wrote:
>>>
>>>>> Also can you please enable CONFIG_PROVE_LOCKING, which might catch any
>>>>> locking problem, which might be related to this.
>>>>
>>>> Another test: Can you please disable CONFIG_SCHED_SMT to narrow it down
>>>> further ?
>>>
>>> I'll try both of these.
>>
>> I'm still seeing the sys_timer_delete version with your patch, and
>> sys_timer_delete and posix_timer_event without. The itimer_delete version
>> is currently missing in action, but getting any particular one to perform
>> on demand is currently not in my power.
>
> Ok, let me summarize:
>
> 2.6.22 + hrt6
>
> Both problems are there whether CONFIG_SCHED_SMT is on or not
If both means BUG triggered in sigqueue_free and send_sigqueue, then yes.
>
> 2.6.22 + hrt6 + posixtimer patch
>
> Both problems are there whether CONFIG_SCHED_SMT is on
> The timer callback problem is gone when CONFIG_SCHED_SMT is off
It turns out that the callback problem just took longer to show up with
CONFIG_SCHED_SMT off. I saw this when I came in this morning:
------------[ cut here ]------------
Kernel BUG at c012682d [verbose debug info unavailable]
invalid opcode: 0000 [#6]
SMP
Modules linked in:
CPU: 3
EIP: 0060:[<c012682d>] Not tainted VLI
EFLAGS: 00010046 (2.6.22.1-WR1.4aq_cgl #5)
EIP is at send_sigqueue+0xe0/0xe8
eax: 00000020 ebx: f735e7d8 ecx: f73ea030 edx: f735e7d8
esi: f73ea030 edi: 00000020 ebp: c050cf84 esp: c050cf70
ds: 007b es: 007b fs: 00d8 gs: 0033 ss: 0068
Process hrtm_test (pid: 11102, ti=c050c000 task=f6a2d550 task.ti=f6fb7000)
Stack: 00000000 f722c89c f722c894 f722c8e0 f735e864 c050cf94 c012d519
f722c894
00000025 c050cfc0 c012d595 c2a41990 c2a41980 c050cfb4 00000046
00000000
00000282 c012d54d f722c8e0 c2a41980 c050cfdc c0131c20 00000000
c2a41a18
Call Trace:
[<c01035c4>] show_trace_log_lvl+0x1a/0x30
[<c010367b>] show_stack_log_lvl+0x8d/0xaa
[<c01038b5>] show_registers+0x1cd/0x2cb
[<c0103b0a>] die+0x113/0x207
[<c03a926d>] do_trap+0x8f/0xc6
[<c0103df5>] do_invalid_op+0x88/0x92
[<c03a903a>] error_code+0x72/0x78
[<c012d519>] posix_timer_event+0x71/0xa5
[<c012d595>] posix_timer_fn+0x48/0xdd
[<c0131c20>] run_hrtimer_softirq+0x5a/0xba
[<c01211ac>] __do_softirq+0x7e/0xf9
[<c0104708>] do_softirq+0x8c/0x101
[<c0121292>] irq_exit+0x4b/0x4d
[<c010f315>] smp_apic_timer_interrupt+0x2f/0x39
[<c01032db>] apic_timer_interrupt+0x33/0x38
[<c0125308>] __sigqueue_free+0x24/0x33
[<c01266f7>] sigqueue_free+0x1f/0x75
[<c012d771>] release_posix_timer+0x1b/0x7a
[<c012e00c>] sys_timer_delete+0xda/0x10f
[<c0102802>] syscall_call+0x7/0xb
=======================
Code: 45 ec 01 00 00 00 eb c8 89 fa 89 f0 e8 ac 99 06 00 eb 93 81 7b 14 fe
ff 01 00 75 13 83 43 1c 01 eb ae c7 45 ec ff ff ff ff eb b8 <0f> 0b eb fe
0f 0b eb fe 55 89 e5 57 89 c7 56 89 ce 53 89 d3 83
EIP: [<c012682d>] send_sigqueue+0xe0/0xe8 SS:ESP 0068:c050cf70
Kernel panic - not syncing: Fatal exception in interrupt
BUG: spinlock lockup on CPU#2, hrtm_test/1355, f722c89c
[<c01035c4>] show_trace_log_lvl+0x1a/0x30
[<c01035ec>] show_trace+0x12/0x14
[<c01036e6>] dump_stack+0x16/0x18
[<c0275dca>] __spin_lock_debug+0xb3/0xc5
[<c0275e31>] _raw_spin_lock+0x55/0x73
[<c03a8ba6>] _spin_lock+0x32/0x38
[<c012daa4>] lock_timer+0x3d/0xab
[<c012df5d>] sys_timer_delete+0x2b/0x10f
[<c0102802>] syscall_call+0x7/0xb
=======================
BUG: spinlock lockup on CPU#0, hrtm_test/1374, c04918d0
BUG: spinlock lockup on CPU#1, hrtm_test/1387, c04918d0
[<c01035c4>] show_trace_log_lvl+0x1a/0x30
[<c01035ec>] show_trace+0x12/0x14
[<c01036e6>] dump_stack+0x16/0x18
[<c0275dca>] __spin_lock_debug+0xb3/0xc5
[<c0275e31>] _raw_spin_lock+0x55/0x73
[<c03a8961>] _spin_lock_irqsave+0x3f/0x4b
[<c012da85>] lock_timer+0x1e/0xab
[<c012df5d>] sys_timer_delete+0x2b/0x10f
[<c0102802>] syscall_call+0x7/0xb
=======================
[<c01035c4>] show_trace_log_lvl+0x1a/0x30
[<c01035ec>] show_trace+0x12/0x14
[<c01036e6>] dump_stack+0x16/0x18
[<c0275dca>] __spin_lock_debug+0xb3/0xc5
[<c0275e31>] _raw_spin_lock+0x55/0x73
[<c03a8961>] _spin_lock_irqsave+0x3f/0x4b
[<c012da85>] lock_timer+0x1e/0xab
[<c012df5d>] sys_timer_delete+0x2b/0x10f
[<c0102802>] syscall_call+0x7/0xb
=======================
Jeremy
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH] posix-timer: fix deletion race
2007-07-18 19:08 ` Jeremy Katz
@ 2007-07-20 14:15 ` Oleg Nesterov
2007-07-24 2:07 ` Jeremy Katz
0 siblings, 1 reply; 24+ messages in thread
From: Oleg Nesterov @ 2007-07-20 14:15 UTC (permalink / raw)
To: Jeremy Katz
Cc: Thomas Gleixner, linux-kernel, Andrew Morton, Ingo Molnar, Stable Team
On 07/18, Jeremy Katz wrote:
>
> On Wed, 18 Jul 2007, Oleg Nesterov wrote:
>
> >Jeremy, I agree with Thomas that your patch should not be right, but it
> >does make a difference. Perhaps this is just the timing, but who knows.
> >Could you add some printk's to be sure that lock_timer() actually fails
> >while it never should?
>
> Agreed.
>
> Unfortunately, adding any significant output appears to alter the
> situation to the point where the issue either does not occur, or takes
> significantly longer to surface.
No, no, I didn't mean any significant output. You changed itimer_delete()
> - spin_lock_irqsave(&timer->it_lock, flags);
> + /* timer already deleted? */
> + if (lock_timer(timer->it_id, &flags) == NULL)
> + return;
This change should not help, lock_timer() should always succeed here.
But since it makes a difference, we can make something like
if (lock_timer(timer->it_id, &flags) == NULL) {
printk("Impossible! but it happened.\n");
return;
}
The same for posix_timer_fn().
I still can't believe we have a double-free problem, this looks imposiible.
Do you see the
"idr_remove called for id=%d which is not allocated.\n"
in syslog?
Could you try the patch below? Perhaps we have some wierd problem with
->sigq corruption.
Oleg.
--- t/kernel/posix-timers.c~ 2007-06-29 14:45:04.000000000 +0400
+++ t/kernel/posix-timers.c 2007-07-20 18:07:59.000000000 +0400
@@ -443,13 +443,16 @@ static struct k_itimer * alloc_posix_tim
#define IT_ID_NOT_SET 0
static void release_posix_timer(struct k_itimer *tmr, int it_id_set)
{
+ sigqueue_free(tmr->sigq);
+ tmr->sigq = NULL;
+
if (it_id_set) {
unsigned long flags;
spin_lock_irqsave(&idr_lock, flags);
idr_remove(&posix_timers_id, tmr->it_id);
spin_unlock_irqrestore(&idr_lock, flags);
}
- sigqueue_free(tmr->sigq);
+
if (unlikely(tmr->it_process) &&
tmr->it_sigev_notify == (SIGEV_SIGNAL|SIGEV_THREAD_ID))
put_task_struct(tmr->it_process);
@@ -615,6 +618,7 @@ static struct k_itimer * lock_timer(time
} else
spin_unlock_irqrestore(&idr_lock, *flags);
+ BUG_ON(timr && !timr->sigq);
return timr;
}
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH] posix-timer: fix deletion race
2007-07-20 14:15 ` Oleg Nesterov
@ 2007-07-24 2:07 ` Jeremy Katz
2007-07-24 14:51 ` Oleg Nesterov
0 siblings, 1 reply; 24+ messages in thread
From: Jeremy Katz @ 2007-07-24 2:07 UTC (permalink / raw)
To: Oleg Nesterov
Cc: Thomas Gleixner, linux-kernel, Andrew Morton, Ingo Molnar, Stable Team
On Fri, 20 Jul 2007, Oleg Nesterov wrote:
> On 07/18, Jeremy Katz wrote:
>>
>> On Wed, 18 Jul 2007, Oleg Nesterov wrote:
>>
>>> Jeremy, I agree with Thomas that your patch should not be right, but it
>>> does make a difference. Perhaps this is just the timing, but who knows.
>>> Could you add some printk's to be sure that lock_timer() actually fails
>>> while it never should?
>>
>> Agreed.
>>
>> Unfortunately, adding any significant output appears to alter the
>> situation to the point where the issue either does not occur, or takes
>> significantly longer to surface.
>
> No, no, I didn't mean any significant output. You changed itimer_delete()
>
> > - spin_lock_irqsave(&timer->it_lock, flags);
> > + /* timer already deleted? */
> > + if (lock_timer(timer->it_id, &flags) == NULL)
> > + return;
>
> This change should not help, lock_timer() should always succeed here.
> But since it makes a difference, we can make something like
>
> if (lock_timer(timer->it_id, &flags) == NULL) {
> printk("Impossible! but it happened.\n");
> return;
> }
>
> The same for posix_timer_fn().
Ahh, of course. I did try that at some point, and remember seeing at
least the occasional failure. This time, taking the spinlock and then
checking for a valid timer ID, I did not see the locking fail. I did see
the attempt to use a freed sigqueue, further suggesting my 'fix' merely
altered the timing sufficiently to hide the issue.
> I still can't believe we have a double-free problem, this looks imposiible.
> Do you see the
>
> "idr_remove called for id=%d which is not allocated.\n"
>
> in syslog?
No. I also added some accounting with atomic counters, and don't see
evidence of a second call to release_posix_timer.
> Could you try the patch below? Perhaps we have some wierd problem with
> ->sigq corruption.
Tried, with apparent effect.
To add to the body of data: Turning off hyperthreading in the hardware
does not resolve the issue. Limiting the system to one CPU does appear to
work.
Jeremy
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH] posix-timer: fix deletion race
2007-07-24 2:07 ` Jeremy Katz
@ 2007-07-24 14:51 ` Oleg Nesterov
2007-07-24 18:36 ` Jeremy Katz
0 siblings, 1 reply; 24+ messages in thread
From: Oleg Nesterov @ 2007-07-24 14:51 UTC (permalink / raw)
To: Jeremy Katz
Cc: Thomas Gleixner, linux-kernel, Andrew Morton, Ingo Molnar, Stable Team
On 07/23, Jeremy Katz wrote:
>
> On Fri, 20 Jul 2007, Oleg Nesterov wrote:
>
> >I still can't believe we have a double-free problem, this looks imposiible.
> >Do you see the
> >
> > "idr_remove called for id=%d which is not allocated.\n"
> >
> >in syslog?
>
> No. I also added some accounting with atomic counters, and don't see
> evidence of a second call to release_posix_timer.
Interesting. Could you show the patch? Where does sys_timer_create() set
counter == 1?
> >Could you try the patch below? Perhaps we have some wierd problem with
> >->sigq corruption.
>
> Tried, with apparent effect.
You mean, "null pointer dereference" in release_posix_timer() ? Or it was
BUG_ON(timr && !timr->sigq) in lock_timer() ?
I assume you didn't apply any other patches except Thomas's, yes?
Oleg.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH] posix-timer: fix deletion race
2007-07-24 14:51 ` Oleg Nesterov
@ 2007-07-24 18:36 ` Jeremy Katz
2007-07-24 20:43 ` Oleg Nesterov
0 siblings, 1 reply; 24+ messages in thread
From: Jeremy Katz @ 2007-07-24 18:36 UTC (permalink / raw)
To: Oleg Nesterov
Cc: Thomas Gleixner, linux-kernel, Andrew Morton, Ingo Molnar, Stable Team
On Tue, 24 Jul 2007, Oleg Nesterov wrote:
> On 07/23, Jeremy Katz wrote:
>>
>> On Fri, 20 Jul 2007, Oleg Nesterov wrote:
>>
>>> I still can't believe we have a double-free problem, this looks imposiible.
>>> Do you see the
>>>
>>> "idr_remove called for id=%d which is not allocated.\n"
>>>
>>> in syslog?
>>
>> No. I also added some accounting with atomic counters, and don't see
>> evidence of a second call to release_posix_timer.
>
> Interesting. Could you show the patch? Where does sys_timer_create() set
> counter == 1?
--- kernel/posix-timers.c.old 2007-07-24 11:21:29.000000000 -0700
+++ kernel/posix-timers.c 2007-07-20 15:49:51.000000000 -0700
@@ -409,13 +409,17 @@ static struct k_itimer * alloc_posix_tim
#define IT_ID_NOT_SET 0
static void release_posix_timer(struct k_itimer *tmr, int it_id_set)
{
+ atomic_inc(&tmr->freecount);
@@ -443,6 +447,7 @@ sys_timer_create(const clockid_t which_c
new_timer = alloc_posix_timer();
if (unlikely(!new_timer))
return -EAGAIN;
+ atomic_set(&new_timer->freecount, 0);
--- include/linux/posix-timers.h.old 2007-07-24 11:32:36.000000000 -0700
+++ include/linux/posix-timers.h 2007-07-20 15:15:44.000000000 -0700
@@ -62,6 +62,7 @@ struct k_itimer {
unsigned long expires;
} mmtimer;
} it;
+ atomic_t freecount;
};
struct k_clock {
Examination was done through a kgdb session when it caught the bug.
freecount always had the expected value, 1 during a delete operation,
and 0 otherwise.
>>> Could you try the patch below? Perhaps we have some wierd problem with
>>> ->sigq corruption.
>>
>> Tried, with apparent effect.
>
> You mean, "null pointer dereference" in release_posix_timer() ? Or it was
> BUG_ON(timr && !timr->sigq) in lock_timer() ?
Sorry. That should have been "without apparent effect".
> I assume you didn't apply any other patches except Thomas's, yes?
Alone, as well as with Thomas' patch. Neither way changed the behavior.
Jeremy
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH] posix-timer: fix deletion race
2007-07-24 18:36 ` Jeremy Katz
@ 2007-07-24 20:43 ` Oleg Nesterov
2007-07-24 21:19 ` Jeremy Katz
0 siblings, 1 reply; 24+ messages in thread
From: Oleg Nesterov @ 2007-07-24 20:43 UTC (permalink / raw)
To: Jeremy Katz
Cc: Thomas Gleixner, linux-kernel, Andrew Morton, Ingo Molnar, Stable Team
On 07/24, Jeremy Katz wrote:
>
> On Tue, 24 Jul 2007, Oleg Nesterov wrote:
>
> >Interesting. Could you show the patch? Where does sys_timer_create() set
> >counter == 1?
>
> --- kernel/posix-timers.c.old 2007-07-24 11:21:29.000000000 -0700
> +++ kernel/posix-timers.c 2007-07-20 15:49:51.000000000 -0700
> @@ -409,13 +409,17 @@ static struct k_itimer * alloc_posix_tim
> #define IT_ID_NOT_SET 0
> static void release_posix_timer(struct k_itimer *tmr, int it_id_set)
> {
> + atomic_inc(&tmr->freecount);
> @@ -443,6 +447,7 @@ sys_timer_create(const clockid_t which_c
> new_timer = alloc_posix_timer();
> if (unlikely(!new_timer))
> return -EAGAIN;
> + atomic_set(&new_timer->freecount, 0);
>
> --- include/linux/posix-timers.h.old 2007-07-24 11:32:36.000000000 -0700
> +++ include/linux/posix-timers.h 2007-07-20 15:15:44.000000000 -0700
> @@ -62,6 +62,7 @@ struct k_itimer {
> unsigned long expires;
> } mmtimer;
> } it;
> + atomic_t freecount;
> };
>
> struct k_clock {
>
>
> Examination was done through a kgdb session when it caught the bug.
> freecount always had the expected value, 1 during a delete operation,
> and 0 otherwise.
OK, thanks.
> >>>Could you try the patch below? Perhaps we have some wierd problem with
> >>>->sigq corruption.
> >>
> >>Tried, with apparent effect.
> >
> >You mean, "null pointer dereference" in release_posix_timer() ? Or it was
> >BUG_ON(timr && !timr->sigq) in lock_timer() ?
>
> Sorry. That should have been "without apparent effect".
Sorry. I confused completely.
So. You mean that even with that patch you _still_ see the BUG_ON(!SIGQUEUE_PREALLOC)
in sigqueue_free() ?
Oleg.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH] posix-timer: fix deletion race
2007-07-24 20:43 ` Oleg Nesterov
@ 2007-07-24 21:19 ` Jeremy Katz
0 siblings, 0 replies; 24+ messages in thread
From: Jeremy Katz @ 2007-07-24 21:19 UTC (permalink / raw)
To: Oleg Nesterov
Cc: Thomas Gleixner, linux-kernel, Andrew Morton, Ingo Molnar, Stable Team
On Wed, 25 Jul 2007, Oleg Nesterov wrote:
> On 07/24, Jeremy Katz wrote:
>>
>> Sorry. That should have been "without apparent effect".
>
> Sorry. I confused completely.
>
> So. You mean that even with that patch you _still_ see the
> BUG_ON(!SIGQUEUE_PREALLOC) in sigqueue_free() ?
Yes. I did not notice any difference in the system's behavior after
applying the patch.
Jeremy
^ permalink raw reply [flat|nested] 24+ messages in thread
end of thread, other threads:[~2007-07-24 21:21 UTC | newest]
Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-07-13 18:19 POSIX timer deletion race? Jeremy Katz
2007-07-17 8:53 ` [PATCH] posix-timer: fix deletion race Thomas Gleixner
2007-07-17 9:57 ` Ingo Molnar
2007-07-17 18:29 ` Jeremy Katz
2007-07-17 13:07 ` Oleg Nesterov
2007-07-17 17:02 ` Thomas Gleixner
2007-07-17 18:39 ` Jeremy Katz
2007-07-17 20:17 ` Thomas Gleixner
2007-07-17 23:12 ` Jeremy Katz
2007-07-17 23:58 ` Jeremy Katz
2007-07-18 6:05 ` Thomas Gleixner
2007-07-18 6:58 ` Thomas Gleixner
2007-07-18 19:02 ` Jeremy Katz
2007-07-18 23:43 ` Jeremy Katz
2007-07-19 5:50 ` Thomas Gleixner
2007-07-19 19:24 ` Jeremy Katz
2007-07-18 16:11 ` Oleg Nesterov
2007-07-18 19:08 ` Jeremy Katz
2007-07-20 14:15 ` Oleg Nesterov
2007-07-24 2:07 ` Jeremy Katz
2007-07-24 14:51 ` Oleg Nesterov
2007-07-24 18:36 ` Jeremy Katz
2007-07-24 20:43 ` Oleg Nesterov
2007-07-24 21:19 ` Jeremy Katz
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).