linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [BUG] Race condition with it_real_fn in kernel/itimer.c
@ 2005-06-15 16:23 Steven Rostedt
  2005-06-15 17:35 ` Steven Rostedt
  2005-06-15 20:25 ` Andrew Morton
  0 siblings, 2 replies; 7+ messages in thread
From: Steven Rostedt @ 2005-06-15 16:23 UTC (permalink / raw)
  To: LKML; +Cc: Andrew Morton, Ingo Molnar

OK, I found this bug on an older version of Ingo's RT kernel with my own
customizations. This is a very hard to get race condition but my logging
traced it pretty good and this looks like it may also be a bug for both
Ingo's RT kernel and the vanilla kernel. This was on an SMP machine.

Here's the race (since this was initiated with XFree86, I'll use it as
the userland process that started this):

XFree86: calls sys_call 
    -> sys_setitimer
       -> do_setitimer 
           (grabs tsk->sighand->siglock)
           -> del_timer_sync
     which has the following code:

	for_each_online_cpu(i) {
		base = &per_cpu(tvec_bases, i);
		if (base->running_timer == timer) {
			while (base->running_timer == timer) {
				cpu_relax();
				preempt_check_resched();
			}
			break;
		}
	}

If the timer hasn't gone off yet on another cpu, it will spin until it
is finished. Now here's the problem:

ksoftirqd: calls do_softirq -> ... -> run_timer_softirq
      -> __run_timers
        -> it_real_fn
            -> send_group_sig_info
              -> group_send_sig_info
                  (grabs p->sighand->siglock)

Now, since the ksoftirqd is what changes running_timer, we have a
deadlock! 

What would be the harm in doing something like:

--- linux-2.6.12-rc6/kernel/itimer.c.orig	2005-06-15 12:14:13.000000000 -0400
+++ linux-2.6.12-rc6/kernel/itimer.c	2005-06-15 12:18:31.000000000 -0400
@@ -153,11 +153,15 @@
 
 	switch (which) {
 	case ITIMER_REAL:
+	try_again:
 		spin_lock_irq(&tsk->sighand->siglock);
 		interval = tsk->signal->it_real_incr;
 		val = it_real_value(tsk->signal);
-		if (val)
+		if (val) {
+			spin_unlock_irq(&tsk->sighand->siglock);
 			del_timer_sync(&tsk->signal->real_timer);
+			goto try_again;
+		}
 		tsk->signal->it_real_incr =
 			timeval_to_jiffies(&value->it_interval);
 		it_real_arm(tsk, timeval_to_jiffies(&value->it_value));


-- Steve



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [BUG] Race condition with it_real_fn in kernel/itimer.c
  2005-06-15 16:23 [BUG] Race condition with it_real_fn in kernel/itimer.c Steven Rostedt
@ 2005-06-15 17:35 ` Steven Rostedt
  2005-06-15 20:25 ` Andrew Morton
  1 sibling, 0 replies; 7+ messages in thread
From: Steven Rostedt @ 2005-06-15 17:35 UTC (permalink / raw)
  To: LKML; +Cc: Ingo Molnar, Andrew Morton

On Wed, 2005-06-15 at 12:23 -0400, Steven Rostedt wrote:

> 
> 	for_each_online_cpu(i) {
> 		base = &per_cpu(tvec_bases, i);
> 		if (base->running_timer == timer) {
> 			while (base->running_timer == timer) {
> 				cpu_relax();
> 				preempt_check_resched();
> 			}
> 			break;
> 		}
> 	}
> 
> If the timer hasn't gone off yet on another cpu, it will spin until it
> is finished. Now here's the problem:

This was suppose to say...

If the timer is currently going off on another cpu.


-- Steve



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [BUG] Race condition with it_real_fn in kernel/itimer.c
  2005-06-15 16:23 [BUG] Race condition with it_real_fn in kernel/itimer.c Steven Rostedt
  2005-06-15 17:35 ` Steven Rostedt
@ 2005-06-15 20:25 ` Andrew Morton
  2005-06-15 21:01   ` Steven Rostedt
  1 sibling, 1 reply; 7+ messages in thread
From: Andrew Morton @ 2005-06-15 20:25 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: linux-kernel, mingo, Roland McGrath

Steven Rostedt <rostedt@goodmis.org> wrote:
>
> OK, I found this bug on an older version of Ingo's RT kernel with my own
> customizations. This is a very hard to get race condition but my logging
> traced it pretty good and this looks like it may also be a bug for both
> Ingo's RT kernel and the vanilla kernel. This was on an SMP machine.
> 
> Here's the race (since this was initiated with XFree86, I'll use it as
> the userland process that started this):
> 
> XFree86: calls sys_call 
>     -> sys_setitimer
>        -> do_setitimer 
>            (grabs tsk->sighand->siglock)
>            -> del_timer_sync
>      which has the following code:
> 
> 	for_each_online_cpu(i) {
> 		base = &per_cpu(tvec_bases, i);
> 		if (base->running_timer == timer) {
> 			while (base->running_timer == timer) {
> 				cpu_relax();
> 				preempt_check_resched();
> 			}
> 			break;
> 		}
> 	}
> 
> If the timer hasn't gone off yet on another cpu, it will spin until it
> is finished. Now here's the problem:
> 
> ksoftirqd: calls do_softirq -> ... -> run_timer_softirq
>       -> __run_timers
>         -> it_real_fn
>             -> send_group_sig_info
>               -> group_send_sig_info
>                   (grabs p->sighand->siglock)
> 
> Now, since the ksoftirqd is what changes running_timer, we have a
> deadlock! 

Yes, that's a deadlock.

> What would be the harm in doing something like:
> 
> --- linux-2.6.12-rc6/kernel/itimer.c.orig	2005-06-15 12:14:13.000000000 -0400
> +++ linux-2.6.12-rc6/kernel/itimer.c	2005-06-15 12:18:31.000000000 -0400
> @@ -153,11 +153,15 @@
>  
>  	switch (which) {
>  	case ITIMER_REAL:
> +	try_again:
>  		spin_lock_irq(&tsk->sighand->siglock);
>  		interval = tsk->signal->it_real_incr;
>  		val = it_real_value(tsk->signal);
> -		if (val)
> +		if (val) {
> +			spin_unlock_irq(&tsk->sighand->siglock);
>  			del_timer_sync(&tsk->signal->real_timer);
> +			goto try_again;
> +		}
>  		tsk->signal->it_real_incr =
>  			timeval_to_jiffies(&value->it_interval);
>  		it_real_arm(tsk, timeval_to_jiffies(&value->it_value));
> 
> 

And that will fix it.  (Labels start in column zero, and a comment is
needed here).

However I wonder if it would be sufficient to remove the del_timer_sync()
call altogether and just do mod_timer() in it_real_arm().

If the handler happens to be running on another CPU and if the handler
tries to run mod_timer() _after_ the do_setitimer() has run mod_timer(),
the handler will use the desired value of it_real_incr anyway.



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [BUG] Race condition with it_real_fn in kernel/itimer.c
  2005-06-15 20:25 ` Andrew Morton
@ 2005-06-15 21:01   ` Steven Rostedt
  0 siblings, 0 replies; 7+ messages in thread
From: Steven Rostedt @ 2005-06-15 21:01 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, mingo, Roland McGrath

On Wed, 2005-06-15 at 13:25 -0700, Andrew Morton wrote:

> 
> And that will fix it.  (Labels start in column zero, and a comment is
> needed here).

I blame emacs for that bad label :-)

> 
> However I wonder if it would be sufficient to remove the del_timer_sync()
> call altogether and just do mod_timer() in it_real_arm().
> 
> If the handler happens to be running on another CPU and if the handler
> tries to run mod_timer() _after_ the do_setitimer() has run mod_timer(),
> the handler will use the desired value of it_real_incr anyway.
> 

So do you prefer a patch like the following?

--- linux-2.6.12-rc6/kernel/itimer.c.orig	2005-06-15 16:33:13.000000000 -0400
+++ linux-2.6.12-rc6/kernel/itimer.c	2005-06-15 16:42:45.000000000 -0400
@@ -118,6 +118,8 @@
  */
 static inline void it_real_arm(struct task_struct *p, unsigned long interval)
 {
+	unsigned long expires;
+
 	p->signal->it_real_value = interval; /* XXX unnecessary field?? */
 	if (interval == 0)
 		return;
@@ -127,8 +129,8 @@
 	 * the interval requested. This could happen if
 	 * time requested % (usecs per jiffy) is more than the usecs left
 	 * in the current jiffy */
-	p->signal->real_timer.expires = jiffies + interval + 1;
-	add_timer(&p->signal->real_timer);
+	expires = jiffies + interval + 1;
+	mod_timer(&p->signal->real_timer, expires);
 }
 
 void it_real_fn(unsigned long __data)
@@ -156,8 +158,6 @@
 		spin_lock_irq(&tsk->sighand->siglock);
 		interval = tsk->signal->it_real_incr;
 		val = it_real_value(tsk->signal);
-		if (val)
-			del_timer_sync(&tsk->signal->real_timer);
 		tsk->signal->it_real_incr =
 			timeval_to_jiffies(&value->it_interval);
 		it_real_arm(tsk, timeval_to_jiffies(&value->it_value));


Now the question is, what happens on the following scenario?

ksoftirqd:

  calls it_real_func 

process:

   calls do_setitimer blocks on siglock;

ksoftirqd: unlocks siglock calls it_real_arm and after it assigns
expires it takes an interrupt before calling mod_timer.

process:

   calls it_real_arm and does the changes to mod_timer first.

ksoftirqd: comes back from interrupt and then calls mod_timer with the
wrong value.

This may be a small chance in hell of happening, and the result may not
be to drastic, but this is still a race condition.  So far I think that
my unconditional calling of del_timer_sync, although inefficient, it
doesn't have any races.

-- Steve


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [BUG] Race condition with it_real_fn in kernel/itimer.c
  2005-06-15 18:37 ` Steven Rostedt
@ 2005-06-16  9:03   ` Oleg Nesterov
  0 siblings, 0 replies; 7+ messages in thread
From: Oleg Nesterov @ 2005-06-16  9:03 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: Andrew Morton, Ingo Molnar, linux-kernel

Steven Rostedt wrote:
> 
> On Wed, 2005-06-15 at 21:39 +0400, Oleg Nesterov wrote:
> >
> > I think we don't need del_timer_sync() at all, just del_timer().
> >
> [...]
>
> it_real_arm unprotected! And you can see here that it_real_arm is also
> called and they both call add_timer! This would not work, so far the
> first patch seems to handle this.

Yes, you are right, thanks.

> PS. Don't strip the CC list.

I am sorry. It's because I am not subscribed to lkml, I saw your message
on http://marc.theaimsgroup.com/. You might know is there an lkml archive
which does not hide recipients list (or in mbox format) ?

Oleg.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [BUG] Race condition with it_real_fn in kernel/itimer.c
  2005-06-15 17:39 Oleg Nesterov
@ 2005-06-15 18:37 ` Steven Rostedt
  2005-06-16  9:03   ` Oleg Nesterov
  0 siblings, 1 reply; 7+ messages in thread
From: Steven Rostedt @ 2005-06-15 18:37 UTC (permalink / raw)
  To: Oleg Nesterov; +Cc: Andrew Morton, Ingo Molnar, linux-kernel

On Wed, 2005-06-15 at 21:39 +0400, Oleg Nesterov wrote:
> Steven Rostedt wrote:
> >
> > +	try_again:
> >  		spin_lock_irq(&tsk->sighand->siglock);
> >  		interval = tsk->signal->it_real_incr;
> >  		val = it_real_value(tsk->signal);
> > -		if (val)
> > +		if (val) {
> > +			spin_unlock_irq(&tsk->sighand->siglock);
> >  			del_timer_sync(&tsk->signal->real_timer);
> > +			goto try_again;
> 
> I think we don't need del_timer_sync() at all, just del_timer().
> 
> Because it_real_value() returns 0 when timer is not pending. And
> in this case the timer may still be running, but do_setitimer()
> doesn't call del_timer_sync().

OK, so is this the better patch?

[Andrew, do NOT use the following]

--- linux-2.6.12-rc6/kernel/itimer.c.orig	2005-06-15 12:14:13.000000000 -0400
+++ linux-2.6.12-rc6/kernel/itimer.c	2005-06-15 14:06:23.000000000 -0400
@@ -157,7 +157,7 @@
 		interval = tsk->signal->it_real_incr;
 		val = it_real_value(tsk->signal);
 		if (val)
-			del_timer_sync(&tsk->signal->real_timer);
+			del_timer(&tsk->signal->real_timer);
 		tsk->signal->it_real_incr =
 			timeval_to_jiffies(&value->it_interval);
 		it_real_arm(tsk, timeval_to_jiffies(&value->it_value));


I haven't played too much with the itimer, what harm can happen if the
timer is running while this is deleted?  [examines code here] This also
looks bad. Since the softirq function can be running and then call
it_real_arm unprotected! And you can see here that it_real_arm is also
called and they both call add_timer! This would not work, so far the
first patch seems to handle this. 


-- Steve

PS. Don't strip the CC list.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [BUG] Race condition with it_real_fn in kernel/itimer.c
@ 2005-06-15 17:39 Oleg Nesterov
  2005-06-15 18:37 ` Steven Rostedt
  0 siblings, 1 reply; 7+ messages in thread
From: Oleg Nesterov @ 2005-06-15 17:39 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: linux-kernel

Steven Rostedt wrote:
>
> +	try_again:
>  		spin_lock_irq(&tsk->sighand->siglock);
>  		interval = tsk->signal->it_real_incr;
>  		val = it_real_value(tsk->signal);
> -		if (val)
> +		if (val) {
> +			spin_unlock_irq(&tsk->sighand->siglock);
>  			del_timer_sync(&tsk->signal->real_timer);
> +			goto try_again;

I think we don't need del_timer_sync() at all, just del_timer().

Because it_real_value() returns 0 when timer is not pending. And
in this case the timer may still be running, but do_setitimer()
doesn't call del_timer_sync().

Oleg.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2005-06-16  8:54 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-06-15 16:23 [BUG] Race condition with it_real_fn in kernel/itimer.c Steven Rostedt
2005-06-15 17:35 ` Steven Rostedt
2005-06-15 20:25 ` Andrew Morton
2005-06-15 21:01   ` Steven Rostedt
2005-06-15 17:39 Oleg Nesterov
2005-06-15 18:37 ` Steven Rostedt
2005-06-16  9:03   ` Oleg Nesterov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).