linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [BUG] Race condition with it_real_fn in kernel/itimer.c
@ 2005-06-15 17:39 Oleg Nesterov
  2005-06-15 18:37 ` Steven Rostedt
  0 siblings, 1 reply; 12+ messages in thread
From: Oleg Nesterov @ 2005-06-15 17:39 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: linux-kernel

Steven Rostedt wrote:
>
> +	try_again:
>  		spin_lock_irq(&tsk->sighand->siglock);
>  		interval = tsk->signal->it_real_incr;
>  		val = it_real_value(tsk->signal);
> -		if (val)
> +		if (val) {
> +			spin_unlock_irq(&tsk->sighand->siglock);
>  			del_timer_sync(&tsk->signal->real_timer);
> +			goto try_again;

I think we don't need del_timer_sync() at all, just del_timer().

Because it_real_value() returns 0 when timer is not pending. And
in this case the timer may still be running, but do_setitimer()
doesn't call del_timer_sync().

Oleg.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [BUG] Race condition with it_real_fn in kernel/itimer.c
  2005-06-15 17:39 [BUG] Race condition with it_real_fn in kernel/itimer.c Oleg Nesterov
@ 2005-06-15 18:37 ` Steven Rostedt
  2005-06-15 19:34   ` [PATCH] " Steven Rostedt
  2005-06-16  9:03   ` [BUG] Race condition with it_real_fn in kernel/itimer.c Oleg Nesterov
  0 siblings, 2 replies; 12+ messages in thread
From: Steven Rostedt @ 2005-06-15 18:37 UTC (permalink / raw)
  To: Oleg Nesterov; +Cc: Andrew Morton, Ingo Molnar, linux-kernel

On Wed, 2005-06-15 at 21:39 +0400, Oleg Nesterov wrote:
> Steven Rostedt wrote:
> >
> > +	try_again:
> >  		spin_lock_irq(&tsk->sighand->siglock);
> >  		interval = tsk->signal->it_real_incr;
> >  		val = it_real_value(tsk->signal);
> > -		if (val)
> > +		if (val) {
> > +			spin_unlock_irq(&tsk->sighand->siglock);
> >  			del_timer_sync(&tsk->signal->real_timer);
> > +			goto try_again;
> 
> I think we don't need del_timer_sync() at all, just del_timer().
> 
> Because it_real_value() returns 0 when timer is not pending. And
> in this case the timer may still be running, but do_setitimer()
> doesn't call del_timer_sync().

OK, so is this the better patch?

[Andrew, do NOT use the following]

--- linux-2.6.12-rc6/kernel/itimer.c.orig	2005-06-15 12:14:13.000000000 -0400
+++ linux-2.6.12-rc6/kernel/itimer.c	2005-06-15 14:06:23.000000000 -0400
@@ -157,7 +157,7 @@
 		interval = tsk->signal->it_real_incr;
 		val = it_real_value(tsk->signal);
 		if (val)
-			del_timer_sync(&tsk->signal->real_timer);
+			del_timer(&tsk->signal->real_timer);
 		tsk->signal->it_real_incr =
 			timeval_to_jiffies(&value->it_interval);
 		it_real_arm(tsk, timeval_to_jiffies(&value->it_value));


I haven't played too much with the itimer, what harm can happen if the
timer is running while this is deleted?  [examines code here] This also
looks bad. Since the softirq function can be running and then call
it_real_arm unprotected! And you can see here that it_real_arm is also
called and they both call add_timer! This would not work, so far the
first patch seems to handle this. 


-- Steve

PS. Don't strip the CC list.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH] Re: [BUG] Race condition with it_real_fn in kernel/itimer.c
  2005-06-15 18:37 ` Steven Rostedt
@ 2005-06-15 19:34   ` Steven Rostedt
  2005-06-16  7:44     ` Oleg Nesterov
  2005-06-16  9:03   ` [BUG] Race condition with it_real_fn in kernel/itimer.c Oleg Nesterov
  1 sibling, 1 reply; 12+ messages in thread
From: Steven Rostedt @ 2005-06-15 19:34 UTC (permalink / raw)
  To: Oleg Nesterov; +Cc: Andrew Morton, Ingo Molnar, linux-kernel

On Wed, 2005-06-15 at 14:37 -0400, Steven Rostedt wrote:

> 
> I haven't played too much with the itimer, what harm can happen if the
> timer is running while this is deleted?  [examines code here] This also
> looks bad. Since the softirq function can be running and then call
> it_real_arm unprotected! And you can see here that it_real_arm is also
> called and they both call add_timer! This would not work, so far the
> first patch seems to handle this. 

Ha! There's even another problem that I just noticed.  it_real_value
could return zero while the function is running! So you don't get the
protection here either!  Take a look here to see what the problem is.

static unsigned long it_real_value(struct signal_struct *sig)
{
	unsigned long val = 0;
	if (timer_pending(&sig->real_timer)) {
		val = sig->real_timer.expires - jiffies;

		/* look out for negative/zero itimer.. */
		if ((long) val <= 0)
			val = 1;
	}
	return val;
}

and 

static inline int timer_pending(const struct timer_list * timer)
{
	return timer->base != NULL;
}

and

static inline void __run_timers(tvec_base_t *base)
{
  ...
			timer = list_entry(head->next,struct timer_list,entry);
 			fn = timer->function;
 			data = timer->data;

			list_del(&timer->entry);
			set_running_timer(base, timer);
			smp_wmb();
			timer->base = NULL;
			spin_unlock_irq(&base->lock);
			{
				u32 preempt_count = preempt_count();
				fn(data);
...
}


So, timer_pending tests if timer->base is NULL, but here we see that
timer->base IS NULL before the function is called, and as I have said
earlier, the it_real_arm can be called on two CPUS simultaneously. So
here's another patch that should fix this race condition too.


--- linux-2.6.12-rc6/kernel/itimer.c.orig	2005-06-15 12:14:13.000000000 -0400
+++ linux-2.6.12-rc6/kernel/itimer.c	2005-06-15 15:31:12.000000000 -0400
@@ -156,8 +156,15 @@
 		spin_lock_irq(&tsk->sighand->siglock);
 		interval = tsk->signal->it_real_incr;
 		val = it_real_value(tsk->signal);
-		if (val)
-			del_timer_sync(&tsk->signal->real_timer);
+		/*
+		 * Call del_timer_sync unconditionally, since we don't
+		 * know if it is running or not. We also need to unlock
+		 * the siglock so that the it_real_fn called by ksoftirqd
+		 * doesn't wait for us.
+		 */
+		spin_unlock(&tsk->sighand->siglock);
+		del_timer_sync(&tsk->signal->real_timer);
+		spin_lock(&tsk->sighand->siglock);
 		tsk->signal->it_real_incr =
 			timeval_to_jiffies(&value->it_interval);
 		it_real_arm(tsk, timeval_to_jiffies(&value->it_value));


-- Steve




^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Re: [BUG] Race condition with it_real_fn in kernel/itimer.c
  2005-06-15 19:34   ` [PATCH] " Steven Rostedt
@ 2005-06-16  7:44     ` Oleg Nesterov
  2005-06-16 11:33       ` Steven Rostedt
  0 siblings, 1 reply; 12+ messages in thread
From: Oleg Nesterov @ 2005-06-16  7:44 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: Andrew Morton, Ingo Molnar, linux-kernel

Steven Rostedt wrote:
>
> So, timer_pending tests if timer->base is NULL, but here we see that
> timer->base IS NULL before the function is called, and as I have said
> earlier, the it_real_arm can be called on two CPUS simultaneously. So
> here's another patch that should fix this race condition too.
>
> [...]
>
> +		/*
> +		 * Call del_timer_sync unconditionally, since we don't
> +		 * know if it is running or not. We also need to unlock
> +		 * the siglock so that the it_real_fn called by ksoftirqd
> +		 * doesn't wait for us.
> +		 */
> +		spin_unlock(&tsk->sighand->siglock);
> +		del_timer_sync(&tsk->signal->real_timer);
> +		spin_lock(&tsk->sighand->siglock);

I don't think this is 100% correct. After del_timer_sync() returns another
thread can come and call do_setitimer() and re-arm the timer (because with
your patch we are dropping tsk->sighand->siglock here). So this patch does
not garantees that the timer is not queued/running after del_timer_sync(),
and the it_real_arm can be called on two CPUS simultaneously again.

There is a try_to_del_timer_sync() in the -mm tree which is suitable here:

	again:
		spin_lock_irq(&tsk->sighand->siglock);
		if (try_to_del_timer_sync(&tsk->signal->real_timer) < 0) {
			spin_unlock_irq(&tsk->sighand->siglock);
			goto again;
		}

Oleg.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [BUG] Race condition with it_real_fn in kernel/itimer.c
  2005-06-15 18:37 ` Steven Rostedt
  2005-06-15 19:34   ` [PATCH] " Steven Rostedt
@ 2005-06-16  9:03   ` Oleg Nesterov
  1 sibling, 0 replies; 12+ messages in thread
From: Oleg Nesterov @ 2005-06-16  9:03 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: Andrew Morton, Ingo Molnar, linux-kernel

Steven Rostedt wrote:
> 
> On Wed, 2005-06-15 at 21:39 +0400, Oleg Nesterov wrote:
> >
> > I think we don't need del_timer_sync() at all, just del_timer().
> >
> [...]
>
> it_real_arm unprotected! And you can see here that it_real_arm is also
> called and they both call add_timer! This would not work, so far the
> first patch seems to handle this.

Yes, you are right, thanks.

> PS. Don't strip the CC list.

I am sorry. It's because I am not subscribed to lkml, I saw your message
on http://marc.theaimsgroup.com/. You might know is there an lkml archive
which does not hide recipients list (or in mbox format) ?

Oleg.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Re: [BUG] Race condition with it_real_fn in kernel/itimer.c
  2005-06-16  7:44     ` Oleg Nesterov
@ 2005-06-16 11:33       ` Steven Rostedt
  2005-06-16 11:44         ` Steven Rostedt
  2005-06-16 14:30         ` [PATCH] Re: [BUG] Race condition with it_real_fn inkernel/itimer.c Oleg Nesterov
  0 siblings, 2 replies; 12+ messages in thread
From: Steven Rostedt @ 2005-06-16 11:33 UTC (permalink / raw)
  To: Oleg Nesterov; +Cc: Roland McGrath, Andrew Morton, Ingo Molnar, linux-kernel

On Thu, 2005-06-16 at 11:44 +0400, Oleg Nesterov wrote:
> Steven Rostedt wrote:
> >
> > So, timer_pending tests if timer->base is NULL, but here we see that
> > timer->base IS NULL before the function is called, and as I have said
> > earlier, the it_real_arm can be called on two CPUS simultaneously. So
> > here's another patch that should fix this race condition too.
> >
> > [...]
> >
> > +		/*
> > +		 * Call del_timer_sync unconditionally, since we don't
> > +		 * know if it is running or not. We also need to unlock
> > +		 * the siglock so that the it_real_fn called by ksoftirqd
> > +		 * doesn't wait for us.
> > +		 */
> > +		spin_unlock(&tsk->sighand->siglock);
> > +		del_timer_sync(&tsk->signal->real_timer);
> > +		spin_lock(&tsk->sighand->siglock);
> 
> I don't think this is 100% correct. After del_timer_sync() returns another
> thread can come and call do_setitimer() and re-arm the timer (because with
> your patch we are dropping tsk->sighand->siglock here). So this patch does
> not garantees that the timer is not queued/running after del_timer_sync(),
> and the it_real_arm can be called on two CPUS simultaneously again.
> 

I first thought that too, but then looking at the code I noticed:

int do_setitimer(int which, struct itimerval *value, struct itimerval *ovalue)
{
        struct task_struct *tsk = current;

Where tsk is current.  So the only ones that can change the
tsk->signal->real_timer seems to be the task itself and ksoftirqd. So
between del_timer_sync (which handles the ksoftirqd part) and the
spin_lock, there's no one else that can modify tsk->signal->real_timer.
So I don't believe that there is a race condition here.

[thinks about this a little]

Oh wait, is ->signal shared among threads?  Damn, I think so! So you are
right, another _thread_ can come and change this. I forgot about threads
(they're evil! ;-).

> There is a try_to_del_timer_sync() in the -mm tree which is suitable here:
> 
> 	again:
> 		spin_lock_irq(&tsk->sighand->siglock);
> 		if (try_to_del_timer_sync(&tsk->signal->real_timer) < 0) {
> 			spin_unlock_irq(&tsk->sighand->siglock);
> 			goto again;
> 		}

OK, for the -mm branch this may work. But for the current tree, we may
need to do something else.  Like this ugly patch. But it should work!


int do_setitimer(int which, struct itimerval *value, struct itimerval *ovalue)
{
        struct task_struct *tsk = current;
	static spinlock_t lock = SPIN_LOCK_UNLOCKED;

[...]
		spin_lock(&lock);
		spin_unlock(&tsk->sighand->siglock);
		del_timer_sync(&tsk->signal->real_timer);
		spin_lock(&tsk->sighand->siglock);
		spin_unlock(&lock);

This would handle the case for threads in the main line kernel, but it
looks (to me) pretty ugly, but should work. I also don't like this
because it is shared among all tasks!

Andrew, (or Roland since I see Andrew added you to the list)
  
 What do you think? Should try_to_del_timer_sync be brought over to the
mainline, or have the above ugly code added?

-- Steve



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Re: [BUG] Race condition with it_real_fn in kernel/itimer.c
  2005-06-16 11:33       ` Steven Rostedt
@ 2005-06-16 11:44         ` Steven Rostedt
  2005-06-16 14:30         ` [PATCH] Re: [BUG] Race condition with it_real_fn inkernel/itimer.c Oleg Nesterov
  1 sibling, 0 replies; 12+ messages in thread
From: Steven Rostedt @ 2005-06-16 11:44 UTC (permalink / raw)
  To: Oleg Nesterov; +Cc: Roland McGrath, Andrew Morton, Ingo Molnar, linux-kernel

On Thu, 2005-06-16 at 07:33 -0400, Steven Rostedt wrote:

> int do_setitimer(int which, struct itimerval *value, struct itimerval *ovalue)
> {
>         struct task_struct *tsk = current;
> 	static spinlock_t lock = SPIN_LOCK_UNLOCKED;
> 
> [...]
> 		spin_lock(&lock);
> 		spin_unlock(&tsk->sighand->siglock);
> 		del_timer_sync(&tsk->signal->real_timer);
> 		spin_lock(&tsk->sighand->siglock);
> 		spin_unlock(&lock);

OK, I just got out of bed, so I'm not too with it :-) 

This is pretty much a guaranteed deadlock!  So the first spin_lock needs
to go before the siglock. That should do it!


	case ITIMER_REAL:
		spin_lock_irq(&lock);
		spin_lock(&tsk->sighand->siglock);
		[...]
		spin_unlock(&tsk->sighand->siglock);
		del_timer_sync(&tsk->signal->real_timer);
		spin_lock(&tsk->sighand->siglock);
		spin_unlock(&lock);

We just need to keep two do_setitimer calls from grabbing the siglock.
That first string of code didn't prevent that.

-- Steve


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Re: [BUG] Race condition with it_real_fn inkernel/itimer.c
  2005-06-16 11:33       ` Steven Rostedt
  2005-06-16 11:44         ` Steven Rostedt
@ 2005-06-16 14:30         ` Oleg Nesterov
  1 sibling, 0 replies; 12+ messages in thread
From: Oleg Nesterov @ 2005-06-16 14:30 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: Roland McGrath, Andrew Morton, Ingo Molnar, linux-kernel

Steven Rostedt wrote:
> 
> Andrew, (or Roland since I see Andrew added you to the list)
> 
>  What do you think? Should try_to_del_timer_sync be brought over to the
> mainline, or have the above ugly code added?

On the other hand, if 2 threads calls do_setitimer() in parallel
they are asking for trouble. So may be it is possible to ignore
this minor problem and stay with your patch?

Oleg.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [BUG] Race condition with it_real_fn in kernel/itimer.c
  2005-06-15 20:25 ` Andrew Morton
@ 2005-06-15 21:01   ` Steven Rostedt
  0 siblings, 0 replies; 12+ messages in thread
From: Steven Rostedt @ 2005-06-15 21:01 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, mingo, Roland McGrath

On Wed, 2005-06-15 at 13:25 -0700, Andrew Morton wrote:

> 
> And that will fix it.  (Labels start in column zero, and a comment is
> needed here).

I blame emacs for that bad label :-)

> 
> However I wonder if it would be sufficient to remove the del_timer_sync()
> call altogether and just do mod_timer() in it_real_arm().
> 
> If the handler happens to be running on another CPU and if the handler
> tries to run mod_timer() _after_ the do_setitimer() has run mod_timer(),
> the handler will use the desired value of it_real_incr anyway.
> 

So do you prefer a patch like the following?

--- linux-2.6.12-rc6/kernel/itimer.c.orig	2005-06-15 16:33:13.000000000 -0400
+++ linux-2.6.12-rc6/kernel/itimer.c	2005-06-15 16:42:45.000000000 -0400
@@ -118,6 +118,8 @@
  */
 static inline void it_real_arm(struct task_struct *p, unsigned long interval)
 {
+	unsigned long expires;
+
 	p->signal->it_real_value = interval; /* XXX unnecessary field?? */
 	if (interval == 0)
 		return;
@@ -127,8 +129,8 @@
 	 * the interval requested. This could happen if
 	 * time requested % (usecs per jiffy) is more than the usecs left
 	 * in the current jiffy */
-	p->signal->real_timer.expires = jiffies + interval + 1;
-	add_timer(&p->signal->real_timer);
+	expires = jiffies + interval + 1;
+	mod_timer(&p->signal->real_timer, expires);
 }
 
 void it_real_fn(unsigned long __data)
@@ -156,8 +158,6 @@
 		spin_lock_irq(&tsk->sighand->siglock);
 		interval = tsk->signal->it_real_incr;
 		val = it_real_value(tsk->signal);
-		if (val)
-			del_timer_sync(&tsk->signal->real_timer);
 		tsk->signal->it_real_incr =
 			timeval_to_jiffies(&value->it_interval);
 		it_real_arm(tsk, timeval_to_jiffies(&value->it_value));


Now the question is, what happens on the following scenario?

ksoftirqd:

  calls it_real_func 

process:

   calls do_setitimer blocks on siglock;

ksoftirqd: unlocks siglock calls it_real_arm and after it assigns
expires it takes an interrupt before calling mod_timer.

process:

   calls it_real_arm and does the changes to mod_timer first.

ksoftirqd: comes back from interrupt and then calls mod_timer with the
wrong value.

This may be a small chance in hell of happening, and the result may not
be to drastic, but this is still a race condition.  So far I think that
my unconditional calling of del_timer_sync, although inefficient, it
doesn't have any races.

-- Steve


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [BUG] Race condition with it_real_fn in kernel/itimer.c
  2005-06-15 16:23 Steven Rostedt
  2005-06-15 17:35 ` Steven Rostedt
@ 2005-06-15 20:25 ` Andrew Morton
  2005-06-15 21:01   ` Steven Rostedt
  1 sibling, 1 reply; 12+ messages in thread
From: Andrew Morton @ 2005-06-15 20:25 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: linux-kernel, mingo, Roland McGrath

Steven Rostedt <rostedt@goodmis.org> wrote:
>
> OK, I found this bug on an older version of Ingo's RT kernel with my own
> customizations. This is a very hard to get race condition but my logging
> traced it pretty good and this looks like it may also be a bug for both
> Ingo's RT kernel and the vanilla kernel. This was on an SMP machine.
> 
> Here's the race (since this was initiated with XFree86, I'll use it as
> the userland process that started this):
> 
> XFree86: calls sys_call 
>     -> sys_setitimer
>        -> do_setitimer 
>            (grabs tsk->sighand->siglock)
>            -> del_timer_sync
>      which has the following code:
> 
> 	for_each_online_cpu(i) {
> 		base = &per_cpu(tvec_bases, i);
> 		if (base->running_timer == timer) {
> 			while (base->running_timer == timer) {
> 				cpu_relax();
> 				preempt_check_resched();
> 			}
> 			break;
> 		}
> 	}
> 
> If the timer hasn't gone off yet on another cpu, it will spin until it
> is finished. Now here's the problem:
> 
> ksoftirqd: calls do_softirq -> ... -> run_timer_softirq
>       -> __run_timers
>         -> it_real_fn
>             -> send_group_sig_info
>               -> group_send_sig_info
>                   (grabs p->sighand->siglock)
> 
> Now, since the ksoftirqd is what changes running_timer, we have a
> deadlock! 

Yes, that's a deadlock.

> What would be the harm in doing something like:
> 
> --- linux-2.6.12-rc6/kernel/itimer.c.orig	2005-06-15 12:14:13.000000000 -0400
> +++ linux-2.6.12-rc6/kernel/itimer.c	2005-06-15 12:18:31.000000000 -0400
> @@ -153,11 +153,15 @@
>  
>  	switch (which) {
>  	case ITIMER_REAL:
> +	try_again:
>  		spin_lock_irq(&tsk->sighand->siglock);
>  		interval = tsk->signal->it_real_incr;
>  		val = it_real_value(tsk->signal);
> -		if (val)
> +		if (val) {
> +			spin_unlock_irq(&tsk->sighand->siglock);
>  			del_timer_sync(&tsk->signal->real_timer);
> +			goto try_again;
> +		}
>  		tsk->signal->it_real_incr =
>  			timeval_to_jiffies(&value->it_interval);
>  		it_real_arm(tsk, timeval_to_jiffies(&value->it_value));
> 
> 

And that will fix it.  (Labels start in column zero, and a comment is
needed here).

However I wonder if it would be sufficient to remove the del_timer_sync()
call altogether and just do mod_timer() in it_real_arm().

If the handler happens to be running on another CPU and if the handler
tries to run mod_timer() _after_ the do_setitimer() has run mod_timer(),
the handler will use the desired value of it_real_incr anyway.



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [BUG] Race condition with it_real_fn in kernel/itimer.c
  2005-06-15 16:23 Steven Rostedt
@ 2005-06-15 17:35 ` Steven Rostedt
  2005-06-15 20:25 ` Andrew Morton
  1 sibling, 0 replies; 12+ messages in thread
From: Steven Rostedt @ 2005-06-15 17:35 UTC (permalink / raw)
  To: LKML; +Cc: Ingo Molnar, Andrew Morton

On Wed, 2005-06-15 at 12:23 -0400, Steven Rostedt wrote:

> 
> 	for_each_online_cpu(i) {
> 		base = &per_cpu(tvec_bases, i);
> 		if (base->running_timer == timer) {
> 			while (base->running_timer == timer) {
> 				cpu_relax();
> 				preempt_check_resched();
> 			}
> 			break;
> 		}
> 	}
> 
> If the timer hasn't gone off yet on another cpu, it will spin until it
> is finished. Now here's the problem:

This was suppose to say...

If the timer is currently going off on another cpu.


-- Steve



^ permalink raw reply	[flat|nested] 12+ messages in thread

* [BUG] Race condition with it_real_fn in kernel/itimer.c
@ 2005-06-15 16:23 Steven Rostedt
  2005-06-15 17:35 ` Steven Rostedt
  2005-06-15 20:25 ` Andrew Morton
  0 siblings, 2 replies; 12+ messages in thread
From: Steven Rostedt @ 2005-06-15 16:23 UTC (permalink / raw)
  To: LKML; +Cc: Andrew Morton, Ingo Molnar

OK, I found this bug on an older version of Ingo's RT kernel with my own
customizations. This is a very hard to get race condition but my logging
traced it pretty good and this looks like it may also be a bug for both
Ingo's RT kernel and the vanilla kernel. This was on an SMP machine.

Here's the race (since this was initiated with XFree86, I'll use it as
the userland process that started this):

XFree86: calls sys_call 
    -> sys_setitimer
       -> do_setitimer 
           (grabs tsk->sighand->siglock)
           -> del_timer_sync
     which has the following code:

	for_each_online_cpu(i) {
		base = &per_cpu(tvec_bases, i);
		if (base->running_timer == timer) {
			while (base->running_timer == timer) {
				cpu_relax();
				preempt_check_resched();
			}
			break;
		}
	}

If the timer hasn't gone off yet on another cpu, it will spin until it
is finished. Now here's the problem:

ksoftirqd: calls do_softirq -> ... -> run_timer_softirq
      -> __run_timers
        -> it_real_fn
            -> send_group_sig_info
              -> group_send_sig_info
                  (grabs p->sighand->siglock)

Now, since the ksoftirqd is what changes running_timer, we have a
deadlock! 

What would be the harm in doing something like:

--- linux-2.6.12-rc6/kernel/itimer.c.orig	2005-06-15 12:14:13.000000000 -0400
+++ linux-2.6.12-rc6/kernel/itimer.c	2005-06-15 12:18:31.000000000 -0400
@@ -153,11 +153,15 @@
 
 	switch (which) {
 	case ITIMER_REAL:
+	try_again:
 		spin_lock_irq(&tsk->sighand->siglock);
 		interval = tsk->signal->it_real_incr;
 		val = it_real_value(tsk->signal);
-		if (val)
+		if (val) {
+			spin_unlock_irq(&tsk->sighand->siglock);
 			del_timer_sync(&tsk->signal->real_timer);
+			goto try_again;
+		}
 		tsk->signal->it_real_incr =
 			timeval_to_jiffies(&value->it_interval);
 		it_real_arm(tsk, timeval_to_jiffies(&value->it_value));


-- Steve



^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2005-06-16 14:21 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-06-15 17:39 [BUG] Race condition with it_real_fn in kernel/itimer.c Oleg Nesterov
2005-06-15 18:37 ` Steven Rostedt
2005-06-15 19:34   ` [PATCH] " Steven Rostedt
2005-06-16  7:44     ` Oleg Nesterov
2005-06-16 11:33       ` Steven Rostedt
2005-06-16 11:44         ` Steven Rostedt
2005-06-16 14:30         ` [PATCH] Re: [BUG] Race condition with it_real_fn inkernel/itimer.c Oleg Nesterov
2005-06-16  9:03   ` [BUG] Race condition with it_real_fn in kernel/itimer.c Oleg Nesterov
  -- strict thread matches above, loose matches on Subject: below --
2005-06-15 16:23 Steven Rostedt
2005-06-15 17:35 ` Steven Rostedt
2005-06-15 20:25 ` Andrew Morton
2005-06-15 21:01   ` Steven Rostedt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).