linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH -rt] Fix TASKLET_STATE_SCHED WARN_ON()
@ 2007-06-15 15:52 Oleg Nesterov
  2007-06-15 21:51 ` john stultz
  0 siblings, 1 reply; 8+ messages in thread
From: Oleg Nesterov @ 2007-06-15 15:52 UTC (permalink / raw)
  To: john stultz
  Cc: Ingo Molnar, Paul E. McKenney, Steven Rostedt, Thomas Gleixner,
	linux-kernel

john stultz wrote:
>
> The following additional patch should correct this issue. Although since
> we weren't actually hitting it, the issue is a bit theoretical, so I've
> not been able to prove it really fixes anything.

Could you please look at the message below? I sent it privately near a month
ago, but I think these problems are not fixed yet.

Oleg.

---------------------------------------------------------------------------

> +__tasklet_action(struct softirq_action *a, struct tasklet_struct *list)
> +{
> +	int loops = 1000000;
>  
>  	while (list) {
>  		struct tasklet_struct *t = list;
>  
>  		list = list->next;
> +		/*
> +		 * Should always succeed - after a tasklist got on the
> +		 * list (after getting the SCHED bit set from 0 to 1),
> +		 * nothing but the tasklet softirq it got queued to can
> +		 * lock it:
> +		 */
> +		if (!tasklet_trylock(t)) {
> +			WARN_ON(1);
> +			continue;
> +		}
> +
> +		t->next = NULL;
> +
> +		/*
> +		 * If we cannot handle the tasklet because it's disabled,
> +		 * mark it as pending. tasklet_enable() will later
> +		 * re-schedule the tasklet.
> +		 */
> +		if (unlikely(atomic_read(&t->count))) {
> +out_disabled:
> +			/* implicit unlock: */
> +			wmb();
> +			t->state = TASKLET_STATEF_PENDING;

What if tasklet_enable() happens just before this line ?

After the next schedule_tasklet() we have all bits set: SCHED, RUN, PENDING.
The next invocation of __tasklet_action() clears SCHED, but tasklet_tryunlock()
below can never succeed because of PENDING.

> +again:
> +		t->func(t->data);
> +
> +		/*
> +		 * Try to unlock the tasklet. We must use cmpxchg, because
> +		 * another CPU might have scheduled or disabled the tasklet.
> +		 * We only allow the STATE_RUN -> 0 transition here.
> +		 */
> +		while (!tasklet_tryunlock(t)) {
> +			/*
> +			 * If it got disabled meanwhile, bail out:
> +			 */
> +			if (atomic_read(&t->count))
> +				goto out_disabled;
> +			/*
> +			 * If it got scheduled meanwhile, re-execute
> +			 * the tasklet function:
> +			 */
> +			if (test_and_clear_bit(TASKLET_STATE_SCHED, &t->state))
> +				goto again;

TASKLET_STATE_SCHED could be set by tasklet_kill(), in this case it is not nice
to call t->func() again (but probably harmless).


^ permalink raw reply	[flat|nested] 8+ messages in thread
* [PATCH -rt] Fix TASKLET_STATE_SCHED WARN_ON()
@ 2007-06-06  2:17 john stultz
  2007-06-06  9:45 ` Ingo Molnar
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: john stultz @ 2007-06-06  2:17 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Thomas Gleixner, Steven Rostedt, Paul E. McKenney, lkml

Hey Ingo,
	So we've been seeing the following trace fairly frequently on our SMP
boxes when running kernbench:

BUG: at kernel/softirq.c:639 __tasklet_action()

Call Trace:
 [<ffffffff8106d5da>] dump_trace+0xaa/0x32a
 [<ffffffff8106d89b>] show_trace+0x41/0x5c
 [<ffffffff8106d8cb>] dump_stack+0x15/0x17
 [<ffffffff81094a97>] __tasklet_action+0xdf/0x12e
 [<ffffffff81094f76>] tasklet_action+0x27/0x29
 [<ffffffff8109530a>] ksoftirqd+0x16c/0x271
 [<ffffffff81033d4d>] kthread+0xf5/0x128
 [<ffffffff8105ff68>] child_rip+0xa/0x12


Paul also pointed this out awhile back: http://lkml.org/lkml/2007/2/25/1


Anyway, I think I finally found the issue. Its a bit hard to explain,
but the idea is while __tasklet_action is running the tasklet function
on CPU1, if a call to tasklet_schedule() on CPU2 is made, and if right
after we mark the TASKLET_STATE_SCHED bit we are preempted,
__tasklet_action on CPU1 might be able to re-run the function, clear the
bit and unlock the tasklet before CPU2 enters __tasklet_common_schedule.
Once __tasklet_common_schedule locks the tasklet, we will add the
tasklet to the list with the TASKLET_STATE_SCHED *unset*. 

I've verified this race occurs w/ a WARN_ON in
__tasklet_common_schedule().


This fix avoids this race by making sure *after* we've locked the
tasklet that the STATE_SCHED bit is set before adding it to the list.

Does it look ok to you?

thanks
-john

Signed-off-by: John Stultz <johnstul@us.ibm.com>

Index: 2.6-rt/kernel/softirq.c
===================================================================
--- 2.6-rt.orig/kernel/softirq.c	2007-06-05 18:30:54.000000000 -0700
+++ 2.6-rt/kernel/softirq.c	2007-06-05 18:36:44.000000000 -0700
@@ -544,10 +544,17 @@ static void inline
 __tasklet_common_schedule(struct tasklet_struct *t, struct tasklet_head *head, unsigned int nr)
 {
 	if (tasklet_trylock(t)) {
-		WARN_ON(t->next != NULL);
-		t->next = head->list;
-		head->list = t;
-		raise_softirq_irqoff(nr);
+		/* We may have been preempted before tasklet_trylock
+		 * and __tasklet_action may have already run.
+		 * So double check the sched bit while the takslet
+		 * is locked before adding it to the list.
+		 */
+		if (test_bit(TASKLET_STATE_SCHED, &t->state)) {
+			WARN_ON(t->next != NULL);
+			t->next = head->list;
+			head->list = t;
+			raise_softirq_irqoff(nr);
+		}
 		tasklet_unlock(t);
 	}
 }



^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2007-06-15 23:59 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-06-15 15:52 [PATCH -rt] Fix TASKLET_STATE_SCHED WARN_ON() Oleg Nesterov
2007-06-15 21:51 ` john stultz
2007-06-15 23:59   ` Oleg Nesterov
  -- strict thread matches above, loose matches on Subject: below --
2007-06-06  2:17 john stultz
2007-06-06  9:45 ` Ingo Molnar
2007-06-06 17:39   ` john stultz
2007-06-06 10:31 ` Jesper Juhl
2007-06-14 21:20 ` john stultz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).