All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH RT] tasklet/rt: Prevent tasklets from going into infinite spin in RT
@ 2011-11-30  1:55 Steven Rostedt
  2011-11-30  3:50 ` Clark Williams
                   ` (3 more replies)
  0 siblings, 4 replies; 14+ messages in thread
From: Steven Rostedt @ 2011-11-30  1:55 UTC (permalink / raw)
  To: LKML, RT
  Cc: Thomas Gleixner, Ingo Molnar, Luis Claudio R. Goncalves, Clark Williams

Ingo,

I forward ported this code from 2.6.33.9-rt31, but I think you were the
original author, as I found most of this code in the
"tasklet-redesign.patch" from my broken out 2.6.24-rt patches. I
committed it into my git tree (stable-rt) under your name, and added the
Signed-off-by that you had in that patch, if you have and objections,
please let me know. This patch should never see mainline, but it will
probably be going into the -rt branch. I wrote up this change log, if
there's something you don't like in it, let me know and I'll fix it.

Luis and Clark (I love saying that),

I booted this patch against 3.0-rt stable, and it didn't crash ;)
Could you apply it and see if it fixes the hang that you've been seeing.

Thanks,

-- Steve

commit 8a69de1b881e32b2a23452e6f25c39587c6f5e0e
Author: Ingo Molnar <mingo@elte.hu>
Date:   Tue Nov 29 20:18:22 2011 -0500

    tasklet/rt: Prevent tasklets from going into infinite spin in RT
    
    When CONFIG_PREEMPT_RT_FULL is enabled, tasklets run as threads,
    and spinlocks turn are mutexes. But this can cause issues with
    tasks disabling tasklets. A tasklet runs under ksoftirqd, and
    if a tasklets are disabled with tasklet_disable(), the tasklet
    count is increased. When a tasklet runs, it checks this counter
    and if it is set, it adds itself back on the softirq queue and
    returns.
    
    The problem arises in RT because ksoftirq will see that a softirq
    is ready to run (the tasklet softirq just re-armed itself), and will
    not sleep, but instead run the softirqs again. The tasklet softirq
    will still see that the count is non-zero and will not execute
    the tasklet and requeue itself on the softirq again, which will
    cause ksoftirqd to run it again and again and again.
    
    It gets worse because ksoftirqd runs as a real-time thread.
    If it preempted the task that disabled tasklets, and that task
    has migration disabled, or can't run for other reasons, the tasklet
    softirq will never run because the count will never be zero, and
    ksoftirqd will go into an infinite loop. As an RT task, it this
    becomes a big problem.
    
    This is a hack solution to have tasklet_disable stop tasklets, and
    when a tasklet runs, instead of requeueing the tasklet softirqd
    it delays it. When tasklet_enable() is called, and tasklets are
    waiting, then the tasklet_enable() will kick the tasklets to continue.
    This prevents the lock up from ksoftirq going into an infinite loop.
    
    Signed-off-by: Ingo Molnar <mingo@elte.hu>
    [ ported to 3.0-rt ]
    Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h
index a62158f..3142442 100644
--- a/include/linux/interrupt.h
+++ b/include/linux/interrupt.h
@@ -501,8 +501,9 @@ extern void __send_remote_softirq(struct call_single_data *cp, int cpu,
      to be executed on some cpu at least once after this.
    * If the tasklet is already scheduled, but its execution is still not
      started, it will be executed only once.
-   * If this tasklet is already running on another CPU (or schedule is called
-     from tasklet itself), it is rescheduled for later.
+   * If this tasklet is already running on another CPU, it is rescheduled
+     for later.
+   * Schedule must not be called from the tasklet itself (a lockup occurs)
    * Tasklet is strictly serialized wrt itself, but not
      wrt another tasklets. If client needs some intertask synchronization,
      he makes it with spinlocks.
@@ -527,27 +528,36 @@ struct tasklet_struct name = { NULL, 0, ATOMIC_INIT(1), func, data }
 enum
 {
 	TASKLET_STATE_SCHED,	/* Tasklet is scheduled for execution */
-	TASKLET_STATE_RUN	/* Tasklet is running (SMP only) */
+	TASKLET_STATE_RUN,	/* Tasklet is running (SMP only) */
+	TASKLET_STATE_PENDING	/* Tasklet is pending */
 };
 
-#ifdef CONFIG_SMP
+#define TASKLET_STATEF_SCHED	(1 << TASKLET_STATE_SCHED)
+#define TASKLET_STATEF_RUN	(1 << TASKLET_STATE_RUN)
+#define TASKLET_STATEF_PENDING	(1 << TASKLET_STATE_PENDING)
+
+#if defined(CONFIG_SMP) || defined(CONFIG_PREEMPT_RT_FULL)
 static inline int tasklet_trylock(struct tasklet_struct *t)
 {
 	return !test_and_set_bit(TASKLET_STATE_RUN, &(t)->state);
 }
 
+static inline int tasklet_tryunlock(struct tasklet_struct *t)
+{
+	return cmpxchg(&t->state, TASKLET_STATEF_RUN, 0) == TASKLET_STATEF_RUN;
+}
+
 static inline void tasklet_unlock(struct tasklet_struct *t)
 {
 	smp_mb__before_clear_bit(); 
 	clear_bit(TASKLET_STATE_RUN, &(t)->state);
 }
 
-static inline void tasklet_unlock_wait(struct tasklet_struct *t)
-{
-	while (test_bit(TASKLET_STATE_RUN, &(t)->state)) { barrier(); }
-}
+extern void tasklet_unlock_wait(struct tasklet_struct *t);
+
 #else
 #define tasklet_trylock(t) 1
+#define tasklet_tryunlock(t)	1
 #define tasklet_unlock_wait(t) do { } while (0)
 #define tasklet_unlock(t) do { } while (0)
 #endif
@@ -596,17 +606,8 @@ static inline void tasklet_disable(struct tasklet_struct *t)
 	smp_mb();
 }
 
-static inline void tasklet_enable(struct tasklet_struct *t)
-{
-	smp_mb__before_atomic_dec();
-	atomic_dec(&t->count);
-}
-
-static inline void tasklet_hi_enable(struct tasklet_struct *t)
-{
-	smp_mb__before_atomic_dec();
-	atomic_dec(&t->count);
-}
+extern  void tasklet_enable(struct tasklet_struct *t);
+extern  void tasklet_hi_enable(struct tasklet_struct *t);
 
 extern void tasklet_kill(struct tasklet_struct *t);
 extern void tasklet_kill_immediate(struct tasklet_struct *t, unsigned int cpu);
diff --git a/kernel/softirq.c b/kernel/softirq.c
index 026a283..3489d06 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -21,6 +21,7 @@
 #include <linux/freezer.h>
 #include <linux/kthread.h>
 #include <linux/rcupdate.h>
+#include <linux/delay.h>
 #include <linux/ftrace.h>
 #include <linux/smp.h>
 #include <linux/tick.h>
@@ -670,15 +671,45 @@ struct tasklet_head
 static DEFINE_PER_CPU(struct tasklet_head, tasklet_vec);
 static DEFINE_PER_CPU(struct tasklet_head, tasklet_hi_vec);
 
+static void inline
+__tasklet_common_schedule(struct tasklet_struct *t, struct tasklet_head *head, unsigned int nr)
+{
+	if (tasklet_trylock(t)) {
+again:
+		/* We may have been preempted before tasklet_trylock
+		 * and __tasklet_action may have already run.
+		 * So double check the sched bit while the takslet
+		 * is locked before adding it to the list.
+		 */
+		if (test_bit(TASKLET_STATE_SCHED, &t->state)) {
+			t->next = NULL;
+			*head->tail = t;
+			head->tail = &(t->next);
+			raise_softirq_irqoff(nr);
+			tasklet_unlock(t);
+		} else {
+			/* This is subtle. If we hit the corner case above
+			 * It is possible that we get preempted right here,
+			 * and another task has successfully called
+			 * tasklet_schedule(), then this function, and
+			 * failed on the trylock. Thus we must be sure
+			 * before releasing the tasklet lock, that the
+			 * SCHED_BIT is clear. Otherwise the tasklet
+			 * may get its SCHED_BIT set, but not added to the
+			 * list
+			 */
+			if (!tasklet_tryunlock(t))
+				goto again;
+		}
+	}
+}
+
 void __tasklet_schedule(struct tasklet_struct *t)
 {
 	unsigned long flags;
 
 	local_irq_save(flags);
-	t->next = NULL;
-	*__this_cpu_read(tasklet_vec.tail) = t;
-	__this_cpu_write(tasklet_vec.tail, &(t->next));
-	raise_softirq_irqoff(TASKLET_SOFTIRQ);
+	__tasklet_common_schedule(t, &__get_cpu_var(tasklet_vec), TASKLET_SOFTIRQ);
 	local_irq_restore(flags);
 }
 
@@ -689,10 +720,7 @@ void __tasklet_hi_schedule(struct tasklet_struct *t)
 	unsigned long flags;
 
 	local_irq_save(flags);
-	t->next = NULL;
-	*__this_cpu_read(tasklet_hi_vec.tail) = t;
-	__this_cpu_write(tasklet_hi_vec.tail,  &(t->next));
-	raise_softirq_irqoff(HI_SOFTIRQ);
+	__tasklet_common_schedule(t, &__get_cpu_var(tasklet_hi_vec), HI_SOFTIRQ);
 	local_irq_restore(flags);
 }
 
@@ -700,50 +728,119 @@ EXPORT_SYMBOL(__tasklet_hi_schedule);
 
 void __tasklet_hi_schedule_first(struct tasklet_struct *t)
 {
-	BUG_ON(!irqs_disabled());
-
-	t->next = __this_cpu_read(tasklet_hi_vec.head);
-	__this_cpu_write(tasklet_hi_vec.head, t);
-	__raise_softirq_irqoff(HI_SOFTIRQ);
+	__tasklet_hi_schedule(t);
 }
 
 EXPORT_SYMBOL(__tasklet_hi_schedule_first);
+ 
+void  tasklet_enable(struct tasklet_struct *t)
+{
+	if (!atomic_dec_and_test(&t->count))
+		return;
+	if (test_and_clear_bit(TASKLET_STATE_PENDING, &t->state))
+		tasklet_schedule(t);
+}
+ 
+EXPORT_SYMBOL(tasklet_enable);
 
-static void tasklet_action(struct softirq_action *a)
+void  tasklet_hi_enable(struct tasklet_struct *t)
 {
-	struct tasklet_struct *list;
+	if (!atomic_dec_and_test(&t->count))
+		return;
+	if (test_and_clear_bit(TASKLET_STATE_PENDING, &t->state))
+		tasklet_hi_schedule(t);
+}
 
-	local_irq_disable();
-	list = __this_cpu_read(tasklet_vec.head);
-	__this_cpu_write(tasklet_vec.head, NULL);
-	__this_cpu_write(tasklet_vec.tail, &__get_cpu_var(tasklet_vec).head);
-	local_irq_enable();
+EXPORT_SYMBOL(tasklet_hi_enable);
+
+static void
+__tasklet_action(struct softirq_action *a, struct tasklet_struct *list)
+{
+	int loops = 1000000;
 
 	while (list) {
 		struct tasklet_struct *t = list;
 
 		list = list->next;
 
-		if (tasklet_trylock(t)) {
-			if (!atomic_read(&t->count)) {
-				if (!test_and_clear_bit(TASKLET_STATE_SCHED, &t->state))
-					BUG();
-				t->func(t->data);
-				tasklet_unlock(t);
-				continue;
-			}
-			tasklet_unlock(t);
+		/*
+		 * Should always succeed - after a tasklist got on the
+		 * list (after getting the SCHED bit set from 0 to 1),
+		 * nothing but the tasklet softirq it got queued to can
+		 * lock it:
+		 */
+		if (!tasklet_trylock(t)) {
+			WARN_ON(1);
+			continue;
 		}
 
-		local_irq_disable();
 		t->next = NULL;
-		*__this_cpu_read(tasklet_vec.tail) = t;
-		__this_cpu_write(tasklet_vec.tail, &(t->next));
-		__raise_softirq_irqoff(TASKLET_SOFTIRQ);
-		local_irq_enable();
+
+		/*
+		 * If we cannot handle the tasklet because it's disabled,
+		 * mark it as pending. tasklet_enable() will later
+		 * re-schedule the tasklet.
+		 */
+		if (unlikely(atomic_read(&t->count))) {
+out_disabled:
+			/* implicit unlock: */
+			wmb();
+			t->state = TASKLET_STATEF_PENDING;
+			continue;
+		}
+
+		/*
+		 * After this point on the tasklet might be rescheduled
+		 * on another CPU, but it can only be added to another
+		 * CPU's tasklet list if we unlock the tasklet (which we
+		 * dont do yet).
+		 */
+		if (!test_and_clear_bit(TASKLET_STATE_SCHED, &t->state))
+			WARN_ON(1);
+
+again:
+		t->func(t->data);
+
+		/*
+		 * Try to unlock the tasklet. We must use cmpxchg, because
+		 * another CPU might have scheduled or disabled the tasklet.
+		 * We only allow the STATE_RUN -> 0 transition here.
+		 */
+		while (!tasklet_tryunlock(t)) {
+			/*
+			 * If it got disabled meanwhile, bail out:
+			 */
+			if (atomic_read(&t->count))
+				goto out_disabled;
+			/*
+			 * If it got scheduled meanwhile, re-execute
+			 * the tasklet function:
+			 */
+			if (test_and_clear_bit(TASKLET_STATE_SCHED, &t->state))
+				goto again;
+			if (!--loops) {
+				printk("hm, tasklet state: %08lx\n", t->state);
+				WARN_ON(1);
+				tasklet_unlock(t);
+				break;
+			}
+		}
 	}
 }
 
+static void tasklet_action(struct softirq_action *a)
+{
+	struct tasklet_struct *list;
+
+	local_irq_disable();
+	list = __get_cpu_var(tasklet_vec).head;
+	__get_cpu_var(tasklet_vec).head = NULL;
+	__get_cpu_var(tasklet_vec).tail = &__get_cpu_var(tasklet_vec).head;
+	local_irq_enable();
+
+	__tasklet_action(a, list);
+}
+
 static void tasklet_hi_action(struct softirq_action *a)
 {
 	struct tasklet_struct *list;
@@ -754,29 +851,7 @@ static void tasklet_hi_action(struct softirq_action *a)
 	__this_cpu_write(tasklet_hi_vec.tail, &__get_cpu_var(tasklet_hi_vec).head);
 	local_irq_enable();
 
-	while (list) {
-		struct tasklet_struct *t = list;
-
-		list = list->next;
-
-		if (tasklet_trylock(t)) {
-			if (!atomic_read(&t->count)) {
-				if (!test_and_clear_bit(TASKLET_STATE_SCHED, &t->state))
-					BUG();
-				t->func(t->data);
-				tasklet_unlock(t);
-				continue;
-			}
-			tasklet_unlock(t);
-		}
-
-		local_irq_disable();
-		t->next = NULL;
-		*__this_cpu_read(tasklet_hi_vec.tail) = t;
-		__this_cpu_write(tasklet_hi_vec.tail, &(t->next));
-		__raise_softirq_irqoff(HI_SOFTIRQ);
-		local_irq_enable();
-	}
+	__tasklet_action(a, list);
 }
 
 
@@ -799,7 +874,7 @@ void tasklet_kill(struct tasklet_struct *t)
 
 	while (test_and_set_bit(TASKLET_STATE_SCHED, &t->state)) {
 		do {
-			yield();
+			msleep(1);
 		} while (test_bit(TASKLET_STATE_SCHED, &t->state));
 	}
 	tasklet_unlock_wait(t);
@@ -1005,6 +1080,23 @@ void __init softirq_init(void)
 	open_softirq(HI_SOFTIRQ, tasklet_hi_action);
 }
 
+#if defined(CONFIG_SMP) || defined(CONFIG_PREEMPT_RT_FULL)
+void tasklet_unlock_wait(struct tasklet_struct *t)
+{
+	while (test_bit(TASKLET_STATE_RUN, &(t)->state)) {
+		/*
+		 * Hack for now to avoid this busy-loop:
+		 */
+#ifdef CONFIG_PREEMPT_RT_FULL
+		msleep(1);
+#else
+		barrier();
+#endif
+	}
+}
+EXPORT_SYMBOL(tasklet_unlock_wait);
+#endif
+
 static int run_ksoftirqd(void * __bind_cpu)
 {
 	ksoftirqd_set_sched_params();



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH RT] tasklet/rt: Prevent tasklets from going into infinite spin in RT
  2011-11-30  1:55 [PATCH RT] tasklet/rt: Prevent tasklets from going into infinite spin in RT Steven Rostedt
@ 2011-11-30  3:50 ` Clark Williams
  2011-11-30  6:24   ` Mike Kravetz
  2011-11-30  4:32 ` Mike Galbraith
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 14+ messages in thread
From: Clark Williams @ 2011-11-30  3:50 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: LKML, RT, Thomas Gleixner, Ingo Molnar, Luis Claudio R. Goncalves

[-- Attachment #1: Type: text/plain, Size: 1078 bytes --]

On Tue, 29 Nov 2011 20:55:20 -0500
Steven Rostedt <rostedt@goodmis.org> wrote:

> Ingo,
> 
> I forward ported this code from 2.6.33.9-rt31, but I think you were the
> original author, as I found most of this code in the
> "tasklet-redesign.patch" from my broken out 2.6.24-rt patches. I
> committed it into my git tree (stable-rt) under your name, and added the
> Signed-off-by that you had in that patch, if you have and objections,
> please let me know. This patch should never see mainline, but it will
> probably be going into the -rt branch. I wrote up this change log, if
> there's something you don't like in it, let me know and I'll fix it.
> 
> Luis and Clark (I love saying that),

No matter how many times you say it, I'm still not treking across the
USA with Luis. :)

> 
> I booted this patch against 3.0-rt stable, and it didn't crash ;)
> Could you apply it and see if it fixes the hang that you've been seeing.
> 

Yeah, since Luis is a couple of time zones ahead of us, maybe it'll all
be fixed when we get up in the morning.

Clark

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH RT] tasklet/rt: Prevent tasklets from going into infinite spin in RT
  2011-11-30  1:55 [PATCH RT] tasklet/rt: Prevent tasklets from going into infinite spin in RT Steven Rostedt
  2011-11-30  3:50 ` Clark Williams
@ 2011-11-30  4:32 ` Mike Galbraith
  2011-11-30  7:34   ` Mike Galbraith
  2011-11-30 10:24   ` Thomas Gleixner
  2011-11-30  8:46   ` Tim Sander
  2011-11-30 14:24   ` Luis Claudio R. Goncalves
  3 siblings, 2 replies; 14+ messages in thread
From: Mike Galbraith @ 2011-11-30  4:32 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: LKML, RT, Thomas Gleixner, Ingo Molnar,
	Luis Claudio R. Goncalves, Clark Williams

On Tue, 2011-11-29 at 20:55 -0500, Steven Rostedt wrote:
> Ingo,
> 
> I forward ported this code from 2.6.33.9-rt31, but I think you were the
> original author, as I found most of this code in the
> "tasklet-redesign.patch" from my broken out 2.6.24-rt patches. I
> committed it into my git tree (stable-rt) under your name, and added the
> Signed-off-by that you had in that patch, if you have and objections,
> please let me know. This patch should never see mainline, but it will
> probably be going into the -rt branch. I wrote up this change log, if
> there's something you don't like in it, let me know and I'll fix it.

I'm oh so happy to see this.  I've been going nuts trying to figure out
why the heck 33-rt doesn't go bonkers, but 30+ rt does.

> Luis and Clark (I love saying that),
> 
> I booted this patch against 3.0-rt stable, and it didn't crash ;)
> Could you apply it and see if it fixes the hang that you've been seeing.

I'll most certainly be testing it too.  With the below, and the
conditional yield thingy disabled, all I have to do is boot x3550 M3
box, and it'll hang very frequently, but not always, with sirq-tasklet
going stark raving mad.  Yielding fix^Wmakes it not do the bad thing.

(somewhat less disgusting version of sirq threads patch;)

sched, rt: resurrect softirq threads for RT_FULL

Signed-off-by: Mike Galbraith <efault@gmx.de>
---
 include/linux/interrupt.h |   46 ++++++++++
 kernel/irq/Kconfig        |    7 +
 kernel/sched.c            |    4 
 kernel/softirq.c          |  194 ++++++++++++++++++++++++++++++++--------------
 4 files changed, 191 insertions(+), 60 deletions(-)

Index: linux-3.2-rt/kernel/irq/Kconfig
===================================================================
--- linux-3.2-rt.orig/kernel/irq/Kconfig
+++ linux-3.2-rt/kernel/irq/Kconfig
@@ -60,6 +60,13 @@ config IRQ_DOMAIN
 config IRQ_FORCED_THREADING
        bool
 
+# Support forced sirq threading
+config SIRQ_FORCED_THREADING
+       bool "Forced Soft IRQ threading"
+       depends on PREEMPT_RT_FULL
+	help
+	  Split ksoftirqd into per SOFTIRQ threads
+
 config SPARSE_IRQ
 	bool "Support sparse irq numbering"
 	depends on HAVE_SPARSE_IRQ
Index: linux-3.2-rt/include/linux/interrupt.h
===================================================================
--- linux-3.2-rt.orig/include/linux/interrupt.h
+++ linux-3.2-rt/include/linux/interrupt.h
@@ -442,6 +442,9 @@ enum
 	NR_SOFTIRQS
 };
 
+/* Update when adding new softirqs. */
+#define SOFTIRQ_MASK_ALL 0x3ff
+
 /* map softirq index to softirq name. update 'softirq_to_name' in
  * kernel/softirq.c when adding a new softirq.
  */
@@ -457,10 +460,16 @@ struct softirq_action
 };
 
 #ifndef CONFIG_PREEMPT_RT_FULL
+#define NR_SOFTIRQ_THREADS 1
 asmlinkage void do_softirq(void);
 asmlinkage void __do_softirq(void);
 static inline void thread_do_softirq(void) { do_softirq(); }
 #else
+#ifdef CONFIG_SIRQ_FORCED_THREADING
+#define NR_SOFTIRQ_THREADS NR_SOFTIRQS
+#else
+#define NR_SOFTIRQ_THREADS 1
+#endif
 extern void thread_do_softirq(void);
 #endif
 
@@ -486,12 +495,43 @@ extern void softirq_check_pending_idle(v
  */
 DECLARE_PER_CPU(struct list_head [NR_SOFTIRQS], softirq_work_list);
 
-DECLARE_PER_CPU(struct task_struct *, ksoftirqd);
+struct softirqdata {
+	int			mask;
+	struct task_struct	*tsk;
+};
+
+DECLARE_PER_CPU(struct softirqdata [NR_SOFTIRQ_THREADS], ksoftirqd);
+
+static inline bool this_cpu_ksoftirqd(struct task_struct *p)
+{
+	int i;
+
+	for (i = 0; i < NR_SOFTIRQ_THREADS; i++) {
+		if (p == __get_cpu_var(ksoftirqd)[i].tsk)
+			return true;
+	}
 
-static inline struct task_struct *this_cpu_ksoftirqd(void)
+	return false;
+}
+
+#ifdef CONFIG_PREEMPT_RT_FULL
+static inline int task_sirq_mask(struct task_struct *p)
+{
+	int i;
+
+	for (i = 0; i < NR_SOFTIRQ_THREADS; i++) {
+		if (p == __get_cpu_var(ksoftirqd)[i].tsk)
+			return __get_cpu_var(ksoftirqd)[i].mask;
+	}
+
+	return SOFTIRQ_MASK_ALL;
+}
+#else
+static inline int task_sirq_mask(struct task_struct *p)
 {
-	return this_cpu_read(ksoftirqd);
+	return SOFTIRQ_MASK_ALL;
 }
+#endif
 
 /* Try to send a softirq to a remote cpu.  If this cannot be done, the
  * work will be queued to the local cpu.
Index: linux-3.2-rt/kernel/sched.c
===================================================================
--- linux-3.2-rt.orig/kernel/sched.c
+++ linux-3.2-rt/kernel/sched.c
@@ -2082,7 +2082,7 @@ void account_system_vtime(struct task_st
 	 */
 	if (hardirq_count())
 		__this_cpu_add(cpu_hardirq_time, delta);
-	else if (in_serving_softirq() && curr != this_cpu_ksoftirqd())
+	else if (in_serving_softirq() && !this_cpu_ksoftirqd(curr))
 		__this_cpu_add(cpu_softirq_time, delta);
 
 	irq_time_write_end();
@@ -4062,7 +4062,7 @@ static void irqtime_account_process_tick
 		cpustat->irq = cputime64_add(cpustat->irq, tmp);
 	} else if (irqtime_account_si_update()) {
 		cpustat->softirq = cputime64_add(cpustat->softirq, tmp);
-	} else if (this_cpu_ksoftirqd() == p) {
+	} else if (this_cpu_ksoftirqd(p)) {
 		/*
 		 * ksoftirqd time do not get accounted in cpu_softirq_time.
 		 * So, we have to handle it separately here.
Index: linux-3.2-rt/kernel/softirq.c
===================================================================
--- linux-3.2-rt.orig/kernel/softirq.c
+++ linux-3.2-rt/kernel/softirq.c
@@ -55,13 +55,31 @@ EXPORT_SYMBOL(irq_stat);
 
 static struct softirq_action softirq_vec[NR_SOFTIRQS] __cacheline_aligned_in_smp;
 
-DEFINE_PER_CPU(struct task_struct *, ksoftirqd);
+DEFINE_PER_CPU(struct softirqdata[NR_SOFTIRQ_THREADS], ksoftirqd);
 
 char *softirq_to_name[NR_SOFTIRQS] = {
 	"HI", "TIMER", "NET_TX", "NET_RX", "BLOCK", "BLOCK_IOPOLL",
 	"TASKLET", "SCHED", "HRTIMER", "RCU"
 };
 
+static const char *softirq_to_thread_name [] =
+{
+#ifdef CONFIG_SIRQ_FORCED_THREADING
+	[HI_SOFTIRQ]		= "sirq-high",
+	[TIMER_SOFTIRQ]		= "sirq-timer",
+	[NET_TX_SOFTIRQ]	= "sirq-net-tx",
+	[NET_RX_SOFTIRQ]	= "sirq-net-rx",
+	[BLOCK_SOFTIRQ]		= "sirq-blk",
+	[BLOCK_IOPOLL_SOFTIRQ]	= "sirq-blk-pol",
+	[TASKLET_SOFTIRQ]	= "sirq-tasklet",
+	[SCHED_SOFTIRQ]		= "sirq-sched",
+	[HRTIMER_SOFTIRQ]	= "sirq-hrtimer",
+	[RCU_SOFTIRQ]		= "sirq-rcu",
+#else
+	[HI_SOFTIRQ]		= "ksoftirqd",
+#endif
+};
+
 #ifdef CONFIG_NO_HZ
 # ifdef CONFIG_PREEMPT_RT_FULL
 /*
@@ -77,32 +95,38 @@ char *softirq_to_name[NR_SOFTIRQS] = {
 void softirq_check_pending_idle(void)
 {
 	static int rate_limit;
-	u32 warnpending = 0, pending = local_softirq_pending();
+	u32 warnpending = 0, pending = local_softirq_pending(), mask;
+	int i = 0;
 
 	if (rate_limit >= 10)
 		return;
 
-	if (pending) {
-		struct task_struct *tsk;
+	for (i = 0; pending && i < NR_SOFTIRQ_THREADS; i++) {
+		mask =  __get_cpu_var(ksoftirqd)[i].mask;
 
-		tsk = __get_cpu_var(ksoftirqd);
-		/*
-		 * The wakeup code in rtmutex.c wakes up the task
-		 * _before_ it sets pi_blocked_on to NULL under
-		 * tsk->pi_lock. So we need to check for both: state
-		 * and pi_blocked_on.
-		 */
-		raw_spin_lock(&tsk->pi_lock);
+		if (pending & mask) {
+			struct task_struct *tsk;
+
+			tsk = __get_cpu_var(ksoftirqd)[i].tsk;
+			/*
+			 * The wakeup code in rtmutex.c wakes up the task
+			 * _before_ it sets pi_blocked_on to NULL under
+			 * tsk->pi_lock. So we need to check for both: state
+			 * and pi_blocked_on.
+			 */
+			raw_spin_lock(&tsk->pi_lock);
 
-		if (!tsk->pi_blocked_on && !(tsk->state == TASK_RUNNING))
-			warnpending = 1;
+			if (!tsk->pi_blocked_on && !(tsk->state == TASK_RUNNING))
+				warnpending |= pending & mask;
 
-		raw_spin_unlock(&tsk->pi_lock);
+			raw_spin_unlock(&tsk->pi_lock);
+			pending &= ~mask;
+		}
 	}
 
 	if (warnpending) {
 		printk(KERN_ERR "NOHZ: local_softirq_pending %02x\n",
-		       pending);
+		       warnpending);
 		rate_limit++;
 	}
 }
@@ -131,11 +155,18 @@ void softirq_check_pending_idle(void)
  */
 static void wakeup_softirqd(void)
 {
-	/* Interrupts are disabled: no need to stop preemption */
-	struct task_struct *tsk = __this_cpu_read(ksoftirqd);
+	struct task_struct *tsk;
+	u32 pending = local_softirq_pending(), mask, i;
 
-	if (tsk && tsk->state != TASK_RUNNING)
-		wake_up_process(tsk);
+	/* Interrupts are disabled: no need to stop preemption */
+	for (i = 0; pending && i < NR_SOFTIRQ_THREADS; i++) {
+		mask = __get_cpu_var(ksoftirqd)[i].mask;
+		if (!(pending & mask))
+			continue;
+		tsk = __get_cpu_var(ksoftirqd)[i].tsk;
+		if (tsk && tsk->state != TASK_RUNNING)
+			wake_up_process(tsk);
+	}
 }
 
 static void handle_pending_softirqs(u32 pending, int cpu, int need_rcu_bh_qs)
@@ -384,11 +415,11 @@ static inline void ksoftirqd_clr_sched_p
 static DEFINE_LOCAL_IRQ_LOCK(local_softirq_lock);
 static DEFINE_PER_CPU(struct task_struct *, local_softirq_runner);
 
-static void __do_softirq_common(int need_rcu_bh_qs);
+static void __do_softirq_common(u32 mask, int need_rcu_bh_qs);
 
-void __do_softirq(void)
+void __do_softirq(u32 mask)
 {
-	__do_softirq_common(0);
+	__do_softirq_common(mask, 0);
 }
 
 void __init softirq_early_init(void)
@@ -414,7 +445,7 @@ void local_bh_enable(void)
 
 		local_irq_disable();
 		if (local_softirq_pending())
-			__do_softirq();
+			__do_softirq(SOFTIRQ_MASK_ALL);
 		local_irq_enable();
 		local_unlock(local_softirq_lock);
 		WARN_ON(current->softirq_nestcnt != 1);
@@ -453,7 +484,7 @@ EXPORT_SYMBOL(in_serving_softirq);
  * Called with bh and local interrupts disabled. For full RT cpu must
  * be pinned.
  */
-static void __do_softirq_common(int need_rcu_bh_qs)
+static void __do_softirq_common(u32 mask, int need_rcu_bh_qs)
 {
 	u32 pending = local_softirq_pending();
 	int cpu = smp_processor_id();
@@ -461,17 +492,14 @@ static void __do_softirq_common(int need
 	current->softirq_nestcnt++;
 
 	/* Reset the pending bitmask before enabling irqs */
-	set_softirq_pending(0);
+	set_softirq_pending(pending & ~mask);
 
 	__get_cpu_var(local_softirq_runner) = current;
 
 	lockdep_softirq_enter();
 
-	handle_pending_softirqs(pending, cpu, need_rcu_bh_qs);
-
-	pending = local_softirq_pending();
-	if (pending)
-		wakeup_softirqd();
+	handle_pending_softirqs(pending & mask, cpu, need_rcu_bh_qs);
+	wakeup_softirqd();
 
 	lockdep_softirq_exit();
 	__get_cpu_var(local_softirq_runner) = NULL;
@@ -481,6 +509,8 @@ static void __do_softirq_common(int need
 
 static int __thread_do_softirq(int cpu)
 {
+	u32 mask;
+
 	/*
 	 * Prevent the current cpu from going offline.
 	 * pin_current_cpu() can reenable preemption and block on the
@@ -498,6 +528,8 @@ static int __thread_do_softirq(int cpu)
 		unpin_current_cpu();
 		return -1;
 	}
+
+	mask = task_sirq_mask(current);
 	preempt_enable();
 	local_lock(local_softirq_lock);
 	local_irq_disable();
@@ -505,8 +537,8 @@ static int __thread_do_softirq(int cpu)
 	 * We cannot switch stacks on RT as we want to be able to
 	 * schedule!
 	 */
-	if (local_softirq_pending())
-		__do_softirq_common(cpu >= 0);
+	if (local_softirq_pending() & mask)
+		__do_softirq_common(mask, cpu >= 0);
 	local_unlock(local_softirq_lock);
 	unpin_current_cpu();
 	preempt_disable();
@@ -1005,24 +1037,59 @@ void __init softirq_init(void)
 	open_softirq(HI_SOFTIRQ, tasklet_hi_action);
 }
 
+/* Drop priority and yield() if we may starve other sirq threads. */
+static int ksoftirqd_cond_yield(struct task_struct *p, u32 mask)
+{
+#ifdef CONFIG_SIRQ_FORCED_THREADING
+	u32 pending = local_softirq_pending();
+	struct sched_param param;
+	int prio, policy = p->policy;
+
+	if (!pending || !(pending & mask))
+		return 1;
+
+	if (policy != SCHED_FIFO && policy != SCHED_RR)
+		return 0;
+
+	prio = p->rt_priority;
+
+	if (prio != MAX_RT_PRIO-1) {
+		param.sched_priority = MAX_RT_PRIO-1;
+		sched_setscheduler(p, policy, &param);
+	}
+	yield();
+	if (p->policy == policy && p->rt_priority == MAX_RT_PRIO-1) {
+		param.sched_priority = prio;
+		sched_setscheduler(p, policy, &param);
+	}
+
+	return 1;
+#else
+	return 0;
+#endif
+}
+
 static int run_ksoftirqd(void * __bind_cpu)
 {
+	u32 mask = task_sirq_mask(current);
+
 	ksoftirqd_set_sched_params();
 
 	set_current_state(TASK_INTERRUPTIBLE);
 
 	while (!kthread_should_stop()) {
 		preempt_disable();
-		if (!local_softirq_pending())
+		if (!(local_softirq_pending() & mask))
 			schedule_preempt_disabled();
 
 		__set_current_state(TASK_RUNNING);
 
-		while (local_softirq_pending()) {
+		while (local_softirq_pending() & mask) {
 			if (ksoftirqd_do_softirq((long) __bind_cpu))
 				goto wait_to_die;
 			__preempt_enable_no_resched();
-			cond_resched();
+			if (!ksoftirqd_cond_yield(current, mask))
+				cond_resched();
 			preempt_disable();
 			rcu_note_context_switch((long)__bind_cpu);
 		}
@@ -1108,41 +1175,58 @@ static int __cpuinit cpu_callback(struct
 				  unsigned long action,
 				  void *hcpu)
 {
-	int hotcpu = (unsigned long)hcpu;
+	int hotcpu = (unsigned long)hcpu, i;
 	struct task_struct *p;
 
 	switch (action & ~CPU_TASKS_FROZEN) {
 	case CPU_UP_PREPARE:
-		p = kthread_create_on_node(run_ksoftirqd,
+		for (i = 0; i < NR_SOFTIRQ_THREADS; i++) {
+			per_cpu(ksoftirqd, hotcpu)[i].mask = SOFTIRQ_MASK_ALL;
+			per_cpu(ksoftirqd, hotcpu)[i].tsk = NULL;
+		}
+		for (i = 0; i < NR_SOFTIRQ_THREADS; i++) {
+			p = kthread_create_on_node(run_ksoftirqd,
 					   hcpu,
 					   cpu_to_node(hotcpu),
-					   "ksoftirqd/%d", hotcpu);
-		if (IS_ERR(p)) {
-			printk("ksoftirqd for %i failed\n", hotcpu);
-			return notifier_from_errno(PTR_ERR(p));
+					   "%s/%d", softirq_to_thread_name[i], hotcpu);
+			if (IS_ERR(p)) {
+				printk(KERN_ERR "%s/%d failed\n",
+					   softirq_to_thread_name[i], hotcpu);
+				return notifier_from_errno(PTR_ERR(p));
+			}
+			kthread_bind(p, hotcpu);
+			per_cpu(ksoftirqd, hotcpu)[i].tsk = p;
+			if (NR_SOFTIRQ_THREADS > 1)
+				per_cpu(ksoftirqd, hotcpu)[i].mask = 1 << i;
 		}
-		kthread_bind(p, hotcpu);
-  		per_cpu(ksoftirqd, hotcpu) = p;
  		break;
 	case CPU_ONLINE:
-		wake_up_process(per_cpu(ksoftirqd, hotcpu));
+		for (i = 0; i < NR_SOFTIRQ_THREADS; i++)
+			wake_up_process(per_cpu(ksoftirqd, hotcpu)[i].tsk);
 		break;
 #ifdef CONFIG_HOTPLUG_CPU
-	case CPU_UP_CANCELED:
-		if (!per_cpu(ksoftirqd, hotcpu))
-			break;
-		/* Unbind so it can run.  Fall thru. */
-		kthread_bind(per_cpu(ksoftirqd, hotcpu),
-			     cpumask_any(cpu_online_mask));
+	case CPU_UP_CANCELED: {
+		for (i = 0; i < NR_SOFTIRQ_THREADS; i++) {
+			p = per_cpu(ksoftirqd, hotcpu)[i].tsk;
+			if (!p)
+				continue;
+			/* Unbind so it can run. */
+			kthread_bind(p, cpumask_any(cpu_online_mask));
+		}
+	}
 	case CPU_POST_DEAD: {
 		static const struct sched_param param = {
 			.sched_priority = MAX_RT_PRIO-1
 		};
 
-		p = per_cpu(ksoftirqd, hotcpu);
-		per_cpu(ksoftirqd, hotcpu) = NULL;
-		sched_setscheduler_nocheck(p, SCHED_FIFO, &param);
-		kthread_stop(p);
+		for (i = 0; i < NR_SOFTIRQ_THREADS; i++) {
+			p = per_cpu(ksoftirqd, hotcpu)[i].tsk;
+			per_cpu(ksoftirqd, hotcpu)[i].tsk = NULL;
+			if (!p)
+				continue;
+			sched_setscheduler_nocheck(p, SCHED_FIFO, &param);
+			kthread_stop(p);
+		}
 		takeover_tasklets(hotcpu);
 		break;
 	}



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH RT] tasklet/rt: Prevent tasklets from going into infinite spin in RT
  2011-11-30  3:50 ` Clark Williams
@ 2011-11-30  6:24   ` Mike Kravetz
  2011-11-30 15:56     ` Mike Kravetz
  0 siblings, 1 reply; 14+ messages in thread
From: Mike Kravetz @ 2011-11-30  6:24 UTC (permalink / raw)
  To: Clark Williams
  Cc: Steven Rostedt, LKML, RT, Thomas Gleixner, Ingo Molnar,
	Luis Claudio R. Goncalves

On Tue, Nov 29, 2011 at 09:50:39PM -0600, Clark Williams wrote:
> On Tue, 29 Nov 2011 20:55:20 -0500
> Steven Rostedt <rostedt@goodmis.org> wrote:
> 
> > Ingo,
> > 
> > I forward ported this code from 2.6.33.9-rt31, but I think you were the
> > original author, as I found most of this code in the
> > "tasklet-redesign.patch" from my broken out 2.6.24-rt patches. I
> > committed it into my git tree (stable-rt) under your name, and added the
> > Signed-off-by that you had in that patch, if you have and objections,
> > please let me know. This patch should never see mainline, but it will
> > probably be going into the -rt branch. I wrote up this change log, if
> > there's something you don't like in it, let me know and I'll fix it.
> > 
> > Luis and Clark (I love saying that),
> 
> No matter how many times you say it, I'm still not treking across the
> USA with Luis. :)
> 
> > I booted this patch against 3.0-rt stable, and it didn't crash ;)
> > Could you apply it and see if it fixes the hang that you've been seeing.
> 
> Yeah, since Luis is a couple of time zones ahead of us, maybe it'll all
> be fixed when we get up in the morning.

I've got it running on a system here that frequently encountered the
condition.  I will let it perform some automated testing while I sleep.
Should have results by morning (west coast US).

-- 
Mike


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH RT] tasklet/rt: Prevent tasklets from going into infinite spin in RT
  2011-11-30  4:32 ` Mike Galbraith
@ 2011-11-30  7:34   ` Mike Galbraith
  2011-11-30 10:24   ` Thomas Gleixner
  1 sibling, 0 replies; 14+ messages in thread
From: Mike Galbraith @ 2011-11-30  7:34 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: LKML, RT, Thomas Gleixner, Ingo Molnar,
	Luis Claudio R. Goncalves, Clark Williams

On Wed, 2011-11-30 at 05:32 +0100, Mike Galbraith wrote:
> On Tue, 2011-11-29 at 20:55 -0500, Steven Rostedt wrote:

> > I booted this patch against 3.0-rt stable, and it didn't crash ;)
> > Could you apply it and see if it fixes the hang that you've been seeing.
> 
> I'll most certainly be testing it too.

Tested-by: Mike Galbraith <efault@gmx.de>

The other me tested it too.  Coincidentally, he was having the same darn
problem with a fat enterprise config (on the same x3550 M3 too).

Tested-by: Mike Galbraith <mgalbraith@suse.de>

Thanks, I can now turn x3550 (by Pratt & Whitney) off for a bit.

	-Mike


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH RT] tasklet/rt: Prevent tasklets from going into infinite spin in RT
  2011-11-30  1:55 [PATCH RT] tasklet/rt: Prevent tasklets from going into infinite spin in RT Steven Rostedt
@ 2011-11-30  8:46   ` Tim Sander
  2011-11-30  4:32 ` Mike Galbraith
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 14+ messages in thread
From: Tim Sander @ 2011-11-30  8:46 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: LKML, RT, Thomas Gleixner, Ingo Molnar,
	Luis Claudio R. Goncalves, Clark Williams

Hi Steven
> I booted this patch against 3.0-rt stable, and it didn't crash ;)
> Could you apply it and see if it fixes the hang that you've been seeing.
Unfortunatly i am also experiencing these problems with ksoftirq running wild.

So i am also going to test this patch and see if it fixes our problem. It seems 
as if this bug is much more probable after UBIFS update and when 
"sched: RT throttling activated" hits.

Best regards
Tim

PS: Steven i just forgot the cc list so you got this mail twice, sorry.

Please ignore or at your wish use as "Deutschkurs" in german legalese:

Hottinger Baldwin Messtechnik GmbH, Im Tiefen See 45, 64293 Darmstadt, Germany | www.hbm.com 

Registered as GmbH (German limited liability corporation) in the commercial register at the local court of Darmstadt, HRB 1147  
Company domiciled in Darmstadt | CEO: Andreas Huellhorst | Chairman of the board: James Charles Webster

Als Gesellschaft mit beschraenkter Haftung eingetragen im Handelsregister des Amtsgerichts Darmstadt unter HRB 1147 
Sitz der Gesellschaft: Darmstadt | Geschaeftsfuehrung: Andreas Huellhorst | Aufsichtsratsvorsitzender: James Charles Webster

The information in this email is confidential. It is intended solely for the addressee. If you are not the intended recipient, please let me know and delete this email.

Die in dieser E-Mail enthaltene Information ist vertraulich und lediglich für den Empfaenger bestimmt. Sollten Sie nicht der eigentliche Empfaenger sein, informieren Sie mich bitte kurz und loeschen diese E-Mail.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH RT] tasklet/rt: Prevent tasklets from going into infinite spin in RT
@ 2011-11-30  8:46   ` Tim Sander
  0 siblings, 0 replies; 14+ messages in thread
From: Tim Sander @ 2011-11-30  8:46 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: LKML, RT, Thomas Gleixner, Ingo Molnar,
	Luis Claudio R. Goncalves, Clark Williams

Hi Steven
> I booted this patch against 3.0-rt stable, and it didn't crash ;)
> Could you apply it and see if it fixes the hang that you've been seeing.
Unfortunatly i am also experiencing these problems with ksoftirq running wild.

So i am also going to test this patch and see if it fixes our problem. It seems 
as if this bug is much more probable after UBIFS update and when 
"sched: RT throttling activated" hits.

Best regards
Tim

PS: Steven i just forgot the cc list so you got this mail twice, sorry.

Please ignore or at your wish use as "Deutschkurs" in german legalese:

Hottinger Baldwin Messtechnik GmbH, Im Tiefen See 45, 64293 Darmstadt, Germany | www.hbm.com 

Registered as GmbH (German limited liability corporation) in the commercial register at the local court of Darmstadt, HRB 1147  
Company domiciled in Darmstadt | CEO: Andreas Huellhorst | Chairman of the board: James Charles Webster

Als Gesellschaft mit beschraenkter Haftung eingetragen im Handelsregister des Amtsgerichts Darmstadt unter HRB 1147 
Sitz der Gesellschaft: Darmstadt | Geschaeftsfuehrung: Andreas Huellhorst | Aufsichtsratsvorsitzender: James Charles Webster

The information in this email is confidential. It is intended solely for the addressee. If you are not the intended recipient, please let me know and delete this email.

Die in dieser E-Mail enthaltene Information ist vertraulich und lediglich für den Empfaenger bestimmt. Sollten Sie nicht der eigentliche Empfaenger sein, informieren Sie mich bitte kurz und loeschen diese E-Mail.
--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH RT] tasklet/rt: Prevent tasklets from going into infinite spin in RT
  2011-11-30  4:32 ` Mike Galbraith
  2011-11-30  7:34   ` Mike Galbraith
@ 2011-11-30 10:24   ` Thomas Gleixner
  2011-11-30 12:45     ` Mike Galbraith
  2011-12-01  9:12     ` PATCH RT] resurrect softirq threads for RT_FULL Mike Galbraith
  1 sibling, 2 replies; 14+ messages in thread
From: Thomas Gleixner @ 2011-11-30 10:24 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: Steven Rostedt, LKML, RT, Ingo Molnar, Luis Claudio R. Goncalves,
	Clark Williams

On Wed, 30 Nov 2011, Mike Galbraith wrote:

> On Tue, 2011-11-29 at 20:55 -0500, Steven Rostedt wrote:
> > Ingo,
> > 
> > I forward ported this code from 2.6.33.9-rt31, but I think you were the
> > original author, as I found most of this code in the
> > "tasklet-redesign.patch" from my broken out 2.6.24-rt patches. I
> > committed it into my git tree (stable-rt) under your name, and added the
> > Signed-off-by that you had in that patch, if you have and objections,
> > please let me know. This patch should never see mainline, but it will
> > probably be going into the -rt branch. I wrote up this change log, if
> > there's something you don't like in it, let me know and I'll fix it.
> 
> I'm oh so happy to see this.  I've been going nuts trying to figure out
> why the heck 33-rt doesn't go bonkers, but 30+ rt does.
> 
> > Luis and Clark (I love saying that),
> > 
> > I booted this patch against 3.0-rt stable, and it didn't crash ;)
> > Could you apply it and see if it fixes the hang that you've been seeing.
> 
> I'll most certainly be testing it too.  With the below, and the
> conditional yield thingy disabled, all I have to do is boot x3550 M3
> box, and it'll hang very frequently, but not always, with sirq-tasklet
> going stark raving mad.  Yielding fix^Wmakes it not do the bad thing.
> 
> (somewhat less disgusting version of sirq threads patch;)
> 
> sched, rt: resurrect softirq threads for RT_FULL
> 
> Signed-off-by: Mike Galbraith <efault@gmx.de>
> ---
>  include/linux/interrupt.h |   46 ++++++++++
>  kernel/irq/Kconfig        |    7 +
>  kernel/sched.c            |    4 
>  kernel/softirq.c          |  194 ++++++++++++++++++++++++++++++++--------------
>  4 files changed, 191 insertions(+), 60 deletions(-)
> 
> Index: linux-3.2-rt/kernel/irq/Kconfig
> ===================================================================
> --- linux-3.2-rt.orig/kernel/irq/Kconfig
> +++ linux-3.2-rt/kernel/irq/Kconfig
> @@ -60,6 +60,13 @@ config IRQ_DOMAIN
>  config IRQ_FORCED_THREADING
>         bool
>  
> +# Support forced sirq threading
> +config SIRQ_FORCED_THREADING
> +       bool "Forced Soft IRQ threading"
> +       depends on PREEMPT_RT_FULL
> +	help
> +	  Split ksoftirqd into per SOFTIRQ threads
> +
>  config SPARSE_IRQ
>  	bool "Support sparse irq numbering"
>  	depends on HAVE_SPARSE_IRQ
> Index: linux-3.2-rt/include/linux/interrupt.h
> ===================================================================
> --- linux-3.2-rt.orig/include/linux/interrupt.h
> +++ linux-3.2-rt/include/linux/interrupt.h
> @@ -442,6 +442,9 @@ enum
>  	NR_SOFTIRQS
>  };
>  
> +/* Update when adding new softirqs. */
> +#define SOFTIRQ_MASK_ALL 0x3ff
> +
>  /* map softirq index to softirq name. update 'softirq_to_name' in
>   * kernel/softirq.c when adding a new softirq.
>   */
> @@ -457,10 +460,16 @@ struct softirq_action
>  };
>  
>  #ifndef CONFIG_PREEMPT_RT_FULL
> +#define NR_SOFTIRQ_THREADS 1
>  asmlinkage void do_softirq(void);
>  asmlinkage void __do_softirq(void);
>  static inline void thread_do_softirq(void) { do_softirq(); }
>  #else
> +#ifdef CONFIG_SIRQ_FORCED_THREADING
> +#define NR_SOFTIRQ_THREADS NR_SOFTIRQS
> +#else
> +#define NR_SOFTIRQ_THREADS 1
> +#endif
>  extern void thread_do_softirq(void);
>  #endif
>  
> @@ -486,12 +495,43 @@ extern void softirq_check_pending_idle(v
>   */
>  DECLARE_PER_CPU(struct list_head [NR_SOFTIRQS], softirq_work_list);
>  
> -DECLARE_PER_CPU(struct task_struct *, ksoftirqd);
> +struct softirqdata {
> +	int			mask;
> +	struct task_struct	*tsk;
> +};
> +
> +DECLARE_PER_CPU(struct softirqdata [NR_SOFTIRQ_THREADS], ksoftirqd);
> +
> +static inline bool this_cpu_ksoftirqd(struct task_struct *p)
> +{
> +	int i;
> +
> +	for (i = 0; i < NR_SOFTIRQ_THREADS; i++) {
> +		if (p == __get_cpu_var(ksoftirqd)[i].tsk)
> +			return true;

You are not serious about that loop, are you ?

> +	}
>  
> -static inline struct task_struct *this_cpu_ksoftirqd(void)
> +	return false;
> +}
> +
> +#ifdef CONFIG_PREEMPT_RT_FULL
> +static inline int task_sirq_mask(struct task_struct *p)
> +{
> +	int i;
> +
> +	for (i = 0; i < NR_SOFTIRQ_THREADS; i++) {
> +		if (p == __get_cpu_var(ksoftirqd)[i].tsk)
> +			return __get_cpu_var(ksoftirqd)[i].mask;

Looks you are

> @@ -131,11 +155,18 @@ void softirq_check_pending_idle(void)
>   */
>  static void wakeup_softirqd(void)
>  {
> -	/* Interrupts are disabled: no need to stop preemption */
> -	struct task_struct *tsk = __this_cpu_read(ksoftirqd);
> +	struct task_struct *tsk;
> +	u32 pending = local_softirq_pending(), mask, i;
>  
> -	if (tsk && tsk->state != TASK_RUNNING)
> -		wake_up_process(tsk);
> +	/* Interrupts are disabled: no need to stop preemption */
> +	for (i = 0; pending && i < NR_SOFTIRQ_THREADS; i++) {
> +		mask = __get_cpu_var(ksoftirqd)[i].mask;
> +		if (!(pending & mask))
> +			continue;
> +		tsk = __get_cpu_var(ksoftirqd)[i].tsk;
> +		if (tsk && tsk->state != TASK_RUNNING)
> +			wake_up_process(tsk);
> +	}
>  }

Dammned serious is seems. :)

I was looking into that as well, though I did not want to inflict it
on 3.0 at this point.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH RT] tasklet/rt: Prevent tasklets from going into infinite spin in RT
  2011-11-30 10:24   ` Thomas Gleixner
@ 2011-11-30 12:45     ` Mike Galbraith
  2011-11-30 13:04       ` Mike Galbraith
  2011-12-01  9:12     ` PATCH RT] resurrect softirq threads for RT_FULL Mike Galbraith
  1 sibling, 1 reply; 14+ messages in thread
From: Mike Galbraith @ 2011-11-30 12:45 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Steven Rostedt, LKML, RT, Ingo Molnar, Luis Claudio R. Goncalves,
	Clark Williams

On Wed, 2011-11-30 at 11:24 +0100, Thomas Gleixner wrote:
> On Wed, 30 Nov 2011, Mike Galbraith wrote:

> > @@ -486,12 +495,43 @@ extern void softirq_check_pending_idle(v
> >   */
> >  DECLARE_PER_CPU(struct list_head [NR_SOFTIRQS], softirq_work_list);
> >  
> > -DECLARE_PER_CPU(struct task_struct *, ksoftirqd);
> > +struct softirqdata {
> > +	int			mask;
> > +	struct task_struct	*tsk;
> > +};
> > +
> > +DECLARE_PER_CPU(struct softirqdata [NR_SOFTIRQ_THREADS], ksoftirqd);
> > +
> > +static inline bool this_cpu_ksoftirqd(struct task_struct *p)
> > +{
> > +	int i;
> > +
> > +	for (i = 0; i < NR_SOFTIRQ_THREADS; i++) {
> > +		if (p == __get_cpu_var(ksoftirqd)[i].tsk)
> > +			return true;
> 
> You are not serious about that loop, are you ?

Well, it's a short loop, especially when NR_SOFTIRQ_THREADS = 1 (poof).

> > @@ -131,11 +155,18 @@ void softirq_check_pending_idle(void)
> >   */
> >  static void wakeup_softirqd(void)
> >  {
> > -	/* Interrupts are disabled: no need to stop preemption */
> > -	struct task_struct *tsk = __this_cpu_read(ksoftirqd);
> > +	struct task_struct *tsk;
> > +	u32 pending = local_softirq_pending(), mask, i;
> >  
> > -	if (tsk && tsk->state != TASK_RUNNING)
> > -		wake_up_process(tsk);
> > +	/* Interrupts are disabled: no need to stop preemption */
> > +	for (i = 0; pending && i < NR_SOFTIRQ_THREADS; i++) {
> > +		mask = __get_cpu_var(ksoftirqd)[i].mask;
> > +		if (!(pending & mask))
> > +			continue;
> > +		tsk = __get_cpu_var(ksoftirqd)[i].tsk;
> > +		if (tsk && tsk->state != TASK_RUNNING)
> > +			wake_up_process(tsk);
> > +	}
> >  }
> 
> Dammned serious is seems. :)

Deadly serious :)

For the problem at hand (timer wakeups), NR_SOFTIRQ_THREADS could be 2.

> I was looking into that as well, though I did not want to inflict it
> on 3.0 at this point.
> 
> Thanks,
> 
> 	tglx



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH RT] tasklet/rt: Prevent tasklets from going into infinite spin in RT
  2011-11-30 12:45     ` Mike Galbraith
@ 2011-11-30 13:04       ` Mike Galbraith
  0 siblings, 0 replies; 14+ messages in thread
From: Mike Galbraith @ 2011-11-30 13:04 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Steven Rostedt, LKML, RT, Ingo Molnar, Luis Claudio R. Goncalves,
	Clark Williams

On Wed, 2011-11-30 at 13:45 +0100, Mike Galbraith wrote:
> On Wed, 2011-11-30 at 11:24 +0100, Thomas Gleixner wrote:
> > On Wed, 30 Nov 2011, Mike Galbraith wrote:

> > > @@ -131,11 +155,18 @@ void softirq_check_pending_idle(void)
> > >   */
> > >  static void wakeup_softirqd(void)
> > >  {
> > > -	/* Interrupts are disabled: no need to stop preemption */
> > > -	struct task_struct *tsk = __this_cpu_read(ksoftirqd);
> > > +	struct task_struct *tsk;
> > > +	u32 pending = local_softirq_pending(), mask, i;
> > >  
> > > -	if (tsk && tsk->state != TASK_RUNNING)
> > > -		wake_up_process(tsk);
> > > +	/* Interrupts are disabled: no need to stop preemption */
> > > +	for (i = 0; pending && i < NR_SOFTIRQ_THREADS; i++) {
> > > +		mask = __get_cpu_var(ksoftirqd)[i].mask;
> > > +		if (!(pending & mask))
> > > +			continue;
> > > +		tsk = __get_cpu_var(ksoftirqd)[i].tsk;
> > > +		if (tsk && tsk->state != TASK_RUNNING)
> > > +			wake_up_process(tsk);
> > > +	}
> > >  }
> > 
> > Dammned serious is seems. :)

'course here I should have just used the busy bits directly.

	-Mike


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH RT] tasklet/rt: Prevent tasklets from going into infinite spin in RT
  2011-11-30  1:55 [PATCH RT] tasklet/rt: Prevent tasklets from going into infinite spin in RT Steven Rostedt
@ 2011-11-30 14:24   ` Luis Claudio R. Goncalves
  2011-11-30  4:32 ` Mike Galbraith
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 14+ messages in thread
From: Luis Claudio R. Goncalves @ 2011-11-30 14:24 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: LKML, RT, Thomas Gleixner, Ingo Molnar, Clark Williams

On Tue, Nov 29, 2011 at 08:55:20PM -0500, Steven Rostedt wrote:
...
| I booted this patch against 3.0-rt stable, and it didn't crash ;)
| Could you apply it and see if it fixes the hang that you've been seeing.

Steven,

I have a test box that would present the issue every 2 or 3 reboots,
consistently. Using your patch I have already rebooted the box 30 times
successfuly. And the test keeps going on... I feel confident to add my:

Tested-by: Luis Claudio R. Gonçalves <lgoncalv@redhat.com>

Cheers,
Luis


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH RT] tasklet/rt: Prevent tasklets from going into infinite spin in RT
@ 2011-11-30 14:24   ` Luis Claudio R. Goncalves
  0 siblings, 0 replies; 14+ messages in thread
From: Luis Claudio R. Goncalves @ 2011-11-30 14:24 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: LKML, RT, Thomas Gleixner, Ingo Molnar, Clark Williams

On Tue, Nov 29, 2011 at 08:55:20PM -0500, Steven Rostedt wrote:
...
| I booted this patch against 3.0-rt stable, and it didn't crash ;)
| Could you apply it and see if it fixes the hang that you've been seeing.

Steven,

I have a test box that would present the issue every 2 or 3 reboots,
consistently. Using your patch I have already rebooted the box 30 times
successfuly. And the test keeps going on... I feel confident to add my:

Tested-by: Luis Claudio R. Gonçalves <lgoncalv@redhat.com>

Cheers,
Luis

--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH RT] tasklet/rt: Prevent tasklets from going into infinite spin in RT
  2011-11-30  6:24   ` Mike Kravetz
@ 2011-11-30 15:56     ` Mike Kravetz
  0 siblings, 0 replies; 14+ messages in thread
From: Mike Kravetz @ 2011-11-30 15:56 UTC (permalink / raw)
  To: Clark Williams
  Cc: Steven Rostedt, LKML, RT, Thomas Gleixner, Ingo Molnar,
	Luis Claudio R. Goncalves

On Tue, Nov 29, 2011 at 10:24:31PM -0800, Mike Kravetz wrote:
> On Tue, Nov 29, 2011 at 09:50:39PM -0600, Clark Williams wrote:
> > On Tue, 29 Nov 2011 20:55:20 -0500
> > Steven Rostedt <rostedt@goodmis.org> wrote:
> > 
> > > Ingo,
> > > 
> > > I forward ported this code from 2.6.33.9-rt31, but I think you were the
> > > original author, as I found most of this code in the
> > > "tasklet-redesign.patch" from my broken out 2.6.24-rt patches. I
> > > committed it into my git tree (stable-rt) under your name, and added the
> > > Signed-off-by that you had in that patch, if you have and objections,
> > > please let me know. This patch should never see mainline, but it will
> > > probably be going into the -rt branch. I wrote up this change log, if
> > > there's something you don't like in it, let me know and I'll fix it.
> > > 
> > > Luis and Clark (I love saying that),
> > 
> > No matter how many times you say it, I'm still not treking across the
> > USA with Luis. :)
> > 
> > > I booted this patch against 3.0-rt stable, and it didn't crash ;)
> > > Could you apply it and see if it fixes the hang that you've been seeing.
> > 
> > Yeah, since Luis is a couple of time zones ahead of us, maybe it'll all
> > be fixed when we get up in the morning.
> 
> I've got it running on a system here that frequently encountered the
> condition.  I will let it perform some automated testing while I sleep.
> Should have results by morning (west coast US).

133 successful test executions (just normal reboot/restart of system).  Without
patch, system would hang/stall during restart more than 50% of the time.

-- 
Mike


^ permalink raw reply	[flat|nested] 14+ messages in thread

* PATCH RT] resurrect softirq threads for RT_FULL
  2011-11-30 10:24   ` Thomas Gleixner
  2011-11-30 12:45     ` Mike Galbraith
@ 2011-12-01  9:12     ` Mike Galbraith
  1 sibling, 0 replies; 14+ messages in thread
From: Mike Galbraith @ 2011-12-01  9:12 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Steven Rostedt, LKML, RT, Ingo Molnar, Luis Claudio R. Goncalves,
	Clark Williams

Un-hijack thread.

On Wed, 2011-11-30 at 11:24 +0100, Thomas Gleixner wrote:
> On Wed, 30 Nov 2011, Mike Galbraith wrote:

> > @@ -486,12 +495,43 @@ extern void softirq_check_pending_idle(v
> >   */
> >  DECLARE_PER_CPU(struct list_head [NR_SOFTIRQS], softirq_work_list);
> >  
> > -DECLARE_PER_CPU(struct task_struct *, ksoftirqd);
> > +struct softirqdata {
> > +	int			mask;
> > +	struct task_struct	*tsk;
> > +};
> > +
> > +DECLARE_PER_CPU(struct softirqdata [NR_SOFTIRQ_THREADS], ksoftirqd);
> > +
> > +static inline bool this_cpu_ksoftirqd(struct task_struct *p)
> > +{
> > +	int i;
> > +
> > +	for (i = 0; i < NR_SOFTIRQ_THREADS; i++) {
> > +		if (p == __get_cpu_var(ksoftirqd)[i].tsk)
> > +			return true;
> 
> You are not serious about that loop, are you ?

After some dainbramaged removal, it might look a little better.

sched, rt: resurrect softirq threads for RT_FULL

Signed-off-by: Mike Galbraith <efault@gmx.de>
---
 include/linux/interrupt.h |   21 ++++-
 kernel/irq/Kconfig        |    7 +
 kernel/sched.c            |    4 -
 kernel/softirq.c          |  168 ++++++++++++++++++++++++++++++++--------------
 4 files changed, 145 insertions(+), 55 deletions(-)

Index: linux-3.2-rt/kernel/irq/Kconfig
===================================================================
--- linux-3.2-rt.orig/kernel/irq/Kconfig
+++ linux-3.2-rt/kernel/irq/Kconfig
@@ -60,6 +60,13 @@ config IRQ_DOMAIN
 config IRQ_FORCED_THREADING
        bool
 
+# Support forced sirq threading
+config SIRQ_FORCED_THREADING
+       bool "Forced Soft IRQ threading"
+       depends on PREEMPT_RT_FULL
+	help
+	  Split ksoftirqd into per SOFTIRQ threads
+
 config SPARSE_IRQ
 	bool "Support sparse irq numbering"
 	depends on HAVE_SPARSE_IRQ
Index: linux-3.2-rt/include/linux/interrupt.h
===================================================================
--- linux-3.2-rt.orig/include/linux/interrupt.h
+++ linux-3.2-rt/include/linux/interrupt.h
@@ -442,6 +442,9 @@ enum
 	NR_SOFTIRQS
 };
 
+/* Update when adding new softirqs. */
+#define SOFTIRQ_MASK_ALL 0x3ff
+
 /* map softirq index to softirq name. update 'softirq_to_name' in
  * kernel/softirq.c when adding a new softirq.
  */
@@ -457,10 +460,16 @@ struct softirq_action
 };
 
 #ifndef CONFIG_PREEMPT_RT_FULL
+#define NR_SOFTIRQ_THREADS 1
 asmlinkage void do_softirq(void);
 asmlinkage void __do_softirq(void);
 static inline void thread_do_softirq(void) { do_softirq(); }
 #else
+#ifdef CONFIG_SIRQ_FORCED_THREADING
+#define NR_SOFTIRQ_THREADS NR_SOFTIRQS
+#else
+#define NR_SOFTIRQ_THREADS 1
+#endif
 extern void thread_do_softirq(void);
 #endif
 
@@ -486,11 +495,17 @@ extern void softirq_check_pending_idle(v
  */
 DECLARE_PER_CPU(struct list_head [NR_SOFTIRQS], softirq_work_list);
 
-DECLARE_PER_CPU(struct task_struct *, ksoftirqd);
+struct softirqdata {
+	int			mask;
+	struct task_struct	*tsk;
+};
+
+DECLARE_PER_CPU(struct softirqdata [NR_SOFTIRQ_THREADS], ksoftirqd);
+DECLARE_PER_CPU(struct task_struct *, local_softirq_thread);
 
-static inline struct task_struct *this_cpu_ksoftirqd(void)
+static inline int task_is_ksoftirqd(struct task_struct *p)
 {
-	return this_cpu_read(ksoftirqd);
+	return p == this_cpu_read(local_softirq_thread);
 }
 
 /* Try to send a softirq to a remote cpu.  If this cannot be done, the
Index: linux-3.2-rt/kernel/sched.c
===================================================================
--- linux-3.2-rt.orig/kernel/sched.c
+++ linux-3.2-rt/kernel/sched.c
@@ -2082,7 +2082,7 @@ void account_system_vtime(struct task_st
 	 */
 	if (hardirq_count())
 		__this_cpu_add(cpu_hardirq_time, delta);
-	else if (in_serving_softirq() && curr != this_cpu_ksoftirqd())
+	else if (in_serving_softirq() && !task_is_ksoftirqd(curr))
 		__this_cpu_add(cpu_softirq_time, delta);
 
 	irq_time_write_end();
@@ -4062,7 +4062,7 @@ static void irqtime_account_process_tick
 		cpustat->irq = cputime64_add(cpustat->irq, tmp);
 	} else if (irqtime_account_si_update()) {
 		cpustat->softirq = cputime64_add(cpustat->softirq, tmp);
-	} else if (this_cpu_ksoftirqd() == p) {
+	} else if (task_is_ksoftirqd(p)) {
 		/*
 		 * ksoftirqd time do not get accounted in cpu_softirq_time.
 		 * So, we have to handle it separately here.
Index: linux-3.2-rt/kernel/softirq.c
===================================================================
--- linux-3.2-rt.orig/kernel/softirq.c
+++ linux-3.2-rt/kernel/softirq.c
@@ -56,13 +56,32 @@ EXPORT_SYMBOL(irq_stat);
 
 static struct softirq_action softirq_vec[NR_SOFTIRQS] __cacheline_aligned_in_smp;
 
-DEFINE_PER_CPU(struct task_struct *, ksoftirqd);
+DEFINE_PER_CPU(struct softirqdata[NR_SOFTIRQ_THREADS], ksoftirqd);
+DEFINE_PER_CPU(struct task_struct *, local_softirq_thread);
 
 char *softirq_to_name[NR_SOFTIRQS] = {
 	"HI", "TIMER", "NET_TX", "NET_RX", "BLOCK", "BLOCK_IOPOLL",
 	"TASKLET", "SCHED", "HRTIMER", "RCU"
 };
 
+static const char *softirq_to_thread_name [] =
+{
+#ifdef CONFIG_SIRQ_FORCED_THREADING
+	[HI_SOFTIRQ]		= "sirq-high",
+	[TIMER_SOFTIRQ]		= "sirq-timer",
+	[NET_TX_SOFTIRQ]	= "sirq-net-tx",
+	[NET_RX_SOFTIRQ]	= "sirq-net-rx",
+	[BLOCK_SOFTIRQ]		= "sirq-blk",
+	[BLOCK_IOPOLL_SOFTIRQ]	= "sirq-blk-pol",
+	[TASKLET_SOFTIRQ]	= "sirq-tasklet",
+	[SCHED_SOFTIRQ]		= "sirq-sched",
+	[HRTIMER_SOFTIRQ]	= "sirq-hrtimer",
+	[RCU_SOFTIRQ]		= "sirq-rcu",
+#else
+	[HI_SOFTIRQ]		= "ksoftirqd",
+#endif
+};
+
 #ifdef CONFIG_NO_HZ
 # ifdef CONFIG_PREEMPT_RT_FULL
 /*
@@ -78,15 +97,23 @@ char *softirq_to_name[NR_SOFTIRQS] = {
 void softirq_check_pending_idle(void)
 {
 	static int rate_limit;
-	u32 warnpending = 0, pending = local_softirq_pending();
+	u32 pending = local_softirq_pending(), mask = pending;
+	int i = 0;
 
 	if (rate_limit >= 10)
 		return;
 
-	if (pending) {
+	for (i = 0; pending && i < NR_SOFTIRQ_THREADS; i++) {
 		struct task_struct *tsk;
 
-		tsk = __get_cpu_var(ksoftirqd);
+		if (NR_SOFTIRQ_THREADS > 1) {
+			mask = 1 << i;
+
+			if (!(pending & mask))
+				continue;
+		}
+
+		tsk = __get_cpu_var(ksoftirqd)[i].tsk;
 		/*
 		 * The wakeup code in rtmutex.c wakes up the task
 		 * _before_ it sets pi_blocked_on to NULL under
@@ -95,13 +122,13 @@ void softirq_check_pending_idle(void)
 		 */
 		raw_spin_lock(&tsk->pi_lock);
 
-		if (!tsk->pi_blocked_on && !(tsk->state == TASK_RUNNING))
-			warnpending = 1;
+		if (tsk->pi_blocked_on || tsk->state == TASK_RUNNING)
+			pending &= ~mask;
 
 		raw_spin_unlock(&tsk->pi_lock);
 	}
 
-	if (warnpending) {
+	if (pending) {
 		printk(KERN_ERR "NOHZ: local_softirq_pending %02x\n",
 		       pending);
 		rate_limit++;
@@ -132,11 +159,17 @@ void softirq_check_pending_idle(void)
  */
 static void wakeup_softirqd(void)
 {
-	/* Interrupts are disabled: no need to stop preemption */
-	struct task_struct *tsk = __this_cpu_read(ksoftirqd);
+	struct task_struct *tsk;
+	u32 pending = local_softirq_pending(), i;
 
-	if (tsk && tsk->state != TASK_RUNNING)
-		wake_up_process(tsk);
+	/* Interrupts are disabled: no need to stop preemption */
+	for (i = 0; pending && i < NR_SOFTIRQ_THREADS; i++) {
+		if (NR_SOFTIRQ_THREADS > 1 && !(pending & (1 << i)))
+			continue;
+		tsk = __get_cpu_var(ksoftirqd)[i].tsk;
+		if (tsk && tsk->state != TASK_RUNNING)
+			wake_up_process(tsk);
+	}
 }
 
 static void handle_pending_softirqs(u32 pending, int cpu, int need_rcu_bh_qs)
@@ -385,11 +418,11 @@ static inline void ksoftirqd_clr_sched_p
 static DEFINE_LOCAL_IRQ_LOCK(local_softirq_lock);
 static DEFINE_PER_CPU(struct task_struct *, local_softirq_runner);
 
-static void __do_softirq_common(int need_rcu_bh_qs);
+static void __do_softirq_common(u32 mask, int need_rcu_bh_qs);
 
-void __do_softirq(void)
+void __do_softirq(u32 mask)
 {
-	__do_softirq_common(0);
+	__do_softirq_common(mask, 0);
 }
 
 void __init softirq_early_init(void)
@@ -415,7 +448,7 @@ void local_bh_enable(void)
 
 		local_irq_disable();
 		if (local_softirq_pending())
-			__do_softirq();
+			__do_softirq(SOFTIRQ_MASK_ALL);
 		local_irq_enable();
 		local_unlock(local_softirq_lock);
 		WARN_ON(current->softirq_nestcnt != 1);
@@ -454,7 +487,7 @@ EXPORT_SYMBOL(in_serving_softirq);
  * Called with bh and local interrupts disabled. For full RT cpu must
  * be pinned.
  */
-static void __do_softirq_common(int need_rcu_bh_qs)
+static void __do_softirq_common(u32 mask, int need_rcu_bh_qs)
 {
 	u32 pending = local_softirq_pending();
 	int cpu = smp_processor_id();
@@ -462,17 +495,18 @@ static void __do_softirq_common(int need
 	current->softirq_nestcnt++;
 
 	/* Reset the pending bitmask before enabling irqs */
-	set_softirq_pending(0);
+	set_softirq_pending(pending & ~mask);
 
 	__get_cpu_var(local_softirq_runner) = current;
 
-	lockdep_softirq_enter();
+	/* Tell accounting that we're a softirq thread */
+	if (NR_SOFTIRQ_THREADS > 1 && !need_rcu_bh_qs)
+		__get_cpu_var(local_softirq_thread) = current;
 
-	handle_pending_softirqs(pending, cpu, need_rcu_bh_qs);
+	lockdep_softirq_enter();
 
-	pending = local_softirq_pending();
-	if (pending)
-		wakeup_softirqd();
+	handle_pending_softirqs(pending & mask, cpu, need_rcu_bh_qs);
+	wakeup_softirqd();
 
 	lockdep_softirq_exit();
 	__get_cpu_var(local_softirq_runner) = NULL;
@@ -480,7 +514,7 @@ static void __do_softirq_common(int need
 	current->softirq_nestcnt--;
 }
 
-static int __thread_do_softirq(int cpu)
+static int __thread_do_softirq(u32 mask, int cpu)
 {
 	/*
 	 * Prevent the current cpu from going offline.
@@ -506,8 +540,8 @@ static int __thread_do_softirq(int cpu)
 	 * We cannot switch stacks on RT as we want to be able to
 	 * schedule!
 	 */
-	if (local_softirq_pending())
-		__do_softirq_common(cpu >= 0);
+	if (local_softirq_pending() & mask)
+		__do_softirq_common(mask, cpu >= 0);
 	local_unlock(local_softirq_lock);
 	unpin_current_cpu();
 	preempt_disable();
@@ -522,14 +556,14 @@ void thread_do_softirq(void)
 {
 	if (!in_serving_softirq()) {
 		preempt_disable();
-		__thread_do_softirq(-1);
+		__thread_do_softirq(SOFTIRQ_MASK_ALL, -1);
 		preempt_enable();
 	}
 }
 
-static int ksoftirqd_do_softirq(int cpu)
+static int ksoftirqd_do_softirq(u32 mask, int cpu)
 {
-	return __thread_do_softirq(cpu);
+	return __thread_do_softirq(mask, cpu);
 }
 
 static inline void local_bh_disable_nort(void) { }
@@ -1097,21 +1131,38 @@ void tasklet_unlock_wait(struct tasklet_
 EXPORT_SYMBOL(tasklet_unlock_wait);
 #endif
 
+static inline int ksoftirqd_mask(struct task_struct *p)
+{
+#ifdef CONFIG_SIRQ_FORCED_THREADING
+	int i;
+
+	for (i = 0; i < NR_SOFTIRQ_THREADS; i++) {
+		if (p == __get_cpu_var(ksoftirqd)[i].tsk)
+			return __get_cpu_var(ksoftirqd)[i].mask;
+	}
+
+#endif
+	return SOFTIRQ_MASK_ALL;
+}
+
 static int run_ksoftirqd(void * __bind_cpu)
 {
+	u32 mask = ksoftirqd_mask(current);
+
 	ksoftirqd_set_sched_params();
+	this_cpu_write(local_softirq_thread, current);
 
 	set_current_state(TASK_INTERRUPTIBLE);
 
 	while (!kthread_should_stop()) {
 		preempt_disable();
-		if (!local_softirq_pending())
+		if (!(local_softirq_pending() & mask))
 			schedule_preempt_disabled();
 
 		__set_current_state(TASK_RUNNING);
 
-		while (local_softirq_pending()) {
-			if (ksoftirqd_do_softirq((long) __bind_cpu))
+		while (local_softirq_pending() & mask) {
+			if (ksoftirqd_do_softirq(mask, (long) __bind_cpu))
 				goto wait_to_die;
 			__preempt_enable_no_resched();
 			cond_resched();
@@ -1200,41 +1251,58 @@ static int __cpuinit cpu_callback(struct
 				  unsigned long action,
 				  void *hcpu)
 {
-	int hotcpu = (unsigned long)hcpu;
+	int hotcpu = (unsigned long)hcpu, i;
 	struct task_struct *p;
 
 	switch (action & ~CPU_TASKS_FROZEN) {
 	case CPU_UP_PREPARE:
-		p = kthread_create_on_node(run_ksoftirqd,
+		for (i = 0; i < NR_SOFTIRQ_THREADS; i++) {
+			per_cpu(ksoftirqd, hotcpu)[i].mask = SOFTIRQ_MASK_ALL;
+			per_cpu(ksoftirqd, hotcpu)[i].tsk = NULL;
+		}
+		for (i = 0; i < NR_SOFTIRQ_THREADS; i++) {
+			p = kthread_create_on_node(run_ksoftirqd,
 					   hcpu,
 					   cpu_to_node(hotcpu),
-					   "ksoftirqd/%d", hotcpu);
-		if (IS_ERR(p)) {
-			printk("ksoftirqd for %i failed\n", hotcpu);
-			return notifier_from_errno(PTR_ERR(p));
+					   "%s/%d", softirq_to_thread_name[i], hotcpu);
+			if (IS_ERR(p)) {
+				printk(KERN_ERR "%s/%d failed\n",
+					   softirq_to_thread_name[i], hotcpu);
+				return notifier_from_errno(PTR_ERR(p));
+			}
+			kthread_bind(p, hotcpu);
+			per_cpu(ksoftirqd, hotcpu)[i].tsk = p;
+			if (NR_SOFTIRQ_THREADS > 1)
+				per_cpu(ksoftirqd, hotcpu)[i].mask = 1 << i;
 		}
-		kthread_bind(p, hotcpu);
-  		per_cpu(ksoftirqd, hotcpu) = p;
  		break;
 	case CPU_ONLINE:
-		wake_up_process(per_cpu(ksoftirqd, hotcpu));
+		for (i = 0; i < NR_SOFTIRQ_THREADS; i++)
+			wake_up_process(per_cpu(ksoftirqd, hotcpu)[i].tsk);
 		break;
 #ifdef CONFIG_HOTPLUG_CPU
-	case CPU_UP_CANCELED:
-		if (!per_cpu(ksoftirqd, hotcpu))
-			break;
-		/* Unbind so it can run.  Fall thru. */
-		kthread_bind(per_cpu(ksoftirqd, hotcpu),
-			     cpumask_any(cpu_online_mask));
+	case CPU_UP_CANCELED: {
+		for (i = 0; i < NR_SOFTIRQ_THREADS; i++) {
+			p = per_cpu(ksoftirqd, hotcpu)[i].tsk;
+			if (!p)
+				continue;
+			/* Unbind so it can run. */
+			kthread_bind(p, cpumask_any(cpu_online_mask));
+		}
+	}
 	case CPU_POST_DEAD: {
 		static const struct sched_param param = {
 			.sched_priority = MAX_RT_PRIO-1
 		};
 
-		p = per_cpu(ksoftirqd, hotcpu);
-		per_cpu(ksoftirqd, hotcpu) = NULL;
-		sched_setscheduler_nocheck(p, SCHED_FIFO, &param);
-		kthread_stop(p);
+		for (i = 0; i < NR_SOFTIRQ_THREADS; i++) {
+			p = per_cpu(ksoftirqd, hotcpu)[i].tsk;
+			per_cpu(ksoftirqd, hotcpu)[i].tsk = NULL;
+			if (!p)
+				continue;
+			sched_setscheduler_nocheck(p, SCHED_FIFO, &param);
+			kthread_stop(p);
+		}
 		takeover_tasklets(hotcpu);
 		break;
 	}



^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2011-12-01  9:12 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-11-30  1:55 [PATCH RT] tasklet/rt: Prevent tasklets from going into infinite spin in RT Steven Rostedt
2011-11-30  3:50 ` Clark Williams
2011-11-30  6:24   ` Mike Kravetz
2011-11-30 15:56     ` Mike Kravetz
2011-11-30  4:32 ` Mike Galbraith
2011-11-30  7:34   ` Mike Galbraith
2011-11-30 10:24   ` Thomas Gleixner
2011-11-30 12:45     ` Mike Galbraith
2011-11-30 13:04       ` Mike Galbraith
2011-12-01  9:12     ` PATCH RT] resurrect softirq threads for RT_FULL Mike Galbraith
2011-11-30  8:46 ` [PATCH RT] tasklet/rt: Prevent tasklets from going into infinite spin in RT Tim Sander
2011-11-30  8:46   ` Tim Sander
2011-11-30 14:24 ` Luis Claudio R. Goncalves
2011-11-30 14:24   ` Luis Claudio R. Goncalves

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.