linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 0/6] Convert all tasklets to workqueues
@ 2007-06-22  4:00 Steven Rostedt
  2007-06-22  4:00 ` [RFC PATCH 1/6] Convert the RCU tasklet into a softirq Steven Rostedt
                   ` (9 more replies)
  0 siblings, 10 replies; 127+ messages in thread
From: Steven Rostedt @ 2007-06-22  4:00 UTC (permalink / raw)
  To: LKML
  Cc: Linus Torvalds, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Christoph Hellwig, john stultz, Oleg Nesterov, Paul E. McKenney,
	Dipankar Sarma, David S. Miller, matthew.wilcox, kuznet


There's a very nice paper by Matthew Willcox that describes Softirqs,
Tasklets, Bottom Halves, Task Queues, Work Queues and Timers[1].
In the paper it describes the history of these items.  Softirqs and
tasklets were created to replace bottom halves after a company (Mindcraft)
showed that Microsoft on a 4x SMP box would out do Linux. It was discovered
that this was due to a bottle neck caused by the design of Bottom Halves.
So Alexey Kuznetsov and Dave Miller [1] (and I'm sure others) created
softirqs and tasklets to multithread the bottom halves.

This worked well, and for the time it shut-up Microsoft^WMindcraft from
saying Linux was slow at networking.

Time passed, and Linux developed other nifty tools, like kthreads and
work queues. These run in a process context and are not as menacing to
latencies as softirqs and tasklets are.  Specifically, a tasklet,
acts as a task by only being able to run the function on one CPU
at a time. The same tasklet can not run on multiple CPUS.  So in that
aspect it is like a task (a task can only exist on one CPU at a time).
But a tasklet is much harder on the rest of the system because it
runs in interrupt context.  This means that if a higher priority process
wants to run, it must wait for the tasklet to finish before doing so.

The most part, tasklets today are not used for time critical functions.
Running tasklets in thread context is not harmful to performance of
the overall system. But running them in interrupt context is, since
they increase the overall latency for high priority tasks.

Even in Matthew's paper, he says that work queues have replaced tasklets.
But this is not truly the case.  Tasklets are common and plentiful.
But to go and replace each driver that uses a tasklet with a work queue
would be very painful.

I've developed this way to replace all tasklets with work queues without
having to change all the drivers that use them.  I created an API that
uses the tasklet API as a wrapper to a work queue.  This API doesn't need
to be permanent. It shows 1) that work queues can replace tasklets, and
2) we can remove a duplicate functionality from the kernel.  This API
only needs to be around until we removed all uses of tasklets from
all drivers.

I just want to state that tasklets served their time well. But it's time
to give them an honorable discharge.  So lets get rid of tasklets and
given them a standing salute as they leave :-)


This patch series does the following:

1) Changes the RCU tasklet into a softirq. The RCU tasklet *is* a 
performance critical function, and changing it to a softirq gives it
even more performance, and removes overhead. This has already been done
in the RT kernel, and should be applied regardless of the rest of the
patches in the series.

2) Splits out the tasklets from softirq.c.  This too should be done anyways.
Tasklets are not softirqs, and they have their own semantics that they
deserve their own file. Also it makes it a lot cleaner to replace them
with something else :-)

3/4) Add an API to the tasklets to allow a driver to see if a tasklet
is scheduled.  The DRM driver does it's own little thing with tasklets
and reads into the internals of the tasklet. These patches give the
DRM driver an API to do that a little bit cleaner.

The above patches really should go into the kernel if for any other
reason as a clean up patch set.

5) Move tasklet.h to tasklet_softirq.h and have tasklet.h include it.

6) This is the magic to make tasklets into work queues. It allows for
the kernel to be configured either with the normal tasklets, as it is
today, or with the tasklets-as-work-queues.

I've booted these patches on 5 machines, i386 and x86_64, and when
I can get my powerbook loaded with Linux, I'll try it there too.

I'd like to give thanks to Ingo Molnar and Oleg Nesterov for reviewing 
my initial patch series and giving me some pointers.


[1] www.wil.cx/matthew/lca2003/paper.pdf

-- Steve

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [RFC PATCH 1/6] Convert the RCU tasklet into a softirq
  2007-06-22  4:00 [RFC PATCH 0/6] Convert all tasklets to workqueues Steven Rostedt
@ 2007-06-22  4:00 ` Steven Rostedt
  2007-06-22  7:10   ` Christoph Hellwig
  2007-06-22  4:00 ` [RFC PATCH 2/6] Split out tasklets from softirq.c Steven Rostedt
                   ` (8 subsequent siblings)
  9 siblings, 1 reply; 127+ messages in thread
From: Steven Rostedt @ 2007-06-22  4:00 UTC (permalink / raw)
  To: LKML
  Cc: Linus Torvalds, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Christoph Hellwig, john stultz, Oleg Nesterov, Paul E. McKenney,
	Dipankar Sarma, David S. Miller, matthew.wilcox, kuznet

[-- Attachment #1: convert-rcu-tasklet-to-softirq.patch --]
[-- Type: text/plain, Size: 2973 bytes --]

I believe this was originally done by Dipankar Sarma. I pulled these
changes from the -rt kernel.

For better preformance, RCU should use a softirq instead of a
tasklet.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

Index: linux-2.6-test/include/linux/interrupt.h
===================================================================
--- linux-2.6-test.orig/include/linux/interrupt.h
+++ linux-2.6-test/include/linux/interrupt.h
@@ -245,6 +245,9 @@ enum
 #ifdef CONFIG_HIGH_RES_TIMERS
 	HRTIMER_SOFTIRQ,
 #endif
+	RCU_SOFTIRQ,	/* Preferable RCU should always be the last softirq */
+	/* Entries after this are ignored in split softirq mode */
+	MAX_SOFTIRQ,
 };
 
 /* softirq mask and active fields moved to irq_cpustat_t in
Index: linux-2.6-test/kernel/rcupdate.c
===================================================================
--- linux-2.6-test.orig/kernel/rcupdate.c
+++ linux-2.6-test/kernel/rcupdate.c
@@ -67,7 +67,6 @@ DEFINE_PER_CPU(struct rcu_data, rcu_data
 DEFINE_PER_CPU(struct rcu_data, rcu_bh_data) = { 0L };
 
 /* Fake initialization required by compiler */
-static DEFINE_PER_CPU(struct tasklet_struct, rcu_tasklet) = {NULL};
 static int blimit = 10;
 static int qhimark = 10000;
 static int qlowmark = 100;
@@ -253,7 +252,7 @@ static void rcu_do_batch(struct rcu_data
 	if (!rdp->donelist)
 		rdp->donetail = &rdp->donelist;
 	else
-		tasklet_schedule(&per_cpu(rcu_tasklet, rdp->cpu));
+		raise_softirq(RCU_SOFTIRQ);
 }
 
 /*
@@ -405,7 +404,6 @@ static void rcu_offline_cpu(int cpu)
 					&per_cpu(rcu_bh_data, cpu));
 	put_cpu_var(rcu_data);
 	put_cpu_var(rcu_bh_data);
-	tasklet_kill_immediate(&per_cpu(rcu_tasklet, cpu), cpu);
 }
 
 #else
@@ -417,7 +415,7 @@ static void rcu_offline_cpu(int cpu)
 #endif
 
 /*
- * This does the RCU processing work from tasklet context. 
+ * This does the RCU processing work from softirq context.
  */
 static void __rcu_process_callbacks(struct rcu_ctrlblk *rcp,
 					struct rcu_data *rdp)
@@ -462,7 +460,7 @@ static void __rcu_process_callbacks(stru
 		rcu_do_batch(rdp);
 }
 
-static void rcu_process_callbacks(unsigned long unused)
+static void rcu_process_callbacks(struct softirq_action *unused)
 {
 	__rcu_process_callbacks(&rcu_ctrlblk, &__get_cpu_var(rcu_data));
 	__rcu_process_callbacks(&rcu_bh_ctrlblk, &__get_cpu_var(rcu_bh_data));
@@ -526,7 +524,7 @@ void rcu_check_callbacks(int cpu, int us
 		rcu_bh_qsctr_inc(cpu);
 	} else if (!in_softirq())
 		rcu_bh_qsctr_inc(cpu);
-	tasklet_schedule(&per_cpu(rcu_tasklet, cpu));
+	raise_softirq(RCU_SOFTIRQ);
 }
 
 static void rcu_init_percpu_data(int cpu, struct rcu_ctrlblk *rcp,
@@ -549,7 +547,7 @@ static void __devinit rcu_online_cpu(int
 
 	rcu_init_percpu_data(cpu, &rcu_ctrlblk, rdp);
 	rcu_init_percpu_data(cpu, &rcu_bh_ctrlblk, bh_rdp);
-	tasklet_init(&per_cpu(rcu_tasklet, cpu), rcu_process_callbacks, 0UL);
+	open_softirq(RCU_SOFTIRQ, rcu_process_callbacks, NULL);
 }
 
 static int __cpuinit rcu_cpu_notify(struct notifier_block *self,

-- 

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [RFC PATCH 2/6] Split out tasklets from softirq.c
  2007-06-22  4:00 [RFC PATCH 0/6] Convert all tasklets to workqueues Steven Rostedt
  2007-06-22  4:00 ` [RFC PATCH 1/6] Convert the RCU tasklet into a softirq Steven Rostedt
@ 2007-06-22  4:00 ` Steven Rostedt
  2007-06-22  7:11   ` Christoph Hellwig
  2007-06-22 13:45   ` Akinobu Mita
  2007-06-22  4:00 ` [RFC PATCH 3/6] Add a tasklet is-scheduled API Steven Rostedt
                   ` (7 subsequent siblings)
  9 siblings, 2 replies; 127+ messages in thread
From: Steven Rostedt @ 2007-06-22  4:00 UTC (permalink / raw)
  To: LKML
  Cc: Linus Torvalds, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Christoph Hellwig, john stultz, Oleg Nesterov, Paul E. McKenney,
	Dipankar Sarma, David S. Miller, matthew.wilcox, kuznet

[-- Attachment #1: split-out-tasklet.patch --]
[-- Type: text/plain, Size: 18098 bytes --]

Tasklets are really a separate entity from softirqs, so they
deserve their own file. Also this allows us to easily replace
tasklets for something else ;-)

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

Index: linux-2.6-test/include/linux/interrupt.h
===================================================================
--- linux-2.6-test.orig/include/linux/interrupt.h
+++ linux-2.6-test/include/linux/interrupt.h
@@ -13,6 +13,7 @@
 #include <linux/irqflags.h>
 #include <linux/bottom_half.h>
 #include <linux/device.h>
+#include <linux/tasklet.h>
 #include <asm/atomic.h>
 #include <asm/ptrace.h>
 #include <asm/system.h>
@@ -268,117 +269,6 @@ extern void FASTCALL(raise_softirq_irqof
 extern void FASTCALL(raise_softirq(unsigned int nr));
 
 
-/* Tasklets --- multithreaded analogue of BHs.
-
-   Main feature differing them of generic softirqs: tasklet
-   is running only on one CPU simultaneously.
-
-   Main feature differing them of BHs: different tasklets
-   may be run simultaneously on different CPUs.
-
-   Properties:
-   * If tasklet_schedule() is called, then tasklet is guaranteed
-     to be executed on some cpu at least once after this.
-   * If the tasklet is already scheduled, but its excecution is still not
-     started, it will be executed only once.
-   * If this tasklet is already running on another CPU (or schedule is called
-     from tasklet itself), it is rescheduled for later.
-   * Tasklet is strictly serialized wrt itself, but not
-     wrt another tasklets. If client needs some intertask synchronization,
-     he makes it with spinlocks.
- */
-
-struct tasklet_struct
-{
-	struct tasklet_struct *next;
-	unsigned long state;
-	atomic_t count;
-	void (*func)(unsigned long);
-	unsigned long data;
-};
-
-#define DECLARE_TASKLET(name, func, data) \
-struct tasklet_struct name = { NULL, 0, ATOMIC_INIT(0), func, data }
-
-#define DECLARE_TASKLET_DISABLED(name, func, data) \
-struct tasklet_struct name = { NULL, 0, ATOMIC_INIT(1), func, data }
-
-
-enum
-{
-	TASKLET_STATE_SCHED,	/* Tasklet is scheduled for execution */
-	TASKLET_STATE_RUN	/* Tasklet is running (SMP only) */
-};
-
-#ifdef CONFIG_SMP
-static inline int tasklet_trylock(struct tasklet_struct *t)
-{
-	return !test_and_set_bit(TASKLET_STATE_RUN, &(t)->state);
-}
-
-static inline void tasklet_unlock(struct tasklet_struct *t)
-{
-	smp_mb__before_clear_bit(); 
-	clear_bit(TASKLET_STATE_RUN, &(t)->state);
-}
-
-static inline void tasklet_unlock_wait(struct tasklet_struct *t)
-{
-	while (test_bit(TASKLET_STATE_RUN, &(t)->state)) { barrier(); }
-}
-#else
-#define tasklet_trylock(t) 1
-#define tasklet_unlock_wait(t) do { } while (0)
-#define tasklet_unlock(t) do { } while (0)
-#endif
-
-extern void FASTCALL(__tasklet_schedule(struct tasklet_struct *t));
-
-static inline void tasklet_schedule(struct tasklet_struct *t)
-{
-	if (!test_and_set_bit(TASKLET_STATE_SCHED, &t->state))
-		__tasklet_schedule(t);
-}
-
-extern void FASTCALL(__tasklet_hi_schedule(struct tasklet_struct *t));
-
-static inline void tasklet_hi_schedule(struct tasklet_struct *t)
-{
-	if (!test_and_set_bit(TASKLET_STATE_SCHED, &t->state))
-		__tasklet_hi_schedule(t);
-}
-
-
-static inline void tasklet_disable_nosync(struct tasklet_struct *t)
-{
-	atomic_inc(&t->count);
-	smp_mb__after_atomic_inc();
-}
-
-static inline void tasklet_disable(struct tasklet_struct *t)
-{
-	tasklet_disable_nosync(t);
-	tasklet_unlock_wait(t);
-	smp_mb();
-}
-
-static inline void tasklet_enable(struct tasklet_struct *t)
-{
-	smp_mb__before_atomic_dec();
-	atomic_dec(&t->count);
-}
-
-static inline void tasklet_hi_enable(struct tasklet_struct *t)
-{
-	smp_mb__before_atomic_dec();
-	atomic_dec(&t->count);
-}
-
-extern void tasklet_kill(struct tasklet_struct *t);
-extern void tasklet_kill_immediate(struct tasklet_struct *t, unsigned int cpu);
-extern void tasklet_init(struct tasklet_struct *t,
-			 void (*func)(unsigned long), unsigned long data);
-
 /*
  * Autoprobing for irqs:
  *
Index: linux-2.6-test/include/linux/tasklet.h
===================================================================
--- /dev/null
+++ linux-2.6-test/include/linux/tasklet.h
@@ -0,0 +1,120 @@
+#ifndef _LINUX_TASKLET_H
+#define _LINUX_TASKLET_H
+
+/* Tasklets --- multithreaded analogue of BHs.
+
+   Main feature differing them of generic softirqs: tasklet
+   is running only on one CPU simultaneously.
+
+   Main feature differing them of BHs: different tasklets
+   may be run simultaneously on different CPUs.
+
+   Properties:
+   * If tasklet_schedule() is called, then tasklet is guaranteed
+     to be executed on some cpu at least once after this.
+   * If the tasklet is already scheduled, but its excecution is still not
+     started, it will be executed only once.
+   * If this tasklet is already running on another CPU (or schedule is called
+     from tasklet itself), it is rescheduled for later.
+   * Tasklet is strictly serialized wrt itself, but not
+     wrt another tasklets. If client needs some intertask synchronization,
+     he makes it with spinlocks.
+ */
+
+struct tasklet_struct
+{
+	struct tasklet_struct *next;
+	unsigned long state;
+	atomic_t count;
+	void (*func)(unsigned long);
+	unsigned long data;
+};
+
+#define DECLARE_TASKLET(name, func, data) \
+struct tasklet_struct name = { NULL, 0, ATOMIC_INIT(0), func, data }
+
+#define DECLARE_TASKLET_DISABLED(name, func, data) \
+struct tasklet_struct name = { NULL, 0, ATOMIC_INIT(1), func, data }
+
+
+enum
+{
+	TASKLET_STATE_SCHED,	/* Tasklet is scheduled for execution */
+	TASKLET_STATE_RUN	/* Tasklet is running (SMP only) */
+};
+
+#ifdef CONFIG_SMP
+static inline int tasklet_trylock(struct tasklet_struct *t)
+{
+	return !test_and_set_bit(TASKLET_STATE_RUN, &(t)->state);
+}
+
+static inline void tasklet_unlock(struct tasklet_struct *t)
+{
+	smp_mb__before_clear_bit(); 
+	clear_bit(TASKLET_STATE_RUN, &(t)->state);
+}
+
+static inline void tasklet_unlock_wait(struct tasklet_struct *t)
+{
+	while (test_bit(TASKLET_STATE_RUN, &(t)->state)) { barrier(); }
+}
+#else
+#define tasklet_trylock(t) 1
+#define tasklet_unlock_wait(t) do { } while (0)
+#define tasklet_unlock(t) do { } while (0)
+#endif
+
+extern void FASTCALL(__tasklet_schedule(struct tasklet_struct *t));
+
+static inline void tasklet_schedule(struct tasklet_struct *t)
+{
+	if (!test_and_set_bit(TASKLET_STATE_SCHED, &t->state))
+		__tasklet_schedule(t);
+}
+
+extern void FASTCALL(__tasklet_hi_schedule(struct tasklet_struct *t));
+
+static inline void tasklet_hi_schedule(struct tasklet_struct *t)
+{
+	if (!test_and_set_bit(TASKLET_STATE_SCHED, &t->state))
+		__tasklet_hi_schedule(t);
+}
+
+
+static inline void tasklet_disable_nosync(struct tasklet_struct *t)
+{
+	atomic_inc(&t->count);
+	smp_mb__after_atomic_inc();
+}
+
+static inline void tasklet_disable(struct tasklet_struct *t)
+{
+	tasklet_disable_nosync(t);
+	tasklet_unlock_wait(t);
+	smp_mb();
+}
+
+static inline void tasklet_enable(struct tasklet_struct *t)
+{
+	smp_mb__before_atomic_dec();
+	atomic_dec(&t->count);
+}
+
+static inline void tasklet_hi_enable(struct tasklet_struct *t)
+{
+	smp_mb__before_atomic_dec();
+	atomic_dec(&t->count);
+}
+
+extern void tasklet_kill(struct tasklet_struct *t);
+extern void tasklet_kill_immediate(struct tasklet_struct *t, unsigned int cpu);
+extern void tasklet_init(struct tasklet_struct *t,
+			 void (*func)(unsigned long), unsigned long data);
+
+#ifdef CONFIG_HOTPLUG_CPU
+void takeover_tasklets(unsigned int cpu);
+#endif /* CONFIG_HOTPLUG_CPU */
+
+#endif /* _LINUX_TASKLET_H */
+
Index: linux-2.6-test/kernel/Makefile
===================================================================
--- linux-2.6-test.orig/kernel/Makefile
+++ linux-2.6-test/kernel/Makefile
@@ -21,6 +21,7 @@ obj-$(CONFIG_FUTEX) += futex.o
 ifeq ($(CONFIG_COMPAT),y)
 obj-$(CONFIG_FUTEX) += futex_compat.o
 endif
+obj-y += tasklet.o
 obj-$(CONFIG_RT_MUTEXES) += rtmutex.o
 obj-$(CONFIG_DEBUG_RT_MUTEXES) += rtmutex-debug.o
 obj-$(CONFIG_RT_MUTEX_TESTER) += rtmutex-tester.o
Index: linux-2.6-test/kernel/softirq.c
===================================================================
--- linux-2.6-test.orig/kernel/softirq.c
+++ linux-2.6-test/kernel/softirq.c
@@ -348,144 +348,6 @@ void open_softirq(int nr, void (*action)
 	softirq_vec[nr].action = action;
 }
 
-/* Tasklets */
-struct tasklet_head
-{
-	struct tasklet_struct *list;
-};
-
-/* Some compilers disobey section attribute on statics when not
-   initialized -- RR */
-static DEFINE_PER_CPU(struct tasklet_head, tasklet_vec) = { NULL };
-static DEFINE_PER_CPU(struct tasklet_head, tasklet_hi_vec) = { NULL };
-
-void fastcall __tasklet_schedule(struct tasklet_struct *t)
-{
-	unsigned long flags;
-
-	local_irq_save(flags);
-	t->next = __get_cpu_var(tasklet_vec).list;
-	__get_cpu_var(tasklet_vec).list = t;
-	raise_softirq_irqoff(TASKLET_SOFTIRQ);
-	local_irq_restore(flags);
-}
-
-EXPORT_SYMBOL(__tasklet_schedule);
-
-void fastcall __tasklet_hi_schedule(struct tasklet_struct *t)
-{
-	unsigned long flags;
-
-	local_irq_save(flags);
-	t->next = __get_cpu_var(tasklet_hi_vec).list;
-	__get_cpu_var(tasklet_hi_vec).list = t;
-	raise_softirq_irqoff(HI_SOFTIRQ);
-	local_irq_restore(flags);
-}
-
-EXPORT_SYMBOL(__tasklet_hi_schedule);
-
-static void tasklet_action(struct softirq_action *a)
-{
-	struct tasklet_struct *list;
-
-	local_irq_disable();
-	list = __get_cpu_var(tasklet_vec).list;
-	__get_cpu_var(tasklet_vec).list = NULL;
-	local_irq_enable();
-
-	while (list) {
-		struct tasklet_struct *t = list;
-
-		list = list->next;
-
-		if (tasklet_trylock(t)) {
-			if (!atomic_read(&t->count)) {
-				if (!test_and_clear_bit(TASKLET_STATE_SCHED, &t->state))
-					BUG();
-				t->func(t->data);
-				tasklet_unlock(t);
-				continue;
-			}
-			tasklet_unlock(t);
-		}
-
-		local_irq_disable();
-		t->next = __get_cpu_var(tasklet_vec).list;
-		__get_cpu_var(tasklet_vec).list = t;
-		__raise_softirq_irqoff(TASKLET_SOFTIRQ);
-		local_irq_enable();
-	}
-}
-
-static void tasklet_hi_action(struct softirq_action *a)
-{
-	struct tasklet_struct *list;
-
-	local_irq_disable();
-	list = __get_cpu_var(tasklet_hi_vec).list;
-	__get_cpu_var(tasklet_hi_vec).list = NULL;
-	local_irq_enable();
-
-	while (list) {
-		struct tasklet_struct *t = list;
-
-		list = list->next;
-
-		if (tasklet_trylock(t)) {
-			if (!atomic_read(&t->count)) {
-				if (!test_and_clear_bit(TASKLET_STATE_SCHED, &t->state))
-					BUG();
-				t->func(t->data);
-				tasklet_unlock(t);
-				continue;
-			}
-			tasklet_unlock(t);
-		}
-
-		local_irq_disable();
-		t->next = __get_cpu_var(tasklet_hi_vec).list;
-		__get_cpu_var(tasklet_hi_vec).list = t;
-		__raise_softirq_irqoff(HI_SOFTIRQ);
-		local_irq_enable();
-	}
-}
-
-
-void tasklet_init(struct tasklet_struct *t,
-		  void (*func)(unsigned long), unsigned long data)
-{
-	t->next = NULL;
-	t->state = 0;
-	atomic_set(&t->count, 0);
-	t->func = func;
-	t->data = data;
-}
-
-EXPORT_SYMBOL(tasklet_init);
-
-void tasklet_kill(struct tasklet_struct *t)
-{
-	if (in_interrupt())
-		printk("Attempt to kill tasklet from interrupt\n");
-
-	while (test_and_set_bit(TASKLET_STATE_SCHED, &t->state)) {
-		do
-			yield();
-		while (test_bit(TASKLET_STATE_SCHED, &t->state));
-	}
-	tasklet_unlock_wait(t);
-	clear_bit(TASKLET_STATE_SCHED, &t->state);
-}
-
-EXPORT_SYMBOL(tasklet_kill);
-
-void __init softirq_init(void)
-{
-	open_softirq(TASKLET_SOFTIRQ, tasklet_action, NULL);
-	open_softirq(HI_SOFTIRQ, tasklet_hi_action, NULL);
-}
-
 static int ksoftirqd(void * __bind_cpu)
 {
 	set_user_nice(current, 19);
@@ -532,58 +394,6 @@ wait_to_die:
 	return 0;
 }
 
-#ifdef CONFIG_HOTPLUG_CPU
-/*
- * tasklet_kill_immediate is called to remove a tasklet which can already be
- * scheduled for execution on @cpu.
- *
- * Unlike tasklet_kill, this function removes the tasklet
- * _immediately_, even if the tasklet is in TASKLET_STATE_SCHED state.
- *
- * When this function is called, @cpu must be in the CPU_DEAD state.
- */
-void tasklet_kill_immediate(struct tasklet_struct *t, unsigned int cpu)
-{
-	struct tasklet_struct **i;
-
-	BUG_ON(cpu_online(cpu));
-	BUG_ON(test_bit(TASKLET_STATE_RUN, &t->state));
-
-	if (!test_bit(TASKLET_STATE_SCHED, &t->state))
-		return;
-
-	/* CPU is dead, so no lock needed. */
-	for (i = &per_cpu(tasklet_vec, cpu).list; *i; i = &(*i)->next) {
-		if (*i == t) {
-			*i = t->next;
-			return;
-		}
-	}
-	BUG();
-}
-
-static void takeover_tasklets(unsigned int cpu)
-{
-	struct tasklet_struct **i;
-
-	/* CPU is dead, so no lock needed. */
-	local_irq_disable();
-
-	/* Find end, append list for that CPU. */
-	for (i = &__get_cpu_var(tasklet_vec).list; *i; i = &(*i)->next);
-	*i = per_cpu(tasklet_vec, cpu).list;
-	per_cpu(tasklet_vec, cpu).list = NULL;
-	raise_softirq_irqoff(TASKLET_SOFTIRQ);
-
-	for (i = &__get_cpu_var(tasklet_hi_vec).list; *i; i = &(*i)->next);
-	*i = per_cpu(tasklet_hi_vec, cpu).list;
-	per_cpu(tasklet_hi_vec, cpu).list = NULL;
-	raise_softirq_irqoff(HI_SOFTIRQ);
-
-	local_irq_enable();
-}
-#endif /* CONFIG_HOTPLUG_CPU */
-
 static int __cpuinit cpu_callback(struct notifier_block *nfb,
 				  unsigned long action,
 				  void *hcpu)
Index: linux-2.6-test/kernel/tasklet.c
===================================================================
--- /dev/null
+++ linux-2.6-test/kernel/tasklet.c
@@ -0,0 +1,203 @@
+/*
+ *	linux/kernel/softirq.c
+ *
+ *	Copyright (C) 1992 Linus Torvalds
+ *
+ * Rewritten. Old one was good in 2.2, but in 2.3 it was immoral. --ANK (990903)
+ *
+ *      Removed from softirq.c by Steven Rostedt
+ */
+#include <linux/interrupt.h>
+#include <linux/cpu.h>
+
+/* Tasklets */
+struct tasklet_head
+{
+	struct tasklet_struct *list;
+};
+
+/* Some compilers disobey section attribute on statics when not
+   initialized -- RR */
+static DEFINE_PER_CPU(struct tasklet_head, tasklet_vec) = { NULL };
+static DEFINE_PER_CPU(struct tasklet_head, tasklet_hi_vec) = { NULL };
+
+void fastcall __tasklet_schedule(struct tasklet_struct *t)
+{
+	unsigned long flags;
+
+	local_irq_save(flags);
+	t->next = __get_cpu_var(tasklet_vec).list;
+	__get_cpu_var(tasklet_vec).list = t;
+	raise_softirq_irqoff(TASKLET_SOFTIRQ);
+	local_irq_restore(flags);
+}
+
+EXPORT_SYMBOL(__tasklet_schedule);
+
+void fastcall __tasklet_hi_schedule(struct tasklet_struct *t)
+{
+	unsigned long flags;
+
+	local_irq_save(flags);
+	t->next = __get_cpu_var(tasklet_hi_vec).list;
+	__get_cpu_var(tasklet_hi_vec).list = t;
+	raise_softirq_irqoff(HI_SOFTIRQ);
+	local_irq_restore(flags);
+}
+
+EXPORT_SYMBOL(__tasklet_hi_schedule);
+
+static void tasklet_action(struct softirq_action *a)
+{
+	struct tasklet_struct *list;
+
+	local_irq_disable();
+	list = __get_cpu_var(tasklet_vec).list;
+	__get_cpu_var(tasklet_vec).list = NULL;
+	local_irq_enable();
+
+	while (list) {
+		struct tasklet_struct *t = list;
+
+		list = list->next;
+
+		if (tasklet_trylock(t)) {
+			if (!atomic_read(&t->count)) {
+				if (!test_and_clear_bit(TASKLET_STATE_SCHED, &t->state))
+					BUG();
+				t->func(t->data);
+				tasklet_unlock(t);
+				continue;
+			}
+			tasklet_unlock(t);
+		}
+
+		local_irq_disable();
+		t->next = __get_cpu_var(tasklet_vec).list;
+		__get_cpu_var(tasklet_vec).list = t;
+		__raise_softirq_irqoff(TASKLET_SOFTIRQ);
+		local_irq_enable();
+	}
+}
+
+static void tasklet_hi_action(struct softirq_action *a)
+{
+	struct tasklet_struct *list;
+
+	local_irq_disable();
+	list = __get_cpu_var(tasklet_hi_vec).list;
+	__get_cpu_var(tasklet_hi_vec).list = NULL;
+	local_irq_enable();
+
+	while (list) {
+		struct tasklet_struct *t = list;
+
+		list = list->next;
+
+		if (tasklet_trylock(t)) {
+			if (!atomic_read(&t->count)) {
+				if (!test_and_clear_bit(TASKLET_STATE_SCHED, &t->state))
+					BUG();
+				t->func(t->data);
+				tasklet_unlock(t);
+				continue;
+			}
+			tasklet_unlock(t);
+		}
+
+		local_irq_disable();
+		t->next = __get_cpu_var(tasklet_hi_vec).list;
+		__get_cpu_var(tasklet_hi_vec).list = t;
+		__raise_softirq_irqoff(HI_SOFTIRQ);
+		local_irq_enable();
+	}
+}
+
+
+void tasklet_init(struct tasklet_struct *t,
+		  void (*func)(unsigned long), unsigned long data)
+{
+	t->next = NULL;
+	t->state = 0;
+	atomic_set(&t->count, 0);
+	t->func = func;
+	t->data = data;
+}
+
+EXPORT_SYMBOL(tasklet_init);
+
+void tasklet_kill(struct tasklet_struct *t)
+{
+	if (in_interrupt())
+		printk("Attempt to kill tasklet from interrupt\n");
+
+	while (test_and_set_bit(TASKLET_STATE_SCHED, &t->state)) {
+		do
+			yield();
+		while (test_bit(TASKLET_STATE_SCHED, &t->state));
+	}
+	tasklet_unlock_wait(t);
+	clear_bit(TASKLET_STATE_SCHED, &t->state);
+}
+
+EXPORT_SYMBOL(tasklet_kill);
+
+void __init softirq_init(void)
+{
+	open_softirq(TASKLET_SOFTIRQ, tasklet_action, NULL);
+	open_softirq(HI_SOFTIRQ, tasklet_hi_action, NULL);
+}
+
+#ifdef CONFIG_HOTPLUG_CPU
+/*
+ * tasklet_kill_immediate is called to remove a tasklet which can already be
+ * scheduled for execution on @cpu.
+ *
+ * Unlike tasklet_kill, this function removes the tasklet
+ * _immediately_, even if the tasklet is in TASKLET_STATE_SCHED state.
+ *
+ * When this function is called, @cpu must be in the CPU_DEAD state.
+ */
+void tasklet_kill_immediate(struct tasklet_struct *t, unsigned int cpu)
+{
+	struct tasklet_struct **i;
+
+	BUG_ON(cpu_online(cpu));
+	BUG_ON(test_bit(TASKLET_STATE_RUN, &t->state));
+
+	if (!test_bit(TASKLET_STATE_SCHED, &t->state))
+		return;
+
+	/* CPU is dead, so no lock needed. */
+	for (i = &per_cpu(tasklet_vec, cpu).list; *i; i = &(*i)->next) {
+		if (*i == t) {
+			*i = t->next;
+			return;
+		}
+	}
+	BUG();
+}
+
+void takeover_tasklets(unsigned int cpu)
+{
+	struct tasklet_struct **i;
+
+	/* CPU is dead, so no lock needed. */
+	local_irq_disable();
+
+	/* Find end, append list for that CPU. */
+	for (i = &__get_cpu_var(tasklet_vec).list; *i; i = &(*i)->next);
+	*i = per_cpu(tasklet_vec, cpu).list;
+	per_cpu(tasklet_vec, cpu).list = NULL;
+	raise_softirq_irqoff(TASKLET_SOFTIRQ);
+
+	for (i = &__get_cpu_var(tasklet_hi_vec).list; *i; i = &(*i)->next);
+	*i = per_cpu(tasklet_hi_vec, cpu).list;
+	per_cpu(tasklet_hi_vec, cpu).list = NULL;
+	raise_softirq_irqoff(HI_SOFTIRQ);
+
+	local_irq_enable();
+}
+#endif /* CONFIG_HOTPLUG_CPU */
+
+

-- 

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [RFC PATCH 3/6] Add a tasklet is-scheduled API
  2007-06-22  4:00 [RFC PATCH 0/6] Convert all tasklets to workqueues Steven Rostedt
  2007-06-22  4:00 ` [RFC PATCH 1/6] Convert the RCU tasklet into a softirq Steven Rostedt
  2007-06-22  4:00 ` [RFC PATCH 2/6] Split out tasklets from softirq.c Steven Rostedt
@ 2007-06-22  4:00 ` Steven Rostedt
  2007-06-22  4:00 ` [RFC PATCH 4/6] Make DRM use the tasklet is-sched API Steven Rostedt
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 127+ messages in thread
From: Steven Rostedt @ 2007-06-22  4:00 UTC (permalink / raw)
  To: LKML
  Cc: Linus Torvalds, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Christoph Hellwig, john stultz, Oleg Nesterov, Paul E. McKenney,
	Dipankar Sarma, David S. Miller, matthew.wilcox, kuznet

[-- Attachment #1: tasklet-state-api.patch --]
[-- Type: text/plain, Size: 773 bytes --]

This patch adds a tasklet_is_scheduled API to allow a driver
to know if its tasklet is already scheduled.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

Index: linux-2.6-test/include/linux/tasklet.h
===================================================================
--- linux-2.6-test.orig/include/linux/tasklet.h
+++ linux-2.6-test/include/linux/tasklet.h
@@ -107,6 +107,11 @@ static inline void tasklet_hi_enable(str
 	atomic_dec(&t->count);
 }
 
+static inline int tasklet_is_scheduled(struct tasklet_struct *t)
+{
+	return test_bit(TASKLET_STATE_SCHED, &t->state);
+}
+
 extern void tasklet_kill(struct tasklet_struct *t);
 extern void tasklet_kill_immediate(struct tasklet_struct *t, unsigned int cpu);
 extern void tasklet_init(struct tasklet_struct *t,

-- 

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [RFC PATCH 4/6] Make DRM use the tasklet is-sched API
  2007-06-22  4:00 [RFC PATCH 0/6] Convert all tasklets to workqueues Steven Rostedt
                   ` (2 preceding siblings ...)
  2007-06-22  4:00 ` [RFC PATCH 3/6] Add a tasklet is-scheduled API Steven Rostedt
@ 2007-06-22  4:00 ` Steven Rostedt
  2007-06-22  6:36   ` Daniel Walker
  2007-06-22  4:00 ` [RFC PATCH 5/6] Move tasklet.h to tasklet_softirq.h Steven Rostedt
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 127+ messages in thread
From: Steven Rostedt @ 2007-06-22  4:00 UTC (permalink / raw)
  To: LKML
  Cc: Linus Torvalds, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Christoph Hellwig, john stultz, Oleg Nesterov, Paul E. McKenney,
	Dipankar Sarma, David S. Miller, matthew.wilcox, kuznet

[-- Attachment #1: tasklet-driver-hacks.patch --]
[-- Type: text/plain, Size: 737 bytes --]

Update the DRM driver to use the new tasklet API, which does not rely
on the tasklet implementation details.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>


Index: linux-2.6.21-rt9/drivers/char/drm/drm_irq.c
===================================================================
--- linux-2.6.21-rt9.orig/drivers/char/drm/drm_irq.c
+++ linux-2.6.21-rt9/drivers/char/drm/drm_irq.c
@@ -461,7 +461,7 @@ void drm_locked_tasklet(drm_device_t *de
 	static DECLARE_TASKLET(drm_tasklet, drm_locked_tasklet_func, 0);
 
 	if (!drm_core_check_feature(dev, DRIVER_HAVE_IRQ) ||
-	    test_bit(TASKLET_STATE_SCHED, &drm_tasklet.state))
+	    tasklet_is_scheduled(&drm_tasklet))
 		return;
 
 	spin_lock_irqsave(&dev->tasklet_lock, irqflags);

-- 

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [RFC PATCH 5/6] Move tasklet.h to tasklet_softirq.h
  2007-06-22  4:00 [RFC PATCH 0/6] Convert all tasklets to workqueues Steven Rostedt
                   ` (3 preceding siblings ...)
  2007-06-22  4:00 ` [RFC PATCH 4/6] Make DRM use the tasklet is-sched API Steven Rostedt
@ 2007-06-22  4:00 ` Steven Rostedt
  2007-06-22  4:00 ` [RFC PATCH 6/6] Convert tasklets to work queues Steven Rostedt
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 127+ messages in thread
From: Steven Rostedt @ 2007-06-22  4:00 UTC (permalink / raw)
  To: LKML
  Cc: Linus Torvalds, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Christoph Hellwig, john stultz, Oleg Nesterov, Paul E. McKenney,
	Dipankar Sarma, David S. Miller, matthew.wilcox, kuznet

[-- Attachment #1: move-tasklet_h-to-tasklet_softirq_h.patch --]
[-- Type: text/plain, Size: 7765 bytes --]

Getting ready for the two versions of tasklet implementations,
we move tasklet.h to tasklet_softirq.h and just include it in
tasklet.h.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

Index: linux-2.6-test/include/linux/tasklet.h
===================================================================
--- linux-2.6-test.orig/include/linux/tasklet.h
+++ linux-2.6-test/include/linux/tasklet.h
@@ -1,125 +1,6 @@
 #ifndef _LINUX_TASKLET_H
 #define _LINUX_TASKLET_H
 
-/* Tasklets --- multithreaded analogue of BHs.
+#include <linux/tasklet_softirq.h>
 
-   Main feature differing them of generic softirqs: tasklet
-   is running only on one CPU simultaneously.
-
-   Main feature differing them of BHs: different tasklets
-   may be run simultaneously on different CPUs.
-
-   Properties:
-   * If tasklet_schedule() is called, then tasklet is guaranteed
-     to be executed on some cpu at least once after this.
-   * If the tasklet is already scheduled, but its excecution is still not
-     started, it will be executed only once.
-   * If this tasklet is already running on another CPU (or schedule is called
-     from tasklet itself), it is rescheduled for later.
-   * Tasklet is strictly serialized wrt itself, but not
-     wrt another tasklets. If client needs some intertask synchronization,
-     he makes it with spinlocks.
- */
-
-struct tasklet_struct
-{
-	struct tasklet_struct *next;
-	unsigned long state;
-	atomic_t count;
-	void (*func)(unsigned long);
-	unsigned long data;
-};
-
-#define DECLARE_TASKLET(name, func, data) \
-struct tasklet_struct name = { NULL, 0, ATOMIC_INIT(0), func, data }
-
-#define DECLARE_TASKLET_DISABLED(name, func, data) \
-struct tasklet_struct name = { NULL, 0, ATOMIC_INIT(1), func, data }
-
-
-enum
-{
-	TASKLET_STATE_SCHED,	/* Tasklet is scheduled for execution */
-	TASKLET_STATE_RUN	/* Tasklet is running (SMP only) */
-};
-
-#ifdef CONFIG_SMP
-static inline int tasklet_trylock(struct tasklet_struct *t)
-{
-	return !test_and_set_bit(TASKLET_STATE_RUN, &(t)->state);
-}
-
-static inline void tasklet_unlock(struct tasklet_struct *t)
-{
-	smp_mb__before_clear_bit(); 
-	clear_bit(TASKLET_STATE_RUN, &(t)->state);
-}
-
-static inline void tasklet_unlock_wait(struct tasklet_struct *t)
-{
-	while (test_bit(TASKLET_STATE_RUN, &(t)->state)) { barrier(); }
-}
-#else
-#define tasklet_trylock(t) 1
-#define tasklet_unlock_wait(t) do { } while (0)
-#define tasklet_unlock(t) do { } while (0)
 #endif
-
-extern void FASTCALL(__tasklet_schedule(struct tasklet_struct *t));
-
-static inline void tasklet_schedule(struct tasklet_struct *t)
-{
-	if (!test_and_set_bit(TASKLET_STATE_SCHED, &t->state))
-		__tasklet_schedule(t);
-}
-
-extern void FASTCALL(__tasklet_hi_schedule(struct tasklet_struct *t));
-
-static inline void tasklet_hi_schedule(struct tasklet_struct *t)
-{
-	if (!test_and_set_bit(TASKLET_STATE_SCHED, &t->state))
-		__tasklet_hi_schedule(t);
-}
-
-
-static inline void tasklet_disable_nosync(struct tasklet_struct *t)
-{
-	atomic_inc(&t->count);
-	smp_mb__after_atomic_inc();
-}
-
-static inline void tasklet_disable(struct tasklet_struct *t)
-{
-	tasklet_disable_nosync(t);
-	tasklet_unlock_wait(t);
-	smp_mb();
-}
-
-static inline void tasklet_enable(struct tasklet_struct *t)
-{
-	smp_mb__before_atomic_dec();
-	atomic_dec(&t->count);
-}
-
-static inline void tasklet_hi_enable(struct tasklet_struct *t)
-{
-	smp_mb__before_atomic_dec();
-	atomic_dec(&t->count);
-}
-
-static inline int tasklet_is_scheduled(struct tasklet_struct *t)
-{
-	return test_bit(TASKLET_STATE_SCHED, &t->state);
-}
-
-extern void tasklet_kill(struct tasklet_struct *t);
-extern void tasklet_kill_immediate(struct tasklet_struct *t, unsigned int cpu);
-extern void tasklet_init(struct tasklet_struct *t,
-			 void (*func)(unsigned long), unsigned long data);
-
-#ifdef CONFIG_HOTPLUG_CPU
-void takeover_tasklets(unsigned int cpu);
-#endif /* CONFIG_HOTPLUG_CPU */
-
-#endif /* _LINUX_TASKLET_H */
-
Index: linux-2.6-test/include/linux/tasklet_softirq.h
===================================================================
--- /dev/null
+++ linux-2.6-test/include/linux/tasklet_softirq.h
@@ -0,0 +1,129 @@
+#ifndef _LINUX_TASKLET_SOFTIRQ_H
+#define _LINUX_TASKLET_SOFTIRQ_H
+
+#ifndef _LINUX_TASKLET_H
+# error "Do not include this header directly! use linux/tasklet.h"
+#endif
+
+/* Tasklets --- multithreaded analogue of BHs.
+
+   Main feature differing them of generic softirqs: tasklet
+   is running only on one CPU simultaneously.
+
+   Main feature differing them of BHs: different tasklets
+   may be run simultaneously on different CPUs.
+
+   Properties:
+   * If tasklet_schedule() is called, then tasklet is guaranteed
+     to be executed on some cpu at least once after this.
+   * If the tasklet is already scheduled, but its excecution is still not
+     started, it will be executed only once.
+   * If this tasklet is already running on another CPU (or schedule is called
+     from tasklet itself), it is rescheduled for later.
+   * Tasklet is strictly serialized wrt itself, but not
+     wrt another tasklets. If client needs some intertask synchronization,
+     he makes it with spinlocks.
+ */
+
+struct tasklet_struct
+{
+	struct tasklet_struct *next;
+	unsigned long state;
+	atomic_t count;
+	void (*func)(unsigned long);
+	unsigned long data;
+};
+
+#define DECLARE_TASKLET(name, func, data) \
+struct tasklet_struct name = { NULL, 0, ATOMIC_INIT(0), func, data }
+
+#define DECLARE_TASKLET_DISABLED(name, func, data) \
+struct tasklet_struct name = { NULL, 0, ATOMIC_INIT(1), func, data }
+
+
+enum
+{
+	TASKLET_STATE_SCHED,	/* Tasklet is scheduled for execution */
+	TASKLET_STATE_RUN	/* Tasklet is running (SMP only) */
+};
+
+#ifdef CONFIG_SMP
+static inline int tasklet_trylock(struct tasklet_struct *t)
+{
+	return !test_and_set_bit(TASKLET_STATE_RUN, &(t)->state);
+}
+
+static inline void tasklet_unlock(struct tasklet_struct *t)
+{
+	smp_mb__before_clear_bit(); 
+	clear_bit(TASKLET_STATE_RUN, &(t)->state);
+}
+
+static inline void tasklet_unlock_wait(struct tasklet_struct *t)
+{
+	while (test_bit(TASKLET_STATE_RUN, &(t)->state)) { barrier(); }
+}
+#else
+#define tasklet_trylock(t) 1
+#define tasklet_unlock_wait(t) do { } while (0)
+#define tasklet_unlock(t) do { } while (0)
+#endif
+
+extern void FASTCALL(__tasklet_schedule(struct tasklet_struct *t));
+
+static inline void tasklet_schedule(struct tasklet_struct *t)
+{
+	if (!test_and_set_bit(TASKLET_STATE_SCHED, &t->state))
+		__tasklet_schedule(t);
+}
+
+extern void FASTCALL(__tasklet_hi_schedule(struct tasklet_struct *t));
+
+static inline void tasklet_hi_schedule(struct tasklet_struct *t)
+{
+	if (!test_and_set_bit(TASKLET_STATE_SCHED, &t->state))
+		__tasklet_hi_schedule(t);
+}
+
+
+static inline void tasklet_disable_nosync(struct tasklet_struct *t)
+{
+	atomic_inc(&t->count);
+	smp_mb__after_atomic_inc();
+}
+
+static inline void tasklet_disable(struct tasklet_struct *t)
+{
+	tasklet_disable_nosync(t);
+	tasklet_unlock_wait(t);
+	smp_mb();
+}
+
+static inline void tasklet_enable(struct tasklet_struct *t)
+{
+	smp_mb__before_atomic_dec();
+	atomic_dec(&t->count);
+}
+
+static inline void tasklet_hi_enable(struct tasklet_struct *t)
+{
+	smp_mb__before_atomic_dec();
+	atomic_dec(&t->count);
+}
+
+static inline int tasklet_is_scheduled(struct tasklet_struct *t)
+{
+	return test_bit(TASKLET_STATE_SCHED, &t->state);
+}
+
+extern void tasklet_kill(struct tasklet_struct *t);
+extern void tasklet_kill_immediate(struct tasklet_struct *t, unsigned int cpu);
+extern void tasklet_init(struct tasklet_struct *t,
+			 void (*func)(unsigned long), unsigned long data);
+
+#ifdef CONFIG_HOTPLUG_CPU
+void takeover_tasklets(unsigned int cpu);
+#endif /* CONFIG_HOTPLUG_CPU */
+
+#endif /* _LINUX_TASKLET_H */
+

-- 

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [RFC PATCH 6/6] Convert tasklets to work queues
  2007-06-22  4:00 [RFC PATCH 0/6] Convert all tasklets to workqueues Steven Rostedt
                   ` (4 preceding siblings ...)
  2007-06-22  4:00 ` [RFC PATCH 5/6] Move tasklet.h to tasklet_softirq.h Steven Rostedt
@ 2007-06-22  4:00 ` Steven Rostedt
  2007-06-22  7:06   ` Daniel Walker
  2007-06-23 11:15   ` Arnd Bergmann
  2007-06-22  7:09 ` [RFC PATCH 0/6] Convert all tasklets to workqueues Christoph Hellwig
                   ` (3 subsequent siblings)
  9 siblings, 2 replies; 127+ messages in thread
From: Steven Rostedt @ 2007-06-22  4:00 UTC (permalink / raw)
  To: LKML
  Cc: Linus Torvalds, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Christoph Hellwig, john stultz, Oleg Nesterov, Paul E. McKenney,
	Dipankar Sarma, David S. Miller, matthew.wilcox, kuznet

[-- Attachment #1: tasklets-to-workqueues.patch --]
[-- Type: text/plain, Size: 8569 bytes --]

This patch creates an alternative for drivers from using tasklets.
It creates a "work_tasklet". When configured to use work_tasklets
instead of tasklets, instead of creating tasklets, a work queue
is made in its place.  The API is still the same, and the drivers
don't know that a work queue is being used.

This is (for now) a proof of concept approach to using work queues
instead of tasklets.  More can be done. For one thing, we could make
individual work queues for each tasklet instead of using the global
ktasklet_wq.  But for now it's just easier to do this.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>


Index: linux-2.6-test/kernel/Kconfig.preempt
===================================================================
--- linux-2.6-test.orig/kernel/Kconfig.preempt
+++ linux-2.6-test/kernel/Kconfig.preempt
@@ -63,3 +63,18 @@ config PREEMPT_BKL
 	  Say Y here if you are building a kernel for a desktop system.
 	  Say N if you are unsure.
 
+config TASKLETS_AS_WORKQUEUES
+	bool "Treat tasklets as workqueues (EXPERIMENTAL)"
+	depends on EXPERIMENTAL
+	help
+	  Tasklets are an old solution to an old problem with respect
+	  to SMP.  But today they are not necessary anymore.
+	  There are better solutions to the problem that tasklets
+	  are trying to solve.
+
+	  This option converts tasklets into workqueues and
+	  removes the tasklet softirq.  This is a clean up until
+	  we get rid of all tasklets that are currently in
+	  the kernel.
+
+	  Say Y if you are unsure and brave (not very well tested code!).
Index: linux-2.6-test/kernel/Makefile
===================================================================
--- linux-2.6-test.orig/kernel/Makefile
+++ linux-2.6-test/kernel/Makefile
@@ -21,7 +21,11 @@ obj-$(CONFIG_FUTEX) += futex.o
 ifeq ($(CONFIG_COMPAT),y)
 obj-$(CONFIG_FUTEX) += futex_compat.o
 endif
+ifeq ($(CONFIG_TASKLETS_AS_WORKQUEUES), y)
+obj-y += tasklet_work.o
+else
 obj-y += tasklet.o
+endif
 obj-$(CONFIG_RT_MUTEXES) += rtmutex.o
 obj-$(CONFIG_DEBUG_RT_MUTEXES) += rtmutex-debug.o
 obj-$(CONFIG_RT_MUTEX_TESTER) += rtmutex-tester.o
Index: linux-2.6-test/init/main.c
===================================================================
--- linux-2.6-test.orig/init/main.c
+++ linux-2.6-test/init/main.c
@@ -121,6 +121,11 @@ extern void time_init(void);
 /* Default late time init is NULL. archs can override this later. */
 void (*late_time_init)(void);
 extern void softirq_init(void);
+#ifdef CONFIG_TASKLETS_AS_WORKQUEUES
+  extern void init_tasklets(void);
+#else
+# define init_tasklets() do { } while(0)
+#endif
 
 /* Untouched command line saved by arch-specific code. */
 char __initdata boot_command_line[COMMAND_LINE_SIZE];
@@ -706,6 +711,7 @@ static void __init do_basic_setup(void)
 {
 	/* drivers will send hotplug events */
 	init_workqueues();
+	init_tasklets();
 	usermodehelper_init();
 	driver_init();
 	init_irq_proc();
Index: linux-2.6-test/include/linux/tasklet.h
===================================================================
--- linux-2.6-test.orig/include/linux/tasklet.h
+++ linux-2.6-test/include/linux/tasklet.h
@@ -1,6 +1,10 @@
 #ifndef _LINUX_TASKLET_H
 #define _LINUX_TASKLET_H
 
-#include <linux/tasklet_softirq.h>
+#ifdef CONFIG_TASKLETS_AS_WORKQUEUES
+# include <linux/tasklet_work.h>
+#else
+# include <linux/tasklet_softirq.h>
+#endif
 
 #endif
Index: linux-2.6-test/include/linux/tasklet_work.h
===================================================================
--- /dev/null
+++ linux-2.6-test/include/linux/tasklet_work.h
@@ -0,0 +1,62 @@
+#ifndef _LINUX_WORK_TASKLET_H
+#define _LINUX_WORK_TASKLET_H
+
+#ifndef _LINUX_INTERRUPT_H
+# error "Do not include this header directly! use linux/interrupt.h"
+#endif
+
+#include <linux/workqueue.h>
+
+extern void work_tasklet_exec(struct work_struct *work);
+
+struct tasklet_struct
+{
+	struct work_struct work;
+	struct list_head list;
+	unsigned long state;
+	atomic_t count;
+	void (*func)(unsigned long);
+	unsigned long data;
+	char *n;
+};
+
+#define DECLARE_TASKLET(name, func, data)				\
+	struct tasklet_struct name = {					\
+		__WORK_INITIALIZER((name).work, work_tasklet_exec),	\
+		LIST_HEAD_INIT((name).list),				\
+		0,							\
+		ATOMIC_INIT(0),						\
+		func,							\
+		data,							\
+		#name							\
+	}
+
+#define DECLARE_TASKLET_DISABLED(name, func, data)			\
+	struct tasklet_struct name = {					\
+		__WORK_INITIALIZER((name).work, work_tasklet_exec),	\
+		LIST_HEAD_INIT((name).list),				\
+		0,							\
+		ATOMIC_INIT(1),						\
+		func,							\
+		data,							\
+ 		#name							\
+	}
+
+void tasklet_schedule(struct tasklet_struct *t);
+#define tasklet_hi_schedule tasklet_schedule
+extern fastcall void tasklet_enable(struct tasklet_struct *t);
+#define tasklet_hi_enable tasklet_enable
+
+void tasklet_disable_nosync(struct tasklet_struct *t);
+void tasklet_disable(struct tasklet_struct *t);
+
+extern int tasklet_is_scheduled(struct tasklet_struct *t);
+
+extern void tasklet_kill(struct tasklet_struct *t);
+extern void tasklet_kill_immediate(struct tasklet_struct *t, unsigned int cpu);
+extern void tasklet_init(struct tasklet_struct *t,
+			 void (*func)(unsigned long), unsigned long data);
+void takeover_tasklets(unsigned int cpu);
+
+
+#endif /* _LINUX_WORK_TASKLET_H */
Index: linux-2.6-test/kernel/tasklet_work.c
===================================================================
--- /dev/null
+++ linux-2.6-test/kernel/tasklet_work.c
@@ -0,0 +1,138 @@
+/*
+ *	linux/kernel/work_tasklet.c
+ *
+ *	Copyright (C) 2007 Steven Rostedt, Red Hat
+ *
+ */
+
+#include <linux/interrupt.h>
+
+static struct workqueue_struct *ktaskletd_wq;
+
+enum
+{
+	TASKLET_STATE_SCHED,	/* Tasklet is scheduled for execution */
+	TASKLET_STATE_RUN,	/* Tasklet is running (SMP only) */
+	TASKLET_STATE_PENDING	/* Tasklet is pending */
+};
+
+#define TASKLET_STATEF_SCHED	(1 << TASKLET_STATE_SCHED)
+#define TASKLET_STATEF_RUN	(1 << TASKLET_STATE_RUN)
+#define TASKLET_STATEF_PENDING	(1 << TASKLET_STATE_PENDING)
+
+void tasklet_schedule(struct tasklet_struct *t)
+{
+	BUG_ON(!ktaskletd_wq);
+	pr_debug("scheduling tasklet %s %p\n", t->n, t);
+	queue_work(ktaskletd_wq, &t->work);
+}
+
+EXPORT_SYMBOL(tasklet_schedule);
+
+int tasklet_is_scheduled(struct tasklet_struct *t)
+{
+	int ret;
+	ret = work_pending(&t->work);
+	pr_debug("sched %s pending=%d\n", t->n, ret);
+	return ret;
+}
+
+EXPORT_SYMBOL(tasklet_is_scheduled);
+
+void tasklet_disable_nosync(struct tasklet_struct *t)
+{
+	pr_debug("disable tasklet %s %p\n", t->n, t);
+	atomic_inc(&t->count);
+	smp_mb__after_atomic_inc();
+}
+
+EXPORT_SYMBOL(tasklet_disable_nosync);
+
+void tasklet_disable(struct tasklet_struct *t)
+{
+	tasklet_disable_nosync(t);
+	pr_debug("flush tasklet %s %p\n", t->n, t);
+	flush_workqueue(ktaskletd_wq);
+	smp_mb();
+}
+
+EXPORT_SYMBOL(tasklet_disable);
+
+void work_tasklet_exec(struct work_struct *work)
+{
+	struct tasklet_struct *t =
+		container_of(work, struct tasklet_struct, work);
+
+	if (unlikely(atomic_read(&t->count))) {
+		pr_debug("tasklet disabled %s %p\n", t->n, t);
+		set_bit(TASKLET_STATE_PENDING, &t->state);
+		smp_mb();
+		/* make sure we were not just enabled */
+		if (likely(atomic_read(&t->count)))
+			goto out;
+		clear_bit(TASKLET_STATE_PENDING, &t->state);
+	}
+
+	local_bh_disable();
+	pr_debug("run tasklet %s %p\n", t->n, t);
+	t->func(t->data);
+	local_bh_enable();
+
+out:
+	return;
+}
+
+EXPORT_SYMBOL(work_tasklet_exec);
+
+void __init softirq_init(void)
+{
+}
+
+void init_tasklets(void)
+{
+	ktaskletd_wq = create_workqueue("tasklets");
+	BUG_ON(!ktaskletd_wq);
+}
+
+void takeover_tasklets(unsigned int cpu)
+{
+	pr_debug("Implement takeover tasklets??\n");
+}
+
+void tasklet_init(struct tasklet_struct *t,
+		  void (*func)(unsigned long), unsigned long data)
+{
+	INIT_WORK(&t->work, work_tasklet_exec);
+	INIT_LIST_HEAD(&t->list);
+	t->state = 0;
+	atomic_set(&t->count, 0);
+	t->func = func;
+	t->data = data;
+	t->n = "anonymous";
+	pr_debug("anonymous tasklet %p set at %p\n",
+		t, __builtin_return_address(0));
+}
+
+EXPORT_SYMBOL(tasklet_init);
+
+void fastcall tasklet_enable(struct tasklet_struct *t)
+{
+	pr_debug("enable tasklet %s (count was %d)\n",
+		 t->n, atomic_read(&t->count));
+	if (!atomic_dec_and_test(&t->count))
+		return;
+	if (test_and_clear_bit(TASKLET_STATE_PENDING, &t->state)) {
+		pr_debug("tasklet %s was pending\n", t->n);
+		tasklet_schedule(t);
+	}
+}
+
+EXPORT_SYMBOL(tasklet_enable);
+
+void tasklet_kill(struct tasklet_struct *t)
+{
+	pr_debug("kill tasklet %s\n", t->n);
+	flush_workqueue(ktaskletd_wq);
+}
+
+EXPORT_SYMBOL(tasklet_kill);

-- 

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 4/6] Make DRM use the tasklet is-sched API
  2007-06-22  4:00 ` [RFC PATCH 4/6] Make DRM use the tasklet is-sched API Steven Rostedt
@ 2007-06-22  6:36   ` Daniel Walker
  2007-06-22  6:49     ` Thomas Gleixner
  0 siblings, 1 reply; 127+ messages in thread
From: Daniel Walker @ 2007-06-22  6:36 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: LKML, Linus Torvalds, Ingo Molnar, Andrew Morton,
	Thomas Gleixner, Christoph Hellwig, john stultz, Oleg Nesterov,
	Paul E. McKenney, Dipankar Sarma, David S. Miller,
	matthew.wilcox, kuznet

On Fri, 2007-06-22 at 00:00 -0400, Steven Rostedt wrote:
> plain text document attachment (tasklet-driver-hacks.patch)
> Update the DRM driver to use the new tasklet API, which does not rely
> on the tasklet implementation details.
> 
> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
> 
> 
> Index: linux-2.6.21-rt9/drivers/char/drm/drm_irq.c
> ===================================================================
> --- linux-2.6.21-rt9.orig/drivers/char/drm/drm_irq.c
> +++ linux-2.6.21-rt9/drivers/char/drm/drm_irq.c
> @@ -461,7 +461,7 @@ void drm_locked_tasklet(drm_device_t *de
>  	static DECLARE_TASKLET(drm_tasklet, drm_locked_tasklet_func, 0);
>  
>  	if (!drm_core_check_feature(dev, DRIVER_HAVE_IRQ) ||
> -	    test_bit(TASKLET_STATE_SCHED, &drm_tasklet.state))
> +	    tasklet_is_scheduled(&drm_tasklet))
>  		return;
>  
>  	spin_lock_irqsave(&dev->tasklet_lock, irqflags);
> 


No sense in having a patch just for this, may as well merge this with
patch 3 ..

Daniel


^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 4/6] Make DRM use the tasklet is-sched API
  2007-06-22  6:36   ` Daniel Walker
@ 2007-06-22  6:49     ` Thomas Gleixner
  2007-06-22  7:08       ` Daniel Walker
  2007-06-22 16:10       ` Arnd Bergmann
  0 siblings, 2 replies; 127+ messages in thread
From: Thomas Gleixner @ 2007-06-22  6:49 UTC (permalink / raw)
  To: Daniel Walker
  Cc: Steven Rostedt, LKML, Linus Torvalds, Ingo Molnar, Andrew Morton,
	Christoph Hellwig, john stultz, Oleg Nesterov, Paul E. McKenney,
	Dipankar Sarma, David S. Miller, matthew.wilcox, kuznet

On Thu, 2007-06-21 at 23:36 -0700, Daniel Walker wrote:
> On Fri, 2007-06-22 at 00:00 -0400, Steven Rostedt wrote:
> > plain text document attachment (tasklet-driver-hacks.patch)
> > Update the DRM driver to use the new tasklet API, which does not rely
> > on the tasklet implementation details.
> > 
> > Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
> > 
> > 
> > Index: linux-2.6.21-rt9/drivers/char/drm/drm_irq.c
> > ===================================================================
> > --- linux-2.6.21-rt9.orig/drivers/char/drm/drm_irq.c
> > +++ linux-2.6.21-rt9/drivers/char/drm/drm_irq.c
> > @@ -461,7 +461,7 @@ void drm_locked_tasklet(drm_device_t *de
> >  	static DECLARE_TASKLET(drm_tasklet, drm_locked_tasklet_func, 0);
> >  
> >  	if (!drm_core_check_feature(dev, DRIVER_HAVE_IRQ) ||
> > -	    test_bit(TASKLET_STATE_SCHED, &drm_tasklet.state))
> > +	    tasklet_is_scheduled(&drm_tasklet))
> >  		return;
> >  
> >  	spin_lock_irqsave(&dev->tasklet_lock, irqflags);
> > 
> 
> 
> No sense in having a patch just for this, may as well merge this with
> patch 3 ..

Wrong. patch 3 adds the API and this one makes use of it. Stevens split
makes perfectly sense.

	tglx



^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 6/6] Convert tasklets to work queues
  2007-06-22  4:00 ` [RFC PATCH 6/6] Convert tasklets to work queues Steven Rostedt
@ 2007-06-22  7:06   ` Daniel Walker
  2007-06-22 13:29     ` Steven Rostedt
  2007-06-23 11:15   ` Arnd Bergmann
  1 sibling, 1 reply; 127+ messages in thread
From: Daniel Walker @ 2007-06-22  7:06 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: LKML, Linus Torvalds, Ingo Molnar, Andrew Morton,
	Thomas Gleixner, Christoph Hellwig, john stultz, Oleg Nesterov,
	Paul E. McKenney, Dipankar Sarma, David S. Miller,
	matthew.wilcox, kuznet

On Fri, 2007-06-22 at 00:00 -0400, Steven Rostedt wrote:
> plain text document attachment (tasklets-to-workqueues.patch)
> This patch creates an alternative for drivers from using tasklets.
> It creates a "work_tasklet". When configured to use work_tasklets
> instead of tasklets, instead of creating tasklets, a work queue
> is made in its place.  The API is still the same, and the drivers
> don't know that a work queue is being used.
> 
> This is (for now) a proof of concept approach to using work queues
> instead of tasklets.  More can be done. For one thing, we could make
> individual work queues for each tasklet instead of using the global
> ktasklet_wq.  But for now it's just easier to do this.
> 
> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
> 
> 
> Index: linux-2.6-test/kernel/Kconfig.preempt
> ===================================================================
> --- linux-2.6-test.orig/kernel/Kconfig.preempt
> +++ linux-2.6-test/kernel/Kconfig.preempt
> @@ -63,3 +63,18 @@ config PREEMPT_BKL
>  	  Say Y here if you are building a kernel for a desktop system.
>  	  Say N if you are unsure.
>  
> +config TASKLETS_AS_WORKQUEUES
> +	bool "Treat tasklets as workqueues (EXPERIMENTAL)"
> +	depends on EXPERIMENTAL
> +	help
> +	  Tasklets are an old solution to an old problem with respect
> +	  to SMP.  But today they are not necessary anymore.
> +	  There are better solutions to the problem that tasklets
> +	  are trying to solve.
> +
> +	  This option converts tasklets into workqueues and
> +	  removes the tasklet softirq.  This is a clean up until
> +	  we get rid of all tasklets that are currently in
> +	  the kernel.
> +
> +	  Say Y if you are unsure and brave (not very well tested code!).
> Index: linux-2.6-test/kernel/Makefile
> ===================================================================
> --- linux-2.6-test.orig/kernel/Makefile
> +++ linux-2.6-test/kernel/Makefile
> @@ -21,7 +21,11 @@ obj-$(CONFIG_FUTEX) += futex.o
>  ifeq ($(CONFIG_COMPAT),y)
>  obj-$(CONFIG_FUTEX) += futex_compat.o
>  endif
> +ifeq ($(CONFIG_TASKLETS_AS_WORKQUEUES), y)
> +obj-y += tasklet_work.o
> +else
>  obj-y += tasklet.o
> +endif
>  obj-$(CONFIG_RT_MUTEXES) += rtmutex.o
>  obj-$(CONFIG_DEBUG_RT_MUTEXES) += rtmutex-debug.o
>  obj-$(CONFIG_RT_MUTEX_TESTER) += rtmutex-tester.o
> Index: linux-2.6-test/init/main.c
> ===================================================================
> --- linux-2.6-test.orig/init/main.c
> +++ linux-2.6-test/init/main.c
> @@ -121,6 +121,11 @@ extern void time_init(void);
>  /* Default late time init is NULL. archs can override this later. */
>  void (*late_time_init)(void);
>  extern void softirq_init(void);
> +#ifdef CONFIG_TASKLETS_AS_WORKQUEUES
> +  extern void init_tasklets(void);
> +#else
> +# define init_tasklets() do { } while(0)
> +#endif
>  
>  /* Untouched command line saved by arch-specific code. */
>  char __initdata boot_command_line[COMMAND_LINE_SIZE];
> @@ -706,6 +711,7 @@ static void __init do_basic_setup(void)
>  {
>  	/* drivers will send hotplug events */
>  	init_workqueues();
> +	init_tasklets();
>  	usermodehelper_init();
>  	driver_init();
>  	init_irq_proc();
> Index: linux-2.6-test/include/linux/tasklet.h
> ===================================================================
> --- linux-2.6-test.orig/include/linux/tasklet.h
> +++ linux-2.6-test/include/linux/tasklet.h
> @@ -1,6 +1,10 @@
>  #ifndef _LINUX_TASKLET_H
>  #define _LINUX_TASKLET_H
>  
> -#include <linux/tasklet_softirq.h>
> +#ifdef CONFIG_TASKLETS_AS_WORKQUEUES
> +# include <linux/tasklet_work.h>
> +#else
> +# include <linux/tasklet_softirq.h>
> +#endif
>  
>  #endif
> Index: linux-2.6-test/include/linux/tasklet_work.h
> ===================================================================
> --- /dev/null
> +++ linux-2.6-test/include/linux/tasklet_work.h
> @@ -0,0 +1,62 @@
> +#ifndef _LINUX_WORK_TASKLET_H
> +#define _LINUX_WORK_TASKLET_H
> +
> +#ifndef _LINUX_INTERRUPT_H
> +# error "Do not include this header directly! use linux/interrupt.h"
> +#endif
> +
> +#include <linux/workqueue.h>
> +
> +extern void work_tasklet_exec(struct work_struct *work);
> +
> +struct tasklet_struct
> +{
> +	struct work_struct work;
> +	struct list_head list;
> +	unsigned long state;
> +	atomic_t count;
> +	void (*func)(unsigned long);
> +	unsigned long data;
> +	char *n;
> +};
> +
> +#define DECLARE_TASKLET(name, func, data)				\
> +	struct tasklet_struct name = {					\
> +		__WORK_INITIALIZER((name).work, work_tasklet_exec),	\
> +		LIST_HEAD_INIT((name).list),				\
> +		0,							\
> +		ATOMIC_INIT(0),						\
> +		func,							\
> +		data,							\
> +		#name							\
> +	}
> +
> +#define DECLARE_TASKLET_DISABLED(name, func, data)			\
> +	struct tasklet_struct name = {					\
> +		__WORK_INITIALIZER((name).work, work_tasklet_exec),	\
> +		LIST_HEAD_INIT((name).list),				\
> +		0,							\
> +		ATOMIC_INIT(1),						\
> +		func,							\
> +		data,							\
> + 		#name							\
> +	}
> +
> +void tasklet_schedule(struct tasklet_struct *t);
> +#define tasklet_hi_schedule tasklet_schedule
> +extern fastcall void tasklet_enable(struct tasklet_struct *t);
> +#define tasklet_hi_enable tasklet_enable
> +
> +void tasklet_disable_nosync(struct tasklet_struct *t);
> +void tasklet_disable(struct tasklet_struct *t);
> +
> +extern int tasklet_is_scheduled(struct tasklet_struct *t);
> +
> +extern void tasklet_kill(struct tasklet_struct *t);
> +extern void tasklet_kill_immediate(struct tasklet_struct *t, unsigned int cpu);
> +extern void tasklet_init(struct tasklet_struct *t,
> +			 void (*func)(unsigned long), unsigned long data);
> +void takeover_tasklets(unsigned int cpu);
> +
> +
> +#endif /* _LINUX_WORK_TASKLET_H */
> Index: linux-2.6-test/kernel/tasklet_work.c
> ===================================================================
> --- /dev/null
> +++ linux-2.6-test/kernel/tasklet_work.c
> @@ -0,0 +1,138 @@
> +/*
> + *	linux/kernel/work_tasklet.c
> + *
> + *	Copyright (C) 2007 Steven Rostedt, Red Hat
> + *
> + */
> +
> +#include <linux/interrupt.h>
> +
> +static struct workqueue_struct *ktaskletd_wq;
> +
> +enum
> +{
> +	TASKLET_STATE_SCHED,	/* Tasklet is scheduled for execution */
> +	TASKLET_STATE_RUN,	/* Tasklet is running (SMP only) */
> +	TASKLET_STATE_PENDING	/* Tasklet is pending */
> +};
> +
> +#define TASKLET_STATEF_SCHED	(1 << TASKLET_STATE_SCHED)
> +#define TASKLET_STATEF_RUN	(1 << TASKLET_STATE_RUN)
> +#define TASKLET_STATEF_PENDING	(1 << TASKLET_STATE_PENDING)
> +
> +void tasklet_schedule(struct tasklet_struct *t)
> +{
> +	BUG_ON(!ktaskletd_wq);
> +	pr_debug("scheduling tasklet %s %p\n", t->n, t);

I'd change these pr_debug lines to "tasklet : scheduling %s %p\n" for
readability ..

> +	queue_work(ktaskletd_wq, &t->work);
> +}
> +
> +EXPORT_SYMBOL(tasklet_schedule);
> +
> +int tasklet_is_scheduled(struct tasklet_struct *t)
> +{
> +	int ret;
> +	ret = work_pending(&t->work);
> +	pr_debug("sched %s pending=%d\n", t->n, ret);
> +	return ret;
> +}
> +
> +EXPORT_SYMBOL(tasklet_is_scheduled);
> +
> +void tasklet_disable_nosync(struct tasklet_struct *t)
> +{
> +	pr_debug("disable tasklet %s %p\n", t->n, t);
> +	atomic_inc(&t->count);
> +	smp_mb__after_atomic_inc();
> +}
> +
> +EXPORT_SYMBOL(tasklet_disable_nosync);
> +
> +void tasklet_disable(struct tasklet_struct *t)
> +{
> +	tasklet_disable_nosync(t);
> +	pr_debug("flush tasklet %s %p\n", t->n, t);
> +	flush_workqueue(ktaskletd_wq);
> +	smp_mb();
> +}
> +
> +EXPORT_SYMBOL(tasklet_disable);
> +
> +void work_tasklet_exec(struct work_struct *work)
> +{
> +	struct tasklet_struct *t =
> +		container_of(work, struct tasklet_struct, work);
> +
> +	if (unlikely(atomic_read(&t->count))) {
> +		pr_debug("tasklet disabled %s %p\n", t->n, t);
> +		set_bit(TASKLET_STATE_PENDING, &t->state);
> +		smp_mb();
> +		/* make sure we were not just enabled */
> +		if (likely(atomic_read(&t->count)))
> +			goto out;
> +		clear_bit(TASKLET_STATE_PENDING, &t->state);

smp_mb__before_clear_bit ?

> +	}
> +
> +	local_bh_disable();
> +	pr_debug("run tasklet %s %p\n", t->n, t);
> +	t->func(t->data);
> +	local_bh_enable();
> +
> +out:
> +	return;
> +}
> +
> +EXPORT_SYMBOL(work_tasklet_exec);
> +
> +void __init softirq_init(void)
> +{
> +}

ifdef's ?

> +void init_tasklets(void)
> +{
> +	ktaskletd_wq = create_workqueue("tasklets");
> +	BUG_ON(!ktaskletd_wq);
> +}
> +
> +void takeover_tasklets(unsigned int cpu)
> +{
> +	pr_debug("Implement takeover tasklets??\n");
> +}

hmm .. Looks like it's for migration of tasklets .. I take it your not
sure that's handled ? Try cpu hotplug ..

> +void tasklet_init(struct tasklet_struct *t,
> +		  void (*func)(unsigned long), unsigned long data)
> +{
> +	INIT_WORK(&t->work, work_tasklet_exec);
> +	INIT_LIST_HEAD(&t->list);
> +	t->state = 0;
> +	atomic_set(&t->count, 0);
> +	t->func = func;
> +	t->data = data;
> +	t->n = "anonymous";

Is this "anonymous" just used for debugging ? Is so you could fill it
with a kallsyms lookup with the __builtin_return_address() ..

> +	pr_debug("anonymous tasklet %p set at %p\n",
> +		t, __builtin_return_address(0));
> +}
> +
> +EXPORT_SYMBOL(tasklet_init);
> +
> +void fastcall tasklet_enable(struct tasklet_struct *t)
> +{
> +	pr_debug("enable tasklet %s (count was %d)\n",
> +		 t->n, atomic_read(&t->count));
> +	if (!atomic_dec_and_test(&t->count))
> +		return;
> +	if (test_and_clear_bit(TASKLET_STATE_PENDING, &t->state)) {
> +		pr_debug("tasklet %s was pending\n", t->n);
> +		tasklet_schedule(t);
> +	}
> +}
> +
> +EXPORT_SYMBOL(tasklet_enable);
> +
> +void tasklet_kill(struct tasklet_struct *t)
> +{
> +	pr_debug("kill tasklet %s\n", t->n);
> +	flush_workqueue(ktaskletd_wq);
> +}
> +
> +EXPORT_SYMBOL(tasklet_kill);
> 


^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 4/6] Make DRM use the tasklet is-sched API
  2007-06-22  6:49     ` Thomas Gleixner
@ 2007-06-22  7:08       ` Daniel Walker
  2007-06-22 12:15         ` Steven Rostedt
  2007-06-22 16:10       ` Arnd Bergmann
  1 sibling, 1 reply; 127+ messages in thread
From: Daniel Walker @ 2007-06-22  7:08 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Steven Rostedt, LKML, Linus Torvalds, Ingo Molnar, Andrew Morton,
	Christoph Hellwig, john stultz, Oleg Nesterov, Paul E. McKenney,
	Dipankar Sarma, David S. Miller, matthew.wilcox, kuznet

On Fri, 2007-06-22 at 08:49 +0200, Thomas Gleixner wrote:
> On Thu, 2007-06-21 at 23:36 -0700, Daniel Walker wrote:
> > On Fri, 2007-06-22 at 00:00 -0400, Steven Rostedt wrote:
> > > plain text document attachment (tasklet-driver-hacks.patch)
> > > Update the DRM driver to use the new tasklet API, which does not rely
> > > on the tasklet implementation details.
> > > 
> > > Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
> > > 
> > > 
> > > Index: linux-2.6.21-rt9/drivers/char/drm/drm_irq.c
> > > ===================================================================
> > > --- linux-2.6.21-rt9.orig/drivers/char/drm/drm_irq.c
> > > +++ linux-2.6.21-rt9/drivers/char/drm/drm_irq.c
> > > @@ -461,7 +461,7 @@ void drm_locked_tasklet(drm_device_t *de
> > >  	static DECLARE_TASKLET(drm_tasklet, drm_locked_tasklet_func, 0);
> > >  
> > >  	if (!drm_core_check_feature(dev, DRIVER_HAVE_IRQ) ||
> > > -	    test_bit(TASKLET_STATE_SCHED, &drm_tasklet.state))
> > > +	    tasklet_is_scheduled(&drm_tasklet))
> > >  		return;
> > >  
> > >  	spin_lock_irqsave(&dev->tasklet_lock, irqflags);
> > > 
> > 
> > 
> > No sense in having a patch just for this, may as well merge this with
> > patch 3 ..
> 
> Wrong. patch 3 adds the API and this one makes use of it. Stevens split
> makes perfectly sense.

Clearly it doesn't make sense to me ;) .. The patches are too small to
split them up that way ..

Daniel


^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-22  4:00 [RFC PATCH 0/6] Convert all tasklets to workqueues Steven Rostedt
                   ` (5 preceding siblings ...)
  2007-06-22  4:00 ` [RFC PATCH 6/6] Convert tasklets to work queues Steven Rostedt
@ 2007-06-22  7:09 ` Christoph Hellwig
  2007-06-22  7:51   ` Ingo Molnar
  2007-06-22 12:32   ` Steven Rostedt
  2007-06-22 14:25 ` Arjan van de Ven
                   ` (2 subsequent siblings)
  9 siblings, 2 replies; 127+ messages in thread
From: Christoph Hellwig @ 2007-06-22  7:09 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: LKML, Linus Torvalds, Ingo Molnar, Andrew Morton,
	Thomas Gleixner, Christoph Hellwig, john stultz, Oleg Nesterov,
	Paul E. McKenney, Dipankar Sarma, David S. Miller,
	matthew.wilcox, kuznet

On Fri, Jun 22, 2007 at 12:00:14AM -0400, Steven Rostedt wrote:
> The most part, tasklets today are not used for time critical functions.
> Running tasklets in thread context is not harmful to performance of
> the overall system. But running them in interrupt context is, since
> they increase the overall latency for high priority tasks.

I think we probably want some numbers, at least for tasklets used in
potentially performance critical code.

> Even in Matthew's paper, he says that work queues have replaced tasklets.
> But this is not truly the case.  Tasklets are common and plentiful.
> But to go and replace each driver that uses a tasklet with a work queue
> would be very painful.
> 
> I've developed this way to replace all tasklets with work queues without
> having to change all the drivers that use them.  I created an API that
> uses the tasklet API as a wrapper to a work queue.  This API doesn't need
> to be permanent. It shows 1) that work queues can replace tasklets, and
> 2) we can remove a duplicate functionality from the kernel.  This API
> only needs to be around until we removed all uses of tasklets from
> all drivers.

I don't like this wrapping at all.  What you're doing is a tradeoff to
do less work today in exchange for more maintaince overhead and more crufty
code in the future.  So while I sympathize a lot with trying to get rid
of tasklets I'd rather prefer to convert individual drivers over until
all users are gone.  It's not exactly a very complicated conversion either.

> 6) This is the magic to make tasklets into work queues. It allows for
> the kernel to be configured either with the normal tasklets, as it is
> today, or with the tasklets-as-work-queues.

And this is something that might be fine for benchmarking, but not something
we should put in.  Keeping two wildly different implementation of core
functionality with very different behaviour around is quite bad.  Better
kill tasklets once for all.


^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 1/6] Convert the RCU tasklet into a softirq
  2007-06-22  4:00 ` [RFC PATCH 1/6] Convert the RCU tasklet into a softirq Steven Rostedt
@ 2007-06-22  7:10   ` Christoph Hellwig
  2007-06-22  7:43     ` Ingo Molnar
  0 siblings, 1 reply; 127+ messages in thread
From: Christoph Hellwig @ 2007-06-22  7:10 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: LKML, Linus Torvalds, Ingo Molnar, Andrew Morton,
	Thomas Gleixner, Christoph Hellwig, john stultz, Oleg Nesterov,
	Paul E. McKenney, Dipankar Sarma, David S. Miller,
	matthew.wilcox, kuznet

On Fri, Jun 22, 2007 at 12:00:15AM -0400, Steven Rostedt wrote:
> I believe this was originally done by Dipankar Sarma. I pulled these
> changes from the -rt kernel.
> 
> For better preformance, RCU should use a softirq instead of a
> tasklet.

I was under the imporession we had merged this a while ago due to
tasklet bottlenecks on big machines.  Any reason why it has been delayed
so long?

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 2/6] Split out tasklets from softirq.c
  2007-06-22  4:00 ` [RFC PATCH 2/6] Split out tasklets from softirq.c Steven Rostedt
@ 2007-06-22  7:11   ` Christoph Hellwig
  2007-06-22 12:40     ` Steven Rostedt
  2007-06-22 13:45   ` Akinobu Mita
  1 sibling, 1 reply; 127+ messages in thread
From: Christoph Hellwig @ 2007-06-22  7:11 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: LKML, Linus Torvalds, Ingo Molnar, Andrew Morton,
	Thomas Gleixner, Christoph Hellwig, john stultz, Oleg Nesterov,
	Paul E. McKenney, Dipankar Sarma, David S. Miller,
	matthew.wilcox, kuznet

On Fri, Jun 22, 2007 at 12:00:16AM -0400, Steven Rostedt wrote:
> Tasklets are really a separate entity from softirqs, so they
> deserve their own file. Also this allows us to easily replace
> tasklets for something else ;-)

It's a bit pointless when softirq.h still always includes it.  A while
ago I had a patch that split it out and made all users include it directly.
But reviving this patch would be rather pointless if we just want to kill
tasklets in the end anyway.


^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 1/6] Convert the RCU tasklet into a softirq
  2007-06-22  7:10   ` Christoph Hellwig
@ 2007-06-22  7:43     ` Ingo Molnar
  2007-06-22 12:35       ` Steven Rostedt
  0 siblings, 1 reply; 127+ messages in thread
From: Ingo Molnar @ 2007-06-22  7:43 UTC (permalink / raw)
  To: Christoph Hellwig, Steven Rostedt, LKML, Linus Torvalds,
	Andrew Morton, Thomas Gleixner, john stultz, Oleg Nesterov,
	Paul E. McKenney, Dipankar Sarma, David S. Miller,
	matthew.wilcox, kuznet


* Christoph Hellwig <hch@infradead.org> wrote:

> On Fri, Jun 22, 2007 at 12:00:15AM -0400, Steven Rostedt wrote:
> > I believe this was originally done by Dipankar Sarma. I pulled these
> > changes from the -rt kernel.
> > 
> > For better preformance, RCU should use a softirq instead of a
> > tasklet.
> 
> I was under the imporession we had merged this a while ago due to 
> tasklet bottlenecks on big machines.  Any reason why it has been 
> delayed so long?

i thought so too, up until a few weeks ago. I think i might have mixed 
it up mentally with the scheduler rebalancing tasklet: that was 
converted to softirqs on my request and made me forget about the RCU 
tasklet thing.

	Ingo

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-22  7:09 ` [RFC PATCH 0/6] Convert all tasklets to workqueues Christoph Hellwig
@ 2007-06-22  7:51   ` Ingo Molnar
  2007-06-22  7:53     ` Christoph Hellwig
  2007-06-22 12:32   ` Steven Rostedt
  1 sibling, 1 reply; 127+ messages in thread
From: Ingo Molnar @ 2007-06-22  7:51 UTC (permalink / raw)
  To: Christoph Hellwig, Steven Rostedt, LKML, Linus Torvalds,
	Andrew Morton, Thomas Gleixner, john stultz, Oleg Nesterov,
	Paul E. McKenney, Dipankar Sarma, David S. Miller,
	matthew.wilcox, kuznet


* Christoph Hellwig <hch@infradead.org> wrote:

> I think we probably want some numbers, at least for tasklets used in 
> potentially performance critical code.

which actual in-kernel tasklets do you have in mind? I'm not aware of 
any in performance critical code. (now that both the RCU and the sched 
tasklet has been fixed.)

	Ingo

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-22  7:51   ` Ingo Molnar
@ 2007-06-22  7:53     ` Christoph Hellwig
  2007-06-22 11:23       ` Ingo Molnar
  0 siblings, 1 reply; 127+ messages in thread
From: Christoph Hellwig @ 2007-06-22  7:53 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Christoph Hellwig, Steven Rostedt, LKML, Linus Torvalds,
	Andrew Morton, Thomas Gleixner, john stultz, Oleg Nesterov,
	Paul E. McKenney, Dipankar Sarma, David S. Miller,
	matthew.wilcox, kuznet

On Fri, Jun 22, 2007 at 09:51:35AM +0200, Ingo Molnar wrote:
> 
> * Christoph Hellwig <hch@infradead.org> wrote:
> 
> > I think we probably want some numbers, at least for tasklets used in 
> > potentially performance critical code.
> 
> which actual in-kernel tasklets do you have in mind? I'm not aware of 
> any in performance critical code. (now that both the RCU and the sched 
> tasklet has been fixed.)

the one in megaraid_sas for example is in a performance-critical path
and was put in quite recently.

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-22  7:53     ` Christoph Hellwig
@ 2007-06-22 11:23       ` Ingo Molnar
  0 siblings, 0 replies; 127+ messages in thread
From: Ingo Molnar @ 2007-06-22 11:23 UTC (permalink / raw)
  To: Christoph Hellwig, Steven Rostedt, LKML, Linus Torvalds,
	Andrew Morton, Thomas Gleixner, john stultz, Oleg Nesterov,
	Paul E. McKenney, Dipankar Sarma, David S. Miller,
	matthew.wilcox, kuznet


* Christoph Hellwig <hch@infradead.org> wrote:

> > which actual in-kernel tasklets do you have in mind? I'm not aware 
> > of any in performance critical code. (now that both the RCU and the 
> > sched tasklet has been fixed.)
> 
> the one in megaraid_sas for example is in a performance-critical path 
> and was put in quite recently.

ah, i thought core kernel. I took a look, and i doubt that doing the 
megaraid SAS thing over a kernel thread would show up on the radar. If 
it does then it should probably be done in its separate per-CPU 
workqueue anyway, not in a globally scheduled thing like a tasklet. 
Tasklets just dont scale.

	Ingo

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 4/6] Make DRM use the tasklet is-sched API
  2007-06-22  7:08       ` Daniel Walker
@ 2007-06-22 12:15         ` Steven Rostedt
  2007-06-22 15:36           ` Daniel Walker
  0 siblings, 1 reply; 127+ messages in thread
From: Steven Rostedt @ 2007-06-22 12:15 UTC (permalink / raw)
  To: Daniel Walker
  Cc: Thomas Gleixner, LKML, Linus Torvalds, Ingo Molnar,
	Andrew Morton, Christoph Hellwig, john stultz, Oleg Nesterov,
	Paul E. McKenney, Dipankar Sarma, David S. Miller,
	matthew.wilcox, kuznet

On Fri, 2007-06-22 at 00:08 -0700, Daniel Walker wrote:

> > > No sense in having a patch just for this, may as well merge this with
> > > patch 3 ..
> > 
> > Wrong. patch 3 adds the API and this one makes use of it. Stevens split
> > makes perfectly sense.
> 
> Clearly it doesn't make sense to me ;) .. The patches are too small to
> split them up that way ..

Daniel, you of all people should know. It's not the size of the patch
that matters, it's the way you use the patch ;-)


No these two patches should *not* be merged to one. If these are sitting
in -mm, and someone were to change the DRM to not to use the API and
someone else changed their driver to use the API, then what?  Does
Andrew keep these maintenance patches on top of each other?

The split lets the DRM patch be dropped or replaced while keeping the
API patch around in case another driver uses the API.

The two patches have two different objectives, even though they are
related and currently on a 1 to 1 basis. The patches regardless, should
stay separate.

-- Steve



^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-22  7:09 ` [RFC PATCH 0/6] Convert all tasklets to workqueues Christoph Hellwig
  2007-06-22  7:51   ` Ingo Molnar
@ 2007-06-22 12:32   ` Steven Rostedt
  2007-06-22 12:38     ` Ingo Molnar
  1 sibling, 1 reply; 127+ messages in thread
From: Steven Rostedt @ 2007-06-22 12:32 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: LKML, Linus Torvalds, Ingo Molnar, Andrew Morton,
	Thomas Gleixner, john stultz, Oleg Nesterov, Paul E. McKenney,
	Dipankar Sarma, David S. Miller, kuznet

Christoph,

Thanks for taking the time to look at my patches!


On Fri, 2007-06-22 at 08:09 +0100, Christoph Hellwig wrote:
> > I've developed this way to replace all tasklets with work queues without
> > having to change all the drivers that use them.  I created an API that
> > uses the tasklet API as a wrapper to a work queue.  This API doesn't need
> > to be permanent. It shows 1) that work queues can replace tasklets, and
> > 2) we can remove a duplicate functionality from the kernel.  This API
> > only needs to be around until we removed all uses of tasklets from
> > all drivers.
> 
> I don't like this wrapping at all.  What you're doing is a tradeoff to
> do less work today in exchange for more maintaince overhead and more crufty
> code in the future.  So while I sympathize a lot with trying to get rid
> of tasklets I'd rather prefer to convert individual drivers over until
> all users are gone.  It's not exactly a very complicated conversion either.
> 
> > 6) This is the magic to make tasklets into work queues. It allows for
> > the kernel to be configured either with the normal tasklets, as it is
> > today, or with the tasklets-as-work-queues.
> 
> And this is something that might be fine for benchmarking, but not something
> we should put in.  Keeping two wildly different implementation of core
> functionality with very different behaviour around is quite bad.  Better
> kill tasklets once for all.

Honestly, I highly doubted that this would make it up to Linus's tree.
My aim was to get this into -mm, and perhaps even turn on the
tasklets-as-workqueues as default.

The objective of these patches was more of a proof-of-concept, showing
that the tasklets are indeed obsolete today.  By putting this patch set
into -mm, we can get a good idea of the regressions that this conversion
would cause, hopefully before we did the real change of converting a
driver's tasklet into a work queue.

I only have a limited amount of hardware to test on. I believe this is a
good candidate as a permanent resident into -mm and can display the
effects of using work queues instead of tasklets in a wider arena.

This patch is too much of a kludge to put into Linus's tree, and if it
were in that tree, it would probably cause laziness and make it even
less likely that driver authors would get rid of their tasklets.  But I
argue it is perfect for the -mm tree, because it allows users to easily
(with a config option) turn it on and off and run benchmarks to see its
effect.


-- Steve


^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 1/6] Convert the RCU tasklet into a softirq
  2007-06-22  7:43     ` Ingo Molnar
@ 2007-06-22 12:35       ` Steven Rostedt
  2007-06-22 12:55         ` Ingo Molnar
  0 siblings, 1 reply; 127+ messages in thread
From: Steven Rostedt @ 2007-06-22 12:35 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Christoph Hellwig, LKML, Linus Torvalds, Andrew Morton,
	Thomas Gleixner, john stultz, Oleg Nesterov, Paul E. McKenney,
	Dipankar Sarma, David S. Miller, matthew.wilcox, kuznet

On Fri, 2007-06-22 at 09:43 +0200, Ingo Molnar wrote:
> * Christoph Hellwig <hch@infradead.org> wrote:
> 
> > On Fri, Jun 22, 2007 at 12:00:15AM -0400, Steven Rostedt wrote:
> > > I believe this was originally done by Dipankar Sarma. I pulled these
> > > changes from the -rt kernel.
> > > 
> > > For better preformance, RCU should use a softirq instead of a
> > > tasklet.
> > 
> > I was under the imporession we had merged this a while ago due to 
> > tasklet bottlenecks on big machines.  Any reason why it has been 
> > delayed so long?
> 
> i thought so too, up until a few weeks ago. I think i might have mixed 
> it up mentally with the scheduler rebalancing tasklet: that was 
> converted to softirqs on my request and made me forget about the RCU 
> tasklet thing.
> 

I think the reason it was dropped was because it was part of a large
patch series that didn't make it into the kernel. And unfortunately,
this was dropped in that regard.


If you do pull this patch in, I would like to get a sign-off-by from
Dipankar and give him credit, since I think he was the original author.
I just pulled it out of the -rt branch.

-- Steve



^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-22 12:32   ` Steven Rostedt
@ 2007-06-22 12:38     ` Ingo Molnar
  2007-06-22 12:58       ` Steven Rostedt
  0 siblings, 1 reply; 127+ messages in thread
From: Ingo Molnar @ 2007-06-22 12:38 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Christoph Hellwig, LKML, Linus Torvalds, Andrew Morton,
	Thomas Gleixner, john stultz, Oleg Nesterov, Paul E. McKenney,
	Dipankar Sarma, David S. Miller, kuznet


* Steven Rostedt <rostedt@goodmis.org> wrote:

> > And this is something that might be fine for benchmarking, but not something
> > we should put in.  Keeping two wildly different implementation of core
> > functionality with very different behaviour around is quite bad.  Better
> > kill tasklets once for all.
> 
> Honestly, I highly doubted that this would make it up to Linus's tree.

that's where it belongs - but it first needs the cleanups suggested by 
Christoph.

> My aim was to get this into -mm, [...]

that would be the first step towards getting it upstream.

> and perhaps even turn on the tasklets-as-workqueues as default.

that is a hack that shouldnt be in the patch. People can unapply/apply a 
patch just as well as they can flip a .config switch.

	Ingo

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 2/6] Split out tasklets from softirq.c
  2007-06-22  7:11   ` Christoph Hellwig
@ 2007-06-22 12:40     ` Steven Rostedt
  0 siblings, 0 replies; 127+ messages in thread
From: Steven Rostedt @ 2007-06-22 12:40 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: LKML, Linus Torvalds, Ingo Molnar, Andrew Morton,
	Thomas Gleixner, john stultz, Oleg Nesterov, Paul E. McKenney,
	Dipankar Sarma, David S. Miller, kuznet

On Fri, 2007-06-22 at 08:11 +0100, Christoph Hellwig wrote:
> On Fri, Jun 22, 2007 at 12:00:16AM -0400, Steven Rostedt wrote:
> > Tasklets are really a separate entity from softirqs, so they
> > deserve their own file. Also this allows us to easily replace
> > tasklets for something else ;-)
> 
> It's a bit pointless when softirq.h still always includes it.  A while
> ago I had a patch that split it out and made all users include it directly.
> But reviving this patch would be rather pointless if we just want to kill
> tasklets in the end anyway.
> 

Actually, if these patches do make it into -mm, then patches 1-4 should
make it into Linus's tree. That will make Andrews maintenance of these
patches much easier, since they would make the -mm changes not so
intrusive.

Since the stripping out of tasklets is more of a cosmetic change and
does not change functionality of the kernel, I don't see why it can't
make it up to Linus's tree. Especially if we are planning on removing
tasklets all together.

-- Steve



^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 1/6] Convert the RCU tasklet into a softirq
  2007-06-22 12:35       ` Steven Rostedt
@ 2007-06-22 12:55         ` Ingo Molnar
  0 siblings, 0 replies; 127+ messages in thread
From: Ingo Molnar @ 2007-06-22 12:55 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Christoph Hellwig, LKML, Linus Torvalds, Andrew Morton,
	Thomas Gleixner, john stultz, Oleg Nesterov, Paul E. McKenney,
	Dipankar Sarma, David S. Miller, matthew.wilcox, kuznet


* Steven Rostedt <rostedt@goodmis.org> wrote:

> If you do pull this patch in, I would like to get a sign-off-by from 
> Dipankar and give him credit, since I think he was the original 
> author. I just pulled it out of the -rt branch.

you can preserve authorship by sending with a patch metadata that says 
From: Dipankar.

	Ingo

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-22 12:38     ` Ingo Molnar
@ 2007-06-22 12:58       ` Steven Rostedt
  2007-06-22 13:12         ` Ingo Molnar
  2007-06-22 13:13         ` Andrew Morton
  0 siblings, 2 replies; 127+ messages in thread
From: Steven Rostedt @ 2007-06-22 12:58 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Christoph Hellwig, LKML, Linus Torvalds, Andrew Morton,
	Thomas Gleixner, john stultz, Oleg Nesterov, Paul E. McKenney,
	Dipankar Sarma, David S. Miller, kuznet

On Fri, 2007-06-22 at 14:38 +0200, Ingo Molnar wrote:
> * Steven Rostedt <rostedt@goodmis.org> wrote:
> 
> > > And this is something that might be fine for benchmarking, but not something
> > > we should put in.  Keeping two wildly different implementation of core
> > > functionality with very different behaviour around is quite bad.  Better
> > > kill tasklets once for all.
> > 
> > Honestly, I highly doubted that this would make it up to Linus's tree.
> 
> that's where it belongs - but it first needs the cleanups suggested by 
> Christoph.

I had the impression that he didn't want it in, but instead wanted each
driver to be changed separately.

> 
> > My aim was to get this into -mm, [...]
> 
> that would be the first step towards getting it upstream.
> 
> > and perhaps even turn on the tasklets-as-workqueues as default.
> 
> that is a hack that shouldnt be in the patch. People can unapply/apply a 
> patch just as well as they can flip a .config switch.

So should the patch be then to not even have the tasklet_softirq there
at all?  Have the patch simply replace the tasklets with workqueues, and
if someone doesn't like that, then they can simply remove the patch?

-- Steve



^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-22 12:58       ` Steven Rostedt
@ 2007-06-22 13:12         ` Ingo Molnar
  2007-06-22 14:27           ` Steven Rostedt
  2007-06-22 13:13         ` Andrew Morton
  1 sibling, 1 reply; 127+ messages in thread
From: Ingo Molnar @ 2007-06-22 13:12 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Christoph Hellwig, LKML, Linus Torvalds, Andrew Morton,
	Thomas Gleixner, john stultz, Oleg Nesterov, Paul E. McKenney,
	Dipankar Sarma, David S. Miller, kuznet


* Steven Rostedt <rostedt@goodmis.org> wrote:

> > that's where it belongs - but it first needs the cleanups suggested 
> > by Christoph.
> 
> I had the impression that he didn't want it in, but instead wanted 
> each driver to be changed separately.

that can be done too in a later stage. We cannot deprecate an API from 
one day to another. But wrapping it sanely via an existing framework 
makes complete sense.

> > > and perhaps even turn on the tasklets-as-workqueues as default.
> > 
> > that is a hack that shouldnt be in the patch. People can 
> > unapply/apply a patch just as well as they can flip a .config 
> > switch.
> 
> So should the patch be then to not even have the tasklet_softirq there 
> at all?  Have the patch simply replace the tasklets with workqueues, 
> and if someone doesn't like that, then they can simply remove the 
> patch?

yes, the softirq based tasklet implementation with workqueue based 
implementation, but the tasklet API itself should still stay.

ok, enough idle talking, lets see the next round of patches? :)

please remove all the pr_debug() lines as well - they are handy for 
development but are quite unnecessary for something headed 
upstream-wards. And please replace all the BUG_ON()s with WARN_ON_ONCE() 
...

	Ingo

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-22 12:58       ` Steven Rostedt
  2007-06-22 13:12         ` Ingo Molnar
@ 2007-06-22 13:13         ` Andrew Morton
  2007-06-22 13:26           ` Ingo Molnar
  2007-06-22 13:35           ` Steven Rostedt
  1 sibling, 2 replies; 127+ messages in thread
From: Andrew Morton @ 2007-06-22 13:13 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: mingo, hch, linux-kernel, torvalds, tglx, johnstul, oleg,
	paulmck, dipankar, davem, kuznet

> On Fri, 22 Jun 2007 08:58:44 -0400 Steven Rostedt <rostedt@goodmis.org> wrote:
> On Fri, 2007-06-22 at 14:38 +0200, Ingo Molnar wrote:
> > * Steven Rostedt <rostedt@goodmis.org> wrote:
> > 
> > > > And this is something that might be fine for benchmarking, but not something
> > > > we should put in.  Keeping two wildly different implementation of core
> > > > functionality with very different behaviour around is quite bad.  Better
> > > > kill tasklets once for all.
> > > 
> > > Honestly, I highly doubted that this would make it up to Linus's tree.
> > 
> > that's where it belongs - but it first needs the cleanups suggested by 
> > Christoph.
> 
> I had the impression that he didn't want it in, but instead wanted each
> driver to be changed separately.

I do think that would be a better approach.  Apart from the cleanliness
issue, the driver-by-driver conversion would make it much easier to hunt
down any regresions or various funnineses.


^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-22 13:13         ` Andrew Morton
@ 2007-06-22 13:26           ` Ingo Molnar
  2007-06-22 13:41             ` Andrew Morton
  2007-06-22 13:35           ` Steven Rostedt
  1 sibling, 1 reply; 127+ messages in thread
From: Ingo Molnar @ 2007-06-22 13:26 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Steven Rostedt, hch, linux-kernel, torvalds, tglx, johnstul,
	oleg, paulmck, dipankar, davem, kuznet


* Andrew Morton <akpm@linux-foundation.org> wrote:

> I do think that would be a better approach.  Apart from the 
> cleanliness issue, the driver-by-driver conversion would make it much 
> easier to hunt down any regresions or various funnineses.

there are 120 tasklet_init()s in the tree and 224 tasklet_schedule()s. 
Pushing it into thread context should work just fine (Steve's patchset 
certainly works on my testbox), as even today we can execute softirqs 
(and hence tasklets) in ksoftirqd. In fact, -rt has been executing 
tasklets in task context for over 2.5 years meanwhile. Do we really want 
to upset the whole API? Realistically it just wont ever be removed, like 
the BKL.

	Ingo

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 6/6] Convert tasklets to work queues
  2007-06-22  7:06   ` Daniel Walker
@ 2007-06-22 13:29     ` Steven Rostedt
  2007-06-22 15:52       ` Oleg Nesterov
  0 siblings, 1 reply; 127+ messages in thread
From: Steven Rostedt @ 2007-06-22 13:29 UTC (permalink / raw)
  To: Daniel Walker
  Cc: LKML, Linus Torvalds, Ingo Molnar, Andrew Morton,
	Thomas Gleixner, Christoph Hellwig, john stultz, Oleg Nesterov,
	Paul E. McKenney, Dipankar Sarma, David S. Miller, kuznet

On Fri, 2007-06-22 at 00:06 -0700, Daniel Walker wrote:

> > +void tasklet_schedule(struct tasklet_struct *t)
> > +{
> > +	BUG_ON(!ktaskletd_wq);
> > +	pr_debug("scheduling tasklet %s %p\n", t->n, t);
> 
> I'd change these pr_debug lines to "tasklet : scheduling %s %p\n" for
> readability ..

As Ingo suggested, the next round won't even have them.



> truct tasklet_struct, work);
> > +
> > +	if (unlikely(atomic_read(&t->count))) {
> > +		pr_debug("tasklet disabled %s %p\n", t->n, t);
> > +		set_bit(TASKLET_STATE_PENDING, &t->state);
> > +		smp_mb();
> > +		/* make sure we were not just enabled */
> > +		if (likely(atomic_read(&t->count)))
> > +			goto out;
> > +		clear_bit(TASKLET_STATE_PENDING, &t->state);
> 
> smp_mb__before_clear_bit ?

The smp_mb() is needed before the atomic_read. But since that atomic
read is in a conditional, no more barriers are needed if we continue.

Thanks go out to Oleg for pointing out the smp_mb was needed!


> > +
> > +void __init softirq_init(void)
> > +{
> > +}
> 
> ifdef's ?

Nah, ifdefs are ugly. Even uglier than stubbed functions. I guess I
could simply put it in the header as a define do {} while(0).

> 
> > +void init_tasklets(void)
> > +{
> > +	ktaskletd_wq = create_workqueue("tasklets");
> > +	BUG_ON(!ktaskletd_wq);
> > +}
> > +
> > +void takeover_tasklets(unsigned int cpu)
> > +{
> > +	pr_debug("Implement takeover tasklets??\n");
> > +}
> 
> hmm .. Looks like it's for migration of tasklets .. I take it your not
> sure that's handled ? Try cpu hotplug ..


Actually, I believe that the workqueues will handle it themselves, so I
don't need to do any special handling. Just another advantage of
converting tasklets into work queues.

> 
> > +void tasklet_init(struct tasklet_struct *t,
> > +		  void (*func)(unsigned long), unsigned long data)
> > +{
> > +	INIT_WORK(&t->work, work_tasklet_exec);
> > +	INIT_LIST_HEAD(&t->list);
> > +	t->state = 0;
> > +	atomic_set(&t->count, 0);
> > +	t->func = func;
> > +	t->data = data;
> > +	t->n = "anonymous";
> 
> Is this "anonymous" just used for debugging ? Is so you could fill it
> with a kallsyms lookup with the __builtin_return_address() ..

Yeah, they were started for debugging, but I also thought about making
some sort of interface for user land to see what was defined. So using
kallsyms might not look well. It's easy enough to find if an anonymous
tasklet gives me trouble.


-- Steve



^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-22 13:13         ` Andrew Morton
  2007-06-22 13:26           ` Ingo Molnar
@ 2007-06-22 13:35           ` Steven Rostedt
  1 sibling, 0 replies; 127+ messages in thread
From: Steven Rostedt @ 2007-06-22 13:35 UTC (permalink / raw)
  To: Andrew Morton
  Cc: mingo, hch, linux-kernel, torvalds, tglx, johnstul, oleg,
	paulmck, dipankar, davem, kuznet

On Fri, 2007-06-22 at 06:13 -0700, Andrew Morton wrote:
> > On Fri, 22 Jun 2007 08:58:44 -0400 Steven Rostedt <rostedt@goodmis.org> wrote:
> > On Fri, 2007-06-22 at 14:38 +0200, Ingo Molnar wrote:
> > > * Steven Rostedt <rostedt@goodmis.org> wrote:

> > > > Honestly, I highly doubted that this would make it up to Linus's tree.
> > > 
> > > that's where it belongs - but it first needs the cleanups suggested by 
> > > Christoph.
> > 
> > I had the impression that he didn't want it in, but instead wanted each
> > driver to be changed separately.
> 
> I do think that would be a better approach.  Apart from the cleanliness
> issue, the driver-by-driver conversion would make it much easier to hunt
> down any regresions or various funnineses.
> 

Actually, I disagree with driver by driver ease of hunting down
regressions. Perhaps a regression is caused by having two different
drivers have their tasklets converted to work queues. Or where their
might be any case a tasklet is somehow related to another tasklet.

Switching all tasklets at once can pin point the problem rather quickly.
Then it's easy to find which driver was the culprit.

-- Steve



^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-22 13:26           ` Ingo Molnar
@ 2007-06-22 13:41             ` Andrew Morton
  2007-06-22 14:00               ` Ingo Molnar
  0 siblings, 1 reply; 127+ messages in thread
From: Andrew Morton @ 2007-06-22 13:41 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: rostedt, hch, linux-kernel, torvalds, tglx, johnstul, oleg,
	paulmck, dipankar, davem, kuznet

> On Fri, 22 Jun 2007 15:26:22 +0200 Ingo Molnar <mingo@elte.hu> wrote:
> 
> * Andrew Morton <akpm@linux-foundation.org> wrote:
> 
> > I do think that would be a better approach.  Apart from the 
> > cleanliness issue, the driver-by-driver conversion would make it much 
> > easier to hunt down any regresions or various funnineses.
> 
> there are 120 tasklet_init()s in the tree and 224 tasklet_schedule()s. 

couple of hours?

> Pushing it into thread context should work just fine (Steve's patchset 
> certainly works on my testbox), as even today we can execute softirqs 
> (and hence tasklets) in ksoftirqd. In fact, -rt has been executing 
> tasklets in task context for over 2.5 years meanwhile. Do we really want 
> to upset the whole API? Realistically it just wont ever be removed, like 
> the BKL.

We can remove it.  It might need to remain deprecated for a year, but we
shouldn't plan on leaving the old interface hanging around for ever.


^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 2/6] Split out tasklets from softirq.c
  2007-06-22  4:00 ` [RFC PATCH 2/6] Split out tasklets from softirq.c Steven Rostedt
  2007-06-22  7:11   ` Christoph Hellwig
@ 2007-06-22 13:45   ` Akinobu Mita
  2007-06-22 13:58     ` Steven Rostedt
  1 sibling, 1 reply; 127+ messages in thread
From: Akinobu Mita @ 2007-06-22 13:45 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: LKML, Linus Torvalds, Ingo Molnar, Andrew Morton,
	Thomas Gleixner, Christoph Hellwig, john stultz, Oleg Nesterov,
	Paul E. McKenney, Dipankar Sarma, David S. Miller,
	matthew.wilcox, kuznet

2007/6/22, Steven Rostedt <rostedt@goodmis.org>:

> +static inline void tasklet_unlock_wait(struct tasklet_struct *t)
> +{
> +       while (test_bit(TASKLET_STATE_RUN, &(t)->state)) { barrier(); }
> +}

BTW, can we use cpu_relax() instead of barrier() in this busy-wait loop?

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 2/6] Split out tasklets from softirq.c
  2007-06-22 13:45   ` Akinobu Mita
@ 2007-06-22 13:58     ` Steven Rostedt
  0 siblings, 0 replies; 127+ messages in thread
From: Steven Rostedt @ 2007-06-22 13:58 UTC (permalink / raw)
  To: Akinobu Mita
  Cc: LKML, Linus Torvalds, Ingo Molnar, Andrew Morton,
	Thomas Gleixner, Christoph Hellwig, john stultz, Oleg Nesterov,
	Paul E. McKenney, Dipankar Sarma, David S. Miller,
	matthew.wilcox, kuznet

On Fri, 2007-06-22 at 22:45 +0900, Akinobu Mita wrote:
> 2007/6/22, Steven Rostedt <rostedt@goodmis.org>:
> 
> > +static inline void tasklet_unlock_wait(struct tasklet_struct *t)
> > +{
> > +       while (test_bit(TASKLET_STATE_RUN, &(t)->state)) { barrier(); }
> > +}
> 
> BTW, can we use cpu_relax() instead of barrier() in this busy-wait loop?
> 

Probably, but not in this patch series.  That's part of the code I'm
trying to remove ;-)

-- Steve



^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-22 13:41             ` Andrew Morton
@ 2007-06-22 14:00               ` Ingo Molnar
  0 siblings, 0 replies; 127+ messages in thread
From: Ingo Molnar @ 2007-06-22 14:00 UTC (permalink / raw)
  To: Andrew Morton
  Cc: rostedt, hch, linux-kernel, torvalds, tglx, johnstul, oleg,
	paulmck, dipankar, davem, kuznet


* Andrew Morton <akpm@linux-foundation.org> wrote:

> > there are 120 tasklet_init()s in the tree and 224 
> > tasklet_schedule()s.
> 
> couple of hours?

hm, what would you replace it with? Another new API? Or to workqueues 
with a manual adding of a local_bh_disable()/enable() pair around the 
worker function? If the latter, then converting it to workqueues isnt 
just a couple of hours i think. Maybe i'm overestimating the effort.

	Ingo

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-22  4:00 [RFC PATCH 0/6] Convert all tasklets to workqueues Steven Rostedt
                   ` (6 preceding siblings ...)
  2007-06-22  7:09 ` [RFC PATCH 0/6] Convert all tasklets to workqueues Christoph Hellwig
@ 2007-06-22 14:25 ` Arjan van de Ven
  2007-06-22 14:42   ` Steven Rostedt
  2007-06-22 17:16 ` Linus Torvalds
  2007-06-23  5:14 ` Stephen Hemminger
  9 siblings, 1 reply; 127+ messages in thread
From: Arjan van de Ven @ 2007-06-22 14:25 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: LKML, Linus Torvalds, Ingo Molnar, Andrew Morton,
	Thomas Gleixner, Christoph Hellwig, john stultz, Oleg Nesterov,
	Paul E. McKenney, Dipankar Sarma, David S. Miller,
	matthew.wilcox, kuznet


> The most part, tasklets today are not used for time critical functions.
> Running tasklets in thread context is not harmful to performance of
> the overall system. 

That is a bold statement...

> But running them in interrupt context is, since
> they increase the overall latency for high priority tasks.


.... since by moving this to process context you do add latency between
the hardirq and the tasklet, and I can see that having performance
impact easily.



^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-22 13:12         ` Ingo Molnar
@ 2007-06-22 14:27           ` Steven Rostedt
  0 siblings, 0 replies; 127+ messages in thread
From: Steven Rostedt @ 2007-06-22 14:27 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Christoph Hellwig, LKML, Linus Torvalds, Andrew Morton,
	Thomas Gleixner, john stultz, Oleg Nesterov, Paul E. McKenney,
	Dipankar Sarma, David S. Miller, kuznet

On Fri, 2007-06-22 at 15:12 +0200, Ingo Molnar wrote:
> * Steven Rostedt <rostedt@goodmis.org> wrote:
> 

> yes, the softirq based tasklet implementation with workqueue based 
> implementation, but the tasklet API itself should still stay.

done.

> 
> ok, enough idle talking, lets see the next round of patches? :)

Still need to run through testing, and I still don't have distcc
working. So it may take a while ;-)

> 
> please remove all the pr_debug() lines as well - they are handy for 
> development but are quite unnecessary for something headed 

done (but I did keep one).

> upstream-wards. And please replace all the BUG_ON()s with WARN_ON_ONCE() 

I had two BUG_ON's and they were both for not having the ktaskletd_wq
allocated. If we just put a WARN_ON, we will right afterward get a BUG
from a bad memory reference, and put the system into an unknown state.
Are you sure you still want me to convert them to WARN_ON?

-- Steve



^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-22 14:25 ` Arjan van de Ven
@ 2007-06-22 14:42   ` Steven Rostedt
  2007-06-22 14:43     ` Arjan van de Ven
  0 siblings, 1 reply; 127+ messages in thread
From: Steven Rostedt @ 2007-06-22 14:42 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: LKML, Linus Torvalds, Ingo Molnar, Andrew Morton,
	Thomas Gleixner, Christoph Hellwig, john stultz, Oleg Nesterov,
	Paul E. McKenney, Dipankar Sarma, David S. Miller,
	matthew.wilcox, kuznet

On Fri, 2007-06-22 at 07:25 -0700, Arjan van de Ven wrote:
> > The most part, tasklets today are not used for time critical functions.
> > Running tasklets in thread context is not harmful to performance of
> > the overall system. 
> 
> That is a bold statement...
> 
> > But running them in interrupt context is, since
> > they increase the overall latency for high priority tasks.
> 
> 
> .... since by moving this to process context you do add latency between
> the hardirq and the tasklet, and I can see that having performance
> impact easily.

This is stated on the assumption that pretty much all performance
critical tasklets have been removed (although Christoph just mentioned
megaraid_sas, but after I made this statement).

We've been running tasklets as threads in the -rt kernel for some time
now, and that hasn't bothered us.

The problem with tasklets is that they still don't provide guaranteed
performance. They can still be pushed off to ksoftirqd on a busy system.
But having *all* tasklets as they are today, keeps them at a higher
priority then any process in the system.

-- Steve



^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-22 14:42   ` Steven Rostedt
@ 2007-06-22 14:43     ` Arjan van de Ven
  0 siblings, 0 replies; 127+ messages in thread
From: Arjan van de Ven @ 2007-06-22 14:43 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: LKML, Linus Torvalds, Ingo Molnar, Andrew Morton,
	Thomas Gleixner, Christoph Hellwig, john stultz, Oleg Nesterov,
	Paul E. McKenney, Dipankar Sarma, David S. Miller,
	matthew.wilcox, kuznet


> 
> This is stated on the assumption that pretty much all performance
> critical tasklets have been removed (although Christoph just mentioned
> megaraid_sas, but after I made this statement).
> 
> We've been running tasklets as threads in the -rt kernel for some time
> now, and that hasn't bothered us.
> 
> The problem with tasklets is that they still don't provide guaranteed
> performance. They can still be pushed off to ksoftirqd on a busy system.
> But having *all* tasklets as they are today, keeps them at a higher
> priority then any process in the system.


don't get me wrong, I'm not against this conversion, just it needs to be
measured in detail... or preferably done step by step so that the
performance guys can undo one piece at a time until they find a guilty
party ;)

-- 
if you want to mail me at work (you don't), use arjan (at) linux.intel.com
Test the interaction between Linux and your BIOS via http://www.linuxfirmwarekit.org


^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 4/6] Make DRM use the tasklet is-sched API
  2007-06-22 12:15         ` Steven Rostedt
@ 2007-06-22 15:36           ` Daniel Walker
  2007-06-22 22:38             ` Ingo Molnar
  0 siblings, 1 reply; 127+ messages in thread
From: Daniel Walker @ 2007-06-22 15:36 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Thomas Gleixner, LKML, Linus Torvalds, Ingo Molnar,
	Andrew Morton, Christoph Hellwig, john stultz, Oleg Nesterov,
	Paul E. McKenney, Dipankar Sarma, David S. Miller,
	matthew.wilcox, kuznet

On Fri, 2007-06-22 at 08:15 -0400, Steven Rostedt wrote:
> On Fri, 2007-06-22 at 00:08 -0700, Daniel Walker wrote:
> 
> > > > No sense in having a patch just for this, may as well merge this with
> > > > patch 3 ..
> > > 
> > > Wrong. patch 3 adds the API and this one makes use of it. Stevens split
> > > makes perfectly sense.
> > 
> > Clearly it doesn't make sense to me ;) .. The patches are too small to
> > split them up that way ..
> 
> Daniel, you of all people should know. It's not the size of the patch
> that matters, it's the way you use the patch ;-)
> 

Are you trying to say that you think I have a small patch Steven ;) ?

> No these two patches should *not* be merged to one. If these are sitting
> in -mm, and someone were to change the DRM to not to use the API and
> someone else changed their driver to use the API, then what?  Does
> Andrew keep these maintenance patches on top of each other?

I read this 5 times at least .. I don't think I'm following what you
saying .. It sounds like you might be thinking to many steps ahead tho..

> The split lets the DRM patch be dropped or replaced while keeping the
> API patch around in case another driver uses the API.

Ok, but there are no other users currently, and I think it's unlikely
you'll have others in the future since TASKLET_STATE_SCHED seems more
like an internal part of tasklets .. This drm user seems like the one
aberrant.

> The two patches have two different objectives, even though they are
> related and currently on a 1 to 1 basis. The patches regardless, should
> stay separate.

I'm not convinced yet .. One more stab?

Daniel


^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 6/6] Convert tasklets to work queues
  2007-06-22 13:29     ` Steven Rostedt
@ 2007-06-22 15:52       ` Oleg Nesterov
  2007-06-22 16:35         ` Steven Rostedt
  0 siblings, 1 reply; 127+ messages in thread
From: Oleg Nesterov @ 2007-06-22 15:52 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Daniel Walker, LKML, Linus Torvalds, Ingo Molnar, Andrew Morton,
	Thomas Gleixner, Christoph Hellwig, john stultz,
	Paul E. McKenney, Dipankar Sarma, David S. Miller, kuznet

On 06/22, Steven Rostedt wrote:
>
> > truct tasklet_struct, work);
> > > +
> > > +	if (unlikely(atomic_read(&t->count))) {
> > > +		pr_debug("tasklet disabled %s %p\n", t->n, t);
> > > +		set_bit(TASKLET_STATE_PENDING, &t->state);
> > > +		smp_mb();
> > > +		/* make sure we were not just enabled */
> > > +		if (likely(atomic_read(&t->count)))
> > > +			goto out;
> > > +		clear_bit(TASKLET_STATE_PENDING, &t->state);

Looking closer, I think this is not right, and the smp_mb__before_clear_bit()
can't help.


				/* t->count == 1 */

work_tasklet_exec()					tasklet_enable()

...

set_bit(TASKLET_STATE_PENDING);				atomic_dec_and_test(&t->count);


				/* t->count == 0 */


// False
if (atomic_read(&t->count))
	goto out;

							// True
							if (test_and_clear_bit(_PENDING))
								tasklet_schedule();


clear_bit(TASKLET_STATE_PENDING);

execute t->func();


So, t->func() will be executed twice because tasklet_enable() does
tasklet_schedule().


So I think we need a fix for work_tasklet_exec,

-		clear_bit(TASKLET_STATE_PENDING);
+		if (!test_and_clear_bit(TASKLET_STATE_PENDING))
			goto out;



Steven, a very stupid suggestion, could you move the code for tasklet_enable()
up, closer to tasklet_disable() ?

Oleg.


^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 4/6] Make DRM use the tasklet is-sched API
  2007-06-22  6:49     ` Thomas Gleixner
  2007-06-22  7:08       ` Daniel Walker
@ 2007-06-22 16:10       ` Arnd Bergmann
  2007-06-22 16:56         ` Steven Rostedt
  2007-06-22 18:24         ` Christoph Hellwig
  1 sibling, 2 replies; 127+ messages in thread
From: Arnd Bergmann @ 2007-06-22 16:10 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Daniel Walker, Steven Rostedt, LKML, Linus Torvalds, Ingo Molnar,
	Andrew Morton, Christoph Hellwig, john stultz, Oleg Nesterov,
	Paul E. McKenney, Dipankar Sarma, David S. Miller,
	matthew.wilcox, kuznet

On Friday 22 June 2007, Thomas Gleixner wrote:
> > > Index: linux-2.6.21-rt9/drivers/char/drm/drm_irq.c
> > > ===================================================================
> > > --- linux-2.6.21-rt9.orig/drivers/char/drm/drm_irq.c
> > > +++ linux-2.6.21-rt9/drivers/char/drm/drm_irq.c
> > > @@ -461,7 +461,7 @@ void drm_locked_tasklet(drm_device_t *de
> > >     static DECLARE_TASKLET(drm_tasklet, drm_locked_tasklet_func, 0);
> > >  
> > >     if (!drm_core_check_feature(dev, DRIVER_HAVE_IRQ) ||
> > > -       test_bit(TASKLET_STATE_SCHED, &drm_tasklet.state))
> > > +       tasklet_is_scheduled(&drm_tasklet))
> > >             return;
> > >  
> > >     spin_lock_irqsave(&dev->tasklet_lock, irqflags);
> > > 
> > 
> > 
> > No sense in having a patch just for this, may as well merge this with
> > patch 3 ..
> 
> Wrong. patch 3 adds the API and this one makes use of it. Stevens split
> makes perfectly sense.

Wouldn't the easy solution be to get rid of drm_locked_tasklet
entirely and convert i915_vblank_tasklet(), the only user, to use
a work queue right away?

The drm_locked_tasklet() function seems to have multiple bugs anyway,
so getting rid of it can only help, and it avoids exporting a new
tasklet_is_scheduled() interface.

	Arnd <><

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 6/6] Convert tasklets to work queues
  2007-06-22 15:52       ` Oleg Nesterov
@ 2007-06-22 16:35         ` Steven Rostedt
  0 siblings, 0 replies; 127+ messages in thread
From: Steven Rostedt @ 2007-06-22 16:35 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Daniel Walker, LKML, Linus Torvalds, Ingo Molnar, Andrew Morton,
	Thomas Gleixner, Christoph Hellwig, john stultz,
	Paul E. McKenney, Dipankar Sarma, David S. Miller, kuznet

On Fri, 2007-06-22 at 19:52 +0400, Oleg Nesterov wrote:
> On 06/22, Steven Rostedt wrote:
> >
> > > truct tasklet_struct, work);
> > > > +
> > > > +	if (unlikely(atomic_read(&t->count))) {
> > > > +		pr_debug("tasklet disabled %s %p\n", t->n, t);
> > > > +		set_bit(TASKLET_STATE_PENDING, &t->state);
> > > > +		smp_mb();
> > > > +		/* make sure we were not just enabled */
> > > > +		if (likely(atomic_read(&t->count)))
> > > > +			goto out;
> > > > +		clear_bit(TASKLET_STATE_PENDING, &t->state);

Yeah, I knew of the race but didn't think that running a tasklet
function twice would cause much harm here.  But not running it when it
needs to run, can have quite a negative impact.

> 
> So, t->func() will be executed twice because tasklet_enable() does
> tasklet_schedule().
> 
> 
> So I think we need a fix for work_tasklet_exec,
> 
> -		clear_bit(TASKLET_STATE_PENDING);
> +		if (!test_and_clear_bit(TASKLET_STATE_PENDING))
> 			goto out;
> 

OK, I like this. I'll add it in the next round.


> 
> 
> Steven, a very stupid suggestion, could you move the code for tasklet_enable()
> up, closer to tasklet_disable() ?

Not a stupid suggestion. I'll accommodate it.

Thanks,

-- Steve



^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 4/6] Make DRM use the tasklet is-sched API
  2007-06-22 16:10       ` Arnd Bergmann
@ 2007-06-22 16:56         ` Steven Rostedt
  2007-06-22 18:24         ` Christoph Hellwig
  1 sibling, 0 replies; 127+ messages in thread
From: Steven Rostedt @ 2007-06-22 16:56 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Thomas Gleixner, Daniel Walker, LKML, Linus Torvalds,
	Ingo Molnar, Andrew Morton, Christoph Hellwig, john stultz,
	Oleg Nesterov, Paul E. McKenney, Dipankar Sarma, David S. Miller,
	matthew.wilcox, kuznet

On Fri, 2007-06-22 at 18:10 +0200, Arnd Bergmann wrote:

> Wouldn't the easy solution be to get rid of drm_locked_tasklet
> entirely and convert i915_vblank_tasklet(), the only user, to use
> a work queue right away?

You recommend making the i915 use a workqueue to do this instead, and
remove the drm_lock_tasklet code from the drm_irq.c completely?

Unfortunately, I don't have any boxes with a i915, so I would not feel
comfortable with doing this myself.

> 
> The drm_locked_tasklet() function seems to have multiple bugs anyway,
> so getting rid of it can only help, and it avoids exporting a new
> tasklet_is_scheduled() interface.

To be able to at least continue my work, I'll just keep these as
separate patches.

Thanks,

-- Steve



^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-22  4:00 [RFC PATCH 0/6] Convert all tasklets to workqueues Steven Rostedt
                   ` (7 preceding siblings ...)
  2007-06-22 14:25 ` Arjan van de Ven
@ 2007-06-22 17:16 ` Linus Torvalds
  2007-06-22 17:31   ` Steven Rostedt
                     ` (2 more replies)
  2007-06-23  5:14 ` Stephen Hemminger
  9 siblings, 3 replies; 127+ messages in thread
From: Linus Torvalds @ 2007-06-22 17:16 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: LKML, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Christoph Hellwig, john stultz, Oleg Nesterov, Paul E. McKenney,
	Dipankar Sarma, David S. Miller, matthew.wilcox, kuznet



On Fri, 22 Jun 2007, Steven Rostedt wrote:
> 
> I just want to state that tasklets served their time well. But it's time
> to give them an honorable discharge.  So lets get rid of tasklets and
> given them a standing salute as they leave :-)

Well, independently of whether we actually discharge them or not, I do 
tend to always like things that split independent concepts up (whether 
they then end up being _implemented_ independently of each other or not is 
a separate issue).

So patches 1-4 all look fine to me. In fact, 5 looks ok too.

Whether we actually then want to do 6 is another matter. I think we'd need 
some measuring and discussion about that.

I'm absolutely 100% sure that we do *not* want to be in a situation where 
we have two different implementations of tasklets, and just keep the 
CONFIG variable and let people just choose one or the other.

So imnsho doing #6 is really something that makes sense only in a "let's 
measure this and decide which implementation is actually the better one", 
_not_ in the sense of merging it into the standard kernel and letting them 
fight it out in the long run.

But I'd happily merge 1-4 regardless after 2.6.22 is out.

Leaving patch 6 as a "only makes sense after we actually have some numbers 
about it", and patch 5 is a "could go either way" as far as I'm concerned 
(ie I could merge it together with the 1-4 series, but I think it's 
equally valid to just see it as a companion to 6).

Does that make sense to people?

		Linus

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-22 17:16 ` Linus Torvalds
@ 2007-06-22 17:31   ` Steven Rostedt
  2007-06-22 18:32   ` Christoph Hellwig
  2007-06-22 20:40   ` Ingo Molnar
  2 siblings, 0 replies; 127+ messages in thread
From: Steven Rostedt @ 2007-06-22 17:31 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: LKML, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Christoph Hellwig, john stultz, Oleg Nesterov, Paul E. McKenney,
	Dipankar Sarma, David S. Miller, kuznet

On Fri, 2007-06-22 at 10:16 -0700, Linus Torvalds wrote:
> 

> So patches 1-4 all look fine to me. In fact, 5 looks ok too.

Great!

> Leaving patch 6 as a "only makes sense after we actually have some numbers 
> about it", and patch 5 is a "could go either way" as far as I'm concerned 
> (ie I could merge it together with the 1-4 series, but I think it's 
> equally valid to just see it as a companion to 6).

Actually, in my next series, I removed patch 5 and just simply replace
the tasklets with the workqueue implementation, leaving no config
option. It will be out shortly, after I finish my testing.

I also keep the file name tasklet_work.c instead of overwriting
tasklet.c, just because it's easier to review the patch that way.

-- Steve





^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 4/6] Make DRM use the tasklet is-sched API
  2007-06-22 16:10       ` Arnd Bergmann
  2007-06-22 16:56         ` Steven Rostedt
@ 2007-06-22 18:24         ` Christoph Hellwig
  2007-06-22 23:38           ` Dave Airlie
  1 sibling, 1 reply; 127+ messages in thread
From: Christoph Hellwig @ 2007-06-22 18:24 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Thomas Gleixner, Daniel Walker, Steven Rostedt, LKML,
	Linus Torvalds, Ingo Molnar, Andrew Morton, Christoph Hellwig,
	john stultz, Oleg Nesterov, Paul E. McKenney, Dipankar Sarma,
	David S. Miller, matthew.wilcox, kuznet

On Fri, Jun 22, 2007 at 06:10:55PM +0200, Arnd Bergmann wrote:
> On Friday 22 June 2007, Thomas Gleixner wrote:
> > > > Index: linux-2.6.21-rt9/drivers/char/drm/drm_irq.c
> > > > ===================================================================
> > > > --- linux-2.6.21-rt9.orig/drivers/char/drm/drm_irq.c
> > > > +++ linux-2.6.21-rt9/drivers/char/drm/drm_irq.c
> > > > @@ -461,7 +461,7 @@ void drm_locked_tasklet(drm_device_t *de
> > > > ????static DECLARE_TASKLET(drm_tasklet, drm_locked_tasklet_func, 0);
> > > > ?
> > > > ????if (!drm_core_check_feature(dev, DRIVER_HAVE_IRQ) ||
> > > > -??? ? ?test_bit(TASKLET_STATE_SCHED, &drm_tasklet.state))
> > > > +??? ? ?tasklet_is_scheduled(&drm_tasklet))
> > > > ????????????return;
> > > > ?
> > > > ????spin_lock_irqsave(&dev->tasklet_lock, irqflags);
> > > > 
> > > 
> > > 
> > > No sense in having a patch just for this, may as well merge this with
> > > patch 3 ..
> > 
> > Wrong. patch 3 adds the API and this one makes use of it. Stevens split
> > makes perfectly sense.
> 
> Wouldn't the easy solution be to get rid of drm_locked_tasklet
> entirely and convert i915_vblank_tasklet(), the only user, to use
> a work queue right away?
> 
> The drm_locked_tasklet() function seems to have multiple bugs anyway,
> so getting rid of it can only help, and it avoids exporting a new
> tasklet_is_scheduled() interface.

That's exactly what I though when looking over this code.  There's
some really crappy in code in that area, and it should simply be
rewritten.

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-22 17:16 ` Linus Torvalds
  2007-06-22 17:31   ` Steven Rostedt
@ 2007-06-22 18:32   ` Christoph Hellwig
  2007-06-22 20:40   ` Ingo Molnar
  2 siblings, 0 replies; 127+ messages in thread
From: Christoph Hellwig @ 2007-06-22 18:32 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Steven Rostedt, LKML, Ingo Molnar, Andrew Morton,
	Thomas Gleixner, Christoph Hellwig, john stultz, Oleg Nesterov,
	Paul E. McKenney, Dipankar Sarma, David S. Miller,
	matthew.wilcox, kuznet

On Fri, Jun 22, 2007 at 10:16:47AM -0700, Linus Torvalds wrote:
> 
> 
> On Fri, 22 Jun 2007, Steven Rostedt wrote:
> > 
> > I just want to state that tasklets served their time well. But it's time
> > to give them an honorable discharge.  So lets get rid of tasklets and
> > given them a standing salute as they leave :-)
> 
> Well, independently of whether we actually discharge them or not, I do 
> tend to always like things that split independent concepts up (whether 
> they then end up being _implemented_ independently of each other or not is 
> a separate issue).
> 
> So patches 1-4 all look fine to me. In fact, 5 looks ok too.

I don't think we should put 3 and 4 in.  The code in drm is just crap and
should just be rewritten.  I'll cook up a patch, but I can't actually
yest it due to lack of hardware.


^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-22 17:16 ` Linus Torvalds
  2007-06-22 17:31   ` Steven Rostedt
  2007-06-22 18:32   ` Christoph Hellwig
@ 2007-06-22 20:40   ` Ingo Molnar
  2007-06-22 21:00     ` Christoph Hellwig
                       ` (2 more replies)
  2 siblings, 3 replies; 127+ messages in thread
From: Ingo Molnar @ 2007-06-22 20:40 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Steven Rostedt, LKML, Andrew Morton, Thomas Gleixner,
	Christoph Hellwig, john stultz, Oleg Nesterov, Paul E. McKenney,
	Dipankar Sarma, David S. Miller, matthew.wilcox, kuznet


* Linus Torvalds <torvalds@linux-foundation.org> wrote:

> Whether we actually then want to do 6 is another matter. I think we'd 
> need some measuring and discussion about that.

basically tasklets have a number of limitations:

 - tasklets have certain latency limitations over real tasks. (for
   example they are not guaranteed to be re-executed when they are
   triggered while they are running, so artificial latencies can be
   introduced into the kernel workflow)

 - tasklets have certain execution limitations. (only atomic functions
   can be executed in them)

 - tasklets have certain fairness limitations. (they are executed in
   softirq context and thus preempt everything, even if there is some
   potentially more important, high-priority task waiting to be
   executed.)

 - the 'priority levels' approach of softirqs is not really 
   self-documenting - unlike real locks. As a result we've got some 
   vague coupling between network softirq processing and timer softirq 
   processing, which spilled over into tasklets as well. The 'hi' and
   'low' concept of tasklets isnt really used either. We should reduce 
   the amount of such opaque 'coupling' between workflows - it should be 
   spelled out explicitly via some synchronization construct.

 - tasklets are duplicated infrastructure (over existing workqueues) 
   that, if it's possible to do it compatibly, would be a good idea to 
   eliminate.

when it comes to 'deferred processing', we've basically got two 'prime' 
choices for deferred processing:

 - if it's high-performance then it goes into a softirq.

 - if performance is not important, or robustness and flexibility is 
   more important than performance, then workqueues are used.

basically tasklets do _neither_ really well. They are too 'global' to 
scale really well on SMP (even the RCU tasklet wasnt a real tasklet: it 
was a _per CPU tasklet_, which almost by definition is equivalent to a 
softirq, some some extra glue overhead ...), and tasklets are also too 
much tied to softirqs to be used as a generic processing context.

that's why i'd like them to be gently but firmly phased out =B-)

	Ingo

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-22 20:40   ` Ingo Molnar
@ 2007-06-22 21:00     ` Christoph Hellwig
  2007-06-22 21:10       ` Ingo Molnar
  2007-06-22 21:13       ` Thomas Gleixner
  2007-06-22 21:37     ` Linus Torvalds
  2007-06-22 21:53     ` Daniel Walker
  2 siblings, 2 replies; 127+ messages in thread
From: Christoph Hellwig @ 2007-06-22 21:00 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Linus Torvalds, Steven Rostedt, LKML, Andrew Morton,
	Thomas Gleixner, Christoph Hellwig, john stultz, Oleg Nesterov,
	Paul E. McKenney, Dipankar Sarma, David S. Miller,
	matthew.wilcox, kuznet

On Fri, Jun 22, 2007 at 10:40:58PM +0200, Ingo Molnar wrote:
> when it comes to 'deferred processing', we've basically got two 'prime' 
> choices for deferred processing:
> 
>  - if it's high-performance then it goes into a softirq.
> 
>  - if performance is not important, or robustness and flexibility is 
>    more important than performance, then workqueues are used.
> 
> basically tasklets do _neither_ really well. They are too 'global' to 
> scale really well on SMP (even the RCU tasklet wasnt a real tasklet: it 
> was a _per CPU tasklet_, which almost by definition is equivalent to a 
> softirq, some some extra glue overhead ...), and tasklets are also too 
> much tied to softirqs to be used as a generic processing context.
> 
> that's why i'd like them to be gently but firmly phased out =B-)

Note that we also have a lot of inefficiency in the way we do deferred
processing.  Think of a setup where you run a XFS filesystem runs over
a megaraid adapter.

 (1) we get a real hardirq, which just clears the interrupt and then
     deferes to a tasklet
 (2) tasklet walks the producer / consumer queue and then calls scsi_done
     for each completeted scsi command which ends up doing
     raise_softirq_irqoff(BLOCK_SOFTIRQ);
 (3) block softirq does the heavy lifting for command completion and finally
     calls back into the bio's completion routine
 (4) xfs wants to avoid irq safe locking and thus deferes the command to a
     kthread

This is rather inefficient due to all the (semi-)context switches already
and not by far the worst setup given that a lot of dm modules can involve
another thread in the process.

Now if just plain convert tasklets to a thread based abstraction this
existing code becomes really dumb because we go from hardirq to process
context to go back to softirq context to go back to process context.

Ouch!

I think we need to put a little more though into how we can optimize our
irq path for the full stack.  Using irqthreads in an intelligent way might
be one option, but we'll need a lot of heavy benchmarking whatever way
we go.

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-22 21:00     ` Christoph Hellwig
@ 2007-06-22 21:10       ` Ingo Molnar
  2007-06-22 21:13       ` Thomas Gleixner
  1 sibling, 0 replies; 127+ messages in thread
From: Ingo Molnar @ 2007-06-22 21:10 UTC (permalink / raw)
  To: Christoph Hellwig, Linus Torvalds, Steven Rostedt, LKML,
	Andrew Morton, Thomas Gleixner, john stultz, Oleg Nesterov,
	Paul E. McKenney, Dipankar Sarma, David S. Miller,
	matthew.wilcox, kuznet


* Christoph Hellwig <hch@infradead.org> wrote:

> Note that we also have a lot of inefficiency in the way we do deferred 
> processing.  Think of a setup where you run a XFS filesystem runs over 
> a megaraid adapter.
> 
>  (1) we get a real hardirq, which just clears the interrupt and then
>      deferes to a tasklet
>  (2) tasklet walks the producer / consumer queue and then calls scsi_done
>      for each completeted scsi command which ends up doing
>      raise_softirq_irqoff(BLOCK_SOFTIRQ);
>  (3) block softirq does the heavy lifting for command completion and finally
>      calls back into the bio's completion routine
>  (4) xfs wants to avoid irq safe locking and thus deferes the command to a
>      kthread

i dont understand - why is a tasklet used at all? Why not do it straight 
in the BLOCK_SOFTIRQ? Using tasklets there is extra, unnecessary 
overhead already.

	Ingo

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-22 21:00     ` Christoph Hellwig
  2007-06-22 21:10       ` Ingo Molnar
@ 2007-06-22 21:13       ` Thomas Gleixner
  1 sibling, 0 replies; 127+ messages in thread
From: Thomas Gleixner @ 2007-06-22 21:13 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Ingo Molnar, Linus Torvalds, Steven Rostedt, LKML, Andrew Morton,
	john stultz, Oleg Nesterov, Paul E. McKenney, Dipankar Sarma,
	David S. Miller, matthew.wilcox, kuznet

On Fri, 2007-06-22 at 22:00 +0100, Christoph Hellwig wrote:
> Note that we also have a lot of inefficiency in the way we do deferred
> processing.  Think of a setup where you run a XFS filesystem runs over
> a megaraid adapter.
> 
>  (1) we get a real hardirq, which just clears the interrupt and then
>      deferes to a tasklet
>  (2) tasklet walks the producer / consumer queue and then calls scsi_done
>      for each completeted scsi command which ends up doing
>      raise_softirq_irqoff(BLOCK_SOFTIRQ);
>  (3) block softirq does the heavy lifting for command completion and finally
>      calls back into the bio's completion routine
>  (4) xfs wants to avoid irq safe locking and thus deferes the command to a
>      kthread
> 
> This is rather inefficient due to all the (semi-)context switches already
> and not by far the worst setup given that a lot of dm modules can involve
> another thread in the process.
> 
> Now if just plain convert tasklets to a thread based abstraction this
> existing code becomes really dumb because we go from hardirq to process
> context to go back to softirq context to go back to process context.
> 
> Ouch!
> 
> I think we need to put a little more though into how we can optimize our
> irq path for the full stack.  Using irqthreads in an intelligent way might
> be one option, but we'll need a lot of heavy benchmarking whatever way
> we go.

Your above scenario screams for a threaded interrupt handler, where you
actually can unify a lot of this into one single context.

	tglx



^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-22 20:40   ` Ingo Molnar
  2007-06-22 21:00     ` Christoph Hellwig
@ 2007-06-22 21:37     ` Linus Torvalds
  2007-06-22 21:59       ` Ingo Molnar
  2007-06-22 21:53     ` Daniel Walker
  2 siblings, 1 reply; 127+ messages in thread
From: Linus Torvalds @ 2007-06-22 21:37 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Steven Rostedt, LKML, Andrew Morton, Thomas Gleixner,
	Christoph Hellwig, john stultz, Oleg Nesterov, Paul E. McKenney,
	Dipankar Sarma, David S. Miller, matthew.wilcox, kuznet



On Fri, 22 Jun 2007, Ingo Molnar wrote:
> 
> * Linus Torvalds <torvalds@linux-foundation.org> wrote:
> 
> > Whether we actually then want to do 6 is another matter. I think we'd 
> > need some measuring and discussion about that.
> 
> basically tasklets have a number of limitations:

I'm not disputing that they aren't pretty.

But none of your arguments really touch on the deeper issue:

 - Is the new code *technically*better*?

Don't get me wrong. I like cleanups as much as the next guy. I have no 
problem at all with the first four patches in the series, because those 
are cleanups that have no technical impact apart from that "cleanup-ness".

The reason I ask about patch #6 is simply that in the end "clean code" 
matters less than "good results".

I'm a _huge_ believer in "clean code", but the fact is, I'm an even bigger 
believer in "reality bites". I'd really like to see some numbers.

If the numbers say that there is no performance difference (or even 
better: that the new code performs better or fixes some latency issue or 
whatever), I'll be very happy. But if the numbers say that it's worse, no 
amount of cleanliness really changes that. 

		Linus

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-22 20:40   ` Ingo Molnar
  2007-06-22 21:00     ` Christoph Hellwig
  2007-06-22 21:37     ` Linus Torvalds
@ 2007-06-22 21:53     ` Daniel Walker
  2007-06-22 22:09       ` david
  2007-06-22 22:15       ` Ingo Molnar
  2 siblings, 2 replies; 127+ messages in thread
From: Daniel Walker @ 2007-06-22 21:53 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Linus Torvalds, Steven Rostedt, LKML, Andrew Morton,
	Thomas Gleixner, Christoph Hellwig, john stultz, Oleg Nesterov,
	Paul E. McKenney, Dipankar Sarma, David S. Miller,
	matthew.wilcox, kuznet

On Fri, 2007-06-22 at 22:40 +0200, Ingo Molnar wrote:

> 
>  - tasklets have certain fairness limitations. (they are executed in
>    softirq context and thus preempt everything, even if there is some
>    potentially more important, high-priority task waiting to be
>    executed.)

Since -rt has been executing tasklets in process context for a long
time, I'm not sure this change would cause to many regressions. However,
it seems like implicit dependencies on "tasklets preempt everything"
might crop up. The other issue is if they don't "preempt
everything" (most of the time), what default priority do we give them
(all of the time)? It seems like Christoph's suggestion of converting
all the tasklets individually might be a better option, to deal with
specific pitfalls.

Daniel


^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-22 21:37     ` Linus Torvalds
@ 2007-06-22 21:59       ` Ingo Molnar
  2007-06-22 22:09         ` Ingo Molnar
                           ` (6 more replies)
  0 siblings, 7 replies; 127+ messages in thread
From: Ingo Molnar @ 2007-06-22 21:59 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Steven Rostedt, LKML, Andrew Morton, Thomas Gleixner,
	Christoph Hellwig, john stultz, Oleg Nesterov, Paul E. McKenney,
	Dipankar Sarma, David S. Miller, matthew.wilcox, kuznet


* Linus Torvalds <torvalds@linux-foundation.org> wrote:

> If the numbers say that there is no performance difference (or even 
> better: that the new code performs better or fixes some latency issue 
> or whatever), I'll be very happy. But if the numbers say that it's 
> worse, no amount of cleanliness really changes that.

Most of the tasklet uses are in rarely used or arcane drivers - in fact 
none of my 10 test-boxes utilizes _any_ tasklet in any way that could 
even get close to mattering to performance. In other words: i just 
cannot test this, nor do i think that others will really test this. I.e. 
if we dont approach this problem in some other way, nothing will happen 
and Steve's patch will be stalled forever and will live in -rt forever. 
(which might be a correct end result too, but i'm just not giving up 
this easily :-)

so how about the following, different approach: anyone who has a tasklet 
in any performance-sensitive codepath, please yell now. We'll also do a 
proactive search for such places. We can convert those places to 
softirqs, or move them back into hardirq context. Once this is done - 
and i doubt it will go beyond 1-2 places - we can just mass-convert the 
other 110 places to the lame but compatible solution of doing them in a 
global thread context.

[ and on a similar notion, i still havent given up on seeing all BKL use 
  gone from the kernel. I expect it to happen any decade now ;-) ]

	Ingo

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-22 21:59       ` Ingo Molnar
@ 2007-06-22 22:09         ` Ingo Molnar
  2007-06-22 22:43           ` Roland Dreier
  2007-06-22 22:58         ` Steven Rostedt
                           ` (5 subsequent siblings)
  6 siblings, 1 reply; 127+ messages in thread
From: Ingo Molnar @ 2007-06-22 22:09 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Steven Rostedt, LKML, Andrew Morton, Thomas Gleixner,
	Christoph Hellwig, john stultz, Oleg Nesterov, Paul E. McKenney,
	Dipankar Sarma, David S. Miller, matthew.wilcox, kuznet


* Ingo Molnar <mingo@elte.hu> wrote:

> [ and on a similar notion, i still havent given up on seeing all BKL 
>   use gone from the kernel. I expect it to happen any decade now ;-) ]

2.6.21 had 476 lock_kernel() calls. 2.6.22-git has 473 lock_kernel() 
calls currently. With that kind of flux we'll see the BKL gone in about 
40 years =B-)

'struct semaphore' use on the other hand has gone down by 10% in this 
release, which is a good rate. I guess the lack of lockdep coverage for 
semaphores might be one of the driving forces? ;-)

	Ingo

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-22 21:53     ` Daniel Walker
@ 2007-06-22 22:09       ` david
  2007-06-22 22:15         ` Daniel Walker
  2007-06-22 22:15       ` Ingo Molnar
  1 sibling, 1 reply; 127+ messages in thread
From: david @ 2007-06-22 22:09 UTC (permalink / raw)
  To: Daniel Walker
  Cc: Ingo Molnar, Linus Torvalds, Steven Rostedt, LKML, Andrew Morton,
	Thomas Gleixner, Christoph Hellwig, john stultz, Oleg Nesterov,
	Paul E. McKenney, Dipankar Sarma, David S. Miller,
	matthew.wilcox, kuznet

On Fri, 22 Jun 2007, Daniel Walker wrote:

> 
> On Fri, 2007-06-22 at 22:40 +0200, Ingo Molnar wrote:
>
>>
>>  - tasklets have certain fairness limitations. (they are executed in
>>    softirq context and thus preempt everything, even if there is some
>>    potentially more important, high-priority task waiting to be
>>    executed.)
>
> Since -rt has been executing tasklets in process context for a long
> time, I'm not sure this change would cause to many regressions. However,
> it seems like implicit dependencies on "tasklets preempt everything"
> might crop up. The other issue is if they don't "preempt
> everything" (most of the time), what default priority do we give them
> (all of the time)? It seems like Christoph's suggestion of converting
> all the tasklets individually might be a better option, to deal with
> specific pitfalls.

that would be the safe way to do it, but it will take a lot of time and a 
lot of testing.

it's probably better to try the big-bang change and only if you see 
problames go back and break things down.

remember, these changes have been in use in -rt for a while. there's 
reason to believe that they aren't going to cause drastic problems.

David Lang

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-22 21:53     ` Daniel Walker
  2007-06-22 22:09       ` david
@ 2007-06-22 22:15       ` Ingo Molnar
  1 sibling, 0 replies; 127+ messages in thread
From: Ingo Molnar @ 2007-06-22 22:15 UTC (permalink / raw)
  To: Daniel Walker
  Cc: Linus Torvalds, Steven Rostedt, LKML, Andrew Morton,
	Thomas Gleixner, Christoph Hellwig, john stultz, Oleg Nesterov,
	Paul E. McKenney, Dipankar Sarma, David S. Miller,
	matthew.wilcox, kuznet


* Daniel Walker <dwalker@mvista.com> wrote:

> >  - tasklets have certain fairness limitations. (they are executed in
> >    softirq context and thus preempt everything, even if there is 
> >    some potentially more important, high-priority task waiting to be 
> >    executed.)
> 
> Since -rt has been executing tasklets in process context for a long 
> time, I'm not sure this change would cause to many regressions. 
> However, it seems like implicit dependencies on "tasklets preempt 
> everything" might crop up. The other issue is if they don't "preempt 
> everything" (most of the time), what default priority do we give them 
> (all of the time)? [...]

there is no such guarantee at all (of 'instant preemption'), even with 
current, softirq-based tasklets. A tasklet might be 'stolen' by another 
CPU. It might be delayed to the next timer tick (or other softirq 
execution). Or it might be delayed into a ksoftirqd context, which 
currently runs at nice +19. So your worry of implicit execution 
dependencies is unfounded, because, if they existed, they would be bad 
(and triggerable) bugs today too.

	Ingo

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-22 22:09       ` david
@ 2007-06-22 22:15         ` Daniel Walker
  2007-06-22 22:44           ` Ingo Molnar
  0 siblings, 1 reply; 127+ messages in thread
From: Daniel Walker @ 2007-06-22 22:15 UTC (permalink / raw)
  To: david
  Cc: Ingo Molnar, Linus Torvalds, Steven Rostedt, LKML, Andrew Morton,
	Thomas Gleixner, Christoph Hellwig, john stultz, Oleg Nesterov,
	Paul E. McKenney, Dipankar Sarma, David S. Miller,
	matthew.wilcox, kuznet

On Fri, 2007-06-22 at 15:09 -0700, david@lang.hm wrote:
> On Fri, 22 Jun 2007, Daniel Walker wrote:
> 
> > 
> > On Fri, 2007-06-22 at 22:40 +0200, Ingo Molnar wrote:
> >
> >>
> >>  - tasklets have certain fairness limitations. (they are executed in
> >>    softirq context and thus preempt everything, even if there is some
> >>    potentially more important, high-priority task waiting to be
> >>    executed.)
> >
> > Since -rt has been executing tasklets in process context for a long
> > time, I'm not sure this change would cause to many regressions. However,
> > it seems like implicit dependencies on "tasklets preempt everything"
> > might crop up. The other issue is if they don't "preempt
> > everything" (most of the time), what default priority do we give them
> > (all of the time)? It seems like Christoph's suggestion of converting
> > all the tasklets individually might be a better option, to deal with
> > specific pitfalls.
> 
> that would be the safe way to do it, but it will take a lot of time and a 
> lot of testing.
> 
> it's probably better to try the big-bang change and only if you see 
> problames go back and break things down.

For testing I'd agree, but not for a kernel that is suppose to be
stable.

> remember, these changes have been in use in -rt for a while. there's 
> reason to believe that they aren't going to cause drastic problems.

Since I've been working with -rt (~2 years now I think) it's clear that
the number of testers of the patch isn't all that high compared to the
stable kernel . There are tons of drivers which get no coverage by -rt
patch users.

So the fact that something similar is in -rt is good, but it's not a
silver bullet ..

Daniel


^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 4/6] Make DRM use the tasklet is-sched API
  2007-06-22 15:36           ` Daniel Walker
@ 2007-06-22 22:38             ` Ingo Molnar
  2007-06-22 23:28               ` Daniel Walker
  0 siblings, 1 reply; 127+ messages in thread
From: Ingo Molnar @ 2007-06-22 22:38 UTC (permalink / raw)
  To: Daniel Walker
  Cc: Steven Rostedt, Thomas Gleixner, LKML, Linus Torvalds,
	Andrew Morton, Christoph Hellwig, john stultz, Oleg Nesterov,
	Paul E. McKenney, Dipankar Sarma, David S. Miller,
	matthew.wilcox, kuznet


* Daniel Walker <dwalker@mvista.com> wrote:

> > The two patches have two different objectives, even though they are 
> > related and currently on a 1 to 1 basis. The patches regardless, 
> > should stay separate.
> 
> I'm not convinced yet .. One more stab?

uhm, i dont think Steve needs to 'convince' you. You have been (once 
again ...) given extensive explanations about a really trivial topic, 
but you keep pushing and demanding very agressively without apparently 
reading the answers. If you dont want to or cannot understand it's 
really your problem to solve, not Steve's ...

	Ingo

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-22 22:09         ` Ingo Molnar
@ 2007-06-22 22:43           ` Roland Dreier
  2007-06-22 22:57             ` Alan Cox
  0 siblings, 1 reply; 127+ messages in thread
From: Roland Dreier @ 2007-06-22 22:43 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Linus Torvalds, Steven Rostedt, LKML, Andrew Morton,
	Thomas Gleixner, Christoph Hellwig, john stultz, Oleg Nesterov,
	Paul E. McKenney, Dipankar Sarma, David S. Miller,
	matthew.wilcox, kuznet

 > > [ and on a similar notion, i still havent given up on seeing all BKL 
 > >   use gone from the kernel. I expect it to happen any decade now ;-) ]

 > 2.6.21 had 476 lock_kernel() calls. 2.6.22-git has 473 lock_kernel() 
 > calls currently. With that kind of flux we'll see the BKL gone in about 
 > 40 years =B-)

 > 'struct semaphore' use on the other hand has gone down by 10% in this 
 > release, which is a good rate. I guess the lack of lockdep coverage for 
 > semaphores might be one of the driving forces? ;-)

The problem with removing uses of the BKL is that a "lock_kernel()"
gives no clue about what it is protecting against, and so it requires
a lot of very difficult auditing to replace with appropriate locking.

To take a couple of examples at random: fs/ext4/ioctl.c takes the BKL
in ext4_compat_ioctl() around the call to ext4_ioctl().  Kind of sad
that a "next-generation" FS still uses the BKL, but who understands
things well enough to say how all the cases in ext4_ioctl() are
relying on being called with the BKL held?

As a second example, msr_seek() in arch/i386/kernel/msr.c... is the
inode semaphore enough or not?  Who understands the implications well
enough to say?

Most semaphores on the other hand can be replaced by mutexes or
completions in a fairly straightforward way.

 - R.

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-22 22:15         ` Daniel Walker
@ 2007-06-22 22:44           ` Ingo Molnar
  2007-06-22 23:28             ` Daniel Walker
  0 siblings, 1 reply; 127+ messages in thread
From: Ingo Molnar @ 2007-06-22 22:44 UTC (permalink / raw)
  To: Daniel Walker
  Cc: david, Linus Torvalds, Steven Rostedt, LKML, Andrew Morton,
	Thomas Gleixner, Christoph Hellwig, john stultz, Oleg Nesterov,
	Paul E. McKenney, Dipankar Sarma, David S. Miller,
	matthew.wilcox, kuznet


* Daniel Walker <dwalker@mvista.com> wrote:

> > remember, these changes have been in use in -rt for a while. there's 
> > reason to believe that they aren't going to cause drastic problems.
> 
> Since I've been working with -rt (~2 years now I think) it's clear 
> that the number of testers of the patch isn't all that high compared 
> to the stable kernel . [...]

You havent been watching it too closely i guess :-) The -rt kernel often 
pops up regressions before mainline does, especially when it comes to 
arcane hardware often used by embedded vendors [ =B-) ]. It even 
triggers certain high-end scalability and race bugs before the mainline 
kernel does, due to its unique scheduling behavior.

So yes, -rt obviously does not have as wide of a tester basis as the 
mainline kernel (but it's by no means small), it nevertheless has a 
tester base that is partly orthogonal to the mainline kernel.

Furthermore, -rt has a wide enough tester base for it to know that if 
something has not caused problems in it for years is certainly at least 
a good indicator that something isnt going to cause drastic problems ... 
which was the point to begin with.

	Ingo

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-22 22:43           ` Roland Dreier
@ 2007-06-22 22:57             ` Alan Cox
  0 siblings, 0 replies; 127+ messages in thread
From: Alan Cox @ 2007-06-22 22:57 UTC (permalink / raw)
  To: Roland Dreier
  Cc: Ingo Molnar, Linus Torvalds, Steven Rostedt, LKML, Andrew Morton,
	Thomas Gleixner, Christoph Hellwig, john stultz, Oleg Nesterov,
	Paul E. McKenney, Dipankar Sarma, David S. Miller,
	matthew.wilcox, kuznet

> As a second example, msr_seek() in arch/i386/kernel/msr.c... is the
> inode semaphore enough or not?  Who understands the implications well
> enough to say?

lseek is one of the nasty remaining cases. tty is another real horror
that needs further work but we slowly get closer - drivers/char is almost
but not entirely lock_kernel free now and several users look quite easy
to swat (everyone but tty_io.c)

Alan

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-22 21:59       ` Ingo Molnar
  2007-06-22 22:09         ` Ingo Molnar
@ 2007-06-22 22:58         ` Steven Rostedt
  2007-06-23  6:23         ` Dave Airlie
                           ` (4 subsequent siblings)
  6 siblings, 0 replies; 127+ messages in thread
From: Steven Rostedt @ 2007-06-22 22:58 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Linus Torvalds, LKML, Andrew Morton, Thomas Gleixner,
	Christoph Hellwig, john stultz, Oleg Nesterov, Paul E. McKenney,
	Dipankar Sarma, David S. Miller, matthew.wilcox, kuznet

On Fri, 2007-06-22 at 23:59 +0200, Ingo Molnar wrote:
> * Linus Torvalds <torvalds@linux-foundation.org> wrote:
> 
> > If the numbers say that there is no performance difference (or even 
> > better: that the new code performs better or fixes some latency issue 
> > or whatever), I'll be very happy. But if the numbers say that it's 
> > worse, no amount of cleanliness really changes that.
> 
> Most of the tasklet uses are in rarely used or arcane drivers - in fact 
> none of my 10 test-boxes utilizes _any_ tasklet in any way that could 
> even get close to mattering to performance. In other words: i just 
> cannot test this, nor do i think that others will really test this. 

This is exactly why I included that CONFIG option in the first series.
Because, I only have a handful of hardware that actually uses tasklets.
And all those pr_debugs I had where turned on on most of my boxes.  I
was not flooded with prints either (every function including
tasklet_schedule had a print).

So, basically, I can't do benchmarks. I was hoping to get this into -mm
with a easy way for people, who have hardware that uses tasklets
extensively, to run it with tasklets on and off to see if there is a
difference.  My fear of not having a config option to switch between the
two (for -mm only) is that we may lose benchmarking from those that are
not comfortable at removing this patch from -mm.  There are people out
there that download and test the -mm tree straight from kernel.org.
Just because someone compiles their own kernel doesn't mean they can (or
will) patch it.

-- Steve




^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 4/6] Make DRM use the tasklet is-sched API
  2007-06-22 22:38             ` Ingo Molnar
@ 2007-06-22 23:28               ` Daniel Walker
  0 siblings, 0 replies; 127+ messages in thread
From: Daniel Walker @ 2007-06-22 23:28 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Steven Rostedt, Thomas Gleixner, LKML, Linus Torvalds,
	Andrew Morton, Christoph Hellwig, john stultz, Oleg Nesterov,
	Paul E. McKenney, Dipankar Sarma, David S. Miller,
	matthew.wilcox, kuznet

On Sat, 2007-06-23 at 00:38 +0200, Ingo Molnar wrote:
> * Daniel Walker <dwalker@mvista.com> wrote:
> 
> > > The two patches have two different objectives, even though they are 
> > > related and currently on a 1 to 1 basis. The patches regardless, 
> > > should stay separate.
> > 
> > I'm not convinced yet .. One more stab?
> 
> uhm, i dont think Steve needs to 'convince' you. You have been (once 
> again ...) given extensive explanations about a really trivial topic, 
> but you keep pushing and demanding very agressively without apparently 
> reading the answers. If you dont want to or cannot understand it's 
> really your problem to solve, not Steve's ...

Your right Steven doesn't need to convince me. He requested comments and
I gave them, if he or you want to ignore them that's your prerogative.

How can you say I didn't read the answers .. I said "I read this 5
times" ! It doesn't get much more explicit than that .. In fact, my
response was mainly cause I thought his explanation was confusing.. How
can you fault _me_ for that, it's the medium we're using.

It seems like you assume all my emails are "demanding" and "aggressive",
where I'm just asking questions and making comments .. Who am I to be
making demands of anyone ?

<joke>
I DEMAND you do what I say! You better do it or else! I'll force Linus
to ignore you, so you better do it.
</joke>

Daniel


^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-22 22:44           ` Ingo Molnar
@ 2007-06-22 23:28             ` Daniel Walker
  0 siblings, 0 replies; 127+ messages in thread
From: Daniel Walker @ 2007-06-22 23:28 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: david, Linus Torvalds, Steven Rostedt, LKML, Andrew Morton,
	Thomas Gleixner, Christoph Hellwig, john stultz, Oleg Nesterov,
	Paul E. McKenney, Dipankar Sarma, David S. Miller,
	matthew.wilcox, kuznet

On Sat, 2007-06-23 at 00:44 +0200, Ingo Molnar wrote:
> * Daniel Walker <dwalker@mvista.com> wrote:
> 
> > > remember, these changes have been in use in -rt for a while. there's 
> > > reason to believe that they aren't going to cause drastic problems.
> > 
> > Since I've been working with -rt (~2 years now I think) it's clear 
> > that the number of testers of the patch isn't all that high compared 
> > to the stable kernel . [...]
> 
> You havent been watching it too closely i guess :-) The -rt kernel often 
> pops up regressions before mainline does, especially when it comes to 
> arcane hardware often used by embedded vendors [ =B-) ]. It even 
> triggers certain high-end scalability and race bugs before the mainline 
> kernel does, due to its unique scheduling behavior.

Don't assume anything Ingo ;)

> So yes, -rt obviously does not have as wide of a tester basis as the 
> mainline kernel (but it's by no means small), it nevertheless has a 
> tester base that is partly orthogonal to the mainline kernel.
> 
> Furthermore, -rt has a wide enough tester base for it to know that if 
> something has not caused problems in it for years is certainly at least 
> a good indicator that something isnt going to cause drastic problems ... 
> which was the point to begin with.

As I said, it's a plus but not a silver bullet ? Are you saying I was
wrong ?

Daniel


^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 4/6] Make DRM use the tasklet is-sched API
  2007-06-22 18:24         ` Christoph Hellwig
@ 2007-06-22 23:38           ` Dave Airlie
  0 siblings, 0 replies; 127+ messages in thread
From: Dave Airlie @ 2007-06-22 23:38 UTC (permalink / raw)
  To: Christoph Hellwig, Arnd Bergmann, Thomas Gleixner, Daniel Walker,
	Steven Rostedt, LKML, Linus Torvalds, Ingo Molnar, Andrew Morton,
	john stultz, Oleg Nesterov, Paul E. McKenney, Dipankar Sarma,
	David S. Miller, matthew.wilcox, kuznet

> >
> > The drm_locked_tasklet() function seems to have multiple bugs anyway,
> > so getting rid of it can only help, and it avoids exporting a new
> > tasklet_is_scheduled() interface.
>
> That's exactly what I though when looking over this code.  There's
> some really crappy in code in that area, and it should simply be
> rewritten.

Can someone submit a patch or even a better review? btw removing the
core stuff and putting it i915 isn't acceptable, I'll have support for
this feature for other hw coming up so the generic code needs to be
available...

Dave.

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-22  4:00 [RFC PATCH 0/6] Convert all tasklets to workqueues Steven Rostedt
                   ` (8 preceding siblings ...)
  2007-06-22 17:16 ` Linus Torvalds
@ 2007-06-23  5:14 ` Stephen Hemminger
  9 siblings, 0 replies; 127+ messages in thread
From: Stephen Hemminger @ 2007-06-23  5:14 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: linux-kernel

On Fri, 22 Jun 2007 00:00:14 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:

> 
> There's a very nice paper by Matthew Willcox that describes Softirqs,
> Tasklets, Bottom Halves, Task Queues, Work Queues and Timers[1].
> In the paper it describes the history of these items.  Softirqs and
> tasklets were created to replace bottom halves after a company (Mindcraft)
> showed that Microsoft on a 4x SMP box would out do Linux. It was discovered
> that this was due to a bottle neck caused by the design of Bottom Halves.
> So Alexey Kuznetsov and Dave Miller [1] (and I'm sure others) created
> softirqs and tasklets to multithread the bottom halves.
> 
> This worked well, and for the time it shut-up Microsoft^WMindcraft from
> saying Linux was slow at networking.
> 
> Time passed, and Linux developed other nifty tools, like kthreads and
> work queues. These run in a process context and are not as menacing to
> latencies as softirqs and tasklets are.  Specifically, a tasklet,
> acts as a task by only being able to run the function on one CPU
> at a time. The same tasklet can not run on multiple CPUS.  So in that
> aspect it is like a task (a task can only exist on one CPU at a time).
> But a tasklet is much harder on the rest of the system because it
> runs in interrupt context.  This means that if a higher priority process
> wants to run, it must wait for the tasklet to finish before doing so.
> 
> The most part, tasklets today are not used for time critical functions.
> Running tasklets in thread context is not harmful to performance of
> the overall system. But running them in interrupt context is, since
> they increase the overall latency for high priority tasks.
>

You will need to search and convert all network drivers that are using
tasklets. Drivers like the ipw2200, will need to be converted to NAPI,
and you probably have to fix a couple of places in the ieee80211 stack as well.


^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-22 21:59       ` Ingo Molnar
  2007-06-22 22:09         ` Ingo Molnar
  2007-06-22 22:58         ` Steven Rostedt
@ 2007-06-23  6:23         ` Dave Airlie
  2007-06-24 15:16         ` Jonathan Corbet
                           ` (3 subsequent siblings)
  6 siblings, 0 replies; 127+ messages in thread
From: Dave Airlie @ 2007-06-23  6:23 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Linus Torvalds, Steven Rostedt, LKML, Andrew Morton,
	Thomas Gleixner, Christoph Hellwig, john stultz, Oleg Nesterov,
	Paul E. McKenney, Dipankar Sarma, David S. Miller,
	matthew.wilcox, kuznet

>
> Most of the tasklet uses are in rarely used or arcane drivers - in fact
> none of my 10 test-boxes utilizes _any_ tasklet in any way that could
> even get close to mattering to performance. In other words: i just
> cannot test this, nor do i think that others will really test this. I.e.
> if we dont approach this problem in some other way, nothing will happen
> and Steve's patch will be stalled forever and will live in -rt forever.
> (which might be a correct end result too, but i'm just not giving up
> this easily :-)

I've no idea but the drm uses tasklets to schedule the page flip or
front/back blit as close to the vblank interrupt as possible to avoid
tearing on vblank sync'ed applications (and compositing managers).

I'm not sure we have any way to quantify this other than the closer to
the irq the flip happens the better chance of the screen not tearing..
going forward where compositing is more used on a loaded system I
wonder will we see more tearing on loaded systems..

I await the cleanups for the DRM code, it came from TG to support the
above feature on Intel hw.

Dave.

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 6/6] Convert tasklets to work queues
  2007-06-22  4:00 ` [RFC PATCH 6/6] Convert tasklets to work queues Steven Rostedt
  2007-06-22  7:06   ` Daniel Walker
@ 2007-06-23 11:15   ` Arnd Bergmann
  1 sibling, 0 replies; 127+ messages in thread
From: Arnd Bergmann @ 2007-06-23 11:15 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: LKML, Linus Torvalds, Ingo Molnar, Andrew Morton,
	Thomas Gleixner, Christoph Hellwig, john stultz, Oleg Nesterov,
	Paul E. McKenney, Dipankar Sarma, David S. Miller, kuznet

On Friday 22 June 2007, Steven Rostedt wrote:
> This patch creates an alternative for drivers from using tasklets.
> It creates a "work_tasklet". When configured to use work_tasklets
> instead of tasklets, instead of creating tasklets, a work queue
> is made in its place.  The API is still the same, and the drivers
> don't know that a work queue is being used.
> 

Perhaps the API can be slimmed down in the process, because half
of the tasklet interface functions are hardly used at all.

> +#define DECLARE_TASKLET(name, func, data)				\
> +	struct tasklet_struct name = {					\
> +		__WORK_INITIALIZER((name).work, work_tasklet_exec),	\
> +		LIST_HEAD_INIT((name).list),				\
> +		0,							\
> +		ATOMIC_INIT(0),						\
> +		func,							\
> +		data,							\
> +		#name							\
> +	}

18 users of this macro. Maybe too much for the start, but if we convert
all of them to use either tasklet_init or use work queues directly,
the macro can go away.

> +#define DECLARE_TASKLET_DISABLED(name, func, data)			\
> +	struct tasklet_struct name = {					\
> +		__WORK_INITIALIZER((name).work, work_tasklet_exec),	\
> +		LIST_HEAD_INIT((name).list),				\
> +		0,							\
> +		ATOMIC_INIT(1),						\
> +		func,							\
> +		data,							\
> + 		#name							\
> +	}

this one is easier, there are only four users in total: three input
drivers, and tipc.

> +void tasklet_schedule(struct tasklet_struct *t);
> +#define tasklet_hi_schedule tasklet_schedule
> +extern fastcall void tasklet_enable(struct tasklet_struct *t);
> +#define tasklet_hi_enable tasklet_enable

there are 34 files using tasklet_hi_* functions. In theory, these
could be converted to the non-hi version with a simple search and
replace, if it's clear that there is not much point in keeping them.

The most common use of tasklet_hi is in the alsa drivers. If it
actually makes a difference for them already, maybe there should
be an alsa softirq instead of moving them all over to work queues.

> +void tasklet_disable_nosync(struct tasklet_struct *t);

only has two users, bcm43xx and sc92031. If both are
converted to workqueue, the interface removed can be
removed.

> +extern void tasklet_kill_immediate(struct tasklet_struct *t, unsigned int cpu);

only one user in total, rcupdate.c. You already take care of that,
it seems the declaration is just a leftover.

	Arnd <><

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-22 21:59       ` Ingo Molnar
                           ` (2 preceding siblings ...)
  2007-06-23  6:23         ` Dave Airlie
@ 2007-06-24 15:16         ` Jonathan Corbet
  2007-06-24 15:52           ` Steven Rostedt
                             ` (2 more replies)
  2007-06-25 18:48         ` Kristian Høgsberg
                           ` (2 subsequent siblings)
  6 siblings, 3 replies; 127+ messages in thread
From: Jonathan Corbet @ 2007-06-24 15:16 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Steven Rostedt, LKML, Andrew Morton, Thomas Gleixner,
	Christoph Hellwig, john stultz, Oleg Nesterov, Paul E. McKenney,
	Dipankar Sarma, David S. Miller, matthew.wilcox, kuznet,
	Linus Torvalds

Ingo Molnar <mingo@elte.hu> wrote:

> so how about the following, different approach: anyone who has a tasklet 
> in any performance-sensitive codepath, please yell now.

The cafe_ccic (OLPC) camera driver uses a tasklet to move frames out of
the DMA buffers in the streaming I/O path.  With this change in place,
I'd worry that the possibility of dropping frames would increase,
especially considering that (1) this is running on OLPC hardware, and 
(2) there is typically a streaming video application running in user
space. 

Obviously some testing is called for here.  I will make an attempt to do
that testing, but the next few weeks involve some insane travel which
will make that hard.  Stay tuned.

Thanks,

jon

Jonathan Corbet / LWN.net / corbet@lwn.net

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-24 15:16         ` Jonathan Corbet
@ 2007-06-24 15:52           ` Steven Rostedt
  2007-06-25 16:50           ` Tilman Schmidt
  2007-06-26  0:00           ` Jonathan Corbet
  2 siblings, 0 replies; 127+ messages in thread
From: Steven Rostedt @ 2007-06-24 15:52 UTC (permalink / raw)
  To: Jonathan Corbet
  Cc: Ingo Molnar, LKML, Andrew Morton, Thomas Gleixner,
	Christoph Hellwig, john stultz, Oleg Nesterov, Paul E. McKenney,
	Dipankar Sarma, David S. Miller, matthew.wilcox, kuznet,
	Linus Torvalds


On Sun, 24 Jun 2007, Jonathan Corbet wrote:
>
> The cafe_ccic (OLPC) camera driver uses a tasklet to move frames out of
> the DMA buffers in the streaming I/O path.  With this change in place,
> I'd worry that the possibility of dropping frames would increase,
> especially considering that (1) this is running on OLPC hardware, and
> (2) there is typically a streaming video application running in user
> space.
>
> Obviously some testing is called for here.  I will make an attempt to do
> that testing, but the next few weeks involve some insane travel which
> will make that hard.  Stay tuned.
>

Jon,

Thanks for pointing this driver out. I'd greatly appreciate any report
you give us regarding the effect of having the tasklet as a thread.

-- Steve


^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-24 15:16         ` Jonathan Corbet
  2007-06-24 15:52           ` Steven Rostedt
@ 2007-06-25 16:50           ` Tilman Schmidt
  2007-06-25 17:06             ` Steven Rostedt
  2007-06-25 19:52             ` Stephen Hemminger
  2007-06-26  0:00           ` Jonathan Corbet
  2 siblings, 2 replies; 127+ messages in thread
From: Tilman Schmidt @ 2007-06-25 16:50 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Steven Rostedt, LKML, Andrew Morton, Thomas Gleixner,
	Christoph Hellwig, john stultz, Oleg Nesterov, Paul E. McKenney,
	Dipankar Sarma, David S. Miller, matthew.wilcox, kuznet,
	Linus Torvalds

[-- Attachment #1: Type: text/plain, Size: 984 bytes --]

Ingo Molnar <mingo@elte.hu> wrote:
> so how about the following, different approach: anyone who has a tasklet 
> in any performance-sensitive codepath, please yell now.

The Siemens Gigaset ISDN base driver uses tasklets in its isochronous
data paths. These will be scheduled for each completion of an isochronous
URB, or every 8 msec for each of the four isochronous pipes if both B
channels are connected. The driver uses three URBs for each pipe, always
maintaining two in flight while processing the third one. So the tasklet
has to run within 16 ms from being scheduled in order to avoid packet
loss (in the receive path) or data underrun (in the transmit path).

Does that qualify as performance sensitive for the purpose of this
discussion?

Thanks,
Tilman

-- 
Tilman Schmidt                          E-Mail: tilman@imap.cc
Bonn, Germany
Diese Nachricht besteht zu 100% aus wiederverwerteten Bits.
Ungeöffnet mindestens haltbar bis: (siehe Rückseite)


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 253 bytes --]

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-25 16:50           ` Tilman Schmidt
@ 2007-06-25 17:06             ` Steven Rostedt
  2007-06-25 20:50               ` Tilman Schmidt
  2007-06-25 19:52             ` Stephen Hemminger
  1 sibling, 1 reply; 127+ messages in thread
From: Steven Rostedt @ 2007-06-25 17:06 UTC (permalink / raw)
  To: Tilman Schmidt
  Cc: Ingo Molnar, LKML, Andrew Morton, Thomas Gleixner,
	Christoph Hellwig, john stultz, Oleg Nesterov, Paul E. McKenney,
	Dipankar Sarma, David S. Miller, matthew.wilcox, kuznet,
	Linus Torvalds

On Mon, 2007-06-25 at 18:50 +0200, Tilman Schmidt wrote:

> The Siemens Gigaset ISDN base driver uses tasklets in its isochronous
> data paths. These will be scheduled for each completion of an isochronous
> URB, or every 8 msec for each of the four isochronous pipes if both B
> channels are connected. The driver uses three URBs for each pipe, always
> maintaining two in flight while processing the third one. So the tasklet
> has to run within 16 ms from being scheduled in order to avoid packet
> loss (in the receive path) or data underrun (in the transmit path).
> 
> Does that qualify as performance sensitive for the purpose of this
> discussion?

Actually, no.  16ms, even 8ms is an incredible amount of time. Unless
you have a thread that is running at a higher priority than the thread
that handles the work queue performing the task, you would have no
problems making that deadline.  If you did miss the deadline without
having ridiculously high prio tasks, I would think that you would miss
your deadline with tasklets as well. Unless the large latency has to do
with preempt_disable, but that large of a latency would be IMHO a bug.

-- Steve
 


^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-22 21:59       ` Ingo Molnar
                           ` (3 preceding siblings ...)
  2007-06-24 15:16         ` Jonathan Corbet
@ 2007-06-25 18:48         ` Kristian Høgsberg
  2007-06-25 19:11           ` Steven Rostedt
  2007-06-25 21:15           ` Ingo Molnar
  2007-06-26  1:46         ` Dan Williams
  2007-06-28  5:48         ` Jeff Garzik
  6 siblings, 2 replies; 127+ messages in thread
From: Kristian Høgsberg @ 2007-06-25 18:48 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Linus Torvalds, Steven Rostedt, LKML, Andrew Morton,
	Thomas Gleixner, Christoph Hellwig, john stultz, Oleg Nesterov,
	Paul E. McKenney, Dipankar Sarma, David S. Miller,
	matthew.wilcox, kuznet

On Fri, 2007-06-22 at 23:59 +0200, Ingo Molnar wrote:
> so how about the following, different approach: anyone who has a tasklet 
> in any performance-sensitive codepath, please yell now. We'll also do a 
> proactive search for such places. We can convert those places to 
> softirqs, or move them back into hardirq context. Once this is done - 
> and i doubt it will go beyond 1-2 places - we can just mass-convert the 
> other 110 places to the lame but compatible solution of doing them in a 
> global thread context.

OK, here's a yell.  I'm using tasklets in the new firewire stack for all
interrupt handling.  All my interrupt handler does is read out the event
mask and schedule the appropriate tasklets.  Most of these tasklets
typically just end up scheduling work or completing a completion, so
moving it to a workqueue is pretty pointless.  In particular, the
isochronous DMA events must be handled with as little latency as
possible, so a workqueue in that code path would be pretty bad.

I'm not strongly attached to tasklets, and it sounds like I got it wrong
and used the wrong delayed execution mechanism.  But that's just another
data point that suggests that there are too many of these.  I guess I
need to sit down and look into porting that to softirqs?

However, I don't really understand how you can discuss a wholesale
replacing of tasklets with workqueues, given the very different
execution sematics of the two mechanisms.  I would think that others
have used tasklets for similar purposes as I have and moving that to
workqueues just has to break a bunch of stuff.  I don't know the various
places tasklets are used as well as other people in this thread, but I
think deprecating them and moving code to either softirqs or workqueues
on a case by case basis is a better approach.  That way we also avoid
the gross wrappers.

Kristian



^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-25 18:48         ` Kristian Høgsberg
@ 2007-06-25 19:11           ` Steven Rostedt
  2007-06-25 20:07             ` Kristian Høgsberg
  2007-06-25 21:15           ` Ingo Molnar
  1 sibling, 1 reply; 127+ messages in thread
From: Steven Rostedt @ 2007-06-25 19:11 UTC (permalink / raw)
  To: Kristian Høgsberg
  Cc: Ingo Molnar, Linus Torvalds, LKML, Andrew Morton,
	Thomas Gleixner, Christoph Hellwig, john stultz, Oleg Nesterov,
	Paul E. McKenney, Dipankar Sarma, David S. Miller, kuznet

On Mon, 2007-06-25 at 14:48 -0400, Kristian Høgsberg wrote:

> OK, here's a yell.  I'm using tasklets in the new firewire stack for all

Thanks for speaking up!

> interrupt handling.  All my interrupt handler does is read out the event
> mask and schedule the appropriate tasklets.  Most of these tasklets
> typically just end up scheduling work or completing a completion, so
> moving it to a workqueue is pretty pointless.  In particular, the
> isochronous DMA events must be handled with as little latency as
> possible, so a workqueue in that code path would be pretty bad.
> 
> I'm not strongly attached to tasklets, and it sounds like I got it wrong
> and used the wrong delayed execution mechanism.  But that's just another
> data point that suggests that there are too many of these.  I guess I
> need to sit down and look into porting that to softirqs?
> 
> However, I don't really understand how you can discuss a wholesale
> replacing of tasklets with workqueues, given the very different
> execution sematics of the two mechanisms.  I would think that others
> have used tasklets for similar purposes as I have and moving that to
> workqueues just has to break a bunch of stuff.  I don't know the various
> places tasklets are used as well as other people in this thread, but I
> think deprecating them and moving code to either softirqs or workqueues
> on a case by case basis is a better approach.  That way we also avoid
> the gross wrappers.

The gross wrappers were a perfect way to shed light on something that is
overused, and should most likely be replaced.

Does your system need to have these functions that are in tasklets need
to be non-reentrant?  I wonder how many "irq critical" functions used
tasklets just because adding a softirq requires too much (no generic
softirq code).  A tasklet is constrained to run on one CPU at a time,
and it is not guaranteed to run on the CPU it was scheduled on.

Perhaps it's time to add a new functionality while removing tasklets.
Things that are ok to bounce around CPUs (like tasklets do) can most
likely be replaced by a work queue. But these highly critical tasks
probably would benefit from being a softirq.

Maybe we should be looking at something like GENERIC_SOFTIRQ to run
functions that a driver could add. But they would run only on the CPU
that scheduled them, and do not guarantee non-reentrant as tasklets do
today.

I think I even found a bug in a driver that was trying to get around the
non-reentrancy of a tasklet (will post this soon).

It's looking like only a few tasklets have this critical requirement,
and are the candidates to move to a more generic softirq.  The rest of
the tasklets would be switched to work queues, and this gross wrapper of
mine can do that in the meantime so we can find those that should be
converted to a generic softirq.

-- Steve



^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-25 16:50           ` Tilman Schmidt
  2007-06-25 17:06             ` Steven Rostedt
@ 2007-06-25 19:52             ` Stephen Hemminger
  1 sibling, 0 replies; 127+ messages in thread
From: Stephen Hemminger @ 2007-06-25 19:52 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Tilman Schmidt, linux-kernel

On Mon, 25 Jun 2007 18:50:03 +0200
Tilman Schmidt <tilman@imap.cc> wrote:

> Ingo Molnar <mingo@elte.hu> wrote:
> > so how about the following, different approach: anyone who has a tasklet 
> > in any performance-sensitive codepath, please yell now.


Getting rid of tasklet's may seem like a good idea. But doing it by changing
them all to workqueue's would have bad consequences for networking.

The first issue is that it would change the semantic assumptions in the
places where tasklets are used. Many places "know" that a tasklet runs in soft
irq context so extra locking is not needed.

The performance overhead of changing to workqueue's could also be disastrous for
some devices. There are 10G device drivers that use tasklets to handle transmit
completion.


Here is a more detailed list how network devices are using tasklet's

Receive packet handling: ifb, ppp, ipw2200, ipw2100
Receive buffer refill: acenic, s2io
Receive & Transmit: sc9031, sundance
Transmit buffer allocation: smc91x
Phy handling: skge

Sorry, if you are going to get rid of tasklets, you need to fix all the
network drivers first.

-- 
Stephen Hemminger <shemminger@linux-foundation.org>

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-25 19:11           ` Steven Rostedt
@ 2007-06-25 20:07             ` Kristian Høgsberg
  2007-06-25 20:31               ` Steven Rostedt
  0 siblings, 1 reply; 127+ messages in thread
From: Kristian Høgsberg @ 2007-06-25 20:07 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Ingo Molnar, Linus Torvalds, LKML, Andrew Morton,
	Thomas Gleixner, Christoph Hellwig, john stultz, Oleg Nesterov,
	Paul E. McKenney, Dipankar Sarma, David S. Miller, kuznet

On Mon, 2007-06-25 at 15:11 -0400, Steven Rostedt wrote:
> On Mon, 2007-06-25 at 14:48 -0400, Kristian Høgsberg wrote:
> ...
> > However, I don't really understand how you can discuss a wholesale
> > replacing of tasklets with workqueues, given the very different
> > execution sematics of the two mechanisms.  I would think that others
> > have used tasklets for similar purposes as I have and moving that to
> > workqueues just has to break a bunch of stuff.  I don't know the various
> > places tasklets are used as well as other people in this thread, but I
> > think deprecating them and moving code to either softirqs or workqueues
> > on a case by case basis is a better approach.  That way we also avoid
> > the gross wrappers.
> 
> The gross wrappers were a perfect way to shed light on something that is
> overused, and should most likely be replaced.
> 
> Does your system need to have these functions that are in tasklets need
> to be non-reentrant?  I wonder how many "irq critical" functions used
> tasklets just because adding a softirq requires too much (no generic
> softirq code).  A tasklet is constrained to run on one CPU at a time,
> and it is not guaranteed to run on the CPU it was scheduled on.

When I started the firewire work, I wasn't aware that tasklets were
going away, but I knew that doing too much work in the interrupt handler
was frowned upon, for good reasons.  So I was looking at softirqs vs
taslkets, and since using softirqs means you have to go add yourself to
the big bitmask, I opted for tasklets.  The comment in interrupt.h
directly recommends this.  As it stands, the firewire stack does
actaully rely on the non-reentrancy of tasklets, but that's not a
deal-breaker, I can add the necessary locking.

> Perhaps it's time to add a new functionality while removing tasklets.
> Things that are ok to bounce around CPUs (like tasklets do) can most
> likely be replaced by a work queue. But these highly critical tasks
> probably would benefit from being a softirq.
> 
> Maybe we should be looking at something like GENERIC_SOFTIRQ to run
> functions that a driver could add. But they would run only on the CPU
> that scheduled them, and do not guarantee non-reentrant as tasklets do
> today.

Sounds like this will fill the gap.  Of course, this won't reduce the
number of delayed-execution mechanisms available...

Kristian


^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-25 20:07             ` Kristian Høgsberg
@ 2007-06-25 20:31               ` Steven Rostedt
  2007-06-25 21:08                 ` Kristian Høgsberg
  0 siblings, 1 reply; 127+ messages in thread
From: Steven Rostedt @ 2007-06-25 20:31 UTC (permalink / raw)
  To: Kristian Høgsberg
  Cc: Ingo Molnar, Linus Torvalds, LKML, Andrew Morton,
	Thomas Gleixner, Christoph Hellwig, john stultz, Oleg Nesterov,
	Paul E. McKenney, Dipankar Sarma, David S. Miller, kuznet

On Mon, 2007-06-25 at 16:07 -0400, Kristian Høgsberg wrote:

> > Maybe we should be looking at something like GENERIC_SOFTIRQ to run
> > functions that a driver could add. But they would run only on the CPU
> > that scheduled them, and do not guarantee non-reentrant as tasklets do
> > today.
> 
> Sounds like this will fill the gap.  Of course, this won't reduce the
> number of delayed-execution mechanisms available...

I disagree. Adding a generic softirq is not really adding another
delayed-execution, it's just extending the sofitrq.  It would not have
any different semantics as a normal softirq, except that it would be
dynamic for modules to use.  A tasklet has different concepts than
softirq. It adds non-reentrancy and that the tasklet function can run
where it wasn't scheduled.

Adding a generic softirq would keep the same concepts as a softirq but
just extend the users for it. Tasklets are probably not the best for
critical sections since it can be postponed longer if it was scheduled
on another CPU that is handling a bunch of other tasklets. So the
latency of running the tasklet is higher and you lose a cache advantage
by jumping to another CPU to execute.

Work queues already exist, so all we need to do to replace tasklets is
to extend softirqs for those critical cases that tasklets are used, and
replace the rest with work queues. By removing the non critical tasklets
to work queues will even lower the latency to execution of the more
critical tasklets.

-- Steve



^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-25 17:06             ` Steven Rostedt
@ 2007-06-25 20:50               ` Tilman Schmidt
  2007-06-25 21:03                 ` Steven Rostedt
  0 siblings, 1 reply; 127+ messages in thread
From: Tilman Schmidt @ 2007-06-25 20:50 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Ingo Molnar, LKML, Andrew Morton, Thomas Gleixner,
	Christoph Hellwig, john stultz, Oleg Nesterov, Paul E. McKenney,
	Dipankar Sarma, David S. Miller, Linus Torvalds

[-- Attachment #1: Type: text/plain, Size: 1162 bytes --]

Am 25.06.2007 19:06 schrieb Steven Rostedt:
> On Mon, 2007-06-25 at 18:50 +0200, Tilman Schmidt wrote:
> 
>> The Siemens Gigaset ISDN base driver uses tasklets in its isochronous
>> data paths. [...]
>> Does that qualify as performance sensitive for the purpose of this
>> discussion?
> 
> Actually, no.  16ms, even 8ms is an incredible amount of time. Unless
> you have a thread that is running at a higher priority than the thread
> that handles the work queue performing the task, you would have no
> problems making that deadline.

Ok, I'm reassured. I'll look into converting these to a work queue
then, although I can't promise when I'll get around to it.

In fact, if these timing requirements are so easy to meet, perhaps
it doesn't even need its own work queue, and just making each
tasklet into a work item and queueing them to the global queue
with schedule_work() would do? Or am I getting too reckless now?

Thanks,
Tilman

-- 
Tilman Schmidt                          E-Mail: tilman@imap.cc
Bonn, Germany
Diese Nachricht besteht zu 100% aus wiederverwerteten Bits.
Ungeöffnet mindestens haltbar bis: (siehe Rückseite)


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 253 bytes --]

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-25 20:50               ` Tilman Schmidt
@ 2007-06-25 21:03                 ` Steven Rostedt
  0 siblings, 0 replies; 127+ messages in thread
From: Steven Rostedt @ 2007-06-25 21:03 UTC (permalink / raw)
  To: Tilman Schmidt
  Cc: Ingo Molnar, LKML, Andrew Morton, Thomas Gleixner,
	Christoph Hellwig, john stultz, Oleg Nesterov, Paul E. McKenney,
	Dipankar Sarma, David S. Miller, Linus Torvalds

On Mon, 2007-06-25 at 22:50 +0200, Tilman Schmidt wrote:

> Ok, I'm reassured. I'll look into converting these to a work queue
> then, although I can't promise when I'll get around to it.
> 
> In fact, if these timing requirements are so easy to meet, perhaps
> it doesn't even need its own work queue, and just making each
> tasklet into a work item and queueing them to the global queue
> with schedule_work() would do? Or am I getting too reckless now?

I'm sure you probably wouldn't have a problem with using just
schedule_work. But that's shared and you don't know with what. A
function in the keventd work queue can call schedule (not recommended,
but with closed source drivers, you'd never know. schedule_work is
EXPORT_SYMBOL not EXPORT_SYMBOL_GPL).

So, if you convert it to work queues, I'd strongly recommend adding a
new instance.

-- Steve



^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-25 20:31               ` Steven Rostedt
@ 2007-06-25 21:08                 ` Kristian Høgsberg
  0 siblings, 0 replies; 127+ messages in thread
From: Kristian Høgsberg @ 2007-06-25 21:08 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Ingo Molnar, Linus Torvalds, LKML, Andrew Morton,
	Thomas Gleixner, Christoph Hellwig, john stultz, Oleg Nesterov,
	Paul E. McKenney, Dipankar Sarma, David S. Miller, kuznet

On Mon, 2007-06-25 at 16:31 -0400, Steven Rostedt wrote:
> On Mon, 2007-06-25 at 16:07 -0400, Kristian Høgsberg wrote:
> 
> > > Maybe we should be looking at something like GENERIC_SOFTIRQ to run
> > > functions that a driver could add. But they would run only on the CPU
> > > that scheduled them, and do not guarantee non-reentrant as tasklets do
> > > today.
> > 
> > Sounds like this will fill the gap.  Of course, this won't reduce the
> > number of delayed-execution mechanisms available...
> 
> I disagree. Adding a generic softirq is not really adding another
> delayed-execution, it's just extending the sofitrq.  It would not have
> any different semantics as a normal softirq, except that it would be
> dynamic for modules to use.  A tasklet has different concepts than
> softirq. It adds non-reentrancy and that the tasklet function can run
> where it wasn't scheduled.

Hmm, yeah, true.  Ok, sounds useful, let me know when you have a patch.
I'll try porting the firewire stack and let you know how it works.

Just to make sure I got this right: softirqs will always be scheduled on
the irq handling CPU and if an interrupt schedules multiple softirqs
from one handler invocation, the softirqs will be executed in the order
they are scheduled?  And the reentrancy that softirqs does not protect
against (as opposed to tasklets) is the case where different CPUs handle
the interrupt and each schedule the same softirq for execution?

Kristian



^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-25 18:48         ` Kristian Høgsberg
  2007-06-25 19:11           ` Steven Rostedt
@ 2007-06-25 21:15           ` Ingo Molnar
  2007-06-25 23:36             ` Stefan Richter
  1 sibling, 1 reply; 127+ messages in thread
From: Ingo Molnar @ 2007-06-25 21:15 UTC (permalink / raw)
  To: Kristian H?gsberg
  Cc: Linus Torvalds, Steven Rostedt, LKML, Andrew Morton,
	Thomas Gleixner, Christoph Hellwig, john stultz, Oleg Nesterov,
	Paul E. McKenney, Dipankar Sarma, David S. Miller,
	matthew.wilcox, kuznet


* Kristian H?gsberg <krh@redhat.com> wrote:

> OK, here's a yell.  I'm using tasklets in the new firewire stack for 
> all interrupt handling.  All my interrupt handler does is read out the 
> event mask and schedule the appropriate tasklets.  Most of these 
> tasklets typically just end up scheduling work or completing a 
> completion, so moving it to a workqueue is pretty pointless.  In 
> particular, the isochronous DMA events must be handled with as little 
> latency as possible, so a workqueue in that code path would be pretty 
> bad.

regarding workqueues - would it be possible for you to test Steve's 
patch and get us performance numbers? Do you have any test with tons of 
tasklet activity that would definitely show the performance impact of 
workqueues? Workqueue priority can be set, and your handler should 
probably be SCHED_FIFO.

right now the tasklet-emulation workqueue is globally locked, etc., but 
if you use per-cpu workqueues then you'd probably get better scalability 
than tasklets. (yes, despite the extra scheduling (which only costs ~1 
microsecond) that the workqueue has to do.) Scheduling is pretty cheap, 
the basic overhead of servicing a single interrupt is often 10 times 
more expensive than a context-switch.

> I'm not strongly attached to tasklets, and it sounds like I got it 
> wrong and used the wrong delayed execution mechanism.  But that's just 
> another data point that suggests that there are too many of these.  I 
> guess I need to sit down and look into porting that to softirqs?

i'd like to stress that your approach is completely fine and valid - and 
if what we propose impacts performance negatively without any acceptable 
(and robust) replacement solution offered by us then our patch wont be 
done - simple as that. Softirqs could be an additional (performance) 
advantage on SMP systems with multiple firewire interrupt sources, but 
it would have to be measured too.

	Ingo

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-25 21:15           ` Ingo Molnar
@ 2007-06-25 23:36             ` Stefan Richter
  2007-06-26  0:46               ` Steven Rostedt
  0 siblings, 1 reply; 127+ messages in thread
From: Stefan Richter @ 2007-06-25 23:36 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Kristian H?gsberg, Linus Torvalds, Steven Rostedt, LKML,
	Andrew Morton, Thomas Gleixner, Christoph Hellwig, john stultz,
	Oleg Nesterov, Paul E. McKenney, Dipankar Sarma, David S. Miller,
	matthew.wilcox, kuznet

Ingo Molnar wrote:
> regarding workqueues - would it be possible for you to test Steve's 
> patch and get us performance numbers? Do you have any test with tons of 
> tasklet activity that would definitely show the performance impact of 
> workqueues?

I can't speak for Kristian, nor do I have test equipment for isochronous
applications, but I know that there are people out there which do data
acquisition on as many FireWire buses as they can stuff boards into
their boxes.  There are also FireWire cards with 2 or 4 controllers per
board; and each controller can receive or transmit on several channels.

Depending on the buffering scheme, there may be one (?) interrupt per
channel and isochronopus cycle.  Or an interrupt when the buffer is
full.  Some application programmers use large buffers; others want small
buffers.  An isochronous cycle is 125us.

Asynchronous I/O can even produce much higher interrupt rates.  I think
IP over 1394 might indeed cause interrupt rates that are moderately
higher than 1/125us during normal traffic.  SBP-2 ( = 1394 storage) is
not as much affected because the bulk of data is transferred without
interrupts.  So I suppose some eth1394 bandwidth tests with this patch
series might make sense... alas I'm short of spare time.  (Would be
interesting to see whether the old ohci1394 driver is blown to bits with
the patch series; it's an old driver with who-knows what assumptions in
there.)

> Workqueue priority can be set, and your handler should 
> probably be SCHED_FIFO.

Does this cooperate nicely with a SCHED_FIFO thread of a userspace data
acquisition program or audio server or the like?
-- 
Stefan Richter
-=====-=-=== -==- ==-=-
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-24 15:16         ` Jonathan Corbet
  2007-06-24 15:52           ` Steven Rostedt
  2007-06-25 16:50           ` Tilman Schmidt
@ 2007-06-26  0:00           ` Jonathan Corbet
  2007-06-26  0:52             ` Steven Rostedt
  2 siblings, 1 reply; 127+ messages in thread
From: Jonathan Corbet @ 2007-06-26  0:00 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Steven Rostedt, LKML, Andrew Morton, Thomas Gleixner,
	Christoph Hellwig, john stultz, Oleg Nesterov, Paul E. McKenney,
	Dipankar Sarma, David S. Miller, matthew.wilcox, kuznet,
	Linus Torvalds

A couple of days ago I said:

> The cafe_ccic (OLPC) camera driver uses a tasklet to move frames out of
> the DMA buffers in the streaming I/O path....
> 
> Obviously some testing is called for here.  I will make an attempt to do
> that testing

I've done that testing - I have an OLPC B3 unit running V2 of the
tasklet->workqueue patch, and all seems well.  30 FPS to the display and
no dropped frames.  The tasklets/0 process is running 3-5% CPU, in case
that's interesting.  For whatever reason, I see about 3% *more* idle
time when running just mplayer than I did without the patch.

Consider my minor qualms withdrawn, there doesn't seem to be any trouble
in this area.

Thanks,

jon

Jonathan Corbet / LWN.net / corbet@lwn.net

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-25 23:36             ` Stefan Richter
@ 2007-06-26  0:46               ` Steven Rostedt
  0 siblings, 0 replies; 127+ messages in thread
From: Steven Rostedt @ 2007-06-26  0:46 UTC (permalink / raw)
  To: Stefan Richter
  Cc: Ingo Molnar, Kristian H?gsberg, Linus Torvalds, LKML,
	Andrew Morton, Thomas Gleixner, Christoph Hellwig, john stultz,
	Oleg Nesterov, Paul E. McKenney, Dipankar Sarma, David S. Miller,
	matthew.wilcox, kuznet

On Tue, 2007-06-26 at 01:36 +0200, Stefan Richter wrote:

> I can't speak for Kristian, nor do I have test equipment for isochronous
> applications, but I know that there are people out there which do data
> acquisition on as many FireWire buses as they can stuff boards into
> their boxes.  There are also FireWire cards with 2 or 4 controllers per
> board; and each controller can receive or transmit on several channels.
> 
> Depending on the buffering scheme, there may be one (?) interrupt per
> channel and isochronopus cycle.  Or an interrupt when the buffer is
> full.  Some application programmers use large buffers; others want small
> buffers.  An isochronous cycle is 125us.
> 
> Asynchronous I/O can even produce much higher interrupt rates.  I think
> IP over 1394 might indeed cause interrupt rates that are moderately
> higher than 1/125us during normal traffic.  SBP-2 ( = 1394 storage) is
> not as much affected because the bulk of data is transferred without
> interrupts.  So I suppose some eth1394 bandwidth tests with this patch
> series might make sense... alas I'm short of spare time.  (Would be
> interesting to see whether the old ohci1394 driver is blown to bits with
> the patch series; it's an old driver with who-knows what assumptions in
> there.)

Hi, any testing of the patches would be much appreciated. I don't have
access to any boxes that might have problems with running tasklets as
work queues. So if you know others with this equipment, and can pass the
patches off to them. It will hopefully help us know if this patch helps,
hurts, or just doesn't make a difference.

> 
> > Workqueue priority can be set, and your handler should 
> > probably be SCHED_FIFO.
> 
> Does this cooperate nicely with a SCHED_FIFO thread of a userspace data
> acquisition program or audio server or the like?

Well, if you put the prio of the work queue higher, it won't be any
different than a tasklet. A tasklet runs at an even higher priority than
any thread on the system.

-- Steve



^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-26  0:00           ` Jonathan Corbet
@ 2007-06-26  0:52             ` Steven Rostedt
  0 siblings, 0 replies; 127+ messages in thread
From: Steven Rostedt @ 2007-06-26  0:52 UTC (permalink / raw)
  To: Jonathan Corbet
  Cc: Ingo Molnar, LKML, Andrew Morton, Thomas Gleixner,
	Christoph Hellwig, john stultz, Oleg Nesterov, Paul E. McKenney,
	Dipankar Sarma, David S. Miller, matthew.wilcox, kuznet,
	Linus Torvalds

On Mon, 2007-06-25 at 18:00 -0600, Jonathan Corbet wrote:
> A couple of days ago I said:
> 
> > The cafe_ccic (OLPC) camera driver uses a tasklet to move frames out of
> > the DMA buffers in the streaming I/O path....
> > 
> > Obviously some testing is called for here.  I will make an attempt to do
> > that testing
> 
> I've done that testing - I have an OLPC B3 unit running V2 of the
> tasklet->workqueue patch, and all seems well.  30 FPS to the display and
> no dropped frames.  The tasklets/0 process is running 3-5% CPU, in case
> that's interesting.  For whatever reason, I see about 3% *more* idle
> time when running just mplayer than I did without the patch.
> 
> Consider my minor qualms withdrawn, there doesn't seem to be any trouble
> in this area.

Jon, thanks a lot!

This is great news. I wonder if converting tasklets to work queues also
helps with other softirqs.  Before, softirqs could not preempt a
tasklet, since tasklets run as a softirq. With tasklets as work queues,
what's left as a softirq can now preempt tasklets. Perhaps this can even
help with performance.

-- Steve
  


^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-22 21:59       ` Ingo Molnar
                           ` (4 preceding siblings ...)
  2007-06-25 18:48         ` Kristian Høgsberg
@ 2007-06-26  1:46         ` Dan Williams
  2007-06-26  2:01           ` Steven Rostedt
  2007-06-28  5:48         ` Jeff Garzik
  6 siblings, 1 reply; 127+ messages in thread
From: Dan Williams @ 2007-06-26  1:46 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Linus Torvalds, Steven Rostedt, LKML, Andrew Morton,
	Thomas Gleixner, Christoph Hellwig, john stultz, Oleg Nesterov,
	Paul E. McKenney, Dipankar Sarma, David S. Miller,
	matthew.wilcox, kuznet

[-- Attachment #1: Type: text/plain, Size: 1081 bytes --]

> so how about the following, different approach: anyone who has a tasklet
> in any performance-sensitive codepath, please yell now. We'll also do a
> proactive search for such places. We can convert those places to
> softirqs, or move them back into hardirq context. Once this is done -
> and i doubt it will go beyond 1-2 places - we can just mass-convert the
> other 110 places to the lame but compatible solution of doing them in a
> global thread context.
>

I have a driver / testcase that reacts negatively to a workqueue
conversion.  This is with the iop-adma driver on an ARM based platform
re-syncing a degraded raid5 array.  The driver is currently in -mm and
it uses tasklets to run a short callback routine upon completion of an
offloaded memcpy or xor operation.  Quick tests show that
write-throughput does not go down too much, but resync speed, as
reported by /proc/mdstat, drops from ~50MB/s to ~30MB/s.

Context switches on this platform flush the L1 cache so bouncing
between a workqueue and the MD thread is painful.

The conversion patch is attached.

--
Dan

[-- Attachment #2: iop-adma-workq-conv.patch --]
[-- Type: text/x-patch, Size: 3739 bytes --]

diff --git a/drivers/dma/iop-adma.c b/drivers/dma/iop-adma.c
index 5d8a6cf..7e89003 100644
--- a/drivers/dma/iop-adma.c
+++ b/drivers/dma/iop-adma.c
@@ -41,6 +41,8 @@
 #define tx_to_iop_adma_slot(tx) \
 	container_of(tx, struct iop_adma_desc_slot, async_tx)
 
+static struct workqueue_struct *iop_adma_workqueue;
+
 /**
  * iop_adma_free_slots - flags descriptor slots for reuse
  * @slot: Slot to free
@@ -273,9 +275,11 @@ iop_adma_slot_cleanup(struct iop_adma_chan *iop_chan)
 	spin_unlock_bh(&iop_chan->lock);
 }
 
-static void iop_adma_tasklet(unsigned long data)
+static void iop_adma_work_routine(struct work_struct *work)
 {
-	struct iop_adma_chan *chan = (struct iop_adma_chan *) data;
+	struct iop_adma_chan *chan =
+		container_of(work, struct iop_adma_chan, work);
+
 	__iop_adma_slot_cleanup(chan);
 }
 
@@ -370,7 +374,7 @@ retry:
 		goto retry;
 
 	/* try to free some slots if the allocation fails */
-	tasklet_schedule(&iop_chan->irq_tasklet);
+	queue_work(iop_adma_workqueue, &iop_chan->work);
 
 	return NULL;
 }
@@ -704,7 +708,7 @@ iop_adma_prep_dma_zero_sum(struct dma_chan *chan, unsigned int src_cnt,
 static void iop_adma_dependency_added(struct dma_chan *chan)
 {
 	struct iop_adma_chan *iop_chan = to_iop_adma_chan(chan);
-	tasklet_schedule(&iop_chan->irq_tasklet);
+	queue_work(iop_adma_workqueue, &iop_chan->work);
 }
 
 static void iop_adma_free_chan_resources(struct dma_chan *chan)
@@ -785,7 +789,7 @@ static irqreturn_t iop_adma_eot_handler(int irq, void *data)
 
 	dev_dbg(chan->device->common.dev, "%s\n", __FUNCTION__);
 
-	tasklet_schedule(&chan->irq_tasklet);
+	queue_work(iop_adma_workqueue, &chan->work);
 
 	iop_adma_device_clear_eot_status(chan);
 
@@ -798,7 +802,7 @@ static irqreturn_t iop_adma_eoc_handler(int irq, void *data)
 
 	dev_dbg(chan->device->common.dev, "%s\n", __FUNCTION__);
 
-	tasklet_schedule(&chan->irq_tasklet);
+	queue_work(iop_adma_workqueue, &chan->work);
 
 	iop_adma_device_clear_eoc_status(chan);
 
@@ -1244,8 +1248,6 @@ static int __devinit iop_adma_probe(struct platform_device *pdev)
 		ret = -ENOMEM;
 		goto err_free_iop_chan;
 	}
-	tasklet_init(&iop_chan->irq_tasklet, iop_adma_tasklet, (unsigned long)
-		iop_chan);
 
 	/* clear errors before enabling interrupts */
 	iop_adma_device_clear_err_status(iop_chan);
@@ -1268,11 +1270,13 @@ static int __devinit iop_adma_probe(struct platform_device *pdev)
 
 	spin_lock_init(&iop_chan->lock);
 	init_timer(&iop_chan->cleanup_watchdog);
-	iop_chan->cleanup_watchdog.data = (unsigned long) iop_chan;
-	iop_chan->cleanup_watchdog.function = iop_adma_tasklet;
+	iop_chan->cleanup_watchdog.data = (unsigned long) &iop_chan->work;
+	iop_chan->cleanup_watchdog.function = iop_adma_work_routine;
 	INIT_LIST_HEAD(&iop_chan->chain);
 	INIT_LIST_HEAD(&iop_chan->all_slots);
 	INIT_RCU_HEAD(&iop_chan->common.rcu);
+	INIT_WORK(&iop_chan->work, iop_adma_work_routine);
+
 	iop_chan->common.device = dma_dev;
 	list_add_tail(&iop_chan->common.device_node, &dma_dev->channels);
 
@@ -1443,6 +1447,10 @@ static struct platform_driver iop_adma_driver = {
 
 static int __init iop_adma_init (void)
 {
+	iop_adma_workqueue = create_workqueue("iop-adma");
+	if (!iop_adma_workqueue)
+		return -ENODEV;
+
 	/* it's currently unsafe to unload this module */
 	/* if forced, worst case is that rmmod hangs */
 	__unsafe(THIS_MODULE);
diff --git a/include/asm-arm/hardware/iop_adma.h b/include/asm-arm/hardware/iop_adma.h
index 8eb5990..7d8742b 100644
--- a/include/asm-arm/hardware/iop_adma.h
+++ b/include/asm-arm/hardware/iop_adma.h
@@ -67,7 +67,7 @@ struct iop_adma_chan {
 	struct list_head all_slots;
 	struct timer_list cleanup_watchdog;
 	int slots_allocated;
-	struct tasklet_struct irq_tasklet;
+	struct work_struct work;
 };
 
 /**

^ permalink raw reply related	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-26  1:46         ` Dan Williams
@ 2007-06-26  2:01           ` Steven Rostedt
  2007-06-26  2:12             ` Dan Williams
  0 siblings, 1 reply; 127+ messages in thread
From: Steven Rostedt @ 2007-06-26  2:01 UTC (permalink / raw)
  To: Dan Williams
  Cc: Ingo Molnar, Linus Torvalds, LKML, Andrew Morton,
	Thomas Gleixner, Christoph Hellwig, john stultz, Oleg Nesterov,
	Paul E. McKenney, Dipankar Sarma, David S. Miller, kuznet

On Mon, 2007-06-25 at 18:46 -0700, Dan Williams wrote:
> 
> Context switches on this platform flush the L1 cache so bouncing
> between a workqueue and the MD thread is painful.

Why is context switches between two kernel threads flushing the L1
cache?  Is this a flaw in the ARM arch?  I would think the only thing
that needs to be done between a context switch of two kernel threads (or
even a user thread to a kernel thread) is update the general regs and
stack. The memory access (page_tables or whatever ARM uses) should stay
the same.

Perhaps something else is at fault here.

Thanks for testing!

-- Steve



^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-26  2:01           ` Steven Rostedt
@ 2007-06-26  2:12             ` Dan Williams
  2007-06-28 12:37               ` Steven Rostedt
  0 siblings, 1 reply; 127+ messages in thread
From: Dan Williams @ 2007-06-26  2:12 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Ingo Molnar, Linus Torvalds, LKML, Andrew Morton,
	Thomas Gleixner, Christoph Hellwig, john stultz, Oleg Nesterov,
	Paul E. McKenney, Dipankar Sarma, David S. Miller, kuznet

On 6/25/07, Steven Rostedt <rostedt@goodmis.org> wrote:
> On Mon, 2007-06-25 at 18:46 -0700, Dan Williams wrote:
> >
> > Context switches on this platform flush the L1 cache so bouncing
> > between a workqueue and the MD thread is painful.
>
> Why is context switches between two kernel threads flushing the L1
> cache?  Is this a flaw in the ARM arch?  I would think the only thing
> that needs to be done between a context switch of two kernel threads (or
> even a user thread to a kernel thread) is update the general regs and
> stack. The memory access (page_tables or whatever ARM uses) should stay
> the same.
>
Yes you are right, ARM does not flush L1 when prev==next in switch_mm.

> Perhaps something else is at fault here.
>
I'll try and dig a bit deeper...

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-22 21:59       ` Ingo Molnar
                           ` (5 preceding siblings ...)
  2007-06-26  1:46         ` Dan Williams
@ 2007-06-28  5:48         ` Jeff Garzik
  2007-06-28  9:23           ` Ingo Molnar
  6 siblings, 1 reply; 127+ messages in thread
From: Jeff Garzik @ 2007-06-28  5:48 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Linus Torvalds, Steven Rostedt, LKML, Andrew Morton,
	Thomas Gleixner, Christoph Hellwig, john stultz, Oleg Nesterov,
	Paul E. McKenney, Dipankar Sarma, David S. Miller,
	matthew.wilcox, kuznet

Ingo Molnar wrote:
> so how about the following, different approach: anyone who has a tasklet 
> in any performance-sensitive codepath, please yell now. We'll also do a 
> proactive search for such places. We can convert those places to 
> softirqs, or move them back into hardirq context. Once this is done - 
> and i doubt it will go beyond 1-2 places - we can just mass-convert the 
> other 110 places to the lame but compatible solution of doing them in a 
> global thread context.


Color me unconvinced.

Tasklets fill a niche not filled by either workqueues (slower, requiring 
context switches, and possibly much latency is all wq's processes are 
active) or softirqs (limited number of them, not flexible at all). 
Sure, tasklets kick over to ksoftirqd, but not immediately, and therein 
lies their value.

And moving code -back- into hardirq is just the wrong thing to do, usually.

This proposal is ENTIRELY derived from "not convenient to my project" 
logic AFAICS, rather than the more sound "not needed in the kernel."

	Jeff




^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-28  5:48         ` Jeff Garzik
@ 2007-06-28  9:23           ` Ingo Molnar
  2007-06-28 14:38             ` Alexey Kuznetsov
  2007-06-28 15:17             ` Jeff Garzik
  0 siblings, 2 replies; 127+ messages in thread
From: Ingo Molnar @ 2007-06-28  9:23 UTC (permalink / raw)
  To: Jeff Garzik
  Cc: Linus Torvalds, Steven Rostedt, LKML, Andrew Morton,
	Thomas Gleixner, Christoph Hellwig, john stultz, Oleg Nesterov,
	Paul E. McKenney, Dipankar Sarma, David S. Miller,
	matthew.wilcox, kuznet


* Jeff Garzik <jeff@garzik.org> wrote:

> Tasklets fill a niche not filled by either workqueues (slower, 
> requiring context switches, and possibly much latency is all wq's 
> processes are active) [...]

... workqueues are also possibly much more scalable (percpu workqueues 
are easy without changing anything in your code but the call where you 
create the workqueue).

the context-switch argument i'll believe if i see numbers. You'll 
probably need in excess of tens of thousands of irqs/sec to even be able 
to measure its overhead. (workqueues are driven by nice kernel threads 
so there's no TLB overhead, etc.)

the only remaining argument is latency: but workqueues are already 
pretty high-prio (with a default priority of nice -5) - and you can 
increase it even further. You can make it SCHED_FIFO prio 98 if latency 
is so important. Tasklets on the other hand are _unconditionally_ 
high-priority. So this argument is more of an arms-race argument: "i 
want _my_ processing to be done immediately!". The fact that workqueues 
can be preempted and that their priorities can be adjusted flexibly is 
an optional _bonus_, not a disadvantage. If low-prio workqueues hurts 
your workflow, make them high-prio.

> And moving code -back- into hardirq is just the wrong thing to do, 
> usually.

agreed - except if the in-tasklet processing is really thin and there's 
already a softirq layer in the workflow. (which the case was for the 
example that was cited.) In such a case moving either to the hardirq or 
to the softirq looks like the right thing - instead of the tasklet 
intermediary.

	Ingo

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-26  2:12             ` Dan Williams
@ 2007-06-28 12:37               ` Steven Rostedt
  2007-06-28 16:37                 ` Oleg Nesterov
  2007-06-28 18:02                 ` Dan Williams
  0 siblings, 2 replies; 127+ messages in thread
From: Steven Rostedt @ 2007-06-28 12:37 UTC (permalink / raw)
  To: Dan Williams
  Cc: Ingo Molnar, Linus Torvalds, LKML, Andrew Morton,
	Thomas Gleixner, Christoph Hellwig, john stultz, Oleg Nesterov,
	Paul E. McKenney, Dipankar Sarma, David S. Miller, kuznet


Hi Dan,

On Mon, 25 Jun 2007, Dan Williams wrote:

> Yes you are right, ARM does not flush L1 when prev==next in switch_mm.
>
> > Perhaps something else is at fault here.
> >
> I'll try and dig a bit deeper...

BTW:

 static int __init iop_adma_init (void)
 {
+       iop_adma_workqueue = create_workqueue("iop-adma");
+       if (!iop_adma_workqueue)
+               return -ENODEV;
+

Could you also try upping the prio of all the "iop-adma" threads?

You should see thread names such as (on SMP) "iop-adma/0", "iop-adma/1"
... "iop-adma/N" where N = # of CPUs - 1.

do a "chrt -p -f 98 <pid>"  once for each of the thread's PIDs.  The
chrt can be found in the package "util-linux" on Red Hat / Fedora, and in
schedutils on Debian.

It just dawned on me that workqueues don't run at a high priority by
default.  So it's funny that I'm running all my current tasklets as a low
priority work queues :-)

But that can certainly be a cause of high latency.  I need to update my
patches to make the workqueue thread a higher priority. All benchmarks on
this patch have been using a low priority work queue.

I also don't see any nice API to have the priority set for a workqueue
thread from within the kernel. Looks like one needs to be added,
otherwise, I need to have the wrapper dig into the workqueue structs to
find the thread that handles the workqueue.

Thanks,

-- Steve

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-28  9:23           ` Ingo Molnar
@ 2007-06-28 14:38             ` Alexey Kuznetsov
  2007-06-28 15:23               ` Jeff Garzik
                                 ` (2 more replies)
  2007-06-28 15:17             ` Jeff Garzik
  1 sibling, 3 replies; 127+ messages in thread
From: Alexey Kuznetsov @ 2007-06-28 14:38 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Jeff Garzik, Linus Torvalds, Steven Rostedt, LKML, Andrew Morton,
	Thomas Gleixner, Christoph Hellwig, john stultz, Oleg Nesterov,
	Paul E. McKenney, Dipankar Sarma, David S. Miller,
	matthew.wilcox

Hello!

> the context-switch argument i'll believe if i see numbers. You'll 
> probably need in excess of tens of thousands of irqs/sec to even be able 
> to measure its overhead. (workqueues are driven by nice kernel threads 
> so there's no TLB overhead, etc.)

It was authors of the patch who were supposed to give some numbers,
at least one or two, just to prove the concept. :-)

According to my measurements (maybe, wrong) on 2.5GHz P4 tasklet
schedule and execution eats ~300ns, workqueue eats ~4usec.
On my 1.8GHz PM notebook (UP kernel), the numbers are 170ns and 1.2usec.

Formally looking awful, this result is positive: tasklets are almost
never used in hot paths. I am sure only about one such place: acenic
driver uses tasklet to refill rx queue. This generates not more than
3000 tasklet schedules per second. Even on P4 it pure workqueue schedule
will eat ~1% of bare cpu ticks.

Anyway, all the uses of tasklet should be verified:

The most dubios place is popular Neterion 10Gbit driver, which uses
tasklet like acenic. But at 10Gbit, multiply acenic numbers and panic. :-)

Also, there exists some hardware which uses tasklets even harder,
but I have no idea what real frequencies are: f.e. sundance.

The case with acenic/s2io is quite special: normally network drivers
refill queues in irq handlers. It was Jes Sorensen observation
that offloading refilling from irq improves performance, I do not
remember numbers. Probably, switching to workqueues will not affect
performance at all, probably it will just collapse, no idea.


> ... workqueues are also possibly much more scalable

I cannot figure out - scale in what direction? :-)
 

>						 (percpu workqueues 
> are easy without changing anything in your code but the call where you 
> create the workqueue).

I do not see how it is related to scalability. And the statement
does not even make sense. The patch already uses per-cpu workqueue
for tasklets, otherwise it would be a disaster: guaranteed cpu non-locality.

Tasklet is single thread by definition and purpose. Those a few places
where people used tasklets to do per-cpu jobs (RCU f.e.) exist just because
they had troubles with allocating new softirq. Workqueues do not make
any difference: tasklet is not workqueue, it is work_struct, and you
still will have to allocate array of per-cpu work structs, everything
remains the same.


> the only remaining argument is latency:

You could set realtime prioriry by default, not a poor nice -5.
If some network adapters were killed just because I run some task
with nice --22, it would be just ridiculous.

Alexey

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-28  9:23           ` Ingo Molnar
  2007-06-28 14:38             ` Alexey Kuznetsov
@ 2007-06-28 15:17             ` Jeff Garzik
  1 sibling, 0 replies; 127+ messages in thread
From: Jeff Garzik @ 2007-06-28 15:17 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Linus Torvalds, Steven Rostedt, LKML, Andrew Morton,
	Thomas Gleixner, Christoph Hellwig, john stultz, Oleg Nesterov,
	Paul E. McKenney, Dipankar Sarma, David S. Miller,
	matthew.wilcox, kuznet

Ingo Molnar wrote:
> * Jeff Garzik <jeff@garzik.org> wrote:
> 
>> Tasklets fill a niche not filled by either workqueues (slower, 
>> requiring context switches, and possibly much latency is all wq's 
>> processes are active) [...]
> 
> ... workqueues are also possibly much more scalable (percpu workqueues 
> are easy without changing anything in your code but the call where you 
> create the workqueue).

All that scalability is just overhead, and overkill, for what 
tasklets/softirqs are used for.


> the context-switch argument i'll believe if i see numbers. You'll 
> probably need in excess of tens of thousands of irqs/sec to even be able 
> to measure its overhead. (workqueues are driven by nice kernel threads 
> so there's no TLB overhead, etc.)

As Alexey said...  I would have thought YOU needed to provide numbers, 
rather than just handwaving as justification for tasklet removal.


> the only remaining argument is latency: but workqueues are already 
> pretty high-prio (with a default priority of nice -5) - and you can 
> increase it even further. You can make it SCHED_FIFO prio 98 if latency 
> is so important.

You skipped the very relevant latency killer:  N threads in wq, and you 
submit the (N+1)th task.

I just cannot see how that is acceptable replacement for a network 
driver that uses tasklets.  Who wants to wait that long for packet RX or TX?


> Tasklets on the other hand are _unconditionally_ 
> high-priority. So this argument is more of an arms-race argument: "i 
> want _my_ processing to be done immediately!". The fact that workqueues 
> can be preempted and that their priorities can be adjusted flexibly is 
> an optional _bonus_, not a disadvantage. If low-prio workqueues hurts 
> your workflow, make them high-prio.

How about letting us stick with a solution that is WORKING now?

Of course tasklets are unconditionally high priority.  So are hardirqs. 
  So are softirqs.  This is not a problem, this is an expected and 
assumed-upon feature of the system.


>> And moving code -back- into hardirq is just the wrong thing to do, 
>> usually.
> 
> agreed - except if the in-tasklet processing is really thin and there's 
> already a softirq layer in the workflow. (which the case was for the 
> example that was cited.) In such a case moving either to the hardirq or 
> to the softirq looks like the right thing - instead of the tasklet 
> intermediary.

Wrong, for all the examples I care about -- drivers.  Network drivers in 
particular.  Just look at the comment in include/linux/interrupt.h if it 
wasn't clear:

/* PLEASE, avoid to allocate new softirqs, if you need not _really_ high
    frequency threaded job scheduling. For almost all the purposes
    tasklets are more than enough. F.e. all serial device BHs et
    al. should be converted to tasklets, not to softirqs.
  */

There is a good reason for this advice, as hinted at by the code 
immediately following the comment:

	enum
	{
	        HI_SOFTIRQ=0,
	        TIMER_SOFTIRQ,
	        NET_TX_SOFTIRQ,
	        NET_RX_SOFTIRQ,
	        BLOCK_SOFTIRQ,
	        TASKLET_SOFTIRQ,
	        SCHED_SOFTIRQ,
	#ifdef CONFIG_HIGH_RES_TIMERS
	        HRTIMER_SOFTIRQ,
	#endif
	};

softirqs cannot really be used by drivers, because they are not modular. 
  They are a scarce resource in any case.

Guess what?  All this is why we have tasklets.

tasklet != workqueue

	Jeff



^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-28 14:38             ` Alexey Kuznetsov
@ 2007-06-28 15:23               ` Jeff Garzik
  2007-06-28 15:54               ` Steven Rostedt
  2007-06-28 16:00               ` Ingo Molnar
  2 siblings, 0 replies; 127+ messages in thread
From: Jeff Garzik @ 2007-06-28 15:23 UTC (permalink / raw)
  To: Alexey Kuznetsov
  Cc: Ingo Molnar, Linus Torvalds, Steven Rostedt, LKML, Andrew Morton,
	Thomas Gleixner, Christoph Hellwig, john stultz, Oleg Nesterov,
	Paul E. McKenney, Dipankar Sarma, David S. Miller,
	matthew.wilcox

Alexey Kuznetsov wrote:
> Hello!
> 
>> the context-switch argument i'll believe if i see numbers. You'll 
>> probably need in excess of tens of thousands of irqs/sec to even be able 
>> to measure its overhead. (workqueues are driven by nice kernel threads 
>> so there's no TLB overhead, etc.)
> 
> It was authors of the patch who were supposed to give some numbers,
> at least one or two, just to prove the concept. :-)
> 
> According to my measurements (maybe, wrong) on 2.5GHz P4 tasklet
> schedule and execution eats ~300ns, workqueue eats ~4usec.
> On my 1.8GHz PM notebook (UP kernel), the numbers are 170ns and 1.2usec.

Thanks :)


> Anyway, all the uses of tasklet should be verified:
> 
> The most dubios place is popular Neterion 10Gbit driver, which uses
> tasklet like acenic. But at 10Gbit, multiply acenic numbers and panic. :-)
> 
> Also, there exists some hardware which uses tasklets even harder,
> but I have no idea what real frequencies are: f.e. sundance.
> 
> The case with acenic/s2io is quite special: normally network drivers
> refill queues in irq handlers. It was Jes Sorensen observation
> that offloading refilling from irq improves performance, I do not
> remember numbers. Probably, switching to workqueues will not affect
> performance at all, probably it will just collapse, no idea.

CPUs have gotten so fast now that its quite possible to run the tasklet 
in parallel with the next invocation of the interrupt handler.

But given the amount of tasklet use in network drivers, I do not think 
tasklets can just be magically equated to workqueues, without 
case-by-case analysis.


> Tasklet is single thread by definition and purpose. Those a few places

Indeed!


>> the only remaining argument is latency:
> 
> You could set realtime prioriry by default, not a poor nice -5.
> If some network adapters were killed just because I run some task
> with nice --22, it would be just ridiculous.

Indeed.

	Jeff



^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-28 14:38             ` Alexey Kuznetsov
  2007-06-28 15:23               ` Jeff Garzik
@ 2007-06-28 15:54               ` Steven Rostedt
  2007-06-28 16:00               ` Ingo Molnar
  2 siblings, 0 replies; 127+ messages in thread
From: Steven Rostedt @ 2007-06-28 15:54 UTC (permalink / raw)
  To: Alexey Kuznetsov
  Cc: Ingo Molnar, Jeff Garzik, Linus Torvalds, LKML, Andrew Morton,
	Thomas Gleixner, Christoph Hellwig, john stultz, Oleg Nesterov,
	Paul E. McKenney, Dipankar Sarma, David S. Miller,
	matthew.wilcox


On Thu, 28 Jun 2007, Alexey Kuznetsov wrote:
> > the context-switch argument i'll believe if i see numbers. You'll
> > probably need in excess of tens of thousands of irqs/sec to even be able
> > to measure its overhead. (workqueues are driven by nice kernel threads
> > so there's no TLB overhead, etc.)
>
> It was authors of the patch who were supposed to give some numbers,
> at least one or two, just to prove the concept. :-)

The problem is that we don't have the hardware that uses tasklets in
critical ways. My original patch series had a debug print in every
function (tasklet_schedule and friends).   I got a few scattered prints on
all my boxes but no flooding of prints. So I can't show that this will
hurt, because on my boxes it does not.

>
> You could set realtime prioriry by default, not a poor nice -5.
> If some network adapters were killed just because I run some task
> with nice --22, it would be just ridiculous.

This is my fault to the patch series. I compelety forgot to up the prio.
My next series will include a change where the tasklet work queue will run
at something like prio FIFO 98 (or maybe 99?)

This is a bit embarrassing that I forgot to do this, since I'm a
real-time developer ;-)

-- Steve


^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-28 14:38             ` Alexey Kuznetsov
  2007-06-28 15:23               ` Jeff Garzik
  2007-06-28 15:54               ` Steven Rostedt
@ 2007-06-28 16:00               ` Ingo Molnar
  2007-06-28 17:26                 ` Jeff Garzik
                                   ` (3 more replies)
  2 siblings, 4 replies; 127+ messages in thread
From: Ingo Molnar @ 2007-06-28 16:00 UTC (permalink / raw)
  To: Alexey Kuznetsov
  Cc: Jeff Garzik, Linus Torvalds, Steven Rostedt, LKML, Andrew Morton,
	Thomas Gleixner, Christoph Hellwig, john stultz, Oleg Nesterov,
	Paul E. McKenney, Dipankar Sarma, David S. Miller,
	matthew.wilcox


* Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> wrote:

> > the context-switch argument i'll believe if i see numbers. You'll 
> > probably need in excess of tens of thousands of irqs/sec to even be 
> > able to measure its overhead. (workqueues are driven by nice kernel 
> > threads so there's no TLB overhead, etc.)
> 
> It was authors of the patch who were supposed to give some numbers, at 
> least one or two, just to prove the concept. :-)

sure enough! But it was not me who claimed that 'workqueues are slow'.

firstly, i'm not here at all to tell people what tools to use. I'm not 
trying to 'force' people away from a perfectly logical technological 
choice. I am just wondering out loud whether this particular tool, in 
its current usage pattern, makes much technological sense. My claim is: 
it could very well be that it doesnt make _much_ sense, and in that case 
we should provide a non-intrusive migration path away in terms of a 
compatible API wrapper to a saner (albeit by virtue of trying to emulate 
an existing API, slower) mechanism. The examples cited so far had the 
tasklet as an intermediary towards a softirq - what's the technological 
point in such a splitup?

> According to my measurements (maybe, wrong) on 2.5GHz P4 tasklet 
> schedule and execution eats ~300ns, workqueue eats ~4usec. On my 
> 1.8GHz PM notebook (UP kernel), the numbers are 170ns and 1.2usec.

I find the 4usecs cost on a P4 interesting and a bit too high - how did 
you measure it? (any test-patch for it i could try?) But i think even 
your current numbers partly prove my point: with 1.2 usecs and 10,000 
irqs/sec the cost is 1.2 msecs/sec, or 0.1%. And 10K irqs/sec themselves 
will eat up much more CPU time than that already.

> Formally looking awful, this result is positive: tasklets are almost 
> never used in hot paths. I am sure only about one such place: acenic 
> driver uses tasklet to refill rx queue. This generates not more than 
> 3000 tasklet schedules per second. Even on P4 it pure workqueue 
> schedule will eat ~1% of bare cpu ticks.

... and the irq cost itself will eat 5-10% of bare CPU ticks already.

> > ... workqueues are also possibly much more scalable
> 
> I cannot figure out - scale in what direction? :-)

workqueues can be per-cpu - for tasklets to be per-cpu you have to 
open-code them into per-cpu like rcu-tasklets did (which in essence 
turns them into more expensive softirqs).

> >						 (percpu workqueues
> > are easy without changing anything in your code but the call where 
> > you create the workqueue).
> 
> I do not see how it is related to scalability. And the statement does 
> not even make sense. The patch already uses per-cpu workqueue for 
> tasklets, otherwise it would be a disaster: guaranteed cpu 
> non-locality.

my argument was: workqueues are more scalable than tasklets in general.

Just look at the tasklet_disable() logic. We basically have a per-cpu 
list of tasklets that we poll in tasklet_action:

 static void tasklet_action(struct softirq_action *a)
 {
        [...]
        while (list) {
                struct tasklet_struct *t = list;

                list = list->next;

                if (tasklet_trylock(t)) {

and if the trylock fails, we just continue to meet this activated 
tasklet again and again, in this nice linear list.

this happens to work in practice because 1) tasklets are used quite 
rarely! 2) tasklet_disable() is done realtively rarely and nobody truly 
runs tons of the same devices (which depend on a tasklet) on the same 
box, but still it's quite an unhealthy approach. Every time i look at 
the tasklet code it hurts - having fundamental stuff like that in the 
heart of Linux ;-)

also, the "be afraid of the hardirq or the process context" mantra is 
overblown as well. If something is too heavy for a hardirq, _it's too 
heavy for a tasklet too_. Most hardirqs are (or should be) running with 
interrupts enabled, which makes their difference to softirqs miniscule. 

The most scalable workloads dont involve any (or many) softirq middlemen 
at all: you queue work straight from the hardirq context to the target 
process context. And that's what you want to do _anyway_, because you 
want to create as little locally cached data for the hardirq context, as 
the target task could easily be on another CPU. (this is generally true 
for things like block IO, but it's also true for things like network 
IO.)

the most scalable solution would be _for the network adapter to figure 
out the target CPU for the packet_. Not many (if any) such adapters 
exist at the moment. (as it would involve allocating NR_CPUs irqs to 
that adapter alone.)

> Tasklet is single thread by definition and purpose. Those a few places 
> where people used tasklets to do per-cpu jobs (RCU f.e.) exist just 
> because they had troubles with allocating new softirq. [...]

no. The following tale is the true and only history of the RCU tasklet 
;-) The RCU guys first used a tasklet, then noticed its bad scalability 
(a particular VFS-intense benchmark regressed because only a single CPU 
would do RCU completion on an 8-way box) so they switched it to a 
per-cpu tasklet - without realizing that a per-cpu tasklet is in essence 
a softirq. I pointed it out to them (years down the road ...) then the 
"convert rcu-tasklet to softirq" patch was born.

> > the only remaining argument is latency:
> 
> You could set realtime prioriry by default, not a poor nice -5. If 
> some network adapters were killed just because I run some task with 
> nice --22, it would be just ridiculous.

there are only 20 negative nice levels ;-) And i dont really get the 
'you might kill the network adapter' argument, because the opposite is 
true just as much: tasklets from a totally uninteresting network adapter 
can kill your latency-sensitive application too.

So providing more flexibility in the prioritization of the work that 
goes on in the system (as long as it has no other drawbacks) can not be 
wrong. The "but you will shoot yourself in the foot" argument is really 
backwards in that context.

Tasklets are called 'task'-lets for a reason: they are poorly scheduled, 
inflexible tasks. They were written in an age when we didnt have 
workqueues, we didnt have kthreads and real men thought they wanted to 
do all their TCP/IP processing in softirq context [ am i heading down 
the road towards a showdown with DaveM here? ;-) ].

Now ... you (and Jeff, and others) are right and workqueues could be too 
slow for some of the cases (i said before that i'd be surprised if it 
were more than 1-2), in which case my argument changes to what i 
outlined above: if you want good scalability, dont use middlemen :-) 
Figure out the target task as early as possible and let it do as much of 
the remaining work as possible. _Increasing_ the amount of cached 
context (by doing delayed processing in tasklets or even softirqs on the 
same CPU where the hardirq arrived) only increases the cross-CPU cost. 
Keeping stuff in a softirq only makes (some) sense as long as you have 
no target task at all (routing, filtering, etc.).

	Ingo

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-28 12:37               ` Steven Rostedt
@ 2007-06-28 16:37                 ` Oleg Nesterov
  2007-06-28 18:02                 ` Dan Williams
  1 sibling, 0 replies; 127+ messages in thread
From: Oleg Nesterov @ 2007-06-28 16:37 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Dan Williams, Ingo Molnar, Linus Torvalds, LKML, Andrew Morton,
	Thomas Gleixner, Christoph Hellwig, john stultz,
	Paul E. McKenney, Dipankar Sarma, David S. Miller, kuznet

On 06/28, Steven Rostedt wrote:
> 
> I also don't see any nice API to have the priority set for a workqueue
> thread from within the kernel. Looks like one needs to be added,
> otherwise, I need to have the wrapper dig into the workqueue structs to
> find the thread that handles the workqueue.

It is not so trivial to implement properly. Note that CPU_UP creates a new
cwq->thread, so somehow workqueue should "remember" its priority. This means
we should record it in workqueue_struct. The most simple way is to add yet
another parameter to __create_workqueue(), but this is nasty.

So, perhaps we should add "long nice" to "struct workqueue_struct", and then

	void set_workqueue_nice(struct workqueue_struct *wq, long nice)
	{
		const cpumask_t *cpu_map = wq_cpu_map(wq);
		struct cpu_workqueue_struct *cwq;
		int cpu;

		wq->nice = nice;

		mutex_lock(&workqueue_mutex);

		for_each_cpu_mask(cpu, *cpu_map) {
			cwq = per_cpu_ptr(wq->cpu_wq, cpu);
			if (cwq->thread)
				set_user_nice(cwq->thread, nice);
		}

		mutex_unlock(&workqueue_mutex);
	}

We could use for_each_cpu_online() instead, but then we should check
is_single_threaded().

Oleg.


^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-28 16:00               ` Ingo Molnar
@ 2007-06-28 17:26                 ` Jeff Garzik
  2007-06-28 17:44                 ` Jeff Garzik
                                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 127+ messages in thread
From: Jeff Garzik @ 2007-06-28 17:26 UTC (permalink / raw)
  To: Ingo Molnar, Steven Rostedt
  Cc: Alexey Kuznetsov, Linus Torvalds, LKML, Andrew Morton,
	Thomas Gleixner, Christoph Hellwig, john stultz, Oleg Nesterov,
	Paul E. McKenney, Dipankar Sarma, David S. Miller,
	matthew.wilcox

Ingo Molnar wrote:
> my argument was: workqueues are more scalable than tasklets in general.

Here is my argument:  that is totally irrelevant to $subject, when it 
comes to dealing with managing existing [network driver] behavior and 
performance.

My overall objection is the attempt to replace apples with oranges.

Network drivers use tasklets TODAY.  Each driver -- in particular 
acenic, ns83820, and the 10Gbps drivers -- has been carefully tuned to 
use tasklets, hardirqs, and perhaps NAPI too.  Changing to workqueue 
WILL affect network driver hot paths, yet I see no analysis or 
measurement at all of the behavior differences.

If hackers are willing to revisit each network driver, rework the 
tasklet code to something more sane [in your opinion], and TEST it, I 
will review the patches and happily ACK away.

Given that I feel that course of action is unlikely (the preferred 
alternative apparently being "I don't use these drivers, but surely my 
changes are OK anyway"), I do not see how this effort can proceed as is.

Lots of time went into tuning these network drivers for the specific 
thread model they use.  Maybe that thread model is no longer in style. 
Maybe modern machine behavior dictates a different approach.  The point 
is... you don't know.

	Jeff



^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-28 16:00               ` Ingo Molnar
  2007-06-28 17:26                 ` Jeff Garzik
@ 2007-06-28 17:44                 ` Jeff Garzik
  2007-06-28 18:19                 ` Andrew Morton
  2007-06-29 11:34                 ` Alexey Kuznetsov
  3 siblings, 0 replies; 127+ messages in thread
From: Jeff Garzik @ 2007-06-28 17:44 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Alexey Kuznetsov, Linus Torvalds, Steven Rostedt, LKML,
	Andrew Morton, Thomas Gleixner, Christoph Hellwig, john stultz,
	Oleg Nesterov, Paul E. McKenney, Dipankar Sarma, David S. Miller

Ingo Molnar wrote:
> But it was not me who claimed that 'workqueues are slow'.

The claim was:  slower than tasklets.


> choice. I am just wondering out loud whether this particular tool, in 
> its current usage pattern, makes much technological sense. My claim is: 
> it could very well be that it doesnt make _much_ sense, and in that case 
> we should provide a non-intrusive migration path away in terms of a 
> compatible API wrapper to a saner (albeit by virtue of trying to emulate 
> an existing API, slower) mechanism. The examples cited so far had the 
> tasklet as an intermediary towards a softirq - what's the technological 
> point in such a splitup?

I already answered that in detail.  In sum, a driver cannot define its 
own softirq.  Softirqs are not modular.

Tasklets are the closest thing to softirqs for a driver.


> The most scalable workloads dont involve any (or many) softirq middlemen 
> at all: you queue work straight from the hardirq context to the target 
> process context. And that's what you want to do _anyway_, because you 
> want to create as little locally cached data for the hardirq context, as 
> the target task could easily be on another CPU. (this is generally true 
> for things like block IO, but it's also true for things like network 
> IO.)
> 
> the most scalable solution would be _for the network adapter to figure 
> out the target CPU for the packet_.

I agree completely.  Wanna implement this?  I will kiss your feet, and 
multi-core CPU vendors will worship you as a demi-god.

Until such time, we must deal with the network stack as it exists today, 
and the network drivers as they exist and work today.


> Not many (if any) such adapters 
> exist at the moment. (as it would involve allocating NR_CPUs irqs to 
> that adapter alone.)

Good news:  this is becoming the norm for modern NICs, especially 10Gbps.

Plenty of NICs already exist that support multiple RX rings (persumably 
one per CPU), and newer NICs will raise individual MSI[-X] interrupts 
based on the RX ring into which a packet was received.

In this area, NIC vendors are way ahead of the Linux net stack.

The Linux net stack is unfortunately not threaded enough to sanely deal 
with dividing /flows/ up across multiple CPUs, even if the NIC does 
support multiple transmit and receive queues.   [side note: initial 
multi-queue TX is being worked on, on netdev]


>> Tasklet is single thread by definition and purpose. Those a few places 
>> where people used tasklets to do per-cpu jobs (RCU f.e.) exist just 
>> because they had troubles with allocating new softirq. [...]
> 
> no. The following tale is the true and only history of the RCU tasklet 
> ;-) The RCU guys first used a tasklet, then noticed its bad scalability 
> (a particular VFS-intense benchmark regressed because only a single CPU 
> would do RCU completion on an 8-way box) so they switched it to a 
> per-cpu tasklet - without realizing that a per-cpu tasklet is in essence 
> a softirq. I pointed it out to them (years down the road ...) then the 
> "convert rcu-tasklet to softirq" patch was born.

You focused on the example rather than the key phrase:  tasklet is 
single thread by definition and purpose.

Wanting to change that without analysis of the impact illustrates the 
apples-to-oranges change being proposed.


> outlined above: if you want good scalability, dont use middlemen :-) 
> Figure out the target task as early as possible and let it do as much of 
> the remaining work as possible. _Increasing_ the amount of cached 
> context (by doing delayed processing in tasklets or even softirqs on the 
> same CPU where the hardirq arrived) only increases the cross-CPU cost. 
> Keeping stuff in a softirq only makes (some) sense as long as you have 
> no target task at all (routing, filtering, etc.).

I do not disagree with these theoretical musings :)

I care the most about the "who will do all this work?" question.  In 
network driver land, these changes impact hot paths.  I am lazy, and 
don't care to revisit each network driver hot path and carefully re-tune 
each based on this proposal.  Who is volunteering?

	Jeff



^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-28 12:37               ` Steven Rostedt
  2007-06-28 16:37                 ` Oleg Nesterov
@ 2007-06-28 18:02                 ` Dan Williams
  2007-06-28 20:46                   ` Steven Rostedt
  1 sibling, 1 reply; 127+ messages in thread
From: Dan Williams @ 2007-06-28 18:02 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Ingo Molnar, Linus Torvalds, LKML, Andrew Morton,
	Thomas Gleixner, Christoph Hellwig, john stultz, Oleg Nesterov,
	Paul E. McKenney, Dipankar Sarma, David S. Miller, kuznet,
	Jeff Garzik

On 6/28/07, Steven Rostedt <rostedt@goodmis.org> wrote:
>
> Hi Dan,
>
> On Mon, 25 Jun 2007, Dan Williams wrote:
>
> > Yes you are right, ARM does not flush L1 when prev==next in switch_mm.
> >
> > > Perhaps something else is at fault here.
> > >
> > I'll try and dig a bit deeper...
>
> BTW:
>
>  static int __init iop_adma_init (void)
>  {
> +       iop_adma_workqueue = create_workqueue("iop-adma");
> +       if (!iop_adma_workqueue)
> +               return -ENODEV;
> +
>
> Could you also try upping the prio of all the "iop-adma" threads?
>
Unfortunately setting the thread to real time priority makes
throughput slightly worse.  Instead of floating around 35MB/s the
resync speed is stuck around 30MB/s:

[ iop-adma: hi-prio workqueue based callbacks ]
iq81340mc:~# cat /proc/mdstat
Personalities : [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
md0 : active raid5 sdd[4] sdc[2] sdb[1] sda[0]
      468872448 blocks level 5, 64k chunk, algorithm 2 [4/3] [UUU_]
      [>....................]  recovery =  3.9% (6107424/156290816)
finish=84.9min speed=29448K/sec

The iop-adma tasklet cleans up any completed descriptors and in the
process calls any attached callbacks.  For the raid5 resync case the
callback is simply:

static void ops_complete_check(void *stripe_head_ref)
{
        struct stripe_head *sh = stripe_head_ref;
        int pd_idx = sh->pd_idx;

        pr_debug("%s: stripe %llu\n", __FUNCTION__,
                (unsigned long long)sh->sector);

        if (test_and_clear_bit(STRIPE_OP_MOD_DMA_CHECK, &sh->ops.pending) &&
                sh->ops.zero_sum_result == 0)
                set_bit(R5_UPTODATE, &sh->dev[pd_idx].flags);

        set_bit(STRIPE_OP_CHECK, &sh->ops.complete);
        set_bit(STRIPE_HANDLE, &sh->state);
        release_stripe(sh);
}


[ iop-adma: tasklet based callbacks ]
iq81340mc:~# cat /proc/mdstat
Personalities : [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
md0 : active raid5 sdd[4] sdc[2] sdb[1] sda[0]
      468872448 blocks level 5, 64k chunk, algorithm 2 [4/3] [UUU_]
      [=>...................]  recovery =  5.1% (8024248/156290816)
finish=47.9min speed=51486K/sec

--
Dan

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-28 16:00               ` Ingo Molnar
  2007-06-28 17:26                 ` Jeff Garzik
  2007-06-28 17:44                 ` Jeff Garzik
@ 2007-06-28 18:19                 ` Andrew Morton
  2007-06-28 20:07                   ` Ingo Molnar
  2007-06-29 11:34                 ` Alexey Kuznetsov
  3 siblings, 1 reply; 127+ messages in thread
From: Andrew Morton @ 2007-06-28 18:19 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Alexey Kuznetsov, Jeff Garzik, Linus Torvalds, Steven Rostedt,
	LKML, Thomas Gleixner, Christoph Hellwig, john stultz,
	Oleg Nesterov, Paul E. McKenney, Dipankar Sarma, David S. Miller,
	matthew.wilcox

On Thu, 28 Jun 2007 18:00:01 +0200 Ingo Molnar <mingo@elte.hu> wrote:

>  with 1.2 usecs and 10,000 
> irqs/sec the cost is 1.2 msecs/sec, or 0.1%.

off-by-10 error.

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-28 18:19                 ` Andrew Morton
@ 2007-06-28 20:07                   ` Ingo Molnar
  0 siblings, 0 replies; 127+ messages in thread
From: Ingo Molnar @ 2007-06-28 20:07 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Alexey Kuznetsov, Jeff Garzik, Linus Torvalds, Steven Rostedt,
	LKML, Thomas Gleixner, Christoph Hellwig, john stultz,
	Oleg Nesterov, Paul E. McKenney, Dipankar Sarma, David S. Miller,
	matthew.wilcox


* Andrew Morton <akpm@linux-foundation.org> wrote:

> On Thu, 28 Jun 2007 18:00:01 +0200 Ingo Molnar <mingo@elte.hu> wrote:
> 
> >  with 1.2 usecs and 10,000 
> > irqs/sec the cost is 1.2 msecs/sec, or 0.1%.
> 
> off-by-10 error.

yeah, indeed - 12 msecs and 1.2% :-/

	Ingo

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-28 18:02                 ` Dan Williams
@ 2007-06-28 20:46                   ` Steven Rostedt
  2007-06-28 21:23                     ` Dan Williams
  0 siblings, 1 reply; 127+ messages in thread
From: Steven Rostedt @ 2007-06-28 20:46 UTC (permalink / raw)
  To: Dan Williams
  Cc: Ingo Molnar, Linus Torvalds, LKML, Andrew Morton,
	Thomas Gleixner, Christoph Hellwig, john stultz, Oleg Nesterov,
	Paul E. McKenney, Dipankar Sarma, David S. Miller, kuznet,
	Jeff Garzik

On Thu, 28 Jun 2007, Dan Williams wrote:
> >
> Unfortunately setting the thread to real time priority makes
> throughput slightly worse.  Instead of floating around 35MB/s the
> resync speed is stuck around 30MB/s:

That is really strange. If you higher the prio of the work queue it
gets worst?  Something really strange is happening here?  Are you using
CONFIG_PREEMPT?

-- Steve


^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-28 20:46                   ` Steven Rostedt
@ 2007-06-28 21:23                     ` Dan Williams
  2007-06-28 21:40                       ` Dan Williams
  2007-06-28 22:00                       ` Steven Rostedt
  0 siblings, 2 replies; 127+ messages in thread
From: Dan Williams @ 2007-06-28 21:23 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Ingo Molnar, Linus Torvalds, LKML, Andrew Morton,
	Thomas Gleixner, Christoph Hellwig, john stultz, Oleg Nesterov,
	Paul E. McKenney, Dipankar Sarma, David S. Miller, kuznet,
	Jeff Garzik

On 6/28/07, Steven Rostedt <rostedt@goodmis.org> wrote:
> On Thu, 28 Jun 2007, Dan Williams wrote:
> > >
> > Unfortunately setting the thread to real time priority makes
> > throughput slightly worse.  Instead of floating around 35MB/s the
> > resync speed is stuck around 30MB/s:
>
> That is really strange. If you higher the prio of the work queue it
> gets worst?  Something really strange is happening here?  Are you using
> CONFIG_PREEMPT?
>
Everything thus far has been CONFIG_PREEMPT=n (the default for this platform).

With CONFIG_PREEMPT=y the resync is back in the 50MB/s range.

[iop-adma: hi-prio workqueue, CONFIG_PREEMPT=y]
iq81340mc:~# cat /proc/mdstat
Personalities : [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
md0 : active raid5 sdd[4] sdc[2] sdb[1] sda[0]
      468872448 blocks level 5, 64k chunk, algorithm 2 [4/3] [UUU_]
      [=>...................]  recovery =  5.8% (9136404/156290816)
finish=46.1min speed=53161K/sec

The tasklet configuration stays in 50MB/s ballpark, and the default
priority (nice -5) workqueue case remains in the 30's with
CONFIG_PREEMPT=n.

--
Dan

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-28 21:23                     ` Dan Williams
@ 2007-06-28 21:40                       ` Dan Williams
  2007-06-28 22:01                         ` Steven Rostedt
  2007-06-28 22:00                       ` Steven Rostedt
  1 sibling, 1 reply; 127+ messages in thread
From: Dan Williams @ 2007-06-28 21:40 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Ingo Molnar, Linus Torvalds, LKML, Andrew Morton,
	Thomas Gleixner, Christoph Hellwig, john stultz, Oleg Nesterov,
	Paul E. McKenney, Dipankar Sarma, David S. Miller, kuznet,
	Jeff Garzik

On 6/28/07, Dan Williams <dan.j.williams@intel.com> wrote:
> Everything thus far has been CONFIG_PREEMPT=n (the default for this platform).
>
> With CONFIG_PREEMPT=y the resync is back in the 50MB/s range.
>
> [iop-adma: hi-prio workqueue, CONFIG_PREEMPT=y]
> iq81340mc:~# cat /proc/mdstat
> Personalities : [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
> md0 : active raid5 sdd[4] sdc[2] sdb[1] sda[0]
>       468872448 blocks level 5, 64k chunk, algorithm 2 [4/3] [UUU_]
>       [=>...................]  recovery =  5.8% (9136404/156290816)
> finish=46.1min speed=53161K/sec
>
> The tasklet configuration stays in 50MB/s ballpark, and the default
> priority (nice -5) workqueue case remains in the 30's with
> CONFIG_PREEMPT=n.
>
That last line should be CONFIG_PREEMPT=y

But I guess there is a reason it is still marked experimental...

iq81340mc:/data_dir# ./md0_verify.sh
kernel BUG at mm/page_alloc.c:363!
Unable to handle kernel NULL pointer dereference at virtual address 00000000
pgd = b29b8000
[00000000] *pgd=00000000
Internal error: Oops: 805 [#1] PREEMPT
Modules linked in:
CPU: 0    Not tainted  (2.6.22-rc6 #144)
PC is at __bug+0x20/0x2c
LR is at 0x403f37a0
pc : [<4002ef08>]    lr : [<403f37a0>]    psr: 60000093
sp : b28a3cc8  ip : 01d80000  fp : b28a3cd4
r10: 00000001  r9 : 00000001  r8 : 00000000
r7 : 0000004f  r6 : ffffffe0  r5 : 412919c0  r4 : 412919e0
r3 : 00000000  r2 : 403f37a0  r1 : 60000093  r0 : 00000026
Flags: nZCv  IRQs off  FIQs on  Mode SVC_32  Segment user
Control: 0400397f  Table: 729b8018  DAC: 00000015
Process md0_verify_kern (pid: 4188, stack limit = 0xb28a2258)
Stack: (0xb28a3cc8 to 0xb28a4000)
3cc0:                   b28a3d1c b28a3cd8 40070a90 4002eef4 b28a2000 403f6818
3ce0: 403f682c 40000013 403f6838 0000001f 00000010 4129c3c0 20000013 15573000
3d00: 4129c3c0 00000000 b28a4550 00000000 b28a3d2c b28a3d20 40070b54 400706b8
3d20: b28a3d44 b28a3d30 40073e7c 40070b4c 4129c3c0 15574000 b28a3d58 b28a3d48
3d40: 40084734 40073e34 b3e0fdcc b28a3dcc b28a3d5c 40079cd8 400846f8 ffffffff
3d60: 00000000 b39a4860 b28a3ddc 00000000 00000001 15574000 b28a2000 4040d548
3d80: b28a4550 b3357240 72d9e0ff ffffffff 00000000 15573fff 00003ffe 15574000
3da0: 15573fff 4040d548 b28a3ddc b2c15320 b28a2000 00000000 b333cc00 b3357240
3dc0: b28a3e04 b28a3dd0 4007d724 40079844 b28a3dd8 00000000 00000015 4040d548
3de0: b3357240 b28f6920 b28a2000 b3357240 b3357240 b3a4f9c0 b28a3e18 b28a3e08
3e00: 4003f2d8 4007d6b4 b3a462e0 b28a3e5c b28a3e1c 40095e70 4003f2a0 00000000
3e20: 00000000 b28a3e3c 00000013 b28a3e5c b28a3e3c b333cc10 b3c15cc0 b333cc00
3e40: 00000080 b28a3fb0 00000000 00000000 b28a3f2c b28a3e60 400c8270 400958f0
3e60: 00000000 b28a3ec8 b28a2000 b3c15c00 00000003 00000000 b29e54a0 00000014
3e80: 00000044 b38b4f00 00000002 00000000 00000000 00000001 403f6aac 403f6aac
3ea0: b333cc7c b3a462e0 000200d2 00000000 403f6aa8 b28a2000 b333cd30 b28a3f08
3ec0: b28a3ecc 40070c38 4006fdbc 000200d2 403f6aac 00000010 41295a60 b28a2000
3ee0: b333cc7c 00000000 b333cc00 b333cc00 00000003 0001fff3 0000004f 0000004f
3f00: b333cc00 403f72a8 b28a2000 400c7e60 00000000 b333cc00 b28a3fb0 fffffffe
3f20: b28a3f5c b28a3f30 40095054 400c7e6c 00000000 b4503000 000bf408 b333cc00
3f40: 00000000 000bf628 b28a2000 b28a3fb0 b28a3f84 b28a3f60 40096c3c 40094f58
3f60: b4503000 000bf408 b28a3fb0 b4503000 4002b0e4 156da000 b28a3fa4 b28a3f88
3f80: 4002e3cc 40096b24 000bf408 000bf628 000bf788 0000000b 00000000 b28a3fa8
3fa0: 4002af20 4002e39c 000bf408 000bf628 000bf788 000bf628 000bf408 00000000
3fc0: 000bf408 000bf628 000bf788 ffffffff 000bf628 000bf408 156da000 000be848
3fe0: 15651550 3eb7e4ac 0002b4f0 1565158c 60000010 000bf788 00000000 00000000
Backtrace:
[<4002eee8>] (__bug+0x0/0x2c) from [<40070a90>] (free_hot_cold_page+0x3e4/0x434)
[<400706ac>] (free_hot_cold_page+0x0/0x434) from [<40070b54>]
(free_hot_page+0x14/0x18)
[<40070b40>] (free_hot_page+0x0/0x18) from [<40073e7c>] (put_page+0x54/0x174)
[<40073e28>] (put_page+0x0/0x174) from [<40084734>]
(free_page_and_swap_cache+0x48/0x64)
 r5:15574000 r4:4129c3c0
[<400846ec>] (free_page_and_swap_cache+0x0/0x64) from [<40079cd8>]
(unmap_vmas+0x4a0/0x67c)
 r4:b3e0fdcc
[<40079838>] (unmap_vmas+0x0/0x67c) from [<4007d724>] (exit_mmap+0x7c/0x158)
[<4007d6a8>] (exit_mmap+0x0/0x158) from [<4003f2d8>] (mmput+0x44/0x100)
[<4003f294>] (mmput+0x0/0x100) from [<40095e70>] (flush_old_exec+0x58c/0x9d8)
 r4:b3a462e0
[<400958e4>] (flush_old_exec+0x0/0x9d8) from [<400c8270>]
(load_elf_binary+0x410/0x18fc)
[<400c7e60>] (load_elf_binary+0x0/0x18fc) from [<40095054>]
(search_binary_handler+0x108/0x2b4)
[<40094f4c>] (search_binary_handler+0x0/0x2b4) from [<40096c3c>]
(do_execve+0x124/0x1e4)
[<40096b18>] (do_execve+0x0/0x1e4) from [<4002e3cc>] (sys_execve+0x3c/0x5c)
[<4002e390>] (sys_execve+0x0/0x5c) from [<4002af20>] (ret_fast_syscall+0x0/0x3c)
 r7:0000000b r6:000bf788 r5:000bf628 r4:000bf408
Code: e1a01000 e59f000c eb004d60 e3a03000 (e5833000)
note: md0_verify_kern[4188] exited with preempt_count 4

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-28 21:23                     ` Dan Williams
  2007-06-28 21:40                       ` Dan Williams
@ 2007-06-28 22:00                       ` Steven Rostedt
  1 sibling, 0 replies; 127+ messages in thread
From: Steven Rostedt @ 2007-06-28 22:00 UTC (permalink / raw)
  To: Dan Williams
  Cc: Ingo Molnar, Linus Torvalds, LKML, Andrew Morton,
	Thomas Gleixner, Christoph Hellwig, john stultz, Oleg Nesterov,
	Paul E. McKenney, Dipankar Sarma, David S. Miller, kuznet,
	Jeff Garzik


On Thu, 28 Jun 2007, Dan Williams wrote:

> > CONFIG_PREEMPT?
> >
> Everything thus far has been CONFIG_PREEMPT=n (the default for this platform).
>
> With CONFIG_PREEMPT=y the resync is back in the 50MB/s range.

So with upping the prio for the work queue you got back your performance?

>
> [iop-adma: hi-prio workqueue, CONFIG_PREEMPT=y]
> iq81340mc:~# cat /proc/mdstat
> Personalities : [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
> md0 : active raid5 sdd[4] sdc[2] sdb[1] sda[0]
>       468872448 blocks level 5, 64k chunk, algorithm 2 [4/3] [UUU_]
>       [=>...................]  recovery =  5.8% (9136404/156290816)
> finish=46.1min speed=53161K/sec
>
> The tasklet configuration stays in 50MB/s ballpark, and the default
> priority (nice -5) workqueue case remains in the 30's with
> CONFIG_PREEMPT=n.

[noted: should be CONFIG_PREEMPT=y]

This is expected. Seems you may have otherthings running at a higher prio.

-- Steve


^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-28 21:40                       ` Dan Williams
@ 2007-06-28 22:01                         ` Steven Rostedt
  0 siblings, 0 replies; 127+ messages in thread
From: Steven Rostedt @ 2007-06-28 22:01 UTC (permalink / raw)
  To: Dan Williams
  Cc: Ingo Molnar, Linus Torvalds, LKML, Andrew Morton,
	Thomas Gleixner, Christoph Hellwig, john stultz, Oleg Nesterov,
	Paul E. McKenney, Dipankar Sarma, David S. Miller, kuznet,
	Jeff Garzik



--

>
> But I guess there is a reason it is still marked experimental...
>
> iq81340mc:/data_dir# ./md0_verify.sh
> kernel BUG at mm/page_alloc.c:363!

Well, at least this uncovered something :-)

I'll look into this too.

-- Steve

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-28 16:00               ` Ingo Molnar
                                   ` (2 preceding siblings ...)
  2007-06-28 18:19                 ` Andrew Morton
@ 2007-06-29 11:34                 ` Alexey Kuznetsov
  2007-06-29 11:48                   ` Duncan Sands
                                     ` (3 more replies)
  3 siblings, 4 replies; 127+ messages in thread
From: Alexey Kuznetsov @ 2007-06-29 11:34 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Jeff Garzik, Linus Torvalds, Steven Rostedt, LKML, Andrew Morton,
	Thomas Gleixner, Christoph Hellwig, john stultz, Oleg Nesterov,
	Paul E. McKenney, Dipankar Sarma, David S. Miller,
	matthew.wilcox

Hello!

> I find the 4usecs cost on a P4 interesting and a bit too high - how did 
> you measure it?

Simple and stupid:

int flag;

static void do_test(unsigned long dummy)
{
	flag = 1;
}

static void do_test_wq(void *dummy)
{
	flag = 1;
}

static void measure_tasklet0(void)
{
	int i;
	int cnt = 0;
	DECLARE_TASKLET(test, do_test, 0);
	unsigned long start = jiffies;

	for (i=0; i<1000000; i++) {
		flag = 0;
		local_bh_disable();
		tasklet_schedule(&test);
		local_bh_enable();
		while (flag == 0) {
			schedule();
			cnt++;
		} /*while (flag == 0)*/;
	}
	printk("tasklet0: %lu %d\n", jiffies - start, cnt);
}

static void measure_tasklet1(void)
{
	int i;
	int cnt = 0;
	DECLARE_TASKLET(test, do_test, 0);
	unsigned long start = jiffies;

	for (i=0; i<1000000; i++) {
		flag = 0;
		local_bh_disable();
		tasklet_schedule(&test);
		local_bh_enable();
		do {
			schedule();
			cnt++;
		} while (flag == 0);
	}
	printk("tasklet1: %lu %d\n", jiffies - start, cnt);
}

static void measure_workqueue(void)
{
	int i;
	int cnt = 0;
	unsigned long start;
	DECLARE_WORK(test, do_test_wq, 0);
	struct workqueue_struct * wq;

	start = jiffies;

	wq = create_workqueue("testq");

	for (i=0; i<1000000; i++) {
		flag = 0;
		queue_work(wq, &test);
		do {
			schedule();
			cnt++;
		} while (flag == 0);
	}
	printk("wq: %lu %d\n", jiffies - start, cnt);
	destroy_workqueue(wq);
}



> tasklet as an intermediary towards a softirq - what's the technological 
> point in such a splitup?

"... work_struct as intermediary towards a workqueue - what's the technological
point in such a splitup?" Non-sense? Yes, but it is exactly what you said. :-)

softirq is just a context and engine to run something. Exactly like
workqueue task. struct tasklet is work_struct, it is just a thing to run.


> workqueues can be per-cpu - for tasklets to be per-cpu you have to 
> open-code them into per-cpu like rcu-tasklets did

I feel I have to repeat: tasklet==work_struct, workqueue==softirq. 

Essentially, you said that workqueues "scale" in direction of increasing
amount of softirqs. This is _correct_, but the word is different: "flexible"
is the word. What's about performance,scalability blah-blah, workqueues
are definitely worse. And this is OK, you do not need to conceal this.

 This is the price, which we pay for flexibility and to niceness to realtime.

That's what should be said in adverticement notes instead of propaganda.



> Just look at the tasklet_disable() logic.

Do not count this.

Done this way because nobody needed that thing, except for _one_ place
in keyboard/console driver, which was very difficult to fix that time,
when vt code was utterly messy and not smp safe at all.

start_bh_atomic() was successfully killed, but we had to preserve analogue
of disable_bh() with the same semantics for some time.
It is deliberately implemented in a way, which does not impact hot paths
and is easy to remove.

It is sad that some usb drivers started to use this creepy and
useless thing.


> also, the "be afraid of the hardirq or the process context" mantra is 
> overblown as well. If something is too heavy for a hardirq, _it's too 
> heavy for a tasklet too_. Most hardirqs are (or should be) running with 
> interrupts enabled, which makes their difference to softirqs miniscule.

Incorrect.

The difference between softirqs and hardirqs lays not in their "heavyness".
It is in reentrancy protection, which has to be done with local_irq_disable(),
unless networking is not isolated from hardirqs. That's all.
Networking is too hairy to allow to be executed with disabled hardirqs.
And moving this hairyiness to process context requires
<irony mode> a little </> more efforts than conversion tasklets to work queues.


> The most scalable workloads dont involve any (or many) softirq middlemen 
> at all: you queue work straight from the hardirq context to the target 
> process context.

Do you really see something common between this Holy Grail Quest and
tasklets/workqeueus? Come on. :-)

Actually, this is step backwards. Instead of execution in correct
context, you create a new dummy context.  This is the place, where goals
of realtime and Holy Grail Quest split.


> true just as much: tasklets from a totally uninteresting network adapter 
> can kill your latency-sensitive application too.

If I started nice --22 running process I signed to killing latency
of nice 0 processes. But I did not sign for killing network/scsi adapters.
"latency-sensitive application" use real time priority as well,
so that they will compete with tasklets fairly.


^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-29 11:34                 ` Alexey Kuznetsov
@ 2007-06-29 11:48                   ` Duncan Sands
  2007-06-29 13:36                     ` Alexey Kuznetsov
  2007-06-29 12:29                   ` Ingo Molnar
                                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 127+ messages in thread
From: Duncan Sands @ 2007-06-29 11:48 UTC (permalink / raw)
  To: Alexey Kuznetsov
  Cc: Ingo Molnar, Jeff Garzik, Linus Torvalds, Steven Rostedt, LKML,
	Andrew Morton, Thomas Gleixner, Christoph Hellwig, john stultz,
	Oleg Nesterov, Paul E. McKenney, Dipankar Sarma, David S. Miller,
	matthew.wilcox

Hi,

> > Just look at the tasklet_disable() logic.
> 
> Do not count this.
> 
> Done this way because nobody needed that thing, except for _one_ place
> in keyboard/console driver, which was very difficult to fix that time,
> when vt code was utterly messy and not smp safe at all.
> 
> start_bh_atomic() was successfully killed, but we had to preserve analogue
> of disable_bh() with the same semantics for some time.
> It is deliberately implemented in a way, which does not impact hot paths
> and is easy to remove.
> 
> It is sad that some usb drivers started to use this creepy and
> useless thing.

the usbatm USB driver uses it in the methods for opening and closing a new network
connection, and on device disconnect.  Yes, tasklet_disable could be eliminated by
adding a spinlock.  However this would mean taking the lock every time a packet is
received or transmitted.  As it is, typically open occurs once, when the computer
boots, and close and disconnect also occur once each, when the computer shuts down.
I felt that three calls to tasklet_disable were better than a gazillion calls to
spin_(un)lock.  Please feel free to educate me :)

Ciao,

Duncan.

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-29 11:34                 ` Alexey Kuznetsov
  2007-06-29 11:48                   ` Duncan Sands
@ 2007-06-29 12:29                   ` Ingo Molnar
  2007-06-29 13:25                     ` Alexey Kuznetsov
  2007-06-29 13:41                   ` Steven Rostedt
  2007-06-29 15:51                   ` Oleg Nesterov
  3 siblings, 1 reply; 127+ messages in thread
From: Ingo Molnar @ 2007-06-29 12:29 UTC (permalink / raw)
  To: Alexey Kuznetsov
  Cc: Jeff Garzik, Linus Torvalds, Steven Rostedt, LKML, Andrew Morton,
	Thomas Gleixner, Christoph Hellwig, john stultz, Oleg Nesterov,
	Paul E. McKenney, Dipankar Sarma, David S. Miller,
	matthew.wilcox


* Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> wrote:

> > also, the "be afraid of the hardirq or the process context" mantra 
> > is overblown as well. If something is too heavy for a hardirq, _it's 
> > too heavy for a tasklet too_. Most hardirqs are (or should be) 
                                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > running with interrupts enabled, which makes their difference to 
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > softirqs miniscule.
    ^^^^^^^^^^^^^^^^^^
> 
> Incorrect.
> 
> The difference between softirqs and hardirqs lays not in their 
> "heavyness". It is in reentrancy protection, which has to be done with 
> local_irq_disable(), unless networking is not isolated from hardirqs. 

i know that pretty well ;)

> That's all. Networking is too hairy to allow to be executed with 
> disabled hardirqs. And moving this hairyiness to process context 
> requires <irony mode> a little </> more efforts than conversion 
> tasklets to work queues.

as i said above (see the underlined sentence), hardirq contexts already 
run just fine with hardirqs enabled. So your dismissal of executing that 
'hairy' bit in hardirq context is not that automatically true as you 
seem to assume i think.

also, network softirq locking dependencies arent all that magic or 
complex either: they do not operate on sockets that are already locked 
by a user context, they are per CPU and they are not preempted by 
'themselves', nor are they preempted by certain other softirqs (such as 
they are not preempted by the timer softirq). Am i missing some point of 
yours?

	Ingo

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-29 12:29                   ` Ingo Molnar
@ 2007-06-29 13:25                     ` Alexey Kuznetsov
  2007-06-29 13:43                       ` Ingo Molnar
  0 siblings, 1 reply; 127+ messages in thread
From: Alexey Kuznetsov @ 2007-06-29 13:25 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Jeff Garzik, Linus Torvalds, Steven Rostedt, LKML, Andrew Morton,
	Thomas Gleixner, Christoph Hellwig, john stultz, Oleg Nesterov,
	Paul E. McKenney, Dipankar Sarma, David S. Miller,
	matthew.wilcox

Hello!

> > The difference between softirqs and hardirqs lays not in their 
> > "heavyness". It is in reentrancy protection, which has to be done with 
> > local_irq_disable(), unless networking is not isolated from hardirqs. 
> 
> i know that pretty well ;)

You forgot about this again in the next sentence. :-)


> as i said above (see the underlined sentence), hardirq contexts already 
> run just fine with hardirqs enabled.

RENTRANCY PROTECTION! If does not matter _how_ they run, it matters what
context they preempt and what that context has to make to prevent that
preemption. If you still do not get the point, make
	sed -e 's/local_bh_/local_irq_/'
over net/* and kill softirqs. Everything will work just fine.
Moreover, if you deal only with a single TCP connection
(and sysctl tcp_low_latency is not set), even hardirq latency will not suck,
all real work is done at process context.


> also, network softirq locking dependencies arent all that magic or 
> complex either: they do not operate on sockets that are already locked 
> by a user context, they are per CPU and they are not preempted by 
> 'themselves', nor are they preempted by certain other softirqs (such as 
> they are not preempted by the timer softirq). Am i missing some point of 
> yours?

I would not say I understood what you wanted to say. :-)

Does my statement about sed match your view? :-)


What I know is that there is no hairy locking dependencies at all
and there is no magic there. Especially, on level of sockets.
The things, which are troublesome are various shared trees/hash tables
(e.g. socket hash tables), which are modified both by incoming network
packets and process contexts.

I have some obscure suspicion that naive dream of "realtime" folks is
to move all those "bad" things to some kernel threads and to talk
to those threads passing some messages. I hope this suspicion is wrong,
otherwise I would say: go to Mach, pals. :-(

Alexey

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-29 11:48                   ` Duncan Sands
@ 2007-06-29 13:36                     ` Alexey Kuznetsov
  2007-06-29 14:01                       ` Duncan Sands
  0 siblings, 1 reply; 127+ messages in thread
From: Alexey Kuznetsov @ 2007-06-29 13:36 UTC (permalink / raw)
  To: Duncan Sands
  Cc: Ingo Molnar, Jeff Garzik, Linus Torvalds, Steven Rostedt, LKML,
	Andrew Morton, Thomas Gleixner, Christoph Hellwig, john stultz,
	Oleg Nesterov, Paul E. McKenney, Dipankar Sarma, David S. Miller,
	matthew.wilcox

Hello!

> I felt that three calls to tasklet_disable were better than a gazillion calls to
> spin_(un)lock.

It is not better.

Actually, it also has something equivalent to spinlock inside.
It raises some flag and waits for completion of already running
tasklets (cf. spin_lock_bh). And if tasklet_schedule happens while
it is disabled, it tries to take that lock gazillion
of times until the tasklet is reenabled back.

Old days that was acceptable, you had not gazillion of attempts
but just a few, but since some time (also old already) it became
disasterous.

It is really better just to avoid calling tasklet_schedule(),
when you do not want it to be executed. :-)

Alexey

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-29 11:34                 ` Alexey Kuznetsov
  2007-06-29 11:48                   ` Duncan Sands
  2007-06-29 12:29                   ` Ingo Molnar
@ 2007-06-29 13:41                   ` Steven Rostedt
  2007-06-29 14:24                     ` Jeff Garzik
                                       ` (2 more replies)
  2007-06-29 15:51                   ` Oleg Nesterov
  3 siblings, 3 replies; 127+ messages in thread
From: Steven Rostedt @ 2007-06-29 13:41 UTC (permalink / raw)
  To: Alexey Kuznetsov
  Cc: Ingo Molnar, Jeff Garzik, Linus Torvalds, LKML, Andrew Morton,
	Thomas Gleixner, Christoph Hellwig, john stultz, Oleg Nesterov,
	Paul E. McKenney, Dipankar Sarma, David S. Miller,
	matthew.wilcox


On Fri, 29 Jun 2007, Alexey Kuznetsov wrote:
> Hello!
>
> > I find the 4usecs cost on a P4 interesting and a bit too high - how did
> > you measure it?
>
> Simple and stupid:

Noted ;-)

> static void measure_tasklet0(void)
> {
> 	int i;
> 	int cnt = 0;
> 	DECLARE_TASKLET(test, do_test, 0);
> 	unsigned long start = jiffies;

Not a very accurate measurement (jiffies that is).

>
> 	for (i=0; i<1000000; i++) {
> 		flag = 0;
> 		local_bh_disable();
> 		tasklet_schedule(&test);
> 		local_bh_enable();
> 		while (flag == 0) {
> 			schedule();
> 			cnt++;
> 		} /*while (flag == 0)*/;
> 	}
> 	printk("tasklet0: %lu %d\n", jiffies - start, cnt);
> }
>

[...]

>
> static void measure_workqueue(void)
> {
> 	int i;
> 	int cnt = 0;
> 	unsigned long start;
> 	DECLARE_WORK(test, do_test_wq, 0);
> 	struct workqueue_struct * wq;
>
> 	start = jiffies;
>
> 	wq = create_workqueue("testq");
>
> 	for (i=0; i<1000000; i++) {
> 		flag = 0;
> 		queue_work(wq, &test);
> 		do {
> 			schedule();

Since the work queue *is* a thread, you are running a busy loop here. Even
though you call schedule, this thread still may have quota available, and
will not yeild to the work queue.  Unless ofcourse this caller is of lower
priority. But even then, I'm not sure how quickly the schedule would
choose the work queue.



 > 			cnt++;
> 		} while (flag == 0);
> 	}
> 	printk("wq: %lu %d\n", jiffies - start, cnt);
> 	destroy_workqueue(wq);
> }
>
>
>

> and is easy to remove.
>
> It is sad that some usb drivers started to use this creepy and
> useless thing.
>
>
> > also, the "be afraid of the hardirq or the process context" mantra is
> > overblown as well. If something is too heavy for a hardirq, _it's too
> > heavy for a tasklet too_. Most hardirqs are (or should be) running with
> > interrupts enabled, which makes their difference to softirqs miniscule.
>
> Incorrect.
>
> The difference between softirqs and hardirqs lays not in their "heavyness".
> It is in reentrancy protection, which has to be done with local_irq_disable(),
> unless networking is not isolated from hardirqs. That's all.
> Networking is too hairy to allow to be executed with disabled hardirqs.
> And moving this hairyiness to process context requires
> <irony mode> a little </> more efforts than conversion tasklets to work queues.
>

I do really want to point out something in the Subject line. **RFC**
:-)

I had very little hope for this magic switch to get into mainline. (maybe
get it into -mm)  But the thing was is that tasklets IMHO are over used.
As Ingo said, there are probably only 2 or 3 places in the kernel that a
a switch to work queue conversion couldn't solve. Those places could then
probably be solved by a different design (yes that would take work).

Tasklets are there and people will continue to use them when they
shouldn't for as long as they exist. Tasklets are there because there
wasn't work queues or kthreads at the time of solving the solution that
tasklets solved.

So if you can still keep the same performance without tasklets, I say we
get rid of them. I've also meet too many device driver writers that want
the lowest possible latency for their device, and do so by sacrificing the
latency of other things in the system that may be even more critical.

Also note, that the more tasklets we have, the higher the latency will be
for other tasklets. There's only two prios you can currently give a
tasklet, so competing devices will need to fight each other without the
admin being able to have as much control over the result.

-- Steve


^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-29 13:25                     ` Alexey Kuznetsov
@ 2007-06-29 13:43                       ` Ingo Molnar
  2007-06-29 15:23                         ` Alexey Kuznetsov
  0 siblings, 1 reply; 127+ messages in thread
From: Ingo Molnar @ 2007-06-29 13:43 UTC (permalink / raw)
  To: Alexey Kuznetsov
  Cc: Jeff Garzik, Linus Torvalds, Steven Rostedt, LKML, Andrew Morton,
	Thomas Gleixner, Christoph Hellwig, john stultz, Oleg Nesterov,
	Paul E. McKenney, Dipankar Sarma, David S. Miller,
	matthew.wilcox


* Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> wrote:

> > as i said above (see the underlined sentence), hardirq contexts 
> > already run just fine with hardirqs enabled.
> 
> RENTRANCY PROTECTION! If does not matter _how_ they run, it matters 
> what context they preempt and what that context has to make to prevent 
> that preemption. [...]

again, there is no reason why this couldnt be done in a hardirq context. 
If a hardirq preempts another hardirq and the first hardirq already 
processes the 'softnet work', you dont do it from the second one but 
queue it with the first one. (into the already existing 
sd->completion_queue for tx work or queue->poll_list for rx work) It 
would be a simple additional flag in softnet_data.

once we forget about 'hardirq contexts run with irqs disabled', _there 
is just no technological point for softirqs_. They are an unnecessary 
abstraction!

once we concede that point, reentrancy protection does not have to be 
done via local_bh_disable(). For example we run just fine without it in 
-rt, local_bh_disable() is a NOP there. How is it done? By controlling 
execution of the _workflow_ that a softirq does. By implementing 
non-reentrancy via another, more flexible mechanism. (and by carefully 
fixing a few _other_, non-workflow assumptions that softnet does/did, 
such as the per-cpu-ness of softnet_data.)

Are we talking about the very same thing perhaps, just from a different 
angle? ;-)

	Ingo

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-29 13:36                     ` Alexey Kuznetsov
@ 2007-06-29 14:01                       ` Duncan Sands
  2007-06-29 16:34                         ` Alexey Kuznetsov
  0 siblings, 1 reply; 127+ messages in thread
From: Duncan Sands @ 2007-06-29 14:01 UTC (permalink / raw)
  To: Alexey Kuznetsov
  Cc: Ingo Molnar, Jeff Garzik, Linus Torvalds, Steven Rostedt, LKML,
	Andrew Morton, Thomas Gleixner, Christoph Hellwig, john stultz,
	Oleg Nesterov, Paul E. McKenney, Dipankar Sarma, David S. Miller,
	matthew.wilcox

> Old days that was acceptable, you had not gazillion of attempts
> but just a few, but since some time (also old already) it became
> disasterous.

What changed?  And can it be fixed?

Thanks,

Duncan.

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-29 13:41                   ` Steven Rostedt
@ 2007-06-29 14:24                     ` Jeff Garzik
  2007-06-29 14:26                     ` Oleg Nesterov
  2007-06-29 14:27                     ` Alexey Kuznetsov
  2 siblings, 0 replies; 127+ messages in thread
From: Jeff Garzik @ 2007-06-29 14:24 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Alexey Kuznetsov, Ingo Molnar, Linus Torvalds, LKML,
	Andrew Morton, Thomas Gleixner, Christoph Hellwig, john stultz,
	Oleg Nesterov, Paul E. McKenney, Dipankar Sarma, David S. Miller,
	matthew.wilcox

Steven Rostedt wrote:
> I had very little hope for this magic switch to get into mainline. (maybe
> get it into -mm)  But the thing was is that tasklets IMHO are over used.
> As Ingo said, there are probably only 2 or 3 places in the kernel that a
> a switch to work queue conversion couldn't solve.

This is purely a guess, backed by zero evidence.

These network drivers were hand-tuned to use tasklets.  Sure it will 
WORK as a workqueue, but that says nothing equivalence.


> Those places could then
> probably be solved by a different design (yes that would take work).

Network driver patches welcome :)


> Tasklets are there because there
> wasn't work queues or kthreads at the time of solving the solution that
> tasklets solved.

Completely false, at least in network driver land.  Threads existed and 
were used (proof: 8139too, among others).

Kernel threads were not used for hot path network packet shovelling 
because they were too damn slow.  Tasklets were single-threaded, fast, 
simple and immediate.  Workqueues today are simple and almost-fast.

	Jeff



^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-29 13:41                   ` Steven Rostedt
  2007-06-29 14:24                     ` Jeff Garzik
@ 2007-06-29 14:26                     ` Oleg Nesterov
  2007-06-29 19:04                       ` Alexey Kuznetsov
  2007-06-29 14:27                     ` Alexey Kuznetsov
  2 siblings, 1 reply; 127+ messages in thread
From: Oleg Nesterov @ 2007-06-29 14:26 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Alexey Kuznetsov, Ingo Molnar, Jeff Garzik, Linus Torvalds, LKML,
	Andrew Morton, Thomas Gleixner, Christoph Hellwig, john stultz,
	Paul E. McKenney, Dipankar Sarma, David S. Miller,
	matthew.wilcox

On 06/29, Steven Rostedt wrote:
> 
> On Fri, 29 Jun 2007, Alexey Kuznetsov wrote:
> >
> > static void measure_workqueue(void)
> > {
> > 	int i;
> > 	int cnt = 0;
> > 	unsigned long start;
> > 	DECLARE_WORK(test, do_test_wq, 0);
> > 	struct workqueue_struct * wq;
> >
> > 	start = jiffies;
> >
> > 	wq = create_workqueue("testq");

Also, create_workqueue() is very costly. The last 2 lines should be
reverted.

Oleg.


^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-29 13:41                   ` Steven Rostedt
  2007-06-29 14:24                     ` Jeff Garzik
  2007-06-29 14:26                     ` Oleg Nesterov
@ 2007-06-29 14:27                     ` Alexey Kuznetsov
  2 siblings, 0 replies; 127+ messages in thread
From: Alexey Kuznetsov @ 2007-06-29 14:27 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Ingo Molnar, Jeff Garzik, Linus Torvalds, LKML, Andrew Morton,
	Thomas Gleixner, Christoph Hellwig, john stultz, Oleg Nesterov,
	Paul E. McKenney, Dipankar Sarma, David S. Miller,
	matthew.wilcox

Hello!

> Not a very accurate measurement (jiffies that is).

Believe me or not, but the measurement has nanosecond precision.


> Since the work queue *is* a thread, you are running a busy loop here. Even
> though you call schedule, this thread still may have quota available, and
> will not yeild to the work queue.  Unless ofcourse this caller is of lower
> priority. But even then, I'm not sure how quickly the schedule would
> choose the work queue.

Instantly. That's why cnt is printed. Unless cnt==1000000, the result
is invalid.


> get it into -mm)  But the thing was is that tasklets IMHO are over used.

You preach to a choir. :-)

Alexey

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-29 13:43                       ` Ingo Molnar
@ 2007-06-29 15:23                         ` Alexey Kuznetsov
  0 siblings, 0 replies; 127+ messages in thread
From: Alexey Kuznetsov @ 2007-06-29 15:23 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Jeff Garzik, Linus Torvalds, Steven Rostedt, LKML, Andrew Morton,
	Thomas Gleixner, Christoph Hellwig, john stultz, Oleg Nesterov,
	Paul E. McKenney, Dipankar Sarma, David S. Miller,
	matthew.wilcox

Hello!

> again, there is no reason why this couldnt be done in a hardirq context. 
> If a hardirq preempts another hardirq and the first hardirq already 
> processes the 'softnet work', you dont do it from the second one but 
> queue it with the first one. (into the already existing 
> sd->completion_queue for tx work or queue->poll_list for rx work) It 
> would be a simple additional flag in softnet_data.

This is kind of obvious. It is just description of softnet.


> once we forget about 'hardirq contexts run with irqs disabled', _there 
> is just no technological point for softirqs_. They are an unnecessary 
> abstraction!

The first paragraph describes softirq, nothing else.

I have already understood when you say "technological",
you mean "terminological". "softirq" is just a term to describe
"softnet" workflow in an intelligible way. Call it from inside
irq handler, rather than in irq_exit, this changes _NOTHING_.

I understood that you describe original pre-historic
softnet model. You just want to replace softirq run at irq_exit
with an explicit soft{net,scsi,whatever}_call, which could
execute immediately or can be queued for later. I hope I am wrong,
because this is... mmm... not a progress.


> -rt, local_bh_disable() is a NOP there. How is it done?
...
> Are we talking about the very same thing perhaps, just from a different 
> angle? ;-)

When talking about softnet, yes.

No, when talking about "implementing non-reentrancy via another,
more flexible mechanism". We are not on the same page.
I am afraid even the books are different. :-)

I need to think about this and really read -rt code, this sounds so crazy
that it can be even correct.

Timeout, we are far out of topic anyway.

Alexey

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-29 11:34                 ` Alexey Kuznetsov
                                     ` (2 preceding siblings ...)
  2007-06-29 13:41                   ` Steven Rostedt
@ 2007-06-29 15:51                   ` Oleg Nesterov
  2007-06-29 16:21                     ` Alexey Kuznetsov
  3 siblings, 1 reply; 127+ messages in thread
From: Oleg Nesterov @ 2007-06-29 15:51 UTC (permalink / raw)
  To: Alexey Kuznetsov
  Cc: Ingo Molnar, Jeff Garzik, Linus Torvalds, Steven Rostedt, LKML,
	Andrew Morton, Thomas Gleixner, Christoph Hellwig, john stultz,
	Paul E. McKenney, Dipankar Sarma, David S. Miller,
	matthew.wilcox

On 06/29, Alexey Kuznetsov wrote:
>
> > Just look at the tasklet_disable() logic.
> 
> Do not count this.

A slightly off-topic question, tasklet_kill(t) doesn't try to steal
t from tasklet_head.list if t was scheduled, but waits until t completes.

If I understand correctly, this is because tasklet_head.list is protected
by local_irq_save(), and t could be scheduled on another CPU, so we just
can't steal it, yes?

If we use worqueues, we can change the semantics of tasklet_kill() so
that it really cancels an already scheduled tasklet.

The question is: would it be the wrong/good change?

Oleg.


^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-29 15:51                   ` Oleg Nesterov
@ 2007-06-29 16:21                     ` Alexey Kuznetsov
  2007-06-29 16:52                       ` Oleg Nesterov
  0 siblings, 1 reply; 127+ messages in thread
From: Alexey Kuznetsov @ 2007-06-29 16:21 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Ingo Molnar, Jeff Garzik, Linus Torvalds, Steven Rostedt, LKML,
	Andrew Morton, Thomas Gleixner, Christoph Hellwig, john stultz,
	Paul E. McKenney, Dipankar Sarma, David S. Miller,
	matthew.wilcox

Hello!

> If I understand correctly, this is because tasklet_head.list is protected
> by local_irq_save(), and t could be scheduled on another CPU, so we just
> can't steal it, yes?

Yes. All that code is written to avoid synchronization as much as possible.


> If we use worqueues, we can change the semantics of tasklet_kill() so
> that it really cancels an already scheduled tasklet.
> 
> The question is: would it be the wrong/good change?

If it does not add another usec to tasklet_schedule(), it would be good.

Alexey

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-29 14:01                       ` Duncan Sands
@ 2007-06-29 16:34                         ` Alexey Kuznetsov
  0 siblings, 0 replies; 127+ messages in thread
From: Alexey Kuznetsov @ 2007-06-29 16:34 UTC (permalink / raw)
  To: Duncan Sands
  Cc: Ingo Molnar, Jeff Garzik, Linus Torvalds, Steven Rostedt, LKML,
	Andrew Morton, Thomas Gleixner, Christoph Hellwig, john stultz,
	Oleg Nesterov, Paul E. McKenney, Dipankar Sarma, David S. Miller,
	matthew.wilcox

Hello!

> What changed? 

softirq remains raised for such tasklet. Old times softirq was processed
once per invocation, in schedule and on syscall exit and this was relatively
harmless. Since softirqs are very weakly moderated, it results in strong
cpu hogging. 


> And can it be fixed?

With current tasklets this can be fixed only introducing additional
synchronization cost to tasklet schedule. Not good. Better to fix
caller.

If Ingo knows about this, I guess it is already fixed in -rt tasklet
or can be fixed. They are really flexible and extensible:
-rt tasklet cost is so high, that even another thousand of cycles
will remain unnnoticed. :-)

Alexey

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-29 16:21                     ` Alexey Kuznetsov
@ 2007-06-29 16:52                       ` Oleg Nesterov
  2007-06-29 17:09                         ` Oleg Nesterov
  0 siblings, 1 reply; 127+ messages in thread
From: Oleg Nesterov @ 2007-06-29 16:52 UTC (permalink / raw)
  To: Alexey Kuznetsov
  Cc: Ingo Molnar, Jeff Garzik, Linus Torvalds, Steven Rostedt, LKML,
	Andrew Morton, Thomas Gleixner, Christoph Hellwig, john stultz,
	Paul E. McKenney, Dipankar Sarma, David S. Miller,
	matthew.wilcox

On 06/29, Alexey Kuznetsov wrote:
> 
> > If I understand correctly, this is because tasklet_head.list is protected
> > by local_irq_save(), and t could be scheduled on another CPU, so we just
> > can't steal it, yes?
> 
> Yes. All that code is written to avoid synchronization as much as possible.

Thanks!

> 
> > If we use worqueues, we can change the semantics of tasklet_kill() so
> > that it really cancels an already scheduled tasklet.
> > 
> > The question is: would it be the wrong/good change?
> 
> If it does not add another usec to tasklet_schedule(), it would be good.

No, it won't slowdown tasklet_schedule(). Instead it will speedup tasklet_kill.


Steven, unless you have some objections, could you change tasklet_kill() ?

> +static inline void tasklet_kill(struct tasklet_struct *t)
>  {
> -       return test_bit(TASKLET_STATE_SCHED, &t->state);
> +       flush_workqueue(ktaskletd_wq);
>  }

Just change flush_workqueue(ktaskletd_wq) to cancel_work_sync(t-work).

Oleg.


^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-29 16:52                       ` Oleg Nesterov
@ 2007-06-29 17:09                         ` Oleg Nesterov
  2007-06-30 11:25                           ` Oleg Nesterov
  0 siblings, 1 reply; 127+ messages in thread
From: Oleg Nesterov @ 2007-06-29 17:09 UTC (permalink / raw)
  To: Alexey Kuznetsov
  Cc: Ingo Molnar, Jeff Garzik, Linus Torvalds, Steven Rostedt, LKML,
	Andrew Morton, Thomas Gleixner, Christoph Hellwig, john stultz,
	Paul E. McKenney, Dipankar Sarma, David S. Miller, matthew

(the email address of Matthew Wilcox looks wrong, changed to matthew@wil.cx)

On 06/29, Oleg Nesterov wrote:
>
> Steven, unless you have some objections, could you change tasklet_kill() ?
> 
> > +static inline void tasklet_kill(struct tasklet_struct *t)
> >  {
> > -       return test_bit(TASKLET_STATE_SCHED, &t->state);
> > +       flush_workqueue(ktaskletd_wq);
> >  }
> 
> Just change flush_workqueue(ktaskletd_wq) to cancel_work_sync(t-work).

Ugh, tasklet_disable() should be changed as well.

> @@ -84,35 +50,35 @@ static inline void tasklet_disable_nosyn
>  static inline void tasklet_disable(struct tasklet_struct *t)
>  {
>         tasklet_disable_nosync(t);
> -       tasklet_unlock_wait(t);
> -       smp_mb();
> -}
> -
> -static inline void tasklet_enable(struct tasklet_struct *t)
> -{
> -       smp_mb__before_atomic_dec();
> -       atomic_dec(&t->count);
> +       flush_workqueue(ktaskletd_wq);
> +       /* flush_workqueue should provide us a barrier */
>  }

Suppose we have the tasklets T1 and T2, both are scheduled on the
same CPU. T1 takes some spinlock LOCK.

Currently it is possible to do

	spin_lock(LOCK);
	disable_tasklet(T2);

With this patch, the above code hangs.


The most simple fix is to use wait_on_work(t->work) instead of
flush_workqueue(). Currently it is static, but we can export it.
This change will speedup tasklet_disable), btw.

A better fix imho is to use cancel_work_sync() again, but this
needs some complications to preserve TASKLET_STATE_PENDING.

This in turn means that cancel_work_sync() should return "int", but
not "void". This change makes sense regardless, I'll try to make a
patch on Sunday.

Oleg.


^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-29 14:26                     ` Oleg Nesterov
@ 2007-06-29 19:04                       ` Alexey Kuznetsov
  0 siblings, 0 replies; 127+ messages in thread
From: Alexey Kuznetsov @ 2007-06-29 19:04 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Steven Rostedt, Ingo Molnar, Jeff Garzik, Linus Torvalds, LKML,
	Andrew Morton, Thomas Gleixner, Christoph Hellwig, john stultz,
	Paul E. McKenney, Dipankar Sarma, David S. Miller,
	matthew.wilcox

Hello!

> Also, create_workqueue() is very costly. The last 2 lines should be
> reverted.

Indeed.

The result improves from 3988 nanoseconds to 3975. :-)
Actually, the difference is within statistical variance,
which is about 20 ns.

Alexey

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
  2007-06-29 17:09                         ` Oleg Nesterov
@ 2007-06-30 11:25                           ` Oleg Nesterov
  0 siblings, 0 replies; 127+ messages in thread
From: Oleg Nesterov @ 2007-06-30 11:25 UTC (permalink / raw)
  To: Alexey Kuznetsov
  Cc: Ingo Molnar, Jeff Garzik, Linus Torvalds, Steven Rostedt, LKML,
	Andrew Morton, Thomas Gleixner, Christoph Hellwig, john stultz,
	Paul E. McKenney, Dipankar Sarma, David S. Miller, matthew

On 06/29, Oleg Nesterov wrote:
>
> Suppose we have the tasklets T1 and T2, both are scheduled on the
> same CPU. T1 takes some spinlock LOCK.
> 
> Currently it is possible to do
> 
> 	spin_lock(LOCK);
> 	disable_tasklet(T2);
> 
> With this patch, the above code hangs.

I am stupid. Yes, flush_workqueue() is evil and should not be used, but
if we use workqueues, tasklet_disable() becomes might_sleep() anyway.
This is incompatible and unavoidable change.

grep, grep...

	net/bluetooth/hci_core.c:hci_rx_task()

		read_lock(&hci_task_lock)
		hci_event_packet()
			hci_num_comp_pkts_evt()
				tasklet_disable(&hdev->tx_task)

Oleg.


^ permalink raw reply	[flat|nested] 127+ messages in thread

end of thread, other threads:[~2007-06-30 11:25 UTC | newest]

Thread overview: 127+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-06-22  4:00 [RFC PATCH 0/6] Convert all tasklets to workqueues Steven Rostedt
2007-06-22  4:00 ` [RFC PATCH 1/6] Convert the RCU tasklet into a softirq Steven Rostedt
2007-06-22  7:10   ` Christoph Hellwig
2007-06-22  7:43     ` Ingo Molnar
2007-06-22 12:35       ` Steven Rostedt
2007-06-22 12:55         ` Ingo Molnar
2007-06-22  4:00 ` [RFC PATCH 2/6] Split out tasklets from softirq.c Steven Rostedt
2007-06-22  7:11   ` Christoph Hellwig
2007-06-22 12:40     ` Steven Rostedt
2007-06-22 13:45   ` Akinobu Mita
2007-06-22 13:58     ` Steven Rostedt
2007-06-22  4:00 ` [RFC PATCH 3/6] Add a tasklet is-scheduled API Steven Rostedt
2007-06-22  4:00 ` [RFC PATCH 4/6] Make DRM use the tasklet is-sched API Steven Rostedt
2007-06-22  6:36   ` Daniel Walker
2007-06-22  6:49     ` Thomas Gleixner
2007-06-22  7:08       ` Daniel Walker
2007-06-22 12:15         ` Steven Rostedt
2007-06-22 15:36           ` Daniel Walker
2007-06-22 22:38             ` Ingo Molnar
2007-06-22 23:28               ` Daniel Walker
2007-06-22 16:10       ` Arnd Bergmann
2007-06-22 16:56         ` Steven Rostedt
2007-06-22 18:24         ` Christoph Hellwig
2007-06-22 23:38           ` Dave Airlie
2007-06-22  4:00 ` [RFC PATCH 5/6] Move tasklet.h to tasklet_softirq.h Steven Rostedt
2007-06-22  4:00 ` [RFC PATCH 6/6] Convert tasklets to work queues Steven Rostedt
2007-06-22  7:06   ` Daniel Walker
2007-06-22 13:29     ` Steven Rostedt
2007-06-22 15:52       ` Oleg Nesterov
2007-06-22 16:35         ` Steven Rostedt
2007-06-23 11:15   ` Arnd Bergmann
2007-06-22  7:09 ` [RFC PATCH 0/6] Convert all tasklets to workqueues Christoph Hellwig
2007-06-22  7:51   ` Ingo Molnar
2007-06-22  7:53     ` Christoph Hellwig
2007-06-22 11:23       ` Ingo Molnar
2007-06-22 12:32   ` Steven Rostedt
2007-06-22 12:38     ` Ingo Molnar
2007-06-22 12:58       ` Steven Rostedt
2007-06-22 13:12         ` Ingo Molnar
2007-06-22 14:27           ` Steven Rostedt
2007-06-22 13:13         ` Andrew Morton
2007-06-22 13:26           ` Ingo Molnar
2007-06-22 13:41             ` Andrew Morton
2007-06-22 14:00               ` Ingo Molnar
2007-06-22 13:35           ` Steven Rostedt
2007-06-22 14:25 ` Arjan van de Ven
2007-06-22 14:42   ` Steven Rostedt
2007-06-22 14:43     ` Arjan van de Ven
2007-06-22 17:16 ` Linus Torvalds
2007-06-22 17:31   ` Steven Rostedt
2007-06-22 18:32   ` Christoph Hellwig
2007-06-22 20:40   ` Ingo Molnar
2007-06-22 21:00     ` Christoph Hellwig
2007-06-22 21:10       ` Ingo Molnar
2007-06-22 21:13       ` Thomas Gleixner
2007-06-22 21:37     ` Linus Torvalds
2007-06-22 21:59       ` Ingo Molnar
2007-06-22 22:09         ` Ingo Molnar
2007-06-22 22:43           ` Roland Dreier
2007-06-22 22:57             ` Alan Cox
2007-06-22 22:58         ` Steven Rostedt
2007-06-23  6:23         ` Dave Airlie
2007-06-24 15:16         ` Jonathan Corbet
2007-06-24 15:52           ` Steven Rostedt
2007-06-25 16:50           ` Tilman Schmidt
2007-06-25 17:06             ` Steven Rostedt
2007-06-25 20:50               ` Tilman Schmidt
2007-06-25 21:03                 ` Steven Rostedt
2007-06-25 19:52             ` Stephen Hemminger
2007-06-26  0:00           ` Jonathan Corbet
2007-06-26  0:52             ` Steven Rostedt
2007-06-25 18:48         ` Kristian Høgsberg
2007-06-25 19:11           ` Steven Rostedt
2007-06-25 20:07             ` Kristian Høgsberg
2007-06-25 20:31               ` Steven Rostedt
2007-06-25 21:08                 ` Kristian Høgsberg
2007-06-25 21:15           ` Ingo Molnar
2007-06-25 23:36             ` Stefan Richter
2007-06-26  0:46               ` Steven Rostedt
2007-06-26  1:46         ` Dan Williams
2007-06-26  2:01           ` Steven Rostedt
2007-06-26  2:12             ` Dan Williams
2007-06-28 12:37               ` Steven Rostedt
2007-06-28 16:37                 ` Oleg Nesterov
2007-06-28 18:02                 ` Dan Williams
2007-06-28 20:46                   ` Steven Rostedt
2007-06-28 21:23                     ` Dan Williams
2007-06-28 21:40                       ` Dan Williams
2007-06-28 22:01                         ` Steven Rostedt
2007-06-28 22:00                       ` Steven Rostedt
2007-06-28  5:48         ` Jeff Garzik
2007-06-28  9:23           ` Ingo Molnar
2007-06-28 14:38             ` Alexey Kuznetsov
2007-06-28 15:23               ` Jeff Garzik
2007-06-28 15:54               ` Steven Rostedt
2007-06-28 16:00               ` Ingo Molnar
2007-06-28 17:26                 ` Jeff Garzik
2007-06-28 17:44                 ` Jeff Garzik
2007-06-28 18:19                 ` Andrew Morton
2007-06-28 20:07                   ` Ingo Molnar
2007-06-29 11:34                 ` Alexey Kuznetsov
2007-06-29 11:48                   ` Duncan Sands
2007-06-29 13:36                     ` Alexey Kuznetsov
2007-06-29 14:01                       ` Duncan Sands
2007-06-29 16:34                         ` Alexey Kuznetsov
2007-06-29 12:29                   ` Ingo Molnar
2007-06-29 13:25                     ` Alexey Kuznetsov
2007-06-29 13:43                       ` Ingo Molnar
2007-06-29 15:23                         ` Alexey Kuznetsov
2007-06-29 13:41                   ` Steven Rostedt
2007-06-29 14:24                     ` Jeff Garzik
2007-06-29 14:26                     ` Oleg Nesterov
2007-06-29 19:04                       ` Alexey Kuznetsov
2007-06-29 14:27                     ` Alexey Kuznetsov
2007-06-29 15:51                   ` Oleg Nesterov
2007-06-29 16:21                     ` Alexey Kuznetsov
2007-06-29 16:52                       ` Oleg Nesterov
2007-06-29 17:09                         ` Oleg Nesterov
2007-06-30 11:25                           ` Oleg Nesterov
2007-06-28 15:17             ` Jeff Garzik
2007-06-22 21:53     ` Daniel Walker
2007-06-22 22:09       ` david
2007-06-22 22:15         ` Daniel Walker
2007-06-22 22:44           ` Ingo Molnar
2007-06-22 23:28             ` Daniel Walker
2007-06-22 22:15       ` Ingo Molnar
2007-06-23  5:14 ` Stephen Hemminger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).