All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/19] tracing/workqueue: events support/enhancements, worklets tracing, fixes
@ 2009-04-30  0:27 Frederic Weisbecker
  2009-04-30  0:27 ` [PATCH 01/19] ftrace, workqueuetrace: make workqueue tracepoints use TRACE_EVENT macro Frederic Weisbecker
                   ` (19 more replies)
  0 siblings, 20 replies; 23+ messages in thread
From: Frederic Weisbecker @ 2009-04-30  0:27 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Li Zefan, Frederic Weisbecker, Zhao Lei, Steven Rostedt,
	Tom Zanussi, Oleg Nesterov, Andrew Morton, KOSAKI Motohiro

Hi,

This patchset brings various features for the workqueue tracer.

The support of events tracing (tracepoint expanding):
	- workqueue_creation
	- workqueue_flush
	- workqueue_destruction

    Also for the worklets:
	- worklet_enqueue
	- worklet_enqueue_delayed
	- worklet_execute
	- worklet_complete
	- worklet_cancel

Now, most of the lifecycle of worklets and workqueues is available.

The histogram tracing now also supports the worklets. You can
measure the max and average execution time of a worklet.

A good bunch of Oleg's comments have been addressed, at least the
critical ones (races which can cause freed pointer dereferences, memory
leak, etc...). The rest is more about the design itself, such as using
the workqueue as the identifier instead of the thread and other things
that are in discussion.

Thanks!

Frederic.

The following changes since commit a0e39ed378fb6ba916522764cd508fa7d42ad495:
  Heiko Carstens (1):
        tracing: fix build failure on s390

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing.git tracing/workqueue

Frederic Weisbecker (6):
      tracing/workqueue: turn cpu_workqueue_stats::inserted into unsigned int
      tracing/workqueue: add execution time average column
      tracing/workqueue: add open/release file handlers
      tracing/workqueue: defer workqueue stat release if needed
      tracing/workqueue: use the original cpu affinity on probe_workqueue_destruction
      tracing/workqueue: provide documentation for the workqueue tracer

KOSAKI Motohiro (2):
      workqueue_tracepoint: introduce workqueue_handler_exit tracepoint and rename workqueue_execution to workqueue_handler_entry
      tracing/workqueue: add max execution time mesurement for per worklet

Zhaolei (11):
      ftrace, workqueuetrace: make workqueue tracepoints use TRACE_EVENT macro
      trace_workqueue: use list_for_each_entry() instead of list_for_each_entry_safe()
      trace_workqueue: remove cpu_workqueue_stats->first_entry
      trace_workqueue: remove blank line between each cpu
      trace_workqueue: add worklet information
      workqueue_tracepoint: add workqueue_flush and worklet_cancel tracepoint
      workqueue_tracepoint: change tracepoint name to fit worklet and workqueue lifecycle
      workqueue_trace: separate worklet_insertion into worklet_enqueue and worklet_enqueue_delayed
      tracing/workqueue: turn workfunc_stats::inserted into unsigned int
      tracing/workqueue: avoid accessing task_struct's member variable in stat file read
      tracing/workqueue: use list_for_each_entry instead of list_for_each_entry_safe in probe_workqueue_destruction()

 Documentation/trace/workqueue.txt |  114 ++++++++++
 include/trace/events/workqueue.h  |  200 +++++++++++++++++
 include/trace/workqueue.h         |   25 --
 kernel/trace/trace_stat.c         |    8 +
 kernel/trace/trace_stat.h         |    3 +
 kernel/trace/trace_workqueue.c    |  444 +++++++++++++++++++++++++++++--------
 kernel/workqueue.c                |   45 +++--
 7 files changed, 697 insertions(+), 142 deletions(-)
 create mode 100644 Documentation/trace/workqueue.txt
 create mode 100644 include/trace/events/workqueue.h
 delete mode 100644 include/trace/workqueue.h

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH 01/19] ftrace, workqueuetrace: make workqueue tracepoints use TRACE_EVENT macro
  2009-04-30  0:27 [PATCH 00/19] tracing/workqueue: events support/enhancements, worklets tracing, fixes Frederic Weisbecker
@ 2009-04-30  0:27 ` Frederic Weisbecker
  2009-04-30  0:27 ` [PATCH 02/19] trace_workqueue: use list_for_each_entry() instead of list_for_each_entry_safe() Frederic Weisbecker
                   ` (18 subsequent siblings)
  19 siblings, 0 replies; 23+ messages in thread
From: Frederic Weisbecker @ 2009-04-30  0:27 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Li Zefan, Zhaolei, Steven Rostedt, Tom Zanussi,
	Oleg Nesterov, Andrew Morton, KOSAKI Motohiro,
	Frederic Weisbecker

From: Zhaolei <zhaolei@cn.fujitsu.com>

v3: zhaolei@cn.fujitsu.com: Change TRACE_EVENT definition to new format
    introduced by Steven Rostedt: consolidate trace and trace_event headers
v2: kosaki@jp.fujitsu.com: print the function names instead of addr, and zap
    the work addr
v1: zhaolei@cn.fujitsu.com: Make workqueue tracepoints use TRACE_EVENT macro

TRACE_EVENT is a more generic way to define tracepoints.
Doing so adds these new capabilities to the tracepoints:

  - zero-copy and per-cpu splice() tracing
  - binary tracing without printf overhead
  - structured logging records exposed under /debug/tracing/events
  - trace events embedded in function tracer output and other plugins
  - user-defined, per tracepoint filter expressions

Then, this patch converts DEFINE_TRACE to TRACE_EVENT in workqueue related
tracepoints.

[ Impact: expand workqueue tracer to events tracing ]

Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
 include/trace/events/workqueue.h |  100 ++++++++++++++++++++++++++++++++++++++
 include/trace/workqueue.h        |   25 ---------
 kernel/trace/trace_workqueue.c   |    2 +-
 kernel/workqueue.c               |   11 +---
 4 files changed, 103 insertions(+), 35 deletions(-)

diff --git a/include/trace/events/workqueue.h b/include/trace/events/workqueue.h
new file mode 100644
index 0000000..035f1bf
--- /dev/null
+++ b/include/trace/events/workqueue.h
@@ -0,0 +1,100 @@
+#if !defined(_TRACE_WORKQUEUE_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_WORKQUEUE_H
+
+#include <linux/workqueue.h>
+#include <linux/sched.h>
+#include <linux/tracepoint.h>
+
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM workqueue
+
+TRACE_EVENT(workqueue_insertion,
+
+	TP_PROTO(struct task_struct *wq_thread, struct work_struct *work),
+
+	TP_ARGS(wq_thread, work),
+
+	TP_STRUCT__entry(
+		__array(char,		thread_comm,	TASK_COMM_LEN)
+		__field(pid_t,		thread_pid)
+		__field(work_func_t,	func)
+	),
+
+	TP_fast_assign(
+		memcpy(__entry->thread_comm, wq_thread->comm, TASK_COMM_LEN);
+		__entry->thread_pid	= wq_thread->pid;
+		__entry->func		= work->func;
+	),
+
+	TP_printk("thread=%s:%d func=%pF", __entry->thread_comm,
+		__entry->thread_pid, __entry->func)
+);
+
+TRACE_EVENT(workqueue_execution,
+
+	TP_PROTO(struct task_struct *wq_thread, struct work_struct *work),
+
+	TP_ARGS(wq_thread, work),
+
+	TP_STRUCT__entry(
+		__array(char,		thread_comm,	TASK_COMM_LEN)
+		__field(pid_t,		thread_pid)
+		__field(work_func_t,	func)
+	),
+
+	TP_fast_assign(
+		memcpy(__entry->thread_comm, wq_thread->comm, TASK_COMM_LEN);
+		__entry->thread_pid	= wq_thread->pid;
+		__entry->func		= work->func;
+	),
+
+	TP_printk("thread=%s:%d func=%pF", __entry->thread_comm,
+		__entry->thread_pid, __entry->func)
+);
+
+/* Trace the creation of one workqueue thread on a cpu */
+TRACE_EVENT(workqueue_creation,
+
+	TP_PROTO(struct task_struct *wq_thread, int cpu),
+
+	TP_ARGS(wq_thread, cpu),
+
+	TP_STRUCT__entry(
+		__array(char,	thread_comm,	TASK_COMM_LEN)
+		__field(pid_t,	thread_pid)
+		__field(int,	cpu)
+	),
+
+	TP_fast_assign(
+		memcpy(__entry->thread_comm, wq_thread->comm, TASK_COMM_LEN);
+		__entry->thread_pid	= wq_thread->pid;
+		__entry->cpu		= cpu;
+	),
+
+	TP_printk("thread=%s:%d cpu=%d", __entry->thread_comm,
+		__entry->thread_pid, __entry->cpu)
+);
+
+TRACE_EVENT(workqueue_destruction,
+
+	TP_PROTO(struct task_struct *wq_thread),
+
+	TP_ARGS(wq_thread),
+
+	TP_STRUCT__entry(
+		__array(char,	thread_comm,	TASK_COMM_LEN)
+		__field(pid_t,	thread_pid)
+	),
+
+	TP_fast_assign(
+		memcpy(__entry->thread_comm, wq_thread->comm, TASK_COMM_LEN);
+		__entry->thread_pid	= wq_thread->pid;
+	),
+
+	TP_printk("thread=%s:%d", __entry->thread_comm, __entry->thread_pid)
+);
+
+#endif /* _TRACE_WORKQUEUE_H */
+
+/* This part must be outside protection */
+#include <trace/define_trace.h>
diff --git a/include/trace/workqueue.h b/include/trace/workqueue.h
deleted file mode 100644
index 7626523..0000000
--- a/include/trace/workqueue.h
+++ /dev/null
@@ -1,25 +0,0 @@
-#ifndef __TRACE_WORKQUEUE_H
-#define __TRACE_WORKQUEUE_H
-
-#include <linux/tracepoint.h>
-#include <linux/workqueue.h>
-#include <linux/sched.h>
-
-DECLARE_TRACE(workqueue_insertion,
-	   TP_PROTO(struct task_struct *wq_thread, struct work_struct *work),
-	   TP_ARGS(wq_thread, work));
-
-DECLARE_TRACE(workqueue_execution,
-	   TP_PROTO(struct task_struct *wq_thread, struct work_struct *work),
-	   TP_ARGS(wq_thread, work));
-
-/* Trace the creation of one workqueue thread on a cpu */
-DECLARE_TRACE(workqueue_creation,
-	   TP_PROTO(struct task_struct *wq_thread, int cpu),
-	   TP_ARGS(wq_thread, cpu));
-
-DECLARE_TRACE(workqueue_destruction,
-	   TP_PROTO(struct task_struct *wq_thread),
-	   TP_ARGS(wq_thread));
-
-#endif /* __TRACE_WORKQUEUE_H */
diff --git a/kernel/trace/trace_workqueue.c b/kernel/trace/trace_workqueue.c
index 984b917..cfe56d3 100644
--- a/kernel/trace/trace_workqueue.c
+++ b/kernel/trace/trace_workqueue.c
@@ -6,7 +6,7 @@
  */
 
 
-#include <trace/workqueue.h>
+#include <trace/events/workqueue.h>
 #include <linux/list.h>
 #include <linux/percpu.h>
 #include "trace_stat.h"
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index f71fb2a..0668795 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -33,7 +33,8 @@
 #include <linux/kallsyms.h>
 #include <linux/debug_locks.h>
 #include <linux/lockdep.h>
-#include <trace/workqueue.h>
+#define CREATE_TRACE_POINTS
+#include <trace/events/workqueue.h>
 
 /*
  * The per-CPU workqueue (if single thread, we always use the first
@@ -124,8 +125,6 @@ struct cpu_workqueue_struct *get_wq_data(struct work_struct *work)
 	return (void *) (atomic_long_read(&work->data) & WORK_STRUCT_WQ_DATA_MASK);
 }
 
-DEFINE_TRACE(workqueue_insertion);
-
 static void insert_work(struct cpu_workqueue_struct *cwq,
 			struct work_struct *work, struct list_head *head)
 {
@@ -262,8 +261,6 @@ int queue_delayed_work_on(int cpu, struct workqueue_struct *wq,
 }
 EXPORT_SYMBOL_GPL(queue_delayed_work_on);
 
-DEFINE_TRACE(workqueue_execution);
-
 static void run_workqueue(struct cpu_workqueue_struct *cwq)
 {
 	spin_lock_irq(&cwq->lock);
@@ -753,8 +750,6 @@ init_cpu_workqueue(struct workqueue_struct *wq, int cpu)
 	return cwq;
 }
 
-DEFINE_TRACE(workqueue_creation);
-
 static int create_workqueue_thread(struct cpu_workqueue_struct *cwq, int cpu)
 {
 	struct sched_param param = { .sched_priority = MAX_RT_PRIO-1 };
@@ -860,8 +855,6 @@ struct workqueue_struct *__create_workqueue_key(const char *name,
 }
 EXPORT_SYMBOL_GPL(__create_workqueue_key);
 
-DEFINE_TRACE(workqueue_destruction);
-
 static void cleanup_workqueue_thread(struct cpu_workqueue_struct *cwq)
 {
 	/*
-- 
1.6.2.3


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 02/19] trace_workqueue: use list_for_each_entry() instead of list_for_each_entry_safe()
  2009-04-30  0:27 [PATCH 00/19] tracing/workqueue: events support/enhancements, worklets tracing, fixes Frederic Weisbecker
  2009-04-30  0:27 ` [PATCH 01/19] ftrace, workqueuetrace: make workqueue tracepoints use TRACE_EVENT macro Frederic Weisbecker
@ 2009-04-30  0:27 ` Frederic Weisbecker
  2009-04-30  0:27 ` [PATCH 03/19] trace_workqueue: remove cpu_workqueue_stats->first_entry Frederic Weisbecker
                   ` (17 subsequent siblings)
  19 siblings, 0 replies; 23+ messages in thread
From: Frederic Weisbecker @ 2009-04-30  0:27 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Li Zefan, Zhaolei, Steven Rostedt, Tom Zanussi,
	Oleg Nesterov, Andrew Morton, Frederic Weisbecker

From: Zhaolei <zhaolei@cn.fujitsu.com>

No need to use list_for_each_entry_safe() in iteration without deleting
any node, we can use list_for_each_entry() instead.

[ Impact: cleanup ]

Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
 kernel/trace/trace_workqueue.c |   10 ++++------
 1 files changed, 4 insertions(+), 6 deletions(-)

diff --git a/kernel/trace/trace_workqueue.c b/kernel/trace/trace_workqueue.c
index cfe56d3..128b64b 100644
--- a/kernel/trace/trace_workqueue.c
+++ b/kernel/trace/trace_workqueue.c
@@ -47,12 +47,11 @@ probe_workqueue_insertion(struct task_struct *wq_thread,
 			  struct work_struct *work)
 {
 	int cpu = cpumask_first(&wq_thread->cpus_allowed);
-	struct cpu_workqueue_stats *node, *next;
+	struct cpu_workqueue_stats *node;
 	unsigned long flags;
 
 	spin_lock_irqsave(&workqueue_cpu_stat(cpu)->lock, flags);
-	list_for_each_entry_safe(node, next, &workqueue_cpu_stat(cpu)->list,
-							list) {
+	list_for_each_entry(node, &workqueue_cpu_stat(cpu)->list, list) {
 		if (node->pid == wq_thread->pid) {
 			atomic_inc(&node->inserted);
 			goto found;
@@ -69,12 +68,11 @@ probe_workqueue_execution(struct task_struct *wq_thread,
 			  struct work_struct *work)
 {
 	int cpu = cpumask_first(&wq_thread->cpus_allowed);
-	struct cpu_workqueue_stats *node, *next;
+	struct cpu_workqueue_stats *node;
 	unsigned long flags;
 
 	spin_lock_irqsave(&workqueue_cpu_stat(cpu)->lock, flags);
-	list_for_each_entry_safe(node, next, &workqueue_cpu_stat(cpu)->list,
-							list) {
+	list_for_each_entry(node, &workqueue_cpu_stat(cpu)->list, list) {
 		if (node->pid == wq_thread->pid) {
 			node->executed++;
 			goto found;
-- 
1.6.2.3


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 03/19] trace_workqueue: remove cpu_workqueue_stats->first_entry
  2009-04-30  0:27 [PATCH 00/19] tracing/workqueue: events support/enhancements, worklets tracing, fixes Frederic Weisbecker
  2009-04-30  0:27 ` [PATCH 01/19] ftrace, workqueuetrace: make workqueue tracepoints use TRACE_EVENT macro Frederic Weisbecker
  2009-04-30  0:27 ` [PATCH 02/19] trace_workqueue: use list_for_each_entry() instead of list_for_each_entry_safe() Frederic Weisbecker
@ 2009-04-30  0:27 ` Frederic Weisbecker
  2009-04-30  0:27 ` [PATCH 04/19] trace_workqueue: remove blank line between each cpu Frederic Weisbecker
                   ` (16 subsequent siblings)
  19 siblings, 0 replies; 23+ messages in thread
From: Frederic Weisbecker @ 2009-04-30  0:27 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Li Zefan, Zhaolei, Steven Rostedt, Tom Zanussi,
	Oleg Nesterov, Andrew Morton, Frederic Weisbecker

From: Zhaolei <zhaolei@cn.fujitsu.com>

cpu_workqueue_stats->first_entry is useless because we can retrieve the
header of a cpu workqueue using:
if (&cpu_workqueue_stats->list == workqueue_cpu_stat(cpu)->list.next)

[ Impact: cleanup ]

Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
 kernel/trace/trace_workqueue.c |    4 ----
 1 files changed, 0 insertions(+), 4 deletions(-)

diff --git a/kernel/trace/trace_workqueue.c b/kernel/trace/trace_workqueue.c
index 128b64b..890974a 100644
--- a/kernel/trace/trace_workqueue.c
+++ b/kernel/trace/trace_workqueue.c
@@ -16,8 +16,6 @@
 /* A cpu workqueue thread */
 struct cpu_workqueue_stats {
 	struct list_head            list;
-/* Useful to know if we print the cpu headers */
-	bool		            first_entry;
 	int		            cpu;
 	pid_t			    pid;
 /* Can be inserted from interrupt or user context, need to be atomic */
@@ -103,8 +101,6 @@ static void probe_workqueue_creation(struct task_struct *wq_thread, int cpu)
 	cws->pid = wq_thread->pid;
 
 	spin_lock_irqsave(&workqueue_cpu_stat(cpu)->lock, flags);
-	if (list_empty(&workqueue_cpu_stat(cpu)->list))
-		cws->first_entry = true;
 	list_add_tail(&cws->list, &workqueue_cpu_stat(cpu)->list);
 	spin_unlock_irqrestore(&workqueue_cpu_stat(cpu)->lock, flags);
 }
-- 
1.6.2.3


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 04/19] trace_workqueue: remove blank line between each cpu
  2009-04-30  0:27 [PATCH 00/19] tracing/workqueue: events support/enhancements, worklets tracing, fixes Frederic Weisbecker
                   ` (2 preceding siblings ...)
  2009-04-30  0:27 ` [PATCH 03/19] trace_workqueue: remove cpu_workqueue_stats->first_entry Frederic Weisbecker
@ 2009-04-30  0:27 ` Frederic Weisbecker
  2009-04-30  0:27 ` [PATCH 05/19] trace_workqueue: add worklet information Frederic Weisbecker
                   ` (15 subsequent siblings)
  19 siblings, 0 replies; 23+ messages in thread
From: Frederic Weisbecker @ 2009-04-30  0:27 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Li Zefan, Zhaolei, KOSAKI Motohiro, Steven Rostedt,
	Tom Zanussi, Oleg Nesterov, Andrew Morton, Frederic Weisbecker

From: Zhaolei <zhaolei@cn.fujitsu.com>

The blankline between each cpu's workqueue stat is not necessary, because
the cpu number is enough to part them by eye.
Old style also caused a blankline below headline, and made code complex
by using lock, disableirq and get cpu var.

Old style:
 # CPU  INSERTED  EXECUTED   NAME
 # |      |         |          |

   0   8644       8644       events/0
   0      0          0       cpuset
   ...
   0      1          1       kdmflush

   1  35365      35365       events/1
   ...

New style:
 # CPU  INSERTED  EXECUTED   NAME
 # |      |         |          |

   0   8644       8644       events/0
   0      0          0       cpuset
   ...
   0      1          1       kdmflush
   1  35365      35365       events/1
   ...

[ Impact: provide more readable code ]

Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
 kernel/trace/trace_workqueue.c |    7 -------
 1 files changed, 0 insertions(+), 7 deletions(-)

diff --git a/kernel/trace/trace_workqueue.c b/kernel/trace/trace_workqueue.c
index 890974a..97fcea4 100644
--- a/kernel/trace/trace_workqueue.c
+++ b/kernel/trace/trace_workqueue.c
@@ -185,16 +185,9 @@ static void *workqueue_stat_next(void *prev, int idx)
 static int workqueue_stat_show(struct seq_file *s, void *p)
 {
 	struct cpu_workqueue_stats *cws = p;
-	unsigned long flags;
-	int cpu = cws->cpu;
 	struct pid *pid;
 	struct task_struct *tsk;
 
-	spin_lock_irqsave(&workqueue_cpu_stat(cpu)->lock, flags);
-	if (&cws->list == workqueue_cpu_stat(cpu)->list.next)
-		seq_printf(s, "\n");
-	spin_unlock_irqrestore(&workqueue_cpu_stat(cpu)->lock, flags);
-
 	pid = find_get_pid(cws->pid);
 	if (pid) {
 		tsk = get_pid_task(pid, PIDTYPE_PID);
-- 
1.6.2.3


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 05/19] trace_workqueue: add worklet information
  2009-04-30  0:27 [PATCH 00/19] tracing/workqueue: events support/enhancements, worklets tracing, fixes Frederic Weisbecker
                   ` (3 preceding siblings ...)
  2009-04-30  0:27 ` [PATCH 04/19] trace_workqueue: remove blank line between each cpu Frederic Weisbecker
@ 2009-04-30  0:27 ` Frederic Weisbecker
  2009-04-30  0:27 ` [PATCH 06/19] tracing/workqueue: turn cpu_workqueue_stats::inserted into unsigned int Frederic Weisbecker
                   ` (14 subsequent siblings)
  19 siblings, 0 replies; 23+ messages in thread
From: Frederic Weisbecker @ 2009-04-30  0:27 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Li Zefan, Zhaolei, KOSAKI Motohiro, Steven Rostedt,
	Tom Zanussi, Oleg Nesterov, Andrew Morton, Frederic Weisbecker

From: Zhaolei <zhaolei@cn.fujitsu.com>

Now workqueue tracer can display per-worklet information as following:

 # cat [debugfs]/tracing/trace_stat/workqueues
 # CPU  INSERTED  EXECUTED   TASKNAME:PID
 # |      |         |        `-WORKFUNC
 # |      |         |          |
 CPU  INSERTED  EXECUTED   TASKNAME:PID
   0    157        157       events/0:5
   0      1          1        |-reg_todo+0x0/0x648
   0      3          3        |-check_corruption+0x0/0x2c
   0      1          1        |-power_supply_changed_work+0x0/0x3c
   0      2          2        |-console_callback+0x0/0x145
   0      9          9        |-do_cache_clean+0x0/0x6a
   0      1          1        |-wq_barrier_func+0x0/0x12
   0    129        129        |-vmstat_update+0x0/0x33
   0      8          8        |-mce_work_fn+0x0/0x31
   0      1          1        |-grace_ender+0x0/0x14
   0      2          2        `-rt_worker_func+0x0/0x2de
   0      0          0       cpuset:6
   0    656        656       khelper:7
   0    656        656        `-__call_usermodehelper+0x0/0xa0
   0      0          0       netns:11
   0    592        592       kblockd/0:13
   0     39         39        |-blk_unplug_work+0x0/0x1e
   0    553        553        `-cfq_kick_queue+0x0/0x2f
   0      0          0       kacpid:412
   0      0          0       kacpi_notify:415
   0      0          0       ata/0:463
   0      0          0       ata_aux:509
   0      0          0       ksuspend_usbd:515
   0      0          0       aio/0:717
   0      0          0       nfsiod:718
   0      0          0       crypto/0:719
   0      0          0       kpsmoused:1141
   0      0          0       kstriped:1248
   0      0          0       kondemand/0:1259
   0      0          0       usbhid_resumer:1260
   0     21         21       rpciod/0:1333
   0      1          1        |-xs_tcp_connect_worker4+0x0/0x1b
   0     19         19        |-xs_udp_connect_worker4+0x0/0x130
   0      1          1        `-xprt_autoclose+0x0/0x2c

[ Impact: extend workqueue tracing to worklets ]

Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
 kernel/trace/trace_workqueue.c |  219 ++++++++++++++++++++++++++++++++--------
 1 files changed, 175 insertions(+), 44 deletions(-)

diff --git a/kernel/trace/trace_workqueue.c b/kernel/trace/trace_workqueue.c
index 97fcea4..1a68b5a 100644
--- a/kernel/trace/trace_workqueue.c
+++ b/kernel/trace/trace_workqueue.c
@@ -12,19 +12,33 @@
 #include "trace_stat.h"
 #include "trace.h"
 
+/* A callback function saved in struct work_struct->func */
+struct cpu_workqueue_stats;
+struct workfunc_stats {
+	struct list_head		list;
+	struct cpu_workqueue_stats	*parent;
+
+	/* Addr of worklet's callback function */
+	work_func_t			func;
+	/* Can be inserted from interrupt or user context, need to be atomic */
+	atomic_t			inserted;
+	/*
+	 * Don't need to be atomic, works are serialized in a single workqueue
+	 * thread on a single CPU.
+	 */
+	unsigned int			executed;
+};
 
 /* A cpu workqueue thread */
 struct cpu_workqueue_stats {
 	struct list_head            list;
 	int		            cpu;
 	pid_t			    pid;
-/* Can be inserted from interrupt or user context, need to be atomic */
+	/* Refer to comments of struct workfunc_stats for detail */
 	atomic_t	            inserted;
-/*
- *  Don't need to be atomic, works are serialized in a single workqueue thread
- *  on a single CPU.
- */
 	unsigned int		    executed;
+	/* list of struct workfunc_stats in this workqueue */
+	struct list_head            workfunclist;
 };
 
 /* List of workqueue threads on one cpu */
@@ -39,6 +53,38 @@ struct workqueue_global_stats {
 static DEFINE_PER_CPU(struct workqueue_global_stats, all_workqueue_stat);
 #define workqueue_cpu_stat(cpu) (&per_cpu(all_workqueue_stat, cpu))
 
+/*
+ * Update record when insert a work into workqueue
+ * Caller need to hold cpu_workqueue_stats spin_lock
+ */
+int do_workqueue_insertion(struct cpu_workqueue_stats *cws,
+			   struct work_struct *work)
+{
+	struct workfunc_stats *wfstat;
+
+	atomic_inc(&cws->inserted);
+
+	list_for_each_entry(wfstat, &cws->workfunclist, list)
+		if (wfstat->func == work->func)
+			goto found;
+
+	/* not found, first insert */
+
+	/* work can be inserted from interrupt */
+	wfstat = kzalloc(sizeof(struct workfunc_stats), GFP_ATOMIC);
+	if (!wfstat)
+		return -ENOMEM;
+
+	wfstat->parent = cws;
+	wfstat->func = work->func;
+
+	list_add_tail(&wfstat->list, &cws->workfunclist);
+found:
+	atomic_inc(&wfstat->inserted);
+
+	return 0;
+}
+
 /* Insertion of a work */
 static void
 probe_workqueue_insertion(struct task_struct *wq_thread,
@@ -51,7 +97,8 @@ probe_workqueue_insertion(struct task_struct *wq_thread,
 	spin_lock_irqsave(&workqueue_cpu_stat(cpu)->lock, flags);
 	list_for_each_entry(node, &workqueue_cpu_stat(cpu)->list, list) {
 		if (node->pid == wq_thread->pid) {
-			atomic_inc(&node->inserted);
+			/* we ignore error of do_workqueue_insertion */
+			do_workqueue_insertion(node, work);
 			goto found;
 		}
 	}
@@ -67,17 +114,30 @@ probe_workqueue_execution(struct task_struct *wq_thread,
 {
 	int cpu = cpumask_first(&wq_thread->cpus_allowed);
 	struct cpu_workqueue_stats *node;
+	struct workfunc_stats *wfnode;
 	unsigned long flags;
 
 	spin_lock_irqsave(&workqueue_cpu_stat(cpu)->lock, flags);
-	list_for_each_entry(node, &workqueue_cpu_stat(cpu)->list, list) {
+
+	list_for_each_entry(node, &workqueue_cpu_stat(cpu)->list, list)
 		if (node->pid == wq_thread->pid) {
 			node->executed++;
-			goto found;
+			goto found_wq;
 		}
-	}
-	pr_debug("trace_workqueue: entry not found\n");
-found:
+	pr_debug("trace_workqueue: workqueue not found\n");
+	goto end;
+
+found_wq:
+	list_for_each_entry(wfnode, &node->workfunclist, list)
+		if (wfnode->func == work->func) {
+			wfnode->executed++;
+			goto found_wf;
+		}
+	pr_debug("trace_workqueue: worklet not found\n");
+	goto end;
+
+found_wf:
+end:
 	spin_unlock_irqrestore(&workqueue_cpu_stat(cpu)->lock, flags);
 }
 
@@ -85,6 +145,7 @@ found:
 static void probe_workqueue_creation(struct task_struct *wq_thread, int cpu)
 {
 	struct cpu_workqueue_stats *cws;
+	struct workfunc_stats *wfstat;
 	unsigned long flags;
 
 	WARN_ON(cpu < 0);
@@ -93,16 +154,37 @@ static void probe_workqueue_creation(struct task_struct *wq_thread, int cpu)
 	cws = kzalloc(sizeof(struct cpu_workqueue_stats), GFP_ATOMIC);
 	if (!cws) {
 		pr_warning("trace_workqueue: not enough memory\n");
-		return;
+		goto err_alloc_cws;
 	}
 	INIT_LIST_HEAD(&cws->list);
 	cws->cpu = cpu;
 
 	cws->pid = wq_thread->pid;
+	INIT_LIST_HEAD(&cws->workfunclist);
+
+	/*
+	 * add a dummy node to cpu_workqueue_stats->workfunclist to:
+	 * 1: help print function to print header of work(ex, workqueue info)
+	 * 2: used as a seq iter if no work exist in workqueue
+	 */
+	wfstat = kzalloc(sizeof(struct workfunc_stats), GFP_ATOMIC);
+	if (!wfstat) {
+		pr_warning("trace_workqueue: not enough memory\n");
+		goto err_alloc_wfstat;
+	}
+	wfstat->parent = cws;
+	list_add_tail(&wfstat->list, &cws->workfunclist);
 
 	spin_lock_irqsave(&workqueue_cpu_stat(cpu)->lock, flags);
 	list_add_tail(&cws->list, &workqueue_cpu_stat(cpu)->list);
 	spin_unlock_irqrestore(&workqueue_cpu_stat(cpu)->lock, flags);
+
+	return;
+
+err_alloc_wfstat:
+	kfree(cws);
+err_alloc_cws:
+	return;
 }
 
 /* Destruction of a cpu workqueue thread */
@@ -115,12 +197,22 @@ static void probe_workqueue_destruction(struct task_struct *wq_thread)
 
 	spin_lock_irqsave(&workqueue_cpu_stat(cpu)->lock, flags);
 	list_for_each_entry_safe(node, next, &workqueue_cpu_stat(cpu)->list,
-							list) {
-		if (node->pid == wq_thread->pid) {
-			list_del(&node->list);
-			kfree(node);
-			goto found;
+				 list) {
+		struct workfunc_stats *wfstat, *wfstatnext;
+
+		if (node->pid != wq_thread->pid)
+			continue;
+
+		list_for_each_entry_safe(wfstat, wfstatnext,
+					 &node->workfunclist, list) {
+			list_del(&wfstat->list);
+			kfree(wfstat);
 		}
+
+		list_del(&node->list);
+		kfree(node);
+
+		goto found;
 	}
 
 	pr_debug("trace_workqueue: don't find workqueue to destroy\n");
@@ -129,17 +221,23 @@ found:
 
 }
 
-static struct cpu_workqueue_stats *workqueue_stat_start_cpu(int cpu)
+static struct workfunc_stats *workqueue_stat_start_cpu(int cpu)
 {
 	unsigned long flags;
-	struct cpu_workqueue_stats *ret = NULL;
-
+	struct workfunc_stats *ret = NULL;
 
 	spin_lock_irqsave(&workqueue_cpu_stat(cpu)->lock, flags);
 
-	if (!list_empty(&workqueue_cpu_stat(cpu)->list))
-		ret = list_entry(workqueue_cpu_stat(cpu)->list.next,
-				 struct cpu_workqueue_stats, list);
+	if (!list_empty(&workqueue_cpu_stat(cpu)->list)) {
+		struct cpu_workqueue_stats *cws;
+		cws = list_entry(workqueue_cpu_stat(cpu)->list.next,
+			struct cpu_workqueue_stats, list);
+		/*
+		 * cpu_workqueue_stats->workfunclist at least have a dummy node
+		 */
+		ret = list_entry(cws->workfunclist.next, struct workfunc_stats,
+			list);
+	}
 
 	spin_unlock_irqrestore(&workqueue_cpu_stat(cpu)->lock, flags);
 
@@ -161,43 +259,75 @@ static void *workqueue_stat_start(struct tracer_stat *trace)
 
 static void *workqueue_stat_next(void *prev, int idx)
 {
-	struct cpu_workqueue_stats *prev_cws = prev;
-	int cpu = prev_cws->cpu;
+	struct workfunc_stats *prev_wfstat = prev;
+	int cpu = prev_wfstat->parent->cpu;
 	unsigned long flags;
 	void *ret = NULL;
 
 	spin_lock_irqsave(&workqueue_cpu_stat(cpu)->lock, flags);
-	if (list_is_last(&prev_cws->list, &workqueue_cpu_stat(cpu)->list)) {
+
+	if (!list_is_last(&prev_wfstat->list,
+	    &prev_wfstat->parent->workfunclist)) {
+		ret = list_entry(prev_wfstat->list.next, struct workfunc_stats,
+			list);
 		spin_unlock_irqrestore(&workqueue_cpu_stat(cpu)->lock, flags);
-		do {
-			cpu = cpumask_next(cpu, cpu_possible_mask);
-			if (cpu >= nr_cpu_ids)
-				return NULL;
-		} while (!(ret = workqueue_stat_start_cpu(cpu)));
 		return ret;
 	}
+
+	if (!list_is_last(&prev_wfstat->parent->list,
+	    &workqueue_cpu_stat(cpu)->list)) {
+		struct cpu_workqueue_stats *cws = list_entry(
+			prev_wfstat->parent->list.next,
+			struct cpu_workqueue_stats, list);
+		ret = list_entry(cws->workfunclist.next, struct workfunc_stats,
+			list);
+		spin_unlock_irqrestore(&workqueue_cpu_stat(cpu)->lock, flags);
+		return ret;
+	}
+
 	spin_unlock_irqrestore(&workqueue_cpu_stat(cpu)->lock, flags);
 
-	return list_entry(prev_cws->list.next, struct cpu_workqueue_stats,
-			  list);
+	do {
+		cpu = cpumask_next(cpu, cpu_possible_mask);
+		if (cpu >= nr_cpu_ids)
+			return NULL;
+	} while (!(ret = workqueue_stat_start_cpu(cpu)));
+
+	return ret;
 }
 
 static int workqueue_stat_show(struct seq_file *s, void *p)
 {
-	struct cpu_workqueue_stats *cws = p;
+	struct workfunc_stats *wfstat = p;
+	struct cpu_workqueue_stats *cws = wfstat->parent;
 	struct pid *pid;
 	struct task_struct *tsk;
 
-	pid = find_get_pid(cws->pid);
-	if (pid) {
-		tsk = get_pid_task(pid, PIDTYPE_PID);
-		if (tsk) {
-			seq_printf(s, "%3d %6d     %6u       %s\n", cws->cpu,
-				   atomic_read(&cws->inserted), cws->executed,
-				   tsk->comm);
-			put_task_struct(tsk);
+	if (!wfstat->func) {
+		/* It is first dummy node, need to print workqueue info */
+		pid = find_get_pid(cws->pid);
+		if (pid) {
+			tsk = get_pid_task(pid, PIDTYPE_PID);
+			if (tsk) {
+				seq_printf(s, "%3d %6d     %6u       %s:%d\n",
+					   cws->cpu,
+					   atomic_read(&cws->inserted),
+					   cws->executed,
+					   tsk->comm,
+					   cws->pid);
+				put_task_struct(tsk);
+			}
+			put_pid(pid);
 		}
-		put_pid(pid);
+	} else {
+		/* It is effect node, need to print workfunc info */
+		int lastwf = list_is_last(&wfstat->list, &cws->workfunclist);
+		seq_printf(s, "%3d %6d     %6u        %c-%pF\n",
+			cws->cpu,
+			atomic_read(&wfstat->inserted),
+			wfstat->executed,
+			lastwf ? '`' : '|',
+			wfstat->func);
 	}
 
 	return 0;
@@ -205,7 +335,8 @@ static int workqueue_stat_show(struct seq_file *s, void *p)
 
 static int workqueue_stat_headers(struct seq_file *s)
 {
-	seq_printf(s, "# CPU  INSERTED  EXECUTED   NAME\n");
+	seq_printf(s, "# CPU  INSERTED  EXECUTED   TASKNAME:PID\n");
+	seq_printf(s, "# |      |         |        `-WORKFUNC\n");
 	seq_printf(s, "# |      |         |          |\n");
 	return 0;
 }
-- 
1.6.2.3


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 06/19] tracing/workqueue: turn cpu_workqueue_stats::inserted into unsigned int
  2009-04-30  0:27 [PATCH 00/19] tracing/workqueue: events support/enhancements, worklets tracing, fixes Frederic Weisbecker
                   ` (4 preceding siblings ...)
  2009-04-30  0:27 ` [PATCH 05/19] trace_workqueue: add worklet information Frederic Weisbecker
@ 2009-04-30  0:27 ` Frederic Weisbecker
  2009-04-30  0:27 ` [PATCH 07/19] workqueue_tracepoint: introduce workqueue_handler_exit tracepoint and rename workqueue_execution to workqueue_handler_entry Frederic Weisbecker
                   ` (13 subsequent siblings)
  19 siblings, 0 replies; 23+ messages in thread
From: Frederic Weisbecker @ 2009-04-30  0:27 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Li Zefan, Frederic Weisbecker, Zhao Lei, KOSAKI Motohiro,
	Steven Rostedt, Tom Zanussi, Oleg Nesterov, Andrew Morton

struct cpu_workqueue_stats used an atomic_t to protect against
concurrent incrementation of inserted worklets. But
trace_workqueue_insertion() is already inside a section protected
by the cpu workqueue lock.

Then we can safely convert it into an unsigned int.

[ Impact: cleanup ]

Reported-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Zhao Lei <zhaolei@cn.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
---
 kernel/trace/trace_workqueue.c |   10 +++++-----
 1 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/kernel/trace/trace_workqueue.c b/kernel/trace/trace_workqueue.c
index 1a68b5a..7c632b9 100644
--- a/kernel/trace/trace_workqueue.c
+++ b/kernel/trace/trace_workqueue.c
@@ -20,7 +20,7 @@ struct workfunc_stats {
 
 	/* Addr of worklet's callback function */
 	work_func_t			func;
-	/* Can be inserted from interrupt or user context, need to be atomic */
+	/* Can be inserted on different workqueues, need to be atomic */
 	atomic_t			inserted;
 	/*
 	 * Don't need to be atomic, works are serialized in a single workqueue
@@ -34,8 +34,8 @@ struct cpu_workqueue_stats {
 	struct list_head            list;
 	int		            cpu;
 	pid_t			    pid;
-	/* Refer to comments of struct workfunc_stats for detail */
-	atomic_t	            inserted;
+	/* Protected by cpu workqueue lock */
+	unsigned int	            inserted;
 	unsigned int		    executed;
 	/* list of struct workfunc_stats in this workqueue */
 	struct list_head            workfunclist;
@@ -62,7 +62,7 @@ int do_workqueue_insertion(struct cpu_workqueue_stats *cws,
 {
 	struct workfunc_stats *wfstat;
 
-	atomic_inc(&cws->inserted);
+	cws->inserted++;
 
 	list_for_each_entry(wfstat, &cws->workfunclist, list)
 		if (wfstat->func == work->func)
@@ -311,7 +311,7 @@ static int workqueue_stat_show(struct seq_file *s, void *p)
 			if (tsk) {
 				seq_printf(s, "%3d %6d     %6u       %s:%d\n",
 					   cws->cpu,
-					   atomic_read(&cws->inserted),
+					   cws->inserted,
 					   cws->executed,
 					   tsk->comm,
 					   cws->pid);
-- 
1.6.2.3


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 07/19] workqueue_tracepoint: introduce workqueue_handler_exit tracepoint and rename workqueue_execution to workqueue_handler_entry
  2009-04-30  0:27 [PATCH 00/19] tracing/workqueue: events support/enhancements, worklets tracing, fixes Frederic Weisbecker
                   ` (5 preceding siblings ...)
  2009-04-30  0:27 ` [PATCH 06/19] tracing/workqueue: turn cpu_workqueue_stats::inserted into unsigned int Frederic Weisbecker
@ 2009-04-30  0:27 ` Frederic Weisbecker
  2009-04-30  0:27 ` [PATCH 08/19] workqueue_tracepoint: add workqueue_flush and worklet_cancel tracepoint Frederic Weisbecker
                   ` (12 subsequent siblings)
  19 siblings, 0 replies; 23+ messages in thread
From: Frederic Weisbecker @ 2009-04-30  0:27 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Li Zefan, KOSAKI Motohiro, KOSAKI Motohiro, Zhao Lei,
	Steven Rostedt, Tom Zanussi, Oleg Nesterov, Andrew Morton,
	Frederic Weisbecker

From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>

Entry/exit handler pair is a useful common tracepoint technique.
It can mesure the consumption time of the handler.

Then, also add a workqueue handler_exit tracepoint and rename "execution"
to "handler_entry".

Changelog:
v1: From KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
    introduce workqueue_handler_exit tracepoint
    rename workqueue_execution to workqueue_handler_entry
v2: From Zhao Lei <zhaolei@cn.fujitsu.com>
    Declare workqueue_handler_exit's work argument as void *, because
    we can't safely access its contents (might be freed).
    Add workqueue_handler_entry's &work_struct back to TP_printk,
    to let it used as a'key' for workqueue_handler_entry/exit pair.

[ Impact: prepare to trace the time execution of worklets ]

Signed-off-by: KOSAKI Motohiro <kosaki@jp.fujitsu.com>
Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>,
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
 include/trace/events/workqueue.h |   36 ++++++++++++++++++++++++++++++------
 kernel/trace/trace_workqueue.c   |   10 +++++-----
 kernel/workqueue.c               |    3 ++-
 3 files changed, 37 insertions(+), 12 deletions(-)

diff --git a/include/trace/events/workqueue.h b/include/trace/events/workqueue.h
index 035f1bf..36f4939 100644
--- a/include/trace/events/workqueue.h
+++ b/include/trace/events/workqueue.h
@@ -30,26 +30,50 @@ TRACE_EVENT(workqueue_insertion,
 		__entry->thread_pid, __entry->func)
 );
 
-TRACE_EVENT(workqueue_execution,
+TRACE_EVENT(workqueue_handler_entry,
 
 	TP_PROTO(struct task_struct *wq_thread, struct work_struct *work),
 
 	TP_ARGS(wq_thread, work),
 
 	TP_STRUCT__entry(
-		__array(char,		thread_comm,	TASK_COMM_LEN)
-		__field(pid_t,		thread_pid)
-		__field(work_func_t,	func)
+		__array(char,			thread_comm,	TASK_COMM_LEN)
+		__field(pid_t,			thread_pid)
+		__field(struct work_struct *,	work)
+		__field(work_func_t,		func)
 	),
 
 	TP_fast_assign(
 		memcpy(__entry->thread_comm, wq_thread->comm, TASK_COMM_LEN);
 		__entry->thread_pid	= wq_thread->pid;
+		__entry->work		= work;
 		__entry->func		= work->func;
 	),
 
-	TP_printk("thread=%s:%d func=%pF", __entry->thread_comm,
-		__entry->thread_pid, __entry->func)
+	TP_printk("thread=%s:%d work=%p func=%pF", __entry->thread_comm,
+		__entry->thread_pid, __entry->work, __entry->func)
+);
+
+/* Declare work as void *, because we can't use work->... in after f(work) */
+TRACE_EVENT(workqueue_handler_exit,
+	TP_PROTO(struct task_struct *wq_thread, void *work),
+
+	TP_ARGS(wq_thread, work),
+
+	TP_STRUCT__entry(
+		__array(char,	thread_comm, TASK_COMM_LEN)
+		__field(pid_t,	thread_pid)
+		__field(void *,	work)
+	),
+
+	TP_fast_assign(
+		memcpy(__entry->thread_comm, wq_thread->comm, TASK_COMM_LEN);
+		__entry->thread_pid	= wq_thread->pid;
+		__entry->work		= work;
+	),
+
+	TP_printk("thread=%s:%d work=%p", __entry->thread_comm,
+		  __entry->thread_pid, __entry->work)
 );
 
 /* Trace the creation of one workqueue thread on a cpu */
diff --git a/kernel/trace/trace_workqueue.c b/kernel/trace/trace_workqueue.c
index 7c632b9..8d26db6 100644
--- a/kernel/trace/trace_workqueue.c
+++ b/kernel/trace/trace_workqueue.c
@@ -109,7 +109,7 @@ found:
 
 /* Execution of a work */
 static void
-probe_workqueue_execution(struct task_struct *wq_thread,
+probe_workqueue_entry(struct task_struct *wq_thread,
 			  struct work_struct *work)
 {
 	int cpu = cpumask_first(&wq_thread->cpus_allowed);
@@ -373,13 +373,13 @@ int __init trace_workqueue_early_init(void)
 	if (ret)
 		goto out;
 
-	ret = register_trace_workqueue_execution(probe_workqueue_execution);
+	ret = register_trace_workqueue_handler_entry(probe_workqueue_entry);
 	if (ret)
 		goto no_insertion;
 
 	ret = register_trace_workqueue_creation(probe_workqueue_creation);
 	if (ret)
-		goto no_execution;
+		goto no_handler_entry;
 
 	ret = register_trace_workqueue_destruction(probe_workqueue_destruction);
 	if (ret)
@@ -394,8 +394,8 @@ int __init trace_workqueue_early_init(void)
 
 no_creation:
 	unregister_trace_workqueue_creation(probe_workqueue_creation);
-no_execution:
-	unregister_trace_workqueue_execution(probe_workqueue_execution);
+no_handler_entry:
+	unregister_trace_workqueue_handler_entry(probe_workqueue_entry);
 no_insertion:
 	unregister_trace_workqueue_insertion(probe_workqueue_insertion);
 out:
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 0668795..fbcdfa2 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -279,7 +279,6 @@ static void run_workqueue(struct cpu_workqueue_struct *cwq)
 		 */
 		struct lockdep_map lockdep_map = work->lockdep_map;
 #endif
-		trace_workqueue_execution(cwq->thread, work);
 		cwq->current_work = work;
 		list_del_init(cwq->worklist.next);
 		spin_unlock_irq(&cwq->lock);
@@ -288,7 +287,9 @@ static void run_workqueue(struct cpu_workqueue_struct *cwq)
 		work_clear_pending(work);
 		lock_map_acquire(&cwq->wq->lockdep_map);
 		lock_map_acquire(&lockdep_map);
+		trace_workqueue_handler_entry(cwq->thread, work);
 		f(work);
+		trace_workqueue_handler_exit(cwq->thread, work);
 		lock_map_release(&lockdep_map);
 		lock_map_release(&cwq->wq->lockdep_map);
 
-- 
1.6.2.3


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 08/19] workqueue_tracepoint: add workqueue_flush and worklet_cancel tracepoint
  2009-04-30  0:27 [PATCH 00/19] tracing/workqueue: events support/enhancements, worklets tracing, fixes Frederic Weisbecker
                   ` (6 preceding siblings ...)
  2009-04-30  0:27 ` [PATCH 07/19] workqueue_tracepoint: introduce workqueue_handler_exit tracepoint and rename workqueue_execution to workqueue_handler_entry Frederic Weisbecker
@ 2009-04-30  0:27 ` Frederic Weisbecker
  2009-04-30  0:27 ` [PATCH 09/19] workqueue_tracepoint: change tracepoint name to fit worklet and workqueue lifecycle Frederic Weisbecker
                   ` (11 subsequent siblings)
  19 siblings, 0 replies; 23+ messages in thread
From: Frederic Weisbecker @ 2009-04-30  0:27 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Li Zefan, Zhaolei, Steven Rostedt, Tom Zanussi,
	Oleg Nesterov, Andrew Morton, KOSAKI Motohiro,
	Frederic Weisbecker

From: Zhaolei <zhaolei@cn.fujitsu.com>

Ingo Molnar <mingo@elte.hu> wrote:
> This allows the understanding of the life cycle of a workqueue and
> of worklets that enter that workqueue.

[ Impact: provide finer grained details about life of worklets ]

Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>,
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
 include/trace/events/workqueue.h |   36 ++++++++++++++++++++++++++++++++++++
 kernel/workqueue.c               |    4 ++++
 2 files changed, 40 insertions(+), 0 deletions(-)

diff --git a/include/trace/events/workqueue.h b/include/trace/events/workqueue.h
index 36f4939..2e77e84 100644
--- a/include/trace/events/workqueue.h
+++ b/include/trace/events/workqueue.h
@@ -76,6 +76,23 @@ TRACE_EVENT(workqueue_handler_exit,
 		  __entry->thread_pid, __entry->work)
 );
 
+TRACE_EVENT(worklet_cancel,
+
+	TP_PROTO(struct work_struct *work),
+
+	TP_ARGS(work),
+
+	TP_STRUCT__entry(
+		__field(work_func_t,	func)
+	),
+
+	TP_fast_assign(
+		__entry->func	= work->func;
+	),
+
+	TP_printk("func=%pF", __entry->func)
+);
+
 /* Trace the creation of one workqueue thread on a cpu */
 TRACE_EVENT(workqueue_creation,
 
@@ -99,6 +116,25 @@ TRACE_EVENT(workqueue_creation,
 		__entry->thread_pid, __entry->cpu)
 );
 
+TRACE_EVENT(workqueue_flush,
+
+	TP_PROTO(struct task_struct *wq_thread),
+
+	TP_ARGS(wq_thread),
+
+	TP_STRUCT__entry(
+		__array(char,	thread_comm,	TASK_COMM_LEN)
+		__field(pid_t,	thread_pid)
+	),
+
+	TP_fast_assign(
+		memcpy(__entry->thread_comm, wq_thread->comm, TASK_COMM_LEN);
+		__entry->thread_pid	= wq_thread->pid;
+	),
+
+	TP_printk("thread=%s:%d", __entry->thread_comm, __entry->thread_pid)
+);
+
 TRACE_EVENT(workqueue_destruction,
 
 	TP_PROTO(struct task_struct *wq_thread),
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index fbcdfa2..99a81ba 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -366,6 +366,8 @@ static int flush_cpu_workqueue(struct cpu_workqueue_struct *cwq)
 	int active = 0;
 	struct wq_barrier barr;
 
+	trace_workqueue_flush(cwq->thread);
+
 	WARN_ON(cwq->thread == current);
 
 	spin_lock_irq(&cwq->lock);
@@ -542,6 +544,8 @@ static int __cancel_work_timer(struct work_struct *work,
 {
 	int ret;
 
+	trace_worklet_cancel(work);
+
 	do {
 		ret = (timer && likely(del_timer(timer)));
 		if (!ret)
-- 
1.6.2.3


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 09/19] workqueue_tracepoint: change tracepoint name to fit worklet and workqueue lifecycle
  2009-04-30  0:27 [PATCH 00/19] tracing/workqueue: events support/enhancements, worklets tracing, fixes Frederic Weisbecker
                   ` (7 preceding siblings ...)
  2009-04-30  0:27 ` [PATCH 08/19] workqueue_tracepoint: add workqueue_flush and worklet_cancel tracepoint Frederic Weisbecker
@ 2009-04-30  0:27 ` Frederic Weisbecker
  2009-04-30  0:27 ` [PATCH 10/19] workqueue_trace: separate worklet_insertion into worklet_enqueue and worklet_enqueue_delayed Frederic Weisbecker
                   ` (10 subsequent siblings)
  19 siblings, 0 replies; 23+ messages in thread
From: Frederic Weisbecker @ 2009-04-30  0:27 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Li Zefan, Zhaolei, Steven Rostedt, Tom Zanussi,
	Oleg Nesterov, Andrew Morton, KOSAKI Motohiro,
	Frederic Weisbecker

From: Zhaolei <zhaolei@cn.fujitsu.com>

Change list:
  workqueue_insertion -> worklet_insertion
  workqueue_handler_entry -> worklet_execute
  workqueue_handler_exit -> worklet_complete

Tracepoint list after change:
  worklet_insertion
  worklet_execute
  worklet_complete
  worklet_cancel
  workqueue_creation
  workqueue_flush
  workqueue_destruction

[ Impact: cleanup, change names of tracepoints to be more accurate ]

Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>,
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
 include/trace/events/workqueue.h |    6 +++---
 kernel/trace/trace_workqueue.c   |   21 ++++++++++-----------
 kernel/workqueue.c               |    6 +++---
 3 files changed, 16 insertions(+), 17 deletions(-)

diff --git a/include/trace/events/workqueue.h b/include/trace/events/workqueue.h
index 2e77e84..fb35cc7 100644
--- a/include/trace/events/workqueue.h
+++ b/include/trace/events/workqueue.h
@@ -8,7 +8,7 @@
 #undef TRACE_SYSTEM
 #define TRACE_SYSTEM workqueue
 
-TRACE_EVENT(workqueue_insertion,
+TRACE_EVENT(worklet_insertion,
 
 	TP_PROTO(struct task_struct *wq_thread, struct work_struct *work),
 
@@ -30,7 +30,7 @@ TRACE_EVENT(workqueue_insertion,
 		__entry->thread_pid, __entry->func)
 );
 
-TRACE_EVENT(workqueue_handler_entry,
+TRACE_EVENT(worklet_execute,
 
 	TP_PROTO(struct task_struct *wq_thread, struct work_struct *work),
 
@@ -55,7 +55,7 @@ TRACE_EVENT(workqueue_handler_entry,
 );
 
 /* Declare work as void *, because we can't use work->... in after f(work) */
-TRACE_EVENT(workqueue_handler_exit,
+TRACE_EVENT(worklet_complete,
 	TP_PROTO(struct task_struct *wq_thread, void *work),
 
 	TP_ARGS(wq_thread, work),
diff --git a/kernel/trace/trace_workqueue.c b/kernel/trace/trace_workqueue.c
index 8d26db6..e3e1218 100644
--- a/kernel/trace/trace_workqueue.c
+++ b/kernel/trace/trace_workqueue.c
@@ -57,8 +57,8 @@ static DEFINE_PER_CPU(struct workqueue_global_stats, all_workqueue_stat);
  * Update record when insert a work into workqueue
  * Caller need to hold cpu_workqueue_stats spin_lock
  */
-int do_workqueue_insertion(struct cpu_workqueue_stats *cws,
-			   struct work_struct *work)
+int do_worklet_insertion(struct cpu_workqueue_stats *cws,
+			 struct work_struct *work)
 {
 	struct workfunc_stats *wfstat;
 
@@ -87,7 +87,7 @@ found:
 
 /* Insertion of a work */
 static void
-probe_workqueue_insertion(struct task_struct *wq_thread,
+probe_worklet_insertion(struct task_struct *wq_thread,
 			  struct work_struct *work)
 {
 	int cpu = cpumask_first(&wq_thread->cpus_allowed);
@@ -97,8 +97,8 @@ probe_workqueue_insertion(struct task_struct *wq_thread,
 	spin_lock_irqsave(&workqueue_cpu_stat(cpu)->lock, flags);
 	list_for_each_entry(node, &workqueue_cpu_stat(cpu)->list, list) {
 		if (node->pid == wq_thread->pid) {
-			/* we ignore error of do_workqueue_insertion */
-			do_workqueue_insertion(node, work);
+			/* we ignore error of do_worklet_insertion */
+			do_worklet_insertion(node, work);
 			goto found;
 		}
 	}
@@ -109,8 +109,7 @@ found:
 
 /* Execution of a work */
 static void
-probe_workqueue_entry(struct task_struct *wq_thread,
-			  struct work_struct *work)
+probe_worklet_execute(struct task_struct *wq_thread, struct work_struct *work)
 {
 	int cpu = cpumask_first(&wq_thread->cpus_allowed);
 	struct cpu_workqueue_stats *node;
@@ -369,11 +368,11 @@ int __init trace_workqueue_early_init(void)
 {
 	int ret, cpu;
 
-	ret = register_trace_workqueue_insertion(probe_workqueue_insertion);
+	ret = register_trace_worklet_insertion(probe_worklet_insertion);
 	if (ret)
 		goto out;
 
-	ret = register_trace_workqueue_handler_entry(probe_workqueue_entry);
+	ret = register_trace_worklet_execute(probe_worklet_execute);
 	if (ret)
 		goto no_insertion;
 
@@ -395,9 +394,9 @@ int __init trace_workqueue_early_init(void)
 no_creation:
 	unregister_trace_workqueue_creation(probe_workqueue_creation);
 no_handler_entry:
-	unregister_trace_workqueue_handler_entry(probe_workqueue_entry);
+	unregister_trace_worklet_execute(probe_worklet_execute);
 no_insertion:
-	unregister_trace_workqueue_insertion(probe_workqueue_insertion);
+	unregister_trace_worklet_insertion(probe_worklet_insertion);
 out:
 	pr_warning("trace_workqueue: unable to trace workqueues\n");
 
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 99a81ba..a3c9848 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -128,7 +128,7 @@ struct cpu_workqueue_struct *get_wq_data(struct work_struct *work)
 static void insert_work(struct cpu_workqueue_struct *cwq,
 			struct work_struct *work, struct list_head *head)
 {
-	trace_workqueue_insertion(cwq->thread, work);
+	trace_worklet_insertion(cwq->thread, work);
 
 	set_wq_data(work, cwq);
 	/*
@@ -287,9 +287,9 @@ static void run_workqueue(struct cpu_workqueue_struct *cwq)
 		work_clear_pending(work);
 		lock_map_acquire(&cwq->wq->lockdep_map);
 		lock_map_acquire(&lockdep_map);
-		trace_workqueue_handler_entry(cwq->thread, work);
+		trace_worklet_execute(cwq->thread, work);
 		f(work);
-		trace_workqueue_handler_exit(cwq->thread, work);
+		trace_worklet_complete(cwq->thread, work);
 		lock_map_release(&lockdep_map);
 		lock_map_release(&cwq->wq->lockdep_map);
 
-- 
1.6.2.3


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 10/19] workqueue_trace: separate worklet_insertion into worklet_enqueue and worklet_enqueue_delayed
  2009-04-30  0:27 [PATCH 00/19] tracing/workqueue: events support/enhancements, worklets tracing, fixes Frederic Weisbecker
                   ` (8 preceding siblings ...)
  2009-04-30  0:27 ` [PATCH 09/19] workqueue_tracepoint: change tracepoint name to fit worklet and workqueue lifecycle Frederic Weisbecker
@ 2009-04-30  0:27 ` Frederic Weisbecker
  2009-04-30  0:27 ` [PATCH 11/19] tracing/workqueue: turn workfunc_stats::inserted into unsigned int Frederic Weisbecker
                   ` (9 subsequent siblings)
  19 siblings, 0 replies; 23+ messages in thread
From: Frederic Weisbecker @ 2009-04-30  0:27 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Li Zefan, Zhaolei, Steven Rostedt, Tom Zanussi,
	Oleg Nesterov, Andrew Morton, KOSAKI Motohiro,
	Frederic Weisbecker

From: Zhaolei <zhaolei@cn.fujitsu.com>

To get more information about delayed work.
Suggested by Ingo Molnar <mingo@elte.hu>
> i'd suggest the following complete set of events instead:
> TRACE_EVENT(worklet_enqueue              /* NEW */
> TRACE_EVENT(worklet_enqueue_delayed      /* NEW */
> i'd suggest a 'cpu' parameter to the enqueue events, to allow the
> mapping of the _on(..cpu) variants too.

[ Impact: take worklet delayed enqueue into account in worklet tracing ]

Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>,
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
 include/trace/events/workqueue.h |   54 ++++++++++++++++++++++++++++++++-----
 kernel/trace/trace_workqueue.c   |   41 +++++++++++++++++++---------
 kernel/workqueue.c               |   19 ++++++++++---
 3 files changed, 88 insertions(+), 26 deletions(-)

diff --git a/include/trace/events/workqueue.h b/include/trace/events/workqueue.h
index fb35cc7..e4c74f2 100644
--- a/include/trace/events/workqueue.h
+++ b/include/trace/events/workqueue.h
@@ -8,26 +8,64 @@
 #undef TRACE_SYSTEM
 #define TRACE_SYSTEM workqueue
 
-TRACE_EVENT(worklet_insertion,
+TRACE_EVENT(worklet_enqueue,
 
-	TP_PROTO(struct task_struct *wq_thread, struct work_struct *work),
+	TP_PROTO(
+		struct task_struct *wq_thread,
+		struct work_struct *work,
+		int cpu
+	),
 
-	TP_ARGS(wq_thread, work),
+	TP_ARGS(wq_thread, work, cpu),
 
 	TP_STRUCT__entry(
-		__array(char,		thread_comm,	TASK_COMM_LEN)
-		__field(pid_t,		thread_pid)
-		__field(work_func_t,	func)
+		__array(char,			thread_comm,	TASK_COMM_LEN)
+		__field(pid_t,			thread_pid)
+		__field(work_func_t,		func)
+		__field(int,			cpu)
 	),
 
 	TP_fast_assign(
 		memcpy(__entry->thread_comm, wq_thread->comm, TASK_COMM_LEN);
 		__entry->thread_pid	= wq_thread->pid;
 		__entry->func		= work->func;
+		__entry->cpu		= cpu;
+	),
+
+	TP_printk("thread=%s:%d func=%pF cpu=%d", __entry->thread_comm,
+		__entry->thread_pid, __entry->func, __entry->cpu)
+);
+
+TRACE_EVENT(worklet_enqueue_delayed,
+
+	TP_PROTO(
+		struct task_struct *wq_thread,
+		struct work_struct *work,
+		int cpu,
+		unsigned long delay
+	),
+
+	TP_ARGS(wq_thread, work, cpu, delay),
+
+	TP_STRUCT__entry(
+		__array(char,			thread_comm,	TASK_COMM_LEN)
+		__field(pid_t,			thread_pid)
+		__field(work_func_t,		func)
+		__field(int,			cpu)
+		__field(unsigned long,		delay)
+	),
+
+	TP_fast_assign(
+		memcpy(__entry->thread_comm, wq_thread->comm, TASK_COMM_LEN);
+		__entry->thread_pid	= wq_thread->pid;
+		__entry->func		= work->func;
+		__entry->cpu		= cpu;
+		__entry->delay		= delay;
 	),
 
-	TP_printk("thread=%s:%d func=%pF", __entry->thread_comm,
-		__entry->thread_pid, __entry->func)
+	TP_printk("thread=%s:%d func=%pF cpu=%d delay=%lu",
+		__entry->thread_comm, __entry->thread_pid, __entry->func,
+		__entry->cpu, __entry->delay)
 );
 
 TRACE_EVENT(worklet_execute,
diff --git a/kernel/trace/trace_workqueue.c b/kernel/trace/trace_workqueue.c
index e3e1218..69ae4aa 100644
--- a/kernel/trace/trace_workqueue.c
+++ b/kernel/trace/trace_workqueue.c
@@ -57,7 +57,7 @@ static DEFINE_PER_CPU(struct workqueue_global_stats, all_workqueue_stat);
  * Update record when insert a work into workqueue
  * Caller need to hold cpu_workqueue_stats spin_lock
  */
-int do_worklet_insertion(struct cpu_workqueue_stats *cws,
+int do_worklet_enqueue(struct cpu_workqueue_stats *cws,
 			 struct work_struct *work)
 {
 	struct workfunc_stats *wfstat;
@@ -85,26 +85,34 @@ found:
 	return 0;
 }
 
-/* Insertion of a work */
+/* Enqueue of a work */
 static void
-probe_worklet_insertion(struct task_struct *wq_thread,
-			  struct work_struct *work)
+probe_worklet_enqueue(struct task_struct *wq_thread, struct work_struct *work,
+		      int cpu)
 {
-	int cpu = cpumask_first(&wq_thread->cpus_allowed);
+	int wqcpu = cpumask_first(&wq_thread->cpus_allowed);
 	struct cpu_workqueue_stats *node;
 	unsigned long flags;
 
-	spin_lock_irqsave(&workqueue_cpu_stat(cpu)->lock, flags);
-	list_for_each_entry(node, &workqueue_cpu_stat(cpu)->list, list) {
+	spin_lock_irqsave(&workqueue_cpu_stat(wqcpu)->lock, flags);
+	list_for_each_entry(node, &workqueue_cpu_stat(wqcpu)->list, list) {
 		if (node->pid == wq_thread->pid) {
 			/* we ignore error of do_worklet_insertion */
-			do_worklet_insertion(node, work);
+			do_worklet_enqueue(node, work);
 			goto found;
 		}
 	}
 	pr_debug("trace_workqueue: entry not found\n");
 found:
-	spin_unlock_irqrestore(&workqueue_cpu_stat(cpu)->lock, flags);
+	spin_unlock_irqrestore(&workqueue_cpu_stat(wqcpu)->lock, flags);
+}
+
+static void
+probe_worklet_enqueue_delayed(struct task_struct *wq_thread,
+			      struct work_struct *work, int cpu,
+			      unsigned long delay)
+{
+	probe_worklet_enqueue(wq_thread, work, cpu);
 }
 
 /* Execution of a work */
@@ -368,13 +376,18 @@ int __init trace_workqueue_early_init(void)
 {
 	int ret, cpu;
 
-	ret = register_trace_worklet_insertion(probe_worklet_insertion);
+	ret = register_trace_worklet_enqueue(probe_worklet_enqueue);
 	if (ret)
 		goto out;
 
+	ret = register_trace_worklet_enqueue_delayed(
+		probe_worklet_enqueue_delayed);
+	if (ret)
+		goto no_enqueue;
+
 	ret = register_trace_worklet_execute(probe_worklet_execute);
 	if (ret)
-		goto no_insertion;
+		goto no_enqueue_delayed;
 
 	ret = register_trace_workqueue_creation(probe_workqueue_creation);
 	if (ret)
@@ -395,8 +408,10 @@ no_creation:
 	unregister_trace_workqueue_creation(probe_workqueue_creation);
 no_handler_entry:
 	unregister_trace_worklet_execute(probe_worklet_execute);
-no_insertion:
-	unregister_trace_worklet_insertion(probe_worklet_insertion);
+no_enqueue_delayed:
+	unregister_trace_worklet_enqueue_delayed(probe_worklet_enqueue_delayed);
+no_enqueue:
+	unregister_trace_worklet_enqueue(probe_worklet_enqueue);
 out:
 	pr_warning("trace_workqueue: unable to trace workqueues\n");
 
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index a3c9848..0cc14b9 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -128,8 +128,6 @@ struct cpu_workqueue_struct *get_wq_data(struct work_struct *work)
 static void insert_work(struct cpu_workqueue_struct *cwq,
 			struct work_struct *work, struct list_head *head)
 {
-	trace_worklet_insertion(cwq->thread, work);
-
 	set_wq_data(work, cwq);
 	/*
 	 * Ensure that we get the right work->data if we see the
@@ -188,8 +186,12 @@ queue_work_on(int cpu, struct workqueue_struct *wq, struct work_struct *work)
 	int ret = 0;
 
 	if (!test_and_set_bit(WORK_STRUCT_PENDING, work_data_bits(work))) {
+		struct cpu_workqueue_struct *cwq;
+
 		BUG_ON(!list_empty(&work->entry));
-		__queue_work(wq_per_cpu(wq, cpu), work);
+		cwq = wq_per_cpu(wq, cpu);
+		__queue_work(cwq, work);
+		trace_worklet_enqueue(cwq->thread, work, cpu);
 		ret = 1;
 	}
 	return ret;
@@ -251,10 +253,17 @@ int queue_delayed_work_on(int cpu, struct workqueue_struct *wq,
 		timer->data = (unsigned long)dwork;
 		timer->function = delayed_work_timer_fn;
 
-		if (unlikely(cpu >= 0))
+		if (unlikely(cpu >= 0)) {
 			add_timer_on(timer, cpu);
-		else
+			trace_worklet_enqueue_delayed(
+				wq_per_cpu(wq, cpu)->thread, work, cpu, delay);
+		} else {
 			add_timer(timer);
+			trace_worklet_enqueue_delayed(
+				wq_per_cpu(wq, raw_smp_processor_id())->thread,
+				work, cpu, delay);
+		}
+
 		ret = 1;
 	}
 	return ret;
-- 
1.6.2.3


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 11/19] tracing/workqueue: turn workfunc_stats::inserted into unsigned int
  2009-04-30  0:27 [PATCH 00/19] tracing/workqueue: events support/enhancements, worklets tracing, fixes Frederic Weisbecker
                   ` (9 preceding siblings ...)
  2009-04-30  0:27 ` [PATCH 10/19] workqueue_trace: separate worklet_insertion into worklet_enqueue and worklet_enqueue_delayed Frederic Weisbecker
@ 2009-04-30  0:27 ` Frederic Weisbecker
  2009-04-30  0:27 ` [PATCH 12/19] tracing/workqueue: avoid accessing task_struct's member variable in stat file read Frederic Weisbecker
                   ` (8 subsequent siblings)
  19 siblings, 0 replies; 23+ messages in thread
From: Frederic Weisbecker @ 2009-04-30  0:27 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Li Zefan, Zhaolei, Steven Rostedt, Tom Zanussi,
	KOSAKI Motohiro, Oleg Nesterov, Frederic Weisbecker,
	Andrew Morton

From: Zhaolei <zhaolei@cn.fujitsu.com>

struct workfunc_stats belongs to per_cpu struct of cpu_workqueue_stats,
which means we only need to serialize its access from one cpu.

And because trace_workqueue_insertion() is already inside a section
protected by the cpu workqueue lock, we can safely convert it into an
unsigned int.

[ Impact: locking cleanup]

Reported-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
---
 kernel/trace/trace_workqueue.c |   13 +++++--------
 1 files changed, 5 insertions(+), 8 deletions(-)

diff --git a/kernel/trace/trace_workqueue.c b/kernel/trace/trace_workqueue.c
index 69ae4aa..7a07d17 100644
--- a/kernel/trace/trace_workqueue.c
+++ b/kernel/trace/trace_workqueue.c
@@ -20,12 +20,9 @@ struct workfunc_stats {
 
 	/* Addr of worklet's callback function */
 	work_func_t			func;
-	/* Can be inserted on different workqueues, need to be atomic */
-	atomic_t			inserted;
-	/*
-	 * Don't need to be atomic, works are serialized in a single workqueue
-	 * thread on a single CPU.
-	 */
+
+	/* Protected by cpu workqueue lock */
+	unsigned int			inserted;
 	unsigned int			executed;
 };
 
@@ -80,7 +77,7 @@ int do_worklet_enqueue(struct cpu_workqueue_stats *cws,
 
 	list_add_tail(&wfstat->list, &cws->workfunclist);
 found:
-	atomic_inc(&wfstat->inserted);
+	wfstat->inserted++;
 
 	return 0;
 }
@@ -331,7 +328,7 @@ static int workqueue_stat_show(struct seq_file *s, void *p)
 		int lastwf = list_is_last(&wfstat->list, &cws->workfunclist);
 		seq_printf(s, "%3d %6d     %6u        %c-%pF\n",
 			cws->cpu,
-			atomic_read(&wfstat->inserted),
+			wfstat->inserted,
 			wfstat->executed,
 			lastwf ? '`' : '|',
 			wfstat->func);
-- 
1.6.2.3


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 12/19] tracing/workqueue: avoid accessing task_struct's member variable in stat file read
  2009-04-30  0:27 [PATCH 00/19] tracing/workqueue: events support/enhancements, worklets tracing, fixes Frederic Weisbecker
                   ` (10 preceding siblings ...)
  2009-04-30  0:27 ` [PATCH 11/19] tracing/workqueue: turn workfunc_stats::inserted into unsigned int Frederic Weisbecker
@ 2009-04-30  0:27 ` Frederic Weisbecker
  2009-04-30  0:27 ` [PATCH 13/19] tracing/workqueue: use list_for_each_entry instead of list_for_each_entry_safe in probe_workqueue_destruction() Frederic Weisbecker
                   ` (7 subsequent siblings)
  19 siblings, 0 replies; 23+ messages in thread
From: Frederic Weisbecker @ 2009-04-30  0:27 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Li Zefan, Zhaolei, Steven Rostedt, Tom Zanussi,
	KOSAKI Motohiro, Oleg Nesterov, Frederic Weisbecker,
	Andrew Morton

From: Zhaolei <zhaolei@cn.fujitsu.com>

Workqueue task may be destroyed when we read the stat file.
We should avoid getting information from task_struct this time.

[ Impact: fix possible NULL pointer dereference ]

Reported-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
---
 kernel/trace/trace_workqueue.c |   50 ++++++++++++++++++++--------------------
 1 files changed, 25 insertions(+), 25 deletions(-)

diff --git a/kernel/trace/trace_workqueue.c b/kernel/trace/trace_workqueue.c
index 7a07d17..51ed57e 100644
--- a/kernel/trace/trace_workqueue.c
+++ b/kernel/trace/trace_workqueue.c
@@ -28,14 +28,22 @@ struct workfunc_stats {
 
 /* A cpu workqueue thread */
 struct cpu_workqueue_stats {
-	struct list_head            list;
-	int		            cpu;
-	pid_t			    pid;
+	struct list_head	list;
+	int			cpu;
+
 	/* Protected by cpu workqueue lock */
-	unsigned int	            inserted;
-	unsigned int		    executed;
+	unsigned int		inserted;
+	unsigned int		executed;
 	/* list of struct workfunc_stats in this workqueue */
-	struct list_head            workfunclist;
+	struct list_head	workfunclist;
+
+	/*
+	 * the task maybe destroyed when we read stat file
+	 * we define it to void * because we only use it as a identifier
+	 */
+	void			*task;
+	int			pid;
+	char			comm[TASK_COMM_LEN];
 };
 
 /* List of workqueue threads on one cpu */
@@ -93,7 +101,7 @@ probe_worklet_enqueue(struct task_struct *wq_thread, struct work_struct *work,
 
 	spin_lock_irqsave(&workqueue_cpu_stat(wqcpu)->lock, flags);
 	list_for_each_entry(node, &workqueue_cpu_stat(wqcpu)->list, list) {
-		if (node->pid == wq_thread->pid) {
+		if (node->task == wq_thread) {
 			/* we ignore error of do_worklet_insertion */
 			do_worklet_enqueue(node, work);
 			goto found;
@@ -124,7 +132,7 @@ probe_worklet_execute(struct task_struct *wq_thread, struct work_struct *work)
 	spin_lock_irqsave(&workqueue_cpu_stat(cpu)->lock, flags);
 
 	list_for_each_entry(node, &workqueue_cpu_stat(cpu)->list, list)
-		if (node->pid == wq_thread->pid) {
+		if (node->task == wq_thread) {
 			node->executed++;
 			goto found_wq;
 		}
@@ -163,7 +171,9 @@ static void probe_workqueue_creation(struct task_struct *wq_thread, int cpu)
 	INIT_LIST_HEAD(&cws->list);
 	cws->cpu = cpu;
 
+	cws->task = wq_thread;
 	cws->pid = wq_thread->pid;
+	strncpy(cws->comm, wq_thread->comm, TASK_COMM_LEN);
 	INIT_LIST_HEAD(&cws->workfunclist);
 
 	/*
@@ -204,7 +214,7 @@ static void probe_workqueue_destruction(struct task_struct *wq_thread)
 				 list) {
 		struct workfunc_stats *wfstat, *wfstatnext;
 
-		if (node->pid != wq_thread->pid)
+		if (node->task != wq_thread)
 			continue;
 
 		list_for_each_entry_safe(wfstat, wfstatnext,
@@ -304,25 +314,15 @@ static int workqueue_stat_show(struct seq_file *s, void *p)
 {
 	struct workfunc_stats *wfstat = p;
 	struct cpu_workqueue_stats *cws = wfstat->parent;
-	struct pid *pid;
-	struct task_struct *tsk;
 
 	if (!wfstat->func) {
 		/* It is first dummy node, need to print workqueue info */
-		pid = find_get_pid(cws->pid);
-		if (pid) {
-			tsk = get_pid_task(pid, PIDTYPE_PID);
-			if (tsk) {
-				seq_printf(s, "%3d %6d     %6u       %s:%d\n",
-					   cws->cpu,
-					   cws->inserted,
-					   cws->executed,
-					   tsk->comm,
-					   cws->pid);
-				put_task_struct(tsk);
-			}
-			put_pid(pid);
-		}
+		seq_printf(s, "%3d %6d     %6u       %s:%d\n",
+			   cws->cpu,
+			   cws->inserted,
+			   cws->executed,
+			   cws->comm,
+			   cws->pid);
 	} else {
 		/* It is effect node, need to print workfunc info */
 		int lastwf = list_is_last(&wfstat->list, &cws->workfunclist);
-- 
1.6.2.3


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 13/19] tracing/workqueue: use list_for_each_entry instead of list_for_each_entry_safe in probe_workqueue_destruction()
  2009-04-30  0:27 [PATCH 00/19] tracing/workqueue: events support/enhancements, worklets tracing, fixes Frederic Weisbecker
                   ` (11 preceding siblings ...)
  2009-04-30  0:27 ` [PATCH 12/19] tracing/workqueue: avoid accessing task_struct's member variable in stat file read Frederic Weisbecker
@ 2009-04-30  0:27 ` Frederic Weisbecker
  2009-04-30  0:27 ` [PATCH 14/19] tracing/workqueue: add max execution time mesurement for per worklet Frederic Weisbecker
                   ` (6 subsequent siblings)
  19 siblings, 0 replies; 23+ messages in thread
From: Frederic Weisbecker @ 2009-04-30  0:27 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Li Zefan, Zhaolei, Steven Rostedt, Tom Zanussi,
	KOSAKI Motohiro, Oleg Nesterov, Frederic Weisbecker,
	Andrew Morton

From: Zhaolei <zhaolei@cn.fujitsu.com>

No need to use list_for_each_entry_safe() because we just remove one list node
which is first found in list iteration.
(we don't continue list iteration after remove list node)

[ Impact: cleanup ]

Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
---
 kernel/trace/trace_workqueue.c |    5 ++---
 1 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/kernel/trace/trace_workqueue.c b/kernel/trace/trace_workqueue.c
index 51ed57e..b995bc0 100644
--- a/kernel/trace/trace_workqueue.c
+++ b/kernel/trace/trace_workqueue.c
@@ -206,12 +206,11 @@ static void probe_workqueue_destruction(struct task_struct *wq_thread)
 {
 	/* Workqueue only execute on one cpu */
 	int cpu = cpumask_first(&wq_thread->cpus_allowed);
-	struct cpu_workqueue_stats *node, *next;
+	struct cpu_workqueue_stats *node;
 	unsigned long flags;
 
 	spin_lock_irqsave(&workqueue_cpu_stat(cpu)->lock, flags);
-	list_for_each_entry_safe(node, next, &workqueue_cpu_stat(cpu)->list,
-				 list) {
+	list_for_each_entry(node, &workqueue_cpu_stat(cpu)->list, list) {
 		struct workfunc_stats *wfstat, *wfstatnext;
 
 		if (node->task != wq_thread)
-- 
1.6.2.3


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 14/19] tracing/workqueue: add max execution time mesurement for per worklet
  2009-04-30  0:27 [PATCH 00/19] tracing/workqueue: events support/enhancements, worklets tracing, fixes Frederic Weisbecker
                   ` (12 preceding siblings ...)
  2009-04-30  0:27 ` [PATCH 13/19] tracing/workqueue: use list_for_each_entry instead of list_for_each_entry_safe in probe_workqueue_destruction() Frederic Weisbecker
@ 2009-04-30  0:27 ` Frederic Weisbecker
  2009-04-30  0:27 ` [PATCH 15/19] tracing/workqueue: add execution time average column Frederic Weisbecker
                   ` (5 subsequent siblings)
  19 siblings, 0 replies; 23+ messages in thread
From: Frederic Weisbecker @ 2009-04-30  0:27 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Li Zefan, KOSAKI Motohiro, Zhao Lei, Steven Rostedt,
	Tom Zanussi, Oleg Nesterov, Frederic Weisbecker, Andrew Morton

From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>

It is useful for driver-developer and system administrator to
know which worklet runs how many time.

Change Log:
v1: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
    add max execution time mesurement for per workqueue
v2: Zhao Lei <zhaolei@cn.fujitsu.com>
    add max execution time mesurement for per worklet instead of workqueue

Output is like:
 # CPU INSERTED EXECUTED M_EXECus TASKNAME:PID
 #  |     |        |        |      `-WORKFUNC
 #  |     |        |        |        |
     0      905      973          events/0:5
     0        1        1       37  |-reg_todo+0x0/0x41f
     0       13       12     2033  |-rt_worker_func+0x0/0x219
     0       49       48     3157  |-mce_work_fn+0x0/0x31
     0       14       13      939  |-check_corruption+0x0/0x2a
     0      707      706     3826  |-vmstat_update+0x0/0x33
     0        1        1     1846  |-power_supply_changed_work+0x0/0x3c
     0       11       11     2879  |-console_callback+0x0/0xe1
     0       30       29     1579  |-do_cache_clean+0x0/0x37
     0       74       74     2841  |-test_work+0x0/0x53
     0        3        2      252  |-rekey_seq_generator+0x0/0x5a
     0        1        1       21  |-grace_ender+0x0/0x14
     0        1        1     3701  `-flush_to_ldisc+0x0/0x15d
     0        0        0          cpuset:6
     0      658      658          khelper:7
     0      658      658     5735  `-__call_usermodehelper+0x0/0x77
     0        0        0          netns:11
 ...

[ Impact: provide per-worklet latency measurement ]

Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
---
 kernel/trace/trace_workqueue.c |   83 +++++++++++++++++++++++++++++++++------
 1 files changed, 70 insertions(+), 13 deletions(-)

diff --git a/kernel/trace/trace_workqueue.c b/kernel/trace/trace_workqueue.c
index b995bc0..798a738 100644
--- a/kernel/trace/trace_workqueue.c
+++ b/kernel/trace/trace_workqueue.c
@@ -24,6 +24,17 @@ struct workfunc_stats {
 	/* Protected by cpu workqueue lock */
 	unsigned int			inserted;
 	unsigned int			executed;
+
+	/*
+	 * save latest work_struct's pointer to use as identifier in
+	 * probe_worklet_complete, because we can't use work_struct->...
+	 * after worklet got executed
+	 */
+	void				*work;
+
+	/* save execution time temporarily for calculate executed time */
+	u64				start_time;
+	u64				max_executed_time;
 };
 
 /* A cpu workqueue thread */
@@ -143,6 +154,8 @@ found_wq:
 	list_for_each_entry(wfnode, &node->workfunclist, list)
 		if (wfnode->func == work->func) {
 			wfnode->executed++;
+			wfnode->start_time = trace_clock_global();
+			wfnode->work = work;
 			goto found_wf;
 		}
 	pr_debug("trace_workqueue: worklet not found\n");
@@ -153,6 +166,43 @@ end:
 	spin_unlock_irqrestore(&workqueue_cpu_stat(cpu)->lock, flags);
 }
 
+/* Complete of a work */
+static void
+probe_worklet_complete(struct task_struct *wq_thread, void *work)
+{
+	int cpu = cpumask_first(&wq_thread->cpus_allowed);
+	struct cpu_workqueue_stats *node;
+	struct workfunc_stats *wfnode;
+	unsigned long flags;
+
+	spin_lock_irqsave(&workqueue_cpu_stat(cpu)->lock, flags);
+
+	list_for_each_entry(node, &workqueue_cpu_stat(cpu)->list, list)
+		if (node->task == wq_thread)
+			goto found_wq;
+	pr_debug("trace_workqueue: workqueue not found\n");
+	goto end;
+
+found_wq:
+	list_for_each_entry(wfnode, &node->workfunclist, list) {
+		u64 executed_time;
+
+		if (wfnode->work != work)
+			continue;
+
+		executed_time = trace_clock_global() - wfnode->start_time;
+		if (executed_time > wfnode->max_executed_time)
+			wfnode->max_executed_time = executed_time;
+		goto found_wf;
+	}
+	pr_debug("trace_workqueue: worklet not found\n");
+	goto end;
+
+found_wf:
+end:
+	spin_unlock_irqrestore(&workqueue_cpu_stat(cpu)->lock, flags);
+}
+
 /* Creation of a cpu workqueue thread */
 static void probe_workqueue_creation(struct task_struct *wq_thread, int cpu)
 {
@@ -316,7 +366,7 @@ static int workqueue_stat_show(struct seq_file *s, void *p)
 
 	if (!wfstat->func) {
 		/* It is first dummy node, need to print workqueue info */
-		seq_printf(s, "%3d %6d     %6u       %s:%d\n",
+		seq_printf(s, "  %3d   %6d   %6u          %s:%d\n",
 			   cws->cpu,
 			   cws->inserted,
 			   cws->executed,
@@ -325,10 +375,11 @@ static int workqueue_stat_show(struct seq_file *s, void *p)
 	} else {
 		/* It is effect node, need to print workfunc info */
 		int lastwf = list_is_last(&wfstat->list, &cws->workfunclist);
-		seq_printf(s, "%3d %6d     %6u        %c-%pF\n",
+		seq_printf(s, "  %3d   %6d   %6u   %6llu  %c-%pF\n",
 			cws->cpu,
 			wfstat->inserted,
 			wfstat->executed,
+			ns2usecs(wfstat->max_executed_time),
 			lastwf ? '`' : '|',
 			wfstat->func);
 	}
@@ -338,9 +389,9 @@ static int workqueue_stat_show(struct seq_file *s, void *p)
 
 static int workqueue_stat_headers(struct seq_file *s)
 {
-	seq_printf(s, "# CPU  INSERTED  EXECUTED   TASKNAME:PID\n");
-	seq_printf(s, "# |      |         |        `-WORKFUNC\n");
-	seq_printf(s, "# |      |         |          |\n");
+	seq_printf(s, "# CPU INSERTED EXECUTED M_EXECus	 TASKNAME:PID\n");
+	seq_printf(s, "#  |     |        |        |      `-WORKFUNC\n");
+	seq_printf(s, "#  |     |        |        |        |\n");
 	return 0;
 }
 
@@ -379,19 +430,23 @@ int __init trace_workqueue_early_init(void)
 	ret = register_trace_worklet_enqueue_delayed(
 		probe_worklet_enqueue_delayed);
 	if (ret)
-		goto no_enqueue;
+		goto out_worklet_enqueue;
 
 	ret = register_trace_worklet_execute(probe_worklet_execute);
 	if (ret)
-		goto no_enqueue_delayed;
+		goto out_worklet_enqueue_delayed;
+
+	ret = register_trace_worklet_complete(probe_worklet_complete);
+	if (ret)
+		goto out_worklet_execute;
 
 	ret = register_trace_workqueue_creation(probe_workqueue_creation);
 	if (ret)
-		goto no_handler_entry;
+		goto out_worklet_complete;
 
 	ret = register_trace_workqueue_destruction(probe_workqueue_destruction);
 	if (ret)
-		goto no_creation;
+		goto out_workqueue_creation;
 
 	for_each_possible_cpu(cpu) {
 		spin_lock_init(&workqueue_cpu_stat(cpu)->lock);
@@ -400,13 +455,15 @@ int __init trace_workqueue_early_init(void)
 
 	return 0;
 
-no_creation:
+out_workqueue_creation:
 	unregister_trace_workqueue_creation(probe_workqueue_creation);
-no_handler_entry:
+out_worklet_complete:
+	unregister_trace_worklet_complete(probe_worklet_complete);
+out_worklet_execute:
 	unregister_trace_worklet_execute(probe_worklet_execute);
-no_enqueue_delayed:
+out_worklet_enqueue_delayed:
 	unregister_trace_worklet_enqueue_delayed(probe_worklet_enqueue_delayed);
-no_enqueue:
+out_worklet_enqueue:
 	unregister_trace_worklet_enqueue(probe_worklet_enqueue);
 out:
 	pr_warning("trace_workqueue: unable to trace workqueues\n");
-- 
1.6.2.3


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 15/19] tracing/workqueue: add execution time average column
  2009-04-30  0:27 [PATCH 00/19] tracing/workqueue: events support/enhancements, worklets tracing, fixes Frederic Weisbecker
                   ` (13 preceding siblings ...)
  2009-04-30  0:27 ` [PATCH 14/19] tracing/workqueue: add max execution time mesurement for per worklet Frederic Weisbecker
@ 2009-04-30  0:27 ` Frederic Weisbecker
  2009-04-30  0:27 ` [PATCH 16/19] tracing/workqueue: add open/release file handlers Frederic Weisbecker
                   ` (4 subsequent siblings)
  19 siblings, 0 replies; 23+ messages in thread
From: Frederic Weisbecker @ 2009-04-30  0:27 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Li Zefan, Frederic Weisbecker, Zhao Lei, Steven Rostedt,
	Tom Zanussi, KOSAKI Motohiro, Oleg Nesterov, Andrew Morton

We recently got a new column on the stats which shows the maximum execution
time of a worklet. This field is useful to find the worst latency
case a worklet can induce and propagate to the tasks waiting for pending
worklets on the queue.

To complete this worst case information, it's also useful to get
the time execution average of a worklet so that we also get the typical
case of latency it causes.

The header M_EXECus is also turned into "MAX us", so that we can have
a pair of MAX us / AVG us which can reflect more easily the role
of these columns.

Example of the new trace:

 # CPU INSERTED EXECUTED    MAX us   AVG us   TASKNAME:PID
 #  |     |        |           |      ` |     -WORKFUNC
 #  |     |        |           |        |          |
    0     1254     1332                     events/0:9
    0        1        1       39       39   |-hpet_work+0x0/0x340
    0        3        2      275      233   |-rt_worker_func+0x0/0x400
    0      167      166       69        9   |-vmstat_update+0x0/0x40
    0      776      776   106345      255   |-fb_flashcursor+0x0/0x3c0
    0      142      142      504       14   |-console_callback+0x0/0x1e0
    0       80       80   46941155   1340510   |-test_work+0x0/0x64
    0        3        2        0        0   |-lru_add_drain_per_cpu+0x0/0x10
    0        1        1        3        3   |-sync_cmos_clock+0x0/0x140
    0       81       81      179       60   `-flush_to_ldisc+0x0/0x2f0
    0        0        0                     cpuset:11
    0     1354     1354                     khelper:12
    0     1354     1354   143449      365   `-__call_usermodehelper+0x0/0xe0
    0        0        0                     kintegrityd/0:174
    0       39       39                     kblockd/0:180
    0       26       26      170       42   |-blk_unplug_work+0x0/0x110
    0       13       13       70       14   `-cfq_kick_queue+0x0/0x50
    0        0        0                     kacpid:186
    0        0        0                     kacpi_notify:187
    0        0        0                     tifm:498
    0      312      312                     ata/0:507
    0      312      312       29       23   `-ata_pio_task+0x0/0x280
    0        0        0                     ata_aux:512
    0        0        0                     aio/0:867
    0        0        0                     crypto/0:897
    0        0        0                     scsi_tgtd/0:2367
    0        0        0                     iscsi_eh:2387
    0        0        0                     kpsmoused:2521
    0        0        0                     hd-audio0:2714
    0        0        0                     krxrpcd/0:2734
    0        5        5                     reiserfs/0:2896
    0        5        5   268602    53724   `-flush_async_commits+0x0/0x90
    1      510      580                     events/1:10
    1        1        1       67       67   |-hpet_work+0x0/0x340
    1      168      167       79       11   |-vmstat_update+0x0/0x40
    1        1        1      441      441   |-power_supply_changed_work+0x0/0x60
    1        1        1      836      836   |-sysfs_add_func+0x0/0xb0
    1       80       80   46940792   1346733   |-test_work+0x0/0x64
    1        1        0        0        0   |-rekey_seq_generator+0x0/0x70
    1        3        3        0        0   |-lru_add_drain_per_cpu+0x0/0x10
    1      164      164     7065       57   |-console_callback+0x0/0x1e0
    1       91       91      315       69   `-flush_to_ldisc+0x0/0x2f0
    1        0        0                     kintegrityd/1:177
    1       44       44                     kblockd/1:181
    1       30       30       59       36   |-blk_unplug_work+0x0/0x110
    1       14       14      167       20   `-cfq_kick_queue+0x0/0x50
    1      258      258                     ata/1:509
    1      258      258    12968      119   `-ata_pio_task+0x0/0x280
    1        0        0                     aio/1:870
    1        0        0                     crypto/1:898
    1        0        0                     scsi_tgtd/1:2370
    1        0        0                     krxrpcd/1:2757
    1        3        3                     reiserfs/1:2897
    1        3        3        4        4   `-flush_async_commits+0x0/0x90

[ Impact: report the typical latency case caused by a worklet ]

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Zhao Lei <zhaolei@cn.fujitsu.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
---
 kernel/trace/trace_workqueue.c |   16 +++++++++++-----
 1 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/kernel/trace/trace_workqueue.c b/kernel/trace/trace_workqueue.c
index 798a738..1c6555d 100644
--- a/kernel/trace/trace_workqueue.c
+++ b/kernel/trace/trace_workqueue.c
@@ -35,6 +35,7 @@ struct workfunc_stats {
 	/* save execution time temporarily for calculate executed time */
 	u64				start_time;
 	u64				max_executed_time;
+	u64				total_time;
 };
 
 /* A cpu workqueue thread */
@@ -191,6 +192,7 @@ found_wq:
 			continue;
 
 		executed_time = trace_clock_global() - wfnode->start_time;
+		wfnode->total_time += executed_time;
 		if (executed_time > wfnode->max_executed_time)
 			wfnode->max_executed_time = executed_time;
 		goto found_wf;
@@ -366,7 +368,7 @@ static int workqueue_stat_show(struct seq_file *s, void *p)
 
 	if (!wfstat->func) {
 		/* It is first dummy node, need to print workqueue info */
-		seq_printf(s, "  %3d   %6d   %6u          %s:%d\n",
+		seq_printf(s, "  %3d   %6d   %6u                     %s:%d\n",
 			   cws->cpu,
 			   cws->inserted,
 			   cws->executed,
@@ -375,11 +377,13 @@ static int workqueue_stat_show(struct seq_file *s, void *p)
 	} else {
 		/* It is effect node, need to print workfunc info */
 		int lastwf = list_is_last(&wfstat->list, &cws->workfunclist);
-		seq_printf(s, "  %3d   %6d   %6u   %6llu  %c-%pF\n",
+		seq_printf(s, "  %3d   %6d   %6u   %6llu   %6llu   %c-%pF\n",
 			cws->cpu,
 			wfstat->inserted,
 			wfstat->executed,
 			ns2usecs(wfstat->max_executed_time),
+			!wfstat->executed ? 0 :
+				ns2usecs(wfstat->total_time / wfstat->executed),
 			lastwf ? '`' : '|',
 			wfstat->func);
 	}
@@ -389,9 +393,11 @@ static int workqueue_stat_show(struct seq_file *s, void *p)
 
 static int workqueue_stat_headers(struct seq_file *s)
 {
-	seq_printf(s, "# CPU INSERTED EXECUTED M_EXECus	 TASKNAME:PID\n");
-	seq_printf(s, "#  |     |        |        |      `-WORKFUNC\n");
-	seq_printf(s, "#  |     |        |        |        |\n");
+	seq_printf(s, "# CPU INSERTED EXECUTED    MAX us   AVG us"
+			"   TASKNAME:PID\n");
+	seq_printf(s, "#  |     |        |           |      ` |"
+			"     -WORKFUNC\n");
+	seq_printf(s, "#  |     |        |           |        |          |\n");
 	return 0;
 }
 
-- 
1.6.2.3


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 16/19] tracing/workqueue: add open/release file handlers
  2009-04-30  0:27 [PATCH 00/19] tracing/workqueue: events support/enhancements, worklets tracing, fixes Frederic Weisbecker
                   ` (14 preceding siblings ...)
  2009-04-30  0:27 ` [PATCH 15/19] tracing/workqueue: add execution time average column Frederic Weisbecker
@ 2009-04-30  0:27 ` Frederic Weisbecker
  2009-04-30  0:27 ` [PATCH 17/19] tracing/workqueue: defer workqueue stat release if needed Frederic Weisbecker
                   ` (3 subsequent siblings)
  19 siblings, 0 replies; 23+ messages in thread
From: Frederic Weisbecker @ 2009-04-30  0:27 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Li Zefan, Frederic Weisbecker, Zhao Lei, Steven Rostedt,
	Tom Zanussi, KOSAKI Motohiro, Oleg Nesterov, Andrew Morton

A stat tracer might want to free one of its entries in an atomic
context, this is the case for the workqueue tracer.

The problem is that the stat tracing framework can store a pointer
to a freed entry and pass it to the stat_next() and stat_show()
handlers.

Since this is the only tracer which suffers such issue for now, it's
better to implement the protection on the workqueue tracer layer
using a reader refcount.

These open/release callbacks are then added to help implementing
such facility.

[ Impact: prepare workqueue tracer to protect its entries freeing ]

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Zhao Lei <zhaolei@cn.fujitsu.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
---
 kernel/trace/trace_stat.c |    8 ++++++++
 kernel/trace/trace_stat.h |    3 +++
 2 files changed, 11 insertions(+), 0 deletions(-)

diff --git a/kernel/trace/trace_stat.c b/kernel/trace/trace_stat.c
index fdde3a4..04f3285 100644
--- a/kernel/trace/trace_stat.c
+++ b/kernel/trace/trace_stat.c
@@ -210,8 +210,13 @@ static int tracing_stat_open(struct inode *inode, struct file *file)
 	ret = seq_open(file, &trace_stat_seq_ops);
 	if (!ret) {
 		struct seq_file *m = file->private_data;
+
 		m->private = session;
+		if (session->ts->file_open)
+			session->ts->file_open();
 		ret = stat_seq_init(session);
+		if (ret && session->ts->file_release)
+			session->ts->file_release();
 	}
 
 	return ret;
@@ -228,6 +233,9 @@ static int tracing_stat_release(struct inode *i, struct file *f)
 	reset_stat_session(session);
 	mutex_unlock(&session->stat_mutex);
 
+	if (session->ts->file_release)
+		session->ts->file_release();
+
 	return 0;
 }
 
diff --git a/kernel/trace/trace_stat.h b/kernel/trace/trace_stat.h
index f3546a2..915bf2e 100644
--- a/kernel/trace/trace_stat.h
+++ b/kernel/trace/trace_stat.h
@@ -20,6 +20,9 @@ struct tracer_stat {
 	int			(*stat_show)(struct seq_file *s, void *p);
 	/* Print the headers of your stat entries */
 	int			(*stat_headers)(struct seq_file *s);
+	/* Hooks on file operations */
+	void			(*file_open)(void);
+	void			(*file_release)(void);
 };
 
 /*
-- 
1.6.2.3


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 17/19] tracing/workqueue: defer workqueue stat release if needed
  2009-04-30  0:27 [PATCH 00/19] tracing/workqueue: events support/enhancements, worklets tracing, fixes Frederic Weisbecker
                   ` (15 preceding siblings ...)
  2009-04-30  0:27 ` [PATCH 16/19] tracing/workqueue: add open/release file handlers Frederic Weisbecker
@ 2009-04-30  0:27 ` Frederic Weisbecker
  2009-04-30  0:27 ` [PATCH 18/19] tracing/workqueue: use the original cpu affinity on probe_workqueue_destruction Frederic Weisbecker
                   ` (2 subsequent siblings)
  19 siblings, 0 replies; 23+ messages in thread
From: Frederic Weisbecker @ 2009-04-30  0:27 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Li Zefan, Frederic Weisbecker, Zhao Lei, Steven Rostedt,
	Tom Zanussi, KOSAKI Motohiro, Oleg Nesterov, Andrew Morton

The workqueue tracer might free its entries in atomic context.
But if a reader is already present in the stat file, it is
not safe to free any entry, otherwise a pointer to a freed entry
could be passed to stat_show/stat_next callbacks.

What we do here is listening to the file events open() and release()
and keep track of the number of readers in our file.

So once an entry has to be freed, if there is a reader, we store the
entry to a temporary list and we defer the actual freeing until
all readers go away.
Otherwise we can free it safely without the need to defer.

[ Impact: fix possible freed pointer dereference ]

Reported-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Zhao Lei <zhaolei@cn.fujitsu.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
---
 kernel/trace/trace_workqueue.c |   85 +++++++++++++++++++++++++++++++++------
 1 files changed, 72 insertions(+), 13 deletions(-)

diff --git a/kernel/trace/trace_workqueue.c b/kernel/trace/trace_workqueue.c
index 1c6555d..f39c5d3 100644
--- a/kernel/trace/trace_workqueue.c
+++ b/kernel/trace/trace_workqueue.c
@@ -70,6 +70,13 @@ struct workqueue_global_stats {
 static DEFINE_PER_CPU(struct workqueue_global_stats, all_workqueue_stat);
 #define workqueue_cpu_stat(cpu) (&per_cpu(all_workqueue_stat, cpu))
 
+/* To defer any workqueue freeing, we place them in this temporary list */
+static LIST_HEAD(free_wq_list);
+static DEFINE_SPINLOCK(free_wq_lock);
+
+/* Number of readers in our stat file */
+static int wq_file_ref;
+
 /*
  * Update record when insert a work into workqueue
  * Caller need to hold cpu_workqueue_stats spin_lock
@@ -253,6 +260,17 @@ err_alloc_cws:
 	return;
 }
 
+static void free_workqueue_stats(struct cpu_workqueue_stats *stat)
+{
+	struct workfunc_stats *wfstat, *next;
+
+	list_for_each_entry_safe(wfstat, next, &stat->workfunclist, list) {
+			list_del(&wfstat->list);
+			kfree(wfstat);
+	}
+	kfree(stat);
+}
+
 /* Destruction of a cpu workqueue thread */
 static void probe_workqueue_destruction(struct task_struct *wq_thread)
 {
@@ -263,19 +281,25 @@ static void probe_workqueue_destruction(struct task_struct *wq_thread)
 
 	spin_lock_irqsave(&workqueue_cpu_stat(cpu)->lock, flags);
 	list_for_each_entry(node, &workqueue_cpu_stat(cpu)->list, list) {
-		struct workfunc_stats *wfstat, *wfstatnext;
 
 		if (node->task != wq_thread)
 			continue;
 
-		list_for_each_entry_safe(wfstat, wfstatnext,
-					 &node->workfunclist, list) {
-			list_del(&wfstat->list);
-			kfree(wfstat);
-		}
-
 		list_del(&node->list);
-		kfree(node);
+
+		/*
+		 * We actually defer this workqueue freeing and
+		 * its worklets until no more readers are present on our
+		 * stat file. We are in atomic context here and can't wait
+		 * for the file and the previous copied entries that point
+		 * to this workqueue to be released.
+		 */
+		spin_lock(&free_wq_lock);
+		if (!wq_file_ref)
+			free_workqueue_stats(node);
+		else
+			list_add_tail(&node->list, &free_wq_list);
+		spin_unlock(&free_wq_lock);
 
 		goto found;
 	}
@@ -391,6 +415,39 @@ static int workqueue_stat_show(struct seq_file *s, void *p)
 	return 0;
 }
 
+/*
+ * Here we are sure that we have no more readers on our stat file
+ * and that further readers will block until we return from this function.
+ * We can then safely free these pending entries
+ */
+static void workqueue_stat_file_release(void)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&free_wq_lock, flags);
+
+	if (!--wq_file_ref) {
+		struct cpu_workqueue_stats *node, *next;
+
+		list_for_each_entry_safe(node, next, &free_wq_list, list) {
+			list_del(&node->list);
+			free_workqueue_stats(node);
+		}
+	}
+
+	spin_unlock_irqrestore(&free_wq_lock, flags);
+};
+
+static void workqueue_stat_file_open(void)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&free_wq_lock, flags);
+	wq_file_ref++;
+	spin_unlock_irqrestore(&free_wq_lock, flags);
+}
+
+		/**/
 static int workqueue_stat_headers(struct seq_file *s)
 {
 	seq_printf(s, "# CPU INSERTED EXECUTED    MAX us   AVG us"
@@ -402,11 +459,13 @@ static int workqueue_stat_headers(struct seq_file *s)
 }
 
 struct tracer_stat workqueue_stats __read_mostly = {
-	.name = "workqueues",
-	.stat_start = workqueue_stat_start,
-	.stat_next = workqueue_stat_next,
-	.stat_show = workqueue_stat_show,
-	.stat_headers = workqueue_stat_headers
+	.name		 = "workqueues",
+	.stat_start	 = workqueue_stat_start,
+	.stat_next	 = workqueue_stat_next,
+	.stat_show	 = workqueue_stat_show,
+	.stat_headers	 = workqueue_stat_headers,
+	.file_open	 = workqueue_stat_file_open,
+	.file_open	 = workqueue_stat_file_release,
 };
 
 
-- 
1.6.2.3


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 18/19] tracing/workqueue: use the original cpu affinity on probe_workqueue_destruction
  2009-04-30  0:27 [PATCH 00/19] tracing/workqueue: events support/enhancements, worklets tracing, fixes Frederic Weisbecker
                   ` (16 preceding siblings ...)
  2009-04-30  0:27 ` [PATCH 17/19] tracing/workqueue: defer workqueue stat release if needed Frederic Weisbecker
@ 2009-04-30  0:27 ` Frederic Weisbecker
  2009-04-30  0:27 ` [PATCH 19/19] tracing/workqueue: provide documentation for the workqueue tracer Frederic Weisbecker
  2009-04-30  8:45 ` [PATCH 00/19] tracing/workqueue: events support/enhancements, worklets tracing, fixes Ingo Molnar
  19 siblings, 0 replies; 23+ messages in thread
From: Frederic Weisbecker @ 2009-04-30  0:27 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Li Zefan, Frederic Weisbecker, Zhao Lei, Steven Rostedt,
	Tom Zanussi, KOSAKI Motohiro, Oleg Nesterov, Andrew Morton

Currently, when a cpu workqueue thread is cleaned up, we retrieve its cpu
affinity by looking its task::cpus_allowed mask field.

But the cpu is no longer available on CPU_POST_DEAD case and this
task has been migrated, therefore its cpus_allowed mask has changed
and does not contain this cpu anymore. It means that we are looking
at the wrong cpu list to find it.

We solve it here by passing the original cpu of the workqueue thread to
cleanup_workqueue_thread() and to trace_workqueue_destruction().

[ Impact: fix possible memory leak ]

Reported-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Zhao Lei <zhaolei@cn.fujitsu.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
---
 include/trace/events/workqueue.h |    6 ++++--
 kernel/trace/trace_workqueue.c   |    4 +---
 kernel/workqueue.c               |    8 ++++----
 3 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/include/trace/events/workqueue.h b/include/trace/events/workqueue.h
index e4c74f2..49608c7 100644
--- a/include/trace/events/workqueue.h
+++ b/include/trace/events/workqueue.h
@@ -175,18 +175,20 @@ TRACE_EVENT(workqueue_flush,
 
 TRACE_EVENT(workqueue_destruction,
 
-	TP_PROTO(struct task_struct *wq_thread),
+	TP_PROTO(struct task_struct *wq_thread, int cpu),
 
-	TP_ARGS(wq_thread),
+	TP_ARGS(wq_thread, cpu),
 
 	TP_STRUCT__entry(
 		__array(char,	thread_comm,	TASK_COMM_LEN)
 		__field(pid_t,	thread_pid)
+		__field(int,	cpu)
 	),
 
 	TP_fast_assign(
 		memcpy(__entry->thread_comm, wq_thread->comm, TASK_COMM_LEN);
 		__entry->thread_pid	= wq_thread->pid;
+		__entry->cpu		= cpu;
 	),
 
 	TP_printk("thread=%s:%d", __entry->thread_comm, __entry->thread_pid)
diff --git a/kernel/trace/trace_workqueue.c b/kernel/trace/trace_workqueue.c
index f39c5d3..eafb4a5 100644
--- a/kernel/trace/trace_workqueue.c
+++ b/kernel/trace/trace_workqueue.c
@@ -272,10 +272,8 @@ static void free_workqueue_stats(struct cpu_workqueue_stats *stat)
 }
 
 /* Destruction of a cpu workqueue thread */
-static void probe_workqueue_destruction(struct task_struct *wq_thread)
+static void probe_workqueue_destruction(struct task_struct *wq_thread, int cpu)
 {
-	/* Workqueue only execute on one cpu */
-	int cpu = cpumask_first(&wq_thread->cpus_allowed);
 	struct cpu_workqueue_stats *node;
 	unsigned long flags;
 
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 0cc14b9..7112850 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -869,7 +869,7 @@ struct workqueue_struct *__create_workqueue_key(const char *name,
 }
 EXPORT_SYMBOL_GPL(__create_workqueue_key);
 
-static void cleanup_workqueue_thread(struct cpu_workqueue_struct *cwq)
+static void cleanup_workqueue_thread(struct cpu_workqueue_struct *cwq, int cpu)
 {
 	/*
 	 * Our caller is either destroy_workqueue() or CPU_POST_DEAD,
@@ -892,7 +892,7 @@ static void cleanup_workqueue_thread(struct cpu_workqueue_struct *cwq)
 	 * checks list_empty(), and a "normal" queue_work() can't use
 	 * a dead CPU.
 	 */
-	trace_workqueue_destruction(cwq->thread);
+	trace_workqueue_destruction(cwq->thread, cpu);
 	kthread_stop(cwq->thread);
 	cwq->thread = NULL;
 }
@@ -914,7 +914,7 @@ void destroy_workqueue(struct workqueue_struct *wq)
 	spin_unlock(&workqueue_lock);
 
 	for_each_cpu(cpu, cpu_map)
-		cleanup_workqueue_thread(per_cpu_ptr(wq->cpu_wq, cpu));
+		cleanup_workqueue_thread(per_cpu_ptr(wq->cpu_wq, cpu), cpu);
  	cpu_maps_update_done();
 
 	free_percpu(wq->cpu_wq);
@@ -958,7 +958,7 @@ undo:
 		case CPU_UP_CANCELED:
 			start_workqueue_thread(cwq, -1);
 		case CPU_POST_DEAD:
-			cleanup_workqueue_thread(cwq);
+			cleanup_workqueue_thread(cwq, cpu);
 			break;
 		}
 	}
-- 
1.6.2.3


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 19/19] tracing/workqueue: provide documentation for the workqueue tracer
  2009-04-30  0:27 [PATCH 00/19] tracing/workqueue: events support/enhancements, worklets tracing, fixes Frederic Weisbecker
                   ` (17 preceding siblings ...)
  2009-04-30  0:27 ` [PATCH 18/19] tracing/workqueue: use the original cpu affinity on probe_workqueue_destruction Frederic Weisbecker
@ 2009-04-30  0:27 ` Frederic Weisbecker
  2009-04-30  1:19   ` KOSAKI Motohiro
  2009-04-30  8:45 ` [PATCH 00/19] tracing/workqueue: events support/enhancements, worklets tracing, fixes Ingo Molnar
  19 siblings, 1 reply; 23+ messages in thread
From: Frederic Weisbecker @ 2009-04-30  0:27 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Li Zefan, Frederic Weisbecker, Zhao Lei, Steven Rostedt,
	Tom Zanussi, KOSAKI Motohiro, Oleg Nesterov, Andrew Morton

Provide a quick documentation for both stat and event tracing with
the workqueue tracer.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Zhao Lei <zhaolei@cn.fujitsu.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
---
 Documentation/trace/workqueue.txt |  114 +++++++++++++++++++++++++++++++++++++
 1 files changed, 114 insertions(+), 0 deletions(-)

diff --git a/Documentation/trace/workqueue.txt b/Documentation/trace/workqueue.txt
new file mode 100644
index 0000000..4019a8a
--- /dev/null
+++ b/Documentation/trace/workqueue.txt
@@ -0,0 +1,114 @@
+The workqueue tracer is a tool that can provide informations
+about worklets execution frequency, latency and worklets/workqueues
+lifecycle.
+
+== A statistical/histogram tracer ==
+
+This tracer provides a snapshot of the workqueues and worklets
+current state.
+
+Eg:
+
+# CPU INSERTED EXECUTED    MAX us   AVG us   TASKNAME:PID
+#  |     |        |           |      ` |     -WORKFUNC
+#  |     |        |           |        |          |
+    0     5903     5977                     events/0:9
+    0        1        1       38       38   |-hpet_work
+    0       49       48      295      224   |-rt_worker_func
+    0     2934     2933       73       11   |-vmstat_update
+    0      935      935    13143      264   |-console_callback
+    0       80       80   1634632587   45804402   |-test_work
+    0       10        9      239       85   |-rekey_seq_generator
+    0       10       10        0        0   |-lru_add_drain_per_cpu
+    0        1        1        4        4   |-sync_cmos_clock
+    0      651      651      366       65   |-flush_to_ldisc
+    0     1105     1105    13171      256   |-fb_flashcursor
+    0      127      120      213       18   `-dst_gc_task
+    0        0        0                     cpuset:11
+    0     1409     1409                     khelper:12
+    0     1409     1409   124030      754   `-__call_usermodehelper
+    0        0        0                     kintegrityd/0:174
+    0     5823     5823                     kblockd/0:179
+    0      336      336      176       40   |-blk_unplug_work
+    0     5487     5487      141       21   `-cfq_kick_queue
+    0        0        0                     kacpid:186
+    0        0        0                     kacpi_notify:187
+    0        0        0                     tifm:498
+    0    19505    19505                     ata/0:508
+    0    19505    19505      114       24   `-ata_pio_task
+    0        0        0                     ata_aux:511
+    0        0        0                     aio/0:871
+    0        0        0                     crypto/0:897
+    0        0        0                     scsi_tgtd/0:2368
+    0        0        0                     iscsi_eh:2386
+    0        0        0                     kpsmoused:2525
+    0        0        0                     hd-audio0:2714
+    0        0        0                     krxrpcd/0:2752
+    0      242      242                     reiserfs/0:2896
+    0      242      242   4563280    21918   `-flush_async_commits
+    1        0        0                     reiserfs/1:6438
+    1        0        0                     krxrpcd/1:6439
+    1        0        0                     scsi_tgtd/1:6440
+    1        0        0                     crypto/1:6441
+    1        0        0                     aio/1:6442
+    1       62       62                     ata/1:6443
+    1       62       62       28       24   `-ata_pio_task
+    1        9        9                     kblockd/1:6444
+    1        9        9      102       54   `-blk_unplug_work
+    1        0        0                     kintegrityd/1:6445
+    1     2544     2545                     events/1:6446
+    1     2541     2540       22       11   |-vmstat_update
+    1        1        1       15       15   |-hpet_work
+    1        2        2       10        6   `-lru_add_drain_per_cpu
+
+For each worklet, you can find how much it has been enqueued
+and executed. Also the MAX us and AVG us give you respectively
+the maximum and average time of execution in microseconds
+for a given worklet.
+
+A series of worklets is grouped under a workqueue using the
+following pattern:
+
+cpu_workqueue_thread[/cpu_nr]:pid
+|-work1
+|-work2
+...
+`-last_work
+
+If cpu_nr is not given, then the workqueue thread is
+a singlethread workqueue, bound to every possible cpu.
+
+Use the following commands to read a snapshot:
+
+mount -t debugfs debug /sys/kernel/debug/
+cat /sys/kernel/debug/tracing/trace_stat/workqueues
+
+
+== An event tracer ==
+
+The workqueue tracer also uses a set of events tracepoints which
+can be used to trace each key event from every workqueues or worklets.
+
+You can refer to Documentation/trace/events.txt for a detailed
+use of the events tracepoints. How to individually toggle them, how
+to use filters, which field is what, etc...
+
+For a simple capture of all workqueues events:
+
+# mount -t debugfs debug /sys/kernel/debug/
+# cd /sys/kernel/debug/tracing/events/workqueue
+# ls
+filter          worklet_complete  worklet_enqueue_delayed  workqueue_creation     workqueue_flush
+worklet_cancel  worklet_enqueue   worklet_execute          workqueue_destruction
+# for e in $(ls); do echo 1 > $e/enable; done
+# cat /sys/kernel/debug/tracing/trace
+
+ #
+ # TASK-PID    CPU#    TIMESTAMP  FUNCTION
+ #   | |       |          |         |
+
+<...>-5627  [000]  4597.858645: worklet_enqueue: thread=ata/0:508 func=ata_pio_task+0x0/0x280 cpu=0
+<...>-508   [000]  4597.858660: worklet_execute: thread=ata/0:508 work=ffff88007e9e67b8 func=ata_pio_task+0x0/0x280
+<...>-508   [000]  4597.858684: worklet_complete: thread=ata/0:508 work=ffff88007e9e67b8
+<...>-2437  [000]  4597.861259: worklet_cancel: func=ata_pio_task+0x0/0x280
+[...]
-- 
1.6.2.3


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH 19/19] tracing/workqueue: provide documentation for the workqueue tracer
  2009-04-30  0:27 ` [PATCH 19/19] tracing/workqueue: provide documentation for the workqueue tracer Frederic Weisbecker
@ 2009-04-30  1:19   ` KOSAKI Motohiro
  0 siblings, 0 replies; 23+ messages in thread
From: KOSAKI Motohiro @ 2009-04-30  1:19 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: kosaki.motohiro, Ingo Molnar, LKML, Li Zefan, Zhao Lei,
	Steven Rostedt, Tom Zanussi, Oleg Nesterov, Andrew Morton

> Provide a quick documentation for both stat and event tracing with
> the workqueue tracer.
> 
> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
> Cc: Zhao Lei <zhaolei@cn.fujitsu.com>
> Cc: Steven Rostedt <rostedt@goodmis.org>
> Cc: Tom Zanussi <tzanussi@gmail.com>
> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> Cc: Oleg Nesterov <oleg@redhat.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> ---
>  Documentation/trace/workqueue.txt |  114 +++++++++++++++++++++++++++++++++++++
>  1 files changed, 114 insertions(+), 0 deletions(-)

Looks good to me.
Thanks for very good documentation!







^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 00/19] tracing/workqueue: events support/enhancements, worklets tracing, fixes
  2009-04-30  0:27 [PATCH 00/19] tracing/workqueue: events support/enhancements, worklets tracing, fixes Frederic Weisbecker
                   ` (18 preceding siblings ...)
  2009-04-30  0:27 ` [PATCH 19/19] tracing/workqueue: provide documentation for the workqueue tracer Frederic Weisbecker
@ 2009-04-30  8:45 ` Ingo Molnar
  2009-04-30 12:48   ` Frederic Weisbecker
  19 siblings, 1 reply; 23+ messages in thread
From: Ingo Molnar @ 2009-04-30  8:45 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: LKML, Li Zefan, Zhao Lei, Steven Rostedt, Tom Zanussi,
	Oleg Nesterov, Andrew Morton, KOSAKI Motohiro


* Frederic Weisbecker <fweisbec@gmail.com> wrote:

> Hi,
> 
> This patchset brings various features for the workqueue tracer.
> 
> The support of events tracing (tracepoint expanding):
> 	- workqueue_creation
> 	- workqueue_flush
> 	- workqueue_destruction
> 
>     Also for the worklets:
> 	- worklet_enqueue
> 	- worklet_enqueue_delayed
> 	- worklet_execute
> 	- worklet_complete
> 	- worklet_cancel
> 
> Now, most of the lifecycle of worklets and workqueues is available.
> 
> The histogram tracing now also supports the worklets. You can
> measure the max and average execution time of a worklet.
> 
> A good bunch of Oleg's comments have been addressed, at least the
> critical ones (races which can cause freed pointer dereferences, memory
> leak, etc...). The rest is more about the design itself, such as using
> the workqueue as the identifier instead of the thread and other things
> that are in discussion.
> 
> Thanks!
> 
> Frederic.
> 
> The following changes since commit a0e39ed378fb6ba916522764cd508fa7d42ad495:
>   Heiko Carstens (1):
>         tracing: fix build failure on s390
> 
> are available in the git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing.git tracing/workqueue
> 
> Frederic Weisbecker (6):
>       tracing/workqueue: turn cpu_workqueue_stats::inserted into unsigned int
>       tracing/workqueue: add execution time average column
>       tracing/workqueue: add open/release file handlers
>       tracing/workqueue: defer workqueue stat release if needed
>       tracing/workqueue: use the original cpu affinity on probe_workqueue_destruction
>       tracing/workqueue: provide documentation for the workqueue tracer
> 
> KOSAKI Motohiro (2):
>       workqueue_tracepoint: introduce workqueue_handler_exit tracepoint and rename workqueue_execution to workqueue_handler_entry
>       tracing/workqueue: add max execution time mesurement for per worklet
> 
> Zhaolei (11):
>       ftrace, workqueuetrace: make workqueue tracepoints use TRACE_EVENT macro
>       trace_workqueue: use list_for_each_entry() instead of list_for_each_entry_safe()
>       trace_workqueue: remove cpu_workqueue_stats->first_entry
>       trace_workqueue: remove blank line between each cpu
>       trace_workqueue: add worklet information
>       workqueue_tracepoint: add workqueue_flush and worklet_cancel tracepoint
>       workqueue_tracepoint: change tracepoint name to fit worklet and workqueue lifecycle
>       workqueue_trace: separate worklet_insertion into worklet_enqueue and worklet_enqueue_delayed
>       tracing/workqueue: turn workfunc_stats::inserted into unsigned int
>       tracing/workqueue: avoid accessing task_struct's member variable in stat file read
>       tracing/workqueue: use list_for_each_entry instead of list_for_each_entry_safe in probe_workqueue_destruction()
> 
>  Documentation/trace/workqueue.txt |  114 ++++++++++
>  include/trace/events/workqueue.h  |  200 +++++++++++++++++
>  include/trace/workqueue.h         |   25 --
>  kernel/trace/trace_stat.c         |    8 +
>  kernel/trace/trace_stat.h         |    3 +
>  kernel/trace/trace_workqueue.c    |  444 +++++++++++++++++++++++++++++--------
>  kernel/workqueue.c                |   45 +++--
>  7 files changed, 697 insertions(+), 142 deletions(-)
>  create mode 100644 Documentation/trace/workqueue.txt
>  create mode 100644 include/trace/events/workqueue.h
>  delete mode 100644 include/trace/workqueue.h

I think one worry here - and this mirrors some of Andrew's concerns 
- is the debug code to instrumented code ratio. There's 500 lines of 
instrumentation for 1000 lines of code.

Is there a chance we can reduce that ratio, by factoring out bits of 
kernel/trace/trace_workqueue.c and perhaps compressing 
include/trace/events/workqueue.h some more?

Also, it would be nice to check how intuitive it is to visualize 
workqueue related details using the user-space ftrace-analyzer tool. 

Kernel developers are one thing - but most of the Linux developers 
are user-space developers so any workqueue tracing details have to 
be intuitive (and useful) to them. (even though workqueues are a 
fundamentally in-kernel abstraction)

	Ingo

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 00/19] tracing/workqueue: events support/enhancements, worklets tracing, fixes
  2009-04-30  8:45 ` [PATCH 00/19] tracing/workqueue: events support/enhancements, worklets tracing, fixes Ingo Molnar
@ 2009-04-30 12:48   ` Frederic Weisbecker
  0 siblings, 0 replies; 23+ messages in thread
From: Frederic Weisbecker @ 2009-04-30 12:48 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Li Zefan, Zhao Lei, Steven Rostedt, Tom Zanussi,
	Oleg Nesterov, Andrew Morton, KOSAKI Motohiro

On Thu, Apr 30, 2009 at 10:45:29AM +0200, Ingo Molnar wrote:
> 
> * Frederic Weisbecker <fweisbec@gmail.com> wrote:
> 
> > Hi,
> > 
> > This patchset brings various features for the workqueue tracer.
> > 
> > The support of events tracing (tracepoint expanding):
> > 	- workqueue_creation
> > 	- workqueue_flush
> > 	- workqueue_destruction
> > 
> >     Also for the worklets:
> > 	- worklet_enqueue
> > 	- worklet_enqueue_delayed
> > 	- worklet_execute
> > 	- worklet_complete
> > 	- worklet_cancel
> > 
> > Now, most of the lifecycle of worklets and workqueues is available.
> > 
> > The histogram tracing now also supports the worklets. You can
> > measure the max and average execution time of a worklet.
> > 
> > A good bunch of Oleg's comments have been addressed, at least the
> > critical ones (races which can cause freed pointer dereferences, memory
> > leak, etc...). The rest is more about the design itself, such as using
> > the workqueue as the identifier instead of the thread and other things
> > that are in discussion.
> > 
> > Thanks!
> > 
> > Frederic.
> > 
> > The following changes since commit a0e39ed378fb6ba916522764cd508fa7d42ad495:
> >   Heiko Carstens (1):
> >         tracing: fix build failure on s390
> > 
> > are available in the git repository at:
> > 
> >   git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing.git tracing/workqueue
> > 
> > Frederic Weisbecker (6):
> >       tracing/workqueue: turn cpu_workqueue_stats::inserted into unsigned int
> >       tracing/workqueue: add execution time average column
> >       tracing/workqueue: add open/release file handlers
> >       tracing/workqueue: defer workqueue stat release if needed
> >       tracing/workqueue: use the original cpu affinity on probe_workqueue_destruction
> >       tracing/workqueue: provide documentation for the workqueue tracer
> > 
> > KOSAKI Motohiro (2):
> >       workqueue_tracepoint: introduce workqueue_handler_exit tracepoint and rename workqueue_execution to workqueue_handler_entry
> >       tracing/workqueue: add max execution time mesurement for per worklet
> > 
> > Zhaolei (11):
> >       ftrace, workqueuetrace: make workqueue tracepoints use TRACE_EVENT macro
> >       trace_workqueue: use list_for_each_entry() instead of list_for_each_entry_safe()
> >       trace_workqueue: remove cpu_workqueue_stats->first_entry
> >       trace_workqueue: remove blank line between each cpu
> >       trace_workqueue: add worklet information
> >       workqueue_tracepoint: add workqueue_flush and worklet_cancel tracepoint
> >       workqueue_tracepoint: change tracepoint name to fit worklet and workqueue lifecycle
> >       workqueue_trace: separate worklet_insertion into worklet_enqueue and worklet_enqueue_delayed
> >       tracing/workqueue: turn workfunc_stats::inserted into unsigned int
> >       tracing/workqueue: avoid accessing task_struct's member variable in stat file read
> >       tracing/workqueue: use list_for_each_entry instead of list_for_each_entry_safe in probe_workqueue_destruction()
> > 
> >  Documentation/trace/workqueue.txt |  114 ++++++++++
> >  include/trace/events/workqueue.h  |  200 +++++++++++++++++
> >  include/trace/workqueue.h         |   25 --
> >  kernel/trace/trace_stat.c         |    8 +
> >  kernel/trace/trace_stat.h         |    3 +
> >  kernel/trace/trace_workqueue.c    |  444 +++++++++++++++++++++++++++++--------
> >  kernel/workqueue.c                |   45 +++--
> >  7 files changed, 697 insertions(+), 142 deletions(-)
> >  create mode 100644 Documentation/trace/workqueue.txt
> >  create mode 100644 include/trace/events/workqueue.h
> >  delete mode 100644 include/trace/workqueue.h
> 
> I think one worry here - and this mirrors some of Andrew's concerns 
> - is the debug code to instrumented code ratio. There's 500 lines of 
> instrumentation for 1000 lines of code.
> 
> Is there a chance we can reduce that ratio, by factoring out bits of 
> kernel/trace/trace_workqueue.c and perhaps compressing 
> include/trace/events/workqueue.h some more?



Hehe, indeed. By adding code here and there, I guess we
forgot to unify common bits. I will try to factorize that.


 
> Also, it would be nice to check how intuitive it is to visualize 
> workqueue related details using the user-space ftrace-analyzer tool. 
> 
> Kernel developers are one thing - but most of the Linux developers 
> are user-space developers so any workqueue tracing details have to 
> be intuitive (and useful) to them. (even though workqueues are a 
> fundamentally in-kernel abstraction)
> 
> 	Ingo


Ok, I will see what it gives and give some examples.


^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2009-04-30 12:48 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-04-30  0:27 [PATCH 00/19] tracing/workqueue: events support/enhancements, worklets tracing, fixes Frederic Weisbecker
2009-04-30  0:27 ` [PATCH 01/19] ftrace, workqueuetrace: make workqueue tracepoints use TRACE_EVENT macro Frederic Weisbecker
2009-04-30  0:27 ` [PATCH 02/19] trace_workqueue: use list_for_each_entry() instead of list_for_each_entry_safe() Frederic Weisbecker
2009-04-30  0:27 ` [PATCH 03/19] trace_workqueue: remove cpu_workqueue_stats->first_entry Frederic Weisbecker
2009-04-30  0:27 ` [PATCH 04/19] trace_workqueue: remove blank line between each cpu Frederic Weisbecker
2009-04-30  0:27 ` [PATCH 05/19] trace_workqueue: add worklet information Frederic Weisbecker
2009-04-30  0:27 ` [PATCH 06/19] tracing/workqueue: turn cpu_workqueue_stats::inserted into unsigned int Frederic Weisbecker
2009-04-30  0:27 ` [PATCH 07/19] workqueue_tracepoint: introduce workqueue_handler_exit tracepoint and rename workqueue_execution to workqueue_handler_entry Frederic Weisbecker
2009-04-30  0:27 ` [PATCH 08/19] workqueue_tracepoint: add workqueue_flush and worklet_cancel tracepoint Frederic Weisbecker
2009-04-30  0:27 ` [PATCH 09/19] workqueue_tracepoint: change tracepoint name to fit worklet and workqueue lifecycle Frederic Weisbecker
2009-04-30  0:27 ` [PATCH 10/19] workqueue_trace: separate worklet_insertion into worklet_enqueue and worklet_enqueue_delayed Frederic Weisbecker
2009-04-30  0:27 ` [PATCH 11/19] tracing/workqueue: turn workfunc_stats::inserted into unsigned int Frederic Weisbecker
2009-04-30  0:27 ` [PATCH 12/19] tracing/workqueue: avoid accessing task_struct's member variable in stat file read Frederic Weisbecker
2009-04-30  0:27 ` [PATCH 13/19] tracing/workqueue: use list_for_each_entry instead of list_for_each_entry_safe in probe_workqueue_destruction() Frederic Weisbecker
2009-04-30  0:27 ` [PATCH 14/19] tracing/workqueue: add max execution time mesurement for per worklet Frederic Weisbecker
2009-04-30  0:27 ` [PATCH 15/19] tracing/workqueue: add execution time average column Frederic Weisbecker
2009-04-30  0:27 ` [PATCH 16/19] tracing/workqueue: add open/release file handlers Frederic Weisbecker
2009-04-30  0:27 ` [PATCH 17/19] tracing/workqueue: defer workqueue stat release if needed Frederic Weisbecker
2009-04-30  0:27 ` [PATCH 18/19] tracing/workqueue: use the original cpu affinity on probe_workqueue_destruction Frederic Weisbecker
2009-04-30  0:27 ` [PATCH 19/19] tracing/workqueue: provide documentation for the workqueue tracer Frederic Weisbecker
2009-04-30  1:19   ` KOSAKI Motohiro
2009-04-30  8:45 ` [PATCH 00/19] tracing/workqueue: events support/enhancements, worklets tracing, fixes Ingo Molnar
2009-04-30 12:48   ` Frederic Weisbecker

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.