linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4] [git pull] tip/tracing/ftrace
@ 2009-02-25  2:56 Steven Rostedt
  2009-02-25  2:56 ` [PATCH 1/4] tracing: add DEFINE_TRACE_FMT to tracepoint.h Steven Rostedt
                   ` (3 more replies)
  0 siblings, 4 replies; 51+ messages in thread
From: Steven Rostedt @ 2009-02-25  2:56 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Thomas Gleixner, Peter Zijlstra,
	Frederic Weisbecker, Theodore Tso, Arjan van de Ven,
	Pekka Paalanen, Arnaldo Carvalho de Melo, Jason Baron,
	Martin Bligh, Mathieu Desnoyers, Frank Ch. Eigler,
	KOSAKI Motohiro, Jens Axboe, Masami Hiramatsu

Ingo,

Updates since the RFC:

 - Renamed DECLARE_TRACE_FMT to DEFINE_TRACE_FMT as commented by
   Andrew Morton.

 - Used TPFMT macro as suggested by Peter Zijlstra

Please pull the latest tip/tracing/ftrace tree, which can be found at:

  git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace.git
tip/tracing/ftrace


Steven Rostedt (4):
      tracing: add DEFINE_TRACE_FMT to tracepoint.h
      tracing: add event trace infrastructure
      tracing: add schedule events to event trace
      tracing: make event directory structure

----
 include/asm-generic/vmlinux.lds.h |   11 +-
 include/linux/tracepoint.h        |    3 +
 include/trace/sched.h             |   49 +-----
 include/trace/sched_event_types.h |   72 +++++++
 kernel/trace/Kconfig              |    9 +
 kernel/trace/Makefile             |    2 +
 kernel/trace/events.c             |   13 ++
 kernel/trace/trace_events.c       |  407 +++++++++++++++++++++++++++++++++++++
 kernel/trace/trace_events.h       |   55 +++++
 9 files changed, 572 insertions(+), 49 deletions(-)

-- 

^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH 1/4] tracing: add DEFINE_TRACE_FMT to tracepoint.h
  2009-02-25  2:56 [PATCH 0/4] [git pull] tip/tracing/ftrace Steven Rostedt
@ 2009-02-25  2:56 ` Steven Rostedt
  2009-02-25  6:27   ` Peter Zijlstra
  2009-02-25 16:13   ` Mathieu Desnoyers
  2009-02-25  2:56 ` [PATCH 2/4] tracing: add event trace infrastructure Steven Rostedt
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 51+ messages in thread
From: Steven Rostedt @ 2009-02-25  2:56 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Thomas Gleixner, Peter Zijlstra,
	Frederic Weisbecker, Theodore Tso, Arjan van de Ven,
	Pekka Paalanen, Arnaldo Carvalho de Melo, Jason Baron,
	Martin Bligh, Mathieu Desnoyers, Frank Ch. Eigler,
	KOSAKI Motohiro, Jens Axboe, Masami Hiramatsu, Steven Rostedt

[-- Attachment #1: 0001-tracing-add-DEFINE_TRACE_FMT-to-tracepoint.h.patch --]
[-- Type: text/plain, Size: 852 bytes --]

From: Steven Rostedt <srostedt@redhat.com>

This patch creates a DEFINE_TRACE_FMT to map to DECLARE_TRACE.
This allows for the developers to place format strings and
args in with their tracepoint declaration. A tracer may now
override the DEFINE_TRACE_FMT macro and use it to record
a default format.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
---
 include/linux/tracepoint.h |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h
index 7570054..34ae464 100644
--- a/include/linux/tracepoint.h
+++ b/include/linux/tracepoint.h
@@ -153,4 +153,7 @@ static inline void tracepoint_synchronize_unregister(void)
 	synchronize_sched();
 }
 
+#define DEFINE_TRACE_FMT(name, proto, args, fmt)		\
+	DECLARE_TRACE(name, TPPROTO(proto), TPARGS(args))
+
 #endif
-- 
1.5.6.5

-- 

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 2/4] tracing: add event trace infrastructure
  2009-02-25  2:56 [PATCH 0/4] [git pull] tip/tracing/ftrace Steven Rostedt
  2009-02-25  2:56 ` [PATCH 1/4] tracing: add DEFINE_TRACE_FMT to tracepoint.h Steven Rostedt
@ 2009-02-25  2:56 ` Steven Rostedt
  2009-02-25  3:45   ` Andrew Morton
                     ` (2 more replies)
  2009-02-25  2:56 ` [PATCH 3/4] tracing: add schedule events to event trace Steven Rostedt
  2009-02-25  2:56 ` [PATCH 4/4] tracing: make event directory structure Steven Rostedt
  3 siblings, 3 replies; 51+ messages in thread
From: Steven Rostedt @ 2009-02-25  2:56 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Thomas Gleixner, Peter Zijlstra,
	Frederic Weisbecker, Theodore Tso, Arjan van de Ven,
	Pekka Paalanen, Arnaldo Carvalho de Melo, Jason Baron,
	Martin Bligh, Mathieu Desnoyers, Frank Ch. Eigler,
	KOSAKI Motohiro, Jens Axboe, Masami Hiramatsu, Steven Rostedt

[-- Attachment #1: 0002-tracing-add-event-trace-infrastructure.patch --]
[-- Type: text/plain, Size: 10093 bytes --]

From: Steven Rostedt <srostedt@redhat.com>

This patch creates the event tracing infrastructure of ftrace.
It will create the files:

 /debug/tracing/available_events
 /debug/tracing/set_event

The available_events will list the trace points that have been
registered with the event tracer.

set_events will allow the user to enable or disable an event hook.

example:

 # echo sched_wakeup > /debug/tracing/set_event

Will enable the sched_wakeup event (if it is registered).

 # echo "!sched_wakeup" >> /debug/tracing/set_event

Will disable the sched_wakeup event (and only that event).

 # echo > /debug/tracing/set_event

Will disable all events (notice the '>')

 # cat /debug/tracing/available_events > /debug/tracing/set_event

Will enable all registered event hooks.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
---
 include/asm-generic/vmlinux.lds.h |   11 ++-
 kernel/trace/Kconfig              |    9 ++
 kernel/trace/Makefile             |    1 +
 kernel/trace/trace_events.c       |  280 +++++++++++++++++++++++++++++++++++++
 kernel/trace/trace_events.h       |   52 +++++++
 5 files changed, 352 insertions(+), 1 deletions(-)
 create mode 100644 kernel/trace/trace_events.c
 create mode 100644 kernel/trace/trace_events.h

diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index c61fab1..0add6b2 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -61,6 +61,14 @@
 #define BRANCH_PROFILE()
 #endif
 
+#ifdef CONFIG_EVENT_TRACER
+#define FTRACE_EVENTS()	VMLINUX_SYMBOL(__start_ftrace_events) = .;	\
+			*(_ftrace_events)				\
+			VMLINUX_SYMBOL(__stop_ftrace_events) = .;
+#else
+#define FTRACE_EVENTS()
+#endif
+
 /* .data section */
 #define DATA_DATA							\
 	*(.data)							\
@@ -81,7 +89,8 @@
 	*(__tracepoints)						\
 	VMLINUX_SYMBOL(__stop___tracepoints) = .;			\
 	LIKELY_PROFILE()		       				\
-	BRANCH_PROFILE()
+	BRANCH_PROFILE()						\
+	FTRACE_EVENTS()
 
 #define RO_DATA(align)							\
 	. = ALIGN((align));						\
diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
index 07877f4..999c6a2 100644
--- a/kernel/trace/Kconfig
+++ b/kernel/trace/Kconfig
@@ -159,6 +159,15 @@ config CONTEXT_SWITCH_TRACER
 	  This tracer gets called from the context switch and records
 	  all switching of tasks.
 
+config EVENT_TRACER
+	bool "Trace various events in the kernel"
+	depends on DEBUG_KERNEL
+	select TRACING
+	help
+	  This tracer hooks to various trace points in the kernel
+	  allowing the user to pick and choose which trace point they
+	  want to trace.
+
 config BOOT_TRACER
 	bool "Trace boot initcalls"
 	depends on DEBUG_KERNEL
diff --git a/kernel/trace/Makefile b/kernel/trace/Makefile
index 627090b..c736356 100644
--- a/kernel/trace/Makefile
+++ b/kernel/trace/Makefile
@@ -38,5 +38,6 @@ obj-$(CONFIG_POWER_TRACER) += trace_power.o
 obj-$(CONFIG_KMEMTRACE) += kmemtrace.o
 obj-$(CONFIG_WORKQUEUE_TRACER) += trace_workqueue.o
 obj-$(CONFIG_BLK_DEV_IO_TRACE)	+= blktrace.o
+obj-$(CONFIG_EVENT_TRACER) += trace_events.o
 
 libftrace-y := ftrace.o
diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
new file mode 100644
index 0000000..05bc80e
--- /dev/null
+++ b/kernel/trace/trace_events.c
@@ -0,0 +1,280 @@
+/*
+ * event tracer
+ *
+ * Copyright (C) 2008 Red Hat Inc, Steven Rostedt <srostedt@redhat.com>
+ *
+ */
+
+#include <linux/debugfs.h>
+#include <linux/uaccess.h>
+#include <linux/module.h>
+#include <linux/ctype.h>
+
+#include "trace_events.h"
+
+void event_trace_printk(unsigned long ip, const char *fmt, ...)
+{
+	va_list ap;
+
+	va_start(ap, fmt);
+	tracing_record_cmdline(current);
+	trace_vprintk(ip, task_curr_ret_stack(current), fmt, ap);
+	va_end(ap);
+}
+
+static void ftrace_clear_events(void)
+{
+	struct ftrace_event_call *call = (void *)__start_ftrace_events;
+
+
+	while ((unsigned long)call < (unsigned long)__stop_ftrace_events) {
+
+		if (call->enabled) {
+			call->enabled = 0;
+			call->unregfunc();
+		}
+		call++;
+	}
+}
+
+static int ftrace_set_clr_event(char *buf, int set)
+{
+	struct ftrace_event_call *call = (void *)__start_ftrace_events;
+
+
+	while ((unsigned long)call < (unsigned long)__stop_ftrace_events) {
+
+		if (strcmp(buf, call->name) != 0) {
+			call++;
+			continue;
+		}
+
+		if (set) {
+			/* Already set? */
+			if (call->enabled)
+				return 0;
+			call->enabled = 1;
+			call->regfunc();
+		} else {
+			/* Already cleared? */
+			if (!call->enabled)
+				return 0;
+			call->enabled = 0;
+			call->unregfunc();
+		}
+		return 0;
+	}
+	return -EINVAL;
+}
+
+/* 128 should be much more than enough */
+#define EVENT_BUF_SIZE		127
+
+static ssize_t
+ftrace_event_write(struct file *file, const char __user *ubuf,
+		   size_t cnt, loff_t *ppos)
+{
+	size_t read = 0;
+	int i, set = 1;
+	ssize_t ret;
+	char *buf;
+	char ch;
+
+	if (!cnt || cnt < 0)
+		return 0;
+
+	ret = get_user(ch, ubuf++);
+	if (ret)
+		return ret;
+	read++;
+	cnt--;
+
+	/* skip white space */
+	while (cnt && isspace(ch)) {
+		ret = get_user(ch, ubuf++);
+		if (ret)
+			return ret;
+		read++;
+		cnt--;
+	}
+
+	/* Only white space found? */
+	if (isspace(ch)) {
+		file->f_pos += read;
+		ret = read;
+		return ret;
+	}
+
+	buf = kmalloc(EVENT_BUF_SIZE+1, GFP_KERNEL);
+	if (!buf)
+		return -ENOMEM;
+
+	if (cnt > EVENT_BUF_SIZE)
+		cnt = EVENT_BUF_SIZE;
+
+	i = 0;
+	while (cnt && !isspace(ch)) {
+		if (!i && ch == '!')
+			set = 0;
+		else
+			buf[i++] = ch;
+
+		ret = get_user(ch, ubuf++);
+		if (ret)
+			goto out_free;
+		read++;
+		cnt--;
+	}
+	buf[i] = 0;
+
+	file->f_pos += read;
+
+	ret = ftrace_set_clr_event(buf, set);
+	if (ret)
+		goto out_free;
+
+	ret = read;
+
+ out_free:
+	kfree(buf);
+
+	return ret;
+}
+
+static void *
+t_next(struct seq_file *m, void *v, loff_t *pos)
+{
+	struct ftrace_event_call *call = m->private;
+	struct ftrace_event_call *next = call;
+
+	(*pos)++;
+
+	if ((unsigned long)call >= (unsigned long)__stop_ftrace_events)
+		return NULL;
+
+	m->private = ++next;
+
+	return call;
+}
+
+static void *t_start(struct seq_file *m, loff_t *pos)
+{
+	return t_next(m, NULL, pos);
+}
+
+static void *
+s_next(struct seq_file *m, void *v, loff_t *pos)
+{
+	struct ftrace_event_call *call = m->private;
+	struct ftrace_event_call *next;
+
+	(*pos)++;
+
+ retry:
+	if ((unsigned long)call >= (unsigned long)__stop_ftrace_events)
+		return NULL;
+
+	if (!call->enabled) {
+		call++;
+		goto retry;
+	}
+
+	next = call;
+	m->private = ++next;
+
+	return call;
+}
+
+static void *s_start(struct seq_file *m, loff_t *pos)
+{
+	return s_next(m, NULL, pos);
+}
+
+static int t_show(struct seq_file *m, void *v)
+{
+	struct ftrace_event_call *call = v;
+
+	seq_printf(m, "%s\n", call->name);
+
+	return 0;
+}
+
+static void t_stop(struct seq_file *m, void *p)
+{
+}
+
+static int
+ftrace_event_seq_open(struct inode *inode, struct file *file)
+{
+	int ret;
+	const struct seq_operations *seq_ops;
+
+	if ((file->f_mode & FMODE_WRITE) &&
+	    !(file->f_flags & O_APPEND))
+		ftrace_clear_events();
+
+	seq_ops = inode->i_private;
+	ret = seq_open(file, seq_ops);
+	if (!ret) {
+		struct seq_file *m = file->private_data;
+
+		m->private = __start_ftrace_events;
+	}
+	return ret;
+}
+
+static const struct seq_operations show_event_seq_ops = {
+	.start = t_start,
+	.next = t_next,
+	.show = t_show,
+	.stop = t_stop,
+};
+
+static const struct seq_operations show_set_event_seq_ops = {
+	.start = s_start,
+	.next = s_next,
+	.show = t_show,
+	.stop = t_stop,
+};
+
+static const struct file_operations ftrace_avail_fops = {
+	.open = ftrace_event_seq_open,
+	.read = seq_read,
+	.llseek = seq_lseek,
+	.release = seq_release,
+};
+
+static const struct file_operations ftrace_set_event_fops = {
+	.open = ftrace_event_seq_open,
+	.read = seq_read,
+	.write = ftrace_event_write,
+	.llseek = seq_lseek,
+	.release = seq_release,
+};
+
+static __init int event_trace_init(void)
+{
+	struct dentry *d_tracer;
+	struct dentry *entry;
+
+	d_tracer = tracing_init_dentry();
+	if (!d_tracer)
+		return 0;
+
+	entry = debugfs_create_file("available_events", 0444, d_tracer,
+				    (void *)&show_event_seq_ops,
+				    &ftrace_avail_fops);
+	if (!entry)
+		pr_warning("Could not create debugfs "
+			   "'available_events' entry\n");
+
+	entry = debugfs_create_file("set_event", 0644, d_tracer,
+				    (void *)&show_set_event_seq_ops,
+				    &ftrace_set_event_fops);
+	if (!entry)
+		pr_warning("Could not create debugfs "
+			   "'set_event' entry\n");
+
+	return 0;
+}
+fs_initcall(event_trace_init);
diff --git a/kernel/trace/trace_events.h b/kernel/trace/trace_events.h
new file mode 100644
index 0000000..39342f8
--- /dev/null
+++ b/kernel/trace/trace_events.h
@@ -0,0 +1,52 @@
+#ifndef _LINUX_KERNEL_TRACE_EVENTS_H
+#define _LINUX_KERNEL_TRACE_EVENTS_H
+
+#include <linux/ftrace.h>
+#include "trace.h"
+
+struct ftrace_event_call {
+	char		*name;
+	int		enabled;
+	int		(*regfunc)(void);
+	void		(*unregfunc)(void);
+};
+
+
+#undef TPFMT
+#define TPFMT(fmt, args...)	fmt "\n", ##args
+
+#undef DEFINE_TRACE_FMT
+#define DEFINE_TRACE_FMT(call, proto, args, fmt)			\
+static void ftrace_event_##call(proto)					\
+{									\
+	event_trace_printk(_RET_IP_, "(" #call ") " fmt);		\
+}									\
+									\
+static int ftrace_reg_event_##call(void)				\
+{									\
+	int ret;							\
+									\
+	ret = register_trace_##call(ftrace_event_##call);		\
+	if (!ret)							\
+		pr_info("event trace: Could not activate trace point "	\
+			"probe to " #call);				\
+	return ret;							\
+}									\
+									\
+static void ftrace_unreg_event_##call(void)				\
+{									\
+	unregister_trace_##call(ftrace_event_##call);			\
+}									\
+									\
+static struct ftrace_event_call __used					\
+__attribute__((section("_ftrace_events"))) event_##call = {		\
+	.name 			= #call,				\
+	.regfunc		= ftrace_reg_event_##call,		\
+	.unregfunc		= ftrace_unreg_event_##call,		\
+}
+
+void event_trace_printk(unsigned long ip, const char *fmt, ...);
+extern unsigned long __start_ftrace_events[];
+extern unsigned long __stop_ftrace_events[];
+
+#endif /* _LINUX_KERNEL_TRACE_EVENTS_H */
-- 
1.5.6.5

-- 

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 3/4] tracing: add schedule events to event trace
  2009-02-25  2:56 [PATCH 0/4] [git pull] tip/tracing/ftrace Steven Rostedt
  2009-02-25  2:56 ` [PATCH 1/4] tracing: add DEFINE_TRACE_FMT to tracepoint.h Steven Rostedt
  2009-02-25  2:56 ` [PATCH 2/4] tracing: add event trace infrastructure Steven Rostedt
@ 2009-02-25  2:56 ` Steven Rostedt
  2009-02-25  6:29   ` Peter Zijlstra
  2009-02-25  2:56 ` [PATCH 4/4] tracing: make event directory structure Steven Rostedt
  3 siblings, 1 reply; 51+ messages in thread
From: Steven Rostedt @ 2009-02-25  2:56 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Thomas Gleixner, Peter Zijlstra,
	Frederic Weisbecker, Theodore Tso, Arjan van de Ven,
	Pekka Paalanen, Arnaldo Carvalho de Melo, Jason Baron,
	Martin Bligh, Mathieu Desnoyers, Frank Ch. Eigler,
	KOSAKI Motohiro, Jens Axboe, Masami Hiramatsu, Steven Rostedt

[-- Attachment #1: 0003-tracing-add-schedule-events-to-event-trace.patch --]
[-- Type: text/plain, Size: 5501 bytes --]

From: Steven Rostedt <srostedt@redhat.com>

This patch changes the trace/sched.h to use the DECLARE_TRACE_FMT
such that they are automatically registered with the event tracer.

And it also adds the tracing sched headers to kernel/trace/events.c

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
---
 include/trace/sched.h             |   49 +------------------------
 include/trace/sched_event_types.h |   72 +++++++++++++++++++++++++++++++++++++
 kernel/trace/Makefile             |    1 +
 kernel/trace/events.c             |   13 +++++++
 4 files changed, 87 insertions(+), 48 deletions(-)
 create mode 100644 include/trace/sched_event_types.h
 create mode 100644 kernel/trace/events.c

diff --git a/include/trace/sched.h b/include/trace/sched.h
index 0d81098..4e372a1 100644
--- a/include/trace/sched.h
+++ b/include/trace/sched.h
@@ -4,53 +4,6 @@
 #include <linux/sched.h>
 #include <linux/tracepoint.h>
 
-DECLARE_TRACE(sched_kthread_stop,
-	TPPROTO(struct task_struct *t),
-		TPARGS(t));
-
-DECLARE_TRACE(sched_kthread_stop_ret,
-	TPPROTO(int ret),
-		TPARGS(ret));
-
-DECLARE_TRACE(sched_wait_task,
-	TPPROTO(struct rq *rq, struct task_struct *p),
-		TPARGS(rq, p));
-
-DECLARE_TRACE(sched_wakeup,
-	TPPROTO(struct rq *rq, struct task_struct *p, int success),
-		TPARGS(rq, p, success));
-
-DECLARE_TRACE(sched_wakeup_new,
-	TPPROTO(struct rq *rq, struct task_struct *p, int success),
-		TPARGS(rq, p, success));
-
-DECLARE_TRACE(sched_switch,
-	TPPROTO(struct rq *rq, struct task_struct *prev,
-		struct task_struct *next),
-		TPARGS(rq, prev, next));
-
-DECLARE_TRACE(sched_migrate_task,
-	TPPROTO(struct task_struct *p, int orig_cpu, int dest_cpu),
-		TPARGS(p, orig_cpu, dest_cpu));
-
-DECLARE_TRACE(sched_process_free,
-	TPPROTO(struct task_struct *p),
-		TPARGS(p));
-
-DECLARE_TRACE(sched_process_exit,
-	TPPROTO(struct task_struct *p),
-		TPARGS(p));
-
-DECLARE_TRACE(sched_process_wait,
-	TPPROTO(struct pid *pid),
-		TPARGS(pid));
-
-DECLARE_TRACE(sched_process_fork,
-	TPPROTO(struct task_struct *parent, struct task_struct *child),
-		TPARGS(parent, child));
-
-DECLARE_TRACE(sched_signal_send,
-	TPPROTO(int sig, struct task_struct *p),
-		TPARGS(sig, p));
+#include <trace/sched_event_types.h>
 
 #endif
diff --git a/include/trace/sched_event_types.h b/include/trace/sched_event_types.h
new file mode 100644
index 0000000..a4f6629
--- /dev/null
+++ b/include/trace/sched_event_types.h
@@ -0,0 +1,72 @@
+
+/* use <trace/sched.h> instead */
+#ifndef DEFINE_TRACE_FMT
+# error Do not include this file directly.
+# error Unless you know what you are doing.
+#endif
+
+DEFINE_TRACE_FMT(sched_kthread_stop,
+	TPPROTO(struct task_struct *t),
+	TPARGS(t),
+	TPFMT("task %s:%d", t->comm, t->pid));
+
+DEFINE_TRACE_FMT(sched_kthread_stop_ret,
+	TPPROTO(int ret),
+	TPARGS(ret),
+	TPFMT("ret=%d", ret));
+
+DEFINE_TRACE_FMT(sched_wait_task,
+	TPPROTO(struct rq *rq, struct task_struct *p),
+	TPARGS(rq, p),
+	TPFMT("task %s:%d", p->comm, p->pid));
+
+DEFINE_TRACE_FMT(sched_wakeup,
+	TPPROTO(struct rq *rq, struct task_struct *p, int success),
+	TPARGS(rq, p, success),
+	TPFMT("task %s:%d %s",
+	      p->comm, p->pid, success?"succeeded":"failed"));
+
+DEFINE_TRACE_FMT(sched_wakeup_new,
+	TPPROTO(struct rq *rq, struct task_struct *p, int success),
+	TPARGS(rq, p, success),
+	TPFMT("task %s:%d",
+	      p->comm, p->pid, success?"succeeded":"failed"));
+
+DEFINE_TRACE_FMT(sched_switch,
+	TPPROTO(struct rq *rq, struct task_struct *prev,
+		struct task_struct *next),
+	TPARGS(rq, prev, next),
+	TPFMT("task %s:%d ==> %s:%d",
+	      prev->comm, prev->pid, next->comm, next->pid));
+
+DEFINE_TRACE_FMT(sched_migrate_task,
+	TPPROTO(struct task_struct *p, int orig_cpu, int dest_cpu),
+	TPARGS(p, orig_cpu, dest_cpu),
+	TPFMT("task %s:%d from: %d  to: %d",
+	      p->comm, p->pid, orig_cpu, dest_cpu));
+
+DEFINE_TRACE_FMT(sched_process_free,
+	TPPROTO(struct task_struct *p),
+	TPARGS(p),
+	TPFMT("task %s:%d", p->comm, p->pid));
+
+DEFINE_TRACE_FMT(sched_process_exit,
+	TPPROTO(struct task_struct *p),
+	TPARGS(p),
+	TPFMT("task %s:%d", p->comm, p->pid));
+
+DEFINE_TRACE_FMT(sched_process_wait,
+	TPPROTO(struct pid *pid),
+	TPARGS(pid),
+	TPFMT("pid %d", pid));
+
+DEFINE_TRACE_FMT(sched_process_fork,
+	TPPROTO(struct task_struct *parent, struct task_struct *child),
+	TPARGS(parent, child),
+	TPFMT("parent %s:%d  child %s:%d",
+	      parent->comm, parent->pid, child->comm, child->pid));
+
+DEFINE_TRACE_FMT(sched_signal_send,
+	TPPROTO(int sig, struct task_struct *p),
+	TPARGS(sig, p),
+	TPFMT("sig: %d   task %s:%d", sig, p->comm, p->pid));
diff --git a/kernel/trace/Makefile b/kernel/trace/Makefile
index c736356..664b6c0 100644
--- a/kernel/trace/Makefile
+++ b/kernel/trace/Makefile
@@ -39,5 +39,6 @@ obj-$(CONFIG_KMEMTRACE) += kmemtrace.o
 obj-$(CONFIG_WORKQUEUE_TRACER) += trace_workqueue.o
 obj-$(CONFIG_BLK_DEV_IO_TRACE)	+= blktrace.o
 obj-$(CONFIG_EVENT_TRACER) += trace_events.o
+obj-$(CONFIG_EVENT_TRACER) += events.o
 
 libftrace-y := ftrace.o
diff --git a/kernel/trace/events.c b/kernel/trace/events.c
new file mode 100644
index 0000000..38c89ee
--- /dev/null
+++ b/kernel/trace/events.c
@@ -0,0 +1,13 @@
+/*
+ * This is the place to register all trace points as events.
+ * Include the trace/<type>.h at the top.
+ * Include the trace/<type>_event_types.h at the bottom.
+ */
+
+/* trace/<type>.h here */
+#include <trace/sched.h>
+
+#include "trace_events.h"
+
+/* trace/<type>_event_types.h here */
+#include <trace/sched_event_types.h>
-- 
1.5.6.5

-- 

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 4/4] tracing: make event directory structure
  2009-02-25  2:56 [PATCH 0/4] [git pull] tip/tracing/ftrace Steven Rostedt
                   ` (2 preceding siblings ...)
  2009-02-25  2:56 ` [PATCH 3/4] tracing: add schedule events to event trace Steven Rostedt
@ 2009-02-25  2:56 ` Steven Rostedt
  2009-02-25  6:59   ` Frederic Weisbecker
  3 siblings, 1 reply; 51+ messages in thread
From: Steven Rostedt @ 2009-02-25  2:56 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Thomas Gleixner, Peter Zijlstra,
	Frederic Weisbecker, Theodore Tso, Arjan van de Ven,
	Pekka Paalanen, Arnaldo Carvalho de Melo, Jason Baron,
	Martin Bligh, Mathieu Desnoyers, Frank Ch. Eigler,
	KOSAKI Motohiro, Jens Axboe, Masami Hiramatsu, Steven Rostedt

[-- Attachment #1: 0004-tracing-make-event-directory-structure.patch --]
[-- Type: text/plain, Size: 6014 bytes --]

From: Steven Rostedt <srostedt@redhat.com>

This patch adds the directory /debug/tracing/events/ that will contain
all the registered trace points.

 # ls /debug/tracing/events/
sched_kthread_stop      sched_process_fork  sched_switch
sched_kthread_stop_ret  sched_process_free  sched_wait_task
sched_migrate_task      sched_process_wait  sched_wakeup
sched_process_exit      sched_signal_send   sched_wakeup_new

 # ls /debug/tracing/events/sched_switch/
enable

 # cat /debug/tracing/events/sched_switch/enable
1

 # cat /debug/tracing/set_event
sched_switch

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
---
 kernel/trace/trace_events.c |  137 +++++++++++++++++++++++++++++++++++++++++--
 kernel/trace/trace_events.h |    7 ++-
 2 files changed, 137 insertions(+), 7 deletions(-)

diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
index 05bc80e..3bcb9df 100644
--- a/kernel/trace/trace_events.c
+++ b/kernel/trace/trace_events.c
@@ -12,6 +12,11 @@
 
 #include "trace_events.h"
 
+#define events_for_each(event)						\
+	for (event = __start_ftrace_events;				\
+	     (unsigned long)event < (unsigned long)__stop_ftrace_events; \
+	     event++)
+
 void event_trace_printk(unsigned long ip, const char *fmt, ...)
 {
 	va_list ap;
@@ -39,15 +44,16 @@ static void ftrace_clear_events(void)
 
 static int ftrace_set_clr_event(char *buf, int set)
 {
-	struct ftrace_event_call *call = (void *)__start_ftrace_events;
+	struct ftrace_event_call *call = __start_ftrace_events;
 
 
-	while ((unsigned long)call < (unsigned long)__stop_ftrace_events) {
+	events_for_each(call) {
 
-		if (strcmp(buf, call->name) != 0) {
-			call++;
+		if (!call->name)
+			continue;
+
+		if (strcmp(buf, call->name) != 0)
 			continue;
-		}
 
 		if (set) {
 			/* Already set? */
@@ -223,6 +229,67 @@ ftrace_event_seq_open(struct inode *inode, struct file *file)
 	return ret;
 }
 
+static ssize_t
+event_enable_read(struct file *filp, char __user *ubuf, size_t cnt,
+		  loff_t *ppos)
+{
+	struct ftrace_event_call *call = filp->private_data;
+	char *buf;
+
+	if (call->enabled)
+		buf = "1\n";
+	else
+		buf = "0\n";
+
+	return simple_read_from_buffer(ubuf, cnt, ppos, buf, 2);
+}
+
+static ssize_t
+event_enable_write(struct file *filp, const char __user *ubuf, size_t cnt,
+		   loff_t *ppos)
+{
+	struct ftrace_event_call *call = filp->private_data;
+	char buf[64];
+	unsigned long val;
+	int ret;
+
+	if (cnt >= sizeof(buf))
+		return -EINVAL;
+
+	if (copy_from_user(&buf, ubuf, cnt))
+		return -EFAULT;
+
+	buf[cnt] = 0;
+
+	ret = strict_strtoul(buf, 10, &val);
+	if (ret < 0)
+		return ret;
+
+	switch (val) {
+	case 0:
+		if (!call->enabled)
+			break;
+
+		call->enabled = 0;
+		call->unregfunc();
+		break;
+	case 1:
+		if (call->enabled)
+			break;
+
+		call->enabled = 1;
+		call->regfunc();
+		break;
+
+	default:
+		return -EINVAL;
+	}
+
+	*ppos += cnt;
+
+	return cnt;
+}
+
 static const struct seq_operations show_event_seq_ops = {
 	.start = t_start,
 	.next = t_next,
@@ -252,10 +319,59 @@ static const struct file_operations ftrace_set_event_fops = {
 	.release = seq_release,
 };
 
+static const struct file_operations ftrace_enable_fops = {
+	.open = tracing_open_generic,
+	.read = event_enable_read,
+	.write = event_enable_write,
+};
+
+static struct dentry *event_trace_events_dir(void)
+{
+	static struct dentry *d_tracer;
+	static struct dentry *d_events;
+
+	if (d_events)
+		return d_events;
+
+	d_tracer = tracing_init_dentry();
+	if (!d_tracer)
+		return NULL;
+
+	d_events = debugfs_create_dir("events", d_tracer);
+	if (!d_events)
+		pr_warning("Could not create debugfs "
+			   "'events' directory\n");
+
+	return d_events;
+}
+
+static int
+event_create_dir(struct ftrace_event_call *call, struct dentry *d_events)
+{
+	struct dentry *entry;
+
+	call->dir = debugfs_create_dir(call->name, d_events);
+	if (!call->dir) {
+		pr_warning("Could not create debugfs "
+			   "'%s' directory\n", call->name);
+		return -1;
+	}
+
+	entry = debugfs_create_file("enable", 0644, call->dir, call,
+				    &ftrace_enable_fops);
+	if (!entry)
+		pr_warning("Could not create debugfs "
+			   "'%s/enable' entry\n", call->name);
+
+	return 0;
+}
+
 static __init int event_trace_init(void)
 {
+	struct ftrace_event_call *call = __start_ftrace_events;
 	struct dentry *d_tracer;
 	struct dentry *entry;
+	struct dentry *d_events;
 
 	d_tracer = tracing_init_dentry();
 	if (!d_tracer)
@@ -275,6 +391,17 @@ static __init int event_trace_init(void)
 		pr_warning("Could not create debugfs "
 			   "'set_event' entry\n");
 
+	d_events = event_trace_events_dir();
+	if (!d_events)
+		return 0;
+
+	events_for_each(call) {
+		/* The linker may leave blanks */
+		if (!call->name)
+			continue;
+		event_create_dir(call, d_events);
+	}
+
 	return 0;
 }
 fs_initcall(event_trace_init);
diff --git a/kernel/trace/trace_events.h b/kernel/trace/trace_events.h
index 39342f8..cb8455b 100644
--- a/kernel/trace/trace_events.h
+++ b/kernel/trace/trace_events.h
@@ -1,11 +1,13 @@
 #ifndef _LINUX_KERNEL_TRACE_EVENTS_H
 #define _LINUX_KERNEL_TRACE_EVENTS_H
 
+#include <linux/debugfs.h>
 #include <linux/ftrace.h>
 #include "trace.h"
 
 struct ftrace_event_call {
 	char		*name;
+	struct dentry	*dir;
 	int		enabled;
 	int		(*regfunc)(void);
 	void		(*unregfunc)(void);
@@ -39,6 +41,7 @@ static void ftrace_unreg_event_##call(void)				\
 }									\
 									\
 static struct ftrace_event_call __used					\
+__attribute__((__aligned__(4)))						\
 __attribute__((section("_ftrace_events"))) event_##call = {		\
 	.name 			= #call,				\
 	.regfunc		= ftrace_reg_event_##call,		\
@@ -46,7 +49,7 @@ __attribute__((section("_ftrace_events"))) event_##call = {		\
 }
 
 void event_trace_printk(unsigned long ip, const char *fmt, ...);
-extern unsigned long __start_ftrace_events[];
-extern unsigned long __stop_ftrace_events[];
+extern struct ftrace_event_call __start_ftrace_events[];
+extern struct ftrace_event_call __stop_ftrace_events[];
 
 #endif /* _LINUX_KERNEL_TRACE_EVENTS_H */
-- 
1.5.6.5

-- 

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* Re: [PATCH 2/4] tracing: add event trace infrastructure
  2009-02-25  2:56 ` [PATCH 2/4] tracing: add event trace infrastructure Steven Rostedt
@ 2009-02-25  3:45   ` Andrew Morton
  2009-02-25  4:08     ` Steven Rostedt
  2009-02-25 13:37     ` Theodore Tso
  2009-02-25  9:07   ` Lai Jiangshan
  2009-02-25  9:21   ` Lai Jiangshan
  2 siblings, 2 replies; 51+ messages in thread
From: Andrew Morton @ 2009-02-25  3:45 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Ingo Molnar, Thomas Gleixner, Peter Zijlstra,
	Frederic Weisbecker, Theodore Tso, Arjan van de Ven,
	Pekka Paalanen, Arnaldo Carvalho de Melo, Jason Baron,
	Martin Bligh, Mathieu Desnoyers, Frank Ch. Eigler,
	KOSAKI Motohiro, Jens Axboe, Masami Hiramatsu, Steven Rostedt

On Tue, 24 Feb 2009 21:56:10 -0500 Steven Rostedt <rostedt@goodmis.org> wrote:

> From: Steven Rostedt <srostedt@redhat.com>
> 
> This patch creates the event tracing infrastructure of ftrace.
> It will create the files:
> 
>  /debug/tracing/available_events
>  /debug/tracing/set_event
> 
> The available_events will list the trace points that have been
> registered with the event tracer.
> 
> set_events will allow the user to enable or disable an event hook.
> 
> example:
> 
>  # echo sched_wakeup > /debug/tracing/set_event
> 
> Will enable the sched_wakeup event (if it is registered).
> 
>  # echo "!sched_wakeup" >> /debug/tracing/set_event
> 
> Will disable the sched_wakeup event (and only that event).

Why not

echo sched_wakeup > /debug/tracing/set_event
echo sched_wakeup > /debug/tracing/clear_event

?

>  # echo > /debug/tracing/set_event
> 
> Will disable all events (notice the '>')

echo 1 > /debug/tracing/clear_all_events

?

>  # cat /debug/tracing/available_events > /debug/tracing/set_event
> 
> Will enable all registered event hooks.
> 
> Signed-off-by: Steven Rostedt <srostedt@redhat.com>
> ---
>  include/asm-generic/vmlinux.lds.h |   11 ++-
>  kernel/trace/Kconfig              |    9 ++
>  kernel/trace/Makefile             |    1 +
>  kernel/trace/trace_events.c       |  280 +++++++++++++++++++++++++++++++++++++
>  kernel/trace/trace_events.h       |   52 +++++++
>  5 files changed, 352 insertions(+), 1 deletions(-)
>  create mode 100644 kernel/trace/trace_events.c
>  create mode 100644 kernel/trace/trace_events.h
> 
> diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
> index c61fab1..0add6b2 100644
> --- a/include/asm-generic/vmlinux.lds.h
> +++ b/include/asm-generic/vmlinux.lds.h
> @@ -61,6 +61,14 @@
>  #define BRANCH_PROFILE()
>  #endif
>  
> +#ifdef CONFIG_EVENT_TRACER
> +#define FTRACE_EVENTS()	VMLINUX_SYMBOL(__start_ftrace_events) = .;	\
> +			*(_ftrace_events)				\
> +			VMLINUX_SYMBOL(__stop_ftrace_events) = .;
> +#else
> +#define FTRACE_EVENTS()
> +#endif
> +
>  /* .data section */
>  #define DATA_DATA							\
>  	*(.data)							\
> @@ -81,7 +89,8 @@
>  	*(__tracepoints)						\
>  	VMLINUX_SYMBOL(__stop___tracepoints) = .;			\
>  	LIKELY_PROFILE()		       				\
> -	BRANCH_PROFILE()
> +	BRANCH_PROFILE()						\
> +	FTRACE_EVENTS()
>  
>  #define RO_DATA(align)							\
>  	. = ALIGN((align));						\
> diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
> index 07877f4..999c6a2 100644
> --- a/kernel/trace/Kconfig
> +++ b/kernel/trace/Kconfig
> @@ -159,6 +159,15 @@ config CONTEXT_SWITCH_TRACER
>  	  This tracer gets called from the context switch and records
>  	  all switching of tasks.
>  
> +config EVENT_TRACER
> +	bool "Trace various events in the kernel"
> +	depends on DEBUG_KERNEL
> +	select TRACING
> +	help
> +	  This tracer hooks to various trace points in the kernel
> +	  allowing the user to pick and choose which trace point they
> +	  want to trace.
> +
>  config BOOT_TRACER
>  	bool "Trace boot initcalls"
>  	depends on DEBUG_KERNEL
> diff --git a/kernel/trace/Makefile b/kernel/trace/Makefile
> index 627090b..c736356 100644
> --- a/kernel/trace/Makefile
> +++ b/kernel/trace/Makefile
> @@ -38,5 +38,6 @@ obj-$(CONFIG_POWER_TRACER) += trace_power.o
>  obj-$(CONFIG_KMEMTRACE) += kmemtrace.o
>  obj-$(CONFIG_WORKQUEUE_TRACER) += trace_workqueue.o
>  obj-$(CONFIG_BLK_DEV_IO_TRACE)	+= blktrace.o
> +obj-$(CONFIG_EVENT_TRACER) += trace_events.o
>  
>  libftrace-y := ftrace.o
> diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
> new file mode 100644
> index 0000000..05bc80e
> --- /dev/null
> +++ b/kernel/trace/trace_events.c
> @@ -0,0 +1,280 @@
> +/*
> + * event tracer
> + *
> + * Copyright (C) 2008 Red Hat Inc, Steven Rostedt <srostedt@redhat.com>
> + *
> + */
> +
> +#include <linux/debugfs.h>
> +#include <linux/uaccess.h>
> +#include <linux/module.h>
> +#include <linux/ctype.h>
> +
> +#include "trace_events.h"
> +
> +void event_trace_printk(unsigned long ip, const char *fmt, ...)
> +{
> +	va_list ap;
> +
> +	va_start(ap, fmt);
> +	tracing_record_cmdline(current);
> +	trace_vprintk(ip, task_curr_ret_stack(current), fmt, ap);
> +	va_end(ap);
> +}
> +
> +static void ftrace_clear_events(void)
> +{
> +	struct ftrace_event_call *call = (void *)__start_ftrace_events;

__start_ftrace_events has type `unsigned long'.  It is instantiated by
the linker.  There's a very old unix convention tha tthese things have
type `int'.  Doesn't matter.

It's a bit strange to do the double-cast like this - one explicit, one
implicit.  Doesn't matter much.

> +
> +

The patch adds various random \n\n's

> +	while ((unsigned long)call < (unsigned long)__stop_ftrace_events) {
> +

and various random \n's

> +		if (call->enabled) {
> +			call->enabled = 0;
> +			call->unregfunc();
> +		}
> +		call++;
> +	}
> +}
> +
> +static int ftrace_set_clr_event(char *buf, int set)
> +{
> +	struct ftrace_event_call *call = (void *)__start_ftrace_events;
> +
> +
> +	while ((unsigned long)call < (unsigned long)__stop_ftrace_events) {
> +
> +		if (strcmp(buf, call->name) != 0) {
> +			call++;
> +			continue;
> +		}
> +
> +		if (set) {
> +			/* Already set? */
> +			if (call->enabled)
> +				return 0;
> +			call->enabled = 1;
> +			call->regfunc();
> +		} else {
> +			/* Already cleared? */
> +			if (!call->enabled)
> +				return 0;
> +			call->enabled = 0;
> +			call->unregfunc();
> +		}
> +		return 0;
> +	}
> +	return -EINVAL;
> +}

I spose if I looked at this and surrounding code enough, I could work
out what it's for.

> +/* 128 should be much more than enough */
> +#define EVENT_BUF_SIZE		127
> +
> +static ssize_t
> +ftrace_event_write(struct file *file, const char __user *ubuf,
> +		   size_t cnt, loff_t *ppos)
> +{
> +	size_t read = 0;
> +	int i, set = 1;
> +	ssize_t ret;
> +	char *buf;
> +	char ch;
> +
> +	if (!cnt || cnt < 0)
> +		return 0;

cnt is unsigned, and I don't think the VFS lets through either zero or
-ve counts (I always forget).

> +	ret = get_user(ch, ubuf++);
> +	if (ret)
> +		return ret;
> +	read++;
> +	cnt--;
> +
> +	/* skip white space */
> +	while (cnt && isspace(ch)) {
> +		ret = get_user(ch, ubuf++);
> +		if (ret)
> +			return ret;
> +		read++;
> +		cnt--;
> +	}
> +
> +	/* Only white space found? */
> +	if (isspace(ch)) {
> +		file->f_pos += read;
> +		ret = read;
> +		return ret;
> +	}
> +
> +	buf = kmalloc(EVENT_BUF_SIZE+1, GFP_KERNEL);
> +	if (!buf)
> +		return -ENOMEM;
> +
> +	if (cnt > EVENT_BUF_SIZE)
> +		cnt = EVENT_BUF_SIZE;
> +
> +	i = 0;
> +	while (cnt && !isspace(ch)) {
> +		if (!i && ch == '!')
> +			set = 0;
> +		else
> +			buf[i++] = ch;
> +
> +		ret = get_user(ch, ubuf++);
> +		if (ret)
> +			goto out_free;
> +		read++;
> +		cnt--;
> +	}
> +	buf[i] = 0;

Gad, what a lot of stuff.

Use strncpy_from_user()?

Use strstrip()?

Why do we care about leading and trailing whitespace - user error!

Then we can use sysfs_streq().

If we really really need to do all this, how's about positioning it as
a general copy_string_from_userspace_then_trim_the_ends() for other
callsites to use?

> +	file->f_pos += read;
> +
> +	ret = ftrace_set_clr_event(buf, set);
> +	if (ret)
> +		goto out_free;
> +
> +	ret = read;
> +
> + out_free:
> +	kfree(buf);
> +
> +	return ret;
> +}
>
> ..
>
>
> +static int
> +ftrace_event_seq_open(struct inode *inode, struct file *file)
> +{
> +	int ret;
> +	const struct seq_operations *seq_ops;
> +
> +	if ((file->f_mode & FMODE_WRITE) &&
> +	    !(file->f_flags & O_APPEND))
> +		ftrace_clear_events();
> +
> +	seq_ops = inode->i_private;
> +	ret = seq_open(file, seq_ops);
> +	if (!ret) {
> +		struct seq_file *m = file->private_data;
> +
> +		m->private = __start_ftrace_events;
> +	}
> +	return ret;
> +}

It would be nice if the code were to somewhere document the userspace
interface which it is attempting to implement.  It's a bit hard to
follow if the reader doesn't already know this.



^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 2/4] tracing: add event trace infrastructure
  2009-02-25  3:45   ` Andrew Morton
@ 2009-02-25  4:08     ` Steven Rostedt
  2009-02-25  4:24       ` Nick Piggin
  2009-02-25  4:33       ` Andrew Morton
  2009-02-25 13:37     ` Theodore Tso
  1 sibling, 2 replies; 51+ messages in thread
From: Steven Rostedt @ 2009-02-25  4:08 UTC (permalink / raw)
  To: Andrew Morton
  Cc: LKML, Ingo Molnar, Thomas Gleixner, Peter Zijlstra,
	Frederic Weisbecker, Theodore Tso, Arjan van de Ven,
	Pekka Paalanen, Arnaldo Carvalho de Melo, Jason Baron,
	Martin Bligh, Mathieu Desnoyers, Frank Ch. Eigler,
	KOSAKI Motohiro, Jens Axboe, Masami Hiramatsu, Steven Rostedt



On Tue, 24 Feb 2009, Andrew Morton wrote:

> On Tue, 24 Feb 2009 21:56:10 -0500 Steven Rostedt <rostedt@goodmis.org> wrote:
> 
> > From: Steven Rostedt <srostedt@redhat.com>
> > 
> > This patch creates the event tracing infrastructure of ftrace.
> > It will create the files:
> > 
> >  /debug/tracing/available_events
> >  /debug/tracing/set_event
> > 
> > The available_events will list the trace points that have been
> > registered with the event tracer.
> > 
> > set_events will allow the user to enable or disable an event hook.
> > 
> > example:
> > 
> >  # echo sched_wakeup > /debug/tracing/set_event
> > 
> > Will enable the sched_wakeup event (if it is registered).
> > 
> >  # echo "!sched_wakeup" >> /debug/tracing/set_event
> > 
> > Will disable the sched_wakeup event (and only that event).
> 
> Why not
> 
> echo sched_wakeup > /debug/tracing/set_event
> echo sched_wakeup > /debug/tracing/clear_event

I'm trying to keep the number of files in /debug/tracing down.

> 
> ?
> 
> >  # echo > /debug/tracing/set_event
> > 
> > Will disable all events (notice the '>')
> 
> echo 1 > /debug/tracing/clear_all_events
> 
> ?
> 
> >  # cat /debug/tracing/available_events > /debug/tracing/set_event
> > 
> > Will enable all registered event hooks.
> > 
> > Signed-off-by: Steven Rostedt <srostedt@redhat.com>
> > ---
> >  include/asm-generic/vmlinux.lds.h |   11 ++-
> >  kernel/trace/Kconfig              |    9 ++
> >  kernel/trace/Makefile             |    1 +
> >  kernel/trace/trace_events.c       |  280 +++++++++++++++++++++++++++++++++++++
> >  kernel/trace/trace_events.h       |   52 +++++++
> >  5 files changed, 352 insertions(+), 1 deletions(-)
> >  create mode 100644 kernel/trace/trace_events.c
> >  create mode 100644 kernel/trace/trace_events.h
> > 
> > diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
> > index c61fab1..0add6b2 100644
> > --- a/include/asm-generic/vmlinux.lds.h
> > +++ b/include/asm-generic/vmlinux.lds.h
> > @@ -61,6 +61,14 @@
> >  #define BRANCH_PROFILE()
> >  #endif
> >  
> > +#ifdef CONFIG_EVENT_TRACER
> > +#define FTRACE_EVENTS()	VMLINUX_SYMBOL(__start_ftrace_events) = .;	\
> > +			*(_ftrace_events)				\
> > +			VMLINUX_SYMBOL(__stop_ftrace_events) = .;
> > +#else
> > +#define FTRACE_EVENTS()
> > +#endif
> > +
> >  /* .data section */
> >  #define DATA_DATA							\
> >  	*(.data)							\
> > @@ -81,7 +89,8 @@
> >  	*(__tracepoints)						\
> >  	VMLINUX_SYMBOL(__stop___tracepoints) = .;			\
> >  	LIKELY_PROFILE()		       				\
> > -	BRANCH_PROFILE()
> > +	BRANCH_PROFILE()						\
> > +	FTRACE_EVENTS()
> >  
> >  #define RO_DATA(align)							\
> >  	. = ALIGN((align));						\
> > diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
> > index 07877f4..999c6a2 100644
> > --- a/kernel/trace/Kconfig
> > +++ b/kernel/trace/Kconfig
> > @@ -159,6 +159,15 @@ config CONTEXT_SWITCH_TRACER
> >  	  This tracer gets called from the context switch and records
> >  	  all switching of tasks.
> >  
> > +config EVENT_TRACER
> > +	bool "Trace various events in the kernel"
> > +	depends on DEBUG_KERNEL
> > +	select TRACING
> > +	help
> > +	  This tracer hooks to various trace points in the kernel
> > +	  allowing the user to pick and choose which trace point they
> > +	  want to trace.
> > +
> >  config BOOT_TRACER
> >  	bool "Trace boot initcalls"
> >  	depends on DEBUG_KERNEL
> > diff --git a/kernel/trace/Makefile b/kernel/trace/Makefile
> > index 627090b..c736356 100644
> > --- a/kernel/trace/Makefile
> > +++ b/kernel/trace/Makefile
> > @@ -38,5 +38,6 @@ obj-$(CONFIG_POWER_TRACER) += trace_power.o
> >  obj-$(CONFIG_KMEMTRACE) += kmemtrace.o
> >  obj-$(CONFIG_WORKQUEUE_TRACER) += trace_workqueue.o
> >  obj-$(CONFIG_BLK_DEV_IO_TRACE)	+= blktrace.o
> > +obj-$(CONFIG_EVENT_TRACER) += trace_events.o
> >  
> >  libftrace-y := ftrace.o
> > diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
> > new file mode 100644
> > index 0000000..05bc80e
> > --- /dev/null
> > +++ b/kernel/trace/trace_events.c
> > @@ -0,0 +1,280 @@
> > +/*
> > + * event tracer
> > + *
> > + * Copyright (C) 2008 Red Hat Inc, Steven Rostedt <srostedt@redhat.com>
> > + *
> > + */
> > +
> > +#include <linux/debugfs.h>
> > +#include <linux/uaccess.h>
> > +#include <linux/module.h>
> > +#include <linux/ctype.h>
> > +
> > +#include "trace_events.h"
> > +
> > +void event_trace_printk(unsigned long ip, const char *fmt, ...)
> > +{
> > +	va_list ap;
> > +
> > +	va_start(ap, fmt);
> > +	tracing_record_cmdline(current);
> > +	trace_vprintk(ip, task_curr_ret_stack(current), fmt, ap);
> > +	va_end(ap);
> > +}
> > +
> > +static void ftrace_clear_events(void)
> > +{
> > +	struct ftrace_event_call *call = (void *)__start_ftrace_events;
> 
> __start_ftrace_events has type `unsigned long'.  It is instantiated by
> the linker.  There's a very old unix convention tha tthese things have
> type `int'.  Doesn't matter.
> 
> It's a bit strange to do the double-cast like this - one explicit, one
> implicit.  Doesn't matter much.

Heh, I fix this in a later patch. Well, I make __start_ftrace_events into 
an array of struct ftrace_event_call's.

> 
> > +
> > +
> 
> The patch adds various random \n\n's
> 
> > +	while ((unsigned long)call < (unsigned long)__stop_ftrace_events) {
> > +
> 
> and various random \n's
> 

Hmm, I cut a bunch out of here in development. I must not have noticed
that I left extra \n's laying around. And now I feel guilty for yelling
at my daughter for leaving her \n's on the kitchen floor.

> > +		if (call->enabled) {
> > +			call->enabled = 0;
> > +			call->unregfunc();
> > +		}
> > +		call++;
> > +	}
> > +}
> > +
> > +static int ftrace_set_clr_event(char *buf, int set)
> > +{
> > +	struct ftrace_event_call *call = (void *)__start_ftrace_events;
> > +
> > +
> > +	while ((unsigned long)call < (unsigned long)__stop_ftrace_events) {
> > +
> > +		if (strcmp(buf, call->name) != 0) {
> > +			call++;
> > +			continue;
> > +		}
> > +
> > +		if (set) {
> > +			/* Already set? */
> > +			if (call->enabled)
> > +				return 0;
> > +			call->enabled = 1;
> > +			call->regfunc();
> > +		} else {
> > +			/* Already cleared? */
> > +			if (!call->enabled)
> > +				return 0;
> > +			call->enabled = 0;
> > +			call->unregfunc();
> > +		}
> > +		return 0;
> > +	}
> > +	return -EINVAL;
> > +}
> 
> I spose if I looked at this and surrounding code enough, I could work
> out what it's for.

Um, er, ah, yyyyeaah, you would. /me quickly goes to write a bunch of 
comments.

> 
> > +/* 128 should be much more than enough */
> > +#define EVENT_BUF_SIZE		127
> > +
> > +static ssize_t
> > +ftrace_event_write(struct file *file, const char __user *ubuf,
> > +		   size_t cnt, loff_t *ppos)
> > +{
> > +	size_t read = 0;
> > +	int i, set = 1;
> > +	ssize_t ret;
> > +	char *buf;
> > +	char ch;
> > +
> > +	if (!cnt || cnt < 0)
> > +		return 0;
> 
> cnt is unsigned, and I don't think the VFS lets through either zero or
> -ve counts (I always forget).

gag, this was cut and pasted from another function. I guess once I have 
some sloppiness around, I constantly reuse it :-/  A clean up patch is 
soon to come.


> 
> > +	ret = get_user(ch, ubuf++);
> > +	if (ret)
> > +		return ret;
> > +	read++;
> > +	cnt--;
> > +
> > +	/* skip white space */
> > +	while (cnt && isspace(ch)) {
> > +		ret = get_user(ch, ubuf++);
> > +		if (ret)
> > +			return ret;
> > +		read++;
> > +		cnt--;
> > +	}
> > +
> > +	/* Only white space found? */
> > +	if (isspace(ch)) {
> > +		file->f_pos += read;
> > +		ret = read;
> > +		return ret;
> > +	}
> > +
> > +	buf = kmalloc(EVENT_BUF_SIZE+1, GFP_KERNEL);
> > +	if (!buf)
> > +		return -ENOMEM;
> > +
> > +	if (cnt > EVENT_BUF_SIZE)
> > +		cnt = EVENT_BUF_SIZE;
> > +
> > +	i = 0;
> > +	while (cnt && !isspace(ch)) {
> > +		if (!i && ch == '!')
> > +			set = 0;
> > +		else
> > +			buf[i++] = ch;
> > +
> > +		ret = get_user(ch, ubuf++);
> > +		if (ret)
> > +			goto out_free;
> > +		read++;
> > +		cnt--;
> > +	}
> > +	buf[i] = 0;
> 
> Gad, what a lot of stuff.
> 
> Use strncpy_from_user()?
> 
> Use strstrip()?
> 
> Why do we care about leading and trailing whitespace - user error!

This is because i want:

 cat available_events > set_event

to work. It probably is already broken. The code I copied this from had an 
iterator to catch when we are in a middle of a name, and it would reuse 
the buffer on the next call. This code has the seq_read part shared with 
other code, so passing an iterator is a bit more complex.

> 
> Then we can use sysfs_streq().
> 
> If we really really need to do all this, how's about positioning it as
> a general copy_string_from_userspace_then_trim_the_ends() for other
> callsites to use?

Or more about, copy_word_from_userspace() that gives me the next word.
Space delimited. I only want to process one word at a time.

> 
> > +	file->f_pos += read;
> > +
> > +	ret = ftrace_set_clr_event(buf, set);
> > +	if (ret)
> > +		goto out_free;
> > +
> > +	ret = read;
> > +
> > + out_free:
> > +	kfree(buf);
> > +
> > +	return ret;
> > +}
> >
> > ..
> >
> >
> > +static int
> > +ftrace_event_seq_open(struct inode *inode, struct file *file)
> > +{
> > +	int ret;
> > +	const struct seq_operations *seq_ops;
> > +
> > +	if ((file->f_mode & FMODE_WRITE) &&
> > +	    !(file->f_flags & O_APPEND))
> > +		ftrace_clear_events();
> > +
> > +	seq_ops = inode->i_private;
> > +	ret = seq_open(file, seq_ops);
> > +	if (!ret) {
> > +		struct seq_file *m = file->private_data;
> > +
> > +		m->private = __start_ftrace_events;
> > +	}
> > +	return ret;
> > +}
> 
> It would be nice if the code were to somewhere document the userspace
> interface which it is attempting to implement.  It's a bit hard to
> follow if the reader doesn't already know this.

Ingo, I told you I should comment ;-)

Yeah, I need to go back and add a bunch of comments. I actually did plan 
to do that. 

Thanks,

-- Steve


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 2/4] tracing: add event trace infrastructure
  2009-02-25  4:08     ` Steven Rostedt
@ 2009-02-25  4:24       ` Nick Piggin
  2009-02-25  4:33       ` Andrew Morton
  1 sibling, 0 replies; 51+ messages in thread
From: Nick Piggin @ 2009-02-25  4:24 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Andrew Morton, LKML, Ingo Molnar, Thomas Gleixner,
	Peter Zijlstra, Frederic Weisbecker, Theodore Tso,
	Arjan van de Ven, Pekka Paalanen, Arnaldo Carvalho de Melo,
	Jason Baron, Martin Bligh, Mathieu Desnoyers, Frank Ch. Eigler,
	KOSAKI Motohiro, Jens Axboe, Masami Hiramatsu, Steven Rostedt

On Wednesday 25 February 2009 15:08:56 Steven Rostedt wrote:
> On Tue, 24 Feb 2009, Andrew Morton wrote:
> > On Tue, 24 Feb 2009 21:56:10 -0500 Steven Rostedt <rostedt@goodmis.org> 
wrote:
> > > From: Steven Rostedt <srostedt@redhat.com>
> > >
> > > This patch creates the event tracing infrastructure of ftrace.
> > > It will create the files:
> > >
> > >  /debug/tracing/available_events
> > >  /debug/tracing/set_event
> > >
> > > The available_events will list the trace points that have been
> > > registered with the event tracer.
> > >
> > > set_events will allow the user to enable or disable an event hook.
> > >
> > > example:
> > >
> > >  # echo sched_wakeup > /debug/tracing/set_event
> > >
> > > Will enable the sched_wakeup event (if it is registered).
> > >
> > >  # echo "!sched_wakeup" >> /debug/tracing/set_event
> > >
> > > Will disable the sched_wakeup event (and only that event).
> >
> > Why not
> >
> > echo sched_wakeup > /debug/tracing/set_event
> > echo sched_wakeup > /debug/tracing/clear_event
>
> I'm trying to keep the number of files in /debug/tracing down.

Why? Andrew's proposal is much cleaner. (what is the !sched_wakeup
event? An event you get when the scheduler does not wake up a task? ;)



^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 2/4] tracing: add event trace infrastructure
  2009-02-25  4:08     ` Steven Rostedt
  2009-02-25  4:24       ` Nick Piggin
@ 2009-02-25  4:33       ` Andrew Morton
  2009-02-25  5:16         ` Mathieu Desnoyers
  2009-02-25  8:11         ` Ingo Molnar
  1 sibling, 2 replies; 51+ messages in thread
From: Andrew Morton @ 2009-02-25  4:33 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: LKML, Ingo Molnar, Thomas Gleixner, Peter Zijlstra,
	Frederic Weisbecker, Theodore Tso, Arjan van de Ven,
	Pekka Paalanen, Arnaldo Carvalho de Melo, Jason Baron,
	Martin Bligh, Mathieu Desnoyers, Frank Ch. Eigler,
	KOSAKI Motohiro, Jens Axboe, Masami Hiramatsu, Steven Rostedt

On Tue, 24 Feb 2009 23:08:56 -0500 (EST) Steven Rostedt <rostedt@goodmis.org> wrote:

> > Gad, what a lot of stuff.
> > 
> > Use strncpy_from_user()?
> > 
> > Use strstrip()?
> > 
> > Why do we care about leading and trailing whitespace - user error!
> 
> This is because i want:
> 
>  cat available_events > set_event
> 
> to work.

:(

Why on earth do we keep on putting all these pretty-printers and
pretty-parsers into the kernel?  I mean, how hard is it for userspace
to read a text file, do some basic substitutions and print it out again?


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 2/4] tracing: add event trace infrastructure
  2009-02-25  4:33       ` Andrew Morton
@ 2009-02-25  5:16         ` Mathieu Desnoyers
  2009-02-25  8:11         ` Ingo Molnar
  1 sibling, 0 replies; 51+ messages in thread
From: Mathieu Desnoyers @ 2009-02-25  5:16 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Steven Rostedt, LKML, Ingo Molnar, Thomas Gleixner,
	Peter Zijlstra, Frederic Weisbecker, Theodore Tso,
	Arjan van de Ven, Pekka Paalanen, Arnaldo Carvalho de Melo,
	Jason Baron, Martin Bligh, Frank Ch. Eigler, KOSAKI Motohiro,
	Jens Axboe, Masami Hiramatsu, Steven Rostedt

* Andrew Morton (akpm@linux-foundation.org) wrote:
> On Tue, 24 Feb 2009 23:08:56 -0500 (EST) Steven Rostedt <rostedt@goodmis.org> wrote:
> 
> > > Gad, what a lot of stuff.
> > > 
> > > Use strncpy_from_user()?
> > > 
> > > Use strstrip()?
> > > 
> > > Why do we care about leading and trailing whitespace - user error!
> > 
> > This is because i want:
> > 
> >  cat available_events > set_event
> > 
> > to work.
> 
> :(
> 
> Why on earth do we keep on putting all these pretty-printers and
> pretty-parsers into the kernel?  I mean, how hard is it for userspace
> to read a text file, do some basic substitutions and print it out again?
> 

... or to read a binary buffer and format it in text, which is much more
efficient... :) If that's the kind of feature you are looking for, you
will probably like this aspect of the LTTng patchset. I should really,
really post it soon.

Mathieu


-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 1/4] tracing: add DEFINE_TRACE_FMT to tracepoint.h
  2009-02-25  2:56 ` [PATCH 1/4] tracing: add DEFINE_TRACE_FMT to tracepoint.h Steven Rostedt
@ 2009-02-25  6:27   ` Peter Zijlstra
  2009-02-25 13:01     ` Steven Rostedt
  2009-02-25 16:13   ` Mathieu Desnoyers
  1 sibling, 1 reply; 51+ messages in thread
From: Peter Zijlstra @ 2009-02-25  6:27 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Frederic Weisbecker, Theodore Tso, Arjan van de Ven,
	Pekka Paalanen, Arnaldo Carvalho de Melo, Jason Baron,
	Martin Bligh, Mathieu Desnoyers, Frank Ch. Eigler,
	KOSAKI Motohiro, Jens Axboe, Masami Hiramatsu, Steven Rostedt

On Tue, 2009-02-24 at 21:56 -0500, Steven Rostedt wrote:

> +#define DEFINE_TRACE_FMT(name, proto, args, fmt)		\
> +	DECLARE_TRACE(name, TPPROTO(proto), TPARGS(args))
> +

If TPPROTO and TPARGS were anything but empty stubs, the above would add
them twice, no?


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 3/4] tracing: add schedule events to event trace
  2009-02-25  2:56 ` [PATCH 3/4] tracing: add schedule events to event trace Steven Rostedt
@ 2009-02-25  6:29   ` Peter Zijlstra
  0 siblings, 0 replies; 51+ messages in thread
From: Peter Zijlstra @ 2009-02-25  6:29 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Frederic Weisbecker, Theodore Tso, Arjan van de Ven,
	Pekka Paalanen, Arnaldo Carvalho de Melo, Jason Baron,
	Martin Bligh, Mathieu Desnoyers, Frank Ch. Eigler,
	KOSAKI Motohiro, Jens Axboe, Masami Hiramatsu, Steven Rostedt

On Tue, 2009-02-24 at 21:56 -0500, Steven Rostedt wrote:
> +DEFINE_TRACE_FMT(sched_switch,
> +       TPPROTO(struct rq *rq, struct task_struct *prev,
> +               struct task_struct *next),
> +       TPARGS(rq, prev, next),
> +       TPFMT("task %s:%d ==> %s:%d",
> +             prev->comm, prev->pid, next->comm, next->pid));

Why add tehe "task" string to the left hand task, but not the right hand
one?


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 4/4] tracing: make event directory structure
  2009-02-25  2:56 ` [PATCH 4/4] tracing: make event directory structure Steven Rostedt
@ 2009-02-25  6:59   ` Frederic Weisbecker
  2009-02-25 13:07     ` Steven Rostedt
  0 siblings, 1 reply; 51+ messages in thread
From: Frederic Weisbecker @ 2009-02-25  6:59 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, Theodore Tso, Arjan van de Ven, Pekka Paalanen,
	Arnaldo Carvalho de Melo, Jason Baron, Martin Bligh,
	Mathieu Desnoyers, Frank Ch. Eigler, KOSAKI Motohiro, Jens Axboe,
	Masami Hiramatsu, Steven Rostedt

On Tue, Feb 24, 2009 at 09:56:12PM -0500, Steven Rostedt wrote:
> From: Steven Rostedt <srostedt@redhat.com>
> 
> This patch adds the directory /debug/tracing/events/ that will contain
> all the registered trace points.
> 
>  # ls /debug/tracing/events/
> sched_kthread_stop      sched_process_fork  sched_switch
> sched_kthread_stop_ret  sched_process_free  sched_wait_task
> sched_migrate_task      sched_process_wait  sched_wakeup
> sched_process_exit      sched_signal_send   sched_wakeup_new
> 
>  # ls /debug/tracing/events/sched_switch/
> enable
> 
>  # cat /debug/tracing/events/sched_switch/enable
> 1
> 
>  # cat /debug/tracing/set_event
> sched_switch


Do you plan to add other things inside these events directory?
I mean, if you plan to only put an enable file on each of them,
it would be better to have just one file:

cat /debug/tracing/events/sched_switch
1

 
> Signed-off-by: Steven Rostedt <srostedt@redhat.com>
> ---
>  kernel/trace/trace_events.c |  137 +++++++++++++++++++++++++++++++++++++++++--
>  kernel/trace/trace_events.h |    7 ++-
>  2 files changed, 137 insertions(+), 7 deletions(-)
> 
> diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
> index 05bc80e..3bcb9df 100644
> --- a/kernel/trace/trace_events.c
> +++ b/kernel/trace/trace_events.c
> @@ -12,6 +12,11 @@
>  
>  #include "trace_events.h"
>  
> +#define events_for_each(event)						\
> +	for (event = __start_ftrace_events;				\
> +	     (unsigned long)event < (unsigned long)__stop_ftrace_events; \
> +	     event++)
> +


The name for_each_events seems to me more intuitive, but it's just
a matter of taste.


Frederic.

>  void event_trace_printk(unsigned long ip, const char *fmt, ...)
>  {
>  	va_list ap;
> @@ -39,15 +44,16 @@ static void ftrace_clear_events(void)
>  
>  static int ftrace_set_clr_event(char *buf, int set)
>  {
> -	struct ftrace_event_call *call = (void *)__start_ftrace_events;
> +	struct ftrace_event_call *call = __start_ftrace_events;
>  
>  
> -	while ((unsigned long)call < (unsigned long)__stop_ftrace_events) {
> +	events_for_each(call) {
>  
> -		if (strcmp(buf, call->name) != 0) {
> -			call++;
> +		if (!call->name)
> +			continue;
> +
> +		if (strcmp(buf, call->name) != 0)
>  			continue;
> -		}
>  
>  		if (set) {
>  			/* Already set? */
> @@ -223,6 +229,67 @@ ftrace_event_seq_open(struct inode *inode, struct file *file)
>  	return ret;
>  }
>  
> +static ssize_t
> +event_enable_read(struct file *filp, char __user *ubuf, size_t cnt,
> +		  loff_t *ppos)
> +{
> +	struct ftrace_event_call *call = filp->private_data;
> +	char *buf;
> +
> +	if (call->enabled)
> +		buf = "1\n";
> +	else
> +		buf = "0\n";
> +
> +	return simple_read_from_buffer(ubuf, cnt, ppos, buf, 2);
> +}
> +
> +static ssize_t
> +event_enable_write(struct file *filp, const char __user *ubuf, size_t cnt,
> +		   loff_t *ppos)
> +{
> +	struct ftrace_event_call *call = filp->private_data;
> +	char buf[64];
> +	unsigned long val;
> +	int ret;
> +
> +	if (cnt >= sizeof(buf))
> +		return -EINVAL;
> +
> +	if (copy_from_user(&buf, ubuf, cnt))
> +		return -EFAULT;
> +
> +	buf[cnt] = 0;
> +
> +	ret = strict_strtoul(buf, 10, &val);
> +	if (ret < 0)
> +		return ret;
> +
> +	switch (val) {
> +	case 0:
> +		if (!call->enabled)
> +			break;
> +
> +		call->enabled = 0;
> +		call->unregfunc();
> +		break;
> +	case 1:
> +		if (call->enabled)
> +			break;
> +
> +		call->enabled = 1;
> +		call->regfunc();
> +		break;
> +
> +	default:
> +		return -EINVAL;
> +	}
> +
> +	*ppos += cnt;
> +
> +	return cnt;
> +}
> +
>  static const struct seq_operations show_event_seq_ops = {
>  	.start = t_start,
>  	.next = t_next,
> @@ -252,10 +319,59 @@ static const struct file_operations ftrace_set_event_fops = {
>  	.release = seq_release,
>  };
>  
> +static const struct file_operations ftrace_enable_fops = {
> +	.open = tracing_open_generic,
> +	.read = event_enable_read,
> +	.write = event_enable_write,
> +};
> +
> +static struct dentry *event_trace_events_dir(void)
> +{
> +	static struct dentry *d_tracer;
> +	static struct dentry *d_events;
> +
> +	if (d_events)
> +		return d_events;
> +
> +	d_tracer = tracing_init_dentry();
> +	if (!d_tracer)
> +		return NULL;
> +
> +	d_events = debugfs_create_dir("events", d_tracer);
> +	if (!d_events)
> +		pr_warning("Could not create debugfs "
> +			   "'events' directory\n");
> +
> +	return d_events;
> +}
> +
> +static int
> +event_create_dir(struct ftrace_event_call *call, struct dentry *d_events)
> +{
> +	struct dentry *entry;
> +
> +	call->dir = debugfs_create_dir(call->name, d_events);
> +	if (!call->dir) {
> +		pr_warning("Could not create debugfs "
> +			   "'%s' directory\n", call->name);
> +		return -1;
> +	}
> +
> +	entry = debugfs_create_file("enable", 0644, call->dir, call,
> +				    &ftrace_enable_fops);
> +	if (!entry)
> +		pr_warning("Could not create debugfs "
> +			   "'%s/enable' entry\n", call->name);
> +
> +	return 0;
> +}
> +
>  static __init int event_trace_init(void)
>  {
> +	struct ftrace_event_call *call = __start_ftrace_events;
>  	struct dentry *d_tracer;
>  	struct dentry *entry;
> +	struct dentry *d_events;
>  
>  	d_tracer = tracing_init_dentry();
>  	if (!d_tracer)
> @@ -275,6 +391,17 @@ static __init int event_trace_init(void)
>  		pr_warning("Could not create debugfs "
>  			   "'set_event' entry\n");
>  
> +	d_events = event_trace_events_dir();
> +	if (!d_events)
> +		return 0;
> +
> +	events_for_each(call) {
> +		/* The linker may leave blanks */
> +		if (!call->name)
> +			continue;
> +		event_create_dir(call, d_events);
> +	}
> +
>  	return 0;
>  }
>  fs_initcall(event_trace_init);
> diff --git a/kernel/trace/trace_events.h b/kernel/trace/trace_events.h
> index 39342f8..cb8455b 100644
> --- a/kernel/trace/trace_events.h
> +++ b/kernel/trace/trace_events.h
> @@ -1,11 +1,13 @@
>  #ifndef _LINUX_KERNEL_TRACE_EVENTS_H
>  #define _LINUX_KERNEL_TRACE_EVENTS_H
>  
> +#include <linux/debugfs.h>
>  #include <linux/ftrace.h>
>  #include "trace.h"
>  
>  struct ftrace_event_call {
>  	char		*name;
> +	struct dentry	*dir;
>  	int		enabled;
>  	int		(*regfunc)(void);
>  	void		(*unregfunc)(void);
> @@ -39,6 +41,7 @@ static void ftrace_unreg_event_##call(void)				\
>  }									\
>  									\
>  static struct ftrace_event_call __used					\
> +__attribute__((__aligned__(4)))						\
>  __attribute__((section("_ftrace_events"))) event_##call = {		\
>  	.name 			= #call,				\
>  	.regfunc		= ftrace_reg_event_##call,		\
> @@ -46,7 +49,7 @@ __attribute__((section("_ftrace_events"))) event_##call = {		\
>  }
>  
>  void event_trace_printk(unsigned long ip, const char *fmt, ...);
> -extern unsigned long __start_ftrace_events[];
> -extern unsigned long __stop_ftrace_events[];
> +extern struct ftrace_event_call __start_ftrace_events[];
> +extern struct ftrace_event_call __stop_ftrace_events[];
>  
>  #endif /* _LINUX_KERNEL_TRACE_EVENTS_H */
> -- 
> 1.5.6.5
> 
> -- 


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 2/4] tracing: add event trace infrastructure
  2009-02-25  4:33       ` Andrew Morton
  2009-02-25  5:16         ` Mathieu Desnoyers
@ 2009-02-25  8:11         ` Ingo Molnar
  2009-02-25  8:28           ` Andrew Morton
  1 sibling, 1 reply; 51+ messages in thread
From: Ingo Molnar @ 2009-02-25  8:11 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Steven Rostedt, LKML, Thomas Gleixner, Peter Zijlstra,
	Frederic Weisbecker, Theodore Tso, Arjan van de Ven,
	Pekka Paalanen, Arnaldo Carvalho de Melo, Jason Baron,
	Martin Bligh, Mathieu Desnoyers, Frank Ch. Eigler,
	KOSAKI Motohiro, Jens Axboe, Masami Hiramatsu, Steven Rostedt


* Andrew Morton <akpm@linux-foundation.org> wrote:

> On Tue, 24 Feb 2009 23:08:56 -0500 (EST) Steven Rostedt <rostedt@goodmis.org> wrote:
> 
> > > Gad, what a lot of stuff.
> > > 
> > > Use strncpy_from_user()?
> > > 
> > > Use strstrip()?
> > > 
> > > Why do we care about leading and trailing whitespace - user error!
> > 
> > This is because i want:
> > 
> >  cat available_events > set_event
> > 
> > to work.
> 
> :(
> 
> Why on earth do we keep on putting all these pretty-printers 
> and pretty-parsers into the kernel?  I mean, how hard is it 
> for userspace to read a text file, do some basic substitutions 
> and print it out again?

Note that there's no mandatory user-space component here - the 
final destination is the kernel developer's eyes in 90% of the 
cases. These traces get pasted into email, etc. etc.

So leading spaces, meaningful formatting and general hands-on 
usability is important. [ I know, it's a strange concept in the 
kernel, we tend to have a perversion for the most unusable and 
most inconsistent user interfaces ;-) ]

It's also a balancing act. We dont want to put all of TeX into 
the kernel obviously. Nor do we want the default to be the 
opposite end of the spectrum: to output raw binary records. So 
we find some middle ground. That middle ground is inluenced by 
the developers using this stuff.

	Ingo

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 2/4] tracing: add event trace infrastructure
  2009-02-25  8:11         ` Ingo Molnar
@ 2009-02-25  8:28           ` Andrew Morton
  2009-02-25  8:40             ` Ingo Molnar
                               ` (3 more replies)
  0 siblings, 4 replies; 51+ messages in thread
From: Andrew Morton @ 2009-02-25  8:28 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Steven Rostedt, LKML, Thomas Gleixner, Peter Zijlstra,
	Frederic Weisbecker, Theodore Tso, Arjan van de Ven,
	Pekka Paalanen, Arnaldo Carvalho de Melo, Jason Baron,
	Martin Bligh, Mathieu Desnoyers, Frank Ch. Eigler,
	KOSAKI Motohiro, Jens Axboe, Masami Hiramatsu, Steven Rostedt

On Wed, 25 Feb 2009 09:11:18 +0100 Ingo Molnar <mingo@elte.hu> wrote:

> 
> * Andrew Morton <akpm@linux-foundation.org> wrote:
> 
> > On Tue, 24 Feb 2009 23:08:56 -0500 (EST) Steven Rostedt <rostedt@goodmis.org> wrote:
> > 
> > > > Gad, what a lot of stuff.
> > > > 
> > > > Use strncpy_from_user()?
> > > > 
> > > > Use strstrip()?
> > > > 
> > > > Why do we care about leading and trailing whitespace - user error!
> > > 
> > > This is because i want:
> > > 
> > >  cat available_events > set_event
> > > 
> > > to work.
> > 
> > :(
> > 
> > Why on earth do we keep on putting all these pretty-printers 
> > and pretty-parsers into the kernel?  I mean, how hard is it 
> > for userspace to read a text file, do some basic substitutions 
> > and print it out again?
> 
> Note that there's no mandatory user-space component here - the 
> final destination is the kernel developer's eyes in 90% of the 
> cases. These traces get pasted into email, etc. etc.
> 
> So leading spaces, meaningful formatting and general hands-on 
> usability is important. [ I know, it's a strange concept in the 
> kernel, we tend to have a perversion for the most unusable and 
> most inconsistent user interfaces ;-) ]
> 
> It's also a balancing act. We dont want to put all of TeX into 
> the kernel obviously. Nor do we want the default to be the 
> opposite end of the spectrum: to output raw binary records. So 
> we find some middle ground. That middle ground is inluenced by 
> the developers using this stuff.
> 

<For the enty enth pissing-in-the-wind time>

A better approach would be to design simple, robust kernel interfaces
which make sense and which aren't made all complex by putting the user
interface in kernel space.  And to maintain corresponding userspace
tools which manipulate and present the IO from those kernel interfaces.

But we don't do that, because userspace is hard, because we don't have
a delivery process.  But nobody has even tried!  We can do this - it
starts with `mkdir -p userspace/ktrace'.

Probably it's already too late for ktrace though - that hole is already
dug.


Last time I pissed in this wind I got fobbed off with some stupid "put
it in util-linux" answer.  But of course we won't do that, because it's
MUCH harder and slower and more complex than just patching the kernel
tree.  So nothing happened.  Again.

And please let's not all sit here busily thinking up improbable reasons
why it can't possibly work.  We're clever!  We can do this sort of
thing!  If problems arise, we fix them!

The only extant example I can think of is getdelays.c, and that was a
totally half-assed effort with no infrastructural support at all.  But
quite a lot of people use it, and patches occasionally come in for it,
no problems.


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 2/4] tracing: add event trace infrastructure
  2009-02-25  8:28           ` Andrew Morton
@ 2009-02-25  8:40             ` Ingo Molnar
  2009-02-25  9:15               ` Andrew Morton
  2009-02-25  9:00             ` Pekka Enberg
                               ` (2 subsequent siblings)
  3 siblings, 1 reply; 51+ messages in thread
From: Ingo Molnar @ 2009-02-25  8:40 UTC (permalink / raw)
  To: Andrew Morton, H. Peter Anvin
  Cc: Steven Rostedt, LKML, Thomas Gleixner, Peter Zijlstra,
	Frederic Weisbecker, Theodore Tso, Arjan van de Ven,
	Pekka Paalanen, Arnaldo Carvalho de Melo, Jason Baron,
	Martin Bligh, Mathieu Desnoyers, Frank Ch. Eigler,
	KOSAKI Motohiro, Jens Axboe, Masami Hiramatsu, Steven Rostedt


* Andrew Morton <akpm@linux-foundation.org> wrote:

> But we don't do that, because userspace is hard, because we 
> don't have a delivery process.  But nobody has even tried! 
> [...]

Actually, you can do it right now, with the current model:

  echo raw > /debug/tracing/trace_options
  echo bin > /debug/tracing/trace_options

Reality is that rarely does anyone use it because obviously 
self-sufficient kernel instrumentation is very convenient to 
kernel developers, to testers and to bugreporters alike. 
(Processing kernel details in user-space is also often slower 
and duplicates code that the kernel already has like symbol 
lookups.)

_Maybe_ some of it (but certainly not all) could be done via 
klibc - but that got nacked. If you can convince Linus to take 
klibc and to allow a userspace 'halo' of utilities that are 
intimate with kernel details into the kernel proper then maybe 
we could start providing such tools in the kernel proper.

	Ingo

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 2/4] tracing: add event trace infrastructure
  2009-02-25  8:28           ` Andrew Morton
  2009-02-25  8:40             ` Ingo Molnar
@ 2009-02-25  9:00             ` Pekka Enberg
  2009-02-25  9:10               ` Ingo Molnar
  2009-02-25  9:22               ` Andrew Morton
  2009-02-25 13:54             ` Theodore Tso
  2009-03-01 10:37             ` KOSAKI Motohiro
  3 siblings, 2 replies; 51+ messages in thread
From: Pekka Enberg @ 2009-02-25  9:00 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Ingo Molnar, Steven Rostedt, LKML, Thomas Gleixner,
	Peter Zijlstra, Frederic Weisbecker, Theodore Tso,
	Arjan van de Ven, Pekka Paalanen, Arnaldo Carvalho de Melo,
	Jason Baron, Martin Bligh, Mathieu Desnoyers, Frank Ch. Eigler,
	KOSAKI Motohiro, Jens Axboe, Masami Hiramatsu, Steven Rostedt

Hi Andrew,

On Wed, Feb 25, 2009 at 10:28 AM, Andrew Morton
<akpm@linux-foundation.org> wrote:
> A better approach would be to design simple, robust kernel interfaces
> which make sense and which aren't made all complex by putting the user
> interface in kernel space.  And to maintain corresponding userspace
> tools which manipulate and present the IO from those kernel interfaces.

We did this for kmemtrace and quite frankly, I usually end up spending
more time figuring out how to export the data from a virtual machine
than actually analyzing the results. What makes ftrace so cool is the
fact that it has almost zero-overhead for setup and that the
text-based data format plays well with simple scripting tools
available everywhere.

                                     Pekka

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 2/4] tracing: add event trace infrastructure
  2009-02-25  2:56 ` [PATCH 2/4] tracing: add event trace infrastructure Steven Rostedt
  2009-02-25  3:45   ` Andrew Morton
@ 2009-02-25  9:07   ` Lai Jiangshan
  2009-02-25 13:50     ` Steven Rostedt
  2009-02-25  9:21   ` Lai Jiangshan
  2 siblings, 1 reply; 51+ messages in thread
From: Lai Jiangshan @ 2009-02-25  9:07 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, Frederic Weisbecker, Theodore Tso,
	Arjan van de Ven, Pekka Paalanen, Arnaldo Carvalho de Melo,
	Jason Baron, Martin Bligh, Mathieu Desnoyers, Frank Ch. Eigler,
	KOSAKI Motohiro, Jens Axboe, Masami Hiramatsu, Steven Rostedt

a very nice infrastructure!

Two comments about format:

Steven Rostedt 写道:
> From: Steven Rostedt <srostedt@redhat.com>
> +#undef TPFMT
> +#define TPFMT(fmt, args...)	fmt "\n", ##args

Where is other TPFMT's definition?
I think TPFMT should be defined in tracepoint.h, because trace/<type>.h
uses it.

> +#define DEFINE_TRACE_FMT(call, proto, args, fmt)			\
> +static void ftrace_event_##call(proto)					\
> +{									\
> +	event_trace_printk(_RET_IP_, "(" #call ") " fmt);		\
> +}									\
> +									\
> +static int ftrace_reg_event_##call(void)				\
> +{									\
> +	int ret;							\
> +									\
> +	ret = register_trace_##call(ftrace_event_##call);		\
> +	if (!ret)							\
> +		pr_info("event trace: Could not activate trace point "	\
> +			"probe to " #call);				\
> +	return ret;							\
> +}									\
> +									\
> +static void ftrace_unreg_event_##call(void)				\
> +{									\
> +	unregister_trace_##call(ftrace_event_##call);			\
> +}									\
> +									\
> +static struct ftrace_event_call __used					\
> +__attribute__((section("_ftrace_events"))) event_##call = {		\
> +	.name 			= #call,				\
> +	.regfunc		= ftrace_reg_event_##call,		\
> +	.unregfunc		= ftrace_unreg_event_##call,		\
> +}
> +
> +void event_trace_printk(unsigned long ip, const char *fmt, ...);

add format checking here:
__attribute__ ((format (printf, 2, 3)))

If user does not use this infrastructure(he uses trace/<type>.h only),
this checking is not work, so I think we can add format checking
in DEFINE_TRACE_FMT in tracepoint.h

--------example:

static inline void  __tp_check_format(const char *fmt, ...)
		__attribute__ ((format (printf, 1, 2)));
static inline void __tp_check_format(const char *fmt, ...) {}

#define DECLARE_TRACE_FORMAT_CHECK(name, proto, fmt, fmt_args...)	\
static inline void ftrace_event_check_format_##name(proto)		\
{									\
	__tp_check_format(fmt, ##fmt_args);				\
}

#define DEFINE_TRACE_FMT(name, proto, args, fmt)			\
	DECLARE_TRACE_FORMAT_CHECK(name, TPPROTO(proto), fmt)		\
	DECLARE_TRACE(name, TPPROTO(proto), TPARGS(args))

---------

Thanks, Lai


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 2/4] tracing: add event trace infrastructure
  2009-02-25  9:00             ` Pekka Enberg
@ 2009-02-25  9:10               ` Ingo Molnar
  2009-02-25  9:22               ` Andrew Morton
  1 sibling, 0 replies; 51+ messages in thread
From: Ingo Molnar @ 2009-02-25  9:10 UTC (permalink / raw)
  To: Pekka Enberg
  Cc: Andrew Morton, Steven Rostedt, LKML, Thomas Gleixner,
	Peter Zijlstra, Frederic Weisbecker, Theodore Tso,
	Arjan van de Ven, Pekka Paalanen, Arnaldo Carvalho de Melo,
	Jason Baron, Martin Bligh, Mathieu Desnoyers, Frank Ch. Eigler,
	KOSAKI Motohiro, Jens Axboe, Masami Hiramatsu, Steven Rostedt


* Pekka Enberg <penberg@cs.helsinki.fi> wrote:

> Hi Andrew,
> 
> On Wed, Feb 25, 2009 at 10:28 AM, Andrew Morton
> <akpm@linux-foundation.org> wrote:
> > A better approach would be to design simple, robust kernel interfaces
> > which make sense and which aren't made all complex by putting the user
> > interface in kernel space.  And to maintain corresponding userspace
> > tools which manipulate and present the IO from those kernel interfaces.
> 
> We did this for kmemtrace and quite frankly, I usually end up 
> spending more time figuring out how to export the data from a 
> virtual machine than actually analyzing the results. What 
> makes ftrace so cool is the fact that it has almost 
> zero-overhead for setup and that the text-based data format 
> plays well with simple scripting tools available everywhere.

I just tried to write a similar reply - and ended up with a long 
5 page mail which was just repeating points others and me made 
before.

Instead i'll now print out your reply and will hang it on the 
wall ... ;-)

	Ingo

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 2/4] tracing: add event trace infrastructure
  2009-02-25  8:40             ` Ingo Molnar
@ 2009-02-25  9:15               ` Andrew Morton
  0 siblings, 0 replies; 51+ messages in thread
From: Andrew Morton @ 2009-02-25  9:15 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: H. Peter Anvin, Steven Rostedt, LKML, Thomas Gleixner,
	Peter Zijlstra, Frederic Weisbecker, Theodore Tso,
	Arjan van de Ven, Pekka Paalanen, Arnaldo Carvalho de Melo,
	Jason Baron, Martin Bligh, Mathieu Desnoyers, Frank Ch. Eigler,
	KOSAKI Motohiro, Jens Axboe, Masami Hiramatsu, Steven Rostedt

On Wed, 25 Feb 2009 09:40:00 +0100 Ingo Molnar <mingo@elte.hu> wrote:

> _Maybe_ some of it (but certainly not all) could be done via 
> klibc - but that got nacked. If you can convince Linus to take 
> klibc and to allow a userspace 'halo' of utilities that are 
> intimate with kernel details into the kernel proper then maybe 
> we could start providing such tools in the kernel proper.

see, there you go!

No, we don't need to merge klibc to be able to develop and deliver
simple, kernel-developer-only userspace tools within the kernel tree. 
For heaven's sake.

Sam had a patchset which scoops up all the compileable .c files in
Documentation/ and actually puts them into a separate directory along
with a makefile.  It worked!  It didn't quite get merged though.


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 2/4] tracing: add event trace infrastructure
  2009-02-25  2:56 ` [PATCH 2/4] tracing: add event trace infrastructure Steven Rostedt
  2009-02-25  3:45   ` Andrew Morton
  2009-02-25  9:07   ` Lai Jiangshan
@ 2009-02-25  9:21   ` Lai Jiangshan
  2009-02-25 13:54     ` Steven Rostedt
  2 siblings, 1 reply; 51+ messages in thread
From: Lai Jiangshan @ 2009-02-25  9:21 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, Frederic Weisbecker, Theodore Tso,
	Arjan van de Ven, Pekka Paalanen, Arnaldo Carvalho de Melo,
	Jason Baron, Martin Bligh, Mathieu Desnoyers, Frank Ch. Eigler,
	KOSAKI Motohiro, Jens Axboe, Masami Hiramatsu, Steven Rostedt

Steven Rostedt 写道:
> 
> Signed-off-by: Steven Rostedt <srostedt@redhat.com>
> ---
>  include/asm-generic/vmlinux.lds.h |   11 ++-
>  kernel/trace/Kconfig              |    9 ++
>  kernel/trace/Makefile             |    1 +
>  kernel/trace/trace_events.c       |  280 +++++++++++++++++++++++++++++++++++++
>  kernel/trace/trace_events.h       |   52 +++++++
>  5 files changed, 352 insertions(+), 1 deletions(-)
>  create mode 100644 kernel/trace/trace_events.c
>  create mode 100644 kernel/trace/trace_events.h
> 

Modules can not use this infrastructure,
will it be implemented in future?

Lai


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 2/4] tracing: add event trace infrastructure
  2009-02-25  9:00             ` Pekka Enberg
  2009-02-25  9:10               ` Ingo Molnar
@ 2009-02-25  9:22               ` Andrew Morton
  2009-02-25  9:26                 ` Peter Zijlstra
                                   ` (2 more replies)
  1 sibling, 3 replies; 51+ messages in thread
From: Andrew Morton @ 2009-02-25  9:22 UTC (permalink / raw)
  To: Pekka Enberg
  Cc: Ingo Molnar, Steven Rostedt, LKML, Thomas Gleixner,
	Peter Zijlstra, Frederic Weisbecker, Theodore Tso,
	Arjan van de Ven, Pekka Paalanen, Arnaldo Carvalho de Melo,
	Jason Baron, Martin Bligh, Mathieu Desnoyers, Frank Ch. Eigler,
	KOSAKI Motohiro, Jens Axboe, Masami Hiramatsu, Steven Rostedt

On Wed, 25 Feb 2009 11:00:53 +0200 Pekka Enberg <penberg@cs.helsinki.fi> wrote:

> Hi Andrew,
> 
> On Wed, Feb 25, 2009 at 10:28 AM, Andrew Morton
> <akpm@linux-foundation.org> wrote:
> > A better approach would be to design simple, robust kernel interfaces
> > which make sense and which aren't made all complex by putting the user
> > interface in kernel space. __And to maintain corresponding userspace
> > tools which manipulate and present the IO from those kernel interfaces.
> 
> We did this for kmemtrace and quite frankly, I usually end up spending
> more time figuring out how to export the data from a virtual machine
> than actually analyzing the results.

Simple text files would suit.  I'm not saying use massive binary blobs.
Here.

> What makes ftrace so cool is the
> fact that it has almost zero-overhead for setup and that the
> text-based data format plays well with simple scripting tools
> available everywhere.

Parse this:

irqsoff latency trace v1.1.5 on 2.6.26-rc8
--------------------------------------------------------------------
 latency: 97 us, #3/3, CPU#0 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:2)
    -----------------
    | task: swapper-0 (uid:0 nice:0 policy:0 rt_prio:0)
    -----------------
 => started at: apic_timer_interrupt
 => ended at:   do_softirq

#                _------=> CPU#
#               / _-----=> irqs-off
#              | / _----=> need-resched
#              || / _---=> hardirq/softirq
#              ||| / _--=> preempt-depth
#              |||| /
#              |||||     delay
#  cmd     pid ||||| time  |   caller
#     \   /    |||||   \   |   /
  <idle>-0     0d..1    0us+: trace_hardirqs_off_thunk (apic_timer_interrupt)
  <idle>-0     0d.s.   97us : __do_softirq (do_softirq)
  <idle>-0     0d.s1   98us : trace_hardirqs_on (do_softirq)

your time starts now.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 2/4] tracing: add event trace infrastructure
  2009-02-25  9:22               ` Andrew Morton
@ 2009-02-25  9:26                 ` Peter Zijlstra
  2009-02-25 10:31                   ` Ingo Molnar
  2009-02-25  9:33                 ` Pekka Enberg
  2009-02-25 14:41                 ` [PATCH 2/4] tracing: add event trace infrastructure Steven Rostedt
  2 siblings, 1 reply; 51+ messages in thread
From: Peter Zijlstra @ 2009-02-25  9:26 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Pekka Enberg, Ingo Molnar, Steven Rostedt, LKML, Thomas Gleixner,
	Frederic Weisbecker, Theodore Tso, Arjan van de Ven,
	Pekka Paalanen, Arnaldo Carvalho de Melo, Jason Baron,
	Martin Bligh, Mathieu Desnoyers, Frank Ch. Eigler,
	KOSAKI Motohiro, Jens Axboe, Masami Hiramatsu, Steven Rostedt

On Wed, 2009-02-25 at 01:22 -0800, Andrew Morton wrote:

> Parse this:
> 
> irqsoff latency trace v1.1.5 on 2.6.26-rc8
> --------------------------------------------------------------------
>  latency: 97 us, #3/3, CPU#0 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:2)
>     -----------------
>     | task: swapper-0 (uid:0 nice:0 policy:0 rt_prio:0)
>     -----------------
>  => started at: apic_timer_interrupt
>  => ended at:   do_softirq
> 
> #                _------=> CPU#
> #               / _-----=> irqs-off
> #              | / _----=> need-resched
> #              || / _---=> hardirq/softirq
> #              ||| / _--=> preempt-depth
> #              |||| /
> #              |||||     delay
> #  cmd     pid ||||| time  |   caller
> #     \   /    |||||   \   |   /
>   <idle>-0     0d..1    0us+: trace_hardirqs_off_thunk (apic_timer_interrupt)
>   <idle>-0     0d.s.   97us : __do_softirq (do_softirq)
>   <idle>-0     0d.s1   98us : trace_hardirqs_on (do_softirq)
> 
> your time starts now.

awk '/^#/ { start=1; next } { if (!start) next; .... }'


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 2/4] tracing: add event trace infrastructure
  2009-02-25  9:22               ` Andrew Morton
  2009-02-25  9:26                 ` Peter Zijlstra
@ 2009-02-25  9:33                 ` Pekka Enberg
  2009-02-25  9:44                   ` Andrew Morton
  2009-02-25 10:07                   ` [PATCH] tracing: remove /debug/tracing/latency_trace Ingo Molnar
  2009-02-25 14:41                 ` [PATCH 2/4] tracing: add event trace infrastructure Steven Rostedt
  2 siblings, 2 replies; 51+ messages in thread
From: Pekka Enberg @ 2009-02-25  9:33 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Ingo Molnar, Steven Rostedt, LKML, Thomas Gleixner,
	Peter Zijlstra, Frederic Weisbecker, Theodore Tso,
	Arjan van de Ven, Pekka Paalanen, Arnaldo Carvalho de Melo,
	Jason Baron, Martin Bligh, Mathieu Desnoyers, Frank Ch. Eigler,
	KOSAKI Motohiro, Jens Axboe, Masami Hiramatsu, Steven Rostedt

Hi Andrew,

On Wed, 2009-02-25 at 01:22 -0800, Andrew Morton wrote:
> irqsoff latency trace v1.1.5 on 2.6.26-rc8
> --------------------------------------------------------------------
>  latency: 97 us, #3/3, CPU#0 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:2)
>     -----------------
>     | task: swapper-0 (uid:0 nice:0 policy:0 rt_prio:0)
>     -----------------
>  => started at: apic_timer_interrupt
>  => ended at:   do_softirq

<the header seems broken here, should have # at the beginning of the line>

> 
> #                _------=> CPU#
> #               / _-----=> irqs-off
> #              | / _----=> need-resched
> #              || / _---=> hardirq/softirq
> #              ||| / _--=> preempt-depth
> #              |||| /
> #              |||||     delay
> #  cmd     pid ||||| time  |   caller
> #     \   /    |||||   \   |   /
>   <idle>-0     0d..1    0us+: trace_hardirqs_off_thunk (apic_timer_interrupt)
>   <idle>-0     0d.s.   97us : __do_softirq (do_softirq)
>   <idle>-0     0d.s1   98us : trace_hardirqs_on (do_softirq)
> 
> your time starts now.

Well, what do you want to know? The first thing to do here is to:

$ grep -v "#" trace 
  <idle>-0     0d..1    0us+: trace_hardirqs_off_thunk (apic_timer_interrupt)
  <idle>-0     0d.s.   97us : __do_softirq (do_softirq)
  <idle>-0     0d.s1   98us : trace_hardirqs_on (do_softirq)

after which you have access to the raw data. This particular trace seems
to be somewhat hard to parse (because not all fields are whitespace
delimited) but I can assure you that any format I rely on is not.

So if you're arguing against specific ftrace plugins, go ahead (you
probably have a fair point there). But please don't dismiss the while
_concept_ of ftrace because of them.

			Pekka


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 2/4] tracing: add event trace infrastructure
  2009-02-25  9:33                 ` Pekka Enberg
@ 2009-02-25  9:44                   ` Andrew Morton
  2009-02-25  9:56                     ` Ingo Molnar
  2009-02-25  9:57                     ` Pekka Enberg
  2009-02-25 10:07                   ` [PATCH] tracing: remove /debug/tracing/latency_trace Ingo Molnar
  1 sibling, 2 replies; 51+ messages in thread
From: Andrew Morton @ 2009-02-25  9:44 UTC (permalink / raw)
  To: Pekka Enberg
  Cc: Ingo Molnar, Steven Rostedt, LKML, Thomas Gleixner,
	Peter Zijlstra, Frederic Weisbecker, Theodore Tso,
	Arjan van de Ven, Pekka Paalanen, Arnaldo Carvalho de Melo,
	Jason Baron, Martin Bligh, Mathieu Desnoyers, Frank Ch. Eigler,
	KOSAKI Motohiro, Jens Axboe, Masami Hiramatsu, Steven Rostedt

On Wed, 25 Feb 2009 11:33:07 +0200 Pekka Enberg <penberg@cs.helsinki.fi> wrote:

> Hi Andrew,
> 
> On Wed, 2009-02-25 at 01:22 -0800, Andrew Morton wrote:
> > irqsoff latency trace v1.1.5 on 2.6.26-rc8
> > --------------------------------------------------------------------
> >  latency: 97 us, #3/3, CPU#0 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:2)
> >     -----------------
> >     | task: swapper-0 (uid:0 nice:0 policy:0 rt_prio:0)
> >     -----------------
> >  => started at: apic_timer_interrupt
> >  => ended at:   do_softirq
> 
> <the header seems broken here, should have # at the beginning of the line>
> 
> > 
> > #                _------=> CPU#
> > #               / _-----=> irqs-off
> > #              | / _----=> need-resched
> > #              || / _---=> hardirq/softirq
> > #              ||| / _--=> preempt-depth
> > #              |||| /
> > #              |||||     delay
> > #  cmd     pid ||||| time  |   caller
> > #     \   /    |||||   \   |   /
> >   <idle>-0     0d..1    0us+: trace_hardirqs_off_thunk (apic_timer_interrupt)
> >   <idle>-0     0d.s.   97us : __do_softirq (do_softirq)
> >   <idle>-0     0d.s1   98us : trace_hardirqs_on (do_softirq)
> > 
> > your time starts now.
> 
> Well, what do you want to know? The first thing to do here is to:
> 
> $ grep -v "#" trace 
>   <idle>-0     0d..1    0us+: trace_hardirqs_off_thunk (apic_timer_interrupt)
>   <idle>-0     0d.s.   97us : __do_softirq (do_softirq)
>   <idle>-0     0d.s1   98us : trace_hardirqs_on (do_softirq)
> 
> after which you have access to the raw data. This particular trace seems
> to be somewhat hard to parse (because not all fields are whitespace
> delimited) but I can assure you that any format I rely on is not.

yes, but now you need to think about how this interface would have been
designed if we'd decided to access it with something smarter than
`cat'.

I mean, look at it.  All the multi-space column lining upping, the
unnecessary "us" annotation, the strange symbol(symbol) thing, etc. 
Plus it would have been more self-describing.  Right now, your parser
has to either assume that the second character of "0d..1" is
"irqs-off", or it has to learn how to follow ascii art lines.

Plus...  it's all English-only.

> So if you're arguing against specific ftrace plugins, go ahead (you
> probably have a fair point there). But please don't dismiss the while
> _concept_ of ftrace because of them.

Where on earth did that come from?

What I'm arguing against is putting English-only pretty-printers and
pretty-parsers on wrong side of int 80.  That's all.


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 2/4] tracing: add event trace infrastructure
  2009-02-25  9:44                   ` Andrew Morton
@ 2009-02-25  9:56                     ` Ingo Molnar
  2009-02-25 10:02                       ` Andrew Morton
  2009-02-25 16:21                       ` Frederic Weisbecker
  2009-02-25  9:57                     ` Pekka Enberg
  1 sibling, 2 replies; 51+ messages in thread
From: Ingo Molnar @ 2009-02-25  9:56 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Pekka Enberg, Steven Rostedt, LKML, Thomas Gleixner,
	Peter Zijlstra, Frederic Weisbecker, Theodore Tso,
	Arjan van de Ven, Pekka Paalanen, Arnaldo Carvalho de Melo,
	Jason Baron, Martin Bligh, Mathieu Desnoyers, Frank Ch. Eigler,
	KOSAKI Motohiro, Jens Axboe, Masami Hiramatsu, Steven Rostedt


* Andrew Morton <akpm@linux-foundation.org> wrote:

> Plus...  it's all English-only.

Note that with bin+raw output you can already internationalize 
tracepoints, if you want to.

I havent seen much interest in that, and the default tracing 
output is in English indeed, and the reason is rather 
fundamental: currently we've got 60,000+ kernel function 
symbols, 99% of which are in English.

Do you argue for them to be converted to some i8n format so that 
the trace output becomes readable in other languages as well? 

I.e. do you suggest that this:

>  3)               |  handle_mm_fault() {
>  3)               |    count_vm_event() {
>  3)   0.243 us    |      test_ti_thread_flag();
>  3)   0.754 us    |    }
>  3)   0.249 us    |    pud_alloc();
>  3)   0.251 us    |    pmd_alloc();
>  3)               |    __do_fault() {
>  3)               |      filemap_fault() {
>  3)               |        find_lock_page() {
>  3)               |          find_get_page() {
>  3)   0.248 us    |            test_ti_thread_flag();
>  3)   0.844 us    |          }
>  3)   1.341 us    |        }
>  3)   1.837 us    |      }
>  3)   0.275 us    |	   _spin_lock();
>  3)   0.257 us    |	   page_add_file_rmap();
>  3)   0.233 us    |      native_set_pte_at();

and /proc/kallsysms to be internationalized? Should all oopses 
and warnings that show up in the kernel log be translated as 
well?

I dont think it's realistic - and arguing for anything less and 
singling out tracing would be a double standard.

Currently being able to understand and hack the kernel means 
being able to read some English - and the same holds for trace 
output as well.

The default output of traces is just a mirror image of what is 
the kernel status quo. If the kernel gets internationalized so 
will ftrace be internationalized too.

> > So if you're arguing against specific ftrace plugins, go 
> > ahead (you probably have a fair point there). But please 
> > don't dismiss the while _concept_ of ftrace because of them.
> 
> Where on earth did that come from?
> 
> What I'm arguing against is putting English-only 
> pretty-printers and pretty-parsers on wrong side of int 80.  
> That's all.

Since the concept of a kernel tracing facility being 
self-sufficient and being easy to use is an integral and key 
concept to ftrace, dont you see why people take your suggestions 
as a dismissal of the ftrace concept?

	Ingo

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 2/4] tracing: add event trace infrastructure
  2009-02-25  9:44                   ` Andrew Morton
  2009-02-25  9:56                     ` Ingo Molnar
@ 2009-02-25  9:57                     ` Pekka Enberg
  1 sibling, 0 replies; 51+ messages in thread
From: Pekka Enberg @ 2009-02-25  9:57 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Ingo Molnar, Steven Rostedt, LKML, Thomas Gleixner,
	Peter Zijlstra, Frederic Weisbecker, Theodore Tso,
	Arjan van de Ven, Pekka Paalanen, Arnaldo Carvalho de Melo,
	Jason Baron, Martin Bligh, Mathieu Desnoyers, Frank Ch. Eigler,
	KOSAKI Motohiro, Jens Axboe, Masami Hiramatsu, Steven Rostedt

Hi Andrew,

On Wed, 2009-02-25 at 01:44 -0800, Andrew Morton wrote:
> > $ grep -v "#" trace 
> >   <idle>-0     0d..1    0us+: trace_hardirqs_off_thunk (apic_timer_interrupt)
> >   <idle>-0     0d.s.   97us : __do_softirq (do_softirq)
> >   <idle>-0     0d.s1   98us : trace_hardirqs_on (do_softirq)
> > 
> > after which you have access to the raw data. This particular trace seems
> > to be somewhat hard to parse (because not all fields are whitespace
> > delimited) but I can assure you that any format I rely on is not.
> 
> yes, but now you need to think about how this interface would have been
> designed if we'd decided to access it with something smarter than
> `cat'.
> 
> I mean, look at it.  All the multi-space column lining upping, the
> unnecessary "us" annotation, the strange symbol(symbol) thing, etc. 
> Plus it would have been more self-describing.  Right now, your parser
> has to either assume that the second character of "0d..1" is
> "irqs-off", or it has to learn how to follow ascii art lines.

Multi-space columns are probably not a big problem but sure, it's better
to keep the raw data as simple as possible and put things like units in
the header.

But anyway, I'm the wrong person to talk to if you want someone to
defend that particular format. If you find similar problems with the
kmemtrace output, then sure, by all means let me know about it and we'll
fix it up.

Note: it still make sense to have a specific kind of "pretty printing"
in the kernel. For things like kmemtrace, the amount of data gets pretty
big so it's very convenient to have "summarizing formatters" like the
histogram formatter thing that's being cooked up in ftrace tree
somewhere.

			Pekka


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 2/4] tracing: add event trace infrastructure
  2009-02-25  9:56                     ` Ingo Molnar
@ 2009-02-25 10:02                       ` Andrew Morton
  2009-02-25 10:24                         ` Pekka Enberg
  2009-02-25 10:27                         ` Ingo Molnar
  2009-02-25 16:21                       ` Frederic Weisbecker
  1 sibling, 2 replies; 51+ messages in thread
From: Andrew Morton @ 2009-02-25 10:02 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Pekka Enberg, Steven Rostedt, LKML, Thomas Gleixner,
	Peter Zijlstra, Frederic Weisbecker, Theodore Tso,
	Arjan van de Ven, Pekka Paalanen, Arnaldo Carvalho de Melo,
	Jason Baron, Martin Bligh, Mathieu Desnoyers, Frank Ch. Eigler,
	KOSAKI Motohiro, Jens Axboe, Masami Hiramatsu, Steven Rostedt

On Wed, 25 Feb 2009 10:56:23 +0100 Ingo Molnar <mingo@elte.hu> wrote:

> Since the concept of a kernel tracing facility being 
> self-sufficient and being easy to use is an integral and key 
> concept to ftrace, dont you see why people take your suggestions 
> as a dismissal of the ftrace concept?

Nothing I've suggested in any way makes ftrace hard to use.

If you guys had gone this way, you would not have screwed it up as much
as you're suggesting.

You're being unserious, and this is going nowhere.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH] tracing: remove /debug/tracing/latency_trace
  2009-02-25  9:33                 ` Pekka Enberg
  2009-02-25  9:44                   ` Andrew Morton
@ 2009-02-25 10:07                   ` Ingo Molnar
  1 sibling, 0 replies; 51+ messages in thread
From: Ingo Molnar @ 2009-02-25 10:07 UTC (permalink / raw)
  To: Pekka Enberg
  Cc: Andrew Morton, Steven Rostedt, LKML, Thomas Gleixner,
	Peter Zijlstra, Frederic Weisbecker, Theodore Tso,
	Arjan van de Ven, Pekka Paalanen, Arnaldo Carvalho de Melo,
	Jason Baron, Martin Bligh, Mathieu Desnoyers, Frank Ch. Eigler,
	KOSAKI Motohiro, Jens Axboe, Masami Hiramatsu, Steven Rostedt


* Pekka Enberg <penberg@cs.helsinki.fi> wrote:

> Hi Andrew,
> 
> On Wed, 2009-02-25 at 01:22 -0800, Andrew Morton wrote:
> > irqsoff latency trace v1.1.5 on 2.6.26-rc8
> > --------------------------------------------------------------------
> >  latency: 97 us, #3/3, CPU#0 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:2)
> >     -----------------
> >     | task: swapper-0 (uid:0 nice:0 policy:0 rt_prio:0)
> >     -----------------
> >  => started at: apic_timer_interrupt
> >  => ended at:   do_softirq
> 
> <the header seems broken here, should have # at the beginning 
> of the line>

yep.

Furthermore note that latency_trace is a legacy format which i 
think we'll deprecate - in fact see the patch below. The primary 
format that people use is /debug/tracing/trace.

All the pretty-printing patches that Andrew has seen and 
complained about affect /debug/tracing/trace, not 
/debug/tracing/latency_trace.

	Ingo

----------------->
>From 886b5b73d71e4027d7dc6c14f5f7ab102201ea6b Mon Sep 17 00:00:00 2001
From: Ingo Molnar <mingo@elte.hu>
Date: Wed, 25 Feb 2009 11:03:44 +0100
Subject: [PATCH] tracing: remove /debug/tracing/latency_trace

Impact: remove old debug/tracing API

/debug/tracing/latency_trace is an old legacy format we kept from
the old latency tracer. Remove the file for now. If there's any
useful bit missing then we'll propagate any useful output bits into
the /debug/tracing/trace output.

Reported-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
 kernel/trace/trace.c |    5 -----
 1 files changed, 0 insertions(+), 5 deletions(-)

diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index e1f3b99..11ba100 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -2938,11 +2938,6 @@ static __init int tracer_init_debugfs(void)
 	if (!entry)
 		pr_warning("Could not create debugfs 'tracing_cpumask' entry\n");
 
-	entry = debugfs_create_file("latency_trace", 0444, d_tracer,
-				    &global_trace, &tracing_lt_fops);
-	if (!entry)
-		pr_warning("Could not create debugfs 'latency_trace' entry\n");
-
 	entry = debugfs_create_file("trace", 0444, d_tracer,
 				    &global_trace, &tracing_fops);
 	if (!entry)

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* Re: [PATCH 2/4] tracing: add event trace infrastructure
  2009-02-25 10:02                       ` Andrew Morton
@ 2009-02-25 10:24                         ` Pekka Enberg
  2009-02-25 10:27                         ` Ingo Molnar
  1 sibling, 0 replies; 51+ messages in thread
From: Pekka Enberg @ 2009-02-25 10:24 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Ingo Molnar, Steven Rostedt, LKML, Thomas Gleixner,
	Peter Zijlstra, Frederic Weisbecker, Theodore Tso,
	Arjan van de Ven, Pekka Paalanen, Arnaldo Carvalho de Melo,
	Jason Baron, Martin Bligh, Mathieu Desnoyers, Frank Ch. Eigler,
	KOSAKI Motohiro, Jens Axboe, Masami Hiramatsu, Steven Rostedt

Hi Andrew,

On Wed, 25 Feb 2009 10:56:23 +0100 Ingo Molnar <mingo@elte.hu> wrote:
>> Since the concept of a kernel tracing facility being
>> self-sufficient and being easy to use is an integral and key
>> concept to ftrace, dont you see why people take your suggestions
>> as a dismissal of the ftrace concept?

On Wed, Feb 25, 2009 at 12:02 PM, Andrew Morton
<akpm@linux-foundation.org> wrote:
> Nothing I've suggested in any way makes ftrace hard to use.
>
> If you guys had gone this way, you would not have screwed it up as much
> as you're suggesting.

I find it difficult to understand what is it exactly that you're
suggesting. AFAICT, the core of the argument is how much
post-processing should we do in the kernel. Now keeping all
post-processing out of the kernel will make (some) tracers less
user-friendly but we obviously don't want to do a full TeX in the
kernel either.

Maybe identifying the specific plug-ins you have problems with would
be useful? I only know some of the ftrace core and the kmemtrace
plug-in and I suspect the situation is similar for other plug-in
developers as well. So the problem here could be just that you're not
getting the message across to the right people.

                                      Pekka

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 2/4] tracing: add event trace infrastructure
  2009-02-25 10:02                       ` Andrew Morton
  2009-02-25 10:24                         ` Pekka Enberg
@ 2009-02-25 10:27                         ` Ingo Molnar
  1 sibling, 0 replies; 51+ messages in thread
From: Ingo Molnar @ 2009-02-25 10:27 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Pekka Enberg, Steven Rostedt, LKML, Thomas Gleixner,
	Peter Zijlstra, Frederic Weisbecker, Theodore Tso,
	Arjan van de Ven, Pekka Paalanen, Arnaldo Carvalho de Melo,
	Jason Baron, Martin Bligh, Mathieu Desnoyers, Frank Ch. Eigler,
	KOSAKI Motohiro, Jens Axboe, Masami Hiramatsu, Steven Rostedt


* Andrew Morton <akpm@linux-foundation.org> wrote:

> On Wed, 25 Feb 2009 10:56:23 +0100 Ingo Molnar <mingo@elte.hu> wrote:
> 
> > Since the concept of a kernel tracing facility being 
> > self-sufficient and being easy to use is an integral and key 
> > concept to ftrace, dont you see why people take your 
> > suggestions as a dismissal of the ftrace concept?
> 
> Nothing I've suggested in any way makes ftrace hard to use.

Note that everything you suggest is _already possible_, if you 
switch on the 'make simple output' option. It's just that the 
human-readable format is the default one.

So i think your suggestion does make ftrace harder to use, in a 
number of ways:

 - There's one more extra tool to be installed on the target
   machine. That target machine might not even have any build
   environment - but tracing on it can still be very useful to 
   pin down bugs and regressions.

   Self-sufficient kernel instrumentation is a key concept to
   usability.

 - That tool will break the current intuitive shell-commands and 
   scripting flow that is based on human readable text output.

   In ftrace we try to strike a good balance between easy
   scriptability and pretty-printing. The two go hand in hand 
   usually so it's not a big task. If you look at the current 
   /debug/tracing/trace output you'll find a lot of small 
   details that we've put in there to make it easy to script - 
   while still being nice to read. [suggestions to improve it 
   are welcome.]

   Your suggestion pushes that completely to the 'minimal 
   output' direction, with no tangible benefit to scripting, and 
   with a lot of damage to readability and usability. A separate
   pretty-printing tool would just add an extra unnecessary step
   and would make the workflow more clumsy. This too is a clear
   step backwards.

 - Plus the extra effort of defining APIs and ABIs to it and the
   extra effort to make it work on all kernel versions. It all
   takes away resources from making tracing more useful in
   practice so it indirectly hurts instrumentation usability. 

   Tracing development manpower is a zero-sum game. If you force 
   people to maintain silly APIs you take away time from other, 
   more important areas. It might even be a negative-sum game: 
   developing something that looks ugly and is hard to use 
   attracts less developers. Tracing under Linux was in such a
   zombie state for a decade and ftrace clearly changed that 
   picture. I dont subscribe to the view that tracing has to be
   stupid and boring.

I.e. i dont see many upsides of your suggestion, and i see a lot 
of downsides.

IMO pretty-printing in the kernel should not be made a religious 
argument but a technical argument. Pretty-printing is a tool, 
and as a tool it sometimes can be used for silly purposes, and 
sometimes it can be used for helpful purposes. You seem to argue 
that doing it in the kernel is always silly - and i strongly 
disagree with that and consider it an extreme-end position - for 
the reasons outlined above.

	Ingo

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 2/4] tracing: add event trace infrastructure
  2009-02-25  9:26                 ` Peter Zijlstra
@ 2009-02-25 10:31                   ` Ingo Molnar
  0 siblings, 0 replies; 51+ messages in thread
From: Ingo Molnar @ 2009-02-25 10:31 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Andrew Morton, Pekka Enberg, Steven Rostedt, LKML,
	Thomas Gleixner, Frederic Weisbecker, Theodore Tso,
	Arjan van de Ven, Pekka Paalanen, Arnaldo Carvalho de Melo,
	Jason Baron, Martin Bligh, Mathieu Desnoyers, Frank Ch. Eigler,
	KOSAKI Motohiro, Jens Axboe, Masami Hiramatsu, Steven Rostedt


* Peter Zijlstra <peterz@infradead.org> wrote:

> On Wed, 2009-02-25 at 01:22 -0800, Andrew Morton wrote:
> 
> > Parse this:
> > 
> > irqsoff latency trace v1.1.5 on 2.6.26-rc8
> > --------------------------------------------------------------------
> >  latency: 97 us, #3/3, CPU#0 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:2)
> >     -----------------
> >     | task: swapper-0 (uid:0 nice:0 policy:0 rt_prio:0)
> >     -----------------
> >  => started at: apic_timer_interrupt
> >  => ended at:   do_softirq
> > 
> > #                _------=> CPU#
> > #               / _-----=> irqs-off
> > #              | / _----=> need-resched
> > #              || / _---=> hardirq/softirq
> > #              ||| / _--=> preempt-depth
> > #              |||| /
> > #              |||||     delay
> > #  cmd     pid ||||| time  |   caller
> > #     \   /    |||||   \   |   /
> >   <idle>-0     0d..1    0us+: trace_hardirqs_off_thunk (apic_timer_interrupt)
> >   <idle>-0     0d.s.   97us : __do_softirq (do_softirq)
> >   <idle>-0     0d.s1   98us : trace_hardirqs_on (do_softirq)
> > 
> > your time starts now.
> 
> awk '/^#/ { start=1; next } { if (!start) next; .... }'

yeah.

For /debug/tracing/trace the convention is that a simple:

  grep -v '^#'

pattern will filter out the header as "comments".

	Ingo

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 1/4] tracing: add DEFINE_TRACE_FMT to tracepoint.h
  2009-02-25  6:27   ` Peter Zijlstra
@ 2009-02-25 13:01     ` Steven Rostedt
  2009-02-25 16:09       ` Mathieu Desnoyers
  0 siblings, 1 reply; 51+ messages in thread
From: Steven Rostedt @ 2009-02-25 13:01 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Frederic Weisbecker, Theodore Tso, Arjan van de Ven,
	Pekka Paalanen, Arnaldo Carvalho de Melo, Jason Baron,
	Martin Bligh, Mathieu Desnoyers, Frank Ch. Eigler,
	KOSAKI Motohiro, Jens Axboe, Masami Hiramatsu, Steven Rostedt


On Wed, 25 Feb 2009, Peter Zijlstra wrote:

> On Tue, 2009-02-24 at 21:56 -0500, Steven Rostedt wrote:
> 
> > +#define DEFINE_TRACE_FMT(name, proto, args, fmt)		\
> > +	DECLARE_TRACE(name, TPPROTO(proto), TPARGS(args))
> > +
> 
> If TPPROTO and TPARGS were anything but empty stubs, the above would add
> them twice, no?

I guess I can rename them too.

The problem comes from passing in more than one argument. If a proto or 
args has a comma, you get a nasty error about unmatched number of 
parameters.

I believe that they were purposely made to be empty stubs to allow for 
multiple parameters. But you are right, if that changes we need to fix it.

Since the above is only to protect against the substitution into multiple 
parameters, we could probably create a new macro name to just be

 #define PARAMS(args...) args
 #define DEFINE_TRACE_FMT(name, proto, args, fmt) \
	DECLARE_TRACE(name, PARAMS(proto), PARAMS(args))

-- Steve


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 4/4] tracing: make event directory structure
  2009-02-25  6:59   ` Frederic Weisbecker
@ 2009-02-25 13:07     ` Steven Rostedt
  0 siblings, 0 replies; 51+ messages in thread
From: Steven Rostedt @ 2009-02-25 13:07 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, Theodore Tso, Arjan van de Ven, Pekka Paalanen,
	Arnaldo Carvalho de Melo, Jason Baron, Martin Bligh,
	Mathieu Desnoyers, Frank Ch. Eigler, KOSAKI Motohiro, Jens Axboe,
	Masami Hiramatsu, Steven Rostedt



On Wed, 25 Feb 2009, Frederic Weisbecker wrote:

> On Tue, Feb 24, 2009 at 09:56:12PM -0500, Steven Rostedt wrote:
> > From: Steven Rostedt <srostedt@redhat.com>
> > 
> > This patch adds the directory /debug/tracing/events/ that will contain
> > all the registered trace points.
> > 
> >  # ls /debug/tracing/events/
> > sched_kthread_stop      sched_process_fork  sched_switch
> > sched_kthread_stop_ret  sched_process_free  sched_wait_task
> > sched_migrate_task      sched_process_wait  sched_wakeup
> > sched_process_exit      sched_signal_send   sched_wakeup_new
> > 
> >  # ls /debug/tracing/events/sched_switch/
> > enable
> > 
> >  # cat /debug/tracing/events/sched_switch/enable
> > 1
> > 
> >  # cat /debug/tracing/set_event
> > sched_switch
> 
> 
> Do you plan to add other things inside these events directory?

That's actually the plan. We might want to trigger a stack trace or 
something on that event. The idea will be to add attributes to each 
individual event, and it should be pretty easy to implement since each 
event has its own structure.

> I mean, if you plan to only put an enable file on each of 
them, > it would be better to have just one file:
> 
> cat /debug/tracing/events/sched_switch
> 1
> 
>  
> > Signed-off-by: Steven Rostedt <srostedt@redhat.com>
> > ---
> >  kernel/trace/trace_events.c |  137 +++++++++++++++++++++++++++++++++++++++++--
> >  kernel/trace/trace_events.h |    7 ++-
> >  2 files changed, 137 insertions(+), 7 deletions(-)
> > 
> > diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
> > index 05bc80e..3bcb9df 100644
> > --- a/kernel/trace/trace_events.c
> > +++ b/kernel/trace/trace_events.c
> > @@ -12,6 +12,11 @@
> >  
> >  #include "trace_events.h"
> >  
> > +#define events_for_each(event)						\
> > +	for (event = __start_ftrace_events;				\
> > +	     (unsigned long)event < (unsigned long)__stop_ftrace_events; \
> > +	     event++)
> > +
> 
> 
> The name for_each_events seems to me more intuitive, but it's just
> a matter of taste.

I like that better too. But I was trying to be more consistent with the 
kernel. "list_for_each", "hlist_for_each".

But come to think about it, we do have "for_each_cpu_mask". Hmm?

But it is local to the file. Not something to worry about now ;-)

-- Steve


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 2/4] tracing: add event trace infrastructure
  2009-02-25  3:45   ` Andrew Morton
  2009-02-25  4:08     ` Steven Rostedt
@ 2009-02-25 13:37     ` Theodore Tso
  2009-02-25 14:10       ` Steven Rostedt
  1 sibling, 1 reply; 51+ messages in thread
From: Theodore Tso @ 2009-02-25 13:37 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Steven Rostedt, linux-kernel, Ingo Molnar, Thomas Gleixner,
	Peter Zijlstra, Frederic Weisbecker, Arjan van de Ven,
	Pekka Paalanen, Arnaldo Carvalho de Melo, Jason Baron,
	Martin Bligh, Mathieu Desnoyers, Frank Ch. Eigler,
	KOSAKI Motohiro, Jens Axboe, Masami Hiramatsu, Steven Rostedt

On Tue, Feb 24, 2009 at 07:45:48PM -0800, Andrew Morton wrote:
> echo sched_wakeup > /debug/tracing/set_event
> echo sched_wakeup > /debug/tracing/clear_event
> echo 1 > /debug/tracing/clear_all_events

>From a usability point of view, I'd prefer this as well.  It's only
two extra files, but it means that filenames in /deb/tracing is
self-documenting, which is a good thing.  Otherwise, it's really not
obvious that "echo !event > /debug/tracing/set_event" will disable the
event, and it's *especially* not obvious "echo > /debug/tracing/set_event"
will disable all events.

It's also a tad bit too easy to typo "echo > /debug/tracing/set_event"
by editing command history, and hitting return a bit too soon after
clearing out the previous event name but before you manage to type the
new event name.

						- Ted

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 2/4] tracing: add event trace infrastructure
  2009-02-25  9:07   ` Lai Jiangshan
@ 2009-02-25 13:50     ` Steven Rostedt
  0 siblings, 0 replies; 51+ messages in thread
From: Steven Rostedt @ 2009-02-25 13:50 UTC (permalink / raw)
  To: Lai Jiangshan
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, Frederic Weisbecker, Theodore Tso,
	Arjan van de Ven, Pekka Paalanen, Arnaldo Carvalho de Melo,
	Jason Baron, Martin Bligh, Mathieu Desnoyers, Frank Ch. Eigler,
	KOSAKI Motohiro, Jens Axboe, Masami Hiramatsu, Steven Rostedt


On Wed, 25 Feb 2009, Lai Jiangshan wrote:

> a very nice infrastructure!
> 
> Two comments about format:
> 
> Steven Rostedt ??:
> > From: Steven Rostedt <srostedt@redhat.com>
> > +#undef TPFMT
> > +#define TPFMT(fmt, args...)	fmt "\n", ##args
> 
> Where is other TPFMT's definition?
> I think TPFMT should be defined in tracepoint.h, because trace/<type>.h
> uses it.

I was thinking that. But in may just end up being a nop.

#define TPFMT(fmt, args)	/* define me */


> 
> > +#define DEFINE_TRACE_FMT(call, proto, args, fmt)			\
> > +static void ftrace_event_##call(proto)					\
> > +{									\
> > +	event_trace_printk(_RET_IP_, "(" #call ") " fmt);		\
> > +}									\
> > +									\
> > +static int ftrace_reg_event_##call(void)				\
> > +{									\
> > +	int ret;							\
> > +									\
> > +	ret = register_trace_##call(ftrace_event_##call);		\
> > +	if (!ret)							\
> > +		pr_info("event trace: Could not activate trace point "	\
> > +			"probe to " #call);				\
> > +	return ret;							\
> > +}									\
> > +									\
> > +static void ftrace_unreg_event_##call(void)				\
> > +{									\
> > +	unregister_trace_##call(ftrace_event_##call);			\
> > +}									\
> > +									\
> > +static struct ftrace_event_call __used					\
> > +__attribute__((section("_ftrace_events"))) event_##call = {		\
> > +	.name 			= #call,				\
> > +	.regfunc		= ftrace_reg_event_##call,		\
> > +	.unregfunc		= ftrace_unreg_event_##call,		\
> > +}
> > +
> > +void event_trace_printk(unsigned long ip, const char *fmt, ...);
> 
> add format checking here:
> __attribute__ ((format (printf, 2, 3)))
> 
> If user does not use this infrastructure(he uses trace/<type>.h only),
> this checking is not work, so I think we can add format checking
> in DEFINE_TRACE_FMT in tracepoint.h

Darn! You are correct. That was what I forgot to fix since the RFC
announcement. I had that on my TODO list and simply forgot.

Thanks!

> 
> --------example:
> 
> static inline void  __tp_check_format(const char *fmt, ...)
> 		__attribute__ ((format (printf, 1, 2)));
> static inline void __tp_check_format(const char *fmt, ...) {}
> 
> #define DECLARE_TRACE_FORMAT_CHECK(name, proto, fmt, fmt_args...)	\
> static inline void ftrace_event_check_format_##name(proto)		\
> {									\
> 	__tp_check_format(fmt, ##fmt_args);				\

Cool, thanks, I'll add that.

-- Steve

> }
> 
> #define DEFINE_TRACE_FMT(name, proto, args, fmt)			\
> 	DECLARE_TRACE_FORMAT_CHECK(name, TPPROTO(proto), fmt)		\
> 	DECLARE_TRACE(name, TPPROTO(proto), TPARGS(args))
> 
> ---------
> 
> Thanks, Lai
> 
> 
> 

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 2/4] tracing: add event trace infrastructure
  2009-02-25  8:28           ` Andrew Morton
  2009-02-25  8:40             ` Ingo Molnar
  2009-02-25  9:00             ` Pekka Enberg
@ 2009-02-25 13:54             ` Theodore Tso
  2009-02-26 21:08               ` Frank Ch. Eigler
  2009-03-01 10:37             ` KOSAKI Motohiro
  3 siblings, 1 reply; 51+ messages in thread
From: Theodore Tso @ 2009-02-25 13:54 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Ingo Molnar, Steven Rostedt, LKML, Thomas Gleixner,
	Peter Zijlstra, Frederic Weisbecker, Arjan van de Ven,
	Pekka Paalanen, Arnaldo Carvalho de Melo, Jason Baron,
	Martin Bligh, Mathieu Desnoyers, Frank Ch. Eigler,
	KOSAKI Motohiro, Jens Axboe, Masami Hiramatsu, Steven Rostedt

On Wed, Feb 25, 2009 at 12:28:52AM -0800, Andrew Morton wrote:
> A better approach would be to design simple, robust kernel interfaces
> which make sense and which aren't made all complex by putting the user
> interface in kernel space.  And to maintain corresponding userspace
> tools which manipulate and present the IO from those kernel interfaces.
> 
> But we don't do that, because userspace is hard, because we don't have
> a delivery process.  But nobody has even tried!  We can do this - it
> starts with `mkdir -p userspace/ktrace'.

I don't think it's just because we don't have a delivery process.  I
think it's because we've just been burned so many times with complex
user space tools that are built in C++, have impossible-to-understand
error handling ("try again with -vvvvvvv"), is hard to build because
they have huge dependencies on external libraries, or is filled with
distro dependencies and might not build on various distributions, or
is dependent on the RPM macros, etc.

Sure, if we do it ourselves, maybe we won't fall into the same set of
traps.  But after so many years of failure in creating a simple,
usable, tracing infrasructure, why not have something which is
self-contained in the kernel?  We can have a binary interface for the
people who make the uber-complex userspace tools, but 99% of the time
we don't need the complexity.

Also, I suspect the delivery problem is a lot more complicated than
you make it out to be.  It's not enough just to solve the kbuild
issues; there are the distribution packaging issues as well, and to
the extent that we have kernel-version-dependent userspace tools, how
do we standardize a system so that when the 2.6.29-rc6-gc457db2 kernel
is botted, we are using the tools associated with that kernel, and
when we have the 2.6.28.2 kernel booted, we use those tools instead?

If someone wants to work on putting together that infrastructure, I'm
sure it could be useful for more than just tracing.  I'll note though
that not all distributions have a reasonable system for dealing with
kernel-version dependent firmware packages yet, either.  (And whatever
happened to packaging firmware as a separate package from the kernel;
I haven't seen much progress on that front either.)

Sometimes, it's just much simpler and easier if we keep things all
bundled together in the kernel image.

						- Ted

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 2/4] tracing: add event trace infrastructure
  2009-02-25  9:21   ` Lai Jiangshan
@ 2009-02-25 13:54     ` Steven Rostedt
  0 siblings, 0 replies; 51+ messages in thread
From: Steven Rostedt @ 2009-02-25 13:54 UTC (permalink / raw)
  To: Lai Jiangshan
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, Frederic Weisbecker, Theodore Tso,
	Arjan van de Ven, Pekka Paalanen, Arnaldo Carvalho de Melo,
	Jason Baron, Martin Bligh, Mathieu Desnoyers, Frank Ch. Eigler,
	KOSAKI Motohiro, Jens Axboe, Masami Hiramatsu, Steven Rostedt


On Wed, 25 Feb 2009, Lai Jiangshan wrote:

> Steven Rostedt ??:
> > 
> > Signed-off-by: Steven Rostedt <srostedt@redhat.com>
> > ---
> >  include/asm-generic/vmlinux.lds.h |   11 ++-
> >  kernel/trace/Kconfig              |    9 ++
> >  kernel/trace/Makefile             |    1 +
> >  kernel/trace/trace_events.c       |  280 +++++++++++++++++++++++++++++++++++++
> >  kernel/trace/trace_events.h       |   52 +++++++
> >  5 files changed, 352 insertions(+), 1 deletions(-)
> >  create mode 100644 kernel/trace/trace_events.c
> >  create mode 100644 kernel/trace/trace_events.h
> > 
> 
> Modules can not use this infrastructure,
> will it be implemented in future?

"Give that man a cigar!"  ;-)  (old saying about someone that answers the 
                                game show question correctly).

I was waiting for this question to come up.  The answer is, "Yes", I have 
plans to add implementation to make it work with modules.

-- Steve


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 2/4] tracing: add event trace infrastructure
  2009-02-25 13:37     ` Theodore Tso
@ 2009-02-25 14:10       ` Steven Rostedt
  0 siblings, 0 replies; 51+ messages in thread
From: Steven Rostedt @ 2009-02-25 14:10 UTC (permalink / raw)
  To: Theodore Tso
  Cc: Andrew Morton, linux-kernel, Ingo Molnar, Thomas Gleixner,
	Peter Zijlstra, Frederic Weisbecker, Arjan van de Ven,
	Pekka Paalanen, Arnaldo Carvalho de Melo, Jason Baron,
	Martin Bligh, Mathieu Desnoyers, Frank Ch. Eigler,
	KOSAKI Motohiro, Jens Axboe, Masami Hiramatsu, Steven Rostedt


On Wed, 25 Feb 2009, Theodore Tso wrote:

> On Tue, Feb 24, 2009 at 07:45:48PM -0800, Andrew Morton wrote:
> > echo sched_wakeup > /debug/tracing/set_event
> > echo sched_wakeup > /debug/tracing/clear_event
> > echo 1 > /debug/tracing/clear_all_events
> 
> >From a usability point of view, I'd prefer this as well.  It's only
> two extra files, but it means that filenames in /deb/tracing is
> self-documenting, which is a good thing.  Otherwise, it's really not
> obvious that "echo !event > /debug/tracing/set_event" will disable the
> event, and it's *especially* not obvious "echo > /debug/tracing/set_event"
> will disable all events.
> 
> It's also a tad bit too easy to typo "echo > /debug/tracing/set_event"
> by editing command history, and hitting return a bit too soon after
> clearing out the previous event name but before you manage to type the
> new event name.

OK, then I guess I can add a clear_event file. And to clear all events 
simply do:

 # cat set_event > clear_event

-- Steve


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 2/4] tracing: add event trace infrastructure
  2009-02-25  9:22               ` Andrew Morton
  2009-02-25  9:26                 ` Peter Zijlstra
  2009-02-25  9:33                 ` Pekka Enberg
@ 2009-02-25 14:41                 ` Steven Rostedt
  2009-02-25 15:57                   ` Ingo Molnar
  2 siblings, 1 reply; 51+ messages in thread
From: Steven Rostedt @ 2009-02-25 14:41 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Pekka Enberg, Ingo Molnar, LKML, Thomas Gleixner, Peter Zijlstra,
	Frederic Weisbecker, Theodore Tso, Arjan van de Ven,
	Pekka Paalanen, Arnaldo Carvalho de Melo, Jason Baron,
	Martin Bligh, Mathieu Desnoyers, Frank Ch. Eigler,
	KOSAKI Motohiro, Jens Axboe, Masami Hiramatsu, Steven Rostedt


On Wed, 25 Feb 2009, Andrew Morton wrote:

> On Wed, 25 Feb 2009 11:00:53 +0200 Pekka Enberg <penberg@cs.helsinki.fi> wrote:
> Parse this:
> 
> irqsoff latency trace v1.1.5 on 2.6.26-rc8
> --------------------------------------------------------------------
>  latency: 97 us, #3/3, CPU#0 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:2)
>     -----------------
>     | task: swapper-0 (uid:0 nice:0 policy:0 rt_prio:0)
>     -----------------
>  => started at: apic_timer_interrupt
>  => ended at:   do_softirq
> 
> #                _------=> CPU#
> #               / _-----=> irqs-off
> #              | / _----=> need-resched
> #              || / _---=> hardirq/softirq
> #              ||| / _--=> preempt-depth
> #              |||| /
> #              |||||     delay
> #  cmd     pid ||||| time  |   caller
> #     \   /    |||||   \   |   /
>   <idle>-0     0d..1    0us+: trace_hardirqs_off_thunk (apic_timer_interrupt)
>   <idle>-0     0d.s.   97us : __do_softirq (do_softirq)
>   <idle>-0     0d.s1   98us : trace_hardirqs_on (do_softirq)
> 
> your time starts now.

Note, that is a legacy format, that works great for the latency tracers. 
Those include (the example you used) irqsoff, the preempt off, wake up 
latencies. This gives us a nice listing of why we are hitting a latency.
All other plugins generally use the trace file, although they could also 
use that file too.

As for your English comment. I showed the header to my wife, and she had 
no idea what irqs-off means, nor need-resched, nor preempt-depth ;-)   
They are not quite English as they are technical terms. Most of the names 
are hard coded into the kernel too.

Yeah, maybe that file (latency_trace) is a bit too much. I for one love 
it. It is great to load a kernel on some remote box, and run the irqs off 
latency tracer to see where the interrupts are disabled for the longest 
time. This format is really nice because it shows me when we are in an 
interrupt, or interrupts are disabled, and when the task should have been 
rescheduled.

This has help find places that we miss a preemption check too.

If I had to also include a user space program, I would need to tell the 
customer how to install it, and run it. Just telling someone to mount a 
directory, run your test, cat a file to some temp file and send me the 
temp file is much easier. Because most admins know how to do such things. 
No learning curve, except what to mount and what to cat.

I've given a few talks on ftrace. And when I'm approached afterward by 
people (usually the ones not paying attention to the speech ;-) asking 
what utils they need to install. You should see the smile on their faces 
when I tell them, "All you need is what is in busybox".  Then they say, 
"No, really? What do I need to use ftrace?" I again tell them, a kernel 
with ftrace and a minimum requirement of what is in busybox: cat, grep, 
echo. That is what makes ftrace powerful and robust.

Once I break that "busybox" rule. ftrace will be used by only a quarter of
those that use it now (and probably less).

-- Steve





^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 2/4] tracing: add event trace infrastructure
  2009-02-25 14:41                 ` [PATCH 2/4] tracing: add event trace infrastructure Steven Rostedt
@ 2009-02-25 15:57                   ` Ingo Molnar
  2009-02-25 16:09                     ` Steven Rostedt
  2009-02-25 22:48                     ` Steven Rostedt
  0 siblings, 2 replies; 51+ messages in thread
From: Ingo Molnar @ 2009-02-25 15:57 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Andrew Morton, Pekka Enberg, LKML, Thomas Gleixner,
	Peter Zijlstra, Frederic Weisbecker, Theodore Tso,
	Arjan van de Ven, Pekka Paalanen, Arnaldo Carvalho de Melo,
	Jason Baron, Martin Bligh, Mathieu Desnoyers, Frank Ch. Eigler,
	KOSAKI Motohiro, Jens Axboe, Masami Hiramatsu, Steven Rostedt


* Steven Rostedt <rostedt@goodmis.org> wrote:

> > #                _------=> CPU#
> > #               / _-----=> irqs-off
> > #              | / _----=> need-resched
> > #              || / _---=> hardirq/softirq
> > #              ||| / _--=> preempt-depth
> > #              |||| /
> > #              |||||     delay
> > #  cmd     pid ||||| time  |   caller
> > #     \   /    |||||   \   |   /
> >   <idle>-0     0d..1    0us+: trace_hardirqs_off_thunk (apic_timer_interrupt)
> >   <idle>-0     0d.s.   97us : __do_softirq (do_softirq)
> >   <idle>-0     0d.s1   98us : trace_hardirqs_on (do_softirq)
[...]
> > 
> > your time starts now.
> 
> Note, that is a legacy format, that works great for the 
> latency tracers. Those include (the example you used) irqsoff, 
> the preempt off, wake up latencies. This gives us a nice 
> listing of why we are hitting a latency. All other plugins 
> generally use the trace file, although they could also use 
> that file too.
> 
> As for your English comment. I showed the header to my wife, 
> and she had no idea what irqs-off means, nor need-resched, nor 
> preempt-depth ;-) They are not quite English as they are 
> technical terms. Most of the names are hard coded into the 
> kernel too.
> 
> Yeah, maybe that file (latency_trace) is a bit too much. I for 
> one love it. It is great to load a kernel on some remote box, 
> and run the irqs off latency tracer to see where the 
> interrupts are disabled for the longest time. This format is 
> really nice because it shows me when we are in an interrupt, 
> or interrupts are disabled, and when the task should have been 
> rescheduled.
> 
> This has help find places that we miss a preemption check too.

Could we get that, as PeterZ has suggested, as a trace_option 
column in the 'trace' file? It would be default off for 
non-latency tracers, with latency tracing plugins turning it on 
by default. Would that work?

	Ingo

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 1/4] tracing: add DEFINE_TRACE_FMT to tracepoint.h
  2009-02-25 13:01     ` Steven Rostedt
@ 2009-02-25 16:09       ` Mathieu Desnoyers
  0 siblings, 0 replies; 51+ messages in thread
From: Mathieu Desnoyers @ 2009-02-25 16:09 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Peter Zijlstra, linux-kernel, Ingo Molnar, Andrew Morton,
	Thomas Gleixner, Frederic Weisbecker, Theodore Tso,
	Arjan van de Ven, Pekka Paalanen, Arnaldo Carvalho de Melo,
	Jason Baron, Martin Bligh, Frank Ch. Eigler, KOSAKI Motohiro,
	Jens Axboe, Masami Hiramatsu, Steven Rostedt

* Steven Rostedt (rostedt@goodmis.org) wrote:
> 
> On Wed, 25 Feb 2009, Peter Zijlstra wrote:
> 
> > On Tue, 2009-02-24 at 21:56 -0500, Steven Rostedt wrote:
> > 
> > > +#define DEFINE_TRACE_FMT(name, proto, args, fmt)		\
> > > +	DECLARE_TRACE(name, TPPROTO(proto), TPARGS(args))
> > > +
> > 
> > If TPPROTO and TPARGS were anything but empty stubs, the above would add
> > them twice, no?
> 
> I guess I can rename them too.
> 
> The problem comes from passing in more than one argument. If a proto or 
> args has a comma, you get a nasty error about unmatched number of 
> parameters.
> 
> I believe that they were purposely made to be empty stubs to allow for 
> multiple parameters. But you are right, if that changes we need to fix it.
> 
> Since the above is only to protect against the substitution into multiple 
> parameters, we could probably create a new macro name to just be
> 
>  #define PARAMS(args...) args
>  #define DEFINE_TRACE_FMT(name, proto, args, fmt) \
> 	DECLARE_TRACE(name, PARAMS(proto), PARAMS(args))
> 

Yes, I think this should work. Just be careful to test passing a "void"
function parameter to these when you make any changes. This is a
subtle corner-case.

Mathieu

> -- Steve
> 

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 2/4] tracing: add event trace infrastructure
  2009-02-25 15:57                   ` Ingo Molnar
@ 2009-02-25 16:09                     ` Steven Rostedt
  2009-02-25 22:48                     ` Steven Rostedt
  1 sibling, 0 replies; 51+ messages in thread
From: Steven Rostedt @ 2009-02-25 16:09 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Andrew Morton, Pekka Enberg, LKML, Thomas Gleixner,
	Peter Zijlstra, Frederic Weisbecker, Theodore Tso,
	Arjan van de Ven, Pekka Paalanen, Arnaldo Carvalho de Melo,
	Jason Baron, Martin Bligh, Mathieu Desnoyers, Frank Ch. Eigler,
	KOSAKI Motohiro, Jens Axboe, Masami Hiramatsu, Steven Rostedt


On Wed, 25 Feb 2009, Ingo Molnar wrote:
> > Yeah, maybe that file (latency_trace) is a bit too much. I for 
> > one love it. It is great to load a kernel on some remote box, 
> > and run the irqs off latency tracer to see where the 
> > interrupts are disabled for the longest time. This format is 
> > really nice because it shows me when we are in an interrupt, 
> > or interrupts are disabled, and when the task should have been 
> > rescheduled.
> > 
> > This has help find places that we miss a preemption check too.
> 
> Could we get that, as PeterZ has suggested, as a trace_option 
> column in the 'trace' file? It would be default off for 
> non-latency tracers, with latency tracing plugins turning it on 
> by default. Would that work?

Sure, I'd be fine with it as an option. I just don't want to 
completely lose the ability to retrieve that information.

-- Steve


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 1/4] tracing: add DEFINE_TRACE_FMT to tracepoint.h
  2009-02-25  2:56 ` [PATCH 1/4] tracing: add DEFINE_TRACE_FMT to tracepoint.h Steven Rostedt
  2009-02-25  6:27   ` Peter Zijlstra
@ 2009-02-25 16:13   ` Mathieu Desnoyers
  2009-02-25 16:28     ` Steven Rostedt
  1 sibling, 1 reply; 51+ messages in thread
From: Mathieu Desnoyers @ 2009-02-25 16:13 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, Frederic Weisbecker, Theodore Tso,
	Arjan van de Ven, Pekka Paalanen, Arnaldo Carvalho de Melo,
	Jason Baron, Martin Bligh, Frank Ch. Eigler, KOSAKI Motohiro,
	Jens Axboe, Masami Hiramatsu, Steven Rostedt

* Steven Rostedt (rostedt@goodmis.org) wrote:
> From: Steven Rostedt <srostedt@redhat.com>
> 
> This patch creates a DEFINE_TRACE_FMT to map to DECLARE_TRACE.
> This allows for the developers to place format strings and
> args in with their tracepoint declaration. A tracer may now
> override the DEFINE_TRACE_FMT macro and use it to record
> a default format.
> 

Hi Steven,

How comes does a DEFINE_* maps to a DECLARE_* ?

Usually, DEFINE_* are meant to be put in .c files. DECLARE_* are in
headers. I don't see how the mapping you propose here can be
semantically correct ?

Mathieu

> Signed-off-by: Steven Rostedt <srostedt@redhat.com>
> ---
>  include/linux/tracepoint.h |    3 +++
>  1 files changed, 3 insertions(+), 0 deletions(-)
> 
> diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h
> index 7570054..34ae464 100644
> --- a/include/linux/tracepoint.h
> +++ b/include/linux/tracepoint.h
> @@ -153,4 +153,7 @@ static inline void tracepoint_synchronize_unregister(void)
>  	synchronize_sched();
>  }
>  
> +#define DEFINE_TRACE_FMT(name, proto, args, fmt)		\
> +	DECLARE_TRACE(name, TPPROTO(proto), TPARGS(args))
> +
>  #endif
> -- 
> 1.5.6.5
> 
> -- 
> 

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 2/4] tracing: add event trace infrastructure
  2009-02-25  9:56                     ` Ingo Molnar
  2009-02-25 10:02                       ` Andrew Morton
@ 2009-02-25 16:21                       ` Frederic Weisbecker
  1 sibling, 0 replies; 51+ messages in thread
From: Frederic Weisbecker @ 2009-02-25 16:21 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Andrew Morton, Pekka Enberg, Steven Rostedt, LKML,
	Thomas Gleixner, Peter Zijlstra, Theodore Tso, Arjan van de Ven,
	Pekka Paalanen, Arnaldo Carvalho de Melo, Jason Baron,
	Martin Bligh, Mathieu Desnoyers, Frank Ch. Eigler,
	KOSAKI Motohiro, Jens Axboe, Masami Hiramatsu, Steven Rostedt

On Wed, Feb 25, 2009 at 10:56:23AM +0100, Ingo Molnar wrote:
> 
> * Andrew Morton <akpm@linux-foundation.org> wrote:
> 
> > Plus...  it's all English-only.
> 
> Note that with bin+raw output you can already internationalize 
> tracepoints, if you want to.
> 
> I havent seen much interest in that, and the default tracing 
> output is in English indeed, and the reason is rather 
> fundamental: currently we've got 60,000+ kernel function 
> symbols, 99% of which are in English.


Neither do I use the raw/bin formats from ftrace personally, but
they are used by automated tools such as userspace side sysprof,
blktrace, ...

Frederic.

 
> Do you argue for them to be converted to some i8n format so that 
> the trace output becomes readable in other languages as well? 
> 
> I.e. do you suggest that this:
> 
> >  3)               |  handle_mm_fault() {
> >  3)               |    count_vm_event() {
> >  3)   0.243 us    |      test_ti_thread_flag();
> >  3)   0.754 us    |    }
> >  3)   0.249 us    |    pud_alloc();
> >  3)   0.251 us    |    pmd_alloc();
> >  3)               |    __do_fault() {
> >  3)               |      filemap_fault() {
> >  3)               |        find_lock_page() {
> >  3)               |          find_get_page() {
> >  3)   0.248 us    |            test_ti_thread_flag();
> >  3)   0.844 us    |          }
> >  3)   1.341 us    |        }
> >  3)   1.837 us    |      }
> >  3)   0.275 us    |	   _spin_lock();
> >  3)   0.257 us    |	   page_add_file_rmap();
> >  3)   0.233 us    |      native_set_pte_at();
> 
> and /proc/kallsysms to be internationalized? Should all oopses 
> and warnings that show up in the kernel log be translated as 
> well?
> 
> I dont think it's realistic - and arguing for anything less and 
> singling out tracing would be a double standard.
> 
> Currently being able to understand and hack the kernel means 
> being able to read some English - and the same holds for trace 
> output as well.
> 
> The default output of traces is just a mirror image of what is 
> the kernel status quo. If the kernel gets internationalized so 
> will ftrace be internationalized too.
> 
> > > So if you're arguing against specific ftrace plugins, go 
> > > ahead (you probably have a fair point there). But please 
> > > don't dismiss the while _concept_ of ftrace because of them.
> > 
> > Where on earth did that come from?
> > 
> > What I'm arguing against is putting English-only 
> > pretty-printers and pretty-parsers on wrong side of int 80.  
> > That's all.
> 
> Since the concept of a kernel tracing facility being 
> self-sufficient and being easy to use is an integral and key 
> concept to ftrace, dont you see why people take your suggestions 
> as a dismissal of the ftrace concept?
> 
> 	Ingo


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 1/4] tracing: add DEFINE_TRACE_FMT to tracepoint.h
  2009-02-25 16:13   ` Mathieu Desnoyers
@ 2009-02-25 16:28     ` Steven Rostedt
  2009-02-25 16:33       ` Ingo Molnar
  0 siblings, 1 reply; 51+ messages in thread
From: Steven Rostedt @ 2009-02-25 16:28 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, Frederic Weisbecker, Theodore Tso,
	Arjan van de Ven, Pekka Paalanen, Arnaldo Carvalho de Melo,
	Jason Baron, Martin Bligh, Frank Ch. Eigler, KOSAKI Motohiro,
	Jens Axboe, Masami Hiramatsu, Steven Rostedt


On Wed, 25 Feb 2009, Mathieu Desnoyers wrote:

> * Steven Rostedt (rostedt@goodmis.org) wrote:
> > From: Steven Rostedt <srostedt@redhat.com>
> > 
> > This patch creates a DEFINE_TRACE_FMT to map to DECLARE_TRACE.
> > This allows for the developers to place format strings and
> > args in with their tracepoint declaration. A tracer may now
> > override the DEFINE_TRACE_FMT macro and use it to record
> > a default format.
> > 
> 
> Hi Steven,
> 
> How comes does a DEFINE_* maps to a DECLARE_* ?
> 
> Usually, DEFINE_* are meant to be put in .c files. DECLARE_* are in
> headers. I don't see how the mapping you propose here can be
> semantically correct ?

Because it can be either a define or a declare ;-)

Gag, I can't win! First Andrew points out that I use DECLARE_TRACE_FMT 
when it should be a DEFINE. Now I go change it and you state that it 
should be a DECLARE not a DEFINE!


Well, any tracer that remaps it, it should be a DEFINE. Just in the 
"default" case of trace points, is it used as a DECLARE. Because, it 
ignores the fmt parameter that would be used in the define.

-- Steve


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 1/4] tracing: add DEFINE_TRACE_FMT to tracepoint.h
  2009-02-25 16:28     ` Steven Rostedt
@ 2009-02-25 16:33       ` Ingo Molnar
  0 siblings, 0 replies; 51+ messages in thread
From: Ingo Molnar @ 2009-02-25 16:33 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Mathieu Desnoyers, linux-kernel, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, Frederic Weisbecker, Theodore Tso,
	Arjan van de Ven, Pekka Paalanen, Arnaldo Carvalho de Melo,
	Jason Baron, Martin Bligh, Frank Ch. Eigler, KOSAKI Motohiro,
	Jens Axboe, Masami Hiramatsu, Steven Rostedt


* Steven Rostedt <rostedt@goodmis.org> wrote:

> On Wed, 25 Feb 2009, Mathieu Desnoyers wrote:
> 
> > * Steven Rostedt (rostedt@goodmis.org) wrote:
> > > From: Steven Rostedt <srostedt@redhat.com>
> > > 
> > > This patch creates a DEFINE_TRACE_FMT to map to DECLARE_TRACE.
> > > This allows for the developers to place format strings and
> > > args in with their tracepoint declaration. A tracer may now
> > > override the DEFINE_TRACE_FMT macro and use it to record
> > > a default format.
> > > 
> > 
> > Hi Steven,
> > 
> > How comes does a DEFINE_* maps to a DECLARE_* ?
> > 
> > Usually, DEFINE_* are meant to be put in .c files. DECLARE_* are in
> > headers. I don't see how the mapping you propose here can be
> > semantically correct ?
> 
> Because it can be either a define or a declare ;-)
> 
> Gag, I can't win! First Andrew points out that I use 
> DECLARE_TRACE_FMT when it should be a DEFINE. Now I go change 
> it and you state that it should be a DECLARE not a DEFINE!
> 
> Well, any tracer that remaps it, it should be a DEFINE. Just 
> in the "default" case of trace points, is it used as a 
> DECLARE. Because, it ignores the fmt parameter that would be 
> used in the define.

Add either both or none of them. I'd suggest the latter ;-)

TRACE_FORMAT() would be perfect.

	Ingo

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 2/4] tracing: add event trace infrastructure
  2009-02-25 15:57                   ` Ingo Molnar
  2009-02-25 16:09                     ` Steven Rostedt
@ 2009-02-25 22:48                     ` Steven Rostedt
  2009-02-26  3:19                       ` Ingo Molnar
  1 sibling, 1 reply; 51+ messages in thread
From: Steven Rostedt @ 2009-02-25 22:48 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Andrew Morton, Pekka Enberg, LKML, Thomas Gleixner,
	Peter Zijlstra, Frederic Weisbecker, Theodore Tso,
	Arjan van de Ven, Pekka Paalanen, Arnaldo Carvalho de Melo,
	Jason Baron, Martin Bligh, Mathieu Desnoyers, Frank Ch. Eigler,
	KOSAKI Motohiro, Jens Axboe, Masami Hiramatsu, Steven Rostedt



On Wed, 25 Feb 2009, Ingo Molnar wrote:

> 
> * Steven Rostedt <rostedt@goodmis.org> wrote:
> 
> > > #                _------=> CPU#
> > > #               / _-----=> irqs-off
> > > #              | / _----=> need-resched
> > > #              || / _---=> hardirq/softirq
> > > #              ||| / _--=> preempt-depth
> > > #              |||| /
> > > #              |||||     delay
> > > #  cmd     pid ||||| time  |   caller
> > > #     \   /    |||||   \   |   /
> > >   <idle>-0     0d..1    0us+: trace_hardirqs_off_thunk (apic_timer_interrupt)
> > >   <idle>-0     0d.s.   97us : __do_softirq (do_softirq)
> > >   <idle>-0     0d.s1   98us : trace_hardirqs_on (do_softirq)
> [...]
> > > 
> > > your time starts now.
> > 
> > Note, that is a legacy format, that works great for the 
> > latency tracers. Those include (the example you used) irqsoff, 
> > the preempt off, wake up latencies. This gives us a nice 
> > listing of why we are hitting a latency. All other plugins 
> > generally use the trace file, although they could also use 
> > that file too.
> > 
> > As for your English comment. I showed the header to my wife, 
> > and she had no idea what irqs-off means, nor need-resched, nor 
> > preempt-depth ;-) They are not quite English as they are 
> > technical terms. Most of the names are hard coded into the 
> > kernel too.
> > 
> > Yeah, maybe that file (latency_trace) is a bit too much. I for 
> > one love it. It is great to load a kernel on some remote box, 
> > and run the irqs off latency tracer to see where the 
> > interrupts are disabled for the longest time. This format is 
> > really nice because it shows me when we are in an interrupt, 
> > or interrupts are disabled, and when the task should have been 
> > rescheduled.
> > 
> > This has help find places that we miss a preemption check too.
> 
> Could we get that, as PeterZ has suggested, as a trace_option 
> column in the 'trace' file? It would be default off for 
> non-latency tracers, with latency tracing plugins turning it on 
> by default. Would that work?

I guess adding options will be a high priority for me now. I can't parse 
irqsoff output anymore :-(

Besides the information of the ..... area, the times were all based off of 
the start of the trace, not global. This allowed you to see the latency in 
the trace.

-- Steve


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 2/4] tracing: add event trace infrastructure
  2009-02-25 22:48                     ` Steven Rostedt
@ 2009-02-26  3:19                       ` Ingo Molnar
  0 siblings, 0 replies; 51+ messages in thread
From: Ingo Molnar @ 2009-02-26  3:19 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Andrew Morton, Pekka Enberg, LKML, Thomas Gleixner,
	Peter Zijlstra, Frederic Weisbecker, Theodore Tso,
	Arjan van de Ven, Pekka Paalanen, Arnaldo Carvalho de Melo,
	Jason Baron, Martin Bligh, Mathieu Desnoyers, Frank Ch. Eigler,
	KOSAKI Motohiro, Jens Axboe, Masami Hiramatsu, Steven Rostedt


* Steven Rostedt <rostedt@goodmis.org> wrote:

> 
> 
> On Wed, 25 Feb 2009, Ingo Molnar wrote:
> 
> > 
> > * Steven Rostedt <rostedt@goodmis.org> wrote:
> > 
> > > > #                _------=> CPU#
> > > > #               / _-----=> irqs-off
> > > > #              | / _----=> need-resched
> > > > #              || / _---=> hardirq/softirq
> > > > #              ||| / _--=> preempt-depth
> > > > #              |||| /
> > > > #              |||||     delay
> > > > #  cmd     pid ||||| time  |   caller
> > > > #     \   /    |||||   \   |   /
> > > >   <idle>-0     0d..1    0us+: trace_hardirqs_off_thunk (apic_timer_interrupt)
> > > >   <idle>-0     0d.s.   97us : __do_softirq (do_softirq)
> > > >   <idle>-0     0d.s1   98us : trace_hardirqs_on (do_softirq)
> > [...]
> > > > 
> > > > your time starts now.
> > > 
> > > Note, that is a legacy format, that works great for the 
> > > latency tracers. Those include (the example you used) irqsoff, 
> > > the preempt off, wake up latencies. This gives us a nice 
> > > listing of why we are hitting a latency. All other plugins 
> > > generally use the trace file, although they could also use 
> > > that file too.
> > > 
> > > As for your English comment. I showed the header to my wife, 
> > > and she had no idea what irqs-off means, nor need-resched, nor 
> > > preempt-depth ;-) They are not quite English as they are 
> > > technical terms. Most of the names are hard coded into the 
> > > kernel too.
> > > 
> > > Yeah, maybe that file (latency_trace) is a bit too much. I for 
> > > one love it. It is great to load a kernel on some remote box, 
> > > and run the irqs off latency tracer to see where the 
> > > interrupts are disabled for the longest time. This format is 
> > > really nice because it shows me when we are in an interrupt, 
> > > or interrupts are disabled, and when the task should have been 
> > > rescheduled.
> > > 
> > > This has help find places that we miss a preemption check too.
> > 
> > Could we get that, as PeterZ has suggested, as a trace_option 
> > column in the 'trace' file? It would be default off for 
> > non-latency tracers, with latency tracing plugins turning it on 
> > by default. Would that work?
> 
> I guess adding options will be a high priority for me now. I 
> can't parse irqsoff output anymore :-(
> 
> Besides the information of the ..... area, the times were all 
> based off of the start of the trace, not global. This allowed 
> you to see the latency in the trace.

That is something that is sane to have as a default anyway. It 
always annoys me that i have to start looking at a trace by 
subtracting the first timestamp from the last timestamp.

	Ingo

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 2/4] tracing: add event trace infrastructure
  2009-02-25 13:54             ` Theodore Tso
@ 2009-02-26 21:08               ` Frank Ch. Eigler
  0 siblings, 0 replies; 51+ messages in thread
From: Frank Ch. Eigler @ 2009-02-26 21:08 UTC (permalink / raw)
  To: Theodore Tso
  Cc: Andrew Morton, Ingo Molnar, Steven Rostedt, LKML,
	Thomas Gleixner, Peter Zijlstra, Frederic Weisbecker,
	Arjan van de Ven, Pekka Paalanen, Arnaldo Carvalho de Melo,
	Jason Baron, Martin Bligh, Mathieu Desnoyers, KOSAKI Motohiro,
	Jens Axboe, Masami Hiramatsu, Steven Rostedt

Theodore Tso <tytso@mit.edu> writes:

> [...]  I think it's because we've just been burned so many times
> with complex user space tools that are built in C++, have
> impossible-to-understand error handling ("try again with -vvvvvvv"),
> is hard to build because they have huge dependencies on external
> libraries, or is filled with distro dependencies and might not build
> on various distributions, or is dependent on the RPM macros, etc.
> [...]

How many of these concerns have a basis in recent reality / experience?

- FChE

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 2/4] tracing: add event trace infrastructure
  2009-02-25  8:28           ` Andrew Morton
                               ` (2 preceding siblings ...)
  2009-02-25 13:54             ` Theodore Tso
@ 2009-03-01 10:37             ` KOSAKI Motohiro
  3 siblings, 0 replies; 51+ messages in thread
From: KOSAKI Motohiro @ 2009-03-01 10:37 UTC (permalink / raw)
  To: Andrew Morton
  Cc: kosaki.motohiro, Ingo Molnar, Steven Rostedt, LKML,
	Thomas Gleixner, Peter Zijlstra, Frederic Weisbecker,
	Theodore Tso, Arjan van de Ven, Pekka Paalanen,
	Arnaldo Carvalho de Melo, Jason Baron, Martin Bligh,
	Mathieu Desnoyers, Frank Ch. Eigler, Jens Axboe,
	Masami Hiramatsu, Steven Rostedt

> On Wed, 25 Feb 2009 09:11:18 +0100 Ingo Molnar <mingo@elte.hu> wrote:
> 
> > 
> > * Andrew Morton <akpm@linux-foundation.org> wrote:
> > 
> > > On Tue, 24 Feb 2009 23:08:56 -0500 (EST) Steven Rostedt <rostedt@goodmis.org> wrote:
> > > 
> > > > > Gad, what a lot of stuff.
> > > > > 
> > > > > Use strncpy_from_user()?
> > > > > 
> > > > > Use strstrip()?
> > > > > 
> > > > > Why do we care about leading and trailing whitespace - user error!
> > > > 
> > > > This is because i want:
> > > > 
> > > >  cat available_events > set_event
> > > > 
> > > > to work.
> > > 
> > > :(
> > > 
> > > Why on earth do we keep on putting all these pretty-printers 
> > > and pretty-parsers into the kernel?  I mean, how hard is it 
> > > for userspace to read a text file, do some basic substitutions 
> > > and print it out again?
> > 
> > Note that there's no mandatory user-space component here - the 
> > final destination is the kernel developer's eyes in 90% of the 
> > cases. These traces get pasted into email, etc. etc.
> > 
> > So leading spaces, meaningful formatting and general hands-on 
> > usability is important. [ I know, it's a strange concept in the 
> > kernel, we tend to have a perversion for the most unusable and 
> > most inconsistent user interfaces ;-) ]
> > 
> > It's also a balancing act. We dont want to put all of TeX into 
> > the kernel obviously. Nor do we want the default to be the 
> > opposite end of the spectrum: to output raw binary records. So 
> > we find some middle ground. That middle ground is inluenced by 
> > the developers using this stuff.
> > 
> 
> <For the enty enth pissing-in-the-wind time>
> 
> A better approach would be to design simple, robust kernel interfaces
> which make sense and which aren't made all complex by putting the user
> interface in kernel space.  And to maintain corresponding userspace
> tools which manipulate and present the IO from those kernel interfaces.
> 
> But we don't do that, because userspace is hard, because we don't have
> a delivery process.  But nobody has even tried!  We can do this - it
> starts with `mkdir -p userspace/ktrace'.
> 
> Probably it's already too late for ktrace though - that hole is already
> dug.
> 
> 
> Last time I pissed in this wind I got fobbed off with some stupid "put
> it in util-linux" answer.  But of course we won't do that, because it's
> MUCH harder and slower and more complex than just patching the kernel
> tree.  So nothing happened.  Again.
> 
> And please let's not all sit here busily thinking up improbable reasons
> why it can't possibly work.  We're clever!  We can do this sort of
> thing!  If problems arise, we fix them!
> 


> The only extant example I can think of is getdelays.c, and that was a
> totally half-assed effort with no infrastructural support at all.  But
> quite a lot of people use it, and patches occasionally come in for it,
> no problems.

<just offtopic>

I think getdelays.c should move to "userspace/delayacct".
it's definitly not document.
(slabinfo too)

Documentation directory should only have documentation and example.




^ permalink raw reply	[flat|nested] 51+ messages in thread

end of thread, other threads:[~2009-03-01 10:40 UTC | newest]

Thread overview: 51+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-02-25  2:56 [PATCH 0/4] [git pull] tip/tracing/ftrace Steven Rostedt
2009-02-25  2:56 ` [PATCH 1/4] tracing: add DEFINE_TRACE_FMT to tracepoint.h Steven Rostedt
2009-02-25  6:27   ` Peter Zijlstra
2009-02-25 13:01     ` Steven Rostedt
2009-02-25 16:09       ` Mathieu Desnoyers
2009-02-25 16:13   ` Mathieu Desnoyers
2009-02-25 16:28     ` Steven Rostedt
2009-02-25 16:33       ` Ingo Molnar
2009-02-25  2:56 ` [PATCH 2/4] tracing: add event trace infrastructure Steven Rostedt
2009-02-25  3:45   ` Andrew Morton
2009-02-25  4:08     ` Steven Rostedt
2009-02-25  4:24       ` Nick Piggin
2009-02-25  4:33       ` Andrew Morton
2009-02-25  5:16         ` Mathieu Desnoyers
2009-02-25  8:11         ` Ingo Molnar
2009-02-25  8:28           ` Andrew Morton
2009-02-25  8:40             ` Ingo Molnar
2009-02-25  9:15               ` Andrew Morton
2009-02-25  9:00             ` Pekka Enberg
2009-02-25  9:10               ` Ingo Molnar
2009-02-25  9:22               ` Andrew Morton
2009-02-25  9:26                 ` Peter Zijlstra
2009-02-25 10:31                   ` Ingo Molnar
2009-02-25  9:33                 ` Pekka Enberg
2009-02-25  9:44                   ` Andrew Morton
2009-02-25  9:56                     ` Ingo Molnar
2009-02-25 10:02                       ` Andrew Morton
2009-02-25 10:24                         ` Pekka Enberg
2009-02-25 10:27                         ` Ingo Molnar
2009-02-25 16:21                       ` Frederic Weisbecker
2009-02-25  9:57                     ` Pekka Enberg
2009-02-25 10:07                   ` [PATCH] tracing: remove /debug/tracing/latency_trace Ingo Molnar
2009-02-25 14:41                 ` [PATCH 2/4] tracing: add event trace infrastructure Steven Rostedt
2009-02-25 15:57                   ` Ingo Molnar
2009-02-25 16:09                     ` Steven Rostedt
2009-02-25 22:48                     ` Steven Rostedt
2009-02-26  3:19                       ` Ingo Molnar
2009-02-25 13:54             ` Theodore Tso
2009-02-26 21:08               ` Frank Ch. Eigler
2009-03-01 10:37             ` KOSAKI Motohiro
2009-02-25 13:37     ` Theodore Tso
2009-02-25 14:10       ` Steven Rostedt
2009-02-25  9:07   ` Lai Jiangshan
2009-02-25 13:50     ` Steven Rostedt
2009-02-25  9:21   ` Lai Jiangshan
2009-02-25 13:54     ` Steven Rostedt
2009-02-25  2:56 ` [PATCH 3/4] tracing: add schedule events to event trace Steven Rostedt
2009-02-25  6:29   ` Peter Zijlstra
2009-02-25  2:56 ` [PATCH 4/4] tracing: make event directory structure Steven Rostedt
2009-02-25  6:59   ` Frederic Weisbecker
2009-02-25 13:07     ` Steven Rostedt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).