All of lore.kernel.org
 help / color / mirror / Atom feed
* [GIT PULL] tracing: Syscalls trace events + perf support
@ 2009-08-11 18:48 Frederic Weisbecker
  2009-08-11 18:48 ` [PATCH 01/16] tracing: Rename set_tracer_flags()'s local variable trace_flags Frederic Weisbecker
                   ` (17 more replies)
  0 siblings, 18 replies; 54+ messages in thread
From: Frederic Weisbecker @ 2009-08-11 18:48 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Frederic Weisbecker, Lai Jiangshan, Steven Rostedt,
	Peter Zijlstra, Mathieu Desnoyers, Jiaying Zhang, Martin Bligh,
	Li Zefan, Jason Baron, Masami Hiramatsu

Hi Ingo,

This pull request integrate one cleanup/fix for ftrace and
an update for syscall tracing: the migration from old-style tracer
to individual tracepoints/trace_events and the support for perf counter.

I've tested it with success either with ftrace (every syscall tracepoints
enabled at the same time without problems) and with perfcounter.

May be one drawback: it creates so much trace events that the ftrace
selftests can take some time :-)

Thanks,
Frederic.

The following changes since commit 89034bc2c7b839702c00a704e79d112737f98be0:
  Ingo Molnar (1):
        Merge branch 'linus' into tracing/core

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing.git \
	tracing/core

Frederic Weisbecker (3):
      tracing: Add ftrace event call parameter to its field descriptor handler
      tracing: Add fields format definition for syscall events
      tracing: Support for syscall events raw records in perfcounters

Jason Baron (12):
      tracing: Map syscall name to number
      tracing: Call arch_init_ftrace_syscalls at boot
      tracing: Add DECLARE_TRACE_WITH_CALLBACK() macro
      tracing: Add syscall tracepoints
      tracing: Update FTRACE_SYSCALL_MAX
      tracing: Raw_init() bailout in trace event register fail case
      tracing: Add ftrace_event_call void * 'data' field
      tracing: Add trace events for each syscall entry/exit
      tracing: Add individual syscalls tracepoint id support
      tracing: Add perf counter support for syscalls tracing
      tracing: Add more namespace area to 'perf list' output
      tracing: Convert x86_64 mmap and uname to use DEFINE_SYSCALL

Zhaolei (1):
      tracing: Rename set_tracer_flags()'s local variable trace_flags

 arch/x86/include/asm/ftrace.h  |    4 +-
 arch/x86/kernel/ftrace.c       |   41 ++++--
 arch/x86/kernel/ptrace.c       |    7 +-
 arch/x86/kernel/sys_x86_64.c   |    8 +-
 include/linux/ftrace_event.h   |    8 +-
 include/linux/perf_counter.h   |    2 +
 include/linux/syscalls.h       |  126 +++++++++++++-
 include/linux/tracepoint.h     |   31 +++-
 include/trace/ftrace.h         |    7 +-
 include/trace/syscall.h        |   56 +++++-
 kernel/trace/trace.c           |   14 +-
 kernel/trace/trace.h           |    6 -
 kernel/trace/trace_events.c    |   35 +++--
 kernel/trace/trace_export.c    |    6 +-
 kernel/trace/trace_syscalls.c  |  372 +++++++++++++++++++++++++++++++---------
 kernel/tracepoint.c            |   38 ++++
 tools/perf/util/parse-events.c |    8 +-
 17 files changed, 613 insertions(+), 156 deletions(-)

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH 01/16] tracing: Rename set_tracer_flags()'s local variable trace_flags
  2009-08-11 18:48 [GIT PULL] tracing: Syscalls trace events + perf support Frederic Weisbecker
@ 2009-08-11 18:48 ` Frederic Weisbecker
  2009-08-11 18:48 ` [PATCH 02/16] tracing: Map syscall name to number Frederic Weisbecker
                   ` (16 subsequent siblings)
  17 siblings, 0 replies; 54+ messages in thread
From: Frederic Weisbecker @ 2009-08-11 18:48 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: LKML, Zhaolei, Steven Rostedt, Li Zefan, Frederic Weisbecker

From: Zhaolei <zhaolei@cn.fujitsu.com>

set_tracer_flags() have a local variable named trace_flags which has
the same name than a global one in the same scope.
This leads to confusion, using tracer_flags should be better by its
meaning.

Changelog:
v1->v2: Simplified another patch in this patchset, no change in this
        patch.

Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
 kernel/trace/trace.c |   14 +++++++-------
 1 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index e793cda..8ac2043 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -2118,23 +2118,23 @@ tracing_trace_options_read(struct file *filp, char __user *ubuf,
 /* Try to assign a tracer specific option */
 static int set_tracer_option(struct tracer *trace, char *cmp, int neg)
 {
-	struct tracer_flags *trace_flags = trace->flags;
+	struct tracer_flags *tracer_flags = trace->flags;
 	struct tracer_opt *opts = NULL;
 	int ret = 0, i = 0;
 	int len;
 
-	for (i = 0; trace_flags->opts[i].name; i++) {
-		opts = &trace_flags->opts[i];
+	for (i = 0; tracer_flags->opts[i].name; i++) {
+		opts = &tracer_flags->opts[i];
 		len = strlen(opts->name);
 
 		if (strncmp(cmp, opts->name, len) == 0) {
-			ret = trace->set_flag(trace_flags->val,
+			ret = trace->set_flag(tracer_flags->val,
 				opts->bit, !neg);
 			break;
 		}
 	}
 	/* Not found */
-	if (!trace_flags->opts[i].name)
+	if (!tracer_flags->opts[i].name)
 		return -EINVAL;
 
 	/* Refused to handle */
@@ -2142,9 +2142,9 @@ static int set_tracer_option(struct tracer *trace, char *cmp, int neg)
 		return ret;
 
 	if (neg)
-		trace_flags->val &= ~opts->bit;
+		tracer_flags->val &= ~opts->bit;
 	else
-		trace_flags->val |= opts->bit;
+		tracer_flags->val |= opts->bit;
 
 	return 0;
 }
-- 
1.6.2.3


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH 02/16] tracing: Map syscall name to number
  2009-08-11 18:48 [GIT PULL] tracing: Syscalls trace events + perf support Frederic Weisbecker
  2009-08-11 18:48 ` [PATCH 01/16] tracing: Rename set_tracer_flags()'s local variable trace_flags Frederic Weisbecker
@ 2009-08-11 18:48 ` Frederic Weisbecker
  2009-08-11 18:48 ` [PATCH 03/16] tracing: Call arch_init_ftrace_syscalls at boot Frederic Weisbecker
                   ` (15 subsequent siblings)
  17 siblings, 0 replies; 54+ messages in thread
From: Frederic Weisbecker @ 2009-08-11 18:48 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Jason Baron, Lai Jiangshan, Steven Rostedt, Peter Zijlstra,
	Mathieu Desnoyers, Jiaying Zhang, Martin Bligh, Li Zefan,
	Masami Hiramatsu, Frederic Weisbecker

From: Jason Baron <jbaron@redhat.com>

Add a new function to support translating a syscall name to number at
runtime.
This allows the syscall event tracer to map syscall names to number.

Signed-off-by: Jason Baron <jbaron@redhat.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Jiaying Zhang <jiayingz@google.com>
Cc: Martin Bligh <mbligh@google.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
 arch/x86/kernel/ftrace.c |   16 ++++++++++++++++
 1 files changed, 16 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
index 8e96634..afb31d7 100644
--- a/arch/x86/kernel/ftrace.c
+++ b/arch/x86/kernel/ftrace.c
@@ -500,6 +500,22 @@ struct syscall_metadata *syscall_nr_to_meta(int nr)
 	return syscalls_metadata[nr];
 }
 
+int syscall_name_to_nr(char *name)
+{
+	int i;
+
+	if (!syscalls_metadata)
+		return -1;
+
+	for (i = 0; i < FTRACE_SYSCALL_MAX; i++) {
+		if (syscalls_metadata[i]) {
+			if (!strcmp(syscalls_metadata[i]->name, name))
+				return i;
+		}
+	}
+	return -1;
+}
+
 void arch_init_ftrace_syscalls(void)
 {
 	int i;
-- 
1.6.2.3


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH 03/16] tracing: Call arch_init_ftrace_syscalls at boot
  2009-08-11 18:48 [GIT PULL] tracing: Syscalls trace events + perf support Frederic Weisbecker
  2009-08-11 18:48 ` [PATCH 01/16] tracing: Rename set_tracer_flags()'s local variable trace_flags Frederic Weisbecker
  2009-08-11 18:48 ` [PATCH 02/16] tracing: Map syscall name to number Frederic Weisbecker
@ 2009-08-11 18:48 ` Frederic Weisbecker
  2009-08-11 18:48 ` [PATCH 04/16] tracing: Add DECLARE_TRACE_WITH_CALLBACK() macro Frederic Weisbecker
                   ` (14 subsequent siblings)
  17 siblings, 0 replies; 54+ messages in thread
From: Frederic Weisbecker @ 2009-08-11 18:48 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Jason Baron, Lai Jiangshan, Steven Rostedt, Peter Zijlstra,
	Mathieu Desnoyers, Jiaying Zhang, Martin Bligh, Li Zefan,
	Masami Hiramatsu, Frederic Weisbecker

From: Jason Baron <jbaron@redhat.com>

Call arch_init_ftrace_syscalls at boot, so we can determine early the
set of syscalls for the syscall trace events.

Signed-off-by: Jason Baron <jbaron@redhat.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Jiaying Zhang <jiayingz@google.com>
Cc: Martin Bligh <mbligh@google.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
 arch/x86/kernel/ftrace.c      |   15 ++++-----------
 include/trace/syscall.h       |    1 -
 kernel/trace/trace_syscalls.c |    1 -
 3 files changed, 4 insertions(+), 13 deletions(-)

diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
index afb31d7..0d93d40 100644
--- a/arch/x86/kernel/ftrace.c
+++ b/arch/x86/kernel/ftrace.c
@@ -516,31 +516,24 @@ int syscall_name_to_nr(char *name)
 	return -1;
 }
 
-void arch_init_ftrace_syscalls(void)
+static int __init arch_init_ftrace_syscalls(void)
 {
 	int i;
 	struct syscall_metadata *meta;
 	unsigned long **psys_syscall_table = &sys_call_table;
-	static atomic_t refs;
-
-	if (atomic_inc_return(&refs) != 1)
-		goto end;
 
 	syscalls_metadata = kzalloc(sizeof(*syscalls_metadata) *
 					FTRACE_SYSCALL_MAX, GFP_KERNEL);
 	if (!syscalls_metadata) {
 		WARN_ON(1);
-		return;
+		return -ENOMEM;
 	}
 
 	for (i = 0; i < FTRACE_SYSCALL_MAX; i++) {
 		meta = find_syscall_meta(psys_syscall_table[i]);
 		syscalls_metadata[i] = meta;
 	}
-	return;
-
-	/* Paranoid: avoid overflow */
-end:
-	atomic_dec(&refs);
+	return 0;
 }
+arch_initcall(arch_init_ftrace_syscalls);
 #endif
diff --git a/include/trace/syscall.h b/include/trace/syscall.h
index 8cfe515..c55fcce 100644
--- a/include/trace/syscall.h
+++ b/include/trace/syscall.h
@@ -19,7 +19,6 @@ struct syscall_metadata {
 };
 
 #ifdef CONFIG_FTRACE_SYSCALLS
-extern void arch_init_ftrace_syscalls(void);
 extern struct syscall_metadata *syscall_nr_to_meta(int nr);
 extern void start_ftrace_syscalls(void);
 extern void stop_ftrace_syscalls(void);
diff --git a/kernel/trace/trace_syscalls.c b/kernel/trace/trace_syscalls.c
index 5e57964..08aed43 100644
--- a/kernel/trace/trace_syscalls.c
+++ b/kernel/trace/trace_syscalls.c
@@ -106,7 +106,6 @@ void start_ftrace_syscalls(void)
 	if (++refcount != 1)
 		goto unlock;
 
-	arch_init_ftrace_syscalls();
 	read_lock_irqsave(&tasklist_lock, flags);
 
 	do_each_thread(g, t) {
-- 
1.6.2.3


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH 04/16] tracing: Add DECLARE_TRACE_WITH_CALLBACK() macro
  2009-08-11 18:48 [GIT PULL] tracing: Syscalls trace events + perf support Frederic Weisbecker
                   ` (2 preceding siblings ...)
  2009-08-11 18:48 ` [PATCH 03/16] tracing: Call arch_init_ftrace_syscalls at boot Frederic Weisbecker
@ 2009-08-11 18:48 ` Frederic Weisbecker
  2009-08-11 18:48 ` [PATCH 05/16] tracing: Add syscall tracepoints Frederic Weisbecker
                   ` (13 subsequent siblings)
  17 siblings, 0 replies; 54+ messages in thread
From: Frederic Weisbecker @ 2009-08-11 18:48 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Jason Baron, Lai Jiangshan, Steven Rostedt, Peter Zijlstra,
	Mathieu Desnoyers, Jiaying Zhang, Martin Bligh, Li Zefan,
	Masami Hiramatsu, Frederic Weisbecker

From: Jason Baron <jbaron@redhat.com>

Introduce a new 'DECLARE_TRACE_WITH_CALLBACK()' macro, so that
tracepoints can associate an external register/unregister function.

This prepares for the syscalls tracer conversion to trace events. We
will need to perform arch level operations once a syscall event is
turned on/off, such as TIF flags setting, hence the need of such
specific callbacks.

Signed-off-by: Jason Baron <jbaron@redhat.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Jiaying Zhang <jiayingz@google.com>
Cc: Martin Bligh <mbligh@google.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
 include/linux/tracepoint.h |   31 +++++++++++++++++++++++++++----
 1 files changed, 27 insertions(+), 4 deletions(-)

diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h
index b9dc4ca..5984ed0 100644
--- a/include/linux/tracepoint.h
+++ b/include/linux/tracepoint.h
@@ -60,8 +60,10 @@ struct tracepoint {
  * Make sure the alignment of the structure in the __tracepoints section will
  * not add unwanted padding between the beginning of the section and the
  * structure. Force alignment to the same alignment as the section start.
+ * An optional set of (un)registration functions can be passed to perform any
+ * additional (un)registration work.
  */
-#define DECLARE_TRACE(name, proto, args)				\
+#define DECLARE_TRACE_WITH_CALLBACK(name, proto, args, reg, unreg)	\
 	extern struct tracepoint __tracepoint_##name;			\
 	static inline void trace_##name(proto)				\
 	{								\
@@ -71,13 +73,30 @@ struct tracepoint {
 	}								\
 	static inline int register_trace_##name(void (*probe)(proto))	\
 	{								\
-		return tracepoint_probe_register(#name, (void *)probe);	\
+		int ret;						\
+		void (*func)(void) = reg;				\
+									\
+		ret = tracepoint_probe_register(#name, (void *)probe);	\
+		if (func && !ret)					\
+			func();						\
+		return ret;						\
 	}								\
 	static inline int unregister_trace_##name(void (*probe)(proto))	\
 	{								\
-		return tracepoint_probe_unregister(#name, (void *)probe);\
+		int ret;						\
+		void (*func)(void) = unreg;				\
+									\
+		ret = tracepoint_probe_unregister(#name, (void *)probe);\
+		if (func && !ret)					\
+			func();						\
+		return ret;						\
 	}
 
+
+#define DECLARE_TRACE(name, proto, args)				 \
+	DECLARE_TRACE_WITH_CALLBACK(name, TP_PROTO(proto), TP_ARGS(args),\
+					NULL, NULL);
+
 #define DEFINE_TRACE(name)						\
 	static const char __tpstrtab_##name[]				\
 	__attribute__((section("__tracepoints_strings"))) = #name;	\
@@ -94,7 +113,7 @@ extern void tracepoint_update_probe_range(struct tracepoint *begin,
 	struct tracepoint *end);
 
 #else /* !CONFIG_TRACEPOINTS */
-#define DECLARE_TRACE(name, proto, args)				\
+#define DECLARE_TRACE_WITH_CALLBACK(name, proto, args, reg, unreg)	\
 	static inline void _do_trace_##name(struct tracepoint *tp, proto) \
 	{ }								\
 	static inline void trace_##name(proto)				\
@@ -108,6 +127,10 @@ extern void tracepoint_update_probe_range(struct tracepoint *begin,
 		return -ENOSYS;						\
 	}
 
+#define DECLARE_TRACE(name, proto, args)				 \
+	DECLARE_TRACE_WITH_CALLBACK(name, TP_PROTO(proto), TP_ARGS(args),\
+					NULL, NULL);
+
 #define DEFINE_TRACE(name)
 #define EXPORT_TRACEPOINT_SYMBOL_GPL(name)
 #define EXPORT_TRACEPOINT_SYMBOL(name)
-- 
1.6.2.3


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH 05/16] tracing: Add syscall tracepoints
  2009-08-11 18:48 [GIT PULL] tracing: Syscalls trace events + perf support Frederic Weisbecker
                   ` (3 preceding siblings ...)
  2009-08-11 18:48 ` [PATCH 04/16] tracing: Add DECLARE_TRACE_WITH_CALLBACK() macro Frederic Weisbecker
@ 2009-08-11 18:48 ` Frederic Weisbecker
  2009-08-11 18:48 ` [PATCH 06/16] tracing: Update FTRACE_SYSCALL_MAX Frederic Weisbecker
                   ` (12 subsequent siblings)
  17 siblings, 0 replies; 54+ messages in thread
From: Frederic Weisbecker @ 2009-08-11 18:48 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Jason Baron, Lai Jiangshan, Steven Rostedt, Peter Zijlstra,
	Mathieu Desnoyers, Jiaying Zhang, Martin Bligh, Li Zefan,
	Masami Hiramatsu, Frederic Weisbecker

From: Jason Baron <jbaron@redhat.com>

add two tracepoints in syscall exit and entry path, conditioned on
TIF_SYSCALL_FTRACE. Supports the syscall trace event code.

Signed-off-by: Jason Baron <jbaron@redhat.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Jiaying Zhang <jiayingz@google.com>
Cc: Martin Bligh <mbligh@google.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
 arch/x86/kernel/ptrace.c |    7 +++++--
 include/trace/syscall.h  |   20 ++++++++++++++++++++
 kernel/tracepoint.c      |   38 ++++++++++++++++++++++++++++++++++++++
 3 files changed, 63 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/ptrace.c b/arch/x86/kernel/ptrace.c
index 09ecbde..34dd6f1 100644
--- a/arch/x86/kernel/ptrace.c
+++ b/arch/x86/kernel/ptrace.c
@@ -37,6 +37,9 @@
 
 #include <trace/syscall.h>
 
+DEFINE_TRACE(syscall_enter);
+DEFINE_TRACE(syscall_exit);
+
 #include "tls.h"
 
 enum x86_regset {
@@ -1498,7 +1501,7 @@ asmregparm long syscall_trace_enter(struct pt_regs *regs)
 		ret = -1L;
 
 	if (unlikely(test_thread_flag(TIF_SYSCALL_FTRACE)))
-		ftrace_syscall_enter(regs);
+		trace_syscall_enter(regs, regs->orig_ax);
 
 	if (unlikely(current->audit_context)) {
 		if (IS_IA32)
@@ -1524,7 +1527,7 @@ asmregparm void syscall_trace_leave(struct pt_regs *regs)
 		audit_syscall_exit(AUDITSC_RESULT(regs->ax), regs->ax);
 
 	if (unlikely(test_thread_flag(TIF_SYSCALL_FTRACE)))
-		ftrace_syscall_exit(regs);
+		trace_syscall_exit(regs, regs->ax);
 
 	if (test_thread_flag(TIF_SYSCALL_TRACE))
 		tracehook_report_syscall_exit(regs, 0);
diff --git a/include/trace/syscall.h b/include/trace/syscall.h
index c55fcce..3951d77 100644
--- a/include/trace/syscall.h
+++ b/include/trace/syscall.h
@@ -1,8 +1,28 @@
 #ifndef _TRACE_SYSCALL_H
 #define _TRACE_SYSCALL_H
 
+#include <linux/tracepoint.h>
+
 #include <asm/ptrace.h>
 
+
+extern void syscall_regfunc(void);
+extern void syscall_unregfunc(void);
+
+DECLARE_TRACE_WITH_CALLBACK(syscall_enter,
+	TP_PROTO(struct pt_regs *regs, long id),
+	TP_ARGS(regs, id),
+	syscall_regfunc,
+	syscall_unregfunc
+);
+
+DECLARE_TRACE_WITH_CALLBACK(syscall_exit,
+	TP_PROTO(struct pt_regs *regs, long ret),
+	TP_ARGS(regs, ret),
+	syscall_regfunc,
+	syscall_unregfunc
+);
+
 /*
  * A syscall entry in the ftrace syscalls array.
  *
diff --git a/kernel/tracepoint.c b/kernel/tracepoint.c
index 1ef5d3a..070a42b 100644
--- a/kernel/tracepoint.c
+++ b/kernel/tracepoint.c
@@ -24,6 +24,7 @@
 #include <linux/tracepoint.h>
 #include <linux/err.h>
 #include <linux/slab.h>
+#include <linux/sched.h>
 
 extern struct tracepoint __start___tracepoints[];
 extern struct tracepoint __stop___tracepoints[];
@@ -577,3 +578,40 @@ static int init_tracepoints(void)
 __initcall(init_tracepoints);
 
 #endif /* CONFIG_MODULES */
+
+static DEFINE_MUTEX(regfunc_mutex);
+static int sys_tracepoint_refcount;
+
+void syscall_regfunc(void)
+{
+	unsigned long flags;
+	struct task_struct *g, *t;
+
+	mutex_lock(&regfunc_mutex);
+	if (!sys_tracepoint_refcount) {
+		read_lock_irqsave(&tasklist_lock, flags);
+		do_each_thread(g, t) {
+			set_tsk_thread_flag(t, TIF_SYSCALL_FTRACE);
+		} while_each_thread(g, t);
+		read_unlock_irqrestore(&tasklist_lock, flags);
+	}
+	sys_tracepoint_refcount++;
+	mutex_unlock(&regfunc_mutex);
+}
+
+void syscall_unregfunc(void)
+{
+	unsigned long flags;
+	struct task_struct *g, *t;
+
+	mutex_lock(&regfunc_mutex);
+	sys_tracepoint_refcount--;
+	if (!sys_tracepoint_refcount) {
+		read_lock_irqsave(&tasklist_lock, flags);
+		do_each_thread(g, t) {
+			clear_tsk_thread_flag(t, TIF_SYSCALL_FTRACE);
+		} while_each_thread(g, t);
+		read_unlock_irqrestore(&tasklist_lock, flags);
+	}
+	mutex_unlock(&regfunc_mutex);
+}
-- 
1.6.2.3


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH 06/16] tracing: Update FTRACE_SYSCALL_MAX
  2009-08-11 18:48 [GIT PULL] tracing: Syscalls trace events + perf support Frederic Weisbecker
                   ` (4 preceding siblings ...)
  2009-08-11 18:48 ` [PATCH 05/16] tracing: Add syscall tracepoints Frederic Weisbecker
@ 2009-08-11 18:48 ` Frederic Weisbecker
  2009-08-11 18:48 ` [PATCH 07/16] tracing: Raw_init() bailout in trace event register fail case Frederic Weisbecker
                   ` (11 subsequent siblings)
  17 siblings, 0 replies; 54+ messages in thread
From: Frederic Weisbecker @ 2009-08-11 18:48 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Jason Baron, Lai Jiangshan, Steven Rostedt, Peter Zijlstra,
	Mathieu Desnoyers, Jiaying Zhang, Martin Bligh, Li Zefan,
	Masami Hiramatsu, Frederic Weisbecker

From: Jason Baron <jbaron@redhat.com>

update FTRACE_SYSCALL_MAX to the current number of syscalls

FTRACE_SYSCALL_MAX is a temporary solution to get the number of
syscalls supported by the arch until we find a more dynamic way
to get this number.

Signed-off-by: Jason Baron <jbaron@redhat.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Jiaying Zhang <jiayingz@google.com>
Cc: Martin Bligh <mbligh@google.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
 arch/x86/include/asm/ftrace.h |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/ftrace.h b/arch/x86/include/asm/ftrace.h
index bd2c651..7113654 100644
--- a/arch/x86/include/asm/ftrace.h
+++ b/arch/x86/include/asm/ftrace.h
@@ -30,9 +30,9 @@
 
 /* FIXME: I don't want to stay hardcoded */
 #ifdef CONFIG_X86_64
-# define FTRACE_SYSCALL_MAX     296
+# define FTRACE_SYSCALL_MAX     299
 #else
-# define FTRACE_SYSCALL_MAX     333
+# define FTRACE_SYSCALL_MAX     337
 #endif
 
 #ifdef CONFIG_FUNCTION_TRACER
-- 
1.6.2.3


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH 07/16] tracing: Raw_init() bailout in trace event register fail case
  2009-08-11 18:48 [GIT PULL] tracing: Syscalls trace events + perf support Frederic Weisbecker
                   ` (5 preceding siblings ...)
  2009-08-11 18:48 ` [PATCH 06/16] tracing: Update FTRACE_SYSCALL_MAX Frederic Weisbecker
@ 2009-08-11 18:48 ` Frederic Weisbecker
  2009-08-11 18:48 ` [PATCH 08/16] tracing: Add ftrace_event_call void * 'data' field Frederic Weisbecker
                   ` (10 subsequent siblings)
  17 siblings, 0 replies; 54+ messages in thread
From: Frederic Weisbecker @ 2009-08-11 18:48 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Jason Baron, Lai Jiangshan, Steven Rostedt, Peter Zijlstra,
	Mathieu Desnoyers, Jiaying Zhang, Martin Bligh, Li Zefan,
	Masami Hiramatsu, Frederic Weisbecker

From: Jason Baron <jbaron@redhat.com>

Allow the return value of raw_init() trace event callback to bail us out
of creating a trace event file, in case we fail to register our
event.

Also, we plan to return -ENOSYS for syscall events that don't match any
syscalls listed in our arch tracing syscall table, we don't want to warn
in that case, we just want this event to be invisible in debugfs and
ignored.

Signed-off-by: Jason Baron <jbaron@redhat.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Jiaying Zhang <jiayingz@google.com>
Cc: Martin Bligh <mbligh@google.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
 kernel/trace/trace_events.c |   29 +++++++++++++++++++----------
 1 files changed, 19 insertions(+), 10 deletions(-)

diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
index e0cbede..f95f847 100644
--- a/kernel/trace/trace_events.c
+++ b/kernel/trace/trace_events.c
@@ -925,15 +925,6 @@ event_create_dir(struct ftrace_event_call *call, struct dentry *d_events,
 	if (strcmp(call->system, TRACE_SYSTEM) != 0)
 		d_events = event_subsystem_dir(call->system, d_events);
 
-	if (call->raw_init) {
-		ret = call->raw_init();
-		if (ret < 0) {
-			pr_warning("Could not initialize trace point"
-				   " events/%s\n", call->name);
-			return ret;
-		}
-	}
-
 	call->dir = debugfs_create_dir(call->name, d_events);
 	if (!call->dir) {
 		pr_warning("Could not create debugfs "
@@ -1058,6 +1049,7 @@ static void trace_module_add_events(struct module *mod)
 	struct ftrace_module_file_ops *file_ops = NULL;
 	struct ftrace_event_call *call, *start, *end;
 	struct dentry *d_events;
+	int ret;
 
 	start = mod->trace_events;
 	end = mod->trace_events + mod->num_trace_events;
@@ -1073,7 +1065,15 @@ static void trace_module_add_events(struct module *mod)
 		/* The linker may leave blanks */
 		if (!call->name)
 			continue;
-
+		if (call->raw_init) {
+			ret = call->raw_init();
+			if (ret < 0) {
+				if (ret != -ENOSYS)
+					pr_warning("Could not initialize trace "
+					"point events/%s\n", call->name);
+				continue;
+			}
+		}
 		/*
 		 * This module has events, create file ops for this module
 		 * if not already done.
@@ -1225,6 +1225,15 @@ static __init int event_trace_init(void)
 		/* The linker may leave blanks */
 		if (!call->name)
 			continue;
+		if (call->raw_init) {
+			ret = call->raw_init();
+			if (ret < 0) {
+				if (ret != -ENOSYS)
+					pr_warning("Could not initialize trace "
+					"point events/%s\n", call->name);
+				continue;
+			}
+		}
 		list_add(&call->list, &ftrace_events);
 		event_create_dir(call, d_events, &ftrace_event_id_fops,
 				 &ftrace_enable_fops, &ftrace_event_filter_fops,
-- 
1.6.2.3


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH 08/16] tracing: Add ftrace_event_call void * 'data' field
  2009-08-11 18:48 [GIT PULL] tracing: Syscalls trace events + perf support Frederic Weisbecker
                   ` (6 preceding siblings ...)
  2009-08-11 18:48 ` [PATCH 07/16] tracing: Raw_init() bailout in trace event register fail case Frederic Weisbecker
@ 2009-08-11 18:48 ` Frederic Weisbecker
  2009-08-11 18:48 ` [PATCH 09/16] tracing: Add trace events for each syscall entry/exit Frederic Weisbecker
                   ` (9 subsequent siblings)
  17 siblings, 0 replies; 54+ messages in thread
From: Frederic Weisbecker @ 2009-08-11 18:48 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Jason Baron, Lai Jiangshan, Steven Rostedt, Peter Zijlstra,
	Mathieu Desnoyers, Jiaying Zhang, Martin Bligh, Li Zefan,
	Masami Hiramatsu, Frederic Weisbecker

From: Jason Baron <jbaron@redhat.com>

add an optional void * pointer to 'ftrace_event_call' that is
passed in for regfunc and unregfunc.

This prepares for syscall tracepoints creation by passing the name of
the syscall we want to trace and then retrieve its number through our
arch syscall table.

Signed-off-by: Jason Baron <jbaron@redhat.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Jiaying Zhang <jiayingz@google.com>
Cc: Martin Bligh <mbligh@google.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
 include/linux/ftrace_event.h |    5 +++--
 include/trace/ftrace.h       |    4 ++--
 kernel/trace/trace_events.c  |    4 ++--
 3 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.h
index ac8c6f8..8544f12 100644
--- a/include/linux/ftrace_event.h
+++ b/include/linux/ftrace_event.h
@@ -112,8 +112,8 @@ struct ftrace_event_call {
 	struct dentry		*dir;
 	struct trace_event	*event;
 	int			enabled;
-	int			(*regfunc)(void);
-	void			(*unregfunc)(void);
+	int			(*regfunc)(void *);
+	void			(*unregfunc)(void *);
 	int			id;
 	int			(*raw_init)(void);
 	int			(*show_format)(struct trace_seq *s);
@@ -122,6 +122,7 @@ struct ftrace_event_call {
 	int			filter_active;
 	struct event_filter	*filter;
 	void			*mod;
+	void			*data;
 
 	atomic_t		profile_count;
 	int			(*profile_enable)(struct ftrace_event_call *);
diff --git a/include/trace/ftrace.h b/include/trace/ftrace.h
index 25d3b02..46d81b5 100644
--- a/include/trace/ftrace.h
+++ b/include/trace/ftrace.h
@@ -568,7 +568,7 @@ static void ftrace_raw_event_##call(proto)				\
 		trace_nowake_buffer_unlock_commit(event, irq_flags, pc); \
 }									\
 									\
-static int ftrace_raw_reg_event_##call(void)				\
+static int ftrace_raw_reg_event_##call(void *ptr)			\
 {									\
 	int ret;							\
 									\
@@ -579,7 +579,7 @@ static int ftrace_raw_reg_event_##call(void)				\
 	return ret;							\
 }									\
 									\
-static void ftrace_raw_unreg_event_##call(void)				\
+static void ftrace_raw_unreg_event_##call(void *ptr)			\
 {									\
 	unregister_trace_##call(ftrace_raw_event_##call);		\
 }									\
diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
index f95f847..1d289e2 100644
--- a/kernel/trace/trace_events.c
+++ b/kernel/trace/trace_events.c
@@ -86,14 +86,14 @@ static void ftrace_event_enable_disable(struct ftrace_event_call *call,
 		if (call->enabled) {
 			call->enabled = 0;
 			tracing_stop_cmdline_record();
-			call->unregfunc();
+			call->unregfunc(call->data);
 		}
 		break;
 	case 1:
 		if (!call->enabled) {
 			call->enabled = 1;
 			tracing_start_cmdline_record();
-			call->regfunc();
+			call->regfunc(call->data);
 		}
 		break;
 	}
-- 
1.6.2.3


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH 09/16] tracing: Add trace events for each syscall entry/exit
  2009-08-11 18:48 [GIT PULL] tracing: Syscalls trace events + perf support Frederic Weisbecker
                   ` (7 preceding siblings ...)
  2009-08-11 18:48 ` [PATCH 08/16] tracing: Add ftrace_event_call void * 'data' field Frederic Weisbecker
@ 2009-08-11 18:48 ` Frederic Weisbecker
  2009-08-11 18:48 ` [PATCH 10/16] tracing: Add individual syscalls tracepoint id support Frederic Weisbecker
                   ` (8 subsequent siblings)
  17 siblings, 0 replies; 54+ messages in thread
From: Frederic Weisbecker @ 2009-08-11 18:48 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Jason Baron, Lai Jiangshan, Steven Rostedt, Peter Zijlstra,
	Mathieu Desnoyers, Jiaying Zhang, Martin Bligh, Li Zefan,
	Masami Hiramatsu, Frederic Weisbecker

From: Jason Baron <jbaron@redhat.com>

Layer Frederic's syscall tracer on tracepoints. We create trace events
via hooking into the SYSCALL_DEFINE macros. This allows us to
individually toggle syscall entry and exit points on/off.

Signed-off-by: Jason Baron <jbaron@redhat.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Jiaying Zhang <jiayingz@google.com>
Cc: Martin Bligh <mbligh@google.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
 include/linux/syscalls.h      |   61 +++++++++++++-
 include/trace/syscall.h       |   18 ++--
 kernel/trace/trace_syscalls.c |  183 ++++++++++++++++++++---------------------
 3 files changed, 159 insertions(+), 103 deletions(-)

diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index 80de700..5e5b4d3 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -64,6 +64,7 @@ struct perf_counter_attr;
 #include <linux/sem.h>
 #include <asm/siginfo.h>
 #include <asm/signal.h>
+#include <linux/unistd.h>
 #include <linux/quota.h>
 #include <linux/key.h>
 #include <trace/syscall.h>
@@ -112,6 +113,59 @@ struct perf_counter_attr;
 #define __SC_STR_TDECL5(t, a, ...)	#t, __SC_STR_TDECL4(__VA_ARGS__)
 #define __SC_STR_TDECL6(t, a, ...)	#t, __SC_STR_TDECL5(__VA_ARGS__)
 
+
+#define SYSCALL_TRACE_ENTER_EVENT(sname)				\
+	static struct ftrace_event_call event_enter_##sname;		\
+	static int init_enter_##sname(void)				\
+	{								\
+		int num;						\
+		num = syscall_name_to_nr("sys"#sname);			\
+		if (num < 0)						\
+			return -ENOSYS;					\
+		register_ftrace_event(&event_syscall_enter);		\
+		INIT_LIST_HEAD(&event_enter_##sname.fields);		\
+		init_preds(&event_enter_##sname);			\
+		return 0;						\
+	}								\
+	static struct ftrace_event_call __used				\
+	  __attribute__((__aligned__(4)))				\
+	  __attribute__((section("_ftrace_events")))			\
+	  event_enter_##sname = {					\
+		.name                   = "sys_enter"#sname,		\
+		.system                 = "syscalls",			\
+		.event                  = &event_syscall_enter,		\
+		.raw_init		= init_enter_##sname,		\
+		.regfunc		= reg_event_syscall_enter,	\
+		.unregfunc		= unreg_event_syscall_enter,	\
+		.data			= "sys"#sname,			\
+	}
+
+#define SYSCALL_TRACE_EXIT_EVENT(sname)					\
+	static struct ftrace_event_call event_exit_##sname;		\
+	static int init_exit_##sname(void)				\
+	{								\
+		int num;						\
+		num = syscall_name_to_nr("sys"#sname);			\
+		if (num < 0)						\
+			return -ENOSYS;					\
+		register_ftrace_event(&event_syscall_exit);		\
+		INIT_LIST_HEAD(&event_exit_##sname.fields);		\
+		init_preds(&event_exit_##sname);			\
+		return 0;						\
+	}								\
+	static struct ftrace_event_call __used				\
+	  __attribute__((__aligned__(4)))				\
+	  __attribute__((section("_ftrace_events")))			\
+	  event_exit_##sname = {					\
+		.name                   = "sys_exit"#sname,		\
+		.system                 = "syscalls",			\
+		.event                  = &event_syscall_exit,		\
+		.raw_init		= init_exit_##sname,		\
+		.regfunc		= reg_event_syscall_exit,	\
+		.unregfunc		= unreg_event_syscall_exit,	\
+		.data			= "sys"#sname,			\
+	}
+
 #define SYSCALL_METADATA(sname, nb)				\
 	static const struct syscall_metadata __used		\
 	  __attribute__((__aligned__(4)))			\
@@ -121,7 +175,9 @@ struct perf_counter_attr;
 		.nb_args 	= nb,				\
 		.types		= types_##sname,		\
 		.args		= args_##sname,			\
-	}
+	};							\
+	SYSCALL_TRACE_ENTER_EVENT(sname);			\
+	SYSCALL_TRACE_EXIT_EVENT(sname);
 
 #define SYSCALL_DEFINE0(sname)					\
 	static const struct syscall_metadata __used		\
@@ -131,8 +187,9 @@ struct perf_counter_attr;
 		.name 		= "sys_"#sname,			\
 		.nb_args 	= 0,				\
 	};							\
+	SYSCALL_TRACE_ENTER_EVENT(_##sname);			\
+	SYSCALL_TRACE_EXIT_EVENT(_##sname);			\
 	asmlinkage long sys_##sname(void)
-
 #else
 #define SYSCALL_DEFINE0(name)	   asmlinkage long sys_##name(void)
 #endif
diff --git a/include/trace/syscall.h b/include/trace/syscall.h
index 3951d77..73fb8b4 100644
--- a/include/trace/syscall.h
+++ b/include/trace/syscall.h
@@ -2,6 +2,8 @@
 #define _TRACE_SYSCALL_H
 
 #include <linux/tracepoint.h>
+#include <linux/unistd.h>
+#include <linux/ftrace_event.h>
 
 #include <asm/ptrace.h>
 
@@ -40,15 +42,13 @@ struct syscall_metadata {
 
 #ifdef CONFIG_FTRACE_SYSCALLS
 extern struct syscall_metadata *syscall_nr_to_meta(int nr);
-extern void start_ftrace_syscalls(void);
-extern void stop_ftrace_syscalls(void);
-extern void ftrace_syscall_enter(struct pt_regs *regs);
-extern void ftrace_syscall_exit(struct pt_regs *regs);
-#else
-static inline void start_ftrace_syscalls(void)			{ }
-static inline void stop_ftrace_syscalls(void)			{ }
-static inline void ftrace_syscall_enter(struct pt_regs *regs)	{ }
-static inline void ftrace_syscall_exit(struct pt_regs *regs)	{ }
+extern int syscall_name_to_nr(char *name);
+extern struct trace_event event_syscall_enter;
+extern struct trace_event event_syscall_exit;
+extern int reg_event_syscall_enter(void *ptr);
+extern void unreg_event_syscall_enter(void *ptr);
+extern int reg_event_syscall_exit(void *ptr);
+extern void unreg_event_syscall_exit(void *ptr);
 #endif
 
 #endif /* _TRACE_SYSCALL_H */
diff --git a/kernel/trace/trace_syscalls.c b/kernel/trace/trace_syscalls.c
index 08aed43..c7ae25e 100644
--- a/kernel/trace/trace_syscalls.c
+++ b/kernel/trace/trace_syscalls.c
@@ -1,15 +1,16 @@
 #include <trace/syscall.h>
 #include <linux/kernel.h>
+#include <linux/ftrace.h>
 #include <asm/syscall.h>
 
 #include "trace_output.h"
 #include "trace.h"
 
-/* Keep a counter of the syscall tracing users */
-static int refcount;
-
-/* Prevent from races on thread flags toggling */
 static DEFINE_MUTEX(syscall_trace_lock);
+static int sys_refcount_enter;
+static int sys_refcount_exit;
+static DECLARE_BITMAP(enabled_enter_syscalls, FTRACE_SYSCALL_MAX);
+static DECLARE_BITMAP(enabled_exit_syscalls, FTRACE_SYSCALL_MAX);
 
 /* Option to display the parameters types */
 enum {
@@ -95,53 +96,7 @@ print_syscall_exit(struct trace_iterator *iter, int flags)
 	return TRACE_TYPE_HANDLED;
 }
 
-void start_ftrace_syscalls(void)
-{
-	unsigned long flags;
-	struct task_struct *g, *t;
-
-	mutex_lock(&syscall_trace_lock);
-
-	/* Don't enable the flag on the tasks twice */
-	if (++refcount != 1)
-		goto unlock;
-
-	read_lock_irqsave(&tasklist_lock, flags);
-
-	do_each_thread(g, t) {
-		set_tsk_thread_flag(t, TIF_SYSCALL_FTRACE);
-	} while_each_thread(g, t);
-
-	read_unlock_irqrestore(&tasklist_lock, flags);
-
-unlock:
-	mutex_unlock(&syscall_trace_lock);
-}
-
-void stop_ftrace_syscalls(void)
-{
-	unsigned long flags;
-	struct task_struct *g, *t;
-
-	mutex_lock(&syscall_trace_lock);
-
-	/* There are perhaps still some users */
-	if (--refcount)
-		goto unlock;
-
-	read_lock_irqsave(&tasklist_lock, flags);
-
-	do_each_thread(g, t) {
-		clear_tsk_thread_flag(t, TIF_SYSCALL_FTRACE);
-	} while_each_thread(g, t);
-
-	read_unlock_irqrestore(&tasklist_lock, flags);
-
-unlock:
-	mutex_unlock(&syscall_trace_lock);
-}
-
-void ftrace_syscall_enter(struct pt_regs *regs)
+void ftrace_syscall_enter(struct pt_regs *regs, long id)
 {
 	struct syscall_trace_enter *entry;
 	struct syscall_metadata *sys_data;
@@ -150,6 +105,8 @@ void ftrace_syscall_enter(struct pt_regs *regs)
 	int syscall_nr;
 
 	syscall_nr = syscall_get_nr(current, regs);
+	if (!test_bit(syscall_nr, enabled_enter_syscalls))
+		return;
 
 	sys_data = syscall_nr_to_meta(syscall_nr);
 	if (!sys_data)
@@ -170,7 +127,7 @@ void ftrace_syscall_enter(struct pt_regs *regs)
 	trace_wake_up();
 }
 
-void ftrace_syscall_exit(struct pt_regs *regs)
+void ftrace_syscall_exit(struct pt_regs *regs, long ret)
 {
 	struct syscall_trace_exit *entry;
 	struct syscall_metadata *sys_data;
@@ -178,6 +135,8 @@ void ftrace_syscall_exit(struct pt_regs *regs)
 	int syscall_nr;
 
 	syscall_nr = syscall_get_nr(current, regs);
+	if (!test_bit(syscall_nr, enabled_exit_syscalls))
+		return;
 
 	sys_data = syscall_nr_to_meta(syscall_nr);
 	if (!sys_data)
@@ -196,54 +155,94 @@ void ftrace_syscall_exit(struct pt_regs *regs)
 	trace_wake_up();
 }
 
-static int init_syscall_tracer(struct trace_array *tr)
+int reg_event_syscall_enter(void *ptr)
 {
-	start_ftrace_syscalls();
-
-	return 0;
+	int ret = 0;
+	int num;
+	char *name;
+
+	name = (char *)ptr;
+	num = syscall_name_to_nr(name);
+	if (num < 0 || num >= FTRACE_SYSCALL_MAX)
+		return -ENOSYS;
+	mutex_lock(&syscall_trace_lock);
+	if (!sys_refcount_enter)
+		ret = register_trace_syscall_enter(ftrace_syscall_enter);
+	if (ret) {
+		pr_info("event trace: Could not activate"
+				"syscall entry trace point");
+	} else {
+		set_bit(num, enabled_enter_syscalls);
+		sys_refcount_enter++;
+	}
+	mutex_unlock(&syscall_trace_lock);
+	return ret;
 }
 
-static void reset_syscall_tracer(struct trace_array *tr)
+void unreg_event_syscall_enter(void *ptr)
 {
-	stop_ftrace_syscalls();
-	tracing_reset_online_cpus(tr);
-}
-
-static struct trace_event syscall_enter_event = {
-	.type	 	= TRACE_SYSCALL_ENTER,
-	.trace		= print_syscall_enter,
-};
-
-static struct trace_event syscall_exit_event = {
-	.type	 	= TRACE_SYSCALL_EXIT,
-	.trace		= print_syscall_exit,
-};
+	int num;
+	char *name;
 
-static struct tracer syscall_tracer __read_mostly = {
-	.name	     	= "syscall",
-	.init		= init_syscall_tracer,
-	.reset		= reset_syscall_tracer,
-	.flags		= &syscalls_flags,
-};
+	name = (char *)ptr;
+	num = syscall_name_to_nr(name);
+	if (num < 0 || num >= FTRACE_SYSCALL_MAX)
+		return;
+	mutex_lock(&syscall_trace_lock);
+	sys_refcount_enter--;
+	clear_bit(num, enabled_enter_syscalls);
+	if (!sys_refcount_enter)
+		unregister_trace_syscall_enter(ftrace_syscall_enter);
+	mutex_unlock(&syscall_trace_lock);
+}
 
-__init int register_ftrace_syscalls(void)
+int reg_event_syscall_exit(void *ptr)
 {
-	int ret;
-
-	ret = register_ftrace_event(&syscall_enter_event);
-	if (!ret) {
-		printk(KERN_WARNING "event %d failed to register\n",
-		       syscall_enter_event.type);
-		WARN_ON_ONCE(1);
+	int ret = 0;
+	int num;
+	char *name;
+
+	name = (char *)ptr;
+	num = syscall_name_to_nr(name);
+	if (num < 0 || num >= FTRACE_SYSCALL_MAX)
+		return -ENOSYS;
+	mutex_lock(&syscall_trace_lock);
+	if (!sys_refcount_exit)
+		ret = register_trace_syscall_exit(ftrace_syscall_exit);
+	if (ret) {
+		pr_info("event trace: Could not activate"
+				"syscall exit trace point");
+	} else {
+		set_bit(num, enabled_exit_syscalls);
+		sys_refcount_exit++;
 	}
+	mutex_unlock(&syscall_trace_lock);
+	return ret;
+}
 
-	ret = register_ftrace_event(&syscall_exit_event);
-	if (!ret) {
-		printk(KERN_WARNING "event %d failed to register\n",
-		       syscall_exit_event.type);
-		WARN_ON_ONCE(1);
-	}
+void unreg_event_syscall_exit(void *ptr)
+{
+	int num;
+	char *name;
 
-	return register_tracer(&syscall_tracer);
+	name = (char *)ptr;
+	num = syscall_name_to_nr(name);
+	if (num < 0 || num >= FTRACE_SYSCALL_MAX)
+		return;
+	mutex_lock(&syscall_trace_lock);
+	sys_refcount_exit--;
+	clear_bit(num, enabled_exit_syscalls);
+	if (!sys_refcount_exit)
+		unregister_trace_syscall_exit(ftrace_syscall_exit);
+	mutex_unlock(&syscall_trace_lock);
 }
-device_initcall(register_ftrace_syscalls);
+
+struct trace_event event_syscall_enter = {
+	.trace			= print_syscall_enter,
+	.type			= TRACE_SYSCALL_ENTER
+};
+
+struct trace_event event_syscall_exit = {
+	.trace			= print_syscall_exit,
+	.type			= TRACE_SYSCALL_EXIT
+};
-- 
1.6.2.3


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH 10/16] tracing: Add individual syscalls tracepoint id support
  2009-08-11 18:48 [GIT PULL] tracing: Syscalls trace events + perf support Frederic Weisbecker
                   ` (8 preceding siblings ...)
  2009-08-11 18:48 ` [PATCH 09/16] tracing: Add trace events for each syscall entry/exit Frederic Weisbecker
@ 2009-08-11 18:48 ` Frederic Weisbecker
  2009-08-11 18:49 ` [PATCH 11/16] tracing: Add perf counter support for syscalls tracing Frederic Weisbecker
                   ` (7 subsequent siblings)
  17 siblings, 0 replies; 54+ messages in thread
From: Frederic Weisbecker @ 2009-08-11 18:48 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Jason Baron, Lai Jiangshan, Steven Rostedt, Peter Zijlstra,
	Mathieu Desnoyers, Jiaying Zhang, Martin Bligh, Li Zefan,
	Masami Hiramatsu, Frederic Weisbecker

From: Jason Baron <jbaron@redhat.com>

The current state of syscalls tracepoints generates only one event id
for every syscall events.

This patch associates an id with each syscall trace event, so that we
can identify each syscall trace event using the 'perf' tool.

Signed-off-by: Jason Baron <jbaron@redhat.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Jiaying Zhang <jiayingz@google.com>
Cc: Martin Bligh <mbligh@google.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
 arch/x86/kernel/ftrace.c      |   10 ++++++++++
 include/linux/syscalls.h      |   22 ++++++++++++++++++----
 include/trace/syscall.h       |    8 ++++++++
 kernel/trace/trace.h          |    6 ------
 kernel/trace/trace_syscalls.c |   26 ++++++++++++++++----------
 5 files changed, 52 insertions(+), 20 deletions(-)

diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
index 0d93d40..3cff121 100644
--- a/arch/x86/kernel/ftrace.c
+++ b/arch/x86/kernel/ftrace.c
@@ -516,6 +516,16 @@ int syscall_name_to_nr(char *name)
 	return -1;
 }
 
+void set_syscall_enter_id(int num, int id)
+{
+	syscalls_metadata[num]->enter_id = id;
+}
+
+void set_syscall_exit_id(int num, int id)
+{
+	syscalls_metadata[num]->exit_id = id;
+}
+
 static int __init arch_init_ftrace_syscalls(void)
 {
 	int i;
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index 5e5b4d3..ce4b01c 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -116,13 +116,20 @@ struct perf_counter_attr;
 
 #define SYSCALL_TRACE_ENTER_EVENT(sname)				\
 	static struct ftrace_event_call event_enter_##sname;		\
+	struct trace_event enter_syscall_print_##sname = {		\
+		.trace                  = print_syscall_enter,		\
+	};								\
 	static int init_enter_##sname(void)				\
 	{								\
-		int num;						\
+		int num, id;						\
 		num = syscall_name_to_nr("sys"#sname);			\
 		if (num < 0)						\
 			return -ENOSYS;					\
-		register_ftrace_event(&event_syscall_enter);		\
+		id = register_ftrace_event(&enter_syscall_print_##sname);\
+		if (!id)						\
+			return -ENODEV;					\
+		event_enter_##sname.id = id;				\
+		set_syscall_enter_id(num, id);				\
 		INIT_LIST_HEAD(&event_enter_##sname.fields);		\
 		init_preds(&event_enter_##sname);			\
 		return 0;						\
@@ -142,13 +149,20 @@ struct perf_counter_attr;
 
 #define SYSCALL_TRACE_EXIT_EVENT(sname)					\
 	static struct ftrace_event_call event_exit_##sname;		\
+	struct trace_event exit_syscall_print_##sname = {		\
+		.trace                  = print_syscall_exit,		\
+	};								\
 	static int init_exit_##sname(void)				\
 	{								\
-		int num;						\
+		int num, id;						\
 		num = syscall_name_to_nr("sys"#sname);			\
 		if (num < 0)						\
 			return -ENOSYS;					\
-		register_ftrace_event(&event_syscall_exit);		\
+		id = register_ftrace_event(&exit_syscall_print_##sname);\
+		if (!id)						\
+			return -ENODEV;					\
+		event_exit_##sname.id = id;				\
+		set_syscall_exit_id(num, id);				\
 		INIT_LIST_HEAD(&event_exit_##sname.fields);		\
 		init_preds(&event_exit_##sname);			\
 		return 0;						\
diff --git a/include/trace/syscall.h b/include/trace/syscall.h
index 73fb8b4..df62840 100644
--- a/include/trace/syscall.h
+++ b/include/trace/syscall.h
@@ -32,23 +32,31 @@ DECLARE_TRACE_WITH_CALLBACK(syscall_exit,
  * @nb_args: number of parameters it takes
  * @types: list of types as strings
  * @args: list of args as strings (args[i] matches types[i])
+ * @enter_id: associated ftrace enter event id
+ * @exit_id: associated ftrace exit event id
  */
 struct syscall_metadata {
 	const char	*name;
 	int		nb_args;
 	const char	**types;
 	const char	**args;
+	int		enter_id;
+	int		exit_id;
 };
 
 #ifdef CONFIG_FTRACE_SYSCALLS
 extern struct syscall_metadata *syscall_nr_to_meta(int nr);
 extern int syscall_name_to_nr(char *name);
+void set_syscall_enter_id(int num, int id);
+void set_syscall_exit_id(int num, int id);
 extern struct trace_event event_syscall_enter;
 extern struct trace_event event_syscall_exit;
 extern int reg_event_syscall_enter(void *ptr);
 extern void unreg_event_syscall_enter(void *ptr);
 extern int reg_event_syscall_exit(void *ptr);
 extern void unreg_event_syscall_exit(void *ptr);
+enum print_line_t print_syscall_enter(struct trace_iterator *iter, int flags);
+enum print_line_t print_syscall_exit(struct trace_iterator *iter, int flags);
 #endif
 
 #endif /* _TRACE_SYSCALL_H */
diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index d682357..300ef78 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -34,8 +34,6 @@ enum trace_type {
 	TRACE_GRAPH_ENT,
 	TRACE_USER_STACK,
 	TRACE_HW_BRANCHES,
-	TRACE_SYSCALL_ENTER,
-	TRACE_SYSCALL_EXIT,
 	TRACE_KMEM_ALLOC,
 	TRACE_KMEM_FREE,
 	TRACE_POWER,
@@ -319,10 +317,6 @@ extern void __ftrace_bad_type(void);
 			  TRACE_KMEM_ALLOC);	\
 		IF_ASSIGN(var, ent, struct kmemtrace_free_entry,	\
 			  TRACE_KMEM_FREE);	\
-		IF_ASSIGN(var, ent, struct syscall_trace_enter,		\
-			  TRACE_SYSCALL_ENTER);				\
-		IF_ASSIGN(var, ent, struct syscall_trace_exit,		\
-			  TRACE_SYSCALL_EXIT);				\
 		__ftrace_bad_type();					\
 	} while (0)
 
diff --git a/kernel/trace/trace_syscalls.c b/kernel/trace/trace_syscalls.c
index c7ae25e..e58a9c1 100644
--- a/kernel/trace/trace_syscalls.c
+++ b/kernel/trace/trace_syscalls.c
@@ -36,14 +36,18 @@ print_syscall_enter(struct trace_iterator *iter, int flags)
 	struct syscall_metadata *entry;
 	int i, ret, syscall;
 
-	trace_assign_type(trace, ent);
-
+	trace = (typeof(trace))ent;
 	syscall = trace->nr;
-
 	entry = syscall_nr_to_meta(syscall);
+
 	if (!entry)
 		goto end;
 
+	if (entry->enter_id != ent->type) {
+		WARN_ON_ONCE(1);
+		goto end;
+	}
+
 	ret = trace_seq_printf(s, "%s(", entry->name);
 	if (!ret)
 		return TRACE_TYPE_PARTIAL_LINE;
@@ -78,16 +82,20 @@ print_syscall_exit(struct trace_iterator *iter, int flags)
 	struct syscall_metadata *entry;
 	int ret;
 
-	trace_assign_type(trace, ent);
-
+	trace = (typeof(trace))ent;
 	syscall = trace->nr;
-
 	entry = syscall_nr_to_meta(syscall);
+
 	if (!entry) {
 		trace_seq_printf(s, "\n");
 		return TRACE_TYPE_HANDLED;
 	}
 
+	if (entry->exit_id != ent->type) {
+		WARN_ON_ONCE(1);
+		return TRACE_TYPE_UNHANDLED;
+	}
+
 	ret = trace_seq_printf(s, "%s -> 0x%lx\n", entry->name,
 				trace->ret);
 	if (!ret)
@@ -114,7 +122,7 @@ void ftrace_syscall_enter(struct pt_regs *regs, long id)
 
 	size = sizeof(*entry) + sizeof(unsigned long) * sys_data->nb_args;
 
-	event = trace_current_buffer_lock_reserve(TRACE_SYSCALL_ENTER, size,
+	event = trace_current_buffer_lock_reserve(sys_data->enter_id, size,
 							0, 0);
 	if (!event)
 		return;
@@ -142,7 +150,7 @@ void ftrace_syscall_exit(struct pt_regs *regs, long ret)
 	if (!sys_data)
 		return;
 
-	event = trace_current_buffer_lock_reserve(TRACE_SYSCALL_EXIT,
+	event = trace_current_buffer_lock_reserve(sys_data->exit_id,
 				sizeof(*entry), 0, 0);
 	if (!event)
 		return;
@@ -239,10 +247,8 @@ void unreg_event_syscall_exit(void *ptr)
 
 struct trace_event event_syscall_enter = {
 	.trace			= print_syscall_enter,
-	.type			= TRACE_SYSCALL_ENTER
 };
 
 struct trace_event event_syscall_exit = {
 	.trace			= print_syscall_exit,
-	.type			= TRACE_SYSCALL_EXIT
 };
-- 
1.6.2.3


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH 11/16] tracing: Add perf counter support for syscalls tracing
  2009-08-11 18:48 [GIT PULL] tracing: Syscalls trace events + perf support Frederic Weisbecker
                   ` (9 preceding siblings ...)
  2009-08-11 18:48 ` [PATCH 10/16] tracing: Add individual syscalls tracepoint id support Frederic Weisbecker
@ 2009-08-11 18:49 ` Frederic Weisbecker
  2009-08-11 18:49 ` [PATCH 12/16] tracing: Add more namespace area to 'perf list' output Frederic Weisbecker
                   ` (6 subsequent siblings)
  17 siblings, 0 replies; 54+ messages in thread
From: Frederic Weisbecker @ 2009-08-11 18:49 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Jason Baron, Lai Jiangshan, Steven Rostedt, Peter Zijlstra,
	Mathieu Desnoyers, Jiaying Zhang, Martin Bligh, Li Zefan,
	Masami Hiramatsu, Frederic Weisbecker

From: Jason Baron <jbaron@redhat.com>

The perf counter support is automated for usual trace events. But we
have to define specific callbacks for this to handle syscalls trace
events

Make 'perf stat -e syscalls:sys_enter_blah' work with syscall style
tracepoints.

Signed-off-by: Jason Baron <jbaron@redhat.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Jiaying Zhang <jiayingz@google.com>
Cc: Martin Bligh <mbligh@google.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
 include/linux/perf_counter.h  |    2 +
 include/linux/syscalls.h      |   52 +++++++++++++++++-
 include/trace/syscall.h       |    7 +++
 kernel/trace/trace_syscalls.c |  121 +++++++++++++++++++++++++++++++++++++++++
 4 files changed, 181 insertions(+), 1 deletions(-)

diff --git a/include/linux/perf_counter.h b/include/linux/perf_counter.h
index a9d823a..8e6460f 100644
--- a/include/linux/perf_counter.h
+++ b/include/linux/perf_counter.h
@@ -734,6 +734,8 @@ extern int sysctl_perf_counter_mlock;
 extern int sysctl_perf_counter_sample_rate;
 
 extern void perf_counter_init(void);
+extern void perf_tpcounter_event(int event_id, u64 addr, u64 count,
+				 void *record, int entry_size);
 
 #ifndef perf_misc_flags
 #define perf_misc_flags(regs)	(user_mode(regs) ? PERF_EVENT_MISC_USER : \
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index ce4b01c..5541e75 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -98,6 +98,53 @@ struct perf_counter_attr;
 #define __SC_TEST5(t5, a5, ...)	__SC_TEST(t5); __SC_TEST4(__VA_ARGS__)
 #define __SC_TEST6(t6, a6, ...)	__SC_TEST(t6); __SC_TEST5(__VA_ARGS__)
 
+#ifdef CONFIG_EVENT_PROFILE
+#define TRACE_SYS_ENTER_PROFILE(sname)					       \
+static int prof_sysenter_enable_##sname(struct ftrace_event_call *event_call)  \
+{									       \
+	int ret = 0;							       \
+	if (!atomic_inc_return(&event_enter_##sname.profile_count))	       \
+		ret = reg_prof_syscall_enter("sys"#sname);		       \
+	return ret;							       \
+}									       \
+									       \
+static void prof_sysenter_disable_##sname(struct ftrace_event_call *event_call)\
+{									       \
+	if (atomic_add_negative(-1, &event_enter_##sname.profile_count))       \
+		unreg_prof_syscall_enter("sys"#sname);			       \
+}
+
+#define TRACE_SYS_EXIT_PROFILE(sname)					       \
+static int prof_sysexit_enable_##sname(struct ftrace_event_call *event_call)   \
+{									       \
+	int ret = 0;							       \
+	if (!atomic_inc_return(&event_exit_##sname.profile_count))	       \
+		ret = reg_prof_syscall_exit("sys"#sname);		       \
+	return ret;							       \
+}									       \
+									       \
+static void prof_sysexit_disable_##sname(struct ftrace_event_call *event_call) \
+{                                                                              \
+	if (atomic_add_negative(-1, &event_exit_##sname.profile_count))	       \
+		unreg_prof_syscall_exit("sys"#sname);			       \
+}
+
+#define TRACE_SYS_ENTER_PROFILE_INIT(sname)				       \
+	.profile_count = ATOMIC_INIT(-1),				       \
+	.profile_enable = prof_sysenter_enable_##sname,			       \
+	.profile_disable = prof_sysenter_disable_##sname,
+
+#define TRACE_SYS_EXIT_PROFILE_INIT(sname)				       \
+	.profile_count = ATOMIC_INIT(-1),				       \
+	.profile_enable = prof_sysexit_enable_##sname,			       \
+	.profile_disable = prof_sysexit_disable_##sname,
+#else
+#define TRACE_SYS_ENTER_PROFILE(sname)
+#define TRACE_SYS_ENTER_PROFILE_INIT(sname)
+#define TRACE_SYS_EXIT_PROFILE(sname)
+#define TRACE_SYS_EXIT_PROFILE_INIT(sname)
+#endif
+
 #ifdef CONFIG_FTRACE_SYSCALLS
 #define __SC_STR_ADECL1(t, a)		#a
 #define __SC_STR_ADECL2(t, a, ...)	#a, __SC_STR_ADECL1(__VA_ARGS__)
@@ -113,7 +160,6 @@ struct perf_counter_attr;
 #define __SC_STR_TDECL5(t, a, ...)	#t, __SC_STR_TDECL4(__VA_ARGS__)
 #define __SC_STR_TDECL6(t, a, ...)	#t, __SC_STR_TDECL5(__VA_ARGS__)
 
-
 #define SYSCALL_TRACE_ENTER_EVENT(sname)				\
 	static struct ftrace_event_call event_enter_##sname;		\
 	struct trace_event enter_syscall_print_##sname = {		\
@@ -134,6 +180,7 @@ struct perf_counter_attr;
 		init_preds(&event_enter_##sname);			\
 		return 0;						\
 	}								\
+	TRACE_SYS_ENTER_PROFILE(sname);					\
 	static struct ftrace_event_call __used				\
 	  __attribute__((__aligned__(4)))				\
 	  __attribute__((section("_ftrace_events")))			\
@@ -145,6 +192,7 @@ struct perf_counter_attr;
 		.regfunc		= reg_event_syscall_enter,	\
 		.unregfunc		= unreg_event_syscall_enter,	\
 		.data			= "sys"#sname,			\
+		TRACE_SYS_ENTER_PROFILE_INIT(sname)			\
 	}
 
 #define SYSCALL_TRACE_EXIT_EVENT(sname)					\
@@ -167,6 +215,7 @@ struct perf_counter_attr;
 		init_preds(&event_exit_##sname);			\
 		return 0;						\
 	}								\
+	TRACE_SYS_EXIT_PROFILE(sname);					\
 	static struct ftrace_event_call __used				\
 	  __attribute__((__aligned__(4)))				\
 	  __attribute__((section("_ftrace_events")))			\
@@ -178,6 +227,7 @@ struct perf_counter_attr;
 		.regfunc		= reg_event_syscall_exit,	\
 		.unregfunc		= unreg_event_syscall_exit,	\
 		.data			= "sys"#sname,			\
+		TRACE_SYS_EXIT_PROFILE_INIT(sname)			\
 	}
 
 #define SYSCALL_METADATA(sname, nb)				\
diff --git a/include/trace/syscall.h b/include/trace/syscall.h
index df62840..3ab6dd1 100644
--- a/include/trace/syscall.h
+++ b/include/trace/syscall.h
@@ -58,5 +58,12 @@ extern void unreg_event_syscall_exit(void *ptr);
 enum print_line_t print_syscall_enter(struct trace_iterator *iter, int flags);
 enum print_line_t print_syscall_exit(struct trace_iterator *iter, int flags);
 #endif
+#ifdef CONFIG_EVENT_PROFILE
+int reg_prof_syscall_enter(char *name);
+void unreg_prof_syscall_enter(char *name);
+int reg_prof_syscall_exit(char *name);
+void unreg_prof_syscall_exit(char *name);
+
+#endif
 
 #endif /* _TRACE_SYSCALL_H */
diff --git a/kernel/trace/trace_syscalls.c b/kernel/trace/trace_syscalls.c
index e58a9c1..f4eaec3 100644
--- a/kernel/trace/trace_syscalls.c
+++ b/kernel/trace/trace_syscalls.c
@@ -1,6 +1,7 @@
 #include <trace/syscall.h>
 #include <linux/kernel.h>
 #include <linux/ftrace.h>
+#include <linux/perf_counter.h>
 #include <asm/syscall.h>
 
 #include "trace_output.h"
@@ -252,3 +253,123 @@ struct trace_event event_syscall_enter = {
 struct trace_event event_syscall_exit = {
 	.trace			= print_syscall_exit,
 };
+
+#ifdef CONFIG_EVENT_PROFILE
+static DECLARE_BITMAP(enabled_prof_enter_syscalls, FTRACE_SYSCALL_MAX);
+static DECLARE_BITMAP(enabled_prof_exit_syscalls, FTRACE_SYSCALL_MAX);
+static int sys_prof_refcount_enter;
+static int sys_prof_refcount_exit;
+
+static void prof_syscall_enter(struct pt_regs *regs, long id)
+{
+	struct syscall_metadata *sys_data;
+	int syscall_nr;
+
+	syscall_nr = syscall_get_nr(current, regs);
+	if (!test_bit(syscall_nr, enabled_prof_enter_syscalls))
+		return;
+
+	sys_data = syscall_nr_to_meta(syscall_nr);
+	if (!sys_data)
+		return;
+
+	perf_tpcounter_event(sys_data->enter_id, 0, 1, NULL, 0);
+}
+
+int reg_prof_syscall_enter(char *name)
+{
+	int ret = 0;
+	int num;
+
+	num = syscall_name_to_nr(name);
+	if (num < 0 || num >= FTRACE_SYSCALL_MAX)
+		return -ENOSYS;
+
+	mutex_lock(&syscall_trace_lock);
+	if (!sys_prof_refcount_enter)
+		ret = register_trace_syscall_enter(prof_syscall_enter);
+	if (ret) {
+		pr_info("event trace: Could not activate"
+				"syscall entry trace point");
+	} else {
+		set_bit(num, enabled_prof_enter_syscalls);
+		sys_prof_refcount_enter++;
+	}
+	mutex_unlock(&syscall_trace_lock);
+	return ret;
+}
+
+void unreg_prof_syscall_enter(char *name)
+{
+	int num;
+
+	num = syscall_name_to_nr(name);
+	if (num < 0 || num >= FTRACE_SYSCALL_MAX)
+		return;
+
+	mutex_lock(&syscall_trace_lock);
+	sys_prof_refcount_enter--;
+	clear_bit(num, enabled_prof_enter_syscalls);
+	if (!sys_prof_refcount_enter)
+		unregister_trace_syscall_enter(prof_syscall_enter);
+	mutex_unlock(&syscall_trace_lock);
+}
+
+static void prof_syscall_exit(struct pt_regs *regs, long ret)
+{
+	struct syscall_metadata *sys_data;
+	int syscall_nr;
+
+	syscall_nr = syscall_get_nr(current, regs);
+	if (!test_bit(syscall_nr, enabled_prof_exit_syscalls))
+		return;
+
+	sys_data = syscall_nr_to_meta(syscall_nr);
+	if (!sys_data)
+		return;
+
+	perf_tpcounter_event(sys_data->exit_id, 0, 1, NULL, 0);
+}
+
+int reg_prof_syscall_exit(char *name)
+{
+	int ret = 0;
+	int num;
+
+	num = syscall_name_to_nr(name);
+	if (num < 0 || num >= FTRACE_SYSCALL_MAX)
+		return -ENOSYS;
+
+	mutex_lock(&syscall_trace_lock);
+	if (!sys_prof_refcount_exit)
+		ret = register_trace_syscall_exit(prof_syscall_exit);
+	if (ret) {
+		pr_info("event trace: Could not activate"
+				"syscall entry trace point");
+	} else {
+		set_bit(num, enabled_prof_exit_syscalls);
+		sys_prof_refcount_exit++;
+	}
+	mutex_unlock(&syscall_trace_lock);
+	return ret;
+}
+
+void unreg_prof_syscall_exit(char *name)
+{
+	int num;
+
+	num = syscall_name_to_nr(name);
+	if (num < 0 || num >= FTRACE_SYSCALL_MAX)
+		return;
+
+	mutex_lock(&syscall_trace_lock);
+	sys_prof_refcount_exit--;
+	clear_bit(num, enabled_prof_exit_syscalls);
+	if (!sys_prof_refcount_exit)
+		unregister_trace_syscall_exit(prof_syscall_exit);
+	mutex_unlock(&syscall_trace_lock);
+}
+
+#endif
+
+
-- 
1.6.2.3


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH 12/16] tracing: Add more namespace area to 'perf list' output
  2009-08-11 18:48 [GIT PULL] tracing: Syscalls trace events + perf support Frederic Weisbecker
                   ` (10 preceding siblings ...)
  2009-08-11 18:49 ` [PATCH 11/16] tracing: Add perf counter support for syscalls tracing Frederic Weisbecker
@ 2009-08-11 18:49 ` Frederic Weisbecker
  2009-08-11 18:49 ` [PATCH 13/16] tracing: Convert x86_64 mmap and uname to use DEFINE_SYSCALL Frederic Weisbecker
                   ` (5 subsequent siblings)
  17 siblings, 0 replies; 54+ messages in thread
From: Frederic Weisbecker @ 2009-08-11 18:49 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Jason Baron, Lai Jiangshan, Steven Rostedt, Peter Zijlstra,
	Mathieu Desnoyers, Jiaying Zhang, Martin Bligh, Li Zefan,
	Masami Hiramatsu, Frederic Weisbecker

From: Jason Baron <jbaron@redhat.com>

The new syscall tracepoints names can be too long for the 'perf list'
output.
Add a few more characters.

Signed-off-by: Jason Baron <jbaron@redhat.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Jiaying Zhang <jiayingz@google.com>
Cc: Martin Bligh <mbligh@google.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
 tools/perf/util/parse-events.c |    8 ++++----
 1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 4858d83..a5d661b 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -606,7 +606,7 @@ static void print_tracepoint_events(void)
 								evt_path, st) {
 			snprintf(evt_path, MAXPATHLEN, "%s:%s",
 				 sys_dirent.d_name, evt_dirent.d_name);
-			fprintf(stderr, "  %-40s [%s]\n", evt_path,
+			fprintf(stderr, "  %-42s [%s]\n", evt_path,
 				event_type_descriptors[PERF_TYPE_TRACEPOINT+1]);
 		}
 		closedir(evt_dir);
@@ -640,7 +640,7 @@ void print_events(void)
 			sprintf(name, "%s OR %s", syms->symbol, syms->alias);
 		else
 			strcpy(name, syms->symbol);
-		fprintf(stderr, "  %-40s [%s]\n", name,
+		fprintf(stderr, "  %-42s [%s]\n", name,
 			event_type_descriptors[type]);
 
 		prev_type = type;
@@ -654,7 +654,7 @@ void print_events(void)
 				continue;
 
 			for (i = 0; i < PERF_COUNT_HW_CACHE_RESULT_MAX; i++) {
-				fprintf(stderr, "  %-40s [%s]\n",
+				fprintf(stderr, "  %-42s [%s]\n",
 					event_cache_name(type, op, i),
 					event_type_descriptors[4]);
 			}
@@ -662,7 +662,7 @@ void print_events(void)
 	}
 
 	fprintf(stderr, "\n");
-	fprintf(stderr, "  %-40s [raw hardware event descriptor]\n",
+	fprintf(stderr, "  %-42s [raw hardware event descriptor]\n",
 		"rNNN");
 	fprintf(stderr, "\n");
 
-- 
1.6.2.3


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH 13/16] tracing: Convert x86_64 mmap and uname to use DEFINE_SYSCALL
  2009-08-11 18:48 [GIT PULL] tracing: Syscalls trace events + perf support Frederic Weisbecker
                   ` (11 preceding siblings ...)
  2009-08-11 18:49 ` [PATCH 12/16] tracing: Add more namespace area to 'perf list' output Frederic Weisbecker
@ 2009-08-11 18:49 ` Frederic Weisbecker
  2009-08-11 18:49 ` [PATCH 14/16] tracing: Add ftrace event call parameter to its field descriptor handler Frederic Weisbecker
                   ` (4 subsequent siblings)
  17 siblings, 0 replies; 54+ messages in thread
From: Frederic Weisbecker @ 2009-08-11 18:49 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Jason Baron, Lai Jiangshan, Steven Rostedt, Peter Zijlstra,
	Mathieu Desnoyers, Jiaying Zhang, Martin Bligh, Li Zefan,
	Masami Hiramatsu, Frederic Weisbecker

From: Jason Baron <jbaron@redhat.com>

A number of syscalls are not using 'DEFINE_SYSCALL'. I'm not sure why.
Convert x86_64 uname and mmap to use DEFINE_SYSCALL.

Signed-off-by: Jason Baron <jbaron@redhat.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Jiaying Zhang <jiayingz@google.com>
Cc: Martin Bligh <mbligh@google.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
 arch/x86/kernel/sys_x86_64.c |    8 ++++----
 1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kernel/sys_x86_64.c b/arch/x86/kernel/sys_x86_64.c
index 6bc211a..45e00eb 100644
--- a/arch/x86/kernel/sys_x86_64.c
+++ b/arch/x86/kernel/sys_x86_64.c
@@ -18,9 +18,9 @@
 #include <asm/ia32.h>
 #include <asm/syscalls.h>
 
-asmlinkage long sys_mmap(unsigned long addr, unsigned long len,
-		unsigned long prot, unsigned long flags,
-		unsigned long fd, unsigned long off)
+SYSCALL_DEFINE6(mmap, unsigned long, addr, unsigned long, len,
+		unsigned long, prot, unsigned long, flags,
+		unsigned long, fd, unsigned long, off)
 {
 	long error;
 	struct file *file;
@@ -226,7 +226,7 @@ bottomup:
 }
 
 
-asmlinkage long sys_uname(struct new_utsname __user *name)
+SYSCALL_DEFINE1(uname, struct new_utsname __user *, name)
 {
 	int err;
 	down_read(&uts_sem);
-- 
1.6.2.3


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH 14/16] tracing: Add ftrace event call parameter to its field descriptor handler
  2009-08-11 18:48 [GIT PULL] tracing: Syscalls trace events + perf support Frederic Weisbecker
                   ` (12 preceding siblings ...)
  2009-08-11 18:49 ` [PATCH 13/16] tracing: Convert x86_64 mmap and uname to use DEFINE_SYSCALL Frederic Weisbecker
@ 2009-08-11 18:49 ` Frederic Weisbecker
  2009-08-11 18:49 ` [PATCH 15/16] tracing: Add fields format definition for syscall events Frederic Weisbecker
                   ` (3 subsequent siblings)
  17 siblings, 0 replies; 54+ messages in thread
From: Frederic Weisbecker @ 2009-08-11 18:49 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Frederic Weisbecker, Lai Jiangshan, Steven Rostedt,
	Peter Zijlstra, Mathieu Desnoyers, Jiaying Zhang, Martin Bligh,
	Li Zefan, Masami Hiramatsu, Jason Baron

Add the struct ftrace_event_call as a parameter of its show_format()
callback. This way we can use it from the syscall trace events to
retrieve the syscall name from the ftrace event call parameter and
describe its fields using the syscalls metadata.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Jiaying Zhang <jiayingz@google.com>
Cc: Martin Bligh <mbligh@google.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Jason Baron <jbaron@redhat.com>
---
 include/linux/ftrace_event.h |    3 ++-
 include/trace/ftrace.h       |    3 ++-
 kernel/trace/trace_events.c  |    2 +-
 kernel/trace/trace_export.c  |    6 ++++--
 4 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.h
index 8544f12..189806b 100644
--- a/include/linux/ftrace_event.h
+++ b/include/linux/ftrace_event.h
@@ -116,7 +116,8 @@ struct ftrace_event_call {
 	void			(*unregfunc)(void *);
 	int			id;
 	int			(*raw_init)(void);
-	int			(*show_format)(struct trace_seq *s);
+	int			(*show_format)(struct ftrace_event_call *call,
+					       struct trace_seq *s);
 	int			(*define_fields)(void);
 	struct list_head	fields;
 	int			filter_active;
diff --git a/include/trace/ftrace.h b/include/trace/ftrace.h
index 46d81b5..b250b06 100644
--- a/include/trace/ftrace.h
+++ b/include/trace/ftrace.h
@@ -151,7 +151,8 @@
 #undef TRACE_EVENT
 #define TRACE_EVENT(call, proto, args, tstruct, func, print)		\
 static int								\
-ftrace_format_##call(struct trace_seq *s)				\
+ftrace_format_##call(struct ftrace_event_call *unused,			\
+		      struct trace_seq *s)				\
 {									\
 	struct ftrace_raw_##call field __attribute__((unused));		\
 	int ret = 0;							\
diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
index 1d289e2..b568ade 100644
--- a/kernel/trace/trace_events.c
+++ b/kernel/trace/trace_events.c
@@ -576,7 +576,7 @@ event_format_read(struct file *filp, char __user *ubuf, size_t cnt,
 	trace_seq_printf(s, "format:\n");
 	trace_write_header(s);
 
-	r = call->show_format(s);
+	r = call->show_format(call, s);
 	if (!r) {
 		/*
 		 * ug!  The format output is bigger than a PAGE!!
diff --git a/kernel/trace/trace_export.c b/kernel/trace/trace_export.c
index d06cf89..956d4bc 100644
--- a/kernel/trace/trace_export.c
+++ b/kernel/trace/trace_export.c
@@ -60,7 +60,8 @@ extern void __bad_type_size(void);
 #undef TRACE_EVENT_FORMAT
 #define TRACE_EVENT_FORMAT(call, proto, args, fmt, tstruct, tpfmt)	\
 static int								\
-ftrace_format_##call(struct trace_seq *s)				\
+ftrace_format_##call(struct ftrace_event_call *unused,			\
+		      struct trace_seq *s)				\
 {									\
 	struct args field;						\
 	int ret;							\
@@ -76,7 +77,8 @@ ftrace_format_##call(struct trace_seq *s)				\
 #define TRACE_EVENT_FORMAT_NOFILTER(call, proto, args, fmt, tstruct,	\
 				    tpfmt)				\
 static int								\
-ftrace_format_##call(struct trace_seq *s)				\
+ftrace_format_##call(struct ftrace_event_call *unused,			\
+		      struct trace_seq *s)				\
 {									\
 	struct args field;						\
 	int ret;							\
-- 
1.6.2.3


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH 15/16] tracing: Add fields format definition for syscall events
  2009-08-11 18:48 [GIT PULL] tracing: Syscalls trace events + perf support Frederic Weisbecker
                   ` (13 preceding siblings ...)
  2009-08-11 18:49 ` [PATCH 14/16] tracing: Add ftrace event call parameter to its field descriptor handler Frederic Weisbecker
@ 2009-08-11 18:49 ` Frederic Weisbecker
  2009-08-19 17:12   ` Masami Hiramatsu
  2009-08-11 18:49 ` [PATCH 16/16] tracing: Support for syscall events raw records in perfcounters Frederic Weisbecker
                   ` (2 subsequent siblings)
  17 siblings, 1 reply; 54+ messages in thread
From: Frederic Weisbecker @ 2009-08-11 18:49 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Frederic Weisbecker, Lai Jiangshan, Steven Rostedt,
	Peter Zijlstra, Mathieu Desnoyers, Jiaying Zhang, Martin Bligh,
	Li Zefan, Masami Hiramatsu, Jason Baron

Define the format of the syscall trace fields to parse the binary
values from a raw trace using the syscall events "format" file.

This is defined dynamically using the syscalls metadata.
It prepares the export of syscall event raw records to perf
counters.

Example:

$ cat /debug/tracing/events/syscalls/sys_enter_sched_getparam/format
name: sys_enter_sched_getparam
ID: 39
format:
	field:unsigned short common_type;	offset:0;	size:2;
	field:unsigned char common_flags;	offset:2;	size:1;
	field:unsigned char common_preempt_count;	offset:3;	size:1;
	field:int common_pid;	offset:4;	size:4;
	field:int common_tgid;	offset:8;	size:4;

	field:pid_t pid;	offset:12;	size:8;
	field:struct sched_param * param;	offset:20;	size:8;

print fmt: "pid: 0x%08lx, param: 0x%08lx", ((unsigned long)(REC->pid)), ((unsigned long)(REC->param))

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Jiaying Zhang <jiayingz@google.com>
Cc: Martin Bligh <mbligh@google.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Jason Baron <jbaron@redhat.com>
---
 include/linux/syscalls.h      |    1 +
 include/trace/syscall.h       |    2 +
 kernel/trace/trace_syscalls.c |   46 +++++++++++++++++++++++++++++++++++++++++
 3 files changed, 49 insertions(+), 0 deletions(-)

diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index 5541e75..87d06c1 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -189,6 +189,7 @@ static void prof_sysexit_disable_##sname(struct ftrace_event_call *event_call) \
 		.system                 = "syscalls",			\
 		.event                  = &event_syscall_enter,		\
 		.raw_init		= init_enter_##sname,		\
+		.show_format		= ftrace_format_syscall,	\
 		.regfunc		= reg_event_syscall_enter,	\
 		.unregfunc		= unreg_event_syscall_enter,	\
 		.data			= "sys"#sname,			\
diff --git a/include/trace/syscall.h b/include/trace/syscall.h
index 3ab6dd1..0cb0362 100644
--- a/include/trace/syscall.h
+++ b/include/trace/syscall.h
@@ -55,6 +55,8 @@ extern int reg_event_syscall_enter(void *ptr);
 extern void unreg_event_syscall_enter(void *ptr);
 extern int reg_event_syscall_exit(void *ptr);
 extern void unreg_event_syscall_exit(void *ptr);
+extern int
+ftrace_format_syscall(struct ftrace_event_call *call, struct trace_seq *s);
 enum print_line_t print_syscall_enter(struct trace_iterator *iter, int flags);
 enum print_line_t print_syscall_exit(struct trace_iterator *iter, int flags);
 #endif
diff --git a/kernel/trace/trace_syscalls.c b/kernel/trace/trace_syscalls.c
index f4eaec3..9ee6386 100644
--- a/kernel/trace/trace_syscalls.c
+++ b/kernel/trace/trace_syscalls.c
@@ -105,6 +105,52 @@ print_syscall_exit(struct trace_iterator *iter, int flags)
 	return TRACE_TYPE_HANDLED;
 }
 
+int ftrace_format_syscall(struct ftrace_event_call *call, struct trace_seq *s)
+{
+	int i;
+	int nr;
+	int ret = 0;
+	struct syscall_metadata *entry;
+	int offset = sizeof(struct trace_entry);
+
+	nr = syscall_name_to_nr((char *)call->data);
+	entry = syscall_nr_to_meta(nr);
+
+	if (!entry)
+		return ret;
+
+	for (i = 0; i < entry->nb_args; i++) {
+		ret = trace_seq_printf(s, "\tfield:%s %s;", entry->types[i],
+				        entry->args[i]);
+		if (!ret)
+			return 0;
+		ret = trace_seq_printf(s, "\toffset:%d;\tsize:%lu;\n", offset,
+				       sizeof(unsigned long));
+		if (!ret)
+			return 0;
+		offset += sizeof(unsigned long);
+	}
+
+	trace_seq_printf(s, "\nprint fmt: \"");
+	for (i = 0; i < entry->nb_args; i++) {
+		ret = trace_seq_printf(s, "%s: 0x%%0%lulx%s", entry->args[i],
+				        sizeof(unsigned long),
+					i == entry->nb_args - 1 ? "\", " : ", ");
+		if (!ret)
+			return 0;
+	}
+
+	for (i = 0; i < entry->nb_args; i++) {
+		ret = trace_seq_printf(s, "((unsigned long)(REC->%s))%s",
+				        entry->args[i],
+					i == entry->nb_args - 1 ? "\n" : ", ");
+		if (!ret)
+			return 0;
+	}
+
+	return ret;
+}
+
 void ftrace_syscall_enter(struct pt_regs *regs, long id)
 {
 	struct syscall_trace_enter *entry;
-- 
1.6.2.3


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH 16/16] tracing: Support for syscall events raw records in perfcounters
  2009-08-11 18:48 [GIT PULL] tracing: Syscalls trace events + perf support Frederic Weisbecker
                   ` (14 preceding siblings ...)
  2009-08-11 18:49 ` [PATCH 15/16] tracing: Add fields format definition for syscall events Frederic Weisbecker
@ 2009-08-11 18:49 ` Frederic Weisbecker
  2009-08-12  9:11 ` [GIT PULL] tracing: Syscalls trace events + perf support Ingo Molnar
  2009-08-12 16:33 ` Masami Hiramatsu
  17 siblings, 0 replies; 54+ messages in thread
From: Frederic Weisbecker @ 2009-08-11 18:49 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Frederic Weisbecker, Lai Jiangshan, Steven Rostedt,
	Peter Zijlstra, Mathieu Desnoyers, Jiaying Zhang, Martin Bligh,
	Li Zefan, Jason Baron, Masami Hiramatsu

This bring the support for raw syscall events in perfcounters.
The arguments or exit value are saved as a raw sample using
the PERF_SAMPLE_RAW attribute in a perf counter.

Example (for now you must explicitly set the PERF_SAMPLE_RAW flag
in perf record):

perf record -e syscalls:sys_enter_open -f -F 1 -a
perf report -D

	0x2cbb8 [0x50]: event: 9
	.
	. ... raw event: size 80 bytes
	.  0000:  09 00 00 00 02 00 50 00 20 e9 39 ab 0a 7f 00 00  ......P. .9....
	.  0010:  bc 14 00 00 bc 14 00 00 01 00 00 00 00 00 00 00  ...............
	.  0020:  2c 00 00 00 15 01 01 00 bc 14 00 00 bc 14 00 00  ,..............
                  ^  ^  ^  ^  ^  ^  ^  ..........................
                  Event Size  struct trace_entry

	.  0030:  00 00 00 00 46 98 43 02 00 00 00 00 80 08 00 00  ....F.C........
                  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^
                  ptr to file name        open flags

	.  0040:  00 00 00 00 02 00 00 00 00 00 00 00 00 00 00 00  ...............
                  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^
	.         open mode               padding

	0x2cbb8 [0x50]: PERF_EVENT_SAMPLE (IP, 2): 5308: 0x7f0aab39e920 period: 1

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Jiaying Zhang <jiayingz@google.com>
Cc: Martin Bligh <mbligh@google.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Jason Baron <jbaron@redhat.com>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
---
 kernel/trace/trace_syscalls.c |   39 +++++++++++++++++++++++++++++++++++++--
 1 files changed, 37 insertions(+), 2 deletions(-)

diff --git a/kernel/trace/trace_syscalls.c b/kernel/trace/trace_syscalls.c
index 9ee6386..f837ccc 100644
--- a/kernel/trace/trace_syscalls.c
+++ b/kernel/trace/trace_syscalls.c
@@ -301,6 +301,17 @@ struct trace_event event_syscall_exit = {
 };
 
 #ifdef CONFIG_EVENT_PROFILE
+
+struct syscall_enter_record {
+	struct trace_entry	entry;
+	unsigned long		args[0];
+};
+
+struct syscall_exit_record {
+	struct trace_entry	entry;
+	unsigned long		ret;
+};
+
 static DECLARE_BITMAP(enabled_prof_enter_syscalls, FTRACE_SYSCALL_MAX);
 static DECLARE_BITMAP(enabled_prof_exit_syscalls, FTRACE_SYSCALL_MAX);
 static int sys_prof_refcount_enter;
@@ -308,8 +319,10 @@ static int sys_prof_refcount_exit;
 
 static void prof_syscall_enter(struct pt_regs *regs, long id)
 {
+	struct syscall_enter_record *rec;
 	struct syscall_metadata *sys_data;
 	int syscall_nr;
+	int size;
 
 	syscall_nr = syscall_get_nr(current, regs);
 	if (!test_bit(syscall_nr, enabled_prof_enter_syscalls))
@@ -319,7 +332,24 @@ static void prof_syscall_enter(struct pt_regs *regs, long id)
 	if (!sys_data)
 		return;
 
-	perf_tpcounter_event(sys_data->enter_id, 0, 1, NULL, 0);
+	/* get the size after alignment with the u32 buffer size field */
+	size = sizeof(unsigned long) * sys_data->nb_args + sizeof(*rec);
+	size = ALIGN(size + sizeof(u32), sizeof(u64));
+	size -= sizeof(u32);
+
+	do {
+		char raw_data[size];
+
+		/* zero the dead bytes from align to not leak stack to user */
+		*(u64 *)(&raw_data[size - sizeof(u64)]) = 0ULL;
+
+		rec = (struct syscall_enter_record *) raw_data;
+		tracing_generic_entry_update(&rec->entry, 0, 0);
+		rec->entry.type = sys_data->enter_id;
+		syscall_get_arguments(current, regs, 0, sys_data->nb_args,
+				       (unsigned long *)&rec->args);
+		perf_tpcounter_event(sys_data->enter_id, 0, 1, rec, size);
+	} while(0);
 }
 
 int reg_prof_syscall_enter(char *name)
@@ -364,6 +394,7 @@ void unreg_prof_syscall_enter(char *name)
 static void prof_syscall_exit(struct pt_regs *regs, long ret)
 {
 	struct syscall_metadata *sys_data;
+	struct syscall_exit_record rec;
 	int syscall_nr;
 
 	syscall_nr = syscall_get_nr(current, regs);
@@ -374,7 +405,11 @@ static void prof_syscall_exit(struct pt_regs *regs, long ret)
 	if (!sys_data)
 		return;
 
-	perf_tpcounter_event(sys_data->exit_id, 0, 1, NULL, 0);
+	tracing_generic_entry_update(&rec.entry, 0, 0);
+	rec.entry.type = sys_data->exit_id;
+	rec.ret = syscall_get_return_value(current, regs);
+
+	perf_tpcounter_event(sys_data->exit_id, 0, 1, &rec, sizeof(rec));
 }
 
 int reg_prof_syscall_exit(char *name)
-- 
1.6.2.3


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* Re: [GIT PULL] tracing: Syscalls trace events + perf support
  2009-08-11 18:48 [GIT PULL] tracing: Syscalls trace events + perf support Frederic Weisbecker
                   ` (15 preceding siblings ...)
  2009-08-11 18:49 ` [PATCH 16/16] tracing: Support for syscall events raw records in perfcounters Frederic Weisbecker
@ 2009-08-12  9:11 ` Ingo Molnar
  2009-08-12 11:03   ` Ingo Molnar
  2009-08-18  0:46   ` Paul Mundt
  2009-08-12 16:33 ` Masami Hiramatsu
  17 siblings, 2 replies; 54+ messages in thread
From: Ingo Molnar @ 2009-08-12  9:11 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: LKML, Lai Jiangshan, Steven Rostedt, Peter Zijlstra,
	Mathieu Desnoyers, Jiaying Zhang, Martin Bligh, Li Zefan,
	Jason Baron, Masami Hiramatsu


* Frederic Weisbecker <fweisbec@gmail.com> wrote:

> Hi Ingo,
> 
> This pull request integrate one cleanup/fix for ftrace and an 
> update for syscall tracing: the migration from old-style tracer to 
> individual tracepoints/trace_events and the support for perf 
> counter.
> 
> I've tested it with success either with ftrace (every syscall 
> tracepoints enabled at the same time without problems) and with 
> perfcounter.
> 
> May be one drawback: it creates so much trace events that the 
> ftrace selftests can take some time :-)

Pulled, thanks a lot!

	Ingo

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [GIT PULL] tracing: Syscalls trace events + perf support
  2009-08-12  9:11 ` [GIT PULL] tracing: Syscalls trace events + perf support Ingo Molnar
@ 2009-08-12 11:03   ` Ingo Molnar
  2009-08-12 11:14     ` Ingo Molnar
  2009-08-12 11:33     ` Frederic Weisbecker
  2009-08-18  0:46   ` Paul Mundt
  1 sibling, 2 replies; 54+ messages in thread
From: Ingo Molnar @ 2009-08-12 11:03 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: LKML, Lai Jiangshan, Steven Rostedt, Peter Zijlstra,
	Mathieu Desnoyers, Jiaying Zhang, Martin Bligh, Li Zefan,
	Jason Baron, Masami Hiramatsu


one thing i noticed is that we dont seem to be catching 
compat syscalls properly:

phoenix:~/linux/linux/tools/perf> perf stat -e 
syscalls:sys_enter_read -e syscalls:sys_enter_write -e 
syscalls:sys_enter_mmap ~/hackbench 10
Time: 0.236

 Performance counter stats for '/home/mingo/hackbench 10':

              0  syscalls:sys_enter_read 
              0  syscalls:sys_enter_write
              0  syscalls:sys_enter_mmap 

    0.270062020  seconds time elapsed

phoenix:~/linux/linux/tools/perf> file ~/hackbench 
/home/mingo/hackbench: ELF 32-bit LSB executable, Intel 
80386, version 1 (SYSV), dynamically linked (uses shared 
libs), for GNU/Linux 2.2.5, not stripped
phoenix:~/linux/linux/tools/perf> uname -a
Linux phoenix 2.6.31-rc5-tip #3 SMP Wed Aug 12 12:50:38 CEST 
2009 x86_64 x86_64 x86_64 GNU/Linux


phoenix:~> perf stat -e syscalls:sys_enter_read -e 
syscalls:sys_enter_write -e syscalls:sys_enter_mmap 
~/hackbench64 10
Running with 10*40 (== 400) tasks.
Time: 0.199

 Performance counter stats for '/home/mingo/hackbench64 10':

         400402  syscalls:sys_enter_read 
         400403  syscalls:sys_enter_write
             12  syscalls:sys_enter_mmap 

    0.247919263  seconds time elapsed

phoenix:~> file hackbench64 
hackbench64: ELF 64-bit LSB executable, x86-64, version 1 
(SYSV), dynamically linked (uses shared libs), for GNU/Linux 
2.6.9, not stripped

	Ingo

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [GIT PULL] tracing: Syscalls trace events + perf support
  2009-08-12 11:03   ` Ingo Molnar
@ 2009-08-12 11:14     ` Ingo Molnar
  2009-08-12 14:25       ` Jason Baron
  2009-08-12 11:33     ` Frederic Weisbecker
  1 sibling, 1 reply; 54+ messages in thread
From: Ingo Molnar @ 2009-08-12 11:14 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: LKML, Lai Jiangshan, Steven Rostedt, Peter Zijlstra,
	Mathieu Desnoyers, Jiaying Zhang, Martin Bligh, Li Zefan,
	Jason Baron, Masami Hiramatsu


another thing: could we please also have a generic, highlevel 
tracepoint (in addition to the specific tracepoints) that
enumerates the raw syscall Nr and the parameters it gets - in a
single tracepoint?

That would allow 'all' syscalls to be traced, at the cost of no 
argument type/name/etc expansion.

	Ingo

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [GIT PULL] tracing: Syscalls trace events + perf support
  2009-08-12 11:03   ` Ingo Molnar
  2009-08-12 11:14     ` Ingo Molnar
@ 2009-08-12 11:33     ` Frederic Weisbecker
  2009-08-12 13:59       ` Jason Baron
  1 sibling, 1 reply; 54+ messages in thread
From: Frederic Weisbecker @ 2009-08-12 11:33 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Lai Jiangshan, Steven Rostedt, Peter Zijlstra,
	Mathieu Desnoyers, Jiaying Zhang, Martin Bligh, Li Zefan,
	Jason Baron, Masami Hiramatsu

On Wed, Aug 12, 2009 at 01:03:07PM +0200, Ingo Molnar wrote:
> 
> one thing i noticed is that we dont seem to be catching 
> compat syscalls properly:


Oh right. While looking at fs/compat.c, it seems most of the
compat syscalls haven't been defined through the SYSCALL_DEFINE
macro.

Grr...

Frederic.


 
> phoenix:~/linux/linux/tools/perf> perf stat -e 
> syscalls:sys_enter_read -e syscalls:sys_enter_write -e 
> syscalls:sys_enter_mmap ~/hackbench 10
> Time: 0.236
> 
>  Performance counter stats for '/home/mingo/hackbench 10':
> 
>               0  syscalls:sys_enter_read 
>               0  syscalls:sys_enter_write
>               0  syscalls:sys_enter_mmap 
> 
>     0.270062020  seconds time elapsed
> 
> phoenix:~/linux/linux/tools/perf> file ~/hackbench 
> /home/mingo/hackbench: ELF 32-bit LSB executable, Intel 
> 80386, version 1 (SYSV), dynamically linked (uses shared 
> libs), for GNU/Linux 2.2.5, not stripped
> phoenix:~/linux/linux/tools/perf> uname -a
> Linux phoenix 2.6.31-rc5-tip #3 SMP Wed Aug 12 12:50:38 CEST 
> 2009 x86_64 x86_64 x86_64 GNU/Linux
> 
> 
> phoenix:~> perf stat -e syscalls:sys_enter_read -e 
> syscalls:sys_enter_write -e syscalls:sys_enter_mmap 
> ~/hackbench64 10
> Running with 10*40 (== 400) tasks.
> Time: 0.199
> 
>  Performance counter stats for '/home/mingo/hackbench64 10':
> 
>          400402  syscalls:sys_enter_read 
>          400403  syscalls:sys_enter_write
>              12  syscalls:sys_enter_mmap 
> 
>     0.247919263  seconds time elapsed
> 
> phoenix:~> file hackbench64 
> hackbench64: ELF 64-bit LSB executable, x86-64, version 1 
> (SYSV), dynamically linked (uses shared libs), for GNU/Linux 
> 2.6.9, not stripped
> 
> 	Ingo


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [GIT PULL] tracing: Syscalls trace events + perf support
  2009-08-12 11:33     ` Frederic Weisbecker
@ 2009-08-12 13:59       ` Jason Baron
  2009-08-12 14:30         ` Ingo Molnar
  0 siblings, 1 reply; 54+ messages in thread
From: Jason Baron @ 2009-08-12 13:59 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Ingo Molnar, LKML, Lai Jiangshan, Steven Rostedt, Peter Zijlstra,
	Mathieu Desnoyers, Jiaying Zhang, Martin Bligh, Li Zefan,
	Masami Hiramatsu

On Wed, Aug 12, 2009 at 01:33:09PM +0200, Frederic Weisbecker wrote:
> On Wed, Aug 12, 2009 at 01:03:07PM +0200, Ingo Molnar wrote:
> > 
> > one thing i noticed is that we dont seem to be catching 
> > compat syscalls properly:
> 
> 
> Oh right. While looking at fs/compat.c, it seems most of the
> compat syscalls haven't been defined through the SYSCALL_DEFINE
> macro.
> 
> Grr...
> 

right. I mentioned this in a pervious mail that we weren't handling
compat syscalls. That said, the pervious syscall tracer also had the
same issue, so we haven't taken a step back. I guess we need to convert
some of the compat layer to use the DEFINE_SYSCALL() macros. Also, there
are a number of 64-bit syscalls that don't use DEFINE_SYSCALL either.
Those need to converted as well. The patchset I submitted converted a
couple of them....

thanks,

-Jason

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [GIT PULL] tracing: Syscalls trace events + perf support
  2009-08-12 11:14     ` Ingo Molnar
@ 2009-08-12 14:25       ` Jason Baron
  2009-08-12 14:29         ` Ingo Molnar
  0 siblings, 1 reply; 54+ messages in thread
From: Jason Baron @ 2009-08-12 14:25 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Frederic Weisbecker, LKML, Lai Jiangshan, Steven Rostedt,
	Peter Zijlstra, Mathieu Desnoyers, Jiaying Zhang, Martin Bligh,
	Li Zefan, Masami Hiramatsu

On Wed, Aug 12, 2009 at 01:14:36PM +0200, Ingo Molnar wrote:
> another thing: could we please also have a generic, highlevel 
> tracepoint (in addition to the specific tracepoints) that
> enumerates the raw syscall Nr and the parameters it gets - in a
> single tracepoint?
> 

The specific tracepoints, are all layered on 2 (entry, exit) generic
tracepoints already. So this shouldn't be too hard...The parameters it
gets might be tricky, since there are a variable number b/w different
syscalls? 

thanks,

-Jason

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [GIT PULL] tracing: Syscalls trace events + perf support
  2009-08-12 14:25       ` Jason Baron
@ 2009-08-12 14:29         ` Ingo Molnar
  2009-08-12 14:37           ` Jason Baron
  0 siblings, 1 reply; 54+ messages in thread
From: Ingo Molnar @ 2009-08-12 14:29 UTC (permalink / raw)
  To: Jason Baron
  Cc: Frederic Weisbecker, LKML, Lai Jiangshan, Steven Rostedt,
	Peter Zijlstra, Mathieu Desnoyers, Jiaying Zhang, Martin Bligh,
	Li Zefan, Masami Hiramatsu


* Jason Baron <jbaron@redhat.com> wrote:

> On Wed, Aug 12, 2009 at 01:14:36PM +0200, Ingo Molnar wrote:

> > another thing: could we please also have a generic, highlevel 
> > tracepoint (in addition to the specific tracepoints) that 
> > enumerates the raw syscall Nr and the parameters it gets - in a 
> > single tracepoint?
> 
> The specific tracepoints, are all layered on 2 (entry, exit) 
> generic tracepoints already. So this shouldn't be too hard...The 
> parameters it gets might be tricky, since there are a variable 
> number b/w different syscalls?

We should just list all ~6 of them. It's up to the sampling entity 
to decide which ones (if any) is relevant.

At least on x86 the syscall arguments will always have a value when 
the syscall entry code is called. (they are in GP registers)

	Ingo

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [GIT PULL] tracing: Syscalls trace events + perf support
  2009-08-12 13:59       ` Jason Baron
@ 2009-08-12 14:30         ` Ingo Molnar
  0 siblings, 0 replies; 54+ messages in thread
From: Ingo Molnar @ 2009-08-12 14:30 UTC (permalink / raw)
  To: Jason Baron
  Cc: Frederic Weisbecker, LKML, Lai Jiangshan, Steven Rostedt,
	Peter Zijlstra, Mathieu Desnoyers, Jiaying Zhang, Martin Bligh,
	Li Zefan, Masami Hiramatsu


* Jason Baron <jbaron@redhat.com> wrote:

> On Wed, Aug 12, 2009 at 01:33:09PM +0200, Frederic Weisbecker wrote:
> > On Wed, Aug 12, 2009 at 01:03:07PM +0200, Ingo Molnar wrote:
> > > 
> > > one thing i noticed is that we dont seem to be catching 
> > > compat syscalls properly:
> > 
> > 
> > Oh right. While looking at fs/compat.c, it seems most of the
> > compat syscalls haven't been defined through the SYSCALL_DEFINE
> > macro.
> > 
> > Grr...
> > 
> 
> right. I mentioned this in a pervious mail that we weren't 
> handling compat syscalls. That said, the pervious syscall tracer 
> also had the same issue, so we haven't taken a step back. I guess 
> we need to convert some of the compat layer to use the 
> DEFINE_SYSCALL() macros. Also, there are a number of 64-bit 
> syscalls that don't use DEFINE_SYSCALL either. Those need to 
> converted as well. The patchset I submitted converted a couple of 
> them....

Those should come in separate patches as DEFINE_SYSCALL() coverage 
is useful to upstream as-is, even outside of syscall-tracing's 
scope.

	Ingo

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [GIT PULL] tracing: Syscalls trace events + perf support
  2009-08-12 14:29         ` Ingo Molnar
@ 2009-08-12 14:37           ` Jason Baron
  0 siblings, 0 replies; 54+ messages in thread
From: Jason Baron @ 2009-08-12 14:37 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Frederic Weisbecker, LKML, Lai Jiangshan, Steven Rostedt,
	Peter Zijlstra, Mathieu Desnoyers, Jiaying Zhang, Martin Bligh,
	Li Zefan, Masami Hiramatsu

On Wed, Aug 12, 2009 at 04:29:17PM +0200, Ingo Molnar wrote:
> * Jason Baron <jbaron@redhat.com> wrote:
> 
> > On Wed, Aug 12, 2009 at 01:14:36PM +0200, Ingo Molnar wrote:
> 
> > > another thing: could we please also have a generic, highlevel 
> > > tracepoint (in addition to the specific tracepoints) that 
> > > enumerates the raw syscall Nr and the parameters it gets - in a 
> > > single tracepoint?
> > 
> > The specific tracepoints, are all layered on 2 (entry, exit) 
> > generic tracepoints already. So this shouldn't be too hard...The 
> > parameters it gets might be tricky, since there are a variable 
> > number b/w different syscalls?
> 
> We should just list all ~6 of them. It's up to the sampling entity 
> to decide which ones (if any) is relevant.
> 
> At least on x86 the syscall arguments will always have a value when 
> the syscall entry code is called. (they are in GP registers)
> 
> 	Ingo

ok, makes sense. I'll add to the todo list.

-Jason



^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [GIT PULL] tracing: Syscalls trace events + perf support
  2009-08-11 18:48 [GIT PULL] tracing: Syscalls trace events + perf support Frederic Weisbecker
                   ` (16 preceding siblings ...)
  2009-08-12  9:11 ` [GIT PULL] tracing: Syscalls trace events + perf support Ingo Molnar
@ 2009-08-12 16:33 ` Masami Hiramatsu
  2009-08-12 17:02   ` Masami Hiramatsu
  17 siblings, 1 reply; 54+ messages in thread
From: Masami Hiramatsu @ 2009-08-12 16:33 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Ingo Molnar, LKML, Lai Jiangshan, Steven Rostedt, Peter Zijlstra,
	Mathieu Desnoyers, Jiaying Zhang, Martin Bligh, Li Zefan,
	Jason Baron

Hi Frederic and Jason,

Frederic Weisbecker wrote:
> Frederic Weisbecker (3):
>        tracing: Add ftrace event call parameter to its field descriptor handler

> Jason Baron (12):
>        tracing: Add ftrace_event_call void * 'data' field

Both of you added a parameter to ftrace_event_call for passing
sycall name (call->data) to handlers, but one passes 'ftrace_event_call *'
and another passes 'void *'. It seems not enough unified.

And also, I'm now updating my patch for 'dynamic ftrace_event_call'
http://lkml.org/lkml/2009/7/24/234
which adds 'ftrace_event_call *' for all handlers.

I think passing 'ftrace_event_call *' is more generic way
to do that. What would you think about that?

Thank you,

-- 
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America), Inc.
Software Solutions Division

e-mail: mhiramat@redhat.com


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [GIT PULL] tracing: Syscalls trace events + perf support
  2009-08-12 16:33 ` Masami Hiramatsu
@ 2009-08-12 17:02   ` Masami Hiramatsu
  2009-08-12 19:13     ` [RFD] Kprobes/Kretprobes " Frederic Weisbecker
  0 siblings, 1 reply; 54+ messages in thread
From: Masami Hiramatsu @ 2009-08-12 17:02 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Ingo Molnar, LKML, Lai Jiangshan, Steven Rostedt, Peter Zijlstra,
	Mathieu Desnoyers, Jiaying Zhang, Martin Bligh, Li Zefan,
	Jason Baron



Masami Hiramatsu wrote:
> Hi Frederic and Jason,
> 
> Frederic Weisbecker wrote:
>> Frederic Weisbecker (3):
>>         tracing: Add ftrace event call parameter to its field descriptor handler
> 
>> Jason Baron (12):
>>         tracing: Add ftrace_event_call void * 'data' field
> 
> Both of you added a parameter to ftrace_event_call for passing
> sycall name (call->data) to handlers, but one passes 'ftrace_event_call *'
> and another passes 'void *'. It seems not enough unified.
> 
> And also, I'm now updating my patch for 'dynamic ftrace_event_call'
> http://lkml.org/lkml/2009/7/24/234
> which adds 'ftrace_event_call *' for all handlers.
> 
> I think passing 'ftrace_event_call *' is more generic way
> to do that. What would you think about that?

Hmm, I changed my mind that passing 'void *' is enough, since
all other fields of ftrace_event_call will be handled in
trace_events.c.

Thank you,


-- 
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America), Inc.
Software Solutions Division

e-mail: mhiramat@redhat.com


^ permalink raw reply	[flat|nested] 54+ messages in thread

* [RFD] Kprobes/Kretprobes perf support
  2009-08-12 17:02   ` Masami Hiramatsu
@ 2009-08-12 19:13     ` Frederic Weisbecker
  2009-08-12 20:20       ` Masami Hiramatsu
                         ` (2 more replies)
  0 siblings, 3 replies; 54+ messages in thread
From: Frederic Weisbecker @ 2009-08-12 19:13 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Ingo Molnar, LKML, Lai Jiangshan, Steven Rostedt, Peter Zijlstra,
	Mathieu Desnoyers, Jiaying Zhang, Martin Bligh, Li Zefan,
	Jason Baron

On Wed, Aug 12, 2009 at 01:02:24PM -0400, Masami Hiramatsu wrote:
> 
> 
> Masami Hiramatsu wrote:
> > Hi Frederic and Jason,
> > 
> > Frederic Weisbecker wrote:
> >> Frederic Weisbecker (3):
> >>         tracing: Add ftrace event call parameter to its field descriptor handler
> > 
> >> Jason Baron (12):
> >>         tracing: Add ftrace_event_call void * 'data' field
> > 
> > Both of you added a parameter to ftrace_event_call for passing
> > sycall name (call->data) to handlers, but one passes 'ftrace_event_call *'
> > and another passes 'void *'. It seems not enough unified.
> > 
> > And also, I'm now updating my patch for 'dynamic ftrace_event_call'
> > http://lkml.org/lkml/2009/7/24/234
> > which adds 'ftrace_event_call *' for all handlers.
> > 
> > I think passing 'ftrace_event_call *' is more generic way
> > to do that. What would you think about that?
> 
> Hmm, I changed my mind that passing 'void *' is enough, since
> all other fields of ftrace_event_call will be handled in
> trace_events.c.
> 
> Thank you,



Well, actually I agree with you because:

- struct ftrace_event_call * is typed and let the compiler
  be able to perform basic type checks.
  (Even though that only delays the use of a void * type through
  call->data)

- Further dynamic trace events might need other fields of struct ftrace_event_call *

While adding the struct ftrace_event_call * as parameter in the show_format
callback yesterday, I first thought about applying your "dynamic ftrace
event creation" patch.

But it was just too much for what I needed.

Speaking about your patches. You told recently you would be willing
to implement a perf support for kprobes, right? :-)

I've thought about how to do that.
Ftrace events are supported by perfcounter currently but Kprobes
dynamic ftrace events are of a different nature: we must create them
before any toggling.

So a large part is already done through the ftrace events and the fact
that you create one dynamically for each kprobes (we'll just need
a little callback for perf sample submission but that's a small
point).

The largest work that remains is to port the current powerful interface
to create these k{ret}probes (with requested  arguments, etc...) through
ftrace but using perf open syscall.

And I imagine it won't be trivial.

Ingo, Peter do you have an idea on how we could do that?
We should be able to choose between a kprobe and kretprobe (these can
be two separate counters). And also one must be able to request the dump
of random desired parameters (or return values in case of kretprobe)
or registers...

May be we should use the perf attr by passing a __user address to a buffer
that contains all these options?
Once we get that to the kernel, that can be passed to ftrace-kprobe that
can parse it, create the desired trace event and rely on perf to create
a counter for it.

I guess that won't imply so much adds to Masami's patchset. Most of
the work is on the perf tools (parsing the user request).

./perf kprobes -e (func|addr):(c|r):(a1,a2,a3,... | rax,rbx,rcx,...)
                              ^  ^
                           c = call = kprobe
                           r = return = kretprobe


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFD] Kprobes/Kretprobes perf support
  2009-08-12 19:13     ` [RFD] Kprobes/Kretprobes " Frederic Weisbecker
@ 2009-08-12 20:20       ` Masami Hiramatsu
  2009-08-13  8:02         ` Ingo Molnar
  2009-08-12 21:09       ` Peter Zijlstra
  2009-08-14 15:05       ` Masami Hiramatsu
  2 siblings, 1 reply; 54+ messages in thread
From: Masami Hiramatsu @ 2009-08-12 20:20 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Ingo Molnar, LKML, Lai Jiangshan, Steven Rostedt, Peter Zijlstra,
	Mathieu Desnoyers, Jiaying Zhang, Martin Bligh, Li Zefan,
	Jason Baron



Frederic Weisbecker wrote:
> On Wed, Aug 12, 2009 at 01:02:24PM -0400, Masami Hiramatsu wrote:
>>
>> Masami Hiramatsu wrote:
>>> Hi Frederic and Jason,
>>>
>>> Frederic Weisbecker wrote:
>>>> Frederic Weisbecker (3):
>>>>          tracing: Add ftrace event call parameter to its field descriptor handler
>>>> Jason Baron (12):
>>>>          tracing: Add ftrace_event_call void * 'data' field
>>> Both of you added a parameter to ftrace_event_call for passing
>>> sycall name (call->data) to handlers, but one passes 'ftrace_event_call *'
>>> and another passes 'void *'. It seems not enough unified.
>>>
>>> And also, I'm now updating my patch for 'dynamic ftrace_event_call'
>>> http://lkml.org/lkml/2009/7/24/234
>>> which adds 'ftrace_event_call *' for all handlers.
>>>
>>> I think passing 'ftrace_event_call *' is more generic way
>>> to do that. What would you think about that?
>> Hmm, I changed my mind that passing 'void *' is enough, since
>> all other fields of ftrace_event_call will be handled in
>> trace_events.c.
>>
>> Thank you,
>
>
>
> Well, actually I agree with you because:
>
> - struct ftrace_event_call * is typed and let the compiler
>    be able to perform basic type checks.
>    (Even though that only delays the use of a void * type through
>    call->data)
>
> - Further dynamic trace events might need other fields of struct ftrace_event_call *

Hmm, so would you think passing 'struct ftrace_event_call *' is better?

> While adding the struct ftrace_event_call * as parameter in the show_format
> callback yesterday, I first thought about applying your "dynamic ftrace
> event creation" patch.
>
> But it was just too much for what I needed.

Sure, syscall events can be defined in build time.

> Speaking about your patches. You told recently you would be willing
> to implement a perf support for kprobes, right? :-)

Hmm, perhaps, I meant a profiling interface(http://lkml.org/lkml/2009/7/24/240).
However, that is interesting idea too.

> I've thought about how to do that.
> Ftrace events are supported by perfcounter currently but Kprobes
> dynamic ftrace events are of a different nature: we must create them
> before any toggling.
>
> So a large part is already done through the ftrace events and the fact
> that you create one dynamically for each kprobes (we'll just need
> a little callback for perf sample submission but that's a small
> point).

Sure, even current implementation has some difference from tracepoint
events... (currently, all of those kprobes events shares same event id,
and each event can be identified by the event ip address)


> The largest work that remains is to port the current powerful interface
> to create these k{ret}probes (with requested  arguments, etc...) through
> ftrace but using perf open syscall.
>
> And I imagine it won't be trivial.
>
> Ingo, Peter do you have an idea on how we could do that?
> We should be able to choose between a kprobe and kretprobe (these can
> be two separate counters). And also one must be able to request the dump
> of random desired parameters (or return values in case of kretprobe)
> or registers...
>
> May be we should use the perf attr by passing a __user address to a buffer
> that contains all these options?
> Once we get that to the kernel, that can be passed to ftrace-kprobe that
> can parse it, create the desired trace event and rely on perf to create
> a counter for it.
>
> I guess that won't imply so much adds to Masami's patchset. Most of
> the work is on the perf tools (parsing the user request).
>
> ./perf kprobes -e (func|addr):(c|r):(a1,a2,a3,... | rax,rbx,rcx,...)
>                                ^  ^
>                             c = call = kprobe
>                             r = return = kretprobe
>

It could work. can it support some dereference format, like as +4(%sp)?

Thank you,

-- 
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America), Inc.
Software Solutions Division

e-mail: mhiramat@redhat.com


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFD] Kprobes/Kretprobes perf support
  2009-08-12 19:13     ` [RFD] Kprobes/Kretprobes " Frederic Weisbecker
  2009-08-12 20:20       ` Masami Hiramatsu
@ 2009-08-12 21:09       ` Peter Zijlstra
  2009-08-12 21:27         ` Masami Hiramatsu
  2009-08-12 21:35         ` Frederic Weisbecker
  2009-08-14 15:05       ` Masami Hiramatsu
  2 siblings, 2 replies; 54+ messages in thread
From: Peter Zijlstra @ 2009-08-12 21:09 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Masami Hiramatsu, Ingo Molnar, LKML, Lai Jiangshan,
	Steven Rostedt, Mathieu Desnoyers, Jiaying Zhang, Martin Bligh,
	Li Zefan, Jason Baron

On Wed, 2009-08-12 at 21:13 +0200, Frederic Weisbecker wrote:

> Ingo, Peter do you have an idea on how we could do that?

Wouldn't it be easiest to use ftrace to create dynamic tracepoints in
the ftrace way such that they become available in
debugfs://tracing/events/kprobes/*/ and then have them interfaced the
same way as all other tracepoints?




^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFD] Kprobes/Kretprobes perf support
  2009-08-12 21:09       ` Peter Zijlstra
@ 2009-08-12 21:27         ` Masami Hiramatsu
  2009-08-12 21:37           ` Frederic Weisbecker
  2009-08-12 21:35         ` Frederic Weisbecker
  1 sibling, 1 reply; 54+ messages in thread
From: Masami Hiramatsu @ 2009-08-12 21:27 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Frederic Weisbecker, Ingo Molnar, LKML, Lai Jiangshan,
	Steven Rostedt, Mathieu Desnoyers, Jiaying Zhang, Martin Bligh,
	Li Zefan, Jason Baron

Peter Zijlstra wrote:
> On Wed, 2009-08-12 at 21:13 +0200, Frederic Weisbecker wrote:
> 
>> Ingo, Peter do you have an idea on how we could do that?
> 
> Wouldn't it be easiest to use ftrace to create dynamic tracepoints in
> the ftrace way such that they become available in
> debugfs://tracing/events/kprobes/*/ and then have them interfaced the
> same way as all other tracepoints?

Yes, almost same. One big difference is that they are sharing
same event-id(TRACE_KPROBE and TRACE_KRETPROBE).
But I can make each kprobe events to have different ids, if you need.

Thank you,

-- 
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America), Inc.
Software Solutions Division

e-mail: mhiramat@redhat.com


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFD] Kprobes/Kretprobes perf support
  2009-08-12 21:09       ` Peter Zijlstra
  2009-08-12 21:27         ` Masami Hiramatsu
@ 2009-08-12 21:35         ` Frederic Weisbecker
  1 sibling, 0 replies; 54+ messages in thread
From: Frederic Weisbecker @ 2009-08-12 21:35 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Masami Hiramatsu, Ingo Molnar, LKML, Lai Jiangshan,
	Steven Rostedt, Mathieu Desnoyers, Jiaying Zhang, Martin Bligh,
	Li Zefan, Jason Baron

On Wed, Aug 12, 2009 at 11:09:57PM +0200, Peter Zijlstra wrote:
> On Wed, 2009-08-12 at 21:13 +0200, Frederic Weisbecker wrote:
> 
> > Ingo, Peter do you have an idea on how we could do that?
> 
> Wouldn't it be easiest to use ftrace to create dynamic tracepoints in
> the ftrace way such that they become available in
> debugfs://tracing/events/kprobes/*/ and then have them interfaced the
> same way as all other tracepoints?
> 


Oh..yeah, that would be definetly simpler.


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFD] Kprobes/Kretprobes perf support
  2009-08-12 21:27         ` Masami Hiramatsu
@ 2009-08-12 21:37           ` Frederic Weisbecker
  0 siblings, 0 replies; 54+ messages in thread
From: Frederic Weisbecker @ 2009-08-12 21:37 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Peter Zijlstra, Ingo Molnar, LKML, Lai Jiangshan, Steven Rostedt,
	Mathieu Desnoyers, Jiaying Zhang, Martin Bligh, Li Zefan,
	Jason Baron

On Wed, Aug 12, 2009 at 05:27:16PM -0400, Masami Hiramatsu wrote:
> Peter Zijlstra wrote:
> > On Wed, 2009-08-12 at 21:13 +0200, Frederic Weisbecker wrote:
> > 
> >> Ingo, Peter do you have an idea on how we could do that?
> > 
> > Wouldn't it be easiest to use ftrace to create dynamic tracepoints in
> > the ftrace way such that they become available in
> > debugfs://tracing/events/kprobes/*/ and then have them interfaced the
> > same way as all other tracepoints?
> 
> Yes, almost same. One big difference is that they are sharing
> same event-id(TRACE_KPROBE and TRACE_KRETPROBE).
> But I can make each kprobe events to have different ids, if you need.
> 
> Thank you,

Yeah that would be better. That's also what Jason did with the syscall events.


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFD] Kprobes/Kretprobes perf support
  2009-08-12 20:20       ` Masami Hiramatsu
@ 2009-08-13  8:02         ` Ingo Molnar
  0 siblings, 0 replies; 54+ messages in thread
From: Ingo Molnar @ 2009-08-13  8:02 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Frederic Weisbecker, LKML, Lai Jiangshan, Steven Rostedt,
	Peter Zijlstra, Mathieu Desnoyers, Jiaying Zhang, Martin Bligh,
	Li Zefan, Jason Baron


* Masami Hiramatsu <mhiramat@redhat.com> wrote:

>> Speaking about your patches. You told recently you would be 
>> willing to implement a perf support for kprobes, right? :-)
>
> Hmm, perhaps, I meant a profiling 
> interface(http://lkml.org/lkml/2009/7/24/240). However, that is 
> interesting idea too.

Note that profiling via perf and perfcounters is a very young 
project, but already far more capable:

 - It is a generic framework. If you provide an event source, the 
   full framework will understand and support your events and will 
   expose it to users: 'perf stat' works, 'perf record',
   'perf report', 'perf top' all works the upcoming 'perf trace'
   and the upcoming 'perf view' GUI will all understand it. These 
   are all different modes of analysis, from the high-level 
   statistics bits, through profiling, down to lowlevel tracing - 
   based on the same stream of data.

 - It's very flexible: there's per cpu, per task or per workload 
   hierarchy stats/profiling/tracing.

 - It can do call-chain graph recording/reporting
   (try "perf record -g -f -a sleep 1" + "perf report")

 - There's a standard syscall interface to all this, making it 
   readily accessible and pushing it into apps and tools.

 - Your events can and will mix with all the other events. So there 
   can be hardware PMU events, software counters, tracepoints, etc. 
   - all in the same data file.

So while there's still a lot of work to do all across the 
perfcounters spectrum, it generally would be nice to expose kprobes 
events via perfcounters.

	Ingo

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFD] Kprobes/Kretprobes perf support
  2009-08-12 19:13     ` [RFD] Kprobes/Kretprobes " Frederic Weisbecker
  2009-08-12 20:20       ` Masami Hiramatsu
  2009-08-12 21:09       ` Peter Zijlstra
@ 2009-08-14 15:05       ` Masami Hiramatsu
  2009-08-15 14:33         ` Ingo Molnar
  2 siblings, 1 reply; 54+ messages in thread
From: Masami Hiramatsu @ 2009-08-14 15:05 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Ingo Molnar, LKML, Lai Jiangshan, Steven Rostedt, Peter Zijlstra,
	Mathieu Desnoyers, Jiaying Zhang, Martin Bligh, Li Zefan,
	Jason Baron

Frederic Weisbecker wrote:
> Ftrace events are supported by perfcounter currently but Kprobes
> dynamic ftrace events are of a different nature: we must create them
> before any toggling.
>
> So a large part is already done through the ftrace events and the fact
> that you create one dynamically for each kprobes (we'll just need
> a little callback for perf sample submission but that's a small
> point).
>
> The largest work that remains is to port the current powerful interface
> to create these k{ret}probes (with requested  arguments, etc...) through
> ftrace but using perf open syscall.
>
> And I imagine it won't be trivial.
>
> Ingo, Peter do you have an idea on how we could do that?
> We should be able to choose between a kprobe and kretprobe (these can
> be two separate counters). And also one must be able to request the dump
> of random desired parameters (or return values in case of kretprobe)
> or registers...
>
> May be we should use the perf attr by passing a __user address to a buffer
> that contains all these options?
> Once we get that to the kernel, that can be passed to ftrace-kprobe that
> can parse it, create the desired trace event and rely on perf to create
> a counter for it.
>
> I guess that won't imply so much adds to Masami's patchset. Most of
> the work is on the perf tools (parsing the user request).
>
> ./perf kprobes -e (func|addr):(c|r):(a1,a2,a3,... | rax,rbx,rcx,...)
>                                ^  ^
>                             c = call = kprobe
>                             r = return = kretprobe

If it is possible that libdwarf can be linked to the perf tool, I think
it might be better to support 'C source line/local variable' style too,
because basic dwarf decoding logic has already been done in c2kpe which
I posted yesterday :-).

Thank you,

-- 
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America), Inc.
Software Solutions Division

e-mail: mhiramat@redhat.com


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFD] Kprobes/Kretprobes perf support
  2009-08-14 15:05       ` Masami Hiramatsu
@ 2009-08-15 14:33         ` Ingo Molnar
  2009-08-17 21:58           ` Masami Hiramatsu
  0 siblings, 1 reply; 54+ messages in thread
From: Ingo Molnar @ 2009-08-15 14:33 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Frederic Weisbecker, LKML, Lai Jiangshan, Steven Rostedt,
	Peter Zijlstra, Mathieu Desnoyers, Jiaying Zhang, Martin Bligh,
	Li Zefan, Jason Baron


* Masami Hiramatsu <mhiramat@redhat.com> wrote:

> Frederic Weisbecker wrote:
>> Ftrace events are supported by perfcounter currently but Kprobes
>> dynamic ftrace events are of a different nature: we must create them
>> before any toggling.
>>
>> So a large part is already done through the ftrace events and the fact
>> that you create one dynamically for each kprobes (we'll just need
>> a little callback for perf sample submission but that's a small
>> point).
>>
>> The largest work that remains is to port the current powerful interface
>> to create these k{ret}probes (with requested  arguments, etc...) through
>> ftrace but using perf open syscall.
>>
>> And I imagine it won't be trivial.
>>
>> Ingo, Peter do you have an idea on how we could do that?
>> We should be able to choose between a kprobe and kretprobe (these can
>> be two separate counters). And also one must be able to request the dump
>> of random desired parameters (or return values in case of kretprobe)
>> or registers...
>>
>> May be we should use the perf attr by passing a __user address to a buffer
>> that contains all these options?
>> Once we get that to the kernel, that can be passed to ftrace-kprobe that
>> can parse it, create the desired trace event and rely on perf to create
>> a counter for it.
>>
>> I guess that won't imply so much adds to Masami's patchset. Most of
>> the work is on the perf tools (parsing the user request).
>>
>> ./perf kprobes -e (func|addr):(c|r):(a1,a2,a3,... | rax,rbx,rcx,...)
>>                                ^  ^
>>                             c = call = kprobe
>>                             r = return = kretprobe
>
> If it is possible that libdwarf can be linked to the perf tool, I 
> think it might be better to support 'C source line/local variable' 
> style too, because basic dwarf decoding logic has already been 
> done in c2kpe which I posted yesterday :-).

Sure - we can link it - and C/source syntax beats everything else, 
hands down. We can also do the kind of automatic 'conditional 
linking' we do for C++ symbol demangling - i.e. if libdwarf is not 
installed we just emit a warning and dont build that functionality 
but otherwise perf will still be built fine.

Thus there will be no dependency on libdwarf.

One thing that occured to me is that it would be nice to have sanity 
checks of all sorts. For example we could expose the md5sum of the 
kernel image and perf would double check against that when looking 
around in the debuginfo - or something like that.

	Ingo

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [RFD] Kprobes/Kretprobes perf support
  2009-08-15 14:33         ` Ingo Molnar
@ 2009-08-17 21:58           ` Masami Hiramatsu
  0 siblings, 0 replies; 54+ messages in thread
From: Masami Hiramatsu @ 2009-08-17 21:58 UTC (permalink / raw)
  To: Ingo Molnar, Frederic Weisbecker
  Cc: LKML, Lai Jiangshan, Steven Rostedt, Peter Zijlstra,
	Mathieu Desnoyers, Jiaying Zhang, Martin Bligh, Li Zefan,
	Jason Baron

Ingo Molnar wrote:
> 
> * Masami Hiramatsu <mhiramat@redhat.com> wrote:
> 
>> Frederic Weisbecker wrote:
>>> Ftrace events are supported by perfcounter currently but Kprobes
>>> dynamic ftrace events are of a different nature: we must create them
>>> before any toggling.
>>>
>>> So a large part is already done through the ftrace events and the fact
>>> that you create one dynamically for each kprobes (we'll just need
>>> a little callback for perf sample submission but that's a small
>>> point).
>>>
>>> The largest work that remains is to port the current powerful interface
>>> to create these k{ret}probes (with requested  arguments, etc...) through
>>> ftrace but using perf open syscall.
>>>
>>> And I imagine it won't be trivial.
>>>
>>> Ingo, Peter do you have an idea on how we could do that?
>>> We should be able to choose between a kprobe and kretprobe (these can
>>> be two separate counters). And also one must be able to request the dump
>>> of random desired parameters (or return values in case of kretprobe)
>>> or registers...
>>>
>>> May be we should use the perf attr by passing a __user address to a buffer
>>> that contains all these options?
>>> Once we get that to the kernel, that can be passed to ftrace-kprobe that
>>> can parse it, create the desired trace event and rely on perf to create
>>> a counter for it.
>>>
>>> I guess that won't imply so much adds to Masami's patchset. Most of
>>> the work is on the perf tools (parsing the user request).
>>>
>>> ./perf kprobes -e (func|addr):(c|r):(a1,a2,a3,... | rax,rbx,rcx,...)
>>>                                ^  ^
>>>                             c = call = kprobe
>>>                             r = return = kretprobe

It is better to support C/source syntax too.

./perf kprobes [-m kmod] [-k vmlinux] -e event-definition [-a arg-definition]
 or
./perf kprobes [-m kmod] [-k vmlinux] -f definition-file
 or
./perf kprobes [-m kmod] [-k vmlinux] -

event-definition:
 (p|r):[event-name]:probepoint

p = kprobe
r = kretprobe

probepoint (with debuginfo):
 function[+offs][@file]
 or
 @file:line

probepoint (without debuginfo):
 function[+offs]
 or
 address

arg-definition:
 a1,a2,a3,... | %ax,%bx,%cx,...| $var1,$var2,...

$var1,... are converted to register or memory address
by using debuginfo.

Thus, you can use perf like this.

./perf kprobes -e p::@mm/filemap.c:339 -a $inode,$pos


>>
>> If it is possible that libdwarf can be linked to the perf tool, I 
>> think it might be better to support 'C source line/local variable' 
>> style too, because basic dwarf decoding logic has already been 
>> done in c2kpe which I posted yesterday :-).
> 
> Sure - we can link it - and C/source syntax beats everything else, 
> hands down. We can also do the kind of automatic 'conditional 
> linking' we do for C++ symbol demangling - i.e. if libdwarf is not 
> installed we just emit a warning and dont build that functionality 
> but otherwise perf will still be built fine.
> 
> Thus there will be no dependency on libdwarf.

That's fine to me. If there is no libdwarf or CONFIG_DEBUG_INFO=n (or
just couldn't find vmlinux), 'perf kprobes' just doesn't accept
C/source syntax (and fail back to symbol/address syntax).

> 
> One thing that occured to me is that it would be nice to have sanity 
> checks of all sorts. For example we could expose the md5sum of the 
> kernel image and perf would double check against that when looking 
> around in the debuginfo - or something like that.

Systemtap has been done it with build-id.
http://sources.redhat.com/bugzilla/show_bug.cgi?id=4886

Perhaps, we can also use build-id to check the vmlinux version.

Thank you,

-- 
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America), Inc.
Software Solutions Division

e-mail: mhiramat@redhat.com


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [GIT PULL] tracing: Syscalls trace events + perf support
  2009-08-12  9:11 ` [GIT PULL] tracing: Syscalls trace events + perf support Ingo Molnar
  2009-08-12 11:03   ` Ingo Molnar
@ 2009-08-18  0:46   ` Paul Mundt
  2009-08-18  7:32       ` Ingo Molnar
  2009-08-18 10:25       ` Frederic Weisbecker
  1 sibling, 2 replies; 54+ messages in thread
From: Paul Mundt @ 2009-08-18  0:46 UTC (permalink / raw)
  To: Ingo Molnar, Stephen Rothwell, Jason Baron
  Cc: Frederic Weisbecker, LKML, Lai Jiangshan, Steven Rostedt,
	Peter Zijlstra, Mathieu Desnoyers, Jiaying Zhang, Martin Bligh,
	Li Zefan, Masami Hiramatsu, Martin Schwidefsky, Wu Zhangjin,
	linux-next

[ Adding to Cc everyone that now has a broken tree thanks to this .. ]

On Wed, Aug 12, 2009 at 11:11:33AM +0200, Ingo Molnar wrote:
> * Frederic Weisbecker <fweisbec@gmail.com> wrote:
> > This pull request integrate one cleanup/fix for ftrace and an 
> > update for syscall tracing: the migration from old-style tracer to 
> > individual tracepoints/trace_events and the support for perf 
> > counter.
> > 
> > I've tested it with success either with ftrace (every syscall 
> > tracepoints enabled at the same time without problems) and with 
> > perfcounter.
> > 
> > May be one drawback: it creates so much trace events that the 
> > ftrace selftests can take some time :-)
> 
> Pulled, thanks a lot!
> 
And this has now subsequently broken every single SH and S390
configuration, and anyone else unfortunate enough to be supporting ftrace
syscall tracing that isn't x86, without so much as a Cc, well done!

The s390 case can be fixed up in-tree as support has already been merged,
but in the SH case we had ftrace syscall tracing queued up for 2.6.32, so
it doesn't show up in -tip, but the end result in -next is now completely
broken.

I'm not sure how we should handle this, if tracing/core in -tip isn't
rebased, should I just pull the topic-branch in to my tree, fix up the sh
support on top of that, and push the end result out? This seems like the
easiest option at least, but I don't know what other dependencies exist
for tracing/core. Alternative suggestions welcome.

This happens again and again with ftrace and -tip, where people just
randomly change existing interfaces, break all of the existing users, and
then fail to tell anyone about it until it shows up in -next. Even if we
had pushed all of the sh ftrace bits to the -tip tree early on it would
not have changed anything, evident by the fact that s390 and all of the
non ftrace syscall architectures were broken by this change as well (the
latter case was at least caught and corrected, although not by the
original authors of this patch series). Is it really that much to task
that people who are running around breaking ftrace interfaces actually
bother to Cc the architectures that are using it?

If -tip is going to perpetuate this sort of half-assed development
methodology, it has no place in -next.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [GIT PULL] tracing: Syscalls trace events + perf support
  2009-08-18  0:46   ` Paul Mundt
@ 2009-08-18  7:32       ` Ingo Molnar
  2009-08-18 10:25       ` Frederic Weisbecker
  1 sibling, 0 replies; 54+ messages in thread
From: Ingo Molnar @ 2009-08-18  7:32 UTC (permalink / raw)
  To: Paul Mundt, Stephen Rothwell, Jason Baron, Frederic Weisbecker,
	LKML, Lai Jiangshan, Steven Rostedt, Peter Zijlstra,
	Mathieu Desnoyers, Jiaying Zhang, Martin Bligh, Li Zefan,
	Masami Hiramatsu, Martin Schwidefsky, Wu Zhangjin, linux-next


* Paul Mundt <lethal@linux-sh.org> wrote:

> [ Adding to Cc everyone that now has a broken tree thanks to this .. ]
> 
> On Wed, Aug 12, 2009 at 11:11:33AM +0200, Ingo Molnar wrote:
> > * Frederic Weisbecker <fweisbec@gmail.com> wrote:
> > > This pull request integrate one cleanup/fix for ftrace and an 
> > > update for syscall tracing: the migration from old-style tracer to 
> > > individual tracepoints/trace_events and the support for perf 
> > > counter.
> > > 
> > > I've tested it with success either with ftrace (every syscall 
> > > tracepoints enabled at the same time without problems) and with 
> > > perfcounter.
> > > 
> > > May be one drawback: it creates so much trace events that the 
> > > ftrace selftests can take some time :-)
> > 
> > Pulled, thanks a lot!
> 
> And this has now subsequently broken every single SH and S390 
> configuration, [...]

I test SH cross-builds regularly. I just checked the SH defconfig 
and it builds just fine here:

$ make -j32 CROSS_COMPILE=sh3-linux- ARCH=sh vmlinux

...
  CC      init/version.o
  LD      init/built-in.o
  LD      .tmp_vmlinux1
  KSYM    .tmp_kallsyms1.S
  AS      .tmp_kallsyms1.o
  LD      .tmp_vmlinux2
  KSYM    .tmp_kallsyms2.S
  AS      .tmp_kallsyms2.o
  LD      vmlinux
  SYSMAP  System.map
  SYSMAP  .tmp_System.map

 phoenix:~/linux/linux> head .config 
 #
 # Automatically generated make config: don't edit
 # Linux kernel version: 2.6.31-rc6
 # Tue Aug 18 09:24:28 2009
 #
 CONFIG_SUPERH=y
 CONFIG_SUPERH32=y
 # CONFIG_SUPERH64 is not set
 CONFIG_ARCH_DEFCONFIG="arch/sh/configs/shx3_defconfig"

AFAICS SH does not even have any syscall tracing added upstream. 
Apparently you added them in the SH tree and then they got 
integrated in linux-next, and the integrated end result broke?

Mind putting those bits into a separate Git branch and sending them 
to the tracing tree too so that we can make sure it's properly 
integrated and tested and that any changes to the generic facility 
are propagated to SH too?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [GIT PULL] tracing: Syscalls trace events + perf support
@ 2009-08-18  7:32       ` Ingo Molnar
  0 siblings, 0 replies; 54+ messages in thread
From: Ingo Molnar @ 2009-08-18  7:32 UTC (permalink / raw)
  To: Paul Mundt, Stephen Rothwell, Jason Baron, Frederic Weisbecker, LKML


* Paul Mundt <lethal@linux-sh.org> wrote:

> [ Adding to Cc everyone that now has a broken tree thanks to this .. ]
> 
> On Wed, Aug 12, 2009 at 11:11:33AM +0200, Ingo Molnar wrote:
> > * Frederic Weisbecker <fweisbec@gmail.com> wrote:
> > > This pull request integrate one cleanup/fix for ftrace and an 
> > > update for syscall tracing: the migration from old-style tracer to 
> > > individual tracepoints/trace_events and the support for perf 
> > > counter.
> > > 
> > > I've tested it with success either with ftrace (every syscall 
> > > tracepoints enabled at the same time without problems) and with 
> > > perfcounter.
> > > 
> > > May be one drawback: it creates so much trace events that the 
> > > ftrace selftests can take some time :-)
> > 
> > Pulled, thanks a lot!
> 
> And this has now subsequently broken every single SH and S390 
> configuration, [...]

I test SH cross-builds regularly. I just checked the SH defconfig 
and it builds just fine here:

$ make -j32 CROSS_COMPILE=sh3-linux- ARCH=sh vmlinux

...
  CC      init/version.o
  LD      init/built-in.o
  LD      .tmp_vmlinux1
  KSYM    .tmp_kallsyms1.S
  AS      .tmp_kallsyms1.o
  LD      .tmp_vmlinux2
  KSYM    .tmp_kallsyms2.S
  AS      .tmp_kallsyms2.o
  LD      vmlinux
  SYSMAP  System.map
  SYSMAP  .tmp_System.map

 phoenix:~/linux/linux> head .config 
 #
 # Automatically generated make config: don't edit
 # Linux kernel version: 2.6.31-rc6
 # Tue Aug 18 09:24:28 2009
 #
 CONFIG_SUPERH=y
 CONFIG_SUPERH32=y
 # CONFIG_SUPERH64 is not set
 CONFIG_ARCH_DEFCONFIG="arch/sh/configs/shx3_defconfig"

AFAICS SH does not even have any syscall tracing added upstream. 
Apparently you added them in the SH tree and then they got 
integrated in linux-next, and the integrated end result broke?

Mind putting those bits into a separate Git branch and sending them 
to the tracing tree too so that we can make sure it's properly 
integrated and tested and that any changes to the generic facility 
are propagated to SH too?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [S390] ftrace: update system call tracer support
  2009-08-18  7:32       ` Ingo Molnar
  (?)
  (?)
@ 2009-08-18  8:51       ` Ingo Molnar
  2009-08-18  8:59         ` Martin Schwidefsky
  2009-08-18 14:56         ` Jason Baron
  -1 siblings, 2 replies; 54+ messages in thread
From: Ingo Molnar @ 2009-08-18  8:51 UTC (permalink / raw)
  To: Paul Mundt, Stephen Rothwell, Jason Baron, Frederic Weisbecker,
	LKML, Lai Jiangshan, Steven Rostedt, Peter Zijlstra,
	Mathieu Desnoyers, Jiaying Zhang, Martin Bligh, Li Zefan,
	Masami Hiramatsu, Martin Schwidefsky, Wu Zhangjin, linux-next,
	Heiko Carstens


* Ingo Molnar <mingo@elte.hu> wrote:

> * Paul Mundt <lethal@linux-sh.org> wrote:
> 
> > [ Adding to Cc everyone that now has a broken tree thanks to this .. ]
> > 
> > On Wed, Aug 12, 2009 at 11:11:33AM +0200, Ingo Molnar wrote:
> > > * Frederic Weisbecker <fweisbec@gmail.com> wrote:
> > > > This pull request integrate one cleanup/fix for ftrace and an 
> > > > update for syscall tracing: the migration from old-style tracer to 
> > > > individual tracepoints/trace_events and the support for perf 
> > > > counter.
> > > > 
> > > > I've tested it with success either with ftrace (every syscall 
> > > > tracepoints enabled at the same time without problems) and with 
> > > > perfcounter.
> > > > 
> > > > May be one drawback: it creates so much trace events that the 
> > > > ftrace selftests can take some time :-)
> > > 
> > > Pulled, thanks a lot!
> > 
> > And this has now subsequently broken every single SH and S390 
> > configuration, [...]
> 
> I test SH cross-builds regularly. I just checked the SH defconfig 
> and it builds just fine here:
> 
> $ make -j32 CROSS_COMPILE=sh3-linux- ARCH=sh vmlinux

The s390 build indeed broke. (This got masked by the s390 toolchain 
i'm using not having been able to build Linus's tree - i fixed 
that.)

Could you try the fix below? It does the trick here.

Martin, Heiko - does the fix look good to you? regs->gprs[2] seems 
to be the register used for both the syscall number (enter 
callback) and for the return code (exit callback).

Regarding SH, the fixup should be similarly trivial. Since SH's 
FTRACE_SYSCALLS code is not upstream yet it can (and should) be 
carried in the tree that integrates the SH tree and the tracing 
tree - linux-next in this case.

Thanks,

	Ingo

------------------------------>
>From a9008fd42b1c3c89f684d90bdfb9c2d05c7af119 Mon Sep 17 00:00:00 2001
From: Ingo Molnar <mingo@elte.hu>
Date: Tue, 18 Aug 2009 10:41:57 +0200
Subject: [PATCH] [S390] ftrace: update system call tracer support

Commit fb34a08c3 ("tracing: Add trace events for each syscall
entry/exit") changed the lowlevel API to ftrace syscall tracing
but did not update s390 which started making use of it recently.

This broke the s390 build, as reported by Paul Mundt.

Update the callbacks with the syscall number and the syscall
return code values. This allows per syscall tracepoints,
syscall argument enumeration /debug/tracing/events/syscalls/
and perfcounters support and integration on s390 too.

Reported-by: Paul Mundt <lethal@linux-sh.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Jason Baron <jbaron@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <tip-fb34a08c3469b2be9eae626ccb96476b4687b810@git.kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
 arch/s390/kernel/ptrace.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/s390/kernel/ptrace.c b/arch/s390/kernel/ptrace.c
index 43acd73..05f57cd 100644
--- a/arch/s390/kernel/ptrace.c
+++ b/arch/s390/kernel/ptrace.c
@@ -662,7 +662,7 @@ asmlinkage long do_syscall_trace_enter(struct pt_regs *regs)
 	}
 
 	if (unlikely(test_thread_flag(TIF_SYSCALL_FTRACE)))
-		ftrace_syscall_enter(regs);
+		trace_syscall_enter(regs, regs->gprs[2]);
 
 	if (unlikely(current->audit_context))
 		audit_syscall_entry(is_compat_task() ?
@@ -680,7 +680,7 @@ asmlinkage void do_syscall_trace_exit(struct pt_regs *regs)
 				   regs->gprs[2]);
 
 	if (unlikely(test_thread_flag(TIF_SYSCALL_FTRACE)))
-		ftrace_syscall_exit(regs);
+		trace_syscall_exit(regs, regs->gprs[2]);
 
 	if (test_thread_flag(TIF_SYSCALL_TRACE))
 		tracehook_report_syscall_exit(regs, 0);

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [S390] ftrace: update system call tracer support
  2009-08-18  7:32       ` Ingo Molnar
  (?)
@ 2009-08-18  8:51       ` Ingo Molnar
  -1 siblings, 0 replies; 54+ messages in thread
From: Ingo Molnar @ 2009-08-18  8:51 UTC (permalink / raw)
  To: Paul Mundt, Stephen Rothwell, Jason Baron, Frederic Weisbecker, LKML


* Ingo Molnar <mingo@elte.hu> wrote:

> * Paul Mundt <lethal@linux-sh.org> wrote:
> 
> > [ Adding to Cc everyone that now has a broken tree thanks to this .. ]
> > 
> > On Wed, Aug 12, 2009 at 11:11:33AM +0200, Ingo Molnar wrote:
> > > * Frederic Weisbecker <fweisbec@gmail.com> wrote:
> > > > This pull request integrate one cleanup/fix for ftrace and an 
> > > > update for syscall tracing: the migration from old-style tracer to 
> > > > individual tracepoints/trace_events and the support for perf 
> > > > counter.
> > > > 
> > > > I've tested it with success either with ftrace (every syscall 
> > > > tracepoints enabled at the same time without problems) and with 
> > > > perfcounter.
> > > > 
> > > > May be one drawback: it creates so much trace events that the 
> > > > ftrace selftests can take some time :-)
> > > 
> > > Pulled, thanks a lot!
> > 
> > And this has now subsequently broken every single SH and S390 
> > configuration, [...]
> 
> I test SH cross-builds regularly. I just checked the SH defconfig 
> and it builds just fine here:
> 
> $ make -j32 CROSS_COMPILE=sh3-linux- ARCH=sh vmlinux

The s390 build indeed broke. (This got masked by the s390 toolchain 
i'm using not having been able to build Linus's tree - i fixed 
that.)

Could you try the fix below? It does the trick here.

Martin, Heiko - does the fix look good to you? regs->gprs[2] seems 
to be the register used for both the syscall number (enter 
callback) and for the return code (exit callback).

Regarding SH, the fixup should be similarly trivial. Since SH's 
FTRACE_SYSCALLS code is not upstream yet it can (and should) be 
carried in the tree that integrates the SH tree and the tracing 
tree - linux-next in this case.

Thanks,

	Ingo

------------------------------>
>From a9008fd42b1c3c89f684d90bdfb9c2d05c7af119 Mon Sep 17 00:00:00 2001
From: Ingo Molnar <mingo@elte.hu>
Date: Tue, 18 Aug 2009 10:41:57 +0200
Subject: [PATCH] [S390] ftrace: update system call tracer support

Commit fb34a08c3 ("tracing: Add trace events for each syscall
entry/exit") changed the lowlevel API to ftrace syscall tracing
but did not update s390 which started making use of it recently.

This broke the s390 build, as reported by Paul Mundt.

Update the callbacks with the syscall number and the syscall
return code values. This allows per syscall tracepoints,
syscall argument enumeration /debug/tracing/events/syscalls/
and perfcounters support and integration on s390 too.

Reported-by: Paul Mundt <lethal@linux-sh.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Jason Baron <jbaron@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <tip-fb34a08c3469b2be9eae626ccb96476b4687b810@git.kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
 arch/s390/kernel/ptrace.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/s390/kernel/ptrace.c b/arch/s390/kernel/ptrace.c
index 43acd73..05f57cd 100644
--- a/arch/s390/kernel/ptrace.c
+++ b/arch/s390/kernel/ptrace.c
@@ -662,7 +662,7 @@ asmlinkage long do_syscall_trace_enter(struct pt_regs *regs)
 	}
 
 	if (unlikely(test_thread_flag(TIF_SYSCALL_FTRACE)))
-		ftrace_syscall_enter(regs);
+		trace_syscall_enter(regs, regs->gprs[2]);
 
 	if (unlikely(current->audit_context))
 		audit_syscall_entry(is_compat_task() ?
@@ -680,7 +680,7 @@ asmlinkage void do_syscall_trace_exit(struct pt_regs *regs)
 				   regs->gprs[2]);
 
 	if (unlikely(test_thread_flag(TIF_SYSCALL_FTRACE)))
-		ftrace_syscall_exit(regs);
+		trace_syscall_exit(regs, regs->gprs[2]);
 
 	if (test_thread_flag(TIF_SYSCALL_TRACE))
 		tracehook_report_syscall_exit(regs, 0);

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* Re: [S390] ftrace: update system call tracer support
  2009-08-18  8:51       ` Ingo Molnar
@ 2009-08-18  8:59         ` Martin Schwidefsky
  2009-08-18 10:05           ` Ingo Molnar
  2009-08-18 14:56         ` Jason Baron
  1 sibling, 1 reply; 54+ messages in thread
From: Martin Schwidefsky @ 2009-08-18  8:59 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Paul Mundt, Stephen Rothwell, Jason Baron, Frederic Weisbecker,
	LKML, Lai Jiangshan, Steven Rostedt, Peter Zijlstra,
	Mathieu Desnoyers, Jiaying Zhang, Martin Bligh, Li Zefan,
	Masami Hiramatsu, Wu Zhangjin, linux-next, Heiko Carstens

On Tue, 18 Aug 2009 10:51:10 +0200
Ingo Molnar <mingo@elte.hu> wrote:

> 
> * Ingo Molnar <mingo@elte.hu> wrote:
> 
> > * Paul Mundt <lethal@linux-sh.org> wrote:
> > 
> > > [ Adding to Cc everyone that now has a broken tree thanks to this .. ]
> > > 
> > > On Wed, Aug 12, 2009 at 11:11:33AM +0200, Ingo Molnar wrote:
> > > > * Frederic Weisbecker <fweisbec@gmail.com> wrote:
> > > > > This pull request integrate one cleanup/fix for ftrace and an 
> > > > > update for syscall tracing: the migration from old-style tracer to 
> > > > > individual tracepoints/trace_events and the support for perf 
> > > > > counter.
> > > > > 
> > > > > I've tested it with success either with ftrace (every syscall 
> > > > > tracepoints enabled at the same time without problems) and with 
> > > > > perfcounter.
> > > > > 
> > > > > May be one drawback: it creates so much trace events that the 
> > > > > ftrace selftests can take some time :-)
> > > > 
> > > > Pulled, thanks a lot!
> > > 
> > > And this has now subsequently broken every single SH and S390 
> > > configuration, [...]
> > 
> > I test SH cross-builds regularly. I just checked the SH defconfig 
> > and it builds just fine here:
> > 
> > $ make -j32 CROSS_COMPILE=sh3-linux- ARCH=sh vmlinux
> 
> The s390 build indeed broke. (This got masked by the s390 toolchain 
> i'm using not having been able to build Linus's tree - i fixed 
> that.)
> 
> Could you try the fix below? It does the trick here.
> 
> Martin, Heiko - does the fix look good to you? regs->gprs[2] seems 
> to be the register used for both the syscall number (enter 
> callback) and for the return code (exit callback).

Correct, for do_syscall_trace_{enter,exit} the code in entry.S stores
the system call number in regs->gprs[2]. The fix is fine.
Thanks Ingo.

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [S390] ftrace: update system call tracer support
  2009-08-18  8:59         ` Martin Schwidefsky
@ 2009-08-18 10:05           ` Ingo Molnar
  2009-08-18 10:22             ` Martin Schwidefsky
  0 siblings, 1 reply; 54+ messages in thread
From: Ingo Molnar @ 2009-08-18 10:05 UTC (permalink / raw)
  To: Martin Schwidefsky
  Cc: Paul Mundt, Stephen Rothwell, Jason Baron, Frederic Weisbecker,
	LKML, Lai Jiangshan, Steven Rostedt, Peter Zijlstra,
	Mathieu Desnoyers, Jiaying Zhang, Martin Bligh, Li Zefan,
	Masami Hiramatsu, Wu Zhangjin, linux-next, Heiko Carstens


* Martin Schwidefsky <schwidefsky@de.ibm.com> wrote:

> On Tue, 18 Aug 2009 10:51:10 +0200
> Ingo Molnar <mingo@elte.hu> wrote:
> 
> > 
> > * Ingo Molnar <mingo@elte.hu> wrote:
> > 
> > > * Paul Mundt <lethal@linux-sh.org> wrote:
> > > 
> > > > [ Adding to Cc everyone that now has a broken tree thanks to this .. ]
> > > > 
> > > > On Wed, Aug 12, 2009 at 11:11:33AM +0200, Ingo Molnar wrote:
> > > > > * Frederic Weisbecker <fweisbec@gmail.com> wrote:
> > > > > > This pull request integrate one cleanup/fix for ftrace and an 
> > > > > > update for syscall tracing: the migration from old-style tracer to 
> > > > > > individual tracepoints/trace_events and the support for perf 
> > > > > > counter.
> > > > > > 
> > > > > > I've tested it with success either with ftrace (every syscall 
> > > > > > tracepoints enabled at the same time without problems) and with 
> > > > > > perfcounter.
> > > > > > 
> > > > > > May be one drawback: it creates so much trace events that the 
> > > > > > ftrace selftests can take some time :-)
> > > > > 
> > > > > Pulled, thanks a lot!
> > > > 
> > > > And this has now subsequently broken every single SH and S390 
> > > > configuration, [...]
> > > 
> > > I test SH cross-builds regularly. I just checked the SH defconfig 
> > > and it builds just fine here:
> > > 
> > > $ make -j32 CROSS_COMPILE=sh3-linux- ARCH=sh vmlinux
> > 
> > The s390 build indeed broke. (This got masked by the s390 toolchain 
> > i'm using not having been able to build Linus's tree - i fixed 
> > that.)
> > 
> > Could you try the fix below? It does the trick here.
> > 
> > Martin, Heiko - does the fix look good to you? regs->gprs[2] seems 
> > to be the register used for both the syscall number (enter 
> > callback) and for the return code (exit callback).
> 
> Correct, for do_syscall_trace_{enter,exit} the code in entry.S 
> stores the system call number in regs->gprs[2]. The fix is fine. 
> Thanks Ingo.

Thanks, i've added your Acked-by to the commit. I suspect you dont 
want to pull tracing infrastructure changes into the S390 tree, so 
keeping this fix in the tracing tree would be the best option?

	Ingo

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [S390] ftrace: update system call tracer support
  2009-08-18 10:05           ` Ingo Molnar
@ 2009-08-18 10:22             ` Martin Schwidefsky
  0 siblings, 0 replies; 54+ messages in thread
From: Martin Schwidefsky @ 2009-08-18 10:22 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Paul Mundt, Stephen Rothwell, Jason Baron, Frederic Weisbecker,
	LKML, Lai Jiangshan, Steven Rostedt, Peter Zijlstra,
	Mathieu Desnoyers, Jiaying Zhang, Martin Bligh, Li Zefan,
	Masami Hiramatsu, Wu Zhangjin, linux-next, Heiko Carstens

On Tue, 18 Aug 2009 12:05:26 +0200
Ingo Molnar <mingo@elte.hu> wrote:

> 
> * Martin Schwidefsky <schwidefsky@de.ibm.com> wrote:
> 
> > On Tue, 18 Aug 2009 10:51:10 +0200
> > Ingo Molnar <mingo@elte.hu> wrote:
> > 
> > > 
> > > * Ingo Molnar <mingo@elte.hu> wrote:
> > > 
> > > > * Paul Mundt <lethal@linux-sh.org> wrote:
> > > > 
> > > > > [ Adding to Cc everyone that now has a broken tree thanks to this .. ]
> > > > > 
> > > > > On Wed, Aug 12, 2009 at 11:11:33AM +0200, Ingo Molnar wrote:
> > > > > > * Frederic Weisbecker <fweisbec@gmail.com> wrote:
> > > > > > > This pull request integrate one cleanup/fix for ftrace and an 
> > > > > > > update for syscall tracing: the migration from old-style tracer to 
> > > > > > > individual tracepoints/trace_events and the support for perf 
> > > > > > > counter.
> > > > > > > 
> > > > > > > I've tested it with success either with ftrace (every syscall 
> > > > > > > tracepoints enabled at the same time without problems) and with 
> > > > > > > perfcounter.
> > > > > > > 
> > > > > > > May be one drawback: it creates so much trace events that the 
> > > > > > > ftrace selftests can take some time :-)
> > > > > > 
> > > > > > Pulled, thanks a lot!
> > > > > 
> > > > > And this has now subsequently broken every single SH and S390 
> > > > > configuration, [...]
> > > > 
> > > > I test SH cross-builds regularly. I just checked the SH defconfig 
> > > > and it builds just fine here:
> > > > 
> > > > $ make -j32 CROSS_COMPILE=sh3-linux- ARCH=sh vmlinux
> > > 
> > > The s390 build indeed broke. (This got masked by the s390 toolchain 
> > > i'm using not having been able to build Linus's tree - i fixed 
> > > that.)
> > > 
> > > Could you try the fix below? It does the trick here.
> > > 
> > > Martin, Heiko - does the fix look good to you? regs->gprs[2] seems 
> > > to be the register used for both the syscall number (enter 
> > > callback) and for the return code (exit callback).
> > 
> > Correct, for do_syscall_trace_{enter,exit} the code in entry.S 
> > stores the system call number in regs->gprs[2]. The fix is fine. 
> > Thanks Ingo.
> 
> Thanks, i've added your Acked-by to the commit. I suspect you dont 
> want to pull tracing infrastructure changes into the S390 tree, so 
> keeping this fix in the tracing tree would be the best option?

Indeed, the patch that introduced the api change is in the tracing tree
so it makes sense to put the s390 adaptions there too.

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [GIT PULL] tracing: Syscalls trace events + perf support
  2009-08-18  0:46   ` Paul Mundt
@ 2009-08-18 10:25       ` Frederic Weisbecker
  2009-08-18 10:25       ` Frederic Weisbecker
  1 sibling, 0 replies; 54+ messages in thread
From: Frederic Weisbecker @ 2009-08-18 10:25 UTC (permalink / raw)
  To: Paul Mundt, Ingo Molnar, Stephen Rothwell, Jason Baron, LKML,
	Lai Jiangshan, Steven Rostedt, Peter Zijlstra, Mathieu Desnoyers,
	Jiaying Zhang, Martin Bligh, Li Zefan, Masami Hiramatsu,
	Martin Schwidefsky, Wu Zhangjin, linux-next

On Tue, Aug 18, 2009 at 09:46:55AM +0900, Paul Mundt wrote:
> [ Adding to Cc everyone that now has a broken tree thanks to this .. ]
> 
> On Wed, Aug 12, 2009 at 11:11:33AM +0200, Ingo Molnar wrote:
> > * Frederic Weisbecker <fweisbec@gmail.com> wrote:
> > > This pull request integrate one cleanup/fix for ftrace and an 
> > > update for syscall tracing: the migration from old-style tracer to 
> > > individual tracepoints/trace_events and the support for perf 
> > > counter.
> > > 
> > > I've tested it with success either with ftrace (every syscall 
> > > tracepoints enabled at the same time without problems) and with 
> > > perfcounter.
> > > 
> > > May be one drawback: it creates so much trace events that the 
> > > ftrace selftests can take some time :-)
> > 
> > Pulled, thanks a lot!
> > 
> And this has now subsequently broken every single SH and S390
> configuration, and anyone else unfortunate enough to be supporting ftrace
> syscall tracing that isn't x86, without so much as a Cc, well done!
> 
> The s390 case can be fixed up in-tree as support has already been merged,
> but in the SH case we had ftrace syscall tracing queued up for 2.6.32, so
> it doesn't show up in -tip, but the end result in -next is now completely
> broken.
> 
> I'm not sure how we should handle this, if tracing/core in -tip isn't
> rebased, should I just pull the topic-branch in to my tree, fix up the sh
> support on top of that, and push the end result out? This seems like the
> easiest option at least, but I don't know what other dependencies exist
> for tracing/core. Alternative suggestions welcome.
> 
> This happens again and again with ftrace and -tip, where people just
> randomly change existing interfaces, break all of the existing users, and
> then fail to tell anyone about it until it shows up in -next. Even if we
> had pushed all of the sh ftrace bits to the -tip tree early on it would
> not have changed anything, evident by the fact that s390 and all of the
> non ftrace syscall architectures were broken by this change as well (the
> latter case was at least caught and corrected, although not by the
> original authors of this patch series). Is it really that much to task
> that people who are running around breaking ftrace interfaces actually
> bother to Cc the architectures that are using it?



I've just retrieved the concerned commit in the sh tree:

sh: Add ftrace syscall tracing support (c652d780c9cf7f860141de232b37160fe013feca)

Was I cc'ed on this one? I can't find it in my inbox. Unless I'm wrong
and I missed it, how could I guess I had to cc you and how am I supposed
to fix something I'm even not aware of?


I can't find the s390 patch in my inbox either (was I cc'ed ?)
([S390] ftrace: add system call tracer support) but we should have fixed
this one because it was already upstream and a git-grep ftrace_syscall_enter
would have warned us about that.

I didn't know another arch was supporting syscall tracing (except mips because
I was cc'ed, but it doesn't seem upstream nor in the mips tree).


> 
> If -tip is going to perpetuate this sort of half-assed development
> methodology, it has no place in -next.


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [GIT PULL] tracing: Syscalls trace events + perf support
@ 2009-08-18 10:25       ` Frederic Weisbecker
  0 siblings, 0 replies; 54+ messages in thread
From: Frederic Weisbecker @ 2009-08-18 10:25 UTC (permalink / raw)
  To: Paul Mundt, Ingo Molnar, Stephen Rothwell, Jason Baron, LKML, Lai

On Tue, Aug 18, 2009 at 09:46:55AM +0900, Paul Mundt wrote:
> [ Adding to Cc everyone that now has a broken tree thanks to this .. ]
> 
> On Wed, Aug 12, 2009 at 11:11:33AM +0200, Ingo Molnar wrote:
> > * Frederic Weisbecker <fweisbec@gmail.com> wrote:
> > > This pull request integrate one cleanup/fix for ftrace and an 
> > > update for syscall tracing: the migration from old-style tracer to 
> > > individual tracepoints/trace_events and the support for perf 
> > > counter.
> > > 
> > > I've tested it with success either with ftrace (every syscall 
> > > tracepoints enabled at the same time without problems) and with 
> > > perfcounter.
> > > 
> > > May be one drawback: it creates so much trace events that the 
> > > ftrace selftests can take some time :-)
> > 
> > Pulled, thanks a lot!
> > 
> And this has now subsequently broken every single SH and S390
> configuration, and anyone else unfortunate enough to be supporting ftrace
> syscall tracing that isn't x86, without so much as a Cc, well done!
> 
> The s390 case can be fixed up in-tree as support has already been merged,
> but in the SH case we had ftrace syscall tracing queued up for 2.6.32, so
> it doesn't show up in -tip, but the end result in -next is now completely
> broken.
> 
> I'm not sure how we should handle this, if tracing/core in -tip isn't
> rebased, should I just pull the topic-branch in to my tree, fix up the sh
> support on top of that, and push the end result out? This seems like the
> easiest option at least, but I don't know what other dependencies exist
> for tracing/core. Alternative suggestions welcome.
> 
> This happens again and again with ftrace and -tip, where people just
> randomly change existing interfaces, break all of the existing users, and
> then fail to tell anyone about it until it shows up in -next. Even if we
> had pushed all of the sh ftrace bits to the -tip tree early on it would
> not have changed anything, evident by the fact that s390 and all of the
> non ftrace syscall architectures were broken by this change as well (the
> latter case was at least caught and corrected, although not by the
> original authors of this patch series). Is it really that much to task
> that people who are running around breaking ftrace interfaces actually
> bother to Cc the architectures that are using it?



I've just retrieved the concerned commit in the sh tree:

sh: Add ftrace syscall tracing support (c652d780c9cf7f860141de232b37160fe013feca)

Was I cc'ed on this one? I can't find it in my inbox. Unless I'm wrong
and I missed it, how could I guess I had to cc you and how am I supposed
to fix something I'm even not aware of?


I can't find the s390 patch in my inbox either (was I cc'ed ?)
([S390] ftrace: add system call tracer support) but we should have fixed
this one because it was already upstream and a git-grep ftrace_syscall_enter
would have warned us about that.

I didn't know another arch was supporting syscall tracing (except mips because
I was cc'ed, but it doesn't seem upstream nor in the mips tree).


> 
> If -tip is going to perpetuate this sort of half-assed development
> methodology, it has no place in -next.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [GIT PULL] tracing: Syscalls trace events + perf support
  2009-08-18 10:25       ` Frederic Weisbecker
  (?)
@ 2009-08-18 11:06       ` Ingo Molnar
  2009-08-18 11:56         ` Paul Mundt
  -1 siblings, 1 reply; 54+ messages in thread
From: Ingo Molnar @ 2009-08-18 11:06 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Paul Mundt, Stephen Rothwell, Jason Baron, LKML, Lai Jiangshan,
	Steven Rostedt, Peter Zijlstra, Mathieu Desnoyers, Jiaying Zhang,
	Martin Bligh, Li Zefan, Masami Hiramatsu, Martin Schwidefsky,
	Wu Zhangjin, linux-next


* Frederic Weisbecker <fweisbec@gmail.com> wrote:

> On Tue, Aug 18, 2009 at 09:46:55AM +0900, Paul Mundt wrote:
> > [ Adding to Cc everyone that now has a broken tree thanks to this .. ]
> > 
> > On Wed, Aug 12, 2009 at 11:11:33AM +0200, Ingo Molnar wrote:
> > > * Frederic Weisbecker <fweisbec@gmail.com> wrote:
> > > > This pull request integrate one cleanup/fix for ftrace and an 
> > > > update for syscall tracing: the migration from old-style tracer to 
> > > > individual tracepoints/trace_events and the support for perf 
> > > > counter.
> > > > 
> > > > I've tested it with success either with ftrace (every syscall 
> > > > tracepoints enabled at the same time without problems) and with 
> > > > perfcounter.
> > > > 
> > > > May be one drawback: it creates so much trace events that the 
> > > > ftrace selftests can take some time :-)
> > > 
> > > Pulled, thanks a lot!
> > 
> > And this has now subsequently broken every single SH and S390 
> > configuration, and anyone else unfortunate enough to be 
> > supporting ftrace syscall tracing that isn't x86, without so 
> > much as a Cc, well done!
> > 
> > The s390 case can be fixed up in-tree as support has already 
> > been merged, but in the SH case we had ftrace syscall tracing 
> > queued up for 2.6.32, so it doesn't show up in -tip, but the 
> > end result in -next is now completely broken.
> > 
> > I'm not sure how we should handle this, if tracing/core in -tip 
> > isn't rebased, should I just pull the topic-branch in to my 
> > tree, fix up the sh support on top of that, and push the end 
> > result out? This seems like the easiest option at least, but I 
> > don't know what other dependencies exist for tracing/core. 
> > Alternative suggestions welcome.
> > 
> > This happens again and again with ftrace and -tip, where people 
> > just randomly change existing interfaces, break all of the 
> > existing users, and then fail to tell anyone about it until it 
> > shows up in -next. Even if we had pushed all of the sh ftrace 
> > bits to the -tip tree early on it would not have changed 
> > anything, evident by the fact that s390 and all of the non 
> > ftrace syscall architectures were broken by this change as well 
> > (the latter case was at least caught and corrected, although 
> > not by the original authors of this patch series). Is it really 
> > that much to task that people who are running around breaking 
> > ftrace interfaces actually bother to Cc the architectures that 
> > are using it?
> 
> I've just retrieved the concerned commit in the sh tree:
> 
> sh: Add ftrace syscall tracing support 
> (c652d780c9cf7f860141de232b37160fe013feca)
> 
> Was I cc'ed on this one? I can't find it in my inbox. Unless I'm 
> wrong and I missed it, how could I guess I had to cc you and how 
> am I supposed to fix something I'm even not aware of?
> 
> I can't find the s390 patch in my inbox either (was I cc'ed ?) 
> ([S390] ftrace: add system call tracer support) but we should 
> have fixed this one because it was already upstream and a 
> git-grep ftrace_syscall_enter would have warned us about that.
> 
> I didn't know another arch was supporting syscall tracing (except 
> mips because I was cc'ed, but it doesn't seem upstream nor in the 
> mips tree).

Yes, and note that Paul Mundt has not even done the minimal 
courtesy of describing/pasting the build failure he was seeing. But 
he had time for a 4-paragraph rant about how others are supposed to 
do their work... And then he complains about us not having 
considered a change he never Cc:-ed to us. How nice.

( I have meanwhile fixed the bug in s390 and have posted that fix -
  the SH fix should be an analogous twoliner. )

This reply from Paul Mundt shows an _incredibly_ arrogant attitude 
towards core kernel facilities: he almost never contributes to them 
(i just checked the logs from v2.6.12 to v2.6.31-rc6) but _THEY_ 
must do everything to keep SH running smoothly.

And he expects that work to benefit SH to happen regardless of how 
many actual Linux users use, develop on and report bugs from SH. 
Where's the many SH crash reports on kerneloops.org? I see _not a 
single SH report_, out of more than 200,000 kernel oopses reported 
and categorized ... Why?

The thing is, as things stand today SH is basically freeloding on 
the hard work, testing and core kernel work of other architectures 
- which is fine in itself - what is not fine is to is to then 
complain in an unacceptable tone about bugs that affect SH.

Of course SH never causes core kernel bugs because it essentially 
does no core kernel work at all!

Paul seems to think that freeloading upon 10 million lines of code 
coupled with loud, self-sure demands for instant fixes is fine, and 
dare anyone trying to advance Linux break the build of a single 
architecture out of 22!

Other architectures, powerpc, s390, sparc, x86 etc. all do lots of 
core kernel work and make sure it's all nicely reciprocal and 
friendly. SH needs to stop the whining and needs to start helping 
out for real - or it can get unmerged and go out of tree if it does 
not want to participate in the development process.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [GIT PULL] tracing: Syscalls trace events + perf support
  2009-08-18 11:06       ` Ingo Molnar
@ 2009-08-18 11:56         ` Paul Mundt
  0 siblings, 0 replies; 54+ messages in thread
From: Paul Mundt @ 2009-08-18 11:56 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Frederic Weisbecker, Stephen Rothwell, Jason Baron, LKML,
	Lai Jiangshan, Steven Rostedt, Peter Zijlstra, Mathieu Desnoyers,
	Jiaying Zhang, Martin Bligh, Li Zefan, Masami Hiramatsu,
	Martin Schwidefsky, Wu Zhangjin, linux-next

On Tue, Aug 18, 2009 at 01:06:16PM +0200, Ingo Molnar wrote:
> * Frederic Weisbecker <fweisbec@gmail.com> wrote:
> > I've just retrieved the concerned commit in the sh tree:
> > 
> > sh: Add ftrace syscall tracing support 
> > (c652d780c9cf7f860141de232b37160fe013feca)
> > 
> > Was I cc'ed on this one? I can't find it in my inbox. Unless I'm 
> > wrong and I missed it, how could I guess I had to cc you and how 
> > am I supposed to fix something I'm even not aware of?
> > 
> > I can't find the s390 patch in my inbox either (was I cc'ed ?) 
> > ([S390] ftrace: add system call tracer support) but we should 
> > have fixed this one because it was already upstream and a 
> > git-grep ftrace_syscall_enter would have warned us about that.
> > 
> > I didn't know another arch was supporting syscall tracing (except 
> > mips because I was cc'ed, but it doesn't seem upstream nor in the 
> > mips tree).
> 
> Yes, and note that Paul Mundt has not even done the minimal 
> courtesy of describing/pasting the build failure he was seeing. But 
> he had time for a 4-paragraph rant about how others are supposed to 
> do their work... And then he complains about us not having 
> considered a change he never Cc:-ed to us. How nice.
> 
I explained that all -next builds had broken due to that tree being
pulled in, which I assumed was sufficient. None of my initial post was
suggesting that there was a problem with making changes to the interfaces
or that I expected anyone else to fix them up for me, it is more just
general frustration at how often people blindly push things out to -next
without checking what impact that is going to have on the already merged
trees. This is not just -tip specific, and indeed most of my posts to the
-next list are fixing up these sorts of build bugs caused by other
people's trees.

> This reply from Paul Mundt shows an _incredibly_ arrogant attitude 
> towards core kernel facilities: he almost never contributes to them 
> (i just checked the logs from v2.6.12 to v2.6.31-rc6) but _THEY_ 
> must do everything to keep SH running smoothly.
> 
I don't think that's a fair generalization. Naturally the vast majority
of my work is in my own architecture, and I send patches for core bits
when we run in to trouble, have to extend certain parts of
infrastructure, discover something is broken whilst wiring up support for
a few feature, etc, etc. And this actually happens quite regularly, so
I'm unsure as to where you get your statistics from.

> And he expects that work to benefit SH to happen regardless of how 
> many actual Linux users use, develop on and report bugs from SH. 
> Where's the many SH crash reports on kerneloops.org? I see _not a 
> single SH report_, out of more than 200,000 kernel oopses reported 
> and categorized ... Why?
> 
I have enough trouble getting people to submit oopses at all, and the
ones that we do get are most often in the architecture code, so there's
little interest to kerneloops.org. Likewise, issues that we hit with
common code during development we try to debug and send patches for long
before it becomes anyone elses problem.

> The thing is, as things stand today SH is basically freeloding on 
> the hard work, testing and core kernel work of other architectures 
> - which is fine in itself - what is not fine is to is to then 
> complain in an unacceptable tone about bugs that affect SH.
> 
This is complete and utter nonsense. SH is one of the only embedded
architectures that aggressively implements and tests new kernel features,
and we send out patches for any issues we run in to on a pretty much
constant basis. Beyond that, most of the patches I send to -next are
fixing up issues introduced by other people that have negative
implications for architectures, and I try to subsequently take care of
those as quickly as possible. The core kernel work we do focus on happens
within that particular subsystem itself, so perhaps there is not so much
visibility outside of that context.

Regarding the ftrace syscall stuff, you did not just break one
architecture, you effectively broke all of them that weren't x86, and
this tends to be unfortunately more of a common trend than an odd
exception. Of course that is a natural part of development, and as long
as things are gradually fixed up it's really not that big of a deal. No
one expects instant fixes all around, the issue is more that no one is
paying attention to what impact these changes will have on others. In
this case we could have Cced you on the ftrace changes we made in the
sh tree so you had been aware of it, but this largely ignores the point.
I don't Cc people when we add new features on the architecture side as
for the most part, and I expect this is the case for most architecture
maintainers. How many times are architecture people told to grovel
around some -tip topic branch before making any changes in their own
trees? If architecture people have to poke around -tip to try and keep
things moving along, I don't think its unrealistic to expect -tip people
to pay attention to the things that are already on-going in linux-next
before merging things.

My "hostility" towards core kernel features tends to be inversely
proportional to how embedded architectures are treated by core kernel
people. We effectively have to fight for every minor change, and are
either dismissed out of hand or regarded with contempt when attempting to
push core changes through. One hardly has to look very long to find your
postings that routinely throw this 95% figure around for example. Folks
complain about embedded architectures not contributing, and then when
they do, this is the end result. As an embedded architecture maintainer I
resigned myself years ago to the fact that I would spend most of my time
outside of my own architecture directory fixing up other peoples code,
and it's only the rare times when this results in opposition -- largely
involving the -tip tree as of late. I even asked you directly how we can
fix this workflow issue in my initial mail, which you completely ignored.

In any event, if all of this constitutes freeloading, then there seems to
be very little point in carrying on dialogue. I'll just pull in the
tracing/core bits in to a topic branch and fix my platform up, as usual.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [S390] ftrace: update system call tracer support
  2009-08-18  8:51       ` Ingo Molnar
  2009-08-18  8:59         ` Martin Schwidefsky
@ 2009-08-18 14:56         ` Jason Baron
  1 sibling, 0 replies; 54+ messages in thread
From: Jason Baron @ 2009-08-18 14:56 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Paul Mundt, Stephen Rothwell, Frederic Weisbecker, LKML,
	Lai Jiangshan, Steven Rostedt, Peter Zijlstra, Mathieu Desnoyers,
	Jiaying Zhang, Martin Bligh, Li Zefan, Masami Hiramatsu,
	Martin Schwidefsky, Wu Zhangjin, linux-next, Heiko Carstens

On Tue, Aug 18, 2009 at 10:51:10AM +0200, Ingo Molnar wrote:
> * Ingo Molnar <mingo@elte.hu> wrote:
> 
> > * Paul Mundt <lethal@linux-sh.org> wrote:
> > 
> > > [ Adding to Cc everyone that now has a broken tree thanks to this .. ]
> > > 
> > > On Wed, Aug 12, 2009 at 11:11:33AM +0200, Ingo Molnar wrote:
> > > > * Frederic Weisbecker <fweisbec@gmail.com> wrote:
> > > > > This pull request integrate one cleanup/fix for ftrace and an 
> > > > > update for syscall tracing: the migration from old-style tracer to 
> > > > > individual tracepoints/trace_events and the support for perf 
> > > > > counter.
> > > > > 
> > > > > I've tested it with success either with ftrace (every syscall 
> > > > > tracepoints enabled at the same time without problems) and with 
> > > > > perfcounter.
> > > > > 
> > > > > May be one drawback: it creates so much trace events that the 
> > > > > ftrace selftests can take some time :-)
> > > > 
> > > > Pulled, thanks a lot!
> > > 
> > > And this has now subsequently broken every single SH and S390 
> > > configuration, [...]
> > 
> > I test SH cross-builds regularly. I just checked the SH defconfig 
> > and it builds just fine here:
> > 
> > $ make -j32 CROSS_COMPILE=sh3-linux- ARCH=sh vmlinux
> 
> The s390 build indeed broke. (This got masked by the s390 toolchain 
> i'm using not having been able to build Linus's tree - i fixed 
> that.)
> 
> Could you try the fix below? It does the trick here.
> 
> Martin, Heiko - does the fix look good to you? regs->gprs[2] seems 
> to be the register used for both the syscall number (enter 
> callback) and for the return code (exit callback).
> 
> Regarding SH, the fixup should be similarly trivial. Since SH's 
> FTRACE_SYSCALLS code is not upstream yet it can (and should) be 
> carried in the tree that integrates the SH tree and the tracing 
> tree - linux-next in this case.
> 
> Thanks,
> 
> 	Ingo
> 
> ------------------------------>
> From a9008fd42b1c3c89f684d90bdfb9c2d05c7af119 Mon Sep 17 00:00:00 2001
> From: Ingo Molnar <mingo@elte.hu>
> Date: Tue, 18 Aug 2009 10:41:57 +0200
> Subject: [PATCH] [S390] ftrace: update system call tracer support
> 
> Commit fb34a08c3 ("tracing: Add trace events for each syscall
> entry/exit") changed the lowlevel API to ftrace syscall tracing
> but did not update s390 which started making use of it recently.
> 
> This broke the s390 build, as reported by Paul Mundt.
> 
> Update the callbacks with the syscall number and the syscall
> return code values. This allows per syscall tracepoints,
> syscall argument enumeration /debug/tracing/events/syscalls/
> and perfcounters support and integration on s390 too.
> 
> Reported-by: Paul Mundt <lethal@linux-sh.org>
> Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
> Cc: Jason Baron <jbaron@redhat.com>
> Cc: Steven Rostedt <rostedt@goodmis.org>
> Cc: Frederic Weisbecker <fweisbec@gmail.com>
> LKML-Reference: <tip-fb34a08c3469b2be9eae626ccb96476b4687b810@git.kernel.org>
> Signed-off-by: Ingo Molnar <mingo@elte.hu>
> ---
>  arch/s390/kernel/ptrace.c |    4 ++--
>  1 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/s390/kernel/ptrace.c b/arch/s390/kernel/ptrace.c
> index 43acd73..05f57cd 100644
> --- a/arch/s390/kernel/ptrace.c
> +++ b/arch/s390/kernel/ptrace.c
> @@ -662,7 +662,7 @@ asmlinkage long do_syscall_trace_enter(struct pt_regs *regs)
>  	}
>  
>  	if (unlikely(test_thread_flag(TIF_SYSCALL_FTRACE)))
> -		ftrace_syscall_enter(regs);
> +		trace_syscall_enter(regs, regs->gprs[2]);
>  
>  	if (unlikely(current->audit_context))
>  		audit_syscall_entry(is_compat_task() ?
> @@ -680,7 +680,7 @@ asmlinkage void do_syscall_trace_exit(struct pt_regs *regs)
>  				   regs->gprs[2]);
>  
>  	if (unlikely(test_thread_flag(TIF_SYSCALL_FTRACE)))
> -		ftrace_syscall_exit(regs);
> +		trace_syscall_exit(regs, regs->gprs[2]);
>  
>  	if (test_thread_flag(TIF_SYSCALL_TRACE))
>  		tracehook_report_syscall_exit(regs, 0);

thanks for fixing this up Ingo! Sorry for all the trouble.

perhaps, we should just reduce the entry/exit routines to just pass
'regs' since we already have the following abstractions for callbacks:

syscall_get_nr(current, regs);
syscall_get_return_value(current, regs);

Thus, we really only need to be passing 'regs'. I was passing more information
here, to make it easier for other users of the tracepoints...

thanks,

-Jason




^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 15/16] tracing: Add fields format definition for syscall events
  2009-08-11 18:49 ` [PATCH 15/16] tracing: Add fields format definition for syscall events Frederic Weisbecker
@ 2009-08-19 17:12   ` Masami Hiramatsu
  2009-08-19 17:37     ` Frederic Weisbecker
  0 siblings, 1 reply; 54+ messages in thread
From: Masami Hiramatsu @ 2009-08-19 17:12 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Ingo Molnar, LKML, Lai Jiangshan, Steven Rostedt, Peter Zijlstra,
	Mathieu Desnoyers, Jiaying Zhang, Martin Bligh, Li Zefan,
	Jason Baron

Frederic Weisbecker wrote:
> Define the format of the syscall trace fields to parse the binary
> values from a raw trace using the syscall events "format" file.
> 
> This is defined dynamically using the syscalls metadata.
> It prepares the export of syscall event raw records to perf
> counters.
> 
> Example:
> 
> $ cat /debug/tracing/events/syscalls/sys_enter_sched_getparam/format
> name: sys_enter_sched_getparam
> ID: 39
> format:
> 	field:unsigned short common_type;	offset:0;	size:2;
> 	field:unsigned char common_flags;	offset:2;	size:1;
> 	field:unsigned char common_preempt_count;	offset:3;	size:1;
> 	field:int common_pid;	offset:4;	size:4;
> 	field:int common_tgid;	offset:8;	size:4;
> 
> 	field:pid_t pid;	offset:12;	size:8;
> 	field:struct sched_param * param;	offset:20;	size:8;
> 
> print fmt: "pid: 0x%08lx, param: 0x%08lx", ((unsigned long)(REC->pid)), ((unsigned long)(REC->param))

Hi Frederic,

I've found that the formats of some syscall events were too big.

---
$ for i in sys_enter* ;do grep name $i/format > /dev/null || echo $i has broken format. ; done sys_enter_getegid has broken format.
sys_enter_geteuid has broken format.
sys_enter_getgid has broken format.
sys_enter_getpgrp has broken format.
sys_enter_getpid has broken format.
sys_enter_getppid has broken format.
sys_enter_gettid has broken format.
sys_enter_getuid has broken format.
sys_enter_inotify_init has broken format.
sys_enter_munlockall has broken format.
sys_enter_pause has broken format.
sys_enter_restart_syscall has broken format.
sys_enter_sched_yield has broken format.
sys_enter_setsid has broken format.
sys_enter_sync has broken format.
sys_enter_vhangup has broken format.

$ cat sys_enter_getegid/format
FORMAT TOO BIG
---

And it causes an error on ./perf trace.

---
$ ./perf record -R -e syscalls:sys_enter_read -a -f  cat libperf.a > /dev/null

$ ./perf trace
  Fatal: Error: expected 'name' but read 'FORMAT'
version = 0.5
---

Thank you,

-- 
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America), Inc.
Software Solutions Division

e-mail: mhiramat@redhat.com


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 15/16] tracing: Add fields format definition for syscall events
  2009-08-19 17:12   ` Masami Hiramatsu
@ 2009-08-19 17:37     ` Frederic Weisbecker
  2009-08-20  1:07       ` Li Zefan
  0 siblings, 1 reply; 54+ messages in thread
From: Frederic Weisbecker @ 2009-08-19 17:37 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Ingo Molnar, LKML, Lai Jiangshan, Steven Rostedt, Peter Zijlstra,
	Mathieu Desnoyers, Jiaying Zhang, Martin Bligh, Li Zefan,
	Jason Baron

On Wed, Aug 19, 2009 at 01:12:48PM -0400, Masami Hiramatsu wrote:
> Frederic Weisbecker wrote:
> > Define the format of the syscall trace fields to parse the binary
> > values from a raw trace using the syscall events "format" file.
> > 
> > This is defined dynamically using the syscalls metadata.
> > It prepares the export of syscall event raw records to perf
> > counters.
> > 
> > Example:
> > 
> > $ cat /debug/tracing/events/syscalls/sys_enter_sched_getparam/format
> > name: sys_enter_sched_getparam
> > ID: 39
> > format:
> > 	field:unsigned short common_type;	offset:0;	size:2;
> > 	field:unsigned char common_flags;	offset:2;	size:1;
> > 	field:unsigned char common_preempt_count;	offset:3;	size:1;
> > 	field:int common_pid;	offset:4;	size:4;
> > 	field:int common_tgid;	offset:8;	size:4;
> > 
> > 	field:pid_t pid;	offset:12;	size:8;
> > 	field:struct sched_param * param;	offset:20;	size:8;
> > 
> > print fmt: "pid: 0x%08lx, param: 0x%08lx", ((unsigned long)(REC->pid)), ((unsigned long)(REC->param))
> 
> Hi Frederic,
> 
> I've found that the formats of some syscall events were too big.
> 
> ---
> $ for i in sys_enter* ;do grep name $i/format > /dev/null || echo $i has broken format. ; done sys_enter_getegid has broken format.
> sys_enter_geteuid has broken format.
> sys_enter_getgid has broken format.
> sys_enter_getpgrp has broken format.
> sys_enter_getpid has broken format.
> sys_enter_getppid has broken format.
> sys_enter_gettid has broken format.
> sys_enter_getuid has broken format.
> sys_enter_inotify_init has broken format.
> sys_enter_munlockall has broken format.
> sys_enter_pause has broken format.
> sys_enter_restart_syscall has broken format.
> sys_enter_sched_yield has broken format.
> sys_enter_setsid has broken format.
> sys_enter_sync has broken format.
> sys_enter_vhangup has broken format.
> 
> $ cat sys_enter_getegid/format
> FORMAT TOO BIG
> ---
> 
> And it causes an error on ./perf trace.
> 
> ---
> $ ./perf record -R -e syscalls:sys_enter_read -a -f  cat libperf.a > /dev/null
> 
> $ ./perf trace
>   Fatal: Error: expected 'name' but read 'FORMAT'
> version = 0.5
> ---
> 
> Thank you,


Yeah, I have yet to fix this, that because syscalls that have no parameters
raise a small bug in the return value of trace_seq_printf() while printing
their format, returning 0 as if the buffer was full and lost some bits.

However, It's possible that the last patches from Li fix this, since he did
a total/better refactoring of the format definition for syscall events.

I'll check this,
Thanks!


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 15/16] tracing: Add fields format definition for syscall events
  2009-08-19 17:37     ` Frederic Weisbecker
@ 2009-08-20  1:07       ` Li Zefan
  0 siblings, 0 replies; 54+ messages in thread
From: Li Zefan @ 2009-08-20  1:07 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Masami Hiramatsu, Ingo Molnar, LKML, Lai Jiangshan,
	Steven Rostedt, Peter Zijlstra, Mathieu Desnoyers, Jiaying Zhang,
	Martin Bligh, Jason Baron

> Yeah, I have yet to fix this, that because syscalls that have no parameters
> raise a small bug in the return value of trace_seq_printf() while printing
> their format, returning 0 as if the buffer was full and lost some bits.
> 
> However, It's possible that the last patches from Li fix this, since he did
> a total/better refactoring of the format definition for syscall events.
> 
> I'll check this,

I was not aware of this bug, and the bug is still there, but it's
easy to fix and I've fixed it. Will send out the patch soon.


^ permalink raw reply	[flat|nested] 54+ messages in thread

end of thread, other threads:[~2009-08-20  1:09 UTC | newest]

Thread overview: 54+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-08-11 18:48 [GIT PULL] tracing: Syscalls trace events + perf support Frederic Weisbecker
2009-08-11 18:48 ` [PATCH 01/16] tracing: Rename set_tracer_flags()'s local variable trace_flags Frederic Weisbecker
2009-08-11 18:48 ` [PATCH 02/16] tracing: Map syscall name to number Frederic Weisbecker
2009-08-11 18:48 ` [PATCH 03/16] tracing: Call arch_init_ftrace_syscalls at boot Frederic Weisbecker
2009-08-11 18:48 ` [PATCH 04/16] tracing: Add DECLARE_TRACE_WITH_CALLBACK() macro Frederic Weisbecker
2009-08-11 18:48 ` [PATCH 05/16] tracing: Add syscall tracepoints Frederic Weisbecker
2009-08-11 18:48 ` [PATCH 06/16] tracing: Update FTRACE_SYSCALL_MAX Frederic Weisbecker
2009-08-11 18:48 ` [PATCH 07/16] tracing: Raw_init() bailout in trace event register fail case Frederic Weisbecker
2009-08-11 18:48 ` [PATCH 08/16] tracing: Add ftrace_event_call void * 'data' field Frederic Weisbecker
2009-08-11 18:48 ` [PATCH 09/16] tracing: Add trace events for each syscall entry/exit Frederic Weisbecker
2009-08-11 18:48 ` [PATCH 10/16] tracing: Add individual syscalls tracepoint id support Frederic Weisbecker
2009-08-11 18:49 ` [PATCH 11/16] tracing: Add perf counter support for syscalls tracing Frederic Weisbecker
2009-08-11 18:49 ` [PATCH 12/16] tracing: Add more namespace area to 'perf list' output Frederic Weisbecker
2009-08-11 18:49 ` [PATCH 13/16] tracing: Convert x86_64 mmap and uname to use DEFINE_SYSCALL Frederic Weisbecker
2009-08-11 18:49 ` [PATCH 14/16] tracing: Add ftrace event call parameter to its field descriptor handler Frederic Weisbecker
2009-08-11 18:49 ` [PATCH 15/16] tracing: Add fields format definition for syscall events Frederic Weisbecker
2009-08-19 17:12   ` Masami Hiramatsu
2009-08-19 17:37     ` Frederic Weisbecker
2009-08-20  1:07       ` Li Zefan
2009-08-11 18:49 ` [PATCH 16/16] tracing: Support for syscall events raw records in perfcounters Frederic Weisbecker
2009-08-12  9:11 ` [GIT PULL] tracing: Syscalls trace events + perf support Ingo Molnar
2009-08-12 11:03   ` Ingo Molnar
2009-08-12 11:14     ` Ingo Molnar
2009-08-12 14:25       ` Jason Baron
2009-08-12 14:29         ` Ingo Molnar
2009-08-12 14:37           ` Jason Baron
2009-08-12 11:33     ` Frederic Weisbecker
2009-08-12 13:59       ` Jason Baron
2009-08-12 14:30         ` Ingo Molnar
2009-08-18  0:46   ` Paul Mundt
2009-08-18  7:32     ` Ingo Molnar
2009-08-18  7:32       ` Ingo Molnar
2009-08-18  8:51       ` [S390] ftrace: update system call tracer support Ingo Molnar
2009-08-18  8:51       ` Ingo Molnar
2009-08-18  8:59         ` Martin Schwidefsky
2009-08-18 10:05           ` Ingo Molnar
2009-08-18 10:22             ` Martin Schwidefsky
2009-08-18 14:56         ` Jason Baron
2009-08-18 10:25     ` [GIT PULL] tracing: Syscalls trace events + perf support Frederic Weisbecker
2009-08-18 10:25       ` Frederic Weisbecker
2009-08-18 11:06       ` Ingo Molnar
2009-08-18 11:56         ` Paul Mundt
2009-08-12 16:33 ` Masami Hiramatsu
2009-08-12 17:02   ` Masami Hiramatsu
2009-08-12 19:13     ` [RFD] Kprobes/Kretprobes " Frederic Weisbecker
2009-08-12 20:20       ` Masami Hiramatsu
2009-08-13  8:02         ` Ingo Molnar
2009-08-12 21:09       ` Peter Zijlstra
2009-08-12 21:27         ` Masami Hiramatsu
2009-08-12 21:37           ` Frederic Weisbecker
2009-08-12 21:35         ` Frederic Weisbecker
2009-08-14 15:05       ` Masami Hiramatsu
2009-08-15 14:33         ` Ingo Molnar
2009-08-17 21:58           ` Masami Hiramatsu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.