linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2 v2] Syscalls tracing
@ 2009-03-13 14:42 Frederic Weisbecker
  2009-03-13 14:42 ` [PATCH 1/2 v2] tracing/syscalls: core infrastructure for syscalls tracing Frederic Weisbecker
                   ` (2 more replies)
  0 siblings, 3 replies; 35+ messages in thread
From: Frederic Weisbecker @ 2009-03-13 14:42 UTC (permalink / raw)
  To: Ingo Molnar, Ingo Molnar
  Cc: Frederic Weisbecker, Linux Kernel Mailing List, Peter Zijlstra,
	Steven Rostedt, tglx, Jason Baron, Frank Ch. Eigler,
	Mathieu Desnoyers, KOSAKI Motohiro, Lai Jiangshan, Jiaying Zhang,
	Michael Rubin, Martin Bligh, Michael Davidson

tracing/syscalls: core infrastructure to trace syscalls

This new iteration addresses a good part of the previous reviews.

As suggested by Ingo Molnar and Peter Zijlstra, the syscalls
prototypes probing is done by abusing the SYSCALL_DEFINE family
macros.
We now store automatically the arguments names, their types, their number
and the name of the syscall.

Also some fixes on output newlines and dangerous exporting of global_trace
are provided.

An example of the trace:

echo syscall > /debugfs/tracing/current_tracer


# tracer: syscall
#
#           TASK-PID    CPU#    TIMESTAMP  FUNCTION
#              | |       |          |         |
           <...>-5080  [001]   132.192228: sys_dup2(oldfd: a, newfd: 1) 
           <...>-5080  [001]   132.192239: sys_dup2 -> 0x1
           <...>-5080  [001]   132.192242: sys_fcntl(fd: a, cmd: 1, arg: 0) 
           <...>-5080  [001]   132.192245: sys_fcntl -> 0x1
           <...>-5080  [001]   132.192248: sys_close(fd: a) 
           <...>-5080  [001]   132.192250: sys_close -> 0x0
           <...>-5080  [001]   132.192265: sys_rt_sigprocmask(how: 0, set: 0, oset: 6cf808, sigsetsize: 8) 
           <...>-5080  [001]   132.192267: sys_rt_sigprocmask -> 0x0
           <...>-5080  [001]   132.192271: sys_rt_sigaction(sig: 2, act: 7fffff4338f0, oact: 7fffff433850, sigsetsize: 8) 
           <...>-5080  [001]   132.192273: sys_rt_sigaction -> 0x0
           <...>-5080  [001]   132.192285: sys_rt_sigprocmask(how: 0, set: 0, oset: 6cf808, sigsetsize: 8) 
           <...>-5080  [001]   132.192287: sys_rt_sigprocmask -> 0x0
           <...>-5080  [001]   132.192415: sys_write(fd: 1, buf: 15dfc08, count: 21) 
           <...>-5080  [001]   132.192436: sys_write -> 0x21
           <...>-4754  [000]   132.192478: sys_read(fd: 8, buf: 2a9340e, count: 1fee) 
           <...>-5080  [001]   132.192487: sys_rt_sigprocmask(how: 0, set: 7fffff432a70, oset: 7fffff4329f0, sigsetsize: 8) 

And if you ask for the parameters types:

echo syscall_arg_type > trace_options

# tracer: syscall
#
#           TASK-PID    CPU#    TIMESTAMP  FUNCTION
#              | |       |          |         |
           <...>-5080  [001]   132.192228: sys_dup2(unsigned int oldfd: a, unsigned int newfd: 1) 
           <...>-5080  [001]   132.192239: sys_dup2 -> 0x1
           <...>-5080  [001]   132.192242: sys_fcntl(unsigned int fd: a, unsigned int cmd: 1, unsigned long arg: 0) 
           <...>-5080  [001]   132.192245: sys_fcntl -> 0x1
           <...>-5080  [001]   132.192248: sys_close(unsigned int fd: a) 
           <...>-5080  [001]   132.192250: sys_close -> 0x0
           <...>-5080  [001]   132.192265: sys_rt_sigprocmask(int how: 0, sigset_t * set: 0, sigset_t * oset: 6cf808, size_t sigsetsize: 8) 
           <...>-5080  [001]   132.192267: sys_rt_sigprocmask -> 0x0
           <...>-5080  [001]   132.192271: sys_rt_sigaction(int sig: 2, const struct sigaction * act: 7fffff4338f0, struct sigaction * oact: 7fffff433850, size_t sigsetsize: 8) 
           <...>-5080  [001]   132.192273: sys_rt_sigaction -> 0x0
           <...>-5080  [001]   132.192285: sys_rt_sigprocmask(int how: 0, sigset_t * set: 0, sigset_t * oset: 6cf808, size_t sigsetsize: 8) 
           <...>-5080  [001]   132.192287: sys_rt_sigprocmask -> 0x0
           <...>-5080  [001]   132.192415: sys_write(unsigned int fd: 1, const char * buf: 15dfc08, size_t count: 21) 
           <...>-5080  [001]   132.192436: sys_write -> 0x21

TODO:

- add a single mask on the struct syscall_metadata to provide quickly which arguments is
  a pointer (usually type __user *p) so that the user can decide if he wants to save them of tracing time

- now that we have each parameter type as strings, add a new field on struct syscall_metadata to have the parameter types
  encoded as single enum values (for quick checks) so that we can use specific callbacks for each parameter type
  to be displayed.

NOTE: this is still not overlapping with a potential future merge of utrace, since the low-level hooks on
the syscalls remain somewhat basic.

NOTE2: I've only tested it on x86-64 for now, so only x86-64 support is provided.
--

Frederic Weisbecker (2):
  tracing/syscalls: core infrastructure for syscalls tracing
  tracing/syscalls: support for syscalls tracing on x86-64

 arch/x86/Kconfig                   |    1 +
 arch/x86/include/asm/ftrace.h      |    7 +
 arch/x86/include/asm/thread_info.h |    9 +-
 arch/x86/kernel/ftrace.c           |   63 +++++++++
 arch/x86/kernel/ptrace.c           |    7 +
 include/asm-generic/vmlinux.lds.h  |   11 ++-
 include/linux/ftrace.h             |   29 +++++
 include/linux/syscalls.h           |   60 +++++++++-
 kernel/trace/Kconfig               |   10 ++
 kernel/trace/Makefile              |    1 +
 kernel/trace/trace.h               |   19 +++
 kernel/trace/trace_syscalls.c      |  243 ++++++++++++++++++++++++++++++++++++
 12 files changed, 454 insertions(+), 6 deletions(-)
 create mode 100644 kernel/trace/trace_syscalls.c


^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH 1/2 v2] tracing/syscalls: core infrastructure for syscalls tracing
  2009-03-13 14:42 [PATCH 0/2 v2] Syscalls tracing Frederic Weisbecker
@ 2009-03-13 14:42 ` Frederic Weisbecker
  2009-03-13 16:09   ` [tip:tracing/syscalls] tracing/syscalls: core infrastructure for syscalls tracing, enhancements Frederic Weisbecker
                     ` (2 more replies)
  2009-03-13 14:42 ` [PATCH 2/2 v2] tracing/syscalls: support for syscalls tracing on x86-64 Frederic Weisbecker
  2009-03-13 15:16 ` [PATCH 0/2 v2] Syscalls tracing Frederic Weisbecker
  2 siblings, 3 replies; 35+ messages in thread
From: Frederic Weisbecker @ 2009-03-13 14:42 UTC (permalink / raw)
  To: Ingo Molnar, Ingo Molnar
  Cc: Frederic Weisbecker, Linux Kernel Mailing List, Peter Zijlstra,
	Steven Rostedt, tglx, Jason Baron, Frank Ch. Eigler,
	Mathieu Desnoyers, KOSAKI Motohiro, Lai Jiangshan, Jiaying Zhang,
	Michael Rubin, Martin Bligh, Michael Davidson

Impact: new feature

This adds the generic support for syscalls tracing.
This is currently exploited through a devoted tracer but other tracing
engines can use it. They just have to play with {start,stop}_ftrace_syscalls()
and use the display callbacks unless they want to override them.

The syscalls prototypes definitions are abused here to steal some metadata
informations:

- syscall name, param types, param names, number of params

The syscall addr is not directly saved during this definition because we don't
know if its prototype is available in the namespace. But we don't really need it.
The arch has just to build a function able to resolve the syscall
number to its metadata struct.

The current tracer prints the syscall names, parameters names and values (and their
types optionally). Currently the value is a raw hex but higher level values diplaying
is on my TODO list.
---
 include/asm-generic/vmlinux.lds.h |   11 ++-
 include/linux/ftrace.h            |   29 +++++
 include/linux/syscalls.h          |   60 +++++++++-
 kernel/trace/Kconfig              |   10 ++
 kernel/trace/Makefile             |    1 +
 kernel/trace/trace.h              |   19 +++
 kernel/trace/trace_syscalls.c     |  243 +++++++++++++++++++++++++++++++++++++
 7 files changed, 370 insertions(+), 3 deletions(-)
 create mode 100644 kernel/trace/trace_syscalls.c

diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index 0e0f39b..d3bc3c8 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -77,6 +77,14 @@
 #define TRACE_PRINTKS()
 #endif
 
+#ifdef CONFIG_FTRACE_SYSCALLS
+#define TRACE_SYSCALLS() VMLINUX_SYMBOL(__start_syscalls_metadata) = .;	\
+			 *(__syscalls_metadata)				\
+			 VMLINUX_SYMBOL(__stop_syscalls_metadata) = .;
+#else
+#define TRACE_SYSCALLS()
+#endif
+
 /* .data section */
 #define DATA_DATA							\
 	*(.data)							\
@@ -99,7 +107,8 @@
 	LIKELY_PROFILE()		       				\
 	BRANCH_PROFILE()						\
 	TRACE_PRINTKS()							\
-	FTRACE_EVENTS()
+	FTRACE_EVENTS()							\
+	TRACE_SYSCALLS()
 
 #define RO_DATA(align)							\
 	. = ALIGN((align));						\
diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index e1583f2..6dc1c65 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -503,4 +503,33 @@ static inline void trace_hw_branch_oops(void) {}
 
 #endif /* CONFIG_HW_BRANCH_TRACER */
 
+/*
+ * A syscall entry in the ftrace syscalls array.
+ *
+ * @name: name of the syscall
+ * @nb_args: number of parameters it takes
+ * @types: list of types as strings
+ * @args: list of args as strings (args[i] matches types[i])
+ */
+struct syscall_metadata {
+	const char	*name;
+	int		nb_args;
+	const char	**types;
+	const char	**args;
+};
+
+#ifdef CONFIG_FTRACE_SYSCALLS
+extern void arch_init_ftrace_syscalls(void);
+extern struct syscall_metadata *syscall_nr_to_meta(int nr);
+extern void start_ftrace_syscalls(void);
+extern void stop_ftrace_syscalls(void);
+extern void ftrace_syscall_enter(struct pt_regs *regs);
+extern void ftrace_syscall_exit(struct pt_regs *regs);
+#else
+static inline void start_ftrace_syscalls(void) { }
+static inline void stop_ftrace_syscalls(void) { }
+static inline void ftrace_syscall_enter(struct pt_regs *regs) { }
+static inline void ftrace_syscall_exit(struct pt_regs *regs) { }
+#endif
+
 #endif /* _LINUX_FTRACE_H */
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index ab1d772..dfe2a44 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -66,6 +66,7 @@ struct perf_counter_hw_event;
 #include <asm/signal.h>
 #include <linux/quota.h>
 #include <linux/key.h>
+#include <linux/ftrace.h>
 
 #define __SC_DECL1(t1, a1)	t1 a1
 #define __SC_DECL2(t2, a2, ...) t2 a2, __SC_DECL1(__VA_ARGS__)
@@ -96,7 +97,46 @@ struct perf_counter_hw_event;
 #define __SC_TEST5(t5, a5, ...)	__SC_TEST(t5); __SC_TEST4(__VA_ARGS__)
 #define __SC_TEST6(t6, a6, ...)	__SC_TEST(t6); __SC_TEST5(__VA_ARGS__)
 
+#ifdef CONFIG_FTRACE_SYSCALLS
+#define __SC_STR_ADECL1(t, a)		#a
+#define __SC_STR_ADECL2(t, a, ...)	#a, __SC_STR_ADECL1(__VA_ARGS__)
+#define __SC_STR_ADECL3(t, a, ...)	#a, __SC_STR_ADECL2(__VA_ARGS__)
+#define __SC_STR_ADECL4(t, a, ...)	#a, __SC_STR_ADECL3(__VA_ARGS__)
+#define __SC_STR_ADECL5(t, a, ...)	#a, __SC_STR_ADECL4(__VA_ARGS__)
+#define __SC_STR_ADECL6(t, a, ...)	#a, __SC_STR_ADECL5(__VA_ARGS__)
+
+#define __SC_STR_TDECL1(t, a)		#t
+#define __SC_STR_TDECL2(t, a, ...)	#t, __SC_STR_TDECL1(__VA_ARGS__)
+#define __SC_STR_TDECL3(t, a, ...)	#t, __SC_STR_TDECL2(__VA_ARGS__)
+#define __SC_STR_TDECL4(t, a, ...)	#t, __SC_STR_TDECL3(__VA_ARGS__)
+#define __SC_STR_TDECL5(t, a, ...)	#t, __SC_STR_TDECL4(__VA_ARGS__)
+#define __SC_STR_TDECL6(t, a, ...)	#t, __SC_STR_TDECL5(__VA_ARGS__)
+
+#define SYSCALL_METADATA(sname, nb)				\
+	static const struct syscall_metadata __used		\
+	  __attribute__((__aligned__(4)))			\
+	  __attribute__((section("__syscalls_metadata")))	\
+	  __syscall_meta_##sname = {				\
+		.name 		= "sys"#sname,			\
+		.nb_args 	= nb,				\
+		.types		= types_##sname,		\
+		.args		= args_##sname,			\
+	}
+
+#define SYSCALL_DEFINE0(sname)					\
+	static const struct syscall_metadata __used		\
+	  __attribute__((__aligned__(4)))			\
+	  __attribute__((section("__syscalls_metadata")))	\
+	  __syscall_meta_##sname = {				\
+		.name 		= "sys_"#sname,			\
+		.nb_args 	= 0,				\
+	};							\
+	asmlinkage long sys_##sname(void)
+
+#else
 #define SYSCALL_DEFINE0(name)	   asmlinkage long sys_##name(void)
+#endif
+
 #define SYSCALL_DEFINE1(name, ...) SYSCALL_DEFINEx(1, _##name, __VA_ARGS__)
 #define SYSCALL_DEFINE2(name, ...) SYSCALL_DEFINEx(2, _##name, __VA_ARGS__)
 #define SYSCALL_DEFINE3(name, ...) SYSCALL_DEFINEx(3, _##name, __VA_ARGS__)
@@ -118,10 +158,26 @@ struct perf_counter_hw_event;
 #endif
 #endif
 
+#ifdef CONFIG_FTRACE_SYSCALLS
+#define SYSCALL_DEFINEx(x, sname, ...)				\
+	static const char *types_##sname[] = {			\
+		__SC_STR_TDECL##x(__VA_ARGS__)			\
+	};							\
+	static const char *args_##sname[] = {			\
+		__SC_STR_ADECL##x(__VA_ARGS__)			\
+	};							\
+	SYSCALL_METADATA(sname, x);				\
+	__SYSCALL_DEFINEx(x, sname, __VA_ARGS__)
+#else
+#define SYSCALL_DEFINEx(x, sname, ...)				\
+	__SYSCALL_DEFINEx(x, sname, __VA_ARGS__)
+#endif
+
 #ifdef CONFIG_HAVE_SYSCALL_WRAPPERS
 
 #define SYSCALL_DEFINE(name) static inline long SYSC_##name
-#define SYSCALL_DEFINEx(x, name, ...)					\
+
+#define __SYSCALL_DEFINEx(x, name, ...)					\
 	asmlinkage long sys##name(__SC_DECL##x(__VA_ARGS__));		\
 	static inline long SYSC##name(__SC_DECL##x(__VA_ARGS__));	\
 	asmlinkage long SyS##name(__SC_LONG##x(__VA_ARGS__))		\
@@ -135,7 +191,7 @@ struct perf_counter_hw_event;
 #else /* CONFIG_HAVE_SYSCALL_WRAPPERS */
 
 #define SYSCALL_DEFINE(name) asmlinkage long sys_##name
-#define SYSCALL_DEFINEx(x, name, ...)					\
+#define __SYSCALL_DEFINEx(x, name, ...)					\
 	asmlinkage long sys##name(__SC_DECL##x(__VA_ARGS__))
 
 #endif /* CONFIG_HAVE_SYSCALL_WRAPPERS */
diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
index 8e4a2a6..95a0ad1 100644
--- a/kernel/trace/Kconfig
+++ b/kernel/trace/Kconfig
@@ -34,6 +34,9 @@ config HAVE_FTRACE_MCOUNT_RECORD
 config HAVE_HW_BRANCH_TRACER
 	bool
 
+config HAVE_FTRACE_SYSCALLS
+	bool
+
 config TRACER_MAX_TRACE
 	bool
 
@@ -175,6 +178,13 @@ config EVENT_TRACER
 	  allowing the user to pick and choose which trace point they
 	  want to trace.
 
+config FTRACE_SYSCALLS
+	bool "Trace syscalls"
+	depends on HAVE_FTRACE_SYSCALLS
+	select TRACING
+	help
+	  Basic tracer to catch the syscall entry and exit events.
+
 config BOOT_TRACER
 	bool "Trace boot initcalls"
 	select TRACING
diff --git a/kernel/trace/Makefile b/kernel/trace/Makefile
index c7a2943..c3feea0 100644
--- a/kernel/trace/Makefile
+++ b/kernel/trace/Makefile
@@ -43,5 +43,6 @@ obj-$(CONFIG_BLK_DEV_IO_TRACE)	+= blktrace.o
 obj-$(CONFIG_EVENT_TRACER) += trace_events.o
 obj-$(CONFIG_EVENT_TRACER) += events.o
 obj-$(CONFIG_EVENT_TRACER) += trace_export.o
+obj-$(CONFIG_FTRACE_SYSCALLS) += trace_syscalls.o
 
 libftrace-y := ftrace.o
diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index c5e1d88..d80ca0d 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -30,6 +30,8 @@ enum trace_type {
 	TRACE_GRAPH_ENT,
 	TRACE_USER_STACK,
 	TRACE_HW_BRANCHES,
+	TRACE_SYSCALL_ENTER,
+	TRACE_SYSCALL_EXIT,
 	TRACE_KMEM_ALLOC,
 	TRACE_KMEM_FREE,
 	TRACE_POWER,
@@ -192,6 +194,19 @@ struct kmemtrace_free_entry {
 	const void *ptr;
 };
 
+struct syscall_trace_enter {
+	struct trace_entry	ent;
+	int			nr;
+	unsigned long		args[];
+};
+
+struct syscall_trace_exit {
+	struct trace_entry	ent;
+	int			nr;
+	unsigned long		ret;
+};
+
+
 /*
  * trace_flag_type is an enumeration that holds different
  * states when a trace occurs. These are:
@@ -304,6 +319,10 @@ extern void __ftrace_bad_type(void);
 			  TRACE_KMEM_ALLOC);	\
 		IF_ASSIGN(var, ent, struct kmemtrace_free_entry,	\
 			  TRACE_KMEM_FREE);	\
+		IF_ASSIGN(var, ent, struct syscall_trace_enter,		\
+			  TRACE_SYSCALL_ENTER);				\
+		IF_ASSIGN(var, ent, struct syscall_trace_exit,		\
+			  TRACE_SYSCALL_EXIT);				\
 		__ftrace_bad_type();					\
 	} while (0)
 
diff --git a/kernel/trace/trace_syscalls.c b/kernel/trace/trace_syscalls.c
new file mode 100644
index 0000000..c72e599
--- /dev/null
+++ b/kernel/trace/trace_syscalls.c
@@ -0,0 +1,243 @@
+#include <linux/kernel.h>
+#include <linux/ftrace.h>
+#include <asm/syscall.h>
+
+#include "trace_output.h"
+#include "trace.h"
+
+static atomic_t refcount;
+
+/* Our two options */
+enum {
+	TRACE_SYSCALLS_OPT_TYPES = 0x1,
+};
+
+static struct tracer_opt syscalls_opts[] = {
+	{ TRACER_OPT(syscall_arg_type, TRACE_SYSCALLS_OPT_TYPES) },
+	{ }
+};
+
+static struct tracer_flags syscalls_flags = {
+	.val = 0, /* By default: no args types */
+	.opts = syscalls_opts
+};
+
+enum print_line_t
+print_syscall_enter(struct trace_iterator *iter, int flags)
+{
+	struct trace_seq *s = &iter->seq;
+	struct trace_entry *ent = iter->ent;
+	struct syscall_trace_enter *trace;
+	struct syscall_metadata *entry;
+	int i, ret, syscall;
+
+	trace_assign_type(trace, ent);
+
+	syscall = trace->nr;
+
+	entry = syscall_nr_to_meta(syscall);
+	if (!entry)
+		goto end;
+
+	ret = trace_seq_printf(s, "%s(", entry->name);
+	if (!ret)
+		return TRACE_TYPE_PARTIAL_LINE;
+
+	for (i = 0; i < entry->nb_args; i++) {
+		/* parameter types */
+		if (syscalls_flags.val & TRACE_SYSCALLS_OPT_TYPES) {
+			ret = trace_seq_printf(s, "%s ", entry->types[i]);
+			if (!ret)
+				return TRACE_TYPE_PARTIAL_LINE;
+		}
+		/* parameter values */
+		ret = trace_seq_printf(s, "%s: %lx%s ", entry->args[i],
+				       trace->args[i],
+				       i == entry->nb_args - 1 ? ")" : ",");
+		if (!ret)
+			return TRACE_TYPE_PARTIAL_LINE;
+	}
+
+end:
+	trace_seq_printf(s, "\n");
+	return TRACE_TYPE_HANDLED;
+}
+
+enum print_line_t
+print_syscall_exit(struct trace_iterator *iter, int flags)
+{
+	struct trace_seq *s = &iter->seq;
+	struct trace_entry *ent = iter->ent;
+	struct syscall_trace_exit *trace;
+	int syscall;
+	struct syscall_metadata *entry;
+	int ret;
+
+	trace_assign_type(trace, ent);
+
+	syscall = trace->nr;
+
+	entry = syscall_nr_to_meta(syscall);
+	if (!entry) {
+		trace_seq_printf(s, "\n");
+		return TRACE_TYPE_HANDLED;
+	}
+
+	ret = trace_seq_printf(s, "%s -> 0x%lx\n", entry->name,
+				trace->ret);
+	if (!ret)
+		return TRACE_TYPE_PARTIAL_LINE;
+
+	return TRACE_TYPE_HANDLED;
+}
+
+void start_ftrace_syscalls(void)
+{
+	unsigned long flags;
+	struct task_struct *g, *t;
+
+	if (atomic_inc_return(&refcount) != 1)
+		goto out;
+
+	arch_init_ftrace_syscalls();
+	read_lock_irqsave(&tasklist_lock, flags);
+
+	do_each_thread(g, t) {
+		set_tsk_thread_flag(t, TIF_SYSCALL_FTRACE);
+	} while_each_thread(g, t);
+
+	read_unlock_irqrestore(&tasklist_lock, flags);
+out:
+	atomic_dec(&refcount);
+}
+
+void stop_ftrace_syscalls(void)
+{
+	unsigned long flags;
+	struct task_struct *g, *t;
+
+	if (atomic_dec_return(&refcount))
+		goto out;
+
+	read_lock_irqsave(&tasklist_lock, flags);
+
+	do_each_thread(g, t) {
+		clear_tsk_thread_flag(t, TIF_SYSCALL_FTRACE);
+	} while_each_thread(g, t);
+
+	read_unlock_irqrestore(&tasklist_lock, flags);
+out:
+	atomic_inc(&refcount);
+}
+
+void ftrace_syscall_enter(struct pt_regs *regs)
+{
+	struct syscall_trace_enter *entry;
+	struct syscall_metadata *sys_data;
+	struct ring_buffer_event *event;
+	int size;
+	int syscall_nr;
+	int cpu;
+
+	syscall_nr = syscall_get_nr(current, regs);
+
+	cpu = raw_smp_processor_id();
+
+	sys_data = syscall_nr_to_meta(syscall_nr);
+	if (!sys_data)
+		return;
+
+	size = sizeof(*entry) + sizeof(unsigned long) * sys_data->nb_args;
+
+	event = trace_current_buffer_lock_reserve(TRACE_SYSCALL_ENTER, size,
+							0, 0);
+	if (!event)
+		return;
+
+	entry = ring_buffer_event_data(event);
+	entry->nr = syscall_nr;
+	syscall_get_arguments(current, regs, 0, sys_data->nb_args, entry->args);
+
+	trace_current_buffer_unlock_commit(event, 0, 0);
+	trace_wake_up();
+}
+
+void ftrace_syscall_exit(struct pt_regs *regs)
+{
+	struct syscall_trace_exit *entry;
+	struct syscall_metadata *sys_data;
+	struct ring_buffer_event *event;
+	int syscall_nr;
+	int cpu;
+
+	syscall_nr = syscall_get_nr(current, regs);
+
+	cpu = raw_smp_processor_id();
+
+	sys_data = syscall_nr_to_meta(syscall_nr);
+	if (!sys_data)
+		return;
+
+	event = trace_current_buffer_lock_reserve(TRACE_SYSCALL_EXIT,
+				sizeof(*entry), 0, 0);
+	if (!event)
+		return;
+
+	entry = ring_buffer_event_data(event);
+	entry->nr = syscall_nr;
+	entry->ret = syscall_get_return_value(current, regs);
+
+	trace_current_buffer_unlock_commit(event, 0, 0);
+	trace_wake_up();
+}
+
+static int init_syscall_tracer(struct trace_array *tr)
+{
+	start_ftrace_syscalls();
+
+	return 0;
+}
+
+static void reset_syscall_tracer(struct trace_array *tr)
+{
+	stop_ftrace_syscalls();
+}
+
+static struct trace_event syscall_enter_event = {
+	.type	 	= TRACE_SYSCALL_ENTER,
+	.trace		= print_syscall_enter,
+};
+
+static struct trace_event syscall_exit_event = {
+	.type	 	= TRACE_SYSCALL_EXIT,
+	.trace		= print_syscall_exit,
+};
+
+static struct tracer syscall_tracer __read_mostly = {
+	.name	     	= "syscall",
+	.init		= init_syscall_tracer,
+	.reset		= reset_syscall_tracer,
+	.flags		= &syscalls_flags,
+};
+
+__init int register_ftrace_syscalls(void)
+{
+	int ret;
+
+	ret = register_ftrace_event(&syscall_enter_event);
+	if (!ret) {
+		printk(KERN_WARNING "event %d failed to register\n",
+		       syscall_enter_event.type);
+		WARN_ON_ONCE(1);
+	}
+
+	ret = register_ftrace_event(&syscall_exit_event);
+	if (!ret) {
+		printk(KERN_WARNING "event %d failed to register\n",
+		       syscall_exit_event.type);
+		WARN_ON_ONCE(1);
+	}
+
+	return register_tracer(&syscall_tracer);
+}
+device_initcall(register_ftrace_syscalls);
-- 
1.6.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 2/2 v2] tracing/syscalls: support for syscalls tracing on x86-64
  2009-03-13 14:42 [PATCH 0/2 v2] Syscalls tracing Frederic Weisbecker
  2009-03-13 14:42 ` [PATCH 1/2 v2] tracing/syscalls: core infrastructure for syscalls tracing Frederic Weisbecker
@ 2009-03-13 14:42 ` Frederic Weisbecker
  2009-03-13 16:09   ` [tip:tracing/syscalls] tracing/syscalls: support for syscalls tracing on x86 Frederic Weisbecker
  2009-03-15  3:53   ` [PATCH 2/2 v2] tracing/syscalls: support for syscalls tracing on x86-64 Andrew Morton
  2009-03-13 15:16 ` [PATCH 0/2 v2] Syscalls tracing Frederic Weisbecker
  2 siblings, 2 replies; 35+ messages in thread
From: Frederic Weisbecker @ 2009-03-13 14:42 UTC (permalink / raw)
  To: Ingo Molnar, Ingo Molnar
  Cc: Frederic Weisbecker, Linux Kernel Mailing List, Peter Zijlstra,
	Steven Rostedt, tglx, Jason Baron, Frank Ch. Eigler,
	Mathieu Desnoyers, KOSAKI Motohiro, Lai Jiangshan, Jiaying Zhang,
	Michael Rubin, Martin Bligh, Michael Davidson

Some bits for syscalls tracing remain arch dependent, including:

- tasks trapping on syscalls entry/exit by using TIF_FLAFS
- call ftrace_syscall_{enter,exit}()
- implement syscall_nr_to_meta() which provides the syscalls metadata
  giving the syscall number

The latter callback is fast on x86-64, a table is built once the syscall
tracing is first launched. This table contains the address of each syscalls
metadata having the syscall number as an index on the table.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
 arch/x86/Kconfig                   |    1 +
 arch/x86/include/asm/ftrace.h      |    7 ++++
 arch/x86/include/asm/thread_info.h |    9 +++--
 arch/x86/kernel/ftrace.c           |   63 ++++++++++++++++++++++++++++++++++++
 arch/x86/kernel/ptrace.c           |    7 ++++
 5 files changed, 84 insertions(+), 3 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 15ec8a2..4c6e422 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -35,6 +35,7 @@ config X86
 	select HAVE_FUNCTION_GRAPH_TRACER
 	select HAVE_FUNCTION_TRACE_MCOUNT_TEST
 	select HAVE_FTRACE_NMI_ENTER if DYNAMIC_FTRACE
+	select HAVE_FTRACE_SYSCALLS if X86_64
 	select HAVE_KVM
 	select HAVE_ARCH_KGDB
 	select HAVE_ARCH_TRACEHOOK
diff --git a/arch/x86/include/asm/ftrace.h b/arch/x86/include/asm/ftrace.h
index db24c22..d0954f7 100644
--- a/arch/x86/include/asm/ftrace.h
+++ b/arch/x86/include/asm/ftrace.h
@@ -28,6 +28,13 @@
 
 #endif
 
+/* FIXME: I don't want to stay hardcoded */
+#ifdef CONFIG_X86_64
+#define FTRACE_SYSCALL_MAX     296
+#else
+#define FTRACE_SYSCALL_MAX     333
+#endif
+
 #ifdef CONFIG_FUNCTION_TRACER
 #define MCOUNT_ADDR		((long)(mcount))
 #define MCOUNT_INSN_SIZE	5 /* sizeof mcount call */
diff --git a/arch/x86/include/asm/thread_info.h b/arch/x86/include/asm/thread_info.h
index f6ba6f8..83d2b73 100644
--- a/arch/x86/include/asm/thread_info.h
+++ b/arch/x86/include/asm/thread_info.h
@@ -95,6 +95,7 @@ struct thread_info {
 #define TIF_FORCED_TF		24	/* true if TF in eflags artificially */
 #define TIF_DEBUGCTLMSR		25	/* uses thread_struct.debugctlmsr */
 #define TIF_DS_AREA_MSR		26      /* uses thread_struct.ds_area_msr */
+#define TIF_SYSCALL_FTRACE	27	/* for ftrace syscall instrumentation */
 
 #define _TIF_SYSCALL_TRACE	(1 << TIF_SYSCALL_TRACE)
 #define _TIF_NOTIFY_RESUME	(1 << TIF_NOTIFY_RESUME)
@@ -117,15 +118,17 @@ struct thread_info {
 #define _TIF_FORCED_TF		(1 << TIF_FORCED_TF)
 #define _TIF_DEBUGCTLMSR	(1 << TIF_DEBUGCTLMSR)
 #define _TIF_DS_AREA_MSR	(1 << TIF_DS_AREA_MSR)
+#define _TIF_SYSCALL_FTRACE	(1 << TIF_SYSCALL_FTRACE)
 
 /* work to do in syscall_trace_enter() */
 #define _TIF_WORK_SYSCALL_ENTRY	\
-	(_TIF_SYSCALL_TRACE | _TIF_SYSCALL_EMU | \
+	(_TIF_SYSCALL_TRACE | _TIF_SYSCALL_EMU | _TIF_SYSCALL_FTRACE |	\
 	 _TIF_SYSCALL_AUDIT | _TIF_SECCOMP | _TIF_SINGLESTEP)
 
 /* work to do in syscall_trace_leave() */
 #define _TIF_WORK_SYSCALL_EXIT	\
-	(_TIF_SYSCALL_TRACE | _TIF_SYSCALL_AUDIT | _TIF_SINGLESTEP)
+	(_TIF_SYSCALL_TRACE | _TIF_SYSCALL_AUDIT | _TIF_SINGLESTEP |	\
+	 _TIF_SYSCALL_FTRACE)
 
 /* work to do on interrupt/exception return */
 #define _TIF_WORK_MASK							\
@@ -134,7 +137,7 @@ struct thread_info {
 	   _TIF_SINGLESTEP|_TIF_SECCOMP|_TIF_SYSCALL_EMU))
 
 /* work to do on any return to user space */
-#define _TIF_ALLWORK_MASK (0x0000FFFF & ~_TIF_SECCOMP)
+#define _TIF_ALLWORK_MASK ((0x0000FFFF & ~_TIF_SECCOMP) | _TIF_SYSCALL_FTRACE)
 
 /* Only used for 64 bit */
 #define _TIF_DO_NOTIFY_MASK						\
diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
index a85da17..1d0d7f4 100644
--- a/arch/x86/kernel/ftrace.c
+++ b/arch/x86/kernel/ftrace.c
@@ -453,3 +453,66 @@ void prepare_ftrace_return(unsigned long *parent, unsigned long self_addr)
 	}
 }
 #endif /* CONFIG_FUNCTION_GRAPH_TRACER */
+
+#ifdef CONFIG_FTRACE_SYSCALLS
+
+extern unsigned long __start_syscalls_metadata[];
+extern unsigned long __stop_syscalls_metadata[];
+extern unsigned long *sys_call_table;
+
+static struct syscall_metadata **syscalls_metadata;
+
+static struct syscall_metadata *find_syscall_meta(unsigned long *syscall)
+{
+	struct syscall_metadata *start;
+	struct syscall_metadata *stop;
+	char str[KSYM_SYMBOL_LEN];
+
+
+	start = (struct syscall_metadata *)__start_syscalls_metadata;
+	stop = (struct syscall_metadata *)__stop_syscalls_metadata;
+	kallsyms_lookup((unsigned long) syscall, NULL, NULL, NULL, str);
+
+	for ( ; start < stop; start++) {
+		if (start->name && !strcmp(start->name, str))
+			return start;
+	}
+	return NULL;
+}
+
+struct syscall_metadata *syscall_nr_to_meta(int nr)
+{
+	if (!syscalls_metadata || nr >= FTRACE_SYSCALL_MAX || nr < 0)
+		return NULL;
+
+	return syscalls_metadata[nr];
+}
+
+void arch_init_ftrace_syscalls(void)
+{
+	int i;
+	struct syscall_metadata *meta;
+	unsigned long **psys_syscall_table = &sys_call_table;
+	static atomic_t refs;
+
+	if (atomic_inc_return(&refs) != 1)
+		goto end;
+
+	syscalls_metadata = kzalloc(sizeof(*syscalls_metadata) *
+					FTRACE_SYSCALL_MAX, GFP_KERNEL);
+	if (!syscalls_metadata) {
+		WARN_ON(1);
+		return;
+	}
+
+	for (i = 0; i < FTRACE_SYSCALL_MAX; i++) {
+		meta = find_syscall_meta(psys_syscall_table[i]);
+		syscalls_metadata[i] = meta;
+	}
+	return;
+
+	/* Paranoid: avoid overflow */
+end:
+	atomic_dec(&refs);
+}
+#endif
diff --git a/arch/x86/kernel/ptrace.c b/arch/x86/kernel/ptrace.c
index 3d9672e..99749d6 100644
--- a/arch/x86/kernel/ptrace.c
+++ b/arch/x86/kernel/ptrace.c
@@ -21,6 +21,7 @@
 #include <linux/audit.h>
 #include <linux/seccomp.h>
 #include <linux/signal.h>
+#include <linux/ftrace.h>
 
 #include <asm/uaccess.h>
 #include <asm/pgtable.h>
@@ -1416,6 +1417,9 @@ asmregparm long syscall_trace_enter(struct pt_regs *regs)
 	    tracehook_report_syscall_entry(regs))
 		ret = -1L;
 
+	if (unlikely(test_thread_flag(TIF_SYSCALL_FTRACE)))
+		ftrace_syscall_enter(regs);
+
 	if (unlikely(current->audit_context)) {
 		if (IS_IA32)
 			audit_syscall_entry(AUDIT_ARCH_I386,
@@ -1439,6 +1443,9 @@ asmregparm void syscall_trace_leave(struct pt_regs *regs)
 	if (unlikely(current->audit_context))
 		audit_syscall_exit(AUDITSC_RESULT(regs->ax), regs->ax);
 
+	if (unlikely(test_thread_flag(TIF_SYSCALL_FTRACE)))
+		ftrace_syscall_exit(regs);
+
 	if (test_thread_flag(TIF_SYSCALL_TRACE))
 		tracehook_report_syscall_exit(regs, 0);
 
-- 
1.6.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [PATCH 0/2 v2] Syscalls tracing
  2009-03-13 14:42 [PATCH 0/2 v2] Syscalls tracing Frederic Weisbecker
  2009-03-13 14:42 ` [PATCH 1/2 v2] tracing/syscalls: core infrastructure for syscalls tracing Frederic Weisbecker
  2009-03-13 14:42 ` [PATCH 2/2 v2] tracing/syscalls: support for syscalls tracing on x86-64 Frederic Weisbecker
@ 2009-03-13 15:16 ` Frederic Weisbecker
  2009-03-13 15:55   ` Ingo Molnar
  2009-03-13 16:47   ` Mathieu Desnoyers
  2 siblings, 2 replies; 35+ messages in thread
From: Frederic Weisbecker @ 2009-03-13 15:16 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Linux Kernel Mailing List, Peter Zijlstra, Steven Rostedt, tglx,
	Jason Baron, Frank Ch. Eigler, Mathieu Desnoyers,
	KOSAKI Motohiro, Lai Jiangshan, Jiaying Zhang, Michael Rubin,
	Martin Bligh, Michael Davidson

On Fri, Mar 13, 2009 at 03:42:10PM +0100, Frederic Weisbecker wrote:
> tracing/syscalls: core infrastructure to trace syscalls
> 
> This new iteration addresses a good part of the previous reviews.


Ah I just discovered that you applied the previous version today.
But the v2 is not a delta :-s

I can rebase them but not until Sunday.

 
> As suggested by Ingo Molnar and Peter Zijlstra, the syscalls
> prototypes probing is done by abusing the SYSCALL_DEFINE family
> macros.
> We now store automatically the arguments names, their types, their number
> and the name of the syscall.
> 
> Also some fixes on output newlines and dangerous exporting of global_trace
> are provided.
> 
> An example of the trace:
> 
> echo syscall > /debugfs/tracing/current_tracer
> 
> 
> # tracer: syscall
> #
> #           TASK-PID    CPU#    TIMESTAMP  FUNCTION
> #              | |       |          |         |
>            <...>-5080  [001]   132.192228: sys_dup2(oldfd: a, newfd: 1) 
>            <...>-5080  [001]   132.192239: sys_dup2 -> 0x1
>            <...>-5080  [001]   132.192242: sys_fcntl(fd: a, cmd: 1, arg: 0) 
>            <...>-5080  [001]   132.192245: sys_fcntl -> 0x1
>            <...>-5080  [001]   132.192248: sys_close(fd: a) 
>            <...>-5080  [001]   132.192250: sys_close -> 0x0
>            <...>-5080  [001]   132.192265: sys_rt_sigprocmask(how: 0, set: 0, oset: 6cf808, sigsetsize: 8) 
>            <...>-5080  [001]   132.192267: sys_rt_sigprocmask -> 0x0
>            <...>-5080  [001]   132.192271: sys_rt_sigaction(sig: 2, act: 7fffff4338f0, oact: 7fffff433850, sigsetsize: 8) 
>            <...>-5080  [001]   132.192273: sys_rt_sigaction -> 0x0
>            <...>-5080  [001]   132.192285: sys_rt_sigprocmask(how: 0, set: 0, oset: 6cf808, sigsetsize: 8) 
>            <...>-5080  [001]   132.192287: sys_rt_sigprocmask -> 0x0
>            <...>-5080  [001]   132.192415: sys_write(fd: 1, buf: 15dfc08, count: 21) 
>            <...>-5080  [001]   132.192436: sys_write -> 0x21
>            <...>-4754  [000]   132.192478: sys_read(fd: 8, buf: 2a9340e, count: 1fee) 
>            <...>-5080  [001]   132.192487: sys_rt_sigprocmask(how: 0, set: 7fffff432a70, oset: 7fffff4329f0, sigsetsize: 8) 
> 
> And if you ask for the parameters types:
> 
> echo syscall_arg_type > trace_options
> 
> # tracer: syscall
> #
> #           TASK-PID    CPU#    TIMESTAMP  FUNCTION
> #              | |       |          |         |
>            <...>-5080  [001]   132.192228: sys_dup2(unsigned int oldfd: a, unsigned int newfd: 1) 
>            <...>-5080  [001]   132.192239: sys_dup2 -> 0x1
>            <...>-5080  [001]   132.192242: sys_fcntl(unsigned int fd: a, unsigned int cmd: 1, unsigned long arg: 0) 
>            <...>-5080  [001]   132.192245: sys_fcntl -> 0x1
>            <...>-5080  [001]   132.192248: sys_close(unsigned int fd: a) 
>            <...>-5080  [001]   132.192250: sys_close -> 0x0
>            <...>-5080  [001]   132.192265: sys_rt_sigprocmask(int how: 0, sigset_t * set: 0, sigset_t * oset: 6cf808, size_t sigsetsize: 8) 
>            <...>-5080  [001]   132.192267: sys_rt_sigprocmask -> 0x0
>            <...>-5080  [001]   132.192271: sys_rt_sigaction(int sig: 2, const struct sigaction * act: 7fffff4338f0, struct sigaction * oact: 7fffff433850, size_t sigsetsize: 8) 
>            <...>-5080  [001]   132.192273: sys_rt_sigaction -> 0x0
>            <...>-5080  [001]   132.192285: sys_rt_sigprocmask(int how: 0, sigset_t * set: 0, sigset_t * oset: 6cf808, size_t sigsetsize: 8) 
>            <...>-5080  [001]   132.192287: sys_rt_sigprocmask -> 0x0
>            <...>-5080  [001]   132.192415: sys_write(unsigned int fd: 1, const char * buf: 15dfc08, size_t count: 21) 
>            <...>-5080  [001]   132.192436: sys_write -> 0x21
> 
> TODO:
> 
> - add a single mask on the struct syscall_metadata to provide quickly which arguments is
>   a pointer (usually type __user *p) so that the user can decide if he wants to save them of tracing time
> 
> - now that we have each parameter type as strings, add a new field on struct syscall_metadata to have the parameter types
>   encoded as single enum values (for quick checks) so that we can use specific callbacks for each parameter type
>   to be displayed.
> 
> NOTE: this is still not overlapping with a potential future merge of utrace, since the low-level hooks on
> the syscalls remain somewhat basic.
> 
> NOTE2: I've only tested it on x86-64 for now, so only x86-64 support is provided.
> --
> 
> Frederic Weisbecker (2):
>   tracing/syscalls: core infrastructure for syscalls tracing
>   tracing/syscalls: support for syscalls tracing on x86-64
> 
>  arch/x86/Kconfig                   |    1 +
>  arch/x86/include/asm/ftrace.h      |    7 +
>  arch/x86/include/asm/thread_info.h |    9 +-
>  arch/x86/kernel/ftrace.c           |   63 +++++++++
>  arch/x86/kernel/ptrace.c           |    7 +
>  include/asm-generic/vmlinux.lds.h  |   11 ++-
>  include/linux/ftrace.h             |   29 +++++
>  include/linux/syscalls.h           |   60 +++++++++-
>  kernel/trace/Kconfig               |   10 ++
>  kernel/trace/Makefile              |    1 +
>  kernel/trace/trace.h               |   19 +++
>  kernel/trace/trace_syscalls.c      |  243 ++++++++++++++++++++++++++++++++++++
>  12 files changed, 454 insertions(+), 6 deletions(-)
>  create mode 100644 kernel/trace/trace_syscalls.c
> 


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 0/2 v2] Syscalls tracing
  2009-03-13 15:16 ` [PATCH 0/2 v2] Syscalls tracing Frederic Weisbecker
@ 2009-03-13 15:55   ` Ingo Molnar
  2009-03-13 16:17     ` Ingo Molnar
  2009-03-13 16:47   ` Mathieu Desnoyers
  1 sibling, 1 reply; 35+ messages in thread
From: Ingo Molnar @ 2009-03-13 15:55 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Linux Kernel Mailing List, Peter Zijlstra, Steven Rostedt, tglx,
	Jason Baron, Frank Ch. Eigler, Mathieu Desnoyers,
	KOSAKI Motohiro, Lai Jiangshan, Jiaying Zhang, Michael Rubin,
	Martin Bligh, Michael Davidson


* Frederic Weisbecker <fweisbec@gmail.com> wrote:

> On Fri, Mar 13, 2009 at 03:42:10PM +0100, Frederic Weisbecker wrote:
> > tracing/syscalls: core infrastructure to trace syscalls
> > 
> > This new iteration addresses a good part of the previous reviews.
> 
> 
> Ah I just discovered that you applied the previous version 
> today. But the v2 is not a delta :-s
> 
> I can rebase them but not until Sunday.

No problem, i'll deltify them and will have a look.

	Ingo

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [tip:tracing/syscalls] tracing/syscalls: support for syscalls tracing on x86
  2009-03-13 14:42 ` [PATCH 2/2 v2] tracing/syscalls: support for syscalls tracing on x86-64 Frederic Weisbecker
@ 2009-03-13 16:09   ` Frederic Weisbecker
  2009-03-15  3:53   ` [PATCH 2/2 v2] tracing/syscalls: support for syscalls tracing on x86-64 Andrew Morton
  1 sibling, 0 replies; 35+ messages in thread
From: Frederic Weisbecker @ 2009-03-13 16:09 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, hpa, mingo, fweisbec, rostedt, tglx, mingo

Commit-ID:  f58ba100678f421bdcb000a3c71793f432dfab93
Gitweb:     http://git.kernel.org/tip/f58ba100678f421bdcb000a3c71793f432dfab93
Author:     Frederic Weisbecker <fweisbec@gmail.com>
AuthorDate: Fri, 13 Mar 2009 15:42:12 +0100
Commit:     Ingo Molnar <mingo@elte.hu>
CommitDate: Fri, 13 Mar 2009 16:57:42 +0100

tracing/syscalls: support for syscalls tracing on x86

Extend x86 architecture syscall tracing support with syscall
metadata table details.

(The upcoming core syscall tracing modifications rely on this.)

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <1236955332-10133-3-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


---
 arch/x86/include/asm/ftrace.h |    7 ++++
 arch/x86/kernel/ftrace.c      |   63 +++++++++++++++++++++++++++++++++++++++++
 2 files changed, 70 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/ftrace.h b/arch/x86/include/asm/ftrace.h
index db24c22..bd2c651 100644
--- a/arch/x86/include/asm/ftrace.h
+++ b/arch/x86/include/asm/ftrace.h
@@ -28,6 +28,13 @@
 
 #endif
 
+/* FIXME: I don't want to stay hardcoded */
+#ifdef CONFIG_X86_64
+# define FTRACE_SYSCALL_MAX     296
+#else
+# define FTRACE_SYSCALL_MAX     333
+#endif
+
 #ifdef CONFIG_FUNCTION_TRACER
 #define MCOUNT_ADDR		((long)(mcount))
 #define MCOUNT_INSN_SIZE	5 /* sizeof mcount call */
diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
index a85da17..1d0d7f4 100644
--- a/arch/x86/kernel/ftrace.c
+++ b/arch/x86/kernel/ftrace.c
@@ -453,3 +453,66 @@ void prepare_ftrace_return(unsigned long *parent, unsigned long self_addr)
 	}
 }
 #endif /* CONFIG_FUNCTION_GRAPH_TRACER */
+
+#ifdef CONFIG_FTRACE_SYSCALLS
+
+extern unsigned long __start_syscalls_metadata[];
+extern unsigned long __stop_syscalls_metadata[];
+extern unsigned long *sys_call_table;
+
+static struct syscall_metadata **syscalls_metadata;
+
+static struct syscall_metadata *find_syscall_meta(unsigned long *syscall)
+{
+	struct syscall_metadata *start;
+	struct syscall_metadata *stop;
+	char str[KSYM_SYMBOL_LEN];
+
+
+	start = (struct syscall_metadata *)__start_syscalls_metadata;
+	stop = (struct syscall_metadata *)__stop_syscalls_metadata;
+	kallsyms_lookup((unsigned long) syscall, NULL, NULL, NULL, str);
+
+	for ( ; start < stop; start++) {
+		if (start->name && !strcmp(start->name, str))
+			return start;
+	}
+	return NULL;
+}
+
+struct syscall_metadata *syscall_nr_to_meta(int nr)
+{
+	if (!syscalls_metadata || nr >= FTRACE_SYSCALL_MAX || nr < 0)
+		return NULL;
+
+	return syscalls_metadata[nr];
+}
+
+void arch_init_ftrace_syscalls(void)
+{
+	int i;
+	struct syscall_metadata *meta;
+	unsigned long **psys_syscall_table = &sys_call_table;
+	static atomic_t refs;
+
+	if (atomic_inc_return(&refs) != 1)
+		goto end;
+
+	syscalls_metadata = kzalloc(sizeof(*syscalls_metadata) *
+					FTRACE_SYSCALL_MAX, GFP_KERNEL);
+	if (!syscalls_metadata) {
+		WARN_ON(1);
+		return;
+	}
+
+	for (i = 0; i < FTRACE_SYSCALL_MAX; i++) {
+		meta = find_syscall_meta(psys_syscall_table[i]);
+		syscalls_metadata[i] = meta;
+	}
+	return;
+
+	/* Paranoid: avoid overflow */
+end:
+	atomic_dec(&refs);
+}
+#endif

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [tip:tracing/syscalls] tracing/syscalls: core infrastructure for syscalls tracing, enhancements
  2009-03-13 14:42 ` [PATCH 1/2 v2] tracing/syscalls: core infrastructure for syscalls tracing Frederic Weisbecker
@ 2009-03-13 16:09   ` Frederic Weisbecker
  2009-03-15  3:51   ` [PATCH 1/2 v2] tracing/syscalls: core infrastructure for syscalls tracing Andrew Morton
  2009-04-06 21:55   ` Tony Luck
  2 siblings, 0 replies; 35+ messages in thread
From: Frederic Weisbecker @ 2009-03-13 16:09 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: linux-kernel, hpa, mingo, fweisbec, tglx, mingo

Commit-ID:  bed1ffca022cc876fb83161d26670e9b5d3cf36b
Gitweb:     http://git.kernel.org/tip/bed1ffca022cc876fb83161d26670e9b5d3cf36b
Author:     Frederic Weisbecker <fweisbec@gmail.com>
AuthorDate: Fri, 13 Mar 2009 15:42:11 +0100
Commit:     Ingo Molnar <mingo@elte.hu>
CommitDate: Fri, 13 Mar 2009 16:57:42 +0100

tracing/syscalls: core infrastructure for syscalls tracing, enhancements

Impact: new feature

This adds the generic support for syscalls tracing. This is
currently exploited through a devoted tracer but other tracing
engines can use it. (They just have to play with
{start,stop}_ftrace_syscalls() and use the display callbacks
unless they want to override them.)

The syscalls prototypes definitions are abused here to steal
some metadata informations:

- syscall name, param types, param names, number of params

The syscall addr is not directly saved during this definition
because we don't know if its prototype is available in the
namespace. But we don't really need it. The arch has just to
build a function able to resolve the syscall number to its
metadata struct.

The current tracer prints the syscall names, parameters names
and values (and their types optionally). Currently the value is
a raw hex but higher level values diplaying is on my TODO list.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <1236955332-10133-2-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


---
 include/asm-generic/vmlinux.lds.h |   11 +++-
 include/linux/ftrace.h            |   14 +++-
 include/linux/syscalls.h          |   60 +++++++++++++++-
 kernel/trace/trace.h              |   17 ++++
 kernel/trace/trace_syscalls.c     |  146 +++++++++++++++++++++++++++++++++++--
 5 files changed, 234 insertions(+), 14 deletions(-)

diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index 0e0f39b..d3bc3c8 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -77,6 +77,14 @@
 #define TRACE_PRINTKS()
 #endif
 
+#ifdef CONFIG_FTRACE_SYSCALLS
+#define TRACE_SYSCALLS() VMLINUX_SYMBOL(__start_syscalls_metadata) = .;	\
+			 *(__syscalls_metadata)				\
+			 VMLINUX_SYMBOL(__stop_syscalls_metadata) = .;
+#else
+#define TRACE_SYSCALLS()
+#endif
+
 /* .data section */
 #define DATA_DATA							\
 	*(.data)							\
@@ -99,7 +107,8 @@
 	LIKELY_PROFILE()		       				\
 	BRANCH_PROFILE()						\
 	TRACE_PRINTKS()							\
-	FTRACE_EVENTS()
+	FTRACE_EVENTS()							\
+	TRACE_SYSCALLS()
 
 #define RO_DATA(align)							\
 	. = ALIGN((align));						\
diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index c146c10..6dc1c65 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -506,13 +506,21 @@ static inline void trace_hw_branch_oops(void) {}
 /*
  * A syscall entry in the ftrace syscalls array.
  *
- * @syscall_nr: syscall number
+ * @name: name of the syscall
+ * @nb_args: number of parameters it takes
+ * @types: list of types as strings
+ * @args: list of args as strings (args[i] matches types[i])
  */
-struct syscall_trace_entry {
-	int		syscall_nr;
+struct syscall_metadata {
+	const char	*name;
+	int		nb_args;
+	const char	**types;
+	const char	**args;
 };
 
 #ifdef CONFIG_FTRACE_SYSCALLS
+extern void arch_init_ftrace_syscalls(void);
+extern struct syscall_metadata *syscall_nr_to_meta(int nr);
 extern void start_ftrace_syscalls(void);
 extern void stop_ftrace_syscalls(void);
 extern void ftrace_syscall_enter(struct pt_regs *regs);
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index f9f900c..0cff9bb 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -65,6 +65,7 @@ struct old_linux_dirent;
 #include <asm/signal.h>
 #include <linux/quota.h>
 #include <linux/key.h>
+#include <linux/ftrace.h>
 
 #define __SC_DECL1(t1, a1)	t1 a1
 #define __SC_DECL2(t2, a2, ...) t2 a2, __SC_DECL1(__VA_ARGS__)
@@ -95,7 +96,46 @@ struct old_linux_dirent;
 #define __SC_TEST5(t5, a5, ...)	__SC_TEST(t5); __SC_TEST4(__VA_ARGS__)
 #define __SC_TEST6(t6, a6, ...)	__SC_TEST(t6); __SC_TEST5(__VA_ARGS__)
 
+#ifdef CONFIG_FTRACE_SYSCALLS
+#define __SC_STR_ADECL1(t, a)		#a
+#define __SC_STR_ADECL2(t, a, ...)	#a, __SC_STR_ADECL1(__VA_ARGS__)
+#define __SC_STR_ADECL3(t, a, ...)	#a, __SC_STR_ADECL2(__VA_ARGS__)
+#define __SC_STR_ADECL4(t, a, ...)	#a, __SC_STR_ADECL3(__VA_ARGS__)
+#define __SC_STR_ADECL5(t, a, ...)	#a, __SC_STR_ADECL4(__VA_ARGS__)
+#define __SC_STR_ADECL6(t, a, ...)	#a, __SC_STR_ADECL5(__VA_ARGS__)
+
+#define __SC_STR_TDECL1(t, a)		#t
+#define __SC_STR_TDECL2(t, a, ...)	#t, __SC_STR_TDECL1(__VA_ARGS__)
+#define __SC_STR_TDECL3(t, a, ...)	#t, __SC_STR_TDECL2(__VA_ARGS__)
+#define __SC_STR_TDECL4(t, a, ...)	#t, __SC_STR_TDECL3(__VA_ARGS__)
+#define __SC_STR_TDECL5(t, a, ...)	#t, __SC_STR_TDECL4(__VA_ARGS__)
+#define __SC_STR_TDECL6(t, a, ...)	#t, __SC_STR_TDECL5(__VA_ARGS__)
+
+#define SYSCALL_METADATA(sname, nb)				\
+	static const struct syscall_metadata __used		\
+	  __attribute__((__aligned__(4)))			\
+	  __attribute__((section("__syscalls_metadata")))	\
+	  __syscall_meta_##sname = {				\
+		.name 		= "sys"#sname,			\
+		.nb_args 	= nb,				\
+		.types		= types_##sname,		\
+		.args		= args_##sname,			\
+	}
+
+#define SYSCALL_DEFINE0(sname)					\
+	static const struct syscall_metadata __used		\
+	  __attribute__((__aligned__(4)))			\
+	  __attribute__((section("__syscalls_metadata")))	\
+	  __syscall_meta_##sname = {				\
+		.name 		= "sys_"#sname,			\
+		.nb_args 	= 0,				\
+	};							\
+	asmlinkage long sys_##sname(void)
+
+#else
 #define SYSCALL_DEFINE0(name)	   asmlinkage long sys_##name(void)
+#endif
+
 #define SYSCALL_DEFINE1(name, ...) SYSCALL_DEFINEx(1, _##name, __VA_ARGS__)
 #define SYSCALL_DEFINE2(name, ...) SYSCALL_DEFINEx(2, _##name, __VA_ARGS__)
 #define SYSCALL_DEFINE3(name, ...) SYSCALL_DEFINEx(3, _##name, __VA_ARGS__)
@@ -117,10 +157,26 @@ struct old_linux_dirent;
 #endif
 #endif
 
+#ifdef CONFIG_FTRACE_SYSCALLS
+#define SYSCALL_DEFINEx(x, sname, ...)				\
+	static const char *types_##sname[] = {			\
+		__SC_STR_TDECL##x(__VA_ARGS__)			\
+	};							\
+	static const char *args_##sname[] = {			\
+		__SC_STR_ADECL##x(__VA_ARGS__)			\
+	};							\
+	SYSCALL_METADATA(sname, x);				\
+	__SYSCALL_DEFINEx(x, sname, __VA_ARGS__)
+#else
+#define SYSCALL_DEFINEx(x, sname, ...)				\
+	__SYSCALL_DEFINEx(x, sname, __VA_ARGS__)
+#endif
+
 #ifdef CONFIG_HAVE_SYSCALL_WRAPPERS
 
 #define SYSCALL_DEFINE(name) static inline long SYSC_##name
-#define SYSCALL_DEFINEx(x, name, ...)					\
+
+#define __SYSCALL_DEFINEx(x, name, ...)					\
 	asmlinkage long sys##name(__SC_DECL##x(__VA_ARGS__));		\
 	static inline long SYSC##name(__SC_DECL##x(__VA_ARGS__));	\
 	asmlinkage long SyS##name(__SC_LONG##x(__VA_ARGS__))		\
@@ -134,7 +190,7 @@ struct old_linux_dirent;
 #else /* CONFIG_HAVE_SYSCALL_WRAPPERS */
 
 #define SYSCALL_DEFINE(name) asmlinkage long sys_##name
-#define SYSCALL_DEFINEx(x, name, ...)					\
+#define __SYSCALL_DEFINEx(x, name, ...)					\
 	asmlinkage long sys##name(__SC_DECL##x(__VA_ARGS__))
 
 #endif /* CONFIG_HAVE_SYSCALL_WRAPPERS */
diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index 3d49daa..d80ca0d 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -194,6 +194,19 @@ struct kmemtrace_free_entry {
 	const void *ptr;
 };
 
+struct syscall_trace_enter {
+	struct trace_entry	ent;
+	int			nr;
+	unsigned long		args[];
+};
+
+struct syscall_trace_exit {
+	struct trace_entry	ent;
+	int			nr;
+	unsigned long		ret;
+};
+
+
 /*
  * trace_flag_type is an enumeration that holds different
  * states when a trace occurs. These are:
@@ -306,6 +319,10 @@ extern void __ftrace_bad_type(void);
 			  TRACE_KMEM_ALLOC);	\
 		IF_ASSIGN(var, ent, struct kmemtrace_free_entry,	\
 			  TRACE_KMEM_FREE);	\
+		IF_ASSIGN(var, ent, struct syscall_trace_enter,		\
+			  TRACE_SYSCALL_ENTER);				\
+		IF_ASSIGN(var, ent, struct syscall_trace_exit,		\
+			  TRACE_SYSCALL_EXIT);				\
 		__ftrace_bad_type();					\
 	} while (0)
 
diff --git a/kernel/trace/trace_syscalls.c b/kernel/trace/trace_syscalls.c
index 66cf974..c72e599 100644
--- a/kernel/trace/trace_syscalls.c
+++ b/kernel/trace/trace_syscalls.c
@@ -1,6 +1,5 @@
-#include <linux/ftrace.h>
 #include <linux/kernel.h>
-
+#include <linux/ftrace.h>
 #include <asm/syscall.h>
 
 #include "trace_output.h"
@@ -8,6 +7,90 @@
 
 static atomic_t refcount;
 
+/* Our two options */
+enum {
+	TRACE_SYSCALLS_OPT_TYPES = 0x1,
+};
+
+static struct tracer_opt syscalls_opts[] = {
+	{ TRACER_OPT(syscall_arg_type, TRACE_SYSCALLS_OPT_TYPES) },
+	{ }
+};
+
+static struct tracer_flags syscalls_flags = {
+	.val = 0, /* By default: no args types */
+	.opts = syscalls_opts
+};
+
+enum print_line_t
+print_syscall_enter(struct trace_iterator *iter, int flags)
+{
+	struct trace_seq *s = &iter->seq;
+	struct trace_entry *ent = iter->ent;
+	struct syscall_trace_enter *trace;
+	struct syscall_metadata *entry;
+	int i, ret, syscall;
+
+	trace_assign_type(trace, ent);
+
+	syscall = trace->nr;
+
+	entry = syscall_nr_to_meta(syscall);
+	if (!entry)
+		goto end;
+
+	ret = trace_seq_printf(s, "%s(", entry->name);
+	if (!ret)
+		return TRACE_TYPE_PARTIAL_LINE;
+
+	for (i = 0; i < entry->nb_args; i++) {
+		/* parameter types */
+		if (syscalls_flags.val & TRACE_SYSCALLS_OPT_TYPES) {
+			ret = trace_seq_printf(s, "%s ", entry->types[i]);
+			if (!ret)
+				return TRACE_TYPE_PARTIAL_LINE;
+		}
+		/* parameter values */
+		ret = trace_seq_printf(s, "%s: %lx%s ", entry->args[i],
+				       trace->args[i],
+				       i == entry->nb_args - 1 ? ")" : ",");
+		if (!ret)
+			return TRACE_TYPE_PARTIAL_LINE;
+	}
+
+end:
+	trace_seq_printf(s, "\n");
+	return TRACE_TYPE_HANDLED;
+}
+
+enum print_line_t
+print_syscall_exit(struct trace_iterator *iter, int flags)
+{
+	struct trace_seq *s = &iter->seq;
+	struct trace_entry *ent = iter->ent;
+	struct syscall_trace_exit *trace;
+	int syscall;
+	struct syscall_metadata *entry;
+	int ret;
+
+	trace_assign_type(trace, ent);
+
+	syscall = trace->nr;
+
+	entry = syscall_nr_to_meta(syscall);
+	if (!entry) {
+		trace_seq_printf(s, "\n");
+		return TRACE_TYPE_HANDLED;
+	}
+
+	ret = trace_seq_printf(s, "%s -> 0x%lx\n", entry->name,
+				trace->ret);
+	if (!ret)
+		return TRACE_TYPE_PARTIAL_LINE;
+
+	return TRACE_TYPE_HANDLED;
+}
+
 void start_ftrace_syscalls(void)
 {
 	unsigned long flags;
@@ -16,6 +99,7 @@ void start_ftrace_syscalls(void)
 	if (atomic_inc_return(&refcount) != 1)
 		goto out;
 
+	arch_init_ftrace_syscalls();
 	read_lock_irqsave(&tasklist_lock, flags);
 
 	do_each_thread(g, t) {
@@ -48,20 +132,63 @@ out:
 
 void ftrace_syscall_enter(struct pt_regs *regs)
 {
+	struct syscall_trace_enter *entry;
+	struct syscall_metadata *sys_data;
+	struct ring_buffer_event *event;
+	int size;
 	int syscall_nr;
+	int cpu;
 
 	syscall_nr = syscall_get_nr(current, regs);
 
-	trace_printk("syscall %d enter\n", syscall_nr);
+	cpu = raw_smp_processor_id();
+
+	sys_data = syscall_nr_to_meta(syscall_nr);
+	if (!sys_data)
+		return;
+
+	size = sizeof(*entry) + sizeof(unsigned long) * sys_data->nb_args;
+
+	event = trace_current_buffer_lock_reserve(TRACE_SYSCALL_ENTER, size,
+							0, 0);
+	if (!event)
+		return;
+
+	entry = ring_buffer_event_data(event);
+	entry->nr = syscall_nr;
+	syscall_get_arguments(current, regs, 0, sys_data->nb_args, entry->args);
+
+	trace_current_buffer_unlock_commit(event, 0, 0);
+	trace_wake_up();
 }
 
 void ftrace_syscall_exit(struct pt_regs *regs)
 {
+	struct syscall_trace_exit *entry;
+	struct syscall_metadata *sys_data;
+	struct ring_buffer_event *event;
 	int syscall_nr;
+	int cpu;
 
 	syscall_nr = syscall_get_nr(current, regs);
 
-	trace_printk("syscall %d exit\n", syscall_nr);
+	cpu = raw_smp_processor_id();
+
+	sys_data = syscall_nr_to_meta(syscall_nr);
+	if (!sys_data)
+		return;
+
+	event = trace_current_buffer_lock_reserve(TRACE_SYSCALL_EXIT,
+				sizeof(*entry), 0, 0);
+	if (!event)
+		return;
+
+	entry = ring_buffer_event_data(event);
+	entry->nr = syscall_nr;
+	entry->ret = syscall_get_return_value(current, regs);
+
+	trace_current_buffer_unlock_commit(event, 0, 0);
+	trace_wake_up();
 }
 
 static int init_syscall_tracer(struct trace_array *tr)
@@ -77,17 +204,20 @@ static void reset_syscall_tracer(struct trace_array *tr)
 }
 
 static struct trace_event syscall_enter_event = {
-	.type		= TRACE_SYSCALL_ENTER,
+	.type	 	= TRACE_SYSCALL_ENTER,
+	.trace		= print_syscall_enter,
 };
 
 static struct trace_event syscall_exit_event = {
-	.type		= TRACE_SYSCALL_EXIT,
+	.type	 	= TRACE_SYSCALL_EXIT,
+	.trace		= print_syscall_exit,
 };
 
 static struct tracer syscall_tracer __read_mostly = {
-	.name		= "syscall",
+	.name	     	= "syscall",
 	.init		= init_syscall_tracer,
-	.reset		= reset_syscall_tracer
+	.reset		= reset_syscall_tracer,
+	.flags		= &syscalls_flags,
 };
 
 __init int register_ftrace_syscalls(void)

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [PATCH 0/2 v2] Syscalls tracing
  2009-03-13 15:55   ` Ingo Molnar
@ 2009-03-13 16:17     ` Ingo Molnar
  2009-03-15 15:25       ` Frederic Weisbecker
  0 siblings, 1 reply; 35+ messages in thread
From: Ingo Molnar @ 2009-03-13 16:17 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Linux Kernel Mailing List, Peter Zijlstra, Steven Rostedt, tglx,
	Jason Baron, Frank Ch. Eigler, Mathieu Desnoyers,
	KOSAKI Motohiro, Lai Jiangshan, Jiaying Zhang, Michael Rubin,
	Martin Bligh, Michael Davidson


* Ingo Molnar <mingo@elte.hu> wrote:

> 
> * Frederic Weisbecker <fweisbec@gmail.com> wrote:
> 
> > On Fri, Mar 13, 2009 at 03:42:10PM +0100, Frederic Weisbecker wrote:
> > > tracing/syscalls: core infrastructure to trace syscalls
> > > 
> > > This new iteration addresses a good part of the previous reviews.
> > 
> > 
> > Ah I just discovered that you applied the previous version 
> > today. But the v2 is not a delta :-s
> > 
> > I can rebase them but not until Sunday.
> 
> No problem, i'll deltify them and will have a look.

Ok, i did the deltas, tidied them up and put them into 
tip:tracing/syscalls.

Nice stuff! Here's some sample output:

aldebaran:/debug/tracing> head trace
# tracer: syscall
#
#           TASK-PID    CPU#    TIMESTAMP  FUNCTION
#              | |       |          |         |
           <...>-4405  [003]   188.452934: sys_dup2(oldfd: a, newfd: 1) 
           <...>-4405  [003]   188.452939: sys_dup2 -> 0x1
           <...>-4405  [003]   188.452940: sys_fcntl(fd: a, cmd: 1, arg: 0) 
           <...>-4405  [003]   188.452941: sys_fcntl -> 0x1
           <...>-4405  [003]   188.452942: sys_close(fd: a) 
           <...>-4405  [003]   188.452943: sys_close -> 0x0

A suggestion:

 - Would be nice for all the registered syscalls to show up
   under /debug/events/syscalls/, one directory per syscall,
   with an 'enable' and a 'format' file as well.

And a bugreport:

 - when using function_graph (after having used the syscall 
   tracer) i dont see graph traces anymore - only the syscall 
   trace entries:

# tracer: function_graph
#
# CPU  DURATION                  FUNCTION CALLS
# |     |   |                     |   |   |   |
##### CPU 9 buffer started ####
 at-spi-registry-3063  [009]   322.915058: sys_read -> 0x5a8
 at-spi-registry-3063  [009]   322.915059: sys_write(fd: 6, buf: 632840, count: 5a8) 
 at-spi-registry-3063  [009]   322.915062: sys_write -> 0x5a8
 at-spi-registry-3063  [009]   322.915062: sys_read(fd: 4, buf: 632840, count: 40000) 

That's not intended, right?

	Ingo

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 0/2 v2] Syscalls tracing
  2009-03-13 15:16 ` [PATCH 0/2 v2] Syscalls tracing Frederic Weisbecker
  2009-03-13 15:55   ` Ingo Molnar
@ 2009-03-13 16:47   ` Mathieu Desnoyers
  2009-03-15 16:01     ` Frederic Weisbecker
  1 sibling, 1 reply; 35+ messages in thread
From: Mathieu Desnoyers @ 2009-03-13 16:47 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Ingo Molnar, Linux Kernel Mailing List, Peter Zijlstra,
	Steven Rostedt, tglx, Jason Baron, Frank Ch. Eigler,
	KOSAKI Motohiro, Lai Jiangshan, Jiaying Zhang, Michael Rubin,
	Martin Bligh, Michael Davidson

* Frederic Weisbecker (fweisbec@gmail.com) wrote:
> On Fri, Mar 13, 2009 at 03:42:10PM +0100, Frederic Weisbecker wrote:
> > tracing/syscalls: core infrastructure to trace syscalls
> > 
> > This new iteration addresses a good part of the previous reviews.
> 
> 
> Ah I just discovered that you applied the previous version today.
> But the v2 is not a delta :-s
> 
> I can rebase them but not until Sunday.
> 

Hi Frederic,

I like the approach you are taking for syscall tracing. As I read your
code, I see that your infrastructure basically exports the information
under ascii format. How hard do you think that would be to get this to
work for binary output with LTTng ? Ascii output is not exactly a
high-performance output for a hot path such as system call entry/exit.
We could :

1 - Iterate on the table at trace start to dump the syscall names, and
their arguments, into a syscall-table data channel.

2 - We might have to create one event type per syscall argument types.
Or one event per syscall, this would be great because we would benefit
from the event identification infrastructure. And actually, the syscall
declaration macros could declare tiny callbacks that would declare a
marker name (syscall, <syscall_name>) along with the code that saves the
system call pr_regs parameters into the event payload.

Any thoughts ?

Mathieu

>  
> > As suggested by Ingo Molnar and Peter Zijlstra, the syscalls
> > prototypes probing is done by abusing the SYSCALL_DEFINE family
> > macros.
> > We now store automatically the arguments names, their types, their number
> > and the name of the syscall.
> > 
> > Also some fixes on output newlines and dangerous exporting of global_trace
> > are provided.
> > 
> > An example of the trace:
> > 
> > echo syscall > /debugfs/tracing/current_tracer
> > 
> > 
> > # tracer: syscall
> > #
> > #           TASK-PID    CPU#    TIMESTAMP  FUNCTION
> > #              | |       |          |         |
> >            <...>-5080  [001]   132.192228: sys_dup2(oldfd: a, newfd: 1) 
> >            <...>-5080  [001]   132.192239: sys_dup2 -> 0x1
> >            <...>-5080  [001]   132.192242: sys_fcntl(fd: a, cmd: 1, arg: 0) 
> >            <...>-5080  [001]   132.192245: sys_fcntl -> 0x1
> >            <...>-5080  [001]   132.192248: sys_close(fd: a) 
> >            <...>-5080  [001]   132.192250: sys_close -> 0x0
> >            <...>-5080  [001]   132.192265: sys_rt_sigprocmask(how: 0, set: 0, oset: 6cf808, sigsetsize: 8) 
> >            <...>-5080  [001]   132.192267: sys_rt_sigprocmask -> 0x0
> >            <...>-5080  [001]   132.192271: sys_rt_sigaction(sig: 2, act: 7fffff4338f0, oact: 7fffff433850, sigsetsize: 8) 
> >            <...>-5080  [001]   132.192273: sys_rt_sigaction -> 0x0
> >            <...>-5080  [001]   132.192285: sys_rt_sigprocmask(how: 0, set: 0, oset: 6cf808, sigsetsize: 8) 
> >            <...>-5080  [001]   132.192287: sys_rt_sigprocmask -> 0x0
> >            <...>-5080  [001]   132.192415: sys_write(fd: 1, buf: 15dfc08, count: 21) 
> >            <...>-5080  [001]   132.192436: sys_write -> 0x21
> >            <...>-4754  [000]   132.192478: sys_read(fd: 8, buf: 2a9340e, count: 1fee) 
> >            <...>-5080  [001]   132.192487: sys_rt_sigprocmask(how: 0, set: 7fffff432a70, oset: 7fffff4329f0, sigsetsize: 8) 
> > 
> > And if you ask for the parameters types:
> > 
> > echo syscall_arg_type > trace_options
> > 
> > # tracer: syscall
> > #
> > #           TASK-PID    CPU#    TIMESTAMP  FUNCTION
> > #              | |       |          |         |
> >            <...>-5080  [001]   132.192228: sys_dup2(unsigned int oldfd: a, unsigned int newfd: 1) 
> >            <...>-5080  [001]   132.192239: sys_dup2 -> 0x1
> >            <...>-5080  [001]   132.192242: sys_fcntl(unsigned int fd: a, unsigned int cmd: 1, unsigned long arg: 0) 
> >            <...>-5080  [001]   132.192245: sys_fcntl -> 0x1
> >            <...>-5080  [001]   132.192248: sys_close(unsigned int fd: a) 
> >            <...>-5080  [001]   132.192250: sys_close -> 0x0
> >            <...>-5080  [001]   132.192265: sys_rt_sigprocmask(int how: 0, sigset_t * set: 0, sigset_t * oset: 6cf808, size_t sigsetsize: 8) 
> >            <...>-5080  [001]   132.192267: sys_rt_sigprocmask -> 0x0
> >            <...>-5080  [001]   132.192271: sys_rt_sigaction(int sig: 2, const struct sigaction * act: 7fffff4338f0, struct sigaction * oact: 7fffff433850, size_t sigsetsize: 8) 
> >            <...>-5080  [001]   132.192273: sys_rt_sigaction -> 0x0
> >            <...>-5080  [001]   132.192285: sys_rt_sigprocmask(int how: 0, sigset_t * set: 0, sigset_t * oset: 6cf808, size_t sigsetsize: 8) 
> >            <...>-5080  [001]   132.192287: sys_rt_sigprocmask -> 0x0
> >            <...>-5080  [001]   132.192415: sys_write(unsigned int fd: 1, const char * buf: 15dfc08, size_t count: 21) 
> >            <...>-5080  [001]   132.192436: sys_write -> 0x21
> > 
> > TODO:
> > 
> > - add a single mask on the struct syscall_metadata to provide quickly which arguments is
> >   a pointer (usually type __user *p) so that the user can decide if he wants to save them of tracing time
> > 
> > - now that we have each parameter type as strings, add a new field on struct syscall_metadata to have the parameter types
> >   encoded as single enum values (for quick checks) so that we can use specific callbacks for each parameter type
> >   to be displayed.
> > 
> > NOTE: this is still not overlapping with a potential future merge of utrace, since the low-level hooks on
> > the syscalls remain somewhat basic.
> > 
> > NOTE2: I've only tested it on x86-64 for now, so only x86-64 support is provided.
> > --
> > 
> > Frederic Weisbecker (2):
> >   tracing/syscalls: core infrastructure for syscalls tracing
> >   tracing/syscalls: support for syscalls tracing on x86-64
> > 
> >  arch/x86/Kconfig                   |    1 +
> >  arch/x86/include/asm/ftrace.h      |    7 +
> >  arch/x86/include/asm/thread_info.h |    9 +-
> >  arch/x86/kernel/ftrace.c           |   63 +++++++++
> >  arch/x86/kernel/ptrace.c           |    7 +
> >  include/asm-generic/vmlinux.lds.h  |   11 ++-
> >  include/linux/ftrace.h             |   29 +++++
> >  include/linux/syscalls.h           |   60 +++++++++-
> >  kernel/trace/Kconfig               |   10 ++
> >  kernel/trace/Makefile              |    1 +
> >  kernel/trace/trace.h               |   19 +++
> >  kernel/trace/trace_syscalls.c      |  243 ++++++++++++++++++++++++++++++++++++
> >  12 files changed, 454 insertions(+), 6 deletions(-)
> >  create mode 100644 kernel/trace/trace_syscalls.c
> > 
> 

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 1/2 v2] tracing/syscalls: core infrastructure for syscalls tracing
  2009-03-13 14:42 ` [PATCH 1/2 v2] tracing/syscalls: core infrastructure for syscalls tracing Frederic Weisbecker
  2009-03-13 16:09   ` [tip:tracing/syscalls] tracing/syscalls: core infrastructure for syscalls tracing, enhancements Frederic Weisbecker
@ 2009-03-15  3:51   ` Andrew Morton
  2009-03-15  4:59     ` Ingo Molnar
  2009-04-06 21:55   ` Tony Luck
  2 siblings, 1 reply; 35+ messages in thread
From: Andrew Morton @ 2009-03-15  3:51 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Ingo Molnar, Linux Kernel Mailing List, Peter Zijlstra,
	Steven Rostedt, tglx, Jason Baron, Frank Ch. Eigler,
	Mathieu Desnoyers, KOSAKI Motohiro, Lai Jiangshan, Jiaying Zhang,
	Michael Rubin, Martin Bligh, Michael Davidson

On Fri, 13 Mar 2009 15:42:11 +0100 Frederic Weisbecker <fweisbec@gmail.com> wrote:

> +void start_ftrace_syscalls(void)
> +{
> +	unsigned long flags;
> +	struct task_struct *g, *t;
> +
> +	if (atomic_inc_return(&refcount) != 1)
> +		goto out;
> +
> +	arch_init_ftrace_syscalls();
> +	read_lock_irqsave(&tasklist_lock, flags);
> +
> +	do_each_thread(g, t) {
> +		set_tsk_thread_flag(t, TIF_SYSCALL_FTRACE);
> +	} while_each_thread(g, t);
> +
> +	read_unlock_irqrestore(&tasklist_lock, flags);
> +out:
> +	atomic_dec(&refcount);
> +}
> +
> +void stop_ftrace_syscalls(void)
> +{
> +	unsigned long flags;
> +	struct task_struct *g, *t;
> +
> +	if (atomic_dec_return(&refcount))
> +		goto out;
> +
> +	read_lock_irqsave(&tasklist_lock, flags);
> +
> +	do_each_thread(g, t) {
> +		clear_tsk_thread_flag(t, TIF_SYSCALL_FTRACE);
> +	} while_each_thread(g, t);
> +
> +	read_unlock_irqrestore(&tasklist_lock, flags);
> +out:
> +	atomic_inc(&refcount);
> +}

What is this `refcount' thing trying to do?  afacit it does not prevent the
two loops from running concurrently and making a mess.

If it _is_ trying to prevent that from happening, then why not use plain old
mutex_lock()?

If the code is or some reason correct then it certainly is not sufficiently
obvious to be let uncommented.


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 2/2 v2] tracing/syscalls: support for syscalls tracing on x86-64
  2009-03-13 14:42 ` [PATCH 2/2 v2] tracing/syscalls: support for syscalls tracing on x86-64 Frederic Weisbecker
  2009-03-13 16:09   ` [tip:tracing/syscalls] tracing/syscalls: support for syscalls tracing on x86 Frederic Weisbecker
@ 2009-03-15  3:53   ` Andrew Morton
  2009-03-15  5:30     ` Ingo Molnar
  1 sibling, 1 reply; 35+ messages in thread
From: Andrew Morton @ 2009-03-15  3:53 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Ingo Molnar, Linux Kernel Mailing List, Peter Zijlstra,
	Steven Rostedt, tglx, Jason Baron, Frank Ch. Eigler,
	Mathieu Desnoyers, KOSAKI Motohiro, Lai Jiangshan, Jiaying Zhang,
	Michael Rubin, Martin Bligh, Michael Davidson

On Fri, 13 Mar 2009 15:42:12 +0100 Frederic Weisbecker <fweisbec@gmail.com> wrote:

> +static struct syscall_metadata *find_syscall_meta(unsigned long *syscall)
> +{
> +	struct syscall_metadata *start;
> +	struct syscall_metadata *stop;
> +	char str[KSYM_SYMBOL_LEN];
> +
> +
> +	start = (struct syscall_metadata *)__start_syscalls_metadata;
> +	stop = (struct syscall_metadata *)__stop_syscalls_metadata;
> +	kallsyms_lookup((unsigned long) syscall, NULL, NULL, NULL, str);
> +
> +	for ( ; start < stop; start++) {
> +		if (start->name && !strcmp(start->name, str))
> +			return start;
> +	}
> +	return NULL;
> +}

afacit this feature can be enabled when CONFIG_KALLSYMS=n.  Does that make
sense?


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 1/2 v2] tracing/syscalls: core infrastructure for syscalls tracing
  2009-03-15  3:51   ` [PATCH 1/2 v2] tracing/syscalls: core infrastructure for syscalls tracing Andrew Morton
@ 2009-03-15  4:59     ` Ingo Molnar
  2009-03-15  8:02       ` Andrew Morton
  0 siblings, 1 reply; 35+ messages in thread
From: Ingo Molnar @ 2009-03-15  4:59 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Frederic Weisbecker, Linux Kernel Mailing List, Peter Zijlstra,
	Steven Rostedt, tglx, Jason Baron, Frank Ch. Eigler,
	Mathieu Desnoyers, KOSAKI Motohiro, Lai Jiangshan, Jiaying Zhang,
	Michael Rubin, Martin Bligh, Michael Davidson


* Andrew Morton <akpm@linux-foundation.org> wrote:

> On Fri, 13 Mar 2009 15:42:11 +0100 Frederic Weisbecker <fweisbec@gmail.com> wrote:
> 
> > +void start_ftrace_syscalls(void)
> > +{
> > +	unsigned long flags;
> > +	struct task_struct *g, *t;
> > +
> > +	if (atomic_inc_return(&refcount) != 1)
> > +		goto out;
> > +
> > +	arch_init_ftrace_syscalls();
> > +	read_lock_irqsave(&tasklist_lock, flags);
> > +
> > +	do_each_thread(g, t) {
> > +		set_tsk_thread_flag(t, TIF_SYSCALL_FTRACE);
> > +	} while_each_thread(g, t);
> > +
> > +	read_unlock_irqrestore(&tasklist_lock, flags);
> > +out:
> > +	atomic_dec(&refcount);
> > +}
> > +
> > +void stop_ftrace_syscalls(void)
> > +{
> > +	unsigned long flags;
> > +	struct task_struct *g, *t;
> > +
> > +	if (atomic_dec_return(&refcount))
> > +		goto out;
> > +
> > +	read_lock_irqsave(&tasklist_lock, flags);
> > +
> > +	do_each_thread(g, t) {
> > +		clear_tsk_thread_flag(t, TIF_SYSCALL_FTRACE);
> > +	} while_each_thread(g, t);
> > +
> > +	read_unlock_irqrestore(&tasklist_lock, flags);
> > +out:
> > +	atomic_inc(&refcount);
> > +}
> 
> What is this `refcount' thing trying to do?  afacit it does 
> not prevent the two loops from running concurrently and making 
> a mess.
> 
> If it _is_ trying to prevent that from happening, then why not 
> use plain old mutex_lock()?

yeah - already commented about that to Frederic over IRC. A 
plain flag, checked inside the tasklist lock is more than 
enough.

	Ingo

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 2/2 v2] tracing/syscalls: support for syscalls tracing on x86-64
  2009-03-15  3:53   ` [PATCH 2/2 v2] tracing/syscalls: support for syscalls tracing on x86-64 Andrew Morton
@ 2009-03-15  5:30     ` Ingo Molnar
  2009-03-15 16:05       ` Frederic Weisbecker
  0 siblings, 1 reply; 35+ messages in thread
From: Ingo Molnar @ 2009-03-15  5:30 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Frederic Weisbecker, Linux Kernel Mailing List, Peter Zijlstra,
	Steven Rostedt, tglx, Jason Baron, Frank Ch. Eigler,
	Mathieu Desnoyers, KOSAKI Motohiro, Lai Jiangshan, Jiaying Zhang,
	Michael Rubin, Martin Bligh, Michael Davidson


* Andrew Morton <akpm@linux-foundation.org> wrote:

> On Fri, 13 Mar 2009 15:42:12 +0100 Frederic Weisbecker <fweisbec@gmail.com> wrote:
> 
> > +static struct syscall_metadata *find_syscall_meta(unsigned long *syscall)
> > +{
> > +	struct syscall_metadata *start;
> > +	struct syscall_metadata *stop;
> > +	char str[KSYM_SYMBOL_LEN];
> > +
> > +
> > +	start = (struct syscall_metadata *)__start_syscalls_metadata;
> > +	stop = (struct syscall_metadata *)__stop_syscalls_metadata;
> > +	kallsyms_lookup((unsigned long) syscall, NULL, NULL, NULL, str);
> > +
> > +	for ( ; start < stop; start++) {
> > +		if (start->name && !strcmp(start->name, str))
> > +			return start;
> > +	}
> > +	return NULL;
> > +}
> 
> afacit this feature can be enabled when CONFIG_KALLSYMS=n.  
> Does that make sense?

It does not make much sense - the function will return NULL and 
we wont do any tracing. Frederic: FTRACE_SYSCALLS should either 
select KALLSYMS in kernel/trace/Kconfig, like the STACK_TRACER 
and FUNCTION_TRACER already does. Or perhaps, for really 
puritane tracing, we should emit simplified, non-symbolic trace 
data in the !KALLSYMS case.

	Ingo

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 1/2 v2] tracing/syscalls: core infrastructure for syscalls tracing
  2009-03-15  4:59     ` Ingo Molnar
@ 2009-03-15  8:02       ` Andrew Morton
  2009-03-15 16:17         ` Frederic Weisbecker
  0 siblings, 1 reply; 35+ messages in thread
From: Andrew Morton @ 2009-03-15  8:02 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Frederic Weisbecker, Linux Kernel Mailing List, Peter Zijlstra,
	Steven Rostedt, tglx, Jason Baron, Frank Ch. Eigler,
	Mathieu Desnoyers, KOSAKI Motohiro, Lai Jiangshan, Jiaying Zhang,
	Michael Rubin, Martin Bligh, Michael Davidson

On Sun, 15 Mar 2009 05:59:04 +0100 Ingo Molnar <mingo@elte.hu> wrote:

> 
> * Andrew Morton <akpm@linux-foundation.org> wrote:
> 
> > On Fri, 13 Mar 2009 15:42:11 +0100 Frederic Weisbecker <fweisbec@gmail.com> wrote:
> > 
> > > +void start_ftrace_syscalls(void)
> > > +{
> > > +	unsigned long flags;
> > > +	struct task_struct *g, *t;
> > > +
> > > +	if (atomic_inc_return(&refcount) != 1)
> > > +		goto out;
> > > +
> > > +	arch_init_ftrace_syscalls();
> > > +	read_lock_irqsave(&tasklist_lock, flags);
> > > +
> > > +	do_each_thread(g, t) {
> > > +		set_tsk_thread_flag(t, TIF_SYSCALL_FTRACE);
> > > +	} while_each_thread(g, t);
> > > +
> > > +	read_unlock_irqrestore(&tasklist_lock, flags);
> > > +out:
> > > +	atomic_dec(&refcount);
> > > +}
> > > +
> > > +void stop_ftrace_syscalls(void)
> > > +{
> > > +	unsigned long flags;
> > > +	struct task_struct *g, *t;
> > > +
> > > +	if (atomic_dec_return(&refcount))
> > > +		goto out;
> > > +
> > > +	read_lock_irqsave(&tasklist_lock, flags);
> > > +
> > > +	do_each_thread(g, t) {
> > > +		clear_tsk_thread_flag(t, TIF_SYSCALL_FTRACE);
> > > +	} while_each_thread(g, t);
> > > +
> > > +	read_unlock_irqrestore(&tasklist_lock, flags);
> > > +out:
> > > +	atomic_inc(&refcount);
> > > +}
> > 
> > What is this `refcount' thing trying to do?  afacit it does 
> > not prevent the two loops from running concurrently and making 
> > a mess.
> > 
> > If it _is_ trying to prevent that from happening, then why not 
> > use plain old mutex_lock()?
> 
> yeah - already commented about that to Frederic over IRC. A 
> plain flag, checked inside the tasklist lock is more than 
> enough.
> 

That would require write_lock().

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 0/2 v2] Syscalls tracing
  2009-03-13 16:17     ` Ingo Molnar
@ 2009-03-15 15:25       ` Frederic Weisbecker
  0 siblings, 0 replies; 35+ messages in thread
From: Frederic Weisbecker @ 2009-03-15 15:25 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Linux Kernel Mailing List, Peter Zijlstra, Steven Rostedt, tglx,
	Jason Baron, Frank Ch. Eigler, Mathieu Desnoyers,
	KOSAKI Motohiro, Lai Jiangshan, Jiaying Zhang, Michael Rubin,
	Martin Bligh, Michael Davidson

On Fri, Mar 13, 2009 at 05:17:49PM +0100, Ingo Molnar wrote:
> 
> * Ingo Molnar <mingo@elte.hu> wrote:
> 
> > 
> > * Frederic Weisbecker <fweisbec@gmail.com> wrote:
> > 
> > > On Fri, Mar 13, 2009 at 03:42:10PM +0100, Frederic Weisbecker wrote:
> > > > tracing/syscalls: core infrastructure to trace syscalls
> > > > 
> > > > This new iteration addresses a good part of the previous reviews.
> > > 
> > > 
> > > Ah I just discovered that you applied the previous version 
> > > today. But the v2 is not a delta :-s
> > > 
> > > I can rebase them but not until Sunday.
> > 
> > No problem, i'll deltify them and will have a look.
> 
> Ok, i did the deltas, tidied them up and put them into 
> tip:tracing/syscalls.
> 
> Nice stuff! Here's some sample output:
> 
> aldebaran:/debug/tracing> head trace
> # tracer: syscall
> #
> #           TASK-PID    CPU#    TIMESTAMP  FUNCTION
> #              | |       |          |         |
>            <...>-4405  [003]   188.452934: sys_dup2(oldfd: a, newfd: 1) 
>            <...>-4405  [003]   188.452939: sys_dup2 -> 0x1
>            <...>-4405  [003]   188.452940: sys_fcntl(fd: a, cmd: 1, arg: 0) 
>            <...>-4405  [003]   188.452941: sys_fcntl -> 0x1
>            <...>-4405  [003]   188.452942: sys_close(fd: a) 
>            <...>-4405  [003]   188.452943: sys_close -> 0x0
> 
> A suggestion:
> 
>  - Would be nice for all the registered syscalls to show up
>    under /debug/events/syscalls/, one directory per syscall,
>    with an 'enable' and a 'format' file as well.


Nice idea.
It is somehow planned since I want to let the user export the traces
through their binary values (so they will need the format).

 
> And a bugreport:
> 
>  - when using function_graph (after having used the syscall 
>    tracer) i dont see graph traces anymore - only the syscall 
>    trace entries:


Ah exact, it seems I forgot to reset the buffer while switching the tracer.


> # tracer: function_graph
> #
> # CPU  DURATION                  FUNCTION CALLS
> # |     |   |                     |   |   |   |
> ##### CPU 9 buffer started ####
>  at-spi-registry-3063  [009]   322.915058: sys_read -> 0x5a8
>  at-spi-registry-3063  [009]   322.915059: sys_write(fd: 6, buf: 632840, count: 5a8) 
>  at-spi-registry-3063  [009]   322.915062: sys_write -> 0x5a8
>  at-spi-registry-3063  [009]   322.915062: sys_read(fd: 4, buf: 632840, count: 40000) 
> 
> That's not intended, right?


Indeed it's not. I'm cooking the fix.

Thanks.


> 	Ingo


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 0/2 v2] Syscalls tracing
  2009-03-13 16:47   ` Mathieu Desnoyers
@ 2009-03-15 16:01     ` Frederic Weisbecker
  2009-03-23 16:32       ` Mathieu Desnoyers
  0 siblings, 1 reply; 35+ messages in thread
From: Frederic Weisbecker @ 2009-03-15 16:01 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Ingo Molnar, Linux Kernel Mailing List, Peter Zijlstra,
	Steven Rostedt, tglx, Jason Baron, Frank Ch. Eigler,
	KOSAKI Motohiro, Lai Jiangshan, Jiaying Zhang, Michael Rubin,
	Martin Bligh, Michael Davidson

On Fri, Mar 13, 2009 at 12:47:43PM -0400, Mathieu Desnoyers wrote:
> * Frederic Weisbecker (fweisbec@gmail.com) wrote:
> > On Fri, Mar 13, 2009 at 03:42:10PM +0100, Frederic Weisbecker wrote:
> > > tracing/syscalls: core infrastructure to trace syscalls
> > > 
> > > This new iteration addresses a good part of the previous reviews.
> > 
> > 
> > Ah I just discovered that you applied the previous version today.
> > But the v2 is not a delta :-s
> > 
> > I can rebase them but not until Sunday.
> > 
> 
> Hi Frederic,
> 
> I like the approach you are taking for syscall tracing. As I read your
> code, I see that your infrastructure basically exports the information
> under ascii format. How hard do you think that would be to get this to
> work for binary output with LTTng ? Ascii output is not exactly a
> high-performance output for a hot path such as system call entry/exit.
> We could :


Do you mean the syscalls metadata? (syscall name, argument types, number
of arguments) or the syscall events?

I'm not sure it would be valuable to export the metadata as binary because
they are not very large as a whole. But a kind of enumeration of types
will be added as I explain below.

But I plan to export the events as binary values, though the usual tracing
files.
 
> 1 - Iterate on the table at trace start to dump the syscall names, and
> their arguments, into a syscall-table data channel.


Yeah.
For now the arguments parameters types are saved as strings as you know.
I would like to add a new field on the syscall_metadata struct which will
contain the arg types as enum values so that the tracer will be able to:

- Define custom callback for custom types to display better the arguments
  I think I could provide a default output for each parameter type and
  let the ability to override this default output with custom callback.
  For example, it would be great to have the open flag as O_RDONLY instead
  of the numerical value.

- probably create a raw format string value for each syscall so that we can
  export the binary values directly to userspace and the user can format all
  that with the pattern we give to him.


> 2 - We might have to create one event type per syscall argument types.
> Or one event per syscall, this would be great because we would benefit
> from the event identification infrastructure. And actually, the syscall
> declaration macros could declare tiny callbacks that would declare a
> marker name (syscall, <syscall_name>) along with the code that saves the
> system call pr_regs parameters into the event payload.
> 
> Any thoughts ?


I will surely have to provide a way to define custom callbacks for several
syscalls (as shown with the open() flag example).
But the problem is that I will not be able to do that statically like the
metadata is currently built.
Such a thing should be built when we enable the syscall tracing once I guess
(iterate over the syscalls metadata, assign the callback that matches a syscall,
etc...).

But I think I will need sometimes complex callbacks to handle the output of
some complex arguments, so I don't know if it is possible to handled all that
through a marker or TRACE_EVENT style.

Thanks!

Frederic.


> 
> Mathieu
> 
> >  
> > > As suggested by Ingo Molnar and Peter Zijlstra, the syscalls
> > > prototypes probing is done by abusing the SYSCALL_DEFINE family
> > > macros.
> > > We now store automatically the arguments names, their types, their number
> > > and the name of the syscall.
> > > 
> > > Also some fixes on output newlines and dangerous exporting of global_trace
> > > are provided.
> > > 
> > > An example of the trace:
> > > 
> > > echo syscall > /debugfs/tracing/current_tracer
> > > 
> > > 
> > > # tracer: syscall
> > > #
> > > #           TASK-PID    CPU#    TIMESTAMP  FUNCTION
> > > #              | |       |          |         |
> > >            <...>-5080  [001]   132.192228: sys_dup2(oldfd: a, newfd: 1) 
> > >            <...>-5080  [001]   132.192239: sys_dup2 -> 0x1
> > >            <...>-5080  [001]   132.192242: sys_fcntl(fd: a, cmd: 1, arg: 0) 
> > >            <...>-5080  [001]   132.192245: sys_fcntl -> 0x1
> > >            <...>-5080  [001]   132.192248: sys_close(fd: a) 
> > >            <...>-5080  [001]   132.192250: sys_close -> 0x0
> > >            <...>-5080  [001]   132.192265: sys_rt_sigprocmask(how: 0, set: 0, oset: 6cf808, sigsetsize: 8) 
> > >            <...>-5080  [001]   132.192267: sys_rt_sigprocmask -> 0x0
> > >            <...>-5080  [001]   132.192271: sys_rt_sigaction(sig: 2, act: 7fffff4338f0, oact: 7fffff433850, sigsetsize: 8) 
> > >            <...>-5080  [001]   132.192273: sys_rt_sigaction -> 0x0
> > >            <...>-5080  [001]   132.192285: sys_rt_sigprocmask(how: 0, set: 0, oset: 6cf808, sigsetsize: 8) 
> > >            <...>-5080  [001]   132.192287: sys_rt_sigprocmask -> 0x0
> > >            <...>-5080  [001]   132.192415: sys_write(fd: 1, buf: 15dfc08, count: 21) 
> > >            <...>-5080  [001]   132.192436: sys_write -> 0x21
> > >            <...>-4754  [000]   132.192478: sys_read(fd: 8, buf: 2a9340e, count: 1fee) 
> > >            <...>-5080  [001]   132.192487: sys_rt_sigprocmask(how: 0, set: 7fffff432a70, oset: 7fffff4329f0, sigsetsize: 8) 
> > > 
> > > And if you ask for the parameters types:
> > > 
> > > echo syscall_arg_type > trace_options
> > > 
> > > # tracer: syscall
> > > #
> > > #           TASK-PID    CPU#    TIMESTAMP  FUNCTION
> > > #              | |       |          |         |
> > >            <...>-5080  [001]   132.192228: sys_dup2(unsigned int oldfd: a, unsigned int newfd: 1) 
> > >            <...>-5080  [001]   132.192239: sys_dup2 -> 0x1
> > >            <...>-5080  [001]   132.192242: sys_fcntl(unsigned int fd: a, unsigned int cmd: 1, unsigned long arg: 0) 
> > >            <...>-5080  [001]   132.192245: sys_fcntl -> 0x1
> > >            <...>-5080  [001]   132.192248: sys_close(unsigned int fd: a) 
> > >            <...>-5080  [001]   132.192250: sys_close -> 0x0
> > >            <...>-5080  [001]   132.192265: sys_rt_sigprocmask(int how: 0, sigset_t * set: 0, sigset_t * oset: 6cf808, size_t sigsetsize: 8) 
> > >            <...>-5080  [001]   132.192267: sys_rt_sigprocmask -> 0x0
> > >            <...>-5080  [001]   132.192271: sys_rt_sigaction(int sig: 2, const struct sigaction * act: 7fffff4338f0, struct sigaction * oact: 7fffff433850, size_t sigsetsize: 8) 
> > >            <...>-5080  [001]   132.192273: sys_rt_sigaction -> 0x0
> > >            <...>-5080  [001]   132.192285: sys_rt_sigprocmask(int how: 0, sigset_t * set: 0, sigset_t * oset: 6cf808, size_t sigsetsize: 8) 
> > >            <...>-5080  [001]   132.192287: sys_rt_sigprocmask -> 0x0
> > >            <...>-5080  [001]   132.192415: sys_write(unsigned int fd: 1, const char * buf: 15dfc08, size_t count: 21) 
> > >            <...>-5080  [001]   132.192436: sys_write -> 0x21
> > > 
> > > TODO:
> > > 
> > > - add a single mask on the struct syscall_metadata to provide quickly which arguments is
> > >   a pointer (usually type __user *p) so that the user can decide if he wants to save them of tracing time
> > > 
> > > - now that we have each parameter type as strings, add a new field on struct syscall_metadata to have the parameter types
> > >   encoded as single enum values (for quick checks) so that we can use specific callbacks for each parameter type
> > >   to be displayed.
> > > 
> > > NOTE: this is still not overlapping with a potential future merge of utrace, since the low-level hooks on
> > > the syscalls remain somewhat basic.
> > > 
> > > NOTE2: I've only tested it on x86-64 for now, so only x86-64 support is provided.
> > > --
> > > 
> > > Frederic Weisbecker (2):
> > >   tracing/syscalls: core infrastructure for syscalls tracing
> > >   tracing/syscalls: support for syscalls tracing on x86-64
> > > 
> > >  arch/x86/Kconfig                   |    1 +
> > >  arch/x86/include/asm/ftrace.h      |    7 +
> > >  arch/x86/include/asm/thread_info.h |    9 +-
> > >  arch/x86/kernel/ftrace.c           |   63 +++++++++
> > >  arch/x86/kernel/ptrace.c           |    7 +
> > >  include/asm-generic/vmlinux.lds.h  |   11 ++-
> > >  include/linux/ftrace.h             |   29 +++++
> > >  include/linux/syscalls.h           |   60 +++++++++-
> > >  kernel/trace/Kconfig               |   10 ++
> > >  kernel/trace/Makefile              |    1 +
> > >  kernel/trace/trace.h               |   19 +++
> > >  kernel/trace/trace_syscalls.c      |  243 ++++++++++++++++++++++++++++++++++++
> > >  12 files changed, 454 insertions(+), 6 deletions(-)
> > >  create mode 100644 kernel/trace/trace_syscalls.c
> > > 
> > 
> 
> -- 
> Mathieu Desnoyers
> OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 2/2 v2] tracing/syscalls: support for syscalls tracing on x86-64
  2009-03-15  5:30     ` Ingo Molnar
@ 2009-03-15 16:05       ` Frederic Weisbecker
  0 siblings, 0 replies; 35+ messages in thread
From: Frederic Weisbecker @ 2009-03-15 16:05 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Andrew Morton, Linux Kernel Mailing List, Peter Zijlstra,
	Steven Rostedt, tglx, Jason Baron, Frank Ch. Eigler,
	Mathieu Desnoyers, KOSAKI Motohiro, Lai Jiangshan, Jiaying Zhang,
	Michael Rubin, Martin Bligh, Michael Davidson

On Sun, Mar 15, 2009 at 06:30:18AM +0100, Ingo Molnar wrote:
> 
> * Andrew Morton <akpm@linux-foundation.org> wrote:
> 
> > On Fri, 13 Mar 2009 15:42:12 +0100 Frederic Weisbecker <fweisbec@gmail.com> wrote:
> > 
> > > +static struct syscall_metadata *find_syscall_meta(unsigned long *syscall)
> > > +{
> > > +	struct syscall_metadata *start;
> > > +	struct syscall_metadata *stop;
> > > +	char str[KSYM_SYMBOL_LEN];
> > > +
> > > +
> > > +	start = (struct syscall_metadata *)__start_syscalls_metadata;
> > > +	stop = (struct syscall_metadata *)__stop_syscalls_metadata;
> > > +	kallsyms_lookup((unsigned long) syscall, NULL, NULL, NULL, str);
> > > +
> > > +	for ( ; start < stop; start++) {
> > > +		if (start->name && !strcmp(start->name, str))
> > > +			return start;
> > > +	}
> > > +	return NULL;
> > > +}
> > 
> > afacit this feature can be enabled when CONFIG_KALLSYMS=n.  
> > Does that make sense?
> 
> It does not make much sense - the function will return NULL and 
> we wont do any tracing. Frederic: FTRACE_SYSCALLS should either 
> select KALLSYMS in kernel/trace/Kconfig, like the STACK_TRACER 
> and FUNCTION_TRACER already does. Or perhaps, for really 
> puritane tracing, we should emit simplified, non-symbolic trace 
> data in the !KALLSYMS case.
> 
> 	Ingo

You're right. But it would be somewhat pointless to trace the syscalls
without their names.
I will make it select KALLSYMS, it seems to me a sane way to fix it.

Thanks.


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 1/2 v2] tracing/syscalls: core infrastructure for syscalls tracing
  2009-03-15  8:02       ` Andrew Morton
@ 2009-03-15 16:17         ` Frederic Weisbecker
  0 siblings, 0 replies; 35+ messages in thread
From: Frederic Weisbecker @ 2009-03-15 16:17 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Ingo Molnar, Linux Kernel Mailing List, Peter Zijlstra,
	Steven Rostedt, tglx, Jason Baron, Frank Ch. Eigler,
	Mathieu Desnoyers, KOSAKI Motohiro, Lai Jiangshan, Jiaying Zhang,
	Michael Rubin, Martin Bligh, Michael Davidson

On Sun, Mar 15, 2009 at 01:02:39AM -0700, Andrew Morton wrote:
> On Sun, 15 Mar 2009 05:59:04 +0100 Ingo Molnar <mingo@elte.hu> wrote:
> 
> > 
> > * Andrew Morton <akpm@linux-foundation.org> wrote:
> > 
> > > On Fri, 13 Mar 2009 15:42:11 +0100 Frederic Weisbecker <fweisbec@gmail.com> wrote:
> > > 
> > > > +void start_ftrace_syscalls(void)
> > > > +{
> > > > +	unsigned long flags;
> > > > +	struct task_struct *g, *t;
> > > > +
> > > > +	if (atomic_inc_return(&refcount) != 1)
> > > > +		goto out;
> > > > +
> > > > +	arch_init_ftrace_syscalls();
> > > > +	read_lock_irqsave(&tasklist_lock, flags);
> > > > +
> > > > +	do_each_thread(g, t) {
> > > > +		set_tsk_thread_flag(t, TIF_SYSCALL_FTRACE);
> > > > +	} while_each_thread(g, t);
> > > > +
> > > > +	read_unlock_irqrestore(&tasklist_lock, flags);
> > > > +out:
> > > > +	atomic_dec(&refcount);
> > > > +}
> > > > +
> > > > +void stop_ftrace_syscalls(void)
> > > > +{
> > > > +	unsigned long flags;
> > > > +	struct task_struct *g, *t;
> > > > +
> > > > +	if (atomic_dec_return(&refcount))
> > > > +		goto out;
> > > > +
> > > > +	read_lock_irqsave(&tasklist_lock, flags);
> > > > +
> > > > +	do_each_thread(g, t) {
> > > > +		clear_tsk_thread_flag(t, TIF_SYSCALL_FTRACE);
> > > > +	} while_each_thread(g, t);
> > > > +
> > > > +	read_unlock_irqrestore(&tasklist_lock, flags);
> > > > +out:
> > > > +	atomic_inc(&refcount);
> > > > +}
> > > 
> > > What is this `refcount' thing trying to do?  afacit it does 
> > > not prevent the two loops from running concurrently and making 
> > > a mess.
> > > 
> > > If it _is_ trying to prevent that from happening, then why not 
> > > use plain old mutex_lock()?
> > 
> > yeah - already commented about that to Frederic over IRC. A 
> > plain flag, checked inside the tasklist lock is more than 
> > enough.
> > 
> 
> That would require write_lock().


Hm, I made a mistake in this code.
What I wanted to do is to allow several tracing callsites to ask
for syscalls tracing and to only disable it when the last user releases
it (and avoid the iterate the tasklist one more time while the flag
is already set).

So I shouldn't increment it back when there is already one user,
I should only exit without doing anything.

But I still need a lock to prevent from concurrent clear/set flag
on the tasks when the last user is disabling syscall tracing and a new one
requests it.

I need a mutex here. I'm preparing it.

Thanks.


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 0/2 v2] Syscalls tracing
  2009-03-15 16:01     ` Frederic Weisbecker
@ 2009-03-23 16:32       ` Mathieu Desnoyers
  2009-03-23 19:27         ` Frederic Weisbecker
  0 siblings, 1 reply; 35+ messages in thread
From: Mathieu Desnoyers @ 2009-03-23 16:32 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Ingo Molnar, Linux Kernel Mailing List, Peter Zijlstra,
	Steven Rostedt, tglx, Jason Baron, Frank Ch. Eigler,
	KOSAKI Motohiro, Lai Jiangshan, Jiaying Zhang, Michael Rubin,
	Martin Bligh, Michael Davidson

* Frederic Weisbecker (fweisbec@gmail.com) wrote:
> On Fri, Mar 13, 2009 at 12:47:43PM -0400, Mathieu Desnoyers wrote:
> > * Frederic Weisbecker (fweisbec@gmail.com) wrote:
> > > On Fri, Mar 13, 2009 at 03:42:10PM +0100, Frederic Weisbecker wrote:
> > > > tracing/syscalls: core infrastructure to trace syscalls
> > > > 
> > > > This new iteration addresses a good part of the previous reviews.
> > > 
> > > 
> > > Ah I just discovered that you applied the previous version today.
> > > But the v2 is not a delta :-s
> > > 
> > > I can rebase them but not until Sunday.
> > > 
> > 
> > Hi Frederic,
> > 
> > I like the approach you are taking for syscall tracing. As I read your
> > code, I see that your infrastructure basically exports the information
> > under ascii format. How hard do you think that would be to get this to
> > work for binary output with LTTng ? Ascii output is not exactly a
> > high-performance output for a hot path such as system call entry/exit.
> > We could :
> 
> 
> Do you mean the syscalls metadata? (syscall name, argument types, number
> of arguments) or the syscall events?
> 
> I'm not sure it would be valuable to export the metadata as binary because
> they are not very large as a whole. But a kind of enumeration of types
> will be added as I explain below.
> 
> But I plan to export the events as binary values, though the usual tracing
> files.
>  

Syscall metadata could be exported in ascii. Syscall events could be
exported as binary, because they generate a high trace data throughput.

> > 1 - Iterate on the table at trace start to dump the syscall names, and
> > their arguments, into a syscall-table data channel.
> 
> 
> Yeah.
> For now the arguments parameters types are saved as strings as you know.
> I would like to add a new field on the syscall_metadata struct which will
> contain the arg types as enum values so that the tracer will be able to:
> 
> - Define custom callback for custom types to display better the arguments
>   I think I could provide a default output for each parameter type and
>   let the ability to override this default output with custom callback.
>   For example, it would be great to have the open flag as O_RDONLY instead
>   of the numerical value.

Yes, and we should make sure we save the information in the trace
buffers in the same way it is presented in the kernel (e.g. a flag
within a unsigned int), and use a type table to map the trace data to
some human-readable information. This type table could be in text
format or some fixed, easy to parse, format.

Basically, the O_RDONLY flag would simply be written as a "int" into
the trace buffers (no special callback required), but a mappping to
O_RDONLY, OR_WRONLY, etc.. would be registered in the type table for
this specific event types.

> 
> - probably create a raw format string value for each syscall so that we can
>   export the binary values directly to userspace and the user can format all
>   that with the pattern we give to him.
> 

LTTng now provides the ability to access the event type table table from
within the kernel, so we could save the information in binary form into
the trace buffers and then, only when the text output is required,
format it into text format. This would unify the binary and text output
mechanisms called at the tracing site, which is the most critical
tracing execution site.

I also wonder about the impact of the syscall tracing approach you are
taking compared to adding instrumentation directly into the kernel code
(e.g. directly in sys_open), which would have the following advantages :

- The data has already been copied to kernel-space, so there is no need
  to duplicate copy_from_user/to_user() calls.
- This second approach would let us trace syscall code called directly
  from within the kernel (e.g. from kernel threads).

What do you think ?

> 
> > 2 - We might have to create one event type per syscall argument types.
> > Or one event per syscall, this would be great because we would benefit
> > from the event identification infrastructure. And actually, the syscall
> > declaration macros could declare tiny callbacks that would declare a
> > marker name (syscall, <syscall_name>) along with the code that saves the
> > system call pr_regs parameters into the event payload.
> > 
> > Any thoughts ?
> 
> 
> I will surely have to provide a way to define custom callbacks for several
> syscalls (as shown with the open() flag example).
> But the problem is that I will not be able to do that statically like the
> metadata is currently built.
> Such a thing should be built when we enable the syscall tracing once I guess
> (iterate over the syscalls metadata, assign the callback that matches a syscall,
> etc...).

In LTTng, the metadata is already generated dynamically. For instance,
when we want to list the connected interrupt handlers, we have to
support the fact that drivers can be loaded/unloaded. Therefore, we list
the interrupt handlers at trace start, but we also instrument the
insertion/removal of handlers while the trace runs, so we know at each
point in time which handlers are connected. This mechanism would fit
well with your system call type dump.

> 
> But I think I will need sometimes complex callbacks to handle the output of
> some complex arguments, so I don't know if it is possible to handled all that
> through a marker or TRACE_EVENT style.
> 

In my lttng tree, ltt/probes/block-trace.c already has something like
what you need for the open() flags :


void probe_block_bio_complete(struct request_queue *q, struct bio *bio)
{
        trace_mark_tp(block, bio_complete, block_bio_complete,
                probe_block_bio_complete,
                "sector %llu size %u rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT,"
                "FAILFAST_DEV,DISCARD,META,SYNC,BARRIER,AHEAD,RW) %lX "
                "not_uptodate #1u%d",
                (unsigned long long)bio->bi_sector, bio->bi_size,
                bio->bi_rw, !bio_flagged(bio, BIO_UPTODATE));
}

an event type parser could then map the data saved in the "rw" unsigned
long argument to the bitfield composed of :

FAILFAST_DRIVER
FAILFAST_TRANSPORT
FAILFAST_DEV
DISCARD
META
SYNC
BARRIER
AHEAD
RW

It's very basic, and I'm sure we could improve this, but it gives you
the idea of how I plan to deal with this.

Mathieu


> Thanks!
> 
> Frederic.
> 
> 
> > 
> > Mathieu
> > 
> > >  
> > > > As suggested by Ingo Molnar and Peter Zijlstra, the syscalls
> > > > prototypes probing is done by abusing the SYSCALL_DEFINE family
> > > > macros.
> > > > We now store automatically the arguments names, their types, their number
> > > > and the name of the syscall.
> > > > 
> > > > Also some fixes on output newlines and dangerous exporting of global_trace
> > > > are provided.
> > > > 
> > > > An example of the trace:
> > > > 
> > > > echo syscall > /debugfs/tracing/current_tracer
> > > > 
> > > > 
> > > > # tracer: syscall
> > > > #
> > > > #           TASK-PID    CPU#    TIMESTAMP  FUNCTION
> > > > #              | |       |          |         |
> > > >            <...>-5080  [001]   132.192228: sys_dup2(oldfd: a, newfd: 1) 
> > > >            <...>-5080  [001]   132.192239: sys_dup2 -> 0x1
> > > >            <...>-5080  [001]   132.192242: sys_fcntl(fd: a, cmd: 1, arg: 0) 
> > > >            <...>-5080  [001]   132.192245: sys_fcntl -> 0x1
> > > >            <...>-5080  [001]   132.192248: sys_close(fd: a) 
> > > >            <...>-5080  [001]   132.192250: sys_close -> 0x0
> > > >            <...>-5080  [001]   132.192265: sys_rt_sigprocmask(how: 0, set: 0, oset: 6cf808, sigsetsize: 8) 
> > > >            <...>-5080  [001]   132.192267: sys_rt_sigprocmask -> 0x0
> > > >            <...>-5080  [001]   132.192271: sys_rt_sigaction(sig: 2, act: 7fffff4338f0, oact: 7fffff433850, sigsetsize: 8) 
> > > >            <...>-5080  [001]   132.192273: sys_rt_sigaction -> 0x0
> > > >            <...>-5080  [001]   132.192285: sys_rt_sigprocmask(how: 0, set: 0, oset: 6cf808, sigsetsize: 8) 
> > > >            <...>-5080  [001]   132.192287: sys_rt_sigprocmask -> 0x0
> > > >            <...>-5080  [001]   132.192415: sys_write(fd: 1, buf: 15dfc08, count: 21) 
> > > >            <...>-5080  [001]   132.192436: sys_write -> 0x21
> > > >            <...>-4754  [000]   132.192478: sys_read(fd: 8, buf: 2a9340e, count: 1fee) 
> > > >            <...>-5080  [001]   132.192487: sys_rt_sigprocmask(how: 0, set: 7fffff432a70, oset: 7fffff4329f0, sigsetsize: 8) 
> > > > 
> > > > And if you ask for the parameters types:
> > > > 
> > > > echo syscall_arg_type > trace_options
> > > > 
> > > > # tracer: syscall
> > > > #
> > > > #           TASK-PID    CPU#    TIMESTAMP  FUNCTION
> > > > #              | |       |          |         |
> > > >            <...>-5080  [001]   132.192228: sys_dup2(unsigned int oldfd: a, unsigned int newfd: 1) 
> > > >            <...>-5080  [001]   132.192239: sys_dup2 -> 0x1
> > > >            <...>-5080  [001]   132.192242: sys_fcntl(unsigned int fd: a, unsigned int cmd: 1, unsigned long arg: 0) 
> > > >            <...>-5080  [001]   132.192245: sys_fcntl -> 0x1
> > > >            <...>-5080  [001]   132.192248: sys_close(unsigned int fd: a) 
> > > >            <...>-5080  [001]   132.192250: sys_close -> 0x0
> > > >            <...>-5080  [001]   132.192265: sys_rt_sigprocmask(int how: 0, sigset_t * set: 0, sigset_t * oset: 6cf808, size_t sigsetsize: 8) 
> > > >            <...>-5080  [001]   132.192267: sys_rt_sigprocmask -> 0x0
> > > >            <...>-5080  [001]   132.192271: sys_rt_sigaction(int sig: 2, const struct sigaction * act: 7fffff4338f0, struct sigaction * oact: 7fffff433850, size_t sigsetsize: 8) 
> > > >            <...>-5080  [001]   132.192273: sys_rt_sigaction -> 0x0
> > > >            <...>-5080  [001]   132.192285: sys_rt_sigprocmask(int how: 0, sigset_t * set: 0, sigset_t * oset: 6cf808, size_t sigsetsize: 8) 
> > > >            <...>-5080  [001]   132.192287: sys_rt_sigprocmask -> 0x0
> > > >            <...>-5080  [001]   132.192415: sys_write(unsigned int fd: 1, const char * buf: 15dfc08, size_t count: 21) 
> > > >            <...>-5080  [001]   132.192436: sys_write -> 0x21
> > > > 
> > > > TODO:
> > > > 
> > > > - add a single mask on the struct syscall_metadata to provide quickly which arguments is
> > > >   a pointer (usually type __user *p) so that the user can decide if he wants to save them of tracing time
> > > > 
> > > > - now that we have each parameter type as strings, add a new field on struct syscall_metadata to have the parameter types
> > > >   encoded as single enum values (for quick checks) so that we can use specific callbacks for each parameter type
> > > >   to be displayed.
> > > > 
> > > > NOTE: this is still not overlapping with a potential future merge of utrace, since the low-level hooks on
> > > > the syscalls remain somewhat basic.
> > > > 
> > > > NOTE2: I've only tested it on x86-64 for now, so only x86-64 support is provided.
> > > > --
> > > > 
> > > > Frederic Weisbecker (2):
> > > >   tracing/syscalls: core infrastructure for syscalls tracing
> > > >   tracing/syscalls: support for syscalls tracing on x86-64
> > > > 
> > > >  arch/x86/Kconfig                   |    1 +
> > > >  arch/x86/include/asm/ftrace.h      |    7 +
> > > >  arch/x86/include/asm/thread_info.h |    9 +-
> > > >  arch/x86/kernel/ftrace.c           |   63 +++++++++
> > > >  arch/x86/kernel/ptrace.c           |    7 +
> > > >  include/asm-generic/vmlinux.lds.h  |   11 ++-
> > > >  include/linux/ftrace.h             |   29 +++++
> > > >  include/linux/syscalls.h           |   60 +++++++++-
> > > >  kernel/trace/Kconfig               |   10 ++
> > > >  kernel/trace/Makefile              |    1 +
> > > >  kernel/trace/trace.h               |   19 +++
> > > >  kernel/trace/trace_syscalls.c      |  243 ++++++++++++++++++++++++++++++++++++
> > > >  12 files changed, 454 insertions(+), 6 deletions(-)
> > > >  create mode 100644 kernel/trace/trace_syscalls.c
> > > > 
> > > 
> > 
> > -- 
> > Mathieu Desnoyers
> > OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
> 

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 0/2 v2] Syscalls tracing
  2009-03-23 16:32       ` Mathieu Desnoyers
@ 2009-03-23 19:27         ` Frederic Weisbecker
  2009-03-23 19:40           ` Ingo Molnar
  0 siblings, 1 reply; 35+ messages in thread
From: Frederic Weisbecker @ 2009-03-23 19:27 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Ingo Molnar, Linux Kernel Mailing List, Peter Zijlstra,
	Steven Rostedt, tglx, Jason Baron, Frank Ch. Eigler,
	KOSAKI Motohiro, Lai Jiangshan, Jiaying Zhang, Michael Rubin,
	Martin Bligh, Michael Davidson

On Mon, Mar 23, 2009 at 12:32:35PM -0400, Mathieu Desnoyers wrote:
> * Frederic Weisbecker (fweisbec@gmail.com) wrote:
> > On Fri, Mar 13, 2009 at 12:47:43PM -0400, Mathieu Desnoyers wrote:
> > > * Frederic Weisbecker (fweisbec@gmail.com) wrote:
> > > > On Fri, Mar 13, 2009 at 03:42:10PM +0100, Frederic Weisbecker wrote:
> > > > > tracing/syscalls: core infrastructure to trace syscalls
> > > > > 
> > > > > This new iteration addresses a good part of the previous reviews.
> > > > 
> > > > 
> > > > Ah I just discovered that you applied the previous version today.
> > > > But the v2 is not a delta :-s
> > > > 
> > > > I can rebase them but not until Sunday.
> > > > 
> > > 
> > > Hi Frederic,
> > > 
> > > I like the approach you are taking for syscall tracing. As I read your
> > > code, I see that your infrastructure basically exports the information
> > > under ascii format. How hard do you think that would be to get this to
> > > work for binary output with LTTng ? Ascii output is not exactly a
> > > high-performance output for a hot path such as system call entry/exit.
> > > We could :
> > 
> > 
> > Do you mean the syscalls metadata? (syscall name, argument types, number
> > of arguments) or the syscall events?
> > 
> > I'm not sure it would be valuable to export the metadata as binary because
> > they are not very large as a whole. But a kind of enumeration of types
> > will be added as I explain below.
> > 
> > But I plan to export the events as binary values, though the usual tracing
> > files.
> >  
> 
> Syscall metadata could be exported in ascii. Syscall events could be
> exported as binary, because they generate a high trace data throughput.


Yeah, agreed, that what I meant.

 
> > > 1 - Iterate on the table at trace start to dump the syscall names, and
> > > their arguments, into a syscall-table data channel.
> > 
> > 
> > Yeah.
> > For now the arguments parameters types are saved as strings as you know.
> > I would like to add a new field on the syscall_metadata struct which will
> > contain the arg types as enum values so that the tracer will be able to:
> > 
> > - Define custom callback for custom types to display better the arguments
> >   I think I could provide a default output for each parameter type and
> >   let the ability to override this default output with custom callback.
> >   For example, it would be great to have the open flag as O_RDONLY instead
> >   of the numerical value.
> 
> Yes, and we should make sure we save the information in the trace
> buffers in the same way it is presented in the kernel (e.g. a flag
> within a unsigned int), and use a type table to map the trace data to
> some human-readable information. This type table could be in text
> format or some fixed, easy to parse, format.
> 
> Basically, the O_RDONLY flag would simply be written as a "int" into
> the trace buffers (no special callback required), but a mappping to
> O_RDONLY, OR_WRONLY, etc.. would be registered in the type table for
> this specific event types.


Indeed a general type table which could be overriden or completed by
a per event table, for those which require it.


> > 
> > - probably create a raw format string value for each syscall so that we can
> >   export the binary values directly to userspace and the user can format all
> >   that with the pattern we give to him.
> > 
> 
> LTTng now provides the ability to access the event type table table from
> within the kernel, so we could save the information in binary form into
> the trace buffers and then, only when the text output is required,
> format it into text format. This would unify the binary and text output
> mechanisms called at the tracing site, which is the most critical
> tracing execution site.


That's what is done currently. Nothing is stored in plain formatted
ascii in the ring buffer for the syscall tracing.
It's all raw binary. Ie, we dump the registers contents and the number
of the syscall.

Only on output time we perform the formatting, which later could be
a choice between a binary export and a formatted text.


> I also wonder about the impact of the syscall tracing approach you are
> taking compared to adding instrumentation directly into the kernel code
> (e.g. directly in sys_open), which would have the following advantages :
> 
> - The data has already been copied to kernel-space, so there is no need
>   to duplicate copy_from_user/to_user() calls.


Indeed, I haven't thought about it.


> - This second approach would let us trace syscall code called directly
>   from within the kernel (e.g. from kernel threads).
> 
> What do you think ?


Yes, but it means adding hundreds of trace{points,events} over the kernel.

Moreover, the current CPP probe let us very easily factorize it, making it very
scalable.

And actually I don't think two copy_from_user will really change a lot the
tracing throughput.
The only difference with your approach is that we'll have:

sys_call do the copy_from_user()
memcpy(ring_buffer, arg_copied, n)

instead of:

copy_from_user(ring_buffer, __user arg, n)
sys_call do the copy_from_user

With the second solution we risk a tlb miss on ring_buffer stage,
or worst: a page fault.
But once it is done, when the syscall performs its copy_from_user, there
are few chances that it encounters one more time a tlb miss or a page fault
for the same page since it will be called very soon after.

Of course this is all theory, and only practice could prove it.

Concerning the syscalls performed by the kernel threads, this is not
the same focus of tracing. When we trace a syscall, we mostly want to see
what happens when the kernel is servicing a user.
But well, it can be debated.

I don't know much about it. But when a kernel thread performs a syscall,
I guess it is often wrapped by an in-kernel helper. For example to create
a thread, a call to fork is done, but this is wrapped in a call to
kthread_create(). So a tracepoint can be placed here.
I think a lot of examples match this pattern concerning the kernel threads
and a tracepoint in such higher level helper for the kthreads seems to
me better to capture a kernel thread creation event.

Furthermore, AFAICS, only the kthreadd thread forks (and init, which has its
own helper for that too).


> 
> > 
> > > 2 - We might have to create one event type per syscall argument types.
> > > Or one event per syscall, this would be great because we would benefit
> > > from the event identification infrastructure. And actually, the syscall
> > > declaration macros could declare tiny callbacks that would declare a
> > > marker name (syscall, <syscall_name>) along with the code that saves the
> > > system call pr_regs parameters into the event payload.
> > > 
> > > Any thoughts ?
> > 
> > 
> > I will surely have to provide a way to define custom callbacks for several
> > syscalls (as shown with the open() flag example).
> > But the problem is that I will not be able to do that statically like the
> > metadata is currently built.
> > Such a thing should be built when we enable the syscall tracing once I guess
> > (iterate over the syscalls metadata, assign the callback that matches a syscall,
> > etc...).
> 
> In LTTng, the metadata is already generated dynamically. For instance,
> when we want to list the connected interrupt handlers, we have to
> support the fact that drivers can be loaded/unloaded. Therefore, we list
> the interrupt handlers at trace start, but we also instrument the
> insertion/removal of handlers while the trace runs, so we know at each
> point in time which handlers are connected. This mechanism would fit
> well with your system call type dump.


Cool, and how could it be used by the syscalls to define specific
callbacks for their types (for example with thee open() flag, or any generic
types like common int)?


> > 
> > But I think I will need sometimes complex callbacks to handle the output of
> > some complex arguments, so I don't know if it is possible to handled all that
> > through a marker or TRACE_EVENT style.
> > 
> 
> In my lttng tree, ltt/probes/block-trace.c already has something like
> what you need for the open() flags :
> 
> 
> void probe_block_bio_complete(struct request_queue *q, struct bio *bio)
> {
>         trace_mark_tp(block, bio_complete, block_bio_complete,
>                 probe_block_bio_complete,
>                 "sector %llu size %u rw(FAILFAST_DRIVER,FAILFAST_TRANSPORT,"
>                 "FAILFAST_DEV,DISCARD,META,SYNC,BARRIER,AHEAD,RW) %lX "
>                 "not_uptodate #1u%d",
>                 (unsigned long long)bio->bi_sector, bio->bi_size,
>                 bio->bi_rw, !bio_flagged(bio, BIO_UPTODATE));
> }
> 
> an event type parser could then map the data saved in the "rw" unsigned
> long argument to the bitfield composed of :
> 
> FAILFAST_DRIVER
> FAILFAST_TRANSPORT
> FAILFAST_DEV
> DISCARD
> META
> SYNC
> BARRIER
> AHEAD
> RW
>


Oh I see!


> It's very basic, and I'm sure we could improve this, but it gives you
> the idea of how I plan to deal with this.
> 
> Mathieu


I understand better now.
The idea would be now to join the syscalls metadata with such quick
handlers. We will have to think about how to join these in a proper
way.

Thanks!

Frederic.

 
> 
> > Thanks!
> > 
> > Frederic.
> > 
> > 
> > > 
> > > Mathieu
> > > 
> > > >  
> > > > > As suggested by Ingo Molnar and Peter Zijlstra, the syscalls
> > > > > prototypes probing is done by abusing the SYSCALL_DEFINE family
> > > > > macros.
> > > > > We now store automatically the arguments names, their types, their number
> > > > > and the name of the syscall.
> > > > > 
> > > > > Also some fixes on output newlines and dangerous exporting of global_trace
> > > > > are provided.
> > > > > 
> > > > > An example of the trace:
> > > > > 
> > > > > echo syscall > /debugfs/tracing/current_tracer
> > > > > 
> > > > > 
> > > > > # tracer: syscall
> > > > > #
> > > > > #           TASK-PID    CPU#    TIMESTAMP  FUNCTION
> > > > > #              | |       |          |         |
> > > > >            <...>-5080  [001]   132.192228: sys_dup2(oldfd: a, newfd: 1) 
> > > > >            <...>-5080  [001]   132.192239: sys_dup2 -> 0x1
> > > > >            <...>-5080  [001]   132.192242: sys_fcntl(fd: a, cmd: 1, arg: 0) 
> > > > >            <...>-5080  [001]   132.192245: sys_fcntl -> 0x1
> > > > >            <...>-5080  [001]   132.192248: sys_close(fd: a) 
> > > > >            <...>-5080  [001]   132.192250: sys_close -> 0x0
> > > > >            <...>-5080  [001]   132.192265: sys_rt_sigprocmask(how: 0, set: 0, oset: 6cf808, sigsetsize: 8) 
> > > > >            <...>-5080  [001]   132.192267: sys_rt_sigprocmask -> 0x0
> > > > >            <...>-5080  [001]   132.192271: sys_rt_sigaction(sig: 2, act: 7fffff4338f0, oact: 7fffff433850, sigsetsize: 8) 
> > > > >            <...>-5080  [001]   132.192273: sys_rt_sigaction -> 0x0
> > > > >            <...>-5080  [001]   132.192285: sys_rt_sigprocmask(how: 0, set: 0, oset: 6cf808, sigsetsize: 8) 
> > > > >            <...>-5080  [001]   132.192287: sys_rt_sigprocmask -> 0x0
> > > > >            <...>-5080  [001]   132.192415: sys_write(fd: 1, buf: 15dfc08, count: 21) 
> > > > >            <...>-5080  [001]   132.192436: sys_write -> 0x21
> > > > >            <...>-4754  [000]   132.192478: sys_read(fd: 8, buf: 2a9340e, count: 1fee) 
> > > > >            <...>-5080  [001]   132.192487: sys_rt_sigprocmask(how: 0, set: 7fffff432a70, oset: 7fffff4329f0, sigsetsize: 8) 
> > > > > 
> > > > > And if you ask for the parameters types:
> > > > > 
> > > > > echo syscall_arg_type > trace_options
> > > > > 
> > > > > # tracer: syscall
> > > > > #
> > > > > #           TASK-PID    CPU#    TIMESTAMP  FUNCTION
> > > > > #              | |       |          |         |
> > > > >            <...>-5080  [001]   132.192228: sys_dup2(unsigned int oldfd: a, unsigned int newfd: 1) 
> > > > >            <...>-5080  [001]   132.192239: sys_dup2 -> 0x1
> > > > >            <...>-5080  [001]   132.192242: sys_fcntl(unsigned int fd: a, unsigned int cmd: 1, unsigned long arg: 0) 
> > > > >            <...>-5080  [001]   132.192245: sys_fcntl -> 0x1
> > > > >            <...>-5080  [001]   132.192248: sys_close(unsigned int fd: a) 
> > > > >            <...>-5080  [001]   132.192250: sys_close -> 0x0
> > > > >            <...>-5080  [001]   132.192265: sys_rt_sigprocmask(int how: 0, sigset_t * set: 0, sigset_t * oset: 6cf808, size_t sigsetsize: 8) 
> > > > >            <...>-5080  [001]   132.192267: sys_rt_sigprocmask -> 0x0
> > > > >            <...>-5080  [001]   132.192271: sys_rt_sigaction(int sig: 2, const struct sigaction * act: 7fffff4338f0, struct sigaction * oact: 7fffff433850, size_t sigsetsize: 8) 
> > > > >            <...>-5080  [001]   132.192273: sys_rt_sigaction -> 0x0
> > > > >            <...>-5080  [001]   132.192285: sys_rt_sigprocmask(int how: 0, sigset_t * set: 0, sigset_t * oset: 6cf808, size_t sigsetsize: 8) 
> > > > >            <...>-5080  [001]   132.192287: sys_rt_sigprocmask -> 0x0
> > > > >            <...>-5080  [001]   132.192415: sys_write(unsigned int fd: 1, const char * buf: 15dfc08, size_t count: 21) 
> > > > >            <...>-5080  [001]   132.192436: sys_write -> 0x21
> > > > > 
> > > > > TODO:
> > > > > 
> > > > > - add a single mask on the struct syscall_metadata to provide quickly which arguments is
> > > > >   a pointer (usually type __user *p) so that the user can decide if he wants to save them of tracing time
> > > > > 
> > > > > - now that we have each parameter type as strings, add a new field on struct syscall_metadata to have the parameter types
> > > > >   encoded as single enum values (for quick checks) so that we can use specific callbacks for each parameter type
> > > > >   to be displayed.
> > > > > 
> > > > > NOTE: this is still not overlapping with a potential future merge of utrace, since the low-level hooks on
> > > > > the syscalls remain somewhat basic.
> > > > > 
> > > > > NOTE2: I've only tested it on x86-64 for now, so only x86-64 support is provided.
> > > > > --
> > > > > 
> > > > > Frederic Weisbecker (2):
> > > > >   tracing/syscalls: core infrastructure for syscalls tracing
> > > > >   tracing/syscalls: support for syscalls tracing on x86-64
> > > > > 
> > > > >  arch/x86/Kconfig                   |    1 +
> > > > >  arch/x86/include/asm/ftrace.h      |    7 +
> > > > >  arch/x86/include/asm/thread_info.h |    9 +-
> > > > >  arch/x86/kernel/ftrace.c           |   63 +++++++++
> > > > >  arch/x86/kernel/ptrace.c           |    7 +
> > > > >  include/asm-generic/vmlinux.lds.h  |   11 ++-
> > > > >  include/linux/ftrace.h             |   29 +++++
> > > > >  include/linux/syscalls.h           |   60 +++++++++-
> > > > >  kernel/trace/Kconfig               |   10 ++
> > > > >  kernel/trace/Makefile              |    1 +
> > > > >  kernel/trace/trace.h               |   19 +++
> > > > >  kernel/trace/trace_syscalls.c      |  243 ++++++++++++++++++++++++++++++++++++
> > > > >  12 files changed, 454 insertions(+), 6 deletions(-)
> > > > >  create mode 100644 kernel/trace/trace_syscalls.c
> > > > > 
> > > > 
> > > 
> > > -- 
> > > Mathieu Desnoyers
> > > OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
> > 
> 
> -- 
> Mathieu Desnoyers
> OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 0/2 v2] Syscalls tracing
  2009-03-23 19:27         ` Frederic Weisbecker
@ 2009-03-23 19:40           ` Ingo Molnar
  2009-03-23 20:37             ` Mathieu Desnoyers
  0 siblings, 1 reply; 35+ messages in thread
From: Ingo Molnar @ 2009-03-23 19:40 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Mathieu Desnoyers, Linux Kernel Mailing List, Peter Zijlstra,
	Steven Rostedt, tglx, Jason Baron, Frank Ch. Eigler,
	KOSAKI Motohiro, Lai Jiangshan, Jiaying Zhang, Michael Rubin,
	Martin Bligh, Michael Davidson


* Frederic Weisbecker <fweisbec@gmail.com> wrote:

> And actually I don't think two copy_from_user will really change a 
> lot the tracing throughput.

Correct. It's already in the CPU cache so it is a performance 
non-issue and essentially for free. Copy avoidance is only an issue 
when touchig cache-cold data.

( Yes, a few cycles could be shaven off but the beauty of 
  all-encompassing non-source-code-invasive syscall tracing covering 
  hundreds of syscalls straight away trumps those concerns. )

> The idea would be now to join the syscalls metadata with such 
> quick handlers. We will have to think about how to join these in a 
> proper way.

We could allow per syscall tracepoints via the attribute table. The 
call signature could be a standard:

   long sys_call(unsinged long arg1, unsigned long arg2,
		 unsigned long arg3, unsigned long arg4,
		 unsigned long arg5, unsigned long arg6);

This would allow interested plugins/tools to install a system call 
specific callback. (we might allow two tracepoints - one before and 
one after the syscall)

The registration API could be driven by the name or by the syscall 
index - NR_sys_open or so.

	Ingo

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 0/2 v2] Syscalls tracing
  2009-03-23 19:40           ` Ingo Molnar
@ 2009-03-23 20:37             ` Mathieu Desnoyers
  0 siblings, 0 replies; 35+ messages in thread
From: Mathieu Desnoyers @ 2009-03-23 20:37 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Frederic Weisbecker, Linux Kernel Mailing List, Peter Zijlstra,
	Steven Rostedt, tglx, Jason Baron, Frank Ch. Eigler,
	KOSAKI Motohiro, Lai Jiangshan, Jiaying Zhang, Michael Rubin,
	Martin Bligh, Michael Davidson

* Ingo Molnar (mingo@elte.hu) wrote:
> 
> * Frederic Weisbecker <fweisbec@gmail.com> wrote:
> 
> > And actually I don't think two copy_from_user will really change a 
> > lot the tracing throughput.
> 
> Correct. It's already in the CPU cache so it is a performance 
> non-issue and essentially for free. Copy avoidance is only an issue 
> when touchig cache-cold data.
> 
> ( Yes, a few cycles could be shaven off but the beauty of 
>   all-encompassing non-source-code-invasive syscall tracing covering 
>   hundreds of syscalls straight away trumps those concerns. )
> 

I agree. I just wanted to make sure we agreed on the tradeoff here. I
also think hitting data already in cache-lines a second time with
copy_from_user should not be a big concern.

> > The idea would be now to join the syscalls metadata with such 
> > quick handlers. We will have to think about how to join these in a 
> > proper way.
> 
> We could allow per syscall tracepoints via the attribute table. The 
> call signature could be a standard:
> 
>    long sys_call(unsinged long arg1, unsigned long arg2,
> 		 unsigned long arg3, unsigned long arg4,
> 		 unsigned long arg5, unsigned long arg6);
> 
> This would allow interested plugins/tools to install a system call 
> specific callback. (we might allow two tracepoints - one before and 
> one after the syscall)
> 
> The registration API could be driven by the name or by the syscall 
> index - NR_sys_open or so.

Hrm, given the syscalls are defined with their number of arguments
with the SYSCALL_DEFINE* macros, then we could create, in syscalls.h
(example from open.c) :

SYSCALL_DECLARE2(statfs, const char __user *, pathname, struct statfs __user *, buf))

SYSCALL_DECLARE3(statfs64, const char __user *, pathname, size_t, sz, struct statfs64 __user *, buf)

creating SYSCALL_DECLARE0 to 6, which would map to a tracepoint
declaration _and_ a syscall prototype, e.g.

#define __SC_ARGS1(t1, a1)      a1
#define __SC_ARGS2(t2, a2, ...) a2, __SC_ARGS1(__VA_ARGS__)
#define __SC_ARGS3(t3, a3, ...) a3, __SC_ARGS2(__VA_ARGS__)
#define __SC_ARGS4(t4, a4, ...) a4, __SC_ARGS3(__VA_ARGS__)
#define __SC_ARGS5(t5, a5, ...) a5, __SC_ARGS4(__VA_ARGS__)
#define __SC_ARGS6(t6, a6, ...) a6, __SC_ARGS5(__VA_ARGS__)

#define SYSCALL_DECLARE2(name, ...) SYSCALL_DECLAREx(2, _##name, __VA_ARGS__) 

#define SYSCALL_DECLAREx(x, name, ...)				\
	long sys##name(__SC_DECL##x(__VA_ARGS__));		\
	DECLARE_TRACE(sys_##name,				\
			TP_PROTO(__SC_DECL##x(__VA_ARGS__)),	\
				TP_ARGS(__SC_ARGS##x(__VA_ARGS__)))

Those could be declared in a system-wide header (syscalls.h ?) which
would be included by each files using SYSCALL_DEFINE*. Those
declarations would declare the tracepoints and therefore make sure we
spot any SYSCALL_DEFINE* change at compile-time, and we could create a
tracing module which would contain the callbacks that would register on
those syscall tracepoint declarations. This would all be type-safe,
which is a very nice thing to have, even if we don't expect the system
calls to change often at all.

Mathieu

> 
> 	Ingo

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 1/2 v2] tracing/syscalls: core infrastructure for syscalls  tracing
  2009-03-13 14:42 ` [PATCH 1/2 v2] tracing/syscalls: core infrastructure for syscalls tracing Frederic Weisbecker
  2009-03-13 16:09   ` [tip:tracing/syscalls] tracing/syscalls: core infrastructure for syscalls tracing, enhancements Frederic Weisbecker
  2009-03-15  3:51   ` [PATCH 1/2 v2] tracing/syscalls: core infrastructure for syscalls tracing Andrew Morton
@ 2009-04-06 21:55   ` Tony Luck
  2009-04-06 22:12     ` Frederic Weisbecker
  2009-04-08 18:40     ` [PATCH] tracing/syscalls: use a dedicated file header Frederic Weisbecker
  2 siblings, 2 replies; 35+ messages in thread
From: Tony Luck @ 2009-04-06 21:55 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Ingo Molnar, Linux Kernel Mailing List, Peter Zijlstra,
	Steven Rostedt, tglx, Jason Baron, Frank Ch. Eigler,
	Mathieu Desnoyers, KOSAKI Motohiro, Lai Jiangshan, Jiaying Zhang,
	Michael Rubin, Martin Bligh, Michael Davidson

On Fri, Mar 13, 2009 at 7:42 AM, Frederic Weisbecker <fweisbec@gmail.com> wrote:
> diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
> index ab1d772..dfe2a44 100644
> --- a/include/linux/syscalls.h
> +++ b/include/linux/syscalls.h
> @@ -66,6 +66,7 @@ struct perf_counter_hw_event;
>  #include <asm/signal.h>
>  #include <linux/quota.h>
>  #include <linux/key.h>
> +#include <linux/ftrace.h>

This causes some unpleasantness in the ia64 build with CONFIG_IA32_SUPPORT=y

  CC      arch/ia64/ia32/sys_ia32.o
In file included from arch/ia64/ia32/sys_ia32.c:55:
arch/ia64/ia32/ia32priv.h:290:1: warning: "elf_check_arch" redefined
In file included from include/linux/elf.h:7,
                 from include/linux/module.h:14,
                 from include/linux/ftrace.h:8,
                 from include/linux/syscalls.h:68,
                 from arch/ia64/ia32/sys_ia32.c:18:
/home/aegl/generic-smp/arch/ia64/include/asm/elf.h:19:1: warning: this
is the location of the previous definition
In file included from arch/ia64/ia32/sys_ia32.c:55:
arch/ia64/ia32/ia32priv.h:295:1: warning: "ELF_CLASS" redefined
In file included from include/linux/elf.h:7,
                 from include/linux/module.h:14,
                 from include/linux/ftrace.h:8,
                 from include/linux/syscalls.h:68,
                 from arch/ia64/ia32/sys_ia32.c:18:

and lots more like this.  Suddenly having elf.h included in a lot of
the ia32 compat
files is the root problem.

What does syscalls.h really need to pull in at this point ... is there
a way for it to get it without pulling in so much extra stuff?

-Tony

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 1/2 v2] tracing/syscalls: core infrastructure for syscalls tracing
  2009-04-06 21:55   ` Tony Luck
@ 2009-04-06 22:12     ` Frederic Weisbecker
  2009-04-07  0:30       ` Steven Rostedt
  2009-04-08 18:40     ` [PATCH] tracing/syscalls: use a dedicated file header Frederic Weisbecker
  1 sibling, 1 reply; 35+ messages in thread
From: Frederic Weisbecker @ 2009-04-06 22:12 UTC (permalink / raw)
  To: Tony Luck
  Cc: Ingo Molnar, Linux Kernel Mailing List, Peter Zijlstra,
	Steven Rostedt, tglx, Jason Baron, Frank Ch. Eigler,
	Mathieu Desnoyers, KOSAKI Motohiro, Lai Jiangshan, Jiaying Zhang,
	Michael Rubin, Martin Bligh, Michael Davidson

On Mon, Apr 06, 2009 at 02:55:55PM -0700, Tony Luck wrote:
> On Fri, Mar 13, 2009 at 7:42 AM, Frederic Weisbecker <fweisbec@gmail.com> wrote:
> > diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
> > index ab1d772..dfe2a44 100644
> > --- a/include/linux/syscalls.h
> > +++ b/include/linux/syscalls.h
> > @@ -66,6 +66,7 @@ struct perf_counter_hw_event;
> >  #include <asm/signal.h>
> >  #include <linux/quota.h>
> >  #include <linux/key.h>
> > +#include <linux/ftrace.h>
> 
> This causes some unpleasantness in the ia64 build with CONFIG_IA32_SUPPORT=y
> 
>   CC      arch/ia64/ia32/sys_ia32.o
> In file included from arch/ia64/ia32/sys_ia32.c:55:
> arch/ia64/ia32/ia32priv.h:290:1: warning: "elf_check_arch" redefined
> In file included from include/linux/elf.h:7,
>                  from include/linux/module.h:14,
>                  from include/linux/ftrace.h:8,
>                  from include/linux/syscalls.h:68,
>                  from arch/ia64/ia32/sys_ia32.c:18:
> /home/aegl/generic-smp/arch/ia64/include/asm/elf.h:19:1: warning: this
> is the location of the previous definition
> In file included from arch/ia64/ia32/sys_ia32.c:55:
> arch/ia64/ia32/ia32priv.h:295:1: warning: "ELF_CLASS" redefined
> In file included from include/linux/elf.h:7,
>                  from include/linux/module.h:14,
>                  from include/linux/ftrace.h:8,
>                  from include/linux/syscalls.h:68,
>                  from arch/ia64/ia32/sys_ia32.c:18:
> 
> and lots more like this.  Suddenly having elf.h included in a lot of
> the ia32 compat
> files is the root problem.
> 
> What does syscalls.h really need to pull in at this point ... is there
> a way for it to get it without pulling in so much extra stuff?


Ah, I see. I wondered somewhat when I included ftrace.h from syscall.h,
I didn't know why it stressed me but now I understand :-)

Though this is weird that both ia32 and ia64 definitions are allowed to be
included like that.

Anyway, I will put the syscalls tracing headers to a separate file to fix
that, thanks for reporting this!

Frederic.


> -Tony


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 1/2 v2] tracing/syscalls: core infrastructure for syscalls tracing
  2009-04-06 22:12     ` Frederic Weisbecker
@ 2009-04-07  0:30       ` Steven Rostedt
  0 siblings, 0 replies; 35+ messages in thread
From: Steven Rostedt @ 2009-04-07  0:30 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Tony Luck, Ingo Molnar, Linux Kernel Mailing List,
	Peter Zijlstra, tglx, Jason Baron, Frank Ch. Eigler,
	Mathieu Desnoyers, KOSAKI Motohiro, Lai Jiangshan, Jiaying Zhang,
	Michael Rubin, Martin Bligh, Michael Davidson


On Tue, 7 Apr 2009, Frederic Weisbecker wrote:
> > 
> > What does syscalls.h really need to pull in at this point ... is there
> > a way for it to get it without pulling in so much extra stuff?
> 
> 
> Ah, I see. I wondered somewhat when I included ftrace.h from syscall.h,
> I didn't know why it stressed me but now I understand :-)
> 
> Though this is weird that both ia32 and ia64 definitions are allowed to be
> included like that.
> 
> Anyway, I will put the syscalls tracing headers to a separate file to fix
> that, thanks for reporting this!

Yeah, probably just separate out what is needed into a file called 
syscall_trace.h ?

-- Steve


^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH] tracing/syscalls: use a dedicated file header
  2009-04-06 21:55   ` Tony Luck
  2009-04-06 22:12     ` Frederic Weisbecker
@ 2009-04-08 18:40     ` Frederic Weisbecker
  2009-04-08 19:36       ` Luck, Tony
  2009-04-09  4:36       ` [tip:tracing/urgent] " Frederic Weisbecker
  1 sibling, 2 replies; 35+ messages in thread
From: Frederic Weisbecker @ 2009-04-08 18:40 UTC (permalink / raw)
  To: Tony Luck, Ingo Molnar
  Cc: Linux Kernel Mailing List, Peter Zijlstra, Steven Rostedt, tglx,
	Jason Baron, Frank Ch. Eigler, Mathieu Desnoyers,
	KOSAKI Motohiro, Lai Jiangshan, Jiaying Zhang, Michael Rubin,
	Martin Bligh, Michael Davidson

On Mon, Apr 06, 2009 at 02:55:55PM -0700, Tony Luck wrote:
> On Fri, Mar 13, 2009 at 7:42 AM, Frederic Weisbecker <fweisbec@gmail.com> wrote:
> > diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
> > index ab1d772..dfe2a44 100644
> > --- a/include/linux/syscalls.h
> > +++ b/include/linux/syscalls.h
> > @@ -66,6 +66,7 @@ struct perf_counter_hw_event;
> >  #include <asm/signal.h>
> >  #include <linux/quota.h>
> >  #include <linux/key.h>
> > +#include <linux/ftrace.h>
> 
> This causes some unpleasantness in the ia64 build with CONFIG_IA32_SUPPORT=y
> 
>   CC      arch/ia64/ia32/sys_ia32.o
> In file included from arch/ia64/ia32/sys_ia32.c:55:
> arch/ia64/ia32/ia32priv.h:290:1: warning: "elf_check_arch" redefined
> In file included from include/linux/elf.h:7,
>                  from include/linux/module.h:14,
>                  from include/linux/ftrace.h:8,
>                  from include/linux/syscalls.h:68,
>                  from arch/ia64/ia32/sys_ia32.c:18:
> /home/aegl/generic-smp/arch/ia64/include/asm/elf.h:19:1: warning: this
> is the location of the previous definition
> In file included from arch/ia64/ia32/sys_ia32.c:55:
> arch/ia64/ia32/ia32priv.h:295:1: warning: "ELF_CLASS" redefined
> In file included from include/linux/elf.h:7,
>                  from include/linux/module.h:14,
>                  from include/linux/ftrace.h:8,
>                  from include/linux/syscalls.h:68,
>                  from arch/ia64/ia32/sys_ia32.c:18:
> 
> and lots more like this.  Suddenly having elf.h included in a lot of
> the ia32 compat
> files is the root problem.
> 
> What does syscalls.h really need to pull in at this point ... is there
> a way for it to get it without pulling in so much extra stuff?
> 
> -Tony


Tony, can you tell me if the patch below fixes your build?
Thanks!

--
>From fa3a401fde69822b32d04c7aebdef7493bc566a7 Mon Sep 17 00:00:00 2001
From: Frederic Weisbecker <fweisbec@gmail.com>
Date: Wed, 8 Apr 2009 20:26:11 +0200
Subject: [PATCH] tracing/syscalls: use a dedicated file header

Impact: fix a build error on IA64

(fix for 2.6.30)

Building a kernel on ia64 might trigger the following crash:

CC      arch/ia64/ia32/sys_ia32.o
In file included from arch/ia64/ia32/sys_ia32.c:55:
arch/ia64/ia32/ia32priv.h:290:1: warning: "elf_check_arch" redefined
In file included from include/linux/elf.h:7,
                 from include/linux/module.h:14,
                 from include/linux/ftrace.h:8,
                 from include/linux/syscalls.h:68,
                 from arch/ia64/ia32/sys_ia32.c:18:
/home/aegl/generic-smp/arch/ia64/include/asm/elf.h:19:1: warning: this
is the location of the previous definition
In file included from arch/ia64/ia32/sys_ia32.c:55:
arch/ia64/ia32/ia32priv.h:295:1: warning: "ELF_CLASS" redefined
In file included from include/linux/elf.h:7,
                 from include/linux/module.h:14,
                 from include/linux/ftrace.h:8,
                 from include/linux/syscalls.h:68,
                 from arch/ia64/ia32/sys_ia32.c:18:

sys_ia32.c includes linux/syscalls.h which in turn includes linux/ftrace.h
to import the syscalls tracing prototypes.
But including ftrace.h can pull too much things for a low level file,
especially on ia64 where the ia32 private headers conflict with higher level
headers.

Now we isolate the syscall tracing headers in their own lightweight file.

Reported-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
 arch/x86/kernel/ftrace.c      |    1 +
 arch/x86/kernel/ptrace.c      |    2 +-
 include/linux/ftrace.h        |   29 -----------------------------
 include/linux/syscalls.h      |    2 +-
 include/trace/syscall.h       |   35 +++++++++++++++++++++++++++++++++++
 kernel/trace/trace_syscalls.c |    2 +-
 6 files changed, 39 insertions(+), 32 deletions(-)
 create mode 100644 include/trace/syscall.h

diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
index 70a10ca..dc871e0 100644
--- a/arch/x86/kernel/ftrace.c
+++ b/arch/x86/kernel/ftrace.c
@@ -12,6 +12,7 @@
 #include <linux/spinlock.h>
 #include <linux/hardirq.h>
 #include <linux/uaccess.h>
+#include <trace/syscall.h>
 #include <linux/ftrace.h>
 #include <linux/percpu.h>
 #include <linux/sched.h>
diff --git a/arch/x86/kernel/ptrace.c b/arch/x86/kernel/ptrace.c
index b32a8ee..6318896 100644
--- a/arch/x86/kernel/ptrace.c
+++ b/arch/x86/kernel/ptrace.c
@@ -21,7 +21,7 @@
 #include <linux/audit.h>
 #include <linux/seccomp.h>
 #include <linux/signal.h>
-#include <linux/ftrace.h>
+#include <trace/syscall.h>
 #include <linux/workqueue.h>
 
 #include <asm/uaccess.h>
diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index 33a2c20..53869be 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -510,33 +510,4 @@ static inline void trace_hw_branch_oops(void) {}
 
 #endif /* CONFIG_HW_BRANCH_TRACER */
 
-/*
- * A syscall entry in the ftrace syscalls array.
- *
- * @name: name of the syscall
- * @nb_args: number of parameters it takes
- * @types: list of types as strings
- * @args: list of args as strings (args[i] matches types[i])
- */
-struct syscall_metadata {
-	const char	*name;
-	int		nb_args;
-	const char	**types;
-	const char	**args;
-};
-
-#ifdef CONFIG_FTRACE_SYSCALLS
-extern void arch_init_ftrace_syscalls(void);
-extern struct syscall_metadata *syscall_nr_to_meta(int nr);
-extern void start_ftrace_syscalls(void);
-extern void stop_ftrace_syscalls(void);
-extern void ftrace_syscall_enter(struct pt_regs *regs);
-extern void ftrace_syscall_exit(struct pt_regs *regs);
-#else
-static inline void start_ftrace_syscalls(void) { }
-static inline void stop_ftrace_syscalls(void) { }
-static inline void ftrace_syscall_enter(struct pt_regs *regs) { }
-static inline void ftrace_syscall_exit(struct pt_regs *regs) { }
-#endif
-
 #endif /* _LINUX_FTRACE_H */
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index 471143b..35bbf67 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -66,7 +66,7 @@ struct perf_counter_hw_event;
 #include <asm/signal.h>
 #include <linux/quota.h>
 #include <linux/key.h>
-#include <linux/ftrace.h>
+#include <trace/syscall.h>
 
 #define __SC_DECL1(t1, a1)	t1 a1
 #define __SC_DECL2(t2, a2, ...) t2 a2, __SC_DECL1(__VA_ARGS__)
diff --git a/include/trace/syscall.h b/include/trace/syscall.h
new file mode 100644
index 0000000..27d1d31
--- /dev/null
+++ b/include/trace/syscall.h
@@ -0,0 +1,35 @@
+#ifndef _TRACE_SYSCALL_H
+#define _TRACE_SYSCALL_H
+
+#include <asm/ptrace.h>
+
+/*
+ * A syscall entry in the ftrace syscalls array.
+ *
+ * @name: name of the syscall
+ * @nb_args: number of parameters it takes
+ * @types: list of types as strings
+ * @args: list of args as strings (args[i] matches types[i])
+ */
+struct syscall_metadata {
+	const char	*name;
+	int		nb_args;
+	const char	**types;
+	const char	**args;
+};
+
+#ifdef CONFIG_FTRACE_SYSCALLS
+extern void arch_init_ftrace_syscalls(void);
+extern struct syscall_metadata *syscall_nr_to_meta(int nr);
+extern void start_ftrace_syscalls(void);
+extern void stop_ftrace_syscalls(void);
+extern void ftrace_syscall_enter(struct pt_regs *regs);
+extern void ftrace_syscall_exit(struct pt_regs *regs);
+#else
+static inline void start_ftrace_syscalls(void) { }
+static inline void stop_ftrace_syscalls(void) { }
+static inline void ftrace_syscall_enter(struct pt_regs *regs) { }
+static inline void ftrace_syscall_exit(struct pt_regs *regs) { }
+#endif
+
+#endif /* _TRACE_SYSCALL_H */
diff --git a/kernel/trace/trace_syscalls.c b/kernel/trace/trace_syscalls.c
index a2a3af2..5e57964 100644
--- a/kernel/trace/trace_syscalls.c
+++ b/kernel/trace/trace_syscalls.c
@@ -1,5 +1,5 @@
+#include <trace/syscall.h>
 #include <linux/kernel.h>
-#include <linux/ftrace.h>
 #include <asm/syscall.h>
 
 #include "trace_output.h"
-- 
1.6.1




^ permalink raw reply related	[flat|nested] 35+ messages in thread

* RE: [PATCH] tracing/syscalls: use a dedicated file header
  2009-04-08 18:40     ` [PATCH] tracing/syscalls: use a dedicated file header Frederic Weisbecker
@ 2009-04-08 19:36       ` Luck, Tony
  2009-04-08 22:44         ` Steven Rostedt
  2009-04-09  0:15         ` [GIT PULL][PATCH] " Steven Rostedt
  2009-04-09  4:36       ` [tip:tracing/urgent] " Frederic Weisbecker
  1 sibling, 2 replies; 35+ messages in thread
From: Luck, Tony @ 2009-04-08 19:36 UTC (permalink / raw)
  To: Frederic Weisbecker, Ingo Molnar
  Cc: Linux Kernel Mailing List, Peter Zijlstra, Steven Rostedt, tglx,
	Jason Baron, Frank Ch. Eigler, Mathieu Desnoyers,
	KOSAKI Motohiro, Lai Jiangshan, Jiaying Zhang, Michael Rubin,
	Martin Bligh, Michael Davidson

> Tony, can you tell me if the patch below fixes your build?
> Thanks!

Frederic,

Yes ... works great.  Builds are OK with and without CONFIG_IA32_SUPPORT.
Kernel with CONFIG_IA32_SUPPORT=y boots fine and can run x86 (32-bit) binaries.

Acked-by: Tony Luck <tony.luck@intel.com>


Thanks for fixing this so quickly.

-Tony

^ permalink raw reply	[flat|nested] 35+ messages in thread

* RE: [PATCH] tracing/syscalls: use a dedicated file header
  2009-04-08 19:36       ` Luck, Tony
@ 2009-04-08 22:44         ` Steven Rostedt
  2009-04-08 22:51           ` Luck, Tony
  2009-04-08 23:08           ` Luck, Tony
  2009-04-09  0:15         ` [GIT PULL][PATCH] " Steven Rostedt
  1 sibling, 2 replies; 35+ messages in thread
From: Steven Rostedt @ 2009-04-08 22:44 UTC (permalink / raw)
  To: Luck, Tony
  Cc: Frederic Weisbecker, Ingo Molnar, Linux Kernel Mailing List,
	Peter Zijlstra, tglx, Jason Baron, Frank Ch. Eigler,
	Mathieu Desnoyers, KOSAKI Motohiro, Lai Jiangshan, Jiaying Zhang,
	Michael Rubin, Martin Bligh, Michael Davidson



On Wed, 8 Apr 2009, Luck, Tony wrote:

> > Tony, can you tell me if the patch below fixes your build?
> > Thanks!
> 
> Frederic,
> 
> Yes ... works great.  Builds are OK with and without CONFIG_IA32_SUPPORT.
> Kernel with CONFIG_IA32_SUPPORT=y boots fine and can run x86 (32-bit) binaries.
> 
> Acked-by: Tony Luck <tony.luck@intel.com>

Great! I'll pull it into my repo and get it ready to send to tip.

Is it OK to add a:

Reported-Tested-and-Acked-by: Tony Luck <tony.luck@intel.com>

-- Steve


^ permalink raw reply	[flat|nested] 35+ messages in thread

* RE: [PATCH] tracing/syscalls: use a dedicated file header
  2009-04-08 22:44         ` Steven Rostedt
@ 2009-04-08 22:51           ` Luck, Tony
  2009-04-08 23:02             ` Steven Rostedt
  2009-04-08 23:08           ` Luck, Tony
  1 sibling, 1 reply; 35+ messages in thread
From: Luck, Tony @ 2009-04-08 22:51 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Frederic Weisbecker, Ingo Molnar, Linux Kernel Mailing List,
	Peter Zijlstra, tglx, Jason Baron, Frank Ch. Eigler,
	Mathieu Desnoyers, KOSAKI Motohiro, Lai Jiangshan, Jiaying Zhang,
	Michael Rubin, Martin Bligh, Michael Davidson

> Great! I'll pull it into my repo and get it ready to send to tip.
>
> Is it OK to add a:
>
> Reported-Tested-and-Acked-by: Tony Luck <tony.luck@intel.com>

Maybe that will give the scripts that parse "git log" output
some grief as I don't see that tag ever used before.  I do
see a few (4) "Tested-and-Acked-by", so you could add:

Reported-by: Tony Luck <tony.luck@intel.com>
Tested-and-Acked-by: Tony Luck <tony.luck@intel.com>

-Tony

^ permalink raw reply	[flat|nested] 35+ messages in thread

* RE: [PATCH] tracing/syscalls: use a dedicated file header
  2009-04-08 22:51           ` Luck, Tony
@ 2009-04-08 23:02             ` Steven Rostedt
  0 siblings, 0 replies; 35+ messages in thread
From: Steven Rostedt @ 2009-04-08 23:02 UTC (permalink / raw)
  To: Luck, Tony
  Cc: Frederic Weisbecker, Ingo Molnar, Linux Kernel Mailing List,
	Peter Zijlstra, tglx, Jason Baron, Frank Ch. Eigler,
	Mathieu Desnoyers, KOSAKI Motohiro, Lai Jiangshan, Jiaying Zhang,
	Michael Rubin, Martin Bligh, Michael Davidson


On Wed, 8 Apr 2009, Luck, Tony wrote:

> > Great! I'll pull it into my repo and get it ready to send to tip.
> >
> > Is it OK to add a:
> >
> > Reported-Tested-and-Acked-by: Tony Luck <tony.luck@intel.com>
> 
> Maybe that will give the scripts that parse "git log" output
> some grief as I don't see that tag ever used before.  I do

I know, I wanted to be the first ;-)

> see a few (4) "Tested-and-Acked-by", so you could add:
> 
> Reported-by: Tony Luck <tony.luck@intel.com>
> Tested-and-Acked-by: Tony Luck <tony.luck@intel.com>

OK, I'll do it that way.

Thanks,

-- Steve


^ permalink raw reply	[flat|nested] 35+ messages in thread

* RE: [PATCH] tracing/syscalls: use a dedicated file header
  2009-04-08 22:44         ` Steven Rostedt
  2009-04-08 22:51           ` Luck, Tony
@ 2009-04-08 23:08           ` Luck, Tony
  2009-04-08 23:32             ` Steven Rostedt
  2009-04-08 23:32             ` Joe Perches
  1 sibling, 2 replies; 35+ messages in thread
From: Luck, Tony @ 2009-04-08 23:08 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Frederic Weisbecker, Ingo Molnar, Linux Kernel Mailing List,
	Peter Zijlstra, tglx, Jason Baron, Frank Ch. Eigler,
	Mathieu Desnoyers, KOSAKI Motohiro, Lai Jiangshan, Jiaying Zhang,
	Michael Rubin, Martin Bligh, Michael Davidson

> Maybe that will give the scripts that parse "git log" output
> some grief as I don't see that tag ever used before.

It appears that any scripts that venture into the depths of
the git logs need to be quite flexible ... people have already
been quite creative about the tags used.  My favourite is

        "Acked-with-apologies-by:"

though the operatic possibilities of

        "Singed-off-by"

are also amusing.

Here's the current list with frequency counts:

 280098     Signed-off-by:
  13893     Acked-by:
   1234     Reported-by:
   1217     Reviewed-by:
   1135     Tested-by:
     74     Reported-and-tested-by:
     57     Signed-of-by:
     35     Suggested-by:
     30     Requested-by:
     30     Bisected-by:
     27     signed-off-by:
     24     Reported-and-bisected-by:
     22     Signed-Off-by:
     20     Pointed-out-by:
     18     Singed-off-by:
     15     Debugged-by:
     14     Noticed-by:
     13     Found-by:
     10     Spotted-by:
      9     Fixed-by:
      9     ACKed-by:
      8     Acked-off-by:
      7     Noted-by:
      7     Acked-and-tested-by:
      6     Submitted-by:
      6     Diagnosed-by:
      5     Inspired-by:
      4     Written-by:
      4     Tested-and-Acked-by:
      4     Sighed-off-by:
      4     Reported-and-Tested-by:
      4     Report-by:
      3     Tested-and-acked-by:
      3     SIgned-off-by:
      3     Signed-by:
      3     Pointed-to-by:
      3     Modified-by:
      3     Located-by:
      3     Ackde-by:
      2     Testted-by:
      2     Signed_off-by:
      2     Sigend-off-by:
      2     Screwed-up-by:
      2     Reviewd-by:
      2     Idea-by:
      2     Grudgingly-acked-by:
      2     Earlier-version-tested-by:
      2     Discovered-by:
      2     Cc: Acked-by:
      2     Ack'd-by:
      2     Ack-by:
      1     With comment fixes Signed-off-by:
      1     Verified-by:
      1     Triaged-by:
      1     This patch does kmalloc + memset conversion to kzalloc anSigned-off-by:
      1     their Signed-off-by:
      1     Tested-and-requested-by:
      1     Tested-and-reported-by:
      1     Test-by:
      1     Tentatively-acked-by:
      1     Suggestd-by:
      1     sSigned-off-by:
      1     Sogned-off-by:
      1     Sign-off-by:
      1     Signe-off-by:
      1     Signen-off-by:
      1     signef-off-by:
      1     Signed-pff-by:
      1     Signed-off-by-by:
      1     Signed-off-and-morning-tea-spilled-by:
      1     Signed-ff-by:
      1     Sight-catched-by:
      1     Siged-off-by:
      1     Serial-parts-Acked-by:
      1     Sent-by:
      1     Rewieved-by:
      1     Reviewed-off-by:
      1     Reveiewed-by:
      1     Requested-and-tested-by:
      1     Reproduced-by:
      1     Reported-tested-and-acked-by:
      1     Reported--by:
      1     Reported-Bisected-Tested-by:
      1     Reported- and tested-by:
      1     Reported-and-debugged-by:
      1     Reported-and-argued-for-by:
      1     Reported-and-acked-by:
      1     Repented-by:
      1     Reminded-by:
      1     Pointed-out-and-tested-by:
      1     Patch-dusted-off-by:
      1     Original-code-by:
      1     Original-by:
      1     ned-off-by:
      1     Narrowed-down-by:
      1     Mostly-acked-by:
      1     Most-Definitely-Acked-by:
      1     Most contributed and Signed-off-by:
      1     More-or-less-tested-by:
      1     Mentored-by:
      1     Lots of updates from http://lkml.org/lkml/2008/5/20/32 Reported-by:
      1     Laughed-at-by:
      1     Jffs2-bit-acked-by:
      1     igned-off-by:
      1     Identified-by:
      1     Heckled-for-on-IRC-by:
      1     From: Unichrome Project http://unichrome.sf.net, Erdi Chen, Thomas Hellstrom    Signed-off-by:
      1     Forwarded-by:
      1     Foreseen-by:
      1     Explain what we use Acked-by:
      1     Explained-by:
      1     Eventually-typed-in-by:
      1     echo Signed-off-by:
      1     Confirmed-by:
      1     Compile-tested-by:
      1     Checked-by:
      1     Caught-by:
      1     Bug-found-by:
      1     Bitten-by-and-tested-by:
      1     Bisected-and-tested-by:
      1     Bisected-and-requested-by:
      1     Bisected-and-reported-by:
      1     Based-on-original-patch-by:
      1     As the idea originated from GregKH, I leave his Signed-off-by:
      1     Approved-by:
      1     AOLed-by:
      1     Aked-by:
      1     actually use it.  Kill this dead code.      Signed-off-by:
      1     Ackey-by:
      1     Acked-with-apologies-by:
      1     Acked-the-tulip-bit-by:
      1     Acked-the-net-bits-by:
      1     Aced-by:

^ permalink raw reply	[flat|nested] 35+ messages in thread

* RE: [PATCH] tracing/syscalls: use a dedicated file header
  2009-04-08 23:08           ` Luck, Tony
@ 2009-04-08 23:32             ` Steven Rostedt
  2009-04-08 23:32             ` Joe Perches
  1 sibling, 0 replies; 35+ messages in thread
From: Steven Rostedt @ 2009-04-08 23:32 UTC (permalink / raw)
  To: Luck, Tony
  Cc: Frederic Weisbecker, Ingo Molnar, Linux Kernel Mailing List,
	Peter Zijlstra, tglx, Jason Baron, Frank Ch. Eigler,
	Mathieu Desnoyers, KOSAKI Motohiro, Lai Jiangshan, Jiaying Zhang,
	Michael Rubin, Martin Bligh, Michael Davidson



On Wed, 8 Apr 2009, Luck, Tony wrote:

> > Maybe that will give the scripts that parse "git log" output
> > some grief as I don't see that tag ever used before.
> 
> It appears that any scripts that venture into the depths of
> the git logs need to be quite flexible ... people have already
> been quite creative about the tags used.  My favourite is
> 
>         "Acked-with-apologies-by:"
> 
> though the operatic possibilities of
> 
>         "Singed-off-by"
> 
> are also amusing.
> 
> Here's the current list with frequency counts:
> 

Funny, OK, I'm going back to my original.

Thanks,

-- Steve

>  280098     Signed-off-by:
>   13893     Acked-by:
>    1234     Reported-by:
>    1217     Reviewed-by:
>    1135     Tested-by:
>      74     Reported-and-tested-by:
>      57     Signed-of-by:
>      35     Suggested-by:
>      30     Requested-by:
>      30     Bisected-by:
>      27     signed-off-by:
>      24     Reported-and-bisected-by:
>      22     Signed-Off-by:
>      20     Pointed-out-by:
>      18     Singed-off-by:
>      15     Debugged-by:
>      14     Noticed-by:
>      13     Found-by:
>      10     Spotted-by:
>       9     Fixed-by:
>       9     ACKed-by:
>       8     Acked-off-by:
>       7     Noted-by:
>       7     Acked-and-tested-by:
>       6     Submitted-by:
>       6     Diagnosed-by:
>       5     Inspired-by:
>       4     Written-by:
>       4     Tested-and-Acked-by:
>       4     Sighed-off-by:
>       4     Reported-and-Tested-by:
>       4     Report-by:
>       3     Tested-and-acked-by:
>       3     SIgned-off-by:
>       3     Signed-by:
>       3     Pointed-to-by:
>       3     Modified-by:
>       3     Located-by:
>       3     Ackde-by:
>       2     Testted-by:
>       2     Signed_off-by:
>       2     Sigend-off-by:
>       2     Screwed-up-by:
>       2     Reviewd-by:
>       2     Idea-by:
>       2     Grudgingly-acked-by:
>       2     Earlier-version-tested-by:
>       2     Discovered-by:
>       2     Cc: Acked-by:
>       2     Ack'd-by:
>       2     Ack-by:
>       1     With comment fixes Signed-off-by:
>       1     Verified-by:
>       1     Triaged-by:
>       1     This patch does kmalloc + memset conversion to kzalloc anSigned-off-by:
>       1     their Signed-off-by:
>       1     Tested-and-requested-by:
>       1     Tested-and-reported-by:
>       1     Test-by:
>       1     Tentatively-acked-by:
>       1     Suggestd-by:
>       1     sSigned-off-by:
>       1     Sogned-off-by:
>       1     Sign-off-by:
>       1     Signe-off-by:
>       1     Signen-off-by:
>       1     signef-off-by:
>       1     Signed-pff-by:
>       1     Signed-off-by-by:
>       1     Signed-off-and-morning-tea-spilled-by:
>       1     Signed-ff-by:
>       1     Sight-catched-by:
>       1     Siged-off-by:
>       1     Serial-parts-Acked-by:
>       1     Sent-by:
>       1     Rewieved-by:
>       1     Reviewed-off-by:
>       1     Reveiewed-by:
>       1     Requested-and-tested-by:
>       1     Reproduced-by:
>       1     Reported-tested-and-acked-by:
>       1     Reported--by:
>       1     Reported-Bisected-Tested-by:
>       1     Reported- and tested-by:
>       1     Reported-and-debugged-by:
>       1     Reported-and-argued-for-by:
>       1     Reported-and-acked-by:
>       1     Repented-by:
>       1     Reminded-by:
>       1     Pointed-out-and-tested-by:
>       1     Patch-dusted-off-by:
>       1     Original-code-by:
>       1     Original-by:
>       1     ned-off-by:
>       1     Narrowed-down-by:
>       1     Mostly-acked-by:
>       1     Most-Definitely-Acked-by:
>       1     Most contributed and Signed-off-by:
>       1     More-or-less-tested-by:
>       1     Mentored-by:
>       1     Lots of updates from http://lkml.org/lkml/2008/5/20/32 Reported-by:
>       1     Laughed-at-by:
>       1     Jffs2-bit-acked-by:
>       1     igned-off-by:
>       1     Identified-by:
>       1     Heckled-for-on-IRC-by:
>       1     From: Unichrome Project http://unichrome.sf.net, Erdi Chen, Thomas Hellstrom    Signed-off-by:
>       1     Forwarded-by:
>       1     Foreseen-by:
>       1     Explain what we use Acked-by:
>       1     Explained-by:
>       1     Eventually-typed-in-by:
>       1     echo Signed-off-by:
>       1     Confirmed-by:
>       1     Compile-tested-by:
>       1     Checked-by:
>       1     Caught-by:
>       1     Bug-found-by:
>       1     Bitten-by-and-tested-by:
>       1     Bisected-and-tested-by:
>       1     Bisected-and-requested-by:
>       1     Bisected-and-reported-by:
>       1     Based-on-original-patch-by:
>       1     As the idea originated from GregKH, I leave his Signed-off-by:
>       1     Approved-by:
>       1     AOLed-by:
>       1     Aked-by:
>       1     actually use it.  Kill this dead code.      Signed-off-by:
>       1     Ackey-by:
>       1     Acked-with-apologies-by:
>       1     Acked-the-tulip-bit-by:
>       1     Acked-the-net-bits-by:
>       1     Aced-by:
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* RE: [PATCH] tracing/syscalls: use a dedicated file header
  2009-04-08 23:08           ` Luck, Tony
  2009-04-08 23:32             ` Steven Rostedt
@ 2009-04-08 23:32             ` Joe Perches
  1 sibling, 0 replies; 35+ messages in thread
From: Joe Perches @ 2009-04-08 23:32 UTC (permalink / raw)
  To: Luck, Tony
  Cc: Steven Rostedt, Frederic Weisbecker, Ingo Molnar,
	Linux Kernel Mailing List, Peter Zijlstra, tglx, Jason Baron,
	Frank Ch. Eigler, Mathieu Desnoyers, KOSAKI Motohiro,
	Lai Jiangshan, Jiaying Zhang, Michael Rubin, Martin Bligh,
	Michael Davidson

On Wed, 2009-04-08 at 16:08 -0700, Luck, Tony wrote:
> Here's the current list with frequency counts:

It's case sensitive.  Try the "-By:"'s too.

My favorite is:  Brown-Paper-Bag-Worn-By


^ permalink raw reply	[flat|nested] 35+ messages in thread

* [GIT PULL][PATCH] tracing/syscalls: use a dedicated file header
  2009-04-08 19:36       ` Luck, Tony
  2009-04-08 22:44         ` Steven Rostedt
@ 2009-04-09  0:15         ` Steven Rostedt
  1 sibling, 0 replies; 35+ messages in thread
From: Steven Rostedt @ 2009-04-09  0:15 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Luck, Tony, Frederic Weisbecker, Ingo Molnar, Peter Zijlstra,
	tglx, Jason Baron, Frank Ch. Eigler, Mathieu Desnoyers,
	KOSAKI Motohiro, Lai Jiangshan, Jiaying Zhang, Michael Rubin,
	Martin Bligh, Michael Davidson


Ingo,

Please pull the latest tip/tracing/core tree, which can be found at:

  git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace.git
tip/tracing/core


Frederic Weisbecker (1):
      tracing/syscalls: use a dedicated file header

----
 arch/x86/kernel/ftrace.c      |    2 ++
 arch/x86/kernel/ptrace.c      |    2 ++
 include/linux/ftrace.h        |   29 -----------------------------
 include/linux/syscalls.h      |    2 +-
 include/trace/syscall.h       |   35 +++++++++++++++++++++++++++++++++++
 kernel/trace/trace_syscalls.c |    2 +-
 6 files changed, 41 insertions(+), 31 deletions(-)
---------------------------
commit e733d7da21b91da7cf593ca16523b60ae38f32fb
Author: Frederic Weisbecker <fweisbec@gmail.com>
Date:   Wed Apr 8 20:26:11 2009 +0200

    tracing/syscalls: use a dedicated file header
    
    Impact: fix a build error on IA64
    
    Building a kernel on ia64 might trigger the following crash:
    
    CC      arch/ia64/ia32/sys_ia32.o
    In file included from arch/ia64/ia32/sys_ia32.c:55:
    arch/ia64/ia32/ia32priv.h:290:1: warning: "elf_check_arch" redefined
    In file included from include/linux/elf.h:7,
                     from include/linux/module.h:14,
                     from include/linux/ftrace.h:8,
                     from include/linux/syscalls.h:68,
                     from arch/ia64/ia32/sys_ia32.c:18:
    /home/aegl/generic-smp/arch/ia64/include/asm/elf.h:19:1: warning: this
    is the location of the previous definition
    In file included from arch/ia64/ia32/sys_ia32.c:55:
    arch/ia64/ia32/ia32priv.h:295:1: warning: "ELF_CLASS" redefined
    In file included from include/linux/elf.h:7,
                     from include/linux/module.h:14,
                     from include/linux/ftrace.h:8,
                     from include/linux/syscalls.h:68,
                     from arch/ia64/ia32/sys_ia32.c:18:
    
    sys_ia32.c includes linux/syscalls.h which in turn includes linux/ftrace.h
    to import the syscalls tracing prototypes.
    But including ftrace.h can pull too much things for a low level file,
    especially on ia64 where the ia32 private headers conflict with higher level
    headers.
    
    Now we isolate the syscall tracing headers in their own lightweight file.
    
    [ moved placement of #include <trace/syscall.h - Steven Rostedt ]
    
    Reported-Tested-and-Acked-by: Tony Luck <tony.luck@intel.com>
    Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
    LKML-Reference: <20090408184058.GB6017@nowhere>
    Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
index 70a10ca..18dfa30 100644
--- a/arch/x86/kernel/ftrace.c
+++ b/arch/x86/kernel/ftrace.c
@@ -18,6 +18,8 @@
 #include <linux/init.h>
 #include <linux/list.h>
 
+#include <trace/syscall.h>
+
 #include <asm/cacheflush.h>
 #include <asm/ftrace.h>
 #include <asm/nops.h>
diff --git a/arch/x86/kernel/ptrace.c b/arch/x86/kernel/ptrace.c
index fe9345c..46609cc 100644
--- a/arch/x86/kernel/ptrace.c
+++ b/arch/x86/kernel/ptrace.c
@@ -23,6 +23,8 @@
 #include <linux/signal.h>
 #include <linux/ftrace.h>
 
+#include <trace/syscall.h>
+
 #include <asm/uaccess.h>
 #include <asm/pgtable.h>
 #include <asm/system.h>
diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index 6aea54d..a5fa36d 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -510,33 +510,4 @@ static inline void trace_hw_branch_oops(void) {}
 
 #endif /* CONFIG_HW_BRANCH_TRACER */
 
-/*
- * A syscall entry in the ftrace syscalls array.
- *
- * @name: name of the syscall
- * @nb_args: number of parameters it takes
- * @types: list of types as strings
- * @args: list of args as strings (args[i] matches types[i])
- */
-struct syscall_metadata {
-	const char	*name;
-	int		nb_args;
-	const char	**types;
-	const char	**args;
-};
-
-#ifdef CONFIG_FTRACE_SYSCALLS
-extern void arch_init_ftrace_syscalls(void);
-extern struct syscall_metadata *syscall_nr_to_meta(int nr);
-extern void start_ftrace_syscalls(void);
-extern void stop_ftrace_syscalls(void);
-extern void ftrace_syscall_enter(struct pt_regs *regs);
-extern void ftrace_syscall_exit(struct pt_regs *regs);
-#else
-static inline void start_ftrace_syscalls(void) { }
-static inline void stop_ftrace_syscalls(void) { }
-static inline void ftrace_syscall_enter(struct pt_regs *regs) { }
-static inline void ftrace_syscall_exit(struct pt_regs *regs) { }
-#endif
-
 #endif /* _LINUX_FTRACE_H */
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index 6470f74..dabe4ad 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -65,7 +65,7 @@ struct old_linux_dirent;
 #include <asm/signal.h>
 #include <linux/quota.h>
 #include <linux/key.h>
-#include <linux/ftrace.h>
+#include <trace/syscall.h>
 
 #define __SC_DECL1(t1, a1)	t1 a1
 #define __SC_DECL2(t2, a2, ...) t2 a2, __SC_DECL1(__VA_ARGS__)
diff --git a/include/trace/syscall.h b/include/trace/syscall.h
new file mode 100644
index 0000000..27d1d31
--- /dev/null
+++ b/include/trace/syscall.h
@@ -0,0 +1,35 @@
+#ifndef _TRACE_SYSCALL_H
+#define _TRACE_SYSCALL_H
+
+#include <asm/ptrace.h>
+
+/*
+ * A syscall entry in the ftrace syscalls array.
+ *
+ * @name: name of the syscall
+ * @nb_args: number of parameters it takes
+ * @types: list of types as strings
+ * @args: list of args as strings (args[i] matches types[i])
+ */
+struct syscall_metadata {
+	const char	*name;
+	int		nb_args;
+	const char	**types;
+	const char	**args;
+};
+
+#ifdef CONFIG_FTRACE_SYSCALLS
+extern void arch_init_ftrace_syscalls(void);
+extern struct syscall_metadata *syscall_nr_to_meta(int nr);
+extern void start_ftrace_syscalls(void);
+extern void stop_ftrace_syscalls(void);
+extern void ftrace_syscall_enter(struct pt_regs *regs);
+extern void ftrace_syscall_exit(struct pt_regs *regs);
+#else
+static inline void start_ftrace_syscalls(void) { }
+static inline void stop_ftrace_syscalls(void) { }
+static inline void ftrace_syscall_enter(struct pt_regs *regs) { }
+static inline void ftrace_syscall_exit(struct pt_regs *regs) { }
+#endif
+
+#endif /* _TRACE_SYSCALL_H */
diff --git a/kernel/trace/trace_syscalls.c b/kernel/trace/trace_syscalls.c
index a2a3af2..5e57964 100644
--- a/kernel/trace/trace_syscalls.c
+++ b/kernel/trace/trace_syscalls.c
@@ -1,5 +1,5 @@
+#include <trace/syscall.h>
 #include <linux/kernel.h>
-#include <linux/ftrace.h>
 #include <asm/syscall.h>
 
 #include "trace_output.h"



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [tip:tracing/urgent] tracing/syscalls: use a dedicated file header
  2009-04-08 18:40     ` [PATCH] tracing/syscalls: use a dedicated file header Frederic Weisbecker
  2009-04-08 19:36       ` Luck, Tony
@ 2009-04-09  4:36       ` Frederic Weisbecker
  1 sibling, 0 replies; 35+ messages in thread
From: Frederic Weisbecker @ 2009-04-09  4:36 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: mathieu.desnoyers, mingo, a.p.zijlstra, tony.luck, fweisbec,
	rostedt, md, tglx, jbaron, laijs, hpa, fche, jiayingz,
	linux-kernel, mrubin, kosaki.motohiro, mingo, mbligh

Commit-ID:  47788c58e66c050982241d9a05eb690daceb05a9
Gitweb:     http://git.kernel.org/tip/47788c58e66c050982241d9a05eb690daceb05a9
Author:     Frederic Weisbecker <fweisbec@gmail.com>
AuthorDate: Wed, 8 Apr 2009 20:40:59 +0200
Committer:  Ingo Molnar <mingo@elte.hu>
CommitDate: Thu, 9 Apr 2009 05:43:32 +0200

tracing/syscalls: use a dedicated file header

Impact: fix build warnings and possibe compat misbehavior on IA64

Building a kernel on ia64 might trigger these ugly build warnings:

CC      arch/ia64/ia32/sys_ia32.o
In file included from arch/ia64/ia32/sys_ia32.c:55:
arch/ia64/ia32/ia32priv.h:290:1: warning: "elf_check_arch" redefined
In file included from include/linux/elf.h:7,
                 from include/linux/module.h:14,
                 from include/linux/ftrace.h:8,
                 from include/linux/syscalls.h:68,
                 from arch/ia64/ia32/sys_ia32.c:18:
arch/ia64/include/asm/elf.h:19:1: warning: this is the location of the previous definition
[...]

sys_ia32.c includes linux/syscalls.h which in turn includes linux/ftrace.h
to import the syscalls tracing prototypes.

But including ftrace.h can pull too much things for a low level file,
especially on ia64 where the ia32 private headers conflict with higher
level headers.

Now we isolate the syscall tracing headers in their own lightweight file.

Reported-by: Tony Luck <tony.luck@intel.com>
Tested-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Jason Baron <jbaron@redhat.com>
Cc: "Frank Ch. Eigler" <fche@redhat.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Jiaying Zhang <jiayingz@google.com>
Cc: Michael Rubin <mrubin@google.com>
Cc: Martin Bligh <mbligh@google.com>
Cc: Michael Davidson <md@google.com>
LKML-Reference: <20090408184058.GB6017@nowhere>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


---
 arch/x86/kernel/ftrace.c      |    2 ++
 arch/x86/kernel/ptrace.c      |    3 ++-
 include/linux/ftrace.h        |   29 -----------------------------
 include/linux/syscalls.h      |    2 +-
 include/trace/syscall.h       |   35 +++++++++++++++++++++++++++++++++++
 kernel/trace/trace_syscalls.c |    2 +-
 6 files changed, 41 insertions(+), 32 deletions(-)

diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
index 70a10ca..18dfa30 100644
--- a/arch/x86/kernel/ftrace.c
+++ b/arch/x86/kernel/ftrace.c
@@ -18,6 +18,8 @@
 #include <linux/init.h>
 #include <linux/list.h>
 
+#include <trace/syscall.h>
+
 #include <asm/cacheflush.h>
 #include <asm/ftrace.h>
 #include <asm/nops.h>
diff --git a/arch/x86/kernel/ptrace.c b/arch/x86/kernel/ptrace.c
index fe9345c..23b7c8f 100644
--- a/arch/x86/kernel/ptrace.c
+++ b/arch/x86/kernel/ptrace.c
@@ -21,7 +21,6 @@
 #include <linux/audit.h>
 #include <linux/seccomp.h>
 #include <linux/signal.h>
-#include <linux/ftrace.h>
 
 #include <asm/uaccess.h>
 #include <asm/pgtable.h>
@@ -35,6 +34,8 @@
 #include <asm/proto.h>
 #include <asm/ds.h>
 
+#include <trace/syscall.h>
+
 #include "tls.h"
 
 enum x86_regset {
diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index ff112a8..8a0c2f2 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -511,33 +511,4 @@ static inline void trace_hw_branch_oops(void) {}
 
 #endif /* CONFIG_HW_BRANCH_TRACER */
 
-/*
- * A syscall entry in the ftrace syscalls array.
- *
- * @name: name of the syscall
- * @nb_args: number of parameters it takes
- * @types: list of types as strings
- * @args: list of args as strings (args[i] matches types[i])
- */
-struct syscall_metadata {
-	const char	*name;
-	int		nb_args;
-	const char	**types;
-	const char	**args;
-};
-
-#ifdef CONFIG_FTRACE_SYSCALLS
-extern void arch_init_ftrace_syscalls(void);
-extern struct syscall_metadata *syscall_nr_to_meta(int nr);
-extern void start_ftrace_syscalls(void);
-extern void stop_ftrace_syscalls(void);
-extern void ftrace_syscall_enter(struct pt_regs *regs);
-extern void ftrace_syscall_exit(struct pt_regs *regs);
-#else
-static inline void start_ftrace_syscalls(void) { }
-static inline void stop_ftrace_syscalls(void) { }
-static inline void ftrace_syscall_enter(struct pt_regs *regs) { }
-static inline void ftrace_syscall_exit(struct pt_regs *regs) { }
-#endif
-
 #endif /* _LINUX_FTRACE_H */
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index 6470f74..dabe4ad 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -65,7 +65,7 @@ struct old_linux_dirent;
 #include <asm/signal.h>
 #include <linux/quota.h>
 #include <linux/key.h>
-#include <linux/ftrace.h>
+#include <trace/syscall.h>
 
 #define __SC_DECL1(t1, a1)	t1 a1
 #define __SC_DECL2(t2, a2, ...) t2 a2, __SC_DECL1(__VA_ARGS__)
diff --git a/include/trace/syscall.h b/include/trace/syscall.h
new file mode 100644
index 0000000..8cfe515
--- /dev/null
+++ b/include/trace/syscall.h
@@ -0,0 +1,35 @@
+#ifndef _TRACE_SYSCALL_H
+#define _TRACE_SYSCALL_H
+
+#include <asm/ptrace.h>
+
+/*
+ * A syscall entry in the ftrace syscalls array.
+ *
+ * @name: name of the syscall
+ * @nb_args: number of parameters it takes
+ * @types: list of types as strings
+ * @args: list of args as strings (args[i] matches types[i])
+ */
+struct syscall_metadata {
+	const char	*name;
+	int		nb_args;
+	const char	**types;
+	const char	**args;
+};
+
+#ifdef CONFIG_FTRACE_SYSCALLS
+extern void arch_init_ftrace_syscalls(void);
+extern struct syscall_metadata *syscall_nr_to_meta(int nr);
+extern void start_ftrace_syscalls(void);
+extern void stop_ftrace_syscalls(void);
+extern void ftrace_syscall_enter(struct pt_regs *regs);
+extern void ftrace_syscall_exit(struct pt_regs *regs);
+#else
+static inline void start_ftrace_syscalls(void)			{ }
+static inline void stop_ftrace_syscalls(void)			{ }
+static inline void ftrace_syscall_enter(struct pt_regs *regs)	{ }
+static inline void ftrace_syscall_exit(struct pt_regs *regs)	{ }
+#endif
+
+#endif /* _TRACE_SYSCALL_H */
diff --git a/kernel/trace/trace_syscalls.c b/kernel/trace/trace_syscalls.c
index a2a3af2..5e57964 100644
--- a/kernel/trace/trace_syscalls.c
+++ b/kernel/trace/trace_syscalls.c
@@ -1,5 +1,5 @@
+#include <trace/syscall.h>
 #include <linux/kernel.h>
-#include <linux/ftrace.h>
 #include <asm/syscall.h>
 
 #include "trace_output.h"

^ permalink raw reply related	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2009-04-09  4:38 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-03-13 14:42 [PATCH 0/2 v2] Syscalls tracing Frederic Weisbecker
2009-03-13 14:42 ` [PATCH 1/2 v2] tracing/syscalls: core infrastructure for syscalls tracing Frederic Weisbecker
2009-03-13 16:09   ` [tip:tracing/syscalls] tracing/syscalls: core infrastructure for syscalls tracing, enhancements Frederic Weisbecker
2009-03-15  3:51   ` [PATCH 1/2 v2] tracing/syscalls: core infrastructure for syscalls tracing Andrew Morton
2009-03-15  4:59     ` Ingo Molnar
2009-03-15  8:02       ` Andrew Morton
2009-03-15 16:17         ` Frederic Weisbecker
2009-04-06 21:55   ` Tony Luck
2009-04-06 22:12     ` Frederic Weisbecker
2009-04-07  0:30       ` Steven Rostedt
2009-04-08 18:40     ` [PATCH] tracing/syscalls: use a dedicated file header Frederic Weisbecker
2009-04-08 19:36       ` Luck, Tony
2009-04-08 22:44         ` Steven Rostedt
2009-04-08 22:51           ` Luck, Tony
2009-04-08 23:02             ` Steven Rostedt
2009-04-08 23:08           ` Luck, Tony
2009-04-08 23:32             ` Steven Rostedt
2009-04-08 23:32             ` Joe Perches
2009-04-09  0:15         ` [GIT PULL][PATCH] " Steven Rostedt
2009-04-09  4:36       ` [tip:tracing/urgent] " Frederic Weisbecker
2009-03-13 14:42 ` [PATCH 2/2 v2] tracing/syscalls: support for syscalls tracing on x86-64 Frederic Weisbecker
2009-03-13 16:09   ` [tip:tracing/syscalls] tracing/syscalls: support for syscalls tracing on x86 Frederic Weisbecker
2009-03-15  3:53   ` [PATCH 2/2 v2] tracing/syscalls: support for syscalls tracing on x86-64 Andrew Morton
2009-03-15  5:30     ` Ingo Molnar
2009-03-15 16:05       ` Frederic Weisbecker
2009-03-13 15:16 ` [PATCH 0/2 v2] Syscalls tracing Frederic Weisbecker
2009-03-13 15:55   ` Ingo Molnar
2009-03-13 16:17     ` Ingo Molnar
2009-03-15 15:25       ` Frederic Weisbecker
2009-03-13 16:47   ` Mathieu Desnoyers
2009-03-15 16:01     ` Frederic Weisbecker
2009-03-23 16:32       ` Mathieu Desnoyers
2009-03-23 19:27         ` Frederic Weisbecker
2009-03-23 19:40           ` Ingo Molnar
2009-03-23 20:37             ` Mathieu Desnoyers

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).