[RFC 0/4] trace,perf: Enabling ftrace:function tracepoint for perf

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [RFC 0/4] trace,perf: Enabling ftrace:function tracepoint for perf
@ 2011-07-11 13:22 Jiri Olsa
  2011-07-11 13:22 ` [RFC 1/4] perf, ftrace: Add perf support to use function tracepoint Jiri Olsa
                   ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: Jiri Olsa @ 2011-07-11 13:22 UTC (permalink / raw)
  To: rostedt, fweisbec, a.p.zijlstra; +Cc: linux-kernel

hi,

attached patches add perf support for function trace.

- Patch 1 adds support to register ftrace:function tracepoint.

- Patch 2 adds a way to used filtering for ip and parent_ip fields.

- Patches 3 and 4 are adding new ioctl to specify the filter on the
  "struct ftrace" level.

  I know this is probably not a good idea, but I dont see a good way
  to combine ftrace_ops filtering with tracepoint filtering other than
  keep them separated.

  My other guess is instead of ioctl use some keyword to specify functions
  followed by standard filter.. something like:

    "FUNCTION_TRACE(sys_open sys_close sys_read) (ip == sys_open)"

  Please take this more as a question about interface than forcing
  new ioctl.. ;)


attached patches:
 - 1/4 perf, ftrace: Add perf support to use function tracepoint
 - 2/4 perf, ftrace: Add filter support
 - 3/4 perf, ftrace: Add new perf ioctl for function trace filter
 - 4/4 perf, ftrace: Add ftrace option for perf stat command


thanks for comments,
jirka
---
 include/linux/ftrace.h             |    7 +-
 include/linux/ftrace_event.h       |    9 +-
 include/linux/perf_event.h         |    2 +
 kernel/events/core.c               |   14 +++
 kernel/trace/ftrace.c              |   34 +++++-
 kernel/trace/trace.h               |    5 +
 kernel/trace/trace_event_perf.c    |  225 +++++++++++++++++++++++++++++-------
 kernel/trace/trace_events.c        |   10 +-
 kernel/trace/trace_events_filter.c |   32 ++++-
 kernel/trace/trace_export.c        |   21 ++++
 kernel/trace/trace_kprobe.c        |    6 +-
 kernel/trace/trace_syscalls.c      |   14 ++-
 tools/perf/builtin-stat.c          |    2 +
 tools/perf/util/evlist.c           |   26 ++++-
 tools/perf/util/evlist.h           |    2 +
 tools/perf/util/evsel.h            |    1 +
 tools/perf/util/parse-events.c     |   37 ++++++-
 tools/perf/util/parse-events.h     |    1 +
 18 files changed, 376 insertions(+), 72 deletions(-)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [RFC 1/4] perf, ftrace: Add perf support to use function tracepoint
  2011-07-11 13:22 [RFC 0/4] trace,perf: Enabling ftrace:function tracepoint for perf Jiri Olsa
@ 2011-07-11 13:22 ` Jiri Olsa
  2011-07-11 13:22 ` [RFC 2/4] perf, ftrace: Add filter support for ftrace:function tracepoint Jiri Olsa
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 11+ messages in thread
From: Jiri Olsa @ 2011-07-11 13:22 UTC (permalink / raw)
  To: rostedt, fweisbec, a.p.zijlstra; +Cc: linux-kernel

Adding perf support to use function tracepoint.

The ftrace:function tracepoint could be now open within the perf and
returns number of processed kernel functions like:

 # ./perf stat -e ftrace:function  sleep 1

 Performance counter stats for 'sleep 1':

             7,488 ftrace:function                                             

       1.002918678 seconds time elapsed

Details:
- added "struct ftrace_ops" into "struct perf_event", so each event
  could define its own filter

- added TRACE_REG_PERF_OPEN/TRACE_REG_PERF_CLOSE in addition to
  TRACE_REG_PERF_REGISTER/TRACE_REG_PERF_UNREGISTER cases.

  The OPEN/CLOSE cases are called within the event init/destroy
  right after the REG/UNREG cases.

  The reason is that REG/UNREG are processed only for for first and last
  events of the same type. While for function trace we need each event
  to get initialized and register the ftrace_ops struct.

- added "void *data" to the class::reg function to pass the "struct perf_event"
  for the ftrace:function register function

---
 include/linux/ftrace_event.h    |    6 +-
 include/linux/perf_event.h      |    1 +
 kernel/trace/trace.h            |    5 +
 kernel/trace/trace_event_perf.c |  183 ++++++++++++++++++++++++++++++---------
 kernel/trace/trace_events.c     |   10 ++-
 kernel/trace/trace_export.c     |   21 +++++
 kernel/trace/trace_kprobe.c     |    6 +-
 kernel/trace/trace_syscalls.c   |   14 ++-
 8 files changed, 195 insertions(+), 51 deletions(-)

diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.h
index b1e69ee..061c267 100644
--- a/include/linux/ftrace_event.h
+++ b/include/linux/ftrace_event.h
@@ -145,6 +145,8 @@ enum trace_reg {
 	TRACE_REG_UNREGISTER,
 	TRACE_REG_PERF_REGISTER,
 	TRACE_REG_PERF_UNREGISTER,
+	TRACE_REG_PERF_OPEN,
+	TRACE_REG_PERF_CLOSE,
 };
 
 struct ftrace_event_call;
@@ -156,7 +158,7 @@ struct ftrace_event_class {
 	void			*perf_probe;
 #endif
 	int			(*reg)(struct ftrace_event_call *event,
-				       enum trace_reg type);
+				       enum trace_reg type, void *data);
 	int			(*define_fields)(struct ftrace_event_call *);
 	struct list_head	*(*get_fields)(struct ftrace_event_call *);
 	struct list_head	fields;
@@ -164,7 +166,7 @@ struct ftrace_event_class {
 };
 
 extern int ftrace_event_reg(struct ftrace_event_call *event,
-			    enum trace_reg type);
+			    enum trace_reg type, void *data);
 
 enum {
 	TRACE_EVENT_FL_ENABLED_BIT,
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 3f2711c..bb708d1 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -844,6 +844,7 @@ struct perf_event {
 #ifdef CONFIG_EVENT_TRACING
 	struct ftrace_event_call	*tp_event;
 	struct event_filter		*filter;
+	struct ftrace_ops               ftrace_ops;
 #endif
 
 #ifdef CONFIG_CGROUP_PERF
diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index 30a94c2..9278d74 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -810,4 +810,9 @@ extern const char *__stop___trace_bprintk_fmt[];
 #define trace_recursion_clear(bit)	do { (current)->trace_recursion &= ~(bit); } while (0)
 #define trace_recursion_test(bit)	((current)->trace_recursion & (bit))
 
+#ifdef CONFIG_PERF_EVENTS
+int perf_ftrace_event_register(struct ftrace_event_call *call,
+			       enum trace_reg type, void *data);
+#endif
+
 #endif /* _LINUX_KERNEL_TRACE_H */
diff --git a/kernel/trace/trace_event_perf.c b/kernel/trace/trace_event_perf.c
index 19a359d..2b56b81 100644
--- a/kernel/trace/trace_event_perf.c
+++ b/kernel/trace/trace_event_perf.c
@@ -44,23 +44,17 @@ static int perf_trace_event_perm(struct ftrace_event_call *tp_event,
 	return 0;
 }
 
-static int perf_trace_event_init(struct ftrace_event_call *tp_event,
-				 struct perf_event *p_event)
+static int perf_trace_event_reg(struct ftrace_event_call *tp_event,
+				struct perf_event *p_event)
 {
 	struct hlist_head __percpu *list;
-	int ret;
+	int ret = -ENOMEM;
 	int cpu;
 
-	ret = perf_trace_event_perm(tp_event, p_event);
-	if (ret)
-		return ret;
-
 	p_event->tp_event = tp_event;
 	if (tp_event->perf_refcount++ > 0)
 		return 0;
 
-	ret = -ENOMEM;
-
 	list = alloc_percpu(struct hlist_head);
 	if (!list)
 		goto fail;
@@ -83,7 +77,7 @@ static int perf_trace_event_init(struct ftrace_event_call *tp_event,
 		}
 	}
 
-	ret = tp_event->class->reg(tp_event, TRACE_REG_PERF_REGISTER);
+	ret = tp_event->class->reg(tp_event, TRACE_REG_PERF_REGISTER, NULL);
 	if (ret)
 		goto fail;
 
@@ -108,6 +102,69 @@ fail:
 	return ret;
 }
 
+static void perf_trace_event_unreg(struct perf_event *p_event)
+{
+	struct ftrace_event_call *tp_event = p_event->tp_event;
+	int i;
+
+	if (--tp_event->perf_refcount > 0)
+		goto out;
+
+	tp_event->class->reg(tp_event, TRACE_REG_PERF_UNREGISTER, NULL);
+
+	/*
+	 * Ensure our callback won't be called anymore. The buffers
+	 * will be freed after that.
+	 */
+	tracepoint_synchronize_unregister();
+
+	free_percpu(tp_event->perf_events);
+	tp_event->perf_events = NULL;
+
+	if (!--total_ref_count) {
+		for (i = 0; i < PERF_NR_CONTEXTS; i++) {
+			free_percpu(perf_trace_buf[i]);
+			perf_trace_buf[i] = NULL;
+		}
+	}
+out:
+	module_put(tp_event->mod);
+}
+
+static int perf_trace_event_open(struct perf_event *p_event)
+{
+	struct ftrace_event_call *tp_event = p_event->tp_event;
+	return tp_event->class->reg(tp_event, TRACE_REG_PERF_OPEN, p_event);
+}
+
+static void perf_trace_event_close(struct perf_event *p_event)
+{
+	struct ftrace_event_call *tp_event = p_event->tp_event;
+	tp_event->class->reg(tp_event, TRACE_REG_PERF_CLOSE, p_event);
+}
+
+static int perf_trace_event_init(struct ftrace_event_call *tp_event,
+				 struct perf_event *p_event)
+{
+	int ret;
+
+	ret = perf_trace_event_perm(tp_event, p_event);
+	if (ret)
+		return ret;
+
+	ret = perf_trace_event_reg(tp_event, p_event);
+	if (ret)
+		return ret;
+
+	ret = perf_trace_event_open(p_event);
+	if (ret) {
+		perf_trace_event_unreg(p_event);
+		return ret;
+	}
+
+	return 0;
+}
+
 int perf_trace_init(struct perf_event *p_event)
 {
 	struct ftrace_event_call *tp_event;
@@ -130,6 +187,14 @@ int perf_trace_init(struct perf_event *p_event)
 	return ret;
 }
 
+void perf_trace_destroy(struct perf_event *p_event)
+{
+	mutex_lock(&event_mutex);
+	perf_trace_event_close(p_event);
+	perf_trace_event_unreg(p_event);
+	mutex_unlock(&event_mutex);
+}
+
 int perf_trace_add(struct perf_event *p_event, int flags)
 {
 	struct ftrace_event_call *tp_event = p_event->tp_event;
@@ -154,37 +219,6 @@ void perf_trace_del(struct perf_event *p_event, int flags)
 	hlist_del_rcu(&p_event->hlist_entry);
 }
 
-void perf_trace_destroy(struct perf_event *p_event)
-{
-	struct ftrace_event_call *tp_event = p_event->tp_event;
-	int i;
-
-	mutex_lock(&event_mutex);
-	if (--tp_event->perf_refcount > 0)
-		goto out;
-
-	tp_event->class->reg(tp_event, TRACE_REG_PERF_UNREGISTER);
-
-	/*
-	 * Ensure our callback won't be called anymore. The buffers
-	 * will be freed after that.
-	 */
-	tracepoint_synchronize_unregister();
-
-	free_percpu(tp_event->perf_events);
-	tp_event->perf_events = NULL;
-
-	if (!--total_ref_count) {
-		for (i = 0; i < PERF_NR_CONTEXTS; i++) {
-			free_percpu(perf_trace_buf[i]);
-			perf_trace_buf[i] = NULL;
-		}
-	}
-out:
-	module_put(tp_event->mod);
-	mutex_unlock(&event_mutex);
-}
-
 __kprobes void *perf_trace_buf_prepare(int size, unsigned short type,
 				       struct pt_regs *regs, int *rctxp)
 {
@@ -214,3 +248,70 @@ __kprobes void *perf_trace_buf_prepare(int size, unsigned short type,
 	return raw_data;
 }
 EXPORT_SYMBOL_GPL(perf_trace_buf_prepare);
+
+
+static void
+perf_ftrace_function_call(unsigned long ip, unsigned long parent_ip)
+{
+	struct ftrace_entry *entry;
+	struct hlist_head *head;
+	unsigned long flags;
+	int rctx;
+
+	local_irq_save(flags);
+
+	BUILD_BUG_ON(sizeof(*entry) > PERF_MAX_TRACE_SIZE);
+
+	entry = (struct ftrace_entry *) perf_trace_buf_prepare(sizeof(*entry),
+						TRACE_FN, NULL, &rctx);
+	if (!entry)
+		return;
+
+	entry->ip = ip;
+	entry->parent_ip = parent_ip;
+
+	head = this_cpu_ptr(event_function.perf_events);
+	perf_trace_buf_submit(entry, sizeof(*entry), rctx, 0,
+			      1, NULL, head);
+
+	local_irq_restore(flags);
+}
+
+static int perf_ftrace_function_register(struct ftrace_event_call *call,
+					 struct perf_event *event)
+{
+	struct ftrace_ops *ops = &event->ftrace_ops;
+	ops->func = perf_ftrace_function_call;
+	return register_ftrace_function(ops);
+}
+
+static int perf_ftrace_function_unregister(struct ftrace_event_call *call,
+					   struct perf_event *event)
+{
+	struct ftrace_ops *ops = &event->ftrace_ops;
+	return unregister_ftrace_function(ops);
+}
+
+int perf_ftrace_event_register(struct ftrace_event_call *call,
+			       enum trace_reg type, void *data)
+{
+	int etype = call->event.type;
+
+	if (etype != TRACE_FN)
+		return -EINVAL;
+
+	switch (type) {
+	case TRACE_REG_REGISTER:
+	case TRACE_REG_UNREGISTER:
+		break;
+	case TRACE_REG_PERF_REGISTER:
+	case TRACE_REG_PERF_UNREGISTER:
+		return 0;
+	case TRACE_REG_PERF_OPEN:
+		return perf_ftrace_function_register(call, data);
+	case TRACE_REG_PERF_CLOSE:
+		return perf_ftrace_function_unregister(call, data);
+	}
+
+	return -EINVAL;
+}
diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
index 581876f..ad4bf55 100644
--- a/kernel/trace/trace_events.c
+++ b/kernel/trace/trace_events.c
@@ -147,7 +147,8 @@ int trace_event_raw_init(struct ftrace_event_call *call)
 }
 EXPORT_SYMBOL_GPL(trace_event_raw_init);
 
-int ftrace_event_reg(struct ftrace_event_call *call, enum trace_reg type)
+int ftrace_event_reg(struct ftrace_event_call *call,
+		     enum trace_reg type, void *data)
 {
 	switch (type) {
 	case TRACE_REG_REGISTER:
@@ -170,6 +171,9 @@ int ftrace_event_reg(struct ftrace_event_call *call, enum trace_reg type)
 					    call->class->perf_probe,
 					    call);
 		return 0;
+	case TRACE_REG_PERF_OPEN:
+	case TRACE_REG_PERF_CLOSE:
+		return 0;
 #endif
 	}
 	return 0;
@@ -209,7 +213,7 @@ static int ftrace_event_enable_disable(struct ftrace_event_call *call,
 				tracing_stop_cmdline_record();
 				call->flags &= ~TRACE_EVENT_FL_RECORDED_CMD;
 			}
-			call->class->reg(call, TRACE_REG_UNREGISTER);
+			call->class->reg(call, TRACE_REG_UNREGISTER, NULL);
 		}
 		break;
 	case 1:
@@ -218,7 +222,7 @@ static int ftrace_event_enable_disable(struct ftrace_event_call *call,
 				tracing_start_cmdline_record();
 				call->flags |= TRACE_EVENT_FL_RECORDED_CMD;
 			}
-			ret = call->class->reg(call, TRACE_REG_REGISTER);
+			ret = call->class->reg(call, TRACE_REG_REGISTER, NULL);
 			if (ret) {
 				tracing_stop_cmdline_record();
 				pr_info("event trace: Could not enable event "
diff --git a/kernel/trace/trace_export.c b/kernel/trace/trace_export.c
index bbeec31..7b7193c 100644
--- a/kernel/trace/trace_export.c
+++ b/kernel/trace/trace_export.c
@@ -131,6 +131,26 @@ ftrace_define_fields_##name(struct ftrace_event_call *event_call)	\
 
 #include "trace_entries.h"
 
+static int ftrace_event_class_register(struct ftrace_event_call *call,
+				       enum trace_reg type, void *data)
+{
+	switch (type) {
+	case TRACE_REG_PERF_REGISTER:
+	case TRACE_REG_PERF_UNREGISTER:
+		return 0;
+	case TRACE_REG_PERF_OPEN:
+	case TRACE_REG_PERF_CLOSE:
+#ifdef CONFIG_PERF_EVENTS
+		return perf_ftrace_event_register(call, type, data);
+#endif
+	case TRACE_REG_REGISTER:
+	case TRACE_REG_UNREGISTER:
+		break;
+	}
+
+	return -EINVAL;
+}
+
 #undef __entry
 #define __entry REC
 
@@ -159,6 +179,7 @@ struct ftrace_event_class event_class_ftrace_##call = {			\
 	.system			= __stringify(TRACE_SYSTEM),		\
 	.define_fields		= ftrace_define_fields_##call,		\
 	.fields			= LIST_HEAD_INIT(event_class_ftrace_##call.fields),\
+	.reg			= ftrace_event_class_register,		\
 };									\
 									\
 struct ftrace_event_call __used event_##call = {			\
diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index 7db7b68..bfcf609 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -1744,7 +1744,8 @@ static void probe_perf_disable(struct ftrace_event_call *call)
 #endif	/* CONFIG_PERF_EVENTS */
 
 static __kprobes
-int kprobe_register(struct ftrace_event_call *event, enum trace_reg type)
+int kprobe_register(struct ftrace_event_call *event,
+		    enum trace_reg type, void *data)
 {
 	switch (type) {
 	case TRACE_REG_REGISTER:
@@ -1759,6 +1760,9 @@ int kprobe_register(struct ftrace_event_call *event, enum trace_reg type)
 	case TRACE_REG_PERF_UNREGISTER:
 		probe_perf_disable(event);
 		return 0;
+	case TRACE_REG_PERF_OPEN:
+	case TRACE_REG_PERF_CLOSE:
+		return 0;
 #endif
 	}
 	return 0;
diff --git a/kernel/trace/trace_syscalls.c b/kernel/trace/trace_syscalls.c
index ee7b5a0..61ebe6a 100644
--- a/kernel/trace/trace_syscalls.c
+++ b/kernel/trace/trace_syscalls.c
@@ -16,9 +16,9 @@ static DECLARE_BITMAP(enabled_enter_syscalls, NR_syscalls);
 static DECLARE_BITMAP(enabled_exit_syscalls, NR_syscalls);
 
 static int syscall_enter_register(struct ftrace_event_call *event,
-				 enum trace_reg type);
+				 enum trace_reg type, void *data);
 static int syscall_exit_register(struct ftrace_event_call *event,
-				 enum trace_reg type);
+				 enum trace_reg type, void *data);
 
 static int syscall_enter_define_fields(struct ftrace_event_call *call);
 static int syscall_exit_define_fields(struct ftrace_event_call *call);
@@ -648,7 +648,7 @@ void perf_sysexit_disable(struct ftrace_event_call *call)
 #endif /* CONFIG_PERF_EVENTS */
 
 static int syscall_enter_register(struct ftrace_event_call *event,
-				 enum trace_reg type)
+				 enum trace_reg type, void *data)
 {
 	switch (type) {
 	case TRACE_REG_REGISTER:
@@ -663,13 +663,16 @@ static int syscall_enter_register(struct ftrace_event_call *event,
 	case TRACE_REG_PERF_UNREGISTER:
 		perf_sysenter_disable(event);
 		return 0;
+	case TRACE_REG_PERF_OPEN:
+	case TRACE_REG_PERF_CLOSE:
+		return 0;
 #endif
 	}
 	return 0;
 }
 
 static int syscall_exit_register(struct ftrace_event_call *event,
-				 enum trace_reg type)
+				 enum trace_reg type, void *data)
 {
 	switch (type) {
 	case TRACE_REG_REGISTER:
@@ -684,6 +687,9 @@ static int syscall_exit_register(struct ftrace_event_call *event,
 	case TRACE_REG_PERF_UNREGISTER:
 		perf_sysexit_disable(event);
 		return 0;
+	case TRACE_REG_PERF_OPEN:
+	case TRACE_REG_PERF_CLOSE:
+		return 0;
 #endif
 	}
 	return 0;
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC 2/4] perf, ftrace: Add filter support for ftrace:function tracepoint
  2011-07-11 13:22 [RFC 0/4] trace,perf: Enabling ftrace:function tracepoint for perf Jiri Olsa
  2011-07-11 13:22 ` [RFC 1/4] perf, ftrace: Add perf support to use function tracepoint Jiri Olsa
@ 2011-07-11 13:22 ` Jiri Olsa
  2011-08-10 19:34   ` Steven Rostedt
  2011-07-11 13:22 ` [RFC 3/4] perf, ftrace: Add new perf ioctl for function trace filter Jiri Olsa
  2011-07-11 13:22 ` [RFC 4/4] perf, ftrace: Add ftrace option for perf stat command Jiri Olsa
  3 siblings, 1 reply; 11+ messages in thread
From: Jiri Olsa @ 2011-07-11 13:22 UTC (permalink / raw)
  To: rostedt, fweisbec, a.p.zijlstra; +Cc: linux-kernel

Add support to specify perf filter for function trace.

Standard perf filter could be now used for ftrace:function tracepoint like:

 # ./perf stat -e ftrace:function --filter "ip == sys_open || ip == sys_close" sleep 1

 Performance counter stats for 'sleep 1':

                 9 ftrace:function                                             

       1.003097430 seconds time elapsed

Details:
- the value for ip and parent_ip fields gets translated to the
  function value (mcount return IP). The ip and parent_ip are then
  processed as integer values.

---
 include/linux/ftrace.h             |    2 ++
 kernel/trace/ftrace.c              |   23 +++++++++++++++++++++++
 kernel/trace/trace_events_filter.c |   32 ++++++++++++++++++++++++++------
 3 files changed, 51 insertions(+), 6 deletions(-)

diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index ed0eb52..f858e97 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -157,6 +157,8 @@ extern void
 unregister_ftrace_function_probe_func(char *glob, struct ftrace_probe_ops *ops);
 extern void unregister_ftrace_function_probe_all(char *glob);
 
+extern int ftrace_function_rec_ip(char *func, unsigned long long *ip);
+
 extern int ftrace_text_reserved(void *start, void *end);
 
 enum {
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 6d6e0d0..7325ae4 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -2599,6 +2599,29 @@ register_ftrace_function_probe(char *glob, struct ftrace_probe_ops *ops,
 	return count;
 }
 
+int ftrace_function_rec_ip(char *func, unsigned long long *ip)
+{
+	struct ftrace_page *pg;
+	struct dyn_ftrace *rec;
+	int len = strlen(func);
+	int found = 0;
+
+	mutex_lock(&ftrace_lock);
+
+	do_for_each_ftrace_rec(pg, rec) {
+		if (!ftrace_match_record(rec, NULL, func, len, MATCH_FULL))
+			continue;
+
+		*ip = rec->ip;
+		found = 1;
+		break;
+	} while_for_each_ftrace_rec();
+
+	mutex_unlock(&ftrace_lock);
+
+	return found ? 0 : -EINVAL;
+}
+
 enum {
 	PROBE_TEST_FUNC		= 1,
 	PROBE_TEST_DATA		= 2
diff --git a/kernel/trace/trace_events_filter.c b/kernel/trace/trace_events_filter.c
index 256764e..dd5c648 100644
--- a/kernel/trace/trace_events_filter.c
+++ b/kernel/trace/trace_events_filter.c
@@ -883,8 +883,12 @@ static bool is_string_field(struct ftrace_event_field *field)
 	       field->filter_type == FILTER_PTR_STRING;
 }
 
-static int is_legal_op(struct ftrace_event_field *field, int op)
+static int is_legal_op(struct ftrace_event_call *call,
+		       struct ftrace_event_field *field, int op)
 {
+	if ((call->event.type == TRACE_FN) &&
+	    (op != OP_EQ && op != OP_NE))
+		return 0;
 	if (is_string_field(field) &&
 	    (op != OP_EQ && op != OP_NE && op != OP_GLOB))
 		return 0;
@@ -937,6 +941,24 @@ static filter_pred_fn_t select_comparison_fn(int op, int field_size,
 	return fn;
 }
 
+static int pred_get_val(struct ftrace_event_call *call,
+			struct filter_pred *pred,
+			struct ftrace_event_field *field,
+			unsigned long long *val)
+{
+	int ret;
+
+	if (call->event.type == TRACE_FN)
+		return ftrace_function_rec_ip(pred->regex.pattern, val);
+
+	if (field->is_signed)
+		ret = strict_strtoll(pred->regex.pattern, 0, val);
+	else
+		ret = strict_strtoull(pred->regex.pattern, 0, val);
+
+	return ret;
+}
+
 static int filter_add_pred(struct filter_parse_state *ps,
 			   struct ftrace_event_call *call,
 			   struct event_filter *filter,
@@ -964,7 +986,7 @@ static int filter_add_pred(struct filter_parse_state *ps,
 
 	pred->offset = field->offset;
 
-	if (!is_legal_op(field, pred->op)) {
+	if (!is_legal_op(call, field, pred->op)) {
 		parse_error(ps, FILT_ERR_ILLEGAL_FIELD_OP, 0);
 		return -EINVAL;
 	}
@@ -980,14 +1002,12 @@ static int filter_add_pred(struct filter_parse_state *ps,
 		else
 			fn = filter_pred_pchar;
 	} else {
-		if (field->is_signed)
-			ret = strict_strtoll(pred->regex.pattern, 0, &val);
-		else
-			ret = strict_strtoull(pred->regex.pattern, 0, &val);
+		ret = pred_get_val(call, pred, field, &val);
 		if (ret) {
 			parse_error(ps, FILT_ERR_ILLEGAL_INTVAL, 0);
 			return -EINVAL;
 		}
+
 		pred->val = val;
 
 		fn = select_comparison_fn(pred->op, field->size,
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC 3/4] perf, ftrace: Add new perf ioctl for function trace filter
  2011-07-11 13:22 [RFC 0/4] trace,perf: Enabling ftrace:function tracepoint for perf Jiri Olsa
  2011-07-11 13:22 ` [RFC 1/4] perf, ftrace: Add perf support to use function tracepoint Jiri Olsa
  2011-07-11 13:22 ` [RFC 2/4] perf, ftrace: Add filter support for ftrace:function tracepoint Jiri Olsa
@ 2011-07-11 13:22 ` Jiri Olsa
  2011-07-18 15:12   ` Frederic Weisbecker
  2011-07-11 13:22 ` [RFC 4/4] perf, ftrace: Add ftrace option for perf stat command Jiri Olsa
  3 siblings, 1 reply; 11+ messages in thread
From: Jiri Olsa @ 2011-07-11 13:22 UTC (permalink / raw)
  To: rostedt, fweisbec, a.p.zijlstra; +Cc: linux-kernel

As the amount of kernel functions obtained by the ftrace:function tracepoint
is quite big, it's desirable to be able to set the filter on the ftrace
level.

Added PERF_EVENT_IOC_SET_FTRACE ioctl to be able to specify function filter
for perf event. The interface is the same as for the set_ftrace_filter file.

Also the same string parser is used as for the set_ftrace_filter file.

---
 include/linux/ftrace.h          |    5 ++-
 include/linux/ftrace_event.h    |    3 ++
 include/linux/perf_event.h      |    1 +
 kernel/events/core.c            |   14 +++++++++++++
 kernel/trace/ftrace.c           |   11 +++++----
 kernel/trace/trace_event_perf.c |   42 +++++++++++++++++++++++++++++++++++++++
 6 files changed, 69 insertions(+), 7 deletions(-)

diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index f858e97..b7368d0 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -182,7 +182,7 @@ struct dyn_ftrace {
 };
 
 int ftrace_force_update(void);
-void ftrace_set_filter(struct ftrace_ops *ops, unsigned char *buf,
+int ftrace_set_filter(struct ftrace_ops *ops, unsigned char *buf,
 		       int len, int reset);
 void ftrace_set_notrace(struct ftrace_ops *ops, unsigned char *buf,
 			int len, int reset);
@@ -268,8 +268,9 @@ extern void ftrace_enable_daemon(void);
 #else
 static inline int skip_trace(unsigned long ip) { return 0; }
 static inline int ftrace_force_update(void) { return 0; }
-static inline void ftrace_set_filter(unsigned char *buf, int len, int reset)
+static inline int ftrace_set_filter(unsigned char *buf, int len, int reset)
 {
+	return -EINVAL;
 }
 static inline void ftrace_disable_daemon(void) { }
 static inline void ftrace_enable_daemon(void) { }
diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.h
index 061c267..df13bc6 100644
--- a/include/linux/ftrace_event.h
+++ b/include/linux/ftrace_event.h
@@ -291,6 +291,9 @@ extern void ftrace_profile_free_filter(struct perf_event *event);
 extern void *perf_trace_buf_prepare(int size, unsigned short type,
 				    struct pt_regs *regs, int *rctxp);
 
+extern int perf_ftrace_set_filter(struct perf_event *event,
+				  char __user *filter);
+
 static inline void
 perf_trace_buf_submit(void *raw_data, int size, int rctx, u64 addr,
 		       u64 count, struct pt_regs *regs, void *head)
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index bb708d1..20c216d 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -248,6 +248,7 @@ struct perf_event_attr {
 #define PERF_EVENT_IOC_PERIOD		_IOW('$', 4, __u64)
 #define PERF_EVENT_IOC_SET_OUTPUT	_IO ('$', 5)
 #define PERF_EVENT_IOC_SET_FILTER	_IOW('$', 6, char *)
+#define PERF_EVENT_IOC_SET_FTRACE	_IOW('$', 7, char *)
 
 enum perf_event_ioc_flags {
 	PERF_IOC_FLAG_GROUP		= 1U << 0,
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 0567e32..8564c74 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -3255,6 +3255,7 @@ static struct perf_event *perf_fget_light(int fd, int *fput_needed)
 static int perf_event_set_output(struct perf_event *event,
 				 struct perf_event *output_event);
 static int perf_event_set_filter(struct perf_event *event, void __user *arg);
+static int perf_event_set_ftrace(struct perf_event *event, void __user *arg);
 
 static long perf_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
 {
@@ -3301,6 +3302,9 @@ static long perf_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
 	case PERF_EVENT_IOC_SET_FILTER:
 		return perf_event_set_filter(event, (void __user *)arg);
 
+	case PERF_EVENT_IOC_SET_FTRACE:
+		return perf_event_set_ftrace(event, (void __user *)arg);
+
 	default:
 		return -ENOTTY;
 	}
@@ -5185,6 +5189,11 @@ static void perf_event_free_filter(struct perf_event *event)
 	ftrace_profile_free_filter(event);
 }
 
+static int perf_event_set_ftrace(struct perf_event *event, void __user *arg)
+{
+	return perf_ftrace_set_filter(event, arg);
+}
+
 #else
 
 static inline void perf_tp_register(void)
@@ -5196,6 +5205,11 @@ static int perf_event_set_filter(struct perf_event *event, void __user *arg)
 	return -ENOENT;
 }
 
+static int perf_event_set_ftrace(struct perf_event *event, void __user *arg)
+{
+	return -ENOENT;
+}
+
 static void perf_event_free_filter(struct perf_event *event)
 {
 }
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 7325ae4..477e050 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -2837,7 +2837,7 @@ ftrace_set_regex(struct ftrace_ops *ops, unsigned char *buf, int len,
 {
 	struct ftrace_hash **orig_hash;
 	struct ftrace_hash *hash;
-	int ret;
+	int ret = -EINVAL;
 
 	/* All global ops uses the global ops filters */
 	if (ops->flags & FTRACE_OPS_FL_GLOBAL)
@@ -2858,13 +2858,14 @@ ftrace_set_regex(struct ftrace_ops *ops, unsigned char *buf, int len,
 	mutex_lock(&ftrace_regex_lock);
 	if (reset)
 		ftrace_filter_reset(hash);
-	if (buf)
-		ftrace_match_records(hash, buf, len);
+	if (buf && !ftrace_match_records(hash, buf, len))
+		goto out_unlock;
 
 	mutex_lock(&ftrace_lock);
 	ret = ftrace_hash_move(orig_hash, hash);
 	mutex_unlock(&ftrace_lock);
 
+ out_unlock:
 	mutex_unlock(&ftrace_regex_lock);
 
 	free_ftrace_hash(hash);
@@ -2881,10 +2882,10 @@ ftrace_set_regex(struct ftrace_ops *ops, unsigned char *buf, int len,
  * Filters denote which functions should be enabled when tracing is enabled.
  * If @buf is NULL and reset is set, all functions will be enabled for tracing.
  */
-void ftrace_set_filter(struct ftrace_ops *ops, unsigned char *buf,
+int ftrace_set_filter(struct ftrace_ops *ops, unsigned char *buf,
 		       int len, int reset)
 {
-	ftrace_set_regex(ops, buf, len, reset, 1);
+	return ftrace_set_regex(ops, buf, len, reset, 1);
 }
 EXPORT_SYMBOL_GPL(ftrace_set_filter);
 
diff --git a/kernel/trace/trace_event_perf.c b/kernel/trace/trace_event_perf.c
index 2b56b81..7a1367d 100644
--- a/kernel/trace/trace_event_perf.c
+++ b/kernel/trace/trace_event_perf.c
@@ -315,3 +315,45 @@ int perf_ftrace_event_register(struct ftrace_event_call *call,
 
 	return -EINVAL;
 }
+
+static int __perf_ftrace_set_filter(struct perf_event *event,
+				   char __user *filter)
+{
+	struct trace_parser parser;
+	ssize_t cnt = strnlen_user(filter, PAGE_SIZE);
+	int reset = 1, ret = -EINVAL;
+	ssize_t read = 0;
+	loff_t pos = 0;
+
+	if (trace_parser_get_init(&parser, KSYM_SYMBOL_LEN + 4))
+		return -ENOMEM;
+
+	while (cnt) {
+		read = trace_get_user(&parser, filter, cnt, &pos);
+		if (!read || !trace_parser_loaded(&parser)) {
+			ret = -EINVAL;
+			break;
+		}
+
+		ret = ftrace_set_filter(&event->ftrace_ops,
+					parser.buffer, parser.idx,
+					reset);
+		if (ret)
+			break;
+
+		cnt    -= read;
+		filter += read;
+		reset = 0;
+	}
+
+	trace_parser_put(&parser);
+	return ret;
+}
+
+int perf_ftrace_set_filter(struct perf_event *event, char __user *filter)
+{
+	if (event->attr.config != TRACE_FN)
+		return -EINVAL;
+
+	return __perf_ftrace_set_filter(event, filter);
+}
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC 4/4] perf, ftrace: Add ftrace option for perf stat command
  2011-07-11 13:22 [RFC 0/4] trace,perf: Enabling ftrace:function tracepoint for perf Jiri Olsa
                   ` (2 preceding siblings ...)
  2011-07-11 13:22 ` [RFC 3/4] perf, ftrace: Add new perf ioctl for function trace filter Jiri Olsa
@ 2011-07-11 13:22 ` Jiri Olsa
  3 siblings, 0 replies; 11+ messages in thread
From: Jiri Olsa @ 2011-07-11 13:22 UTC (permalink / raw)
  To: rostedt, fweisbec, a.p.zijlstra; +Cc: linux-kernel

Adding ftrace option to specify the ftrace filter for last/current event.
The value format is same as for the set_ftrace_filter tracing file.

This could be used only for ftrace:function (TRACE_FN) tracepoint event
like:

 # ./perf stat -e ftrace:function --filter "ip == sys_open" \
   --ftrace "sys_open sys_read" sleep 1

  Performance counter stats for 'sleep 1':

                 3 ftrace:function                                             

       1.001460132 seconds time elapsed

---
 tools/perf/builtin-stat.c      |    2 ++
 tools/perf/util/evlist.c       |   26 ++++++++++++++++++++++----
 tools/perf/util/evlist.h       |    2 ++
 tools/perf/util/evsel.h        |    1 +
 tools/perf/util/parse-events.c |   37 +++++++++++++++++++++++++++++++++----
 tools/perf/util/parse-events.h |    1 +
 6 files changed, 61 insertions(+), 8 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 1d08c80..77a257d 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -1035,6 +1035,8 @@ static const struct option options[] = {
 		     parse_events),
 	OPT_CALLBACK(0, "filter", &evsel_list, "filter",
 		     "event filter", parse_filter),
+	OPT_CALLBACK(0, "ftrace", &evsel_list, "ftrace",
+		     "ftrace filter", parse_ftrace),
 	OPT_BOOLEAN('i', "no-inherit", &no_inherit,
 		    "child tasks do not inherit counters"),
 	OPT_INTEGER('p', "pid", &target_pid,
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index b021ea9..7627266 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -427,25 +427,43 @@ void perf_evlist__delete_maps(struct perf_evlist *evlist)
 	evlist->threads = NULL;
 }
 
+static int set_filter_fd(int fd, char *filter, char *ftrace)
+{
+	int err;
+
+	if (filter) {
+		err = ioctl(fd, PERF_EVENT_IOC_SET_FILTER, filter);
+		if (err)
+			return err;
+	}
+	if (ftrace) {
+		err = ioctl(fd, PERF_EVENT_IOC_SET_FTRACE, ftrace);
+		if (err)
+			return err;
+	}
+	return 0;
+}
+
 int perf_evlist__set_filters(struct perf_evlist *evlist)
 {
 	const struct thread_map *threads = evlist->threads;
 	const struct cpu_map *cpus = evlist->cpus;
 	struct perf_evsel *evsel;
 	char *filter;
+	char *ftrace;
 	int thread;
 	int cpu;
 	int err;
-	int fd;
 
 	list_for_each_entry(evsel, &evlist->entries, node) {
 		filter = evsel->filter;
-		if (!filter)
+		ftrace = evsel->ftrace;
+		if (!filter && !ftrace)
 			continue;
 		for (cpu = 0; cpu < cpus->nr; cpu++) {
 			for (thread = 0; thread < threads->nr; thread++) {
-				fd = FD(evsel, cpu, thread);
-				err = ioctl(fd, PERF_EVENT_IOC_SET_FILTER, filter);
+				err = set_filter_fd(FD(evsel, cpu, thread),
+						  filter, ftrace);
 				if (err)
 					return err;
 			}
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index b2b8623..25ccdf8 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -9,6 +9,8 @@ struct pollfd;
 struct thread_map;
 struct cpu_map;
 
+#define PERF_EVENT_IOC_SET_FTRACE       _IOW('$', 7, char *)
+
 #define PERF_EVLIST__HLIST_BITS 8
 #define PERF_EVLIST__HLIST_SIZE (1 << PERF_EVLIST__HLIST_BITS)
 
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index e9a3155..60d949c 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -48,6 +48,7 @@ struct perf_evsel {
 	struct list_head	node;
 	struct perf_event_attr	attr;
 	char			*filter;
+	char			*ftrace;
 	struct xyarray		*fd;
 	struct xyarray		*sample_id;
 	u64			*id;
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 41982c3..dc58b0e 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -863,8 +863,7 @@ int parse_events(const struct option *opt, const char *str, int unset __used)
 	return 0;
 }
 
-int parse_filter(const struct option *opt, const char *str,
-		 int unset __used)
+static struct perf_evsel *get_last_evsel(const struct option *opt)
 {
 	struct perf_evlist *evlist = *(struct perf_evlist **)opt->value;
 	struct perf_evsel *last = NULL;
@@ -874,10 +873,22 @@ int parse_filter(const struct option *opt, const char *str,
 
 	if (last == NULL || last->attr.type != PERF_TYPE_TRACEPOINT) {
 		fprintf(stderr,
-			"-F option should follow a -e tracepoint option\n");
-		return -1;
+			"filter/ftrace options should follow a -e tracepoint option\n");
+		return NULL;
 	}
 
+	return last;
+}
+
+int parse_filter(const struct option *opt, const char *str,
+		 int unset __used)
+{
+	struct perf_evsel *last = NULL;
+
+	last = get_last_evsel(opt);
+	if (!last)
+		return -1;
+
 	last->filter = strdup(str);
 	if (last->filter == NULL) {
 		fprintf(stderr, "not enough memory to hold filter string\n");
@@ -887,6 +898,24 @@ int parse_filter(const struct option *opt, const char *str,
 	return 0;
 }
 
+int parse_ftrace(const struct option *opt, const char *str,
+		 int unset __used)
+{
+	struct perf_evsel *last = NULL;
+
+	last = get_last_evsel(opt);
+	if (!last)
+		return -1;
+
+	last->ftrace = strdup(str);
+	if (last->ftrace == NULL) {
+		fprintf(stderr, "not enough memory to hold ftrace string\n");
+		return -1;
+	}
+
+	return 0;
+}
+
 static const char * const event_type_descriptors[] = {
 	"Hardware event",
 	"Software event",
diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h
index 746d3fc..9933649 100644
--- a/tools/perf/util/parse-events.h
+++ b/tools/perf/util/parse-events.h
@@ -26,6 +26,7 @@ extern const char *__event_name(int type, u64 config);
 
 extern int parse_events(const struct option *opt, const char *str, int unset);
 extern int parse_filter(const struct option *opt, const char *str, int unset);
+extern int parse_ftrace(const struct option *opt, const char *str, int unset);
 
 #define EVENTS_HELP_MAX (128*1024)
 
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [RFC 3/4] perf, ftrace: Add new perf ioctl for function trace filter
  2011-07-11 13:22 ` [RFC 3/4] perf, ftrace: Add new perf ioctl for function trace filter Jiri Olsa
@ 2011-07-18 15:12   ` Frederic Weisbecker
  2011-07-19  9:57     ` Jiri Olsa
  0 siblings, 1 reply; 11+ messages in thread
From: Frederic Weisbecker @ 2011-07-18 15:12 UTC (permalink / raw)
  To: Jiri Olsa; +Cc: rostedt, a.p.zijlstra, linux-kernel, Ingo Molnar

On Mon, Jul 11, 2011 at 03:22:55PM +0200, Jiri Olsa wrote:
> As the amount of kernel functions obtained by the ftrace:function tracepoint
> is quite big, it's desirable to be able to set the filter on the ftrace
> level.
> 
> Added PERF_EVENT_IOC_SET_FTRACE ioctl to be able to specify function filter
> for perf event. The interface is the same as for the set_ftrace_filter file.
> 
> Also the same string parser is used as for the set_ftrace_filter file.

I'm not sure I understand why an ioctl is needed for that.
Why not using the ftrace filter?

The first idea was to do something like this:

	ip == func1 || ip == func2 || ....

But that makes a too long expression. So I thought we
could bring support for the "+" operator and have expressions
like:

	ip == func1 + func2 + func3 + ....

	ip == !func1

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC 3/4] perf, ftrace: Add new perf ioctl for function trace filter
  2011-07-18 15:12   ` Frederic Weisbecker
@ 2011-07-19  9:57     ` Jiri Olsa
  2011-07-26 13:43       ` Frederic Weisbecker
  0 siblings, 1 reply; 11+ messages in thread
From: Jiri Olsa @ 2011-07-19  9:57 UTC (permalink / raw)
  To: Frederic Weisbecker; +Cc: rostedt, a.p.zijlstra, linux-kernel, Ingo Molnar

On Mon, Jul 18, 2011 at 05:12:33PM +0200, Frederic Weisbecker wrote:
> On Mon, Jul 11, 2011 at 03:22:55PM +0200, Jiri Olsa wrote:
> > As the amount of kernel functions obtained by the ftrace:function tracepoint
> > is quite big, it's desirable to be able to set the filter on the ftrace
> > level.
> > 
> > Added PERF_EVENT_IOC_SET_FTRACE ioctl to be able to specify function filter
> > for perf event. The interface is the same as for the set_ftrace_filter file.
> > 
> > Also the same string parser is used as for the set_ftrace_filter file.
> 
> I'm not sure I understand why an ioctl is needed for that.
> Why not using the ftrace filter?
> 
> The first idea was to do something like this:
> 
> 	ip == func1 || ip == func2 || ....

The part where I got stuck is when you start to combine the fields
in the filter like:

	ip == func1 || ip_parent != func2
	ip == func1 || (ip == func2 && ip_parent != func3)

Which is ok when all functions are enabled and ip/parent_ip fields
are treated as values. But I'd need enable proper set of functions
based on above filter parsing.. which might not be that easy :)

So I thought it'd be easier to have the filter as above,
plus you could specify the amount of traced functions with
ioctl command.

> 
> But that makes a too long expression. So I thought we
> could bring support for the "+" operator and have expressions
> like:
> 
> 	ip == func1 + func2 + func3 + ....
> 
> 	ip == !func1

So would you like to have just simple filter for function event?
(embedded into the current event filter..) like:

	ip == func1 + func2 + func3 + 

Support just one 'ip' field and whatever we got in the parameter
we put into the ftrace filter.. ?

thanks,
jirka

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC 3/4] perf, ftrace: Add new perf ioctl for function trace filter
  2011-07-19  9:57     ` Jiri Olsa
@ 2011-07-26 13:43       ` Frederic Weisbecker
  2011-07-27 12:17         ` Jiri Olsa
  0 siblings, 1 reply; 11+ messages in thread
From: Frederic Weisbecker @ 2011-07-26 13:43 UTC (permalink / raw)
  To: Jiri Olsa; +Cc: rostedt, a.p.zijlstra, linux-kernel, Ingo Molnar

On Tue, Jul 19, 2011 at 11:57:27AM +0200, Jiri Olsa wrote:
> On Mon, Jul 18, 2011 at 05:12:33PM +0200, Frederic Weisbecker wrote:
> > On Mon, Jul 11, 2011 at 03:22:55PM +0200, Jiri Olsa wrote:
> > > As the amount of kernel functions obtained by the ftrace:function tracepoint
> > > is quite big, it's desirable to be able to set the filter on the ftrace
> > > level.
> > > 
> > > Added PERF_EVENT_IOC_SET_FTRACE ioctl to be able to specify function filter
> > > for perf event. The interface is the same as for the set_ftrace_filter file.
> > > 
> > > Also the same string parser is used as for the set_ftrace_filter file.
> > 
> > I'm not sure I understand why an ioctl is needed for that.
> > Why not using the ftrace filter?
> > 
> > The first idea was to do something like this:
> > 
> > 	ip == func1 || ip == func2 || ....
> 
> The part where I got stuck is when you start to combine the fields
> in the filter like:
> 
> 	ip == func1 || ip_parent != func2
> 	ip == func1 || (ip == func2 && ip_parent != func3)
> 
> Which is ok when all functions are enabled and ip/parent_ip fields
> are treated as values. But I'd need enable proper set of functions
> based on above filter parsing.. which might not be that easy :)

Yeah. I believe that when we have ip and ip_parent together in
a filter, we should just tell the user we are not (or we
don't want to be) smart enough to handle that :)

I guess this is a corner case we don't need to worry much
about.

> So I thought it'd be easier to have the filter as above,
> plus you could specify the amount of traced functions with
> ioctl command.
> 
> > 
> > But that makes a too long expression. So I thought we
> > could bring support for the "+" operator and have expressions
> > like:
> > 
> > 	ip == func1 + func2 + func3 + ....
> > 
> > 	ip == !func1
> 
> So would you like to have just simple filter for function event?
> (embedded into the current event filter..) like:
> 
> 	ip == func1 + func2 + func3 + 
> 
> Support just one 'ip' field and whatever we got in the parameter
> we put into the ftrace filter.. ?

Yeah.

Ideally, the filter engine should split expressions into a tree
of ops:

	ip == func1 + func2 + func3 || ip == func4

should be parsed into:

            ||
            /\
           /  \
          /    \
         /      \
        ==       ==
       /  \     /  \
      /    \   ip   \
     /      \       func4
    ip       +       
            / \
           /   \
          +     \
         /  \   func3
        /    \
    func1   func2

And then pass that to ftrace that interprets that tree
by building set of functions on top of each node joined to
the other.

But that can be complicated to do, and perhaps a bit of
an overkill even for daily use of it.

Having a simple "ip == func + func2 + func3" expression support
should be enough I think. And we can reject expressions that don't
fit that pattern. Then if it becomes necessary one day to support
real expressions there, we can still switch to a real tree.

Does that look sane? 


> thanks,
> jirka

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC 3/4] perf, ftrace: Add new perf ioctl for function trace filter
  2011-07-26 13:43       ` Frederic Weisbecker
@ 2011-07-27 12:17         ` Jiri Olsa
  2011-08-10 20:25           ` Steven Rostedt
  0 siblings, 1 reply; 11+ messages in thread
From: Jiri Olsa @ 2011-07-27 12:17 UTC (permalink / raw)
  To: Frederic Weisbecker; +Cc: rostedt, a.p.zijlstra, linux-kernel, Ingo Molnar

On Tue, Jul 26, 2011 at 03:43:44PM +0200, Frederic Weisbecker wrote:

SNIP

> Ideally, the filter engine should split expressions into a tree
> of ops:
> 
> 	ip == func1 + func2 + func3 || ip == func4
> 
> should be parsed into:
> 
>             ||
>             /\
>            /  \
>           /    \
>          /      \
>         ==       ==
>        /  \     /  \
>       /    \   ip   \
>      /      \       func4
>     ip       +       
>             / \
>            /   \
>           +     \
>          /  \   func3
>         /    \
>     func1   func2
> 
> And then pass that to ftrace that interprets that tree
> by building set of functions on top of each node joined to
> the other.
> 
> But that can be complicated to do, and perhaps a bit of
> an overkill even for daily use of it.
> 
> Having a simple "ip == func + func2 + func3" expression support

so the '+' is just shortcut for '||' ... like:

"ip == x1 || ip == x1" AND "ip == x1 + x2" mean the same thing

if we omit the '+' and keep just the whitespace we could use the function
name parser as used in set_ftrace_filter interface with no change ;)
but adding new separator should not be that hard..

> should be enough I think. And we can reject expressions that don't
> fit that pattern. Then if it becomes necessary one day to support
> real expressions there, we can still switch to a real tree.
> 
> Does that look sane? 

I'l make the change and send new version

thanks,
jirka

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC 2/4] perf, ftrace: Add filter support for ftrace:function tracepoint
  2011-07-11 13:22 ` [RFC 2/4] perf, ftrace: Add filter support for ftrace:function tracepoint Jiri Olsa
@ 2011-08-10 19:34   ` Steven Rostedt
  0 siblings, 0 replies; 11+ messages in thread
From: Steven Rostedt @ 2011-08-10 19:34 UTC (permalink / raw)
  To: Jiri Olsa; +Cc: fweisbec, a.p.zijlstra, linux-kernel

Hi Jiri,

Sorry for the late response, I've been catching up on lots of other
things.


On Mon, 2011-07-11 at 15:22 +0200, Jiri Olsa wrote:

> +int ftrace_function_rec_ip(char *func, unsigned long long *ip)
> +{
> +	struct ftrace_page *pg;
> +	struct dyn_ftrace *rec;
> +	int len = strlen(func);
> +	int found = 0;
> +
> +	mutex_lock(&ftrace_lock);
> +
> +	do_for_each_ftrace_rec(pg, rec) {
> +		if (!ftrace_match_record(rec, NULL, func, len, MATCH_FULL))
> +			continue;
> +
> +		*ip = rec->ip;
> +		found = 1;
> +		break;

You can't use break here. The do_for_each_ftrace_rec() { }
while_for_each_ftrace_rec(); is a double loop. The break exits the first
loop, but then continues to the next loop. You must use a goto.

-- Steve

> +	} while_for_each_ftrace_rec();
> +
> +	mutex_unlock(&ftrace_lock);
> +
> +	return found ? 0 : -EINVAL;
> +}
> +



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC 3/4] perf, ftrace: Add new perf ioctl for function trace filter
  2011-07-27 12:17         ` Jiri Olsa
@ 2011-08-10 20:25           ` Steven Rostedt
  0 siblings, 0 replies; 11+ messages in thread
From: Steven Rostedt @ 2011-08-10 20:25 UTC (permalink / raw)
  To: Jiri Olsa; +Cc: Frederic Weisbecker, a.p.zijlstra, linux-kernel, Ingo Molnar

Again, sorry for the late reply.

On Wed, 2011-07-27 at 14:17 +0200, Jiri Olsa wrote:
> On Tue, Jul 26, 2011 at 03:43:44PM +0200, Frederic Weisbecker wrote:
> 
> SNIP
> 
> > Ideally, the filter engine should split expressions into a tree
> > of ops:
> > 
> > 	ip == func1 + func2 + func3 || ip == func4
> > 
> > should be parsed into:
> > 
> >             ||
> >             /\
> >            /  \
> >           /    \
> >          /      \
> >         ==       ==
> >        /  \     /  \
> >       /    \   ip   \
> >      /      \       func4
> >     ip       +       
> >             / \
> >            /   \
> >           +     \
> >          /  \   func3
> >         /    \
> >     func1   func2
> > 
> > And then pass that to ftrace that interprets that tree
> > by building set of functions on top of each node joined to
> > the other.
> > 
> > But that can be complicated to do, and perhaps a bit of
> > an overkill even for daily use of it.
> > 
> > Having a simple "ip == func + func2 + func3" expression support
> 
> so the '+' is just shortcut for '||' ... like:

Please, lets use something other than '+', as it can also be construed
as addition, and will probably confuse things. Or we change the '==' to
'|=' or something, and then comma separate it?

ip |= func1, func2, func3

Where we can say the '|=' ',' pair is a short cut of multiple a == b
|| ..

-- Steve

> 
> "ip == x1 || ip == x1" AND "ip == x1 + x2" mean the same thing
> 
> if we omit the '+' and keep just the whitespace we could use the function
> name parser as used in set_ftrace_filter interface with no change ;)
> but adding new separator should not be that hard..
> 
> > should be enough I think. And we can reject expressions that don't
> > fit that pattern. Then if it becomes necessary one day to support
> > real expressions there, we can still switch to a real tree.
> > 
> > Does that look sane? 
> 
> I'l make the change and send new version
> 
> thanks,
> jirka



^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2011-08-10 20:25 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-07-11 13:22 [RFC 0/4] trace,perf: Enabling ftrace:function tracepoint for perf Jiri Olsa
2011-07-11 13:22 ` [RFC 1/4] perf, ftrace: Add perf support to use function tracepoint Jiri Olsa
2011-07-11 13:22 ` [RFC 2/4] perf, ftrace: Add filter support for ftrace:function tracepoint Jiri Olsa
2011-08-10 19:34   ` Steven Rostedt
2011-07-11 13:22 ` [RFC 3/4] perf, ftrace: Add new perf ioctl for function trace filter Jiri Olsa
2011-07-18 15:12   ` Frederic Weisbecker
2011-07-19  9:57     ` Jiri Olsa
2011-07-26 13:43       ` Frederic Weisbecker
2011-07-27 12:17         ` Jiri Olsa
2011-08-10 20:25           ` Steven Rostedt
2011-07-11 13:22 ` [RFC 4/4] perf, ftrace: Add ftrace option for perf stat command Jiri Olsa

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).