All of lore.kernel.org
 help / color / mirror / Atom feed
* tracepoint filter problems
@ 2015-10-29 15:27 Arnaldo Carvalho de Melo
  2015-11-02  9:50 ` Peter Zijlstra
  0 siblings, 1 reply; 4+ messages in thread
From: Arnaldo Carvalho de Melo @ 2015-10-29 15:27 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Frédéric Weisbecker, Jiri Olsa, David Ahern,
	Adrian Hunter, Wang Nan, Steven Rostedt,
	Linux Kernel Mailing List

Hi,

	In 'perf trace' we have an equivalent to strace's -e option, i.e. a way
to state which syscalls one wants to see, so, when we do:

[root@felicio linux]# trace -e open
     0.348 ( 0.005 ms): ls/6887 open(filename: /etc/ld.so.cache, flags: CLOEXEC                       ) = 3
     0.363 ( 0.005 ms): ls/6887 open(filename: /lib64/libselinux.so.1, flags: CLOEXEC                 ) = 3
     0.400 ( 0.006 ms): ls/6887 open(filename: /lib64/libcap.so.2, flags: CLOEXEC                     ) = 3
     0.427 ( 0.006 ms): ls/6887 open(filename: /lib64/libacl.so.1, flags: CLOEXEC                     ) = 3
     0.454 ( 0.006 ms): ls/6887 open(filename: /lib64/libc.so.6, flags: CLOEXEC                       ) = 3
     0.497 ( 0.006 ms): ls/6887 open(filename: /lib64/libpcre.so.1, flags: CLOEXEC                    ) = 3
     0.533 ( 0.006 ms): ls/6887 open(filename: /lib64/liblzma.so.5, flags: CLOEXEC                    ) = 3
     0.561 ( 0.006 ms): ls/6887 open(filename: /lib64/libdl.so.2, flags: CLOEXEC                      ) = 3
     0.586 ( 0.006 ms): ls/6887 open(filename: /lib64/libattr.so.1, flags: CLOEXEC                    ) = 3
     0.611 ( 0.006 ms): ls/6887 open(filename: /lib64/libpthread.so.0, flags: CLOEXEC                 ) = 3
     0.859 ( 0.007 ms): ls/6887 open(filename: /usr/lib/locale/locale-archive, flags: CLOEXEC         ) = 3
^C[root@felicio linux]#

I.e. tracing system wide, looking for any program that calls the open syscall, it works, as it
used the PERF_EVENT_IOC_SET_FILTER to set the relevant 'id == 486' to filter it:

[root@felicio linux]# head -2 /sys/kernel/debug/tracing/events/syscalls/sys_enter_open/format 
name: sys_enter_open
ID: 486

Ok, this is how it should be, all works, but if I try instead calling a program
that in turn calls another, the filter will apply for the first in line, but
not for the calls made by another program called from it.

In the example below 'perf trace' will ask to see just the 'open' syscall, and
it works for the started workload, namely 'perf record', but then it'll call
'sleep 100000000000' and there the filter doesn't apply, bummer :-\

The inherit bit is set, sure, as we can see when adding -vv to the 'perf trace'
call, ideas?

I came accross this wanting to trace perf record with a bpf proggie that would
call another workload, and I wanted to see just the perf_event_open and bpf
syscalls, but ended up seeing all the syscalls in the leaf workload.

- Arnaldo

[root@felicio linux]# perf trace -e open perf record -e /tmp/foo.o sleep 100000000000
     0.091 ( 0.007 ms): perf/6875 open(filename: 0x96590bf0, flags: CLOEXEC                             ) = -1 ENOENT No such file or directory
     0.096 ( 0.002 ms): perf/6875 open(filename: 0x96590bf0, flags: CLOEXEC                             ) = -1 ENOENT No such file or directory
     0.100 ( 0.002 ms): perf/6875 open(filename: 0x96590bf0, flags: CLOEXEC                             ) = -1 ENOENT No such file or directory
     0.103 ( 0.002 ms): perf/6875 open(filename: 0x96590bf0, flags: CLOEXEC                             ) = -1 ENOENT No such file or directory
     0.110 ( 0.003 ms): perf/6875 open(filename: 0x7629da98, flags: CLOEXEC                             ) = 3
   386.185 ( 0.005 ms): perf/6875 open(filename: 0x965868a0, mode: 20                                   ) = 10
   386.800 ( 0.008 ms): perf/6875 open(filename: 0x96588ae0, mode: 20                                   ) = 8
<SNIP tons of open syscalls>
   387.426 ( 0.002 ms): :6876/6876 set_robust_list(head: 0x7faa76484a60, len: 24                         ) = 0
   387.446 ( 0.002 ms): :6876/6876 rt_sigaction(sig: TERM, act: 0x7ffe9658ebc0, oact: 0x7ffe9658ec60, sigsetsize: 8) = 0
   387.449 ( 0.001 ms): :6876/6876 close(fd: 5                                                           ) = 0
   387.450 ( 0.001 ms): :6876/6876 close(fd: 8                                                           ) = 0
<SNIP some syscalls for pid 6876>
   391.573 ( 0.002 ms): perf/6876 execve(arg0: 140731420831140, arg1: 140731420841520, arg2: 140731420841568, arg3: 140731420830688, arg4: 7307199665339051828, arg5: 140731420830816) = -2
   391.580 ( 0.006 ms): perf/6876 execve(arg0: 140731420831149, arg1: 140731420841520, arg2: 140731420841568, arg3: 140731420830688, arg4: 7307199665339051828, arg5: 140731420830816) = -2
   392.009 ( 0.428 ms): sleep/6876 execve(arg0: 140731420831150, arg1: 140731420841520, arg2: 140731420841568, arg3: 140731420830688, arg4: 7307199665339051828, arg5: 140731420830816) = 0
   392.030 ( 0.001 ms): sleep/6876 brk(                                                                  ) = 0x14f1000
   392.042 ( 0.003 ms): sleep/6876 mmap(len: 4096, prot: READ|WRITE, flags: PRIVATE|ANONYMOUS, fd: -1    ) = 0x7f9223117000
   392.050 ( 0.004 ms): sleep/6876 access(filename: 0x22f151a0, mode: R                                  ) = -1 ENOENT No such file or directory
   392.058 ( 0.004 ms): sleep/6876 open(filename: 0x22f13a98, flags: CLOEXEC                             ) = 3
   392.060 ( 0.001 ms): sleep/6876 fstat(fd: 3, statbuf: 0x7ffeb5f28660                                  ) = 0
   392.064 ( 0.003 ms): sleep/6876 mmap(len: 62420, prot: READ, flags: PRIVATE, fd: 3                    ) = 0x7f9223107000
   392.066 ( 0.001 ms): sleep/6876 close(fd: 3                                                           ) = 0
   392.076 ( 0.005 ms): sleep/6876 open(filename: 0x23114dcf, flags: CLOEXEC                             ) = 3
<SNIP> some more

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: tracepoint filter problems
  2015-10-29 15:27 tracepoint filter problems Arnaldo Carvalho de Melo
@ 2015-11-02  9:50 ` Peter Zijlstra
  2015-11-03  1:13   ` Arnaldo Carvalho de Melo
  2015-11-10  6:40   ` [tip:perf/urgent] perf: Fix inherited events vs. tracepoint filters tip-bot for Peter Zijlstra
  0 siblings, 2 replies; 4+ messages in thread
From: Peter Zijlstra @ 2015-11-02  9:50 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Frédéric Weisbecker, Jiri Olsa, David Ahern,
	Adrian Hunter, Wang Nan, Steven Rostedt,
	Linux Kernel Mailing List

On Thu, Oct 29, 2015 at 12:27:42PM -0300, Arnaldo Carvalho de Melo wrote:
> Hi,
> 
> 	In 'perf trace' we have an equivalent to strace's -e option, i.e. a way
> to state which syscalls one wants to see, so, when we do:
> 
> [root@felicio linux]# trace -e open
>      0.348 ( 0.005 ms): ls/6887 open(filename: /etc/ld.so.cache, flags: CLOEXEC                       ) = 3
>      0.363 ( 0.005 ms): ls/6887 open(filename: /lib64/libselinux.so.1, flags: CLOEXEC                 ) = 3
>      0.400 ( 0.006 ms): ls/6887 open(filename: /lib64/libcap.so.2, flags: CLOEXEC                     ) = 3
>      0.427 ( 0.006 ms): ls/6887 open(filename: /lib64/libacl.so.1, flags: CLOEXEC                     ) = 3
>      0.454 ( 0.006 ms): ls/6887 open(filename: /lib64/libc.so.6, flags: CLOEXEC                       ) = 3
>      0.497 ( 0.006 ms): ls/6887 open(filename: /lib64/libpcre.so.1, flags: CLOEXEC                    ) = 3
>      0.533 ( 0.006 ms): ls/6887 open(filename: /lib64/liblzma.so.5, flags: CLOEXEC                    ) = 3
>      0.561 ( 0.006 ms): ls/6887 open(filename: /lib64/libdl.so.2, flags: CLOEXEC                      ) = 3
>      0.586 ( 0.006 ms): ls/6887 open(filename: /lib64/libattr.so.1, flags: CLOEXEC                    ) = 3
>      0.611 ( 0.006 ms): ls/6887 open(filename: /lib64/libpthread.so.0, flags: CLOEXEC                 ) = 3
>      0.859 ( 0.007 ms): ls/6887 open(filename: /usr/lib/locale/locale-archive, flags: CLOEXEC         ) = 3
> ^C[root@felicio linux]#
> 
> I.e. tracing system wide, looking for any program that calls the open syscall, it works, as it
> used the PERF_EVENT_IOC_SET_FILTER to set the relevant 'id == 486' to filter it:
> 
> [root@felicio linux]# head -2 /sys/kernel/debug/tracing/events/syscalls/sys_enter_open/format 
> name: sys_enter_open
> ID: 486
> 
> Ok, this is how it should be, all works, but if I try instead calling a program
> that in turn calls another, the filter will apply for the first in line, but
> not for the calls made by another program called from it.
> 
> In the example below 'perf trace' will ask to see just the 'open' syscall, and
> it works for the started workload, namely 'perf record', but then it'll call
> 'sleep 100000000000' and there the filter doesn't apply, bummer :-\
> 
> The inherit bit is set, sure, as we can see when adding -vv to the 'perf trace'
> call, ideas?

--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -6939,6 +6939,10 @@ static int perf_tp_filter_match(struct p
 {
 	void *record = data->raw->data;
 
+	/* only top level events have filters set */
+	if (event->parent)
+		event = event->parent;
+
 	if (likely(!event->filter) || filter_match_preds(event->filter, record))
 		return 1;
 	return 0;

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: tracepoint filter problems
  2015-11-02  9:50 ` Peter Zijlstra
@ 2015-11-03  1:13   ` Arnaldo Carvalho de Melo
  2015-11-10  6:40   ` [tip:perf/urgent] perf: Fix inherited events vs. tracepoint filters tip-bot for Peter Zijlstra
  1 sibling, 0 replies; 4+ messages in thread
From: Arnaldo Carvalho de Melo @ 2015-11-03  1:13 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Frédéric Weisbecker, Jiri Olsa, David Ahern,
	Adrian Hunter, Wang Nan, Steven Rostedt,
	Linux Kernel Mailing List

Em Mon, Nov 02, 2015 at 10:50:51AM +0100, Peter Zijlstra escreveu:
> On Thu, Oct 29, 2015 at 12:27:42PM -0300, Arnaldo Carvalho de Melo wrote:
> > In the example below 'perf trace' will ask to see just the 'open' syscall, and
> > it works for the started workload, namely 'perf record', but then it'll call
> > 'sleep 100000000000' and there the filter doesn't apply, bummer :-\

> > The inherit bit is set, sure, as we can see when adding -vv to the 'perf trace'
> > call, ideas?
 
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -6939,6 +6939,10 @@ static int perf_tp_filter_match(struct p
>  {
>  	void *record = data->raw->data;
>  
> +	/* only top level events have filters set */
> +	if (event->parent)
> +		event = event->parent;
> +
>  	if (likely(!event->filter) || filter_match_preds(event->filter, record))
>  		return 1;
>  	return 0;

Thank you!

Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>

  [root@felicio ~]# trace -e bpf,nanosleep perf record -e /tmp/foo.o sleep 1
     385.246 ( 0.062 ms): perf/13761 bpf(cmd: PROG_LOAD, uattr: 0x7ffd9900ac90, size: 48) = 3
    1424.278 (999.497 ms): sleep/13762 nanosleep(rqtp: 0x7ffd4583a6e0                   ) = 0
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.012 MB perf.data ]
  [root@felicio ~]#

- Arnaldo

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [tip:perf/urgent] perf: Fix inherited events vs. tracepoint filters
  2015-11-02  9:50 ` Peter Zijlstra
  2015-11-03  1:13   ` Arnaldo Carvalho de Melo
@ 2015-11-10  6:40   ` tip-bot for Peter Zijlstra
  1 sibling, 0 replies; 4+ messages in thread
From: tip-bot for Peter Zijlstra @ 2015-11-10  6:40 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: jolsa, rostedt, fweisbec, hpa, torvalds, linux-kernel, tglx,
	wangnan0, peterz, dsahern, acme, adrian.hunter, jolsa, mingo,
	acme

Commit-ID:  b71b437eedaed985062492565d9d421d975ae845
Gitweb:     http://git.kernel.org/tip/b71b437eedaed985062492565d9d421d975ae845
Author:     Peter Zijlstra <peterz@infradead.org>
AuthorDate: Mon, 2 Nov 2015 10:50:51 +0100
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Mon, 9 Nov 2015 16:13:11 +0100

perf: Fix inherited events vs. tracepoint filters

Arnaldo reported that tracepoint filters seem to misbehave (ie. not
apply) on inherited events.

The fix is obvious; filters are only set on the actual (parent)
event, use the normal pattern of using this parent event for filters.
This is safe because each child event has a reference to it.

Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20151102095051.GN17308@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 kernel/events/core.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index f8e5c44..98a4b9d 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -6909,6 +6909,10 @@ static int perf_tp_filter_match(struct perf_event *event,
 {
 	void *record = data->raw->data;
 
+	/* only top level events have filters set */
+	if (event->parent)
+		event = event->parent;
+
 	if (likely(!event->filter) || filter_match_preds(event->filter, record))
 		return 1;
 	return 0;

^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2015-11-10  6:41 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-10-29 15:27 tracepoint filter problems Arnaldo Carvalho de Melo
2015-11-02  9:50 ` Peter Zijlstra
2015-11-03  1:13   ` Arnaldo Carvalho de Melo
2015-11-10  6:40   ` [tip:perf/urgent] perf: Fix inherited events vs. tracepoint filters tip-bot for Peter Zijlstra

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.