linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/2] perf, tools: Add PERF_PID
@ 2014-09-24 20:51 Andi Kleen
  2014-09-24 20:51 ` [PATCH 2/2] perf tools: apply -F filter to all previous events Andi Kleen
  2014-10-01 18:03 ` Adding a filter to events (instead of replacing one) was Re: [PATCH 1/2] perf, tools: Add PERF_PID Arnaldo Carvalho de Melo
  0 siblings, 2 replies; 5+ messages in thread
From: Andi Kleen @ 2014-09-24 20:51 UTC (permalink / raw)
  To: jolsa; +Cc: linux-kernel, namhyung, acme, Andi Kleen

From: Andi Kleen <ak@linux.intel.com>

It's currently difficult to filter out perf itself using a filter.
This can give cascading effects during IO tracing when the IO
perf does itself causes more trace output.

The best way to filter is to use the pid. But it's difficult to get the pid
of perf without using hacks.

Add a PERF_PID meta variable to the perf filter that contains the current pid.

With this patch the following works

% perf record -e syscalls:sys_enter_write -a --filter 'common_pid != PERF_PID' ...

This won't work for more complex perf pipe lines with multiple processes,
but at least solves the problem nicely for a single perf.

v2: Remove debug code. Don't use zero padding (Namhyung)
v3: Correct patch.
v4: Handle arbitary pid length.
v5: Fix a warning
Acked-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 tools/perf/Documentation/perf-record.txt |  2 +-
 tools/perf/util/parse-events.c           | 21 ++++++++++++++++++++-
 2 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
index d460049..b6c5e51 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -41,7 +41,7 @@ OPTIONS
           'mem:0x1000:rw'.
 
 --filter=<filter>::
-        Event filter.
+        Event filter. PERF_PID represents the perf pid.
 
 -a::
 --all-cpus::
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 1e15df1..9c4bab8 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -967,6 +967,9 @@ int parse_filter(const struct option *opt, const char *str,
 {
 	struct perf_evlist *evlist = *(struct perf_evlist **)opt->value;
 	struct perf_evsel *last = NULL;
+	char *pid, buf[30], *o;
+	int len, plen;
+	const char *p;
 
 	if (evlist->nr_entries > 0)
 		last = perf_evlist__last(evlist);
@@ -977,12 +980,28 @@ int parse_filter(const struct option *opt, const char *str,
 		return -1;
 	}
 
-	last->filter = strdup(str);
+	plen = sprintf(buf, "%d", getpid());
+	len = strlen(str) + 1;
+	for (pid = strstr(str, "PERF_PID"); pid;
+	     pid = strstr(pid + 1, "PERF_PID")) {
+		len += plen - 8;
+	}
+
+	last->filter = malloc(len);
 	if (last->filter == NULL) {
 		fprintf(stderr, "not enough memory to hold filter string\n");
 		return -1;
 	}
 
+	o = last->filter;
+	for (p = str; *p; ) {
+		if (*p == 'P' && !strncmp(p, "PERF_PID", 8)) {
+			strcpy(o, buf);
+			o += plen;
+			p += sizeof("PERF_PID") - 1;
+		} else
+			*o++ = *p++;
+	}
 	return 0;
 }
 
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 2/2] perf tools: apply -F filter to all previous events
  2014-09-24 20:51 [PATCH 1/2] perf, tools: Add PERF_PID Andi Kleen
@ 2014-09-24 20:51 ` Andi Kleen
  2014-10-01 18:03 ` Adding a filter to events (instead of replacing one) was Re: [PATCH 1/2] perf, tools: Add PERF_PID Arnaldo Carvalho de Melo
  1 sibling, 0 replies; 5+ messages in thread
From: Andi Kleen @ 2014-09-24 20:51 UTC (permalink / raw)
  To: jolsa; +Cc: linux-kernel, namhyung, acme, Andi Kleen, brendan.d.gregg

From: Andi Kleen <ak@linux.intel.com>

Right now you would need to duplicate the filter definition
for every trace point event. Instead apply it to each previous
one that did not have its own filter. If you have some tracepoint
events you don't want to give filter you can put them after the
last one.

Cc: brendan.d.gregg@gmail.com
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 tools/perf/util/parse-events.c | 48 +++++++++++++++++++++++++++---------------
 1 file changed, 31 insertions(+), 17 deletions(-)

diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 9c4bab8..9bf4173 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -962,23 +962,12 @@ int parse_events_option(const struct option *opt, const char *str,
 	return ret;
 }
 
-int parse_filter(const struct option *opt, const char *str,
-		 int unset __maybe_unused)
+static char *gen_filter(const char *str)
 {
-	struct perf_evlist *evlist = *(struct perf_evlist **)opt->value;
-	struct perf_evsel *last = NULL;
 	char *pid, buf[30], *o;
 	int len, plen;
 	const char *p;
-
-	if (evlist->nr_entries > 0)
-		last = perf_evlist__last(evlist);
-
-	if (last == NULL || last->attr.type != PERF_TYPE_TRACEPOINT) {
-		fprintf(stderr,
-			"-F option should follow a -e tracepoint option\n");
-		return -1;
-	}
+	char *filter;
 
 	plen = sprintf(buf, "%d", getpid());
 	len = strlen(str) + 1;
@@ -987,13 +976,13 @@ int parse_filter(const struct option *opt, const char *str,
 		len += plen - 8;
 	}
 
-	last->filter = malloc(len);
-	if (last->filter == NULL) {
+	filter = malloc(len);
+	if (filter == NULL) {
 		fprintf(stderr, "not enough memory to hold filter string\n");
-		return -1;
+		return NULL;
 	}
 
-	o = last->filter;
+	o = filter;
 	for (p = str; *p; ) {
 		if (*p == 'P' && !strncmp(p, "PERF_PID", 8)) {
 			strcpy(o, buf);
@@ -1002,6 +991,31 @@ int parse_filter(const struct option *opt, const char *str,
 		} else
 			*o++ = *p++;
 	}
+	return filter;
+}
+
+int parse_filter(const struct option *opt, const char *str,
+		 int unset __maybe_unused)
+{
+	struct perf_evlist *evlist = *(struct perf_evlist **)opt->value;
+	struct perf_evsel *evsel;
+	int nr_tp = 0;
+
+	evlist__for_each(evlist, evsel) {
+		if (evsel->attr.type == PERF_TYPE_TRACEPOINT &&
+			!evsel->filter) {
+			evsel->filter = gen_filter(str);
+			if (!evsel->filter)
+				return -1;
+			nr_tp++;
+		}
+	}
+
+	if (nr_tp == 0) {
+		fprintf(stderr,
+			"-F option should follow a -e tracepoint option\n");
+		return -1;
+	}
 	return 0;
 }
 
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Adding a filter to events (instead of replacing one) was Re: [PATCH 1/2] perf, tools: Add PERF_PID
  2014-09-24 20:51 [PATCH 1/2] perf, tools: Add PERF_PID Andi Kleen
  2014-09-24 20:51 ` [PATCH 2/2] perf tools: apply -F filter to all previous events Andi Kleen
@ 2014-10-01 18:03 ` Arnaldo Carvalho de Melo
  2014-10-01 22:02   ` Andi Kleen
  1 sibling, 1 reply; 5+ messages in thread
From: Arnaldo Carvalho de Melo @ 2014-10-01 18:03 UTC (permalink / raw)
  To: Andi Kleen
  Cc: jolsa, linux-kernel, namhyung, Andi Kleen, Brendan Gregg, Steven Rostedt

[-- Attachment #1: Type: text/plain, Size: 4613 bytes --]

Em Wed, Sep 24, 2014 at 01:51:08PM -0700, Andi Kleen escreveu:

> It's currently difficult to filter out perf itself using a filter.
> This can give cascading effects during IO tracing when the IO perf
> does itself causes more trace output.
 
> The best way to filter is to use the pid. But it's difficult to get the pid
> of perf without using hacks.
 
> Add a PERF_PID meta variable to the perf filter that contains the current pid.
 
> With this patch the following works
 
> % perf record -e syscalls:sys_enter_write -a --filter 'common_pid != PERF_PID' ...

So I tried this one now and saw the other patch, that applies the
--filter to all events, while trying I got:

[root@zoo ~]# perf record --filter "common_pid != PERF_PID" -a
-F option should follow a -e tracepoint option

 usage: perf record [<options>] [<command>]
    or: perf record [<options>] -- <command> [<options>]

        --filter <filter>
                          event filter
[root@zoo ~]#

A bug there, as '-F' is for frequency, will fix, but the message tells
that --filter affects just the last event, i.e. whoever implemented this
wanted to apply filters to specific events, and those need to be
tracepoints (as we can't apply filters to other PERF_TYPE_ events (we
should!).

So, more surgery would be needed to keep the existing behaviour, i.e. to
allow per event filters, which is what the current does (or should,
judging by the error message above), and to provide a new one that would
apply the given filter to all suitable events, i.e. all tracepoint
events (nowadays, others in the future).

I give a shot at implementing it in 'perf trace', by introducing a
perf_evlist__filter_pid() that would then be used by default, but
stopped working as to really make it good by default I would have to
filter out activity on the gnome-terminal where 'perf trace' was
running, to avoid another feedback loop.

But now I think that perf_evlist__filter_pid() thing is what we want
here and then we should add a --filter-self option to 'perf record',
that would just do:

	perf_evlist__filter_pid(evlist, getpid());

Agreed?

The patch is attached, but it is still incomplete, as I haven't checked
if we set a filter we end up reseting whatever filter is in place, or if
we have to use some syntax (prefixing the filter with '+'?) to add a
filter instead of replacing what is there.

- Arnaldo
 
> This won't work for more complex perf pipe lines with multiple processes,
> but at least solves the problem nicely for a single perf.
> 
> v2: Remove debug code. Don't use zero padding (Namhyung)
> v3: Correct patch.
> v4: Handle arbitary pid length.
> v5: Fix a warning
> Acked-by: Namhyung Kim <namhyung@kernel.org>
> Signed-off-by: Andi Kleen <ak@linux.intel.com>
> ---
>  tools/perf/Documentation/perf-record.txt |  2 +-
>  tools/perf/util/parse-events.c           | 21 ++++++++++++++++++++-
>  2 files changed, 21 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
> index d460049..b6c5e51 100644
> --- a/tools/perf/Documentation/perf-record.txt
> +++ b/tools/perf/Documentation/perf-record.txt
> @@ -41,7 +41,7 @@ OPTIONS
>            'mem:0x1000:rw'.
>  
>  --filter=<filter>::
> -        Event filter.
> +        Event filter. PERF_PID represents the perf pid.
>  
>  -a::
>  --all-cpus::
> diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
> index 1e15df1..9c4bab8 100644
> --- a/tools/perf/util/parse-events.c
> +++ b/tools/perf/util/parse-events.c
> @@ -967,6 +967,9 @@ int parse_filter(const struct option *opt, const char *str,
>  {
>  	struct perf_evlist *evlist = *(struct perf_evlist **)opt->value;
>  	struct perf_evsel *last = NULL;
> +	char *pid, buf[30], *o;
> +	int len, plen;
> +	const char *p;
>  
>  	if (evlist->nr_entries > 0)
>  		last = perf_evlist__last(evlist);
> @@ -977,12 +980,28 @@ int parse_filter(const struct option *opt, const char *str,
>  		return -1;
>  	}
>  
> -	last->filter = strdup(str);
> +	plen = sprintf(buf, "%d", getpid());
> +	len = strlen(str) + 1;
> +	for (pid = strstr(str, "PERF_PID"); pid;
> +	     pid = strstr(pid + 1, "PERF_PID")) {
> +		len += plen - 8;
> +	}
> +
> +	last->filter = malloc(len);
>  	if (last->filter == NULL) {
>  		fprintf(stderr, "not enough memory to hold filter string\n");
>  		return -1;
>  	}
>  
> +	o = last->filter;
> +	for (p = str; *p; ) {
> +		if (*p == 'P' && !strncmp(p, "PERF_PID", 8)) {
> +			strcpy(o, buf);
> +			o += plen;
> +			p += sizeof("PERF_PID") - 1;
> +		} else
> +			*o++ = *p++;
> +	}
>  	return 0;
>  }
>  
> -- 
> 1.9.3

[-- Attachment #2: trace_set_filter.patch --]
[-- Type: text/plain, Size: 2303 bytes --]

diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index 09bcf2393910..d64e8c619ff6 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -2105,6 +2105,10 @@ static int trace__run(struct trace *trace, int argc, const char **argv)
 	if (err < 0)
 		goto out_error_open;
 
+	err = perf_evlist__set_filter_pid(evlist, getpid());
+	if (err < 0)
+		goto out_error_filter;
+
 	err = perf_evlist__mmap(evlist, trace->opts.mmap_pages, false);
 	if (err < 0) {
 		fprintf(trace->output, "Couldn't mmap the events: %s\n",
@@ -2216,6 +2220,11 @@ out_error_open:
 out_error:
 	fprintf(trace->output, "%s\n", errbuf);
 	goto out_delete_evlist;
+
+out_error_filter:
+	fprintf(trace->output, "Couldn't set 'perf trace' pid filter: %s\n",
+                        strerror_r(errno, errbuf, sizeof(errbuf)));
+	goto out_delete_evlist;
 }
 }
 
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 3cebc9a8d52e..06e54304f63f 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -1041,6 +1041,27 @@ int perf_evlist__set_filter(struct perf_evlist *evlist, const char *filter)
 	return err;
 }
 
+int perf_evlist__set_filter_pid(struct perf_evlist *evlist, pid_t pid)
+{
+	struct perf_evsel *evsel;
+	int err = 0;
+	const int ncpus = cpu_map__nr(evlist->cpus),
+		  nthreads = thread_map__nr(evlist->threads);
+	char filter[64];
+
+	scnprintf(filter, sizeof(filter), "common_pid != %d", pid);
+
+	evlist__for_each(evlist, evsel) {
+		if (evsel->attr.type != PERF_TYPE_TRACEPOINT)
+			continue;
+		err = perf_evsel__set_filter(evsel, ncpus, nthreads, filter);
+		if (err)
+			break;
+	}
+
+	return err;
+}
+
 bool perf_evlist__valid_sample_type(struct perf_evlist *evlist)
 {
 	struct perf_evsel *pos;
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index bd312b01e876..9c4b9410d511 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -77,6 +77,7 @@ int perf_evlist__add_newtp(struct perf_evlist *evlist,
 			   const char *sys, const char *name, void *handler);
 
 int perf_evlist__set_filter(struct perf_evlist *evlist, const char *filter);
+int perf_evlist__set_filter_pid(struct perf_evlist *evlist, pid_t pid);
 
 struct perf_evsel *
 perf_evlist__find_tracepoint_by_id(struct perf_evlist *evlist, int id);

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: Adding a filter to events (instead of replacing one) was Re: [PATCH 1/2] perf, tools: Add PERF_PID
  2014-10-01 18:03 ` Adding a filter to events (instead of replacing one) was Re: [PATCH 1/2] perf, tools: Add PERF_PID Arnaldo Carvalho de Melo
@ 2014-10-01 22:02   ` Andi Kleen
  2014-10-01 22:13     ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 5+ messages in thread
From: Andi Kleen @ 2014-10-01 22:02 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Andi Kleen, jolsa, linux-kernel, namhyung, Brendan Gregg, Steven Rostedt

On Wed, Oct 01, 2014 at 03:03:16PM -0300, Arnaldo Carvalho de Melo wrote:
> Em Wed, Sep 24, 2014 at 01:51:08PM -0700, Andi Kleen escreveu:
> 
> > It's currently difficult to filter out perf itself using a filter.
> > This can give cascading effects during IO tracing when the IO perf
> > does itself causes more trace output.
>  
> > The best way to filter is to use the pid. But it's difficult to get the pid
> > of perf without using hacks.
>  
> > Add a PERF_PID meta variable to the perf filter that contains the current pid.
>  
> > With this patch the following works
>  
> > % perf record -e syscalls:sys_enter_write -a --filter 'common_pid != PERF_PID' ...
> 
> So I tried this one now and saw the other patch, that applies the
> --filter to all events, while trying I got:

Patch seems reasonable to me.

However adding PERF_PID and sanitizing --filter are really two
different things and should probably not be mixed in a patch.

-Andi

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Adding a filter to events (instead of replacing one) was Re: [PATCH 1/2] perf, tools: Add PERF_PID
  2014-10-01 22:02   ` Andi Kleen
@ 2014-10-01 22:13     ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 5+ messages in thread
From: Arnaldo Carvalho de Melo @ 2014-10-01 22:13 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Andi Kleen, jolsa, linux-kernel, namhyung, Brendan Gregg, Steven Rostedt

Em Wed, Oct 01, 2014 at 03:02:18PM -0700, Andi Kleen escreveu:
> On Wed, Oct 01, 2014 at 03:03:16PM -0300, Arnaldo Carvalho de Melo wrote:
> > Em Wed, Sep 24, 2014 at 01:51:08PM -0700, Andi Kleen escreveu:

> > > It's currently difficult to filter out perf itself using a filter.
> > > This can give cascading effects during IO tracing when the IO perf
> > > does itself causes more trace output.

> > > The best way to filter is to use the pid. But it's difficult to get the pid
> > > of perf without using hacks.

> > > Add a PERF_PID meta variable to the perf filter that contains the current pid.

> > > With this patch the following works

> > > % perf record -e syscalls:sys_enter_write -a --filter 'common_pid != PERF_PID' ...

> > So I tried this one now and saw the other patch, that applies the
> > --filter to all events, while trying I got:

> Patch seems reasonable to me.

> However adding PERF_PID and sanitizing --filter are really two
> different things and should probably not be mixed in a patch.

Yes, there are two things, but what seems to be wanted first is a way to
exclude 'perf record' samples, for all events.

Being able to specify PERF_PID on a filter so that some events can
include 'perf record' samples and some not seems to be something not
really needed at this point, i.e. just doing:

  perf record --filter-self -e syscalls:sys_enter_write -a ...

Is shorter and doesn't breaks the current --filter semantic (apply just
to the last tracepoint informed in the cmdline, not to all tracepoints)
like in your second patch.

So probably what is best to do is to finish the patch I sent here, which
will cover the filter-out-perf-tooling-samples usecase, and then, if
people still think it is needed, introduce the PERF_PID meta filter
variable.

- Arnaldo

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2014-10-01 22:13 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-09-24 20:51 [PATCH 1/2] perf, tools: Add PERF_PID Andi Kleen
2014-09-24 20:51 ` [PATCH 2/2] perf tools: apply -F filter to all previous events Andi Kleen
2014-10-01 18:03 ` Adding a filter to events (instead of replacing one) was Re: [PATCH 1/2] perf, tools: Add PERF_PID Arnaldo Carvalho de Melo
2014-10-01 22:02   ` Andi Kleen
2014-10-01 22:13     ` Arnaldo Carvalho de Melo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).