[GIT PULL] perf/core improvements and fixes

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [GIT PULL] perf/core improvements and fixes
@ 2019-07-22 17:38 Arnaldo Carvalho de Melo
  2019-07-22 17:38 ` [PATCH 01/37] perf include bpf: Add bpf_tail_call() prototype Arnaldo Carvalho de Melo
                   ` (36 more replies)
  0 siblings, 37 replies; 38+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-07-22 17:38 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
	linux-perf-users, Arnaldo Carvalho de Melo, Alexey Budankov,
	Andi Kleen, Cong Wang, Denis Bakhvalov, Numfor Mbiziwo-Tiapo

Hi Ingo,

	Please consider pulling,

Best regards,

- Arnaldo


^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH 01/37] perf include bpf: Add bpf_tail_call() prototype
  2019-07-22 17:38 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
@ 2019-07-22 17:38 ` Arnaldo Carvalho de Melo
  2019-07-22 17:38 ` [PATCH 02/37] perf bpf: Do not attach a BPF prog to a tracepoint if its name starts with ! Arnaldo Carvalho de Melo
                   ` (35 subsequent siblings)
  36 siblings, 0 replies; 38+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-07-22 17:38 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
	linux-perf-users, Arnaldo Carvalho de Melo, Adrian Hunter,
	Luis Cláudio Gonçalves

From: Arnaldo Carvalho de Melo <acme@redhat.com>

Will be used together with BPF_MAP_TYPE_PROG_ARRAY in
tools/perf/examples/bpf/augmented_raw_syscalls.c.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-pd1bpy8i31nta6jqwdex871g@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/include/bpf/bpf.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/tools/perf/include/bpf/bpf.h b/tools/perf/include/bpf/bpf.h
index 2eac6d804b2d..b422aeef5339 100644
--- a/tools/perf/include/bpf/bpf.h
+++ b/tools/perf/include/bpf/bpf.h
@@ -45,6 +45,8 @@ struct ____btf_map_##name __attribute__((section(".maps." #name), used)) \
 static int (*bpf_map_update_elem)(struct bpf_map *map, void *key, void *value, u64 flags) = (void *)BPF_FUNC_map_update_elem;
 static void *(*bpf_map_lookup_elem)(struct bpf_map *map, void *key) = (void *)BPF_FUNC_map_lookup_elem;
 
+static void (*bpf_tail_call)(void *ctx, void *map, int index) = (void *)BPF_FUNC_tail_call;
+
 #define SEC(NAME) __attribute__((section(NAME),  used))
 
 #define probe(function, vars) \
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 02/37] perf bpf: Do not attach a BPF prog to a tracepoint if its name starts with !
  2019-07-22 17:38 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
  2019-07-22 17:38 ` [PATCH 01/37] perf include bpf: Add bpf_tail_call() prototype Arnaldo Carvalho de Melo
@ 2019-07-22 17:38 ` Arnaldo Carvalho de Melo
  2019-07-22 17:38 ` [PATCH 03/37] perf evsel: Store backpointer to attached bpf_object Arnaldo Carvalho de Melo
                   ` (34 subsequent siblings)
  36 siblings, 0 replies; 38+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-07-22 17:38 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
	linux-perf-users, Arnaldo Carvalho de Melo, Adrian Hunter,
	Luis Cláudio Gonçalves

From: Arnaldo Carvalho de Melo <acme@redhat.com>

With BPF_MAP_TYPE_PROG_ARRAY + bpf_tail_call() we want to have BPF
programs, i.e. functions in a object file that perf's BPF loader
shouldn't try to attach to anything, i.e. "!syscalls:sys_enter_open"
should just stay there, not be attached to a tracepoint with that name,
it'll be used by, for instance, 'perf trace' to associate with syscalls
that copy, in addition to the syscall raw args, a filename pointed by
the first arg, i.e. multiple syscalls that need copying the same pointer
arg in the same way, as a filename, for instance, will share the same
BPF program/function.

Right now when perf's BPF loader sees a function with a name
"sys:name" it'll look for a tracepoint and will associate that BPF
program with it, say:

  SEC("raw_syscalls:sys_enter")
  int sys_enter(struct syscall_enter_args *args)
  {
     //SNIP
  }

Will crate a perf_evsel tracepoint event and then associate with it that
BPF program.

This convention at some point will switch to the one used by the BPF
loader in libbpf, but to experiment with BPF_MAP_TYPE_PROG_ARRAY in
'perf trace' lets do this, that will not require changing too much
stuff.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-lk6dasjr1yf9rtvl292b2hpc@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/parse-events.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 371ff3aee769..fac6b32ef94a 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -639,6 +639,15 @@ static int add_bpf_event(const char *group, const char *event, int fd,
 	struct list_head *list = param->list;
 	struct perf_evsel *pos;
 	int err;
+	/*
+	 * Check if we should add the event, i.e. if it is a TP but starts with a '!',
+	 * then don't add the tracepoint, this will be used for something else, like
+	 * adding to a BPF_MAP_TYPE_PROG_ARRAY.
+	 *
+	 * See tools/perf/examples/bpf/augmented_raw_syscalls.c
+	 */
+	if (group[0] == '!')
+		return 0;
 
 	pr_debug("add bpf event %s:%s and attach bpf program %d\n",
 		 group, event, fd);
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 03/37] perf evsel: Store backpointer to attached bpf_object
  2019-07-22 17:38 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
  2019-07-22 17:38 ` [PATCH 01/37] perf include bpf: Add bpf_tail_call() prototype Arnaldo Carvalho de Melo
  2019-07-22 17:38 ` [PATCH 02/37] perf bpf: Do not attach a BPF prog to a tracepoint if its name starts with ! Arnaldo Carvalho de Melo
@ 2019-07-22 17:38 ` Arnaldo Carvalho de Melo
  2019-07-22 17:38 ` [PATCH 04/37] perf trace: Add pointer to BPF object containing __augmented_syscalls__ Arnaldo Carvalho de Melo
                   ` (33 subsequent siblings)
  36 siblings, 0 replies; 38+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-07-22 17:38 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
	linux-perf-users, Arnaldo Carvalho de Melo, Adrian Hunter,
	Luis Cláudio Gonçalves

From: Arnaldo Carvalho de Melo <acme@redhat.com>

We may want to get to this bpf_object, to search for other BPF programs,
etc.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-3y8hrb6lszjfi23vjlic3cib@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/bpf-loader.c   | 4 ++--
 tools/perf/util/bpf-loader.h   | 2 +-
 tools/perf/util/evsel.c        | 1 +
 tools/perf/util/evsel.h        | 3 +++
 tools/perf/util/parse-events.c | 3 ++-
 5 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/bpf-loader.c b/tools/perf/util/bpf-loader.c
index c61974a50aa5..6d0dfb777a79 100644
--- a/tools/perf/util/bpf-loader.c
+++ b/tools/perf/util/bpf-loader.c
@@ -763,7 +763,7 @@ int bpf__foreach_event(struct bpf_object *obj,
 
 		if (priv->is_tp) {
 			fd = bpf_program__fd(prog);
-			err = (*func)(priv->sys_name, priv->evt_name, fd, arg);
+			err = (*func)(priv->sys_name, priv->evt_name, fd, obj, arg);
 			if (err) {
 				pr_debug("bpf: tracepoint call back failed, stop iterate\n");
 				return err;
@@ -788,7 +788,7 @@ int bpf__foreach_event(struct bpf_object *obj,
 				return fd;
 			}
 
-			err = (*func)(tev->group, tev->event, fd, arg);
+			err = (*func)(tev->group, tev->event, fd, obj, arg);
 			if (err) {
 				pr_debug("bpf: call back failed, stop iterate\n");
 				return err;
diff --git a/tools/perf/util/bpf-loader.h b/tools/perf/util/bpf-loader.h
index 3f46856e3330..8c3441a4b72c 100644
--- a/tools/perf/util/bpf-loader.h
+++ b/tools/perf/util/bpf-loader.h
@@ -46,7 +46,7 @@ struct parse_events_term;
 #define PERF_BPF_PROBE_GROUP "perf_bpf_probe"
 
 typedef int (*bpf_prog_iter_callback_t)(const char *group, const char *event,
-					int fd, void *arg);
+					int fd, struct bpf_object *obj, void *arg);
 
 #ifdef HAVE_LIBBPF_SUPPORT
 struct bpf_object *bpf__prepare_load(const char *filename, bool source);
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index ebb46da4dfe5..b787ec778049 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -234,6 +234,7 @@ void perf_evsel__init(struct perf_evsel *evsel,
 	evsel->scale	   = 1.0;
 	evsel->max_events  = ULONG_MAX;
 	evsel->evlist	   = NULL;
+	evsel->bpf_obj	   = NULL;
 	evsel->bpf_fd	   = -1;
 	INIT_LIST_HEAD(&evsel->node);
 	INIT_LIST_HEAD(&evsel->config_terms);
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index cad54e8ba522..b27935a6d36c 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -82,6 +82,8 @@ enum perf_tool_event {
 	PERF_TOOL_DURATION_TIME = 1,
 };
 
+struct bpf_object;
+
 /** struct perf_evsel - event selector
  *
  * @evlist - evlist this evsel is in, if it is in one.
@@ -152,6 +154,7 @@ struct perf_evsel {
 	char			*group_name;
 	bool			cmdline_group_boundary;
 	struct list_head	config_terms;
+	struct bpf_object	*bpf_obj;
 	int			bpf_fd;
 	bool			auto_merge_stats;
 	bool			merged_stat;
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index fac6b32ef94a..0540303e5e97 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -630,7 +630,7 @@ struct __add_bpf_event_param {
 	struct list_head *head_config;
 };
 
-static int add_bpf_event(const char *group, const char *event, int fd,
+static int add_bpf_event(const char *group, const char *event, int fd, struct bpf_object *obj,
 			 void *_param)
 {
 	LIST_HEAD(new_evsels);
@@ -672,6 +672,7 @@ static int add_bpf_event(const char *group, const char *event, int fd,
 		pr_debug("adding %s:%s to %p\n",
 			 group, event, pos);
 		pos->bpf_fd = fd;
+		pos->bpf_obj = obj;
 	}
 	list_splice(&new_evsels, list);
 	return 0;
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 04/37] perf trace: Add pointer to BPF object containing __augmented_syscalls__
  2019-07-22 17:38 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (2 preceding siblings ...)
  2019-07-22 17:38 ` [PATCH 03/37] perf evsel: Store backpointer to attached bpf_object Arnaldo Carvalho de Melo
@ 2019-07-22 17:38 ` Arnaldo Carvalho de Melo
  2019-07-22 17:38 ` [PATCH 05/37] perf trace: Look up maps just on the __augmented_syscalls__ BPF object Arnaldo Carvalho de Melo
                   ` (32 subsequent siblings)
  36 siblings, 0 replies; 38+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-07-22 17:38 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
	linux-perf-users, Arnaldo Carvalho de Melo, Adrian Hunter,
	Luis Cláudio Gonçalves

From: Arnaldo Carvalho de Melo <acme@redhat.com>

So that we can use it when looking for other components of that object
file, such as other programs to add to the BPF_MAP_TYPE_PROG_ARRAY and
use with bpf_tail_call().

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-1ibmz7ouv6llqxajy7m8igtd@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-trace.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index 4f0bbffee05f..6aa080845a84 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -96,6 +96,7 @@ struct trace {
 	struct perf_evlist	*evlist;
 	struct machine		*host;
 	struct thread		*current;
+	struct bpf_object	*bpf_obj;
 	struct cgroup		*cgroup;
 	u64			base_time;
 	FILE			*output;
@@ -3895,6 +3896,20 @@ int cmd_trace(int argc, const char **argv)
 
 	if (evsel) {
 		trace.syscalls.events.augmented = evsel;
+
+		evsel = perf_evlist__find_tracepoint_by_name(trace.evlist, "raw_syscalls:sys_enter");
+		if (evsel == NULL) {
+			pr_err("ERROR: raw_syscalls:sys_enter not found in the augmented BPF object\n");
+			goto out;
+		}
+
+		if (evsel->bpf_obj == NULL) {
+			pr_err("ERROR: raw_syscalls:sys_enter not associated to a BPF object\n");
+			goto out;
+		}
+
+		trace.bpf_obj = evsel->bpf_obj;
+
 		trace__set_bpf_map_filtered_pids(&trace);
 		trace__set_bpf_map_syscalls(&trace);
 	}
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 05/37] perf trace: Look up maps just on the __augmented_syscalls__ BPF object
  2019-07-22 17:38 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (3 preceding siblings ...)
  2019-07-22 17:38 ` [PATCH 04/37] perf trace: Add pointer to BPF object containing __augmented_syscalls__ Arnaldo Carvalho de Melo
@ 2019-07-22 17:38 ` Arnaldo Carvalho de Melo
  2019-07-22 17:38 ` [PATCH 06/37] perf trace: Order -e syscalls table Arnaldo Carvalho de Melo
                   ` (31 subsequent siblings)
  36 siblings, 0 replies; 38+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-07-22 17:38 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
	linux-perf-users, Arnaldo Carvalho de Melo, Adrian Hunter,
	Brendan Gregg, Luis Cláudio Gonçalves

From: Arnaldo Carvalho de Melo <acme@redhat.com>

We can conceivably have multiple BPF object files for other purposes, so
better look just on the BPF object containing the __augmented_syscalls__
map for all things augmented_syscalls related.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-3jt8knkuae9lt705r1lns202@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-trace.c | 20 +++++++-------------
 1 file changed, 7 insertions(+), 13 deletions(-)

diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index 6aa080845a84..bfd739a321d1 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -3669,28 +3669,22 @@ static int trace__parse_cgroups(const struct option *opt, const char *str, int u
 	return 0;
 }
 
-static struct bpf_map *bpf__find_map_by_name(const char *name)
+static struct bpf_map *trace__find_bpf_map_by_name(struct trace *trace, const char *name)
 {
-	struct bpf_object *obj, *tmp;
-
-	bpf_object__for_each_safe(obj, tmp) {
-		struct bpf_map *map = bpf_object__find_map_by_name(obj, name);
-		if (map)
-			return map;
-
-	}
+	if (trace->bpf_obj == NULL)
+		return NULL;
 
-	return NULL;
+	return bpf_object__find_map_by_name(trace->bpf_obj, name);
 }
 
 static void trace__set_bpf_map_filtered_pids(struct trace *trace)
 {
-	trace->filter_pids.map = bpf__find_map_by_name("pids_filtered");
+	trace->filter_pids.map = trace__find_bpf_map_by_name(trace, "pids_filtered");
 }
 
 static void trace__set_bpf_map_syscalls(struct trace *trace)
 {
-	trace->syscalls.map = bpf__find_map_by_name("syscalls");
+	trace->syscalls.map = trace__find_bpf_map_by_name(trace, "syscalls");
 }
 
 static int trace__config(const char *var, const char *value, void *arg)
@@ -3924,7 +3918,7 @@ int cmd_trace(int argc, const char **argv)
 	err = -1;
 
 	if (map_dump_str) {
-		trace.dump.map = bpf__find_map_by_name(map_dump_str);
+		trace.dump.map = trace__find_bpf_map_by_name(&trace, map_dump_str);
 		if (trace.dump.map == NULL) {
 			pr_err("ERROR: BPF map \"%s\" not found\n", map_dump_str);
 			goto out;
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 06/37] perf trace: Order -e syscalls table
  2019-07-22 17:38 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (4 preceding siblings ...)
  2019-07-22 17:38 ` [PATCH 05/37] perf trace: Look up maps just on the __augmented_syscalls__ BPF object Arnaldo Carvalho de Melo
@ 2019-07-22 17:38 ` Arnaldo Carvalho de Melo
  2019-07-22 17:38 ` [PATCH 07/37] perf trace: Add BPF handler for unaugmented syscalls Arnaldo Carvalho de Melo
                   ` (30 subsequent siblings)
  36 siblings, 0 replies; 38+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-07-22 17:38 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
	linux-perf-users, Arnaldo Carvalho de Melo, Adrian Hunter,
	Luis Cláudio Gonçalves

From: Arnaldo Carvalho de Melo <acme@redhat.com>

The ev_qualifier is an array with the syscall ids passed via -e on the
command line, sort it as we'll search it when setting up the
BPF_MAP_TYPE_PROG_ARRAY.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-c8hprylp3ai6e0z9burn2r3s@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-trace.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index bfd739a321d1..9bd5ecd6a8dd 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -1528,6 +1528,13 @@ static int trace__read_syscall_info(struct trace *trace, int id)
 	return syscall__set_arg_fmts(sc);
 }
 
+static int intcmp(const void *a, const void *b)
+{
+	const int *one = a, *another = b;
+
+	return *one - *another;
+}
+
 static int trace__validate_ev_qualifier(struct trace *trace)
 {
 	int err = 0;
@@ -1591,6 +1598,7 @@ static int trace__validate_ev_qualifier(struct trace *trace)
 	}
 
 	trace->ev_qualifier_ids.nr = nr_used;
+	qsort(trace->ev_qualifier_ids.entries, nr_used, sizeof(int), intcmp);
 out:
 	if (printed_invalid_prefix)
 		pr_debug("\n");
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 07/37] perf trace: Add BPF handler for unaugmented syscalls
  2019-07-22 17:38 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (5 preceding siblings ...)
  2019-07-22 17:38 ` [PATCH 06/37] perf trace: Order -e syscalls table Arnaldo Carvalho de Melo
@ 2019-07-22 17:38 ` Arnaldo Carvalho de Melo
  2019-07-22 17:38 ` [PATCH 08/37] perf trace: Allow specifying the bpf prog to augment specific syscalls Arnaldo Carvalho de Melo
                   ` (29 subsequent siblings)
  36 siblings, 0 replies; 38+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-07-22 17:38 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
	linux-perf-users, Arnaldo Carvalho de Melo, Adrian Hunter,
	Luis Cláudio Gonçalves

From: Arnaldo Carvalho de Melo <acme@redhat.com>

Will be used to assign to syscalls that don't need augmentation, i.e.
those with just integer args.

All syscalls will be in a BPF_MAP_TYPE_PROG_ARRAY, and the
bpf_tail_call() keyed by the syscall id will either find nothing in
place, which means the syscall is being filtered, or a function that
will either add things like filenames to the ring buffer, right after
the raw syscall args, or be this unaugmented handler that will just
return 1, meaning don't filter the original
raw_syscalls:sys_{enter,exit} tracepoint.

For now it is not really being used, this is just leg work to break the
patch into smaller pieces.

It introduces a trace__find_bpf_program_by_title() helper that in turn
uses libbpf's bpf_object__find_program_by_title() on the BPF object with
the __augmented_syscalls__ map. "title" is how libbpf calls the SEC()
argument for functions, i.e. the ELF section that follows a convention
to specify what BPF program (a function with this SEC() marking) should
be connected to which tracepoint, kprobes, etc.

In perf anything that is of the form SEC("sys:event_name") will be
connected to that tracepoint by perf's BPF loader.

In this case its something that will be bpf_tail_call()ed from either
the "raw_syscalls:sys_enter" or "raw_syscall:sys_exit" tracepoints, so
its named "!raw_syscalls:unaugmented" to convey that idea, i.e. its not
going to be directly attached to a tracepoint, thus it starts with a
"!".

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-meucpjx2u0slpkayx56lxqq6@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-trace.c                       | 16 ++++++++++++++++
 tools/perf/examples/bpf/augmented_raw_syscalls.c |  6 ++++++
 2 files changed, 22 insertions(+)

diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index 9bd5ecd6a8dd..07df952a0d7f 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -88,6 +88,7 @@ struct trace {
 					  *sys_exit,
 					  *augmented;
 		}		events;
+		struct bpf_program *unaugmented_prog;
 	} syscalls;
 	struct {
 		struct bpf_map *map;
@@ -2733,6 +2734,14 @@ static int trace__set_ev_qualifier_tp_filter(struct trace *trace)
 }
 
 #ifdef HAVE_LIBBPF_SUPPORT
+static struct bpf_program *trace__find_bpf_program_by_title(struct trace *trace, const char *name)
+{
+	if (trace->bpf_obj == NULL)
+		return NULL;
+
+	return bpf_object__find_program_by_title(trace->bpf_obj, name);
+}
+
 static void trace__init_bpf_map_syscall_args(struct trace *trace, int id, struct bpf_map_syscall_entry *entry)
 {
 	struct syscall *sc = trace__syscall_info(trace, NULL, id);
@@ -2814,6 +2823,12 @@ static int trace__init_syscalls_bpf_map(struct trace *trace __maybe_unused)
 {
 	return 0;
 }
+
+static struct bpf_program *trace__find_bpf_program_by_title(struct trace *trace __maybe_unused,
+							    const char *name __maybe_unused)
+{
+	return NULL;
+}
 #endif // HAVE_LIBBPF_SUPPORT
 
 static int trace__set_ev_qualifier_filter(struct trace *trace)
@@ -3914,6 +3929,7 @@ int cmd_trace(int argc, const char **argv)
 
 		trace__set_bpf_map_filtered_pids(&trace);
 		trace__set_bpf_map_syscalls(&trace);
+		trace.syscalls.unaugmented_prog = trace__find_bpf_program_by_title(&trace, "!raw_syscalls:unaugmented");
 	}
 
 	err = bpf__setup_stdout(trace.evlist);
diff --git a/tools/perf/examples/bpf/augmented_raw_syscalls.c b/tools/perf/examples/bpf/augmented_raw_syscalls.c
index 2f822bb51717..48a536b1be6d 100644
--- a/tools/perf/examples/bpf/augmented_raw_syscalls.c
+++ b/tools/perf/examples/bpf/augmented_raw_syscalls.c
@@ -88,6 +88,12 @@ unsigned int augmented_filename__read(struct augmented_filename *augmented_filen
 	return len;
 }
 
+SEC("!raw_syscalls:unaugmented")
+int syscall_unaugmented(struct syscall_enter_args *args)
+{
+	return 1;
+}
+
 SEC("raw_syscalls:sys_enter")
 int sys_enter(struct syscall_enter_args *args)
 {
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 08/37] perf trace: Allow specifying the bpf prog to augment specific syscalls
  2019-07-22 17:38 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (6 preceding siblings ...)
  2019-07-22 17:38 ` [PATCH 07/37] perf trace: Add BPF handler for unaugmented syscalls Arnaldo Carvalho de Melo
@ 2019-07-22 17:38 ` Arnaldo Carvalho de Melo
  2019-07-22 17:38 ` [PATCH 09/37] perf trace: Put the per-syscall entry/exit prog_array BPF map infrastructure in place Arnaldo Carvalho de Melo
                   ` (28 subsequent siblings)
  36 siblings, 0 replies; 38+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-07-22 17:38 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
	linux-perf-users, Arnaldo Carvalho de Melo, Adrian Hunter,
	Luis Cláudio Gonçalves

From: Arnaldo Carvalho de Melo <acme@redhat.com>

This is a step in the direction of being able to use a
BPF_MAP_TYPE_PROG_ARRAY to handle syscalls that need to copy pointer
payloads in addition to the raw tracepoint syscall args.

There is a first example in
tools/perf/examples/bpf/augmented_raw_syscalls.c for the 'open' syscall.

Next step is to introduce the prog array map and use this 'open'
augmenter, then use that augmenter in other syscalls that also only copy
the first arg as a string, and then show how to use with a syscall that
reads more than one filename, like 'rename', etc.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-pys4v57x5qqrybb4cery2mc8@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-trace.c                    | 50 ++++++++++++++++++-
 .../examples/bpf/augmented_raw_syscalls.c     | 23 +++++++++
 2 files changed, 71 insertions(+), 2 deletions(-)

diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index 07df952a0d7f..6cc696edf24a 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -690,6 +690,10 @@ struct syscall_arg_fmt {
 static struct syscall_fmt {
 	const char *name;
 	const char *alias;
+	struct {
+		const char *sys_enter,
+			   *sys_exit;
+	}	   bpf_prog_name;
 	struct syscall_arg_fmt arg[6];
 	u8	   nr_args;
 	bool	   errpid;
@@ -823,6 +827,7 @@ static struct syscall_fmt {
 	{ .name	    = "newfstatat",
 	  .arg = { [0] = { .scnprintf = SCA_FDAT, /* dfd */ }, }, },
 	{ .name	    = "open",
+	  .bpf_prog_name = { .sys_enter = "!syscalls:sys_enter_open", },
 	  .arg = { [1] = { .scnprintf = SCA_OPEN_FLAGS, /* flags */ }, }, },
 	{ .name	    = "open_by_handle_at",
 	  .arg = { [0] = { .scnprintf = SCA_FDAT,	/* dfd */ },
@@ -967,6 +972,10 @@ struct syscall {
 	struct tep_event    *tp_format;
 	int		    nr_args;
 	int		    args_size;
+	struct {
+		struct bpf_program *sys_enter,
+				   *sys_exit;
+	}		    bpf_prog;
 	bool		    is_exit;
 	bool		    is_open;
 	struct tep_format_field *args;
@@ -2742,6 +2751,39 @@ static struct bpf_program *trace__find_bpf_program_by_title(struct trace *trace,
 	return bpf_object__find_program_by_title(trace->bpf_obj, name);
 }
 
+static struct bpf_program *trace__find_syscall_bpf_prog(struct trace *trace, struct syscall *sc,
+							const char *prog_name, const char *type)
+{
+	struct bpf_program *prog;
+
+	if (prog_name == NULL)
+		goto out_unaugmented;
+
+	prog = trace__find_bpf_program_by_title(trace, prog_name);
+	if (prog != NULL)
+		return prog;
+
+	pr_debug("Couldn't find BPF prog \"%s\" to associate with syscalls:sys_%s_%s, not augmenting it\n",
+		 prog_name, type, sc->name);
+out_unaugmented:
+	return trace->syscalls.unaugmented_prog;
+}
+
+static void trace__init_syscall_bpf_progs(struct trace *trace, int id)
+{
+	struct syscall *sc = trace__syscall_info(trace, NULL, id);
+
+	if (sc == NULL)
+		return;
+
+	if (sc->fmt != NULL) {
+		sc->bpf_prog.sys_enter = trace__find_syscall_bpf_prog(trace, sc, sc->fmt->bpf_prog_name.sys_enter, "enter");
+		sc->bpf_prog.sys_exit  = trace__find_syscall_bpf_prog(trace, sc, sc->fmt->bpf_prog_name.sys_exit,  "exit");
+	} else {
+		sc->bpf_prog.sys_enter = sc->bpf_prog.sys_exit = trace->syscalls.unaugmented_prog;
+	}
+}
+
 static void trace__init_bpf_map_syscall_args(struct trace *trace, int id, struct bpf_map_syscall_entry *entry)
 {
 	struct syscall *sc = trace__syscall_info(trace, NULL, id);
@@ -2773,8 +2815,10 @@ static int trace__set_ev_qualifier_bpf_filter(struct trace *trace)
 	for (i = 0; i < trace->ev_qualifier_ids.nr; ++i) {
 		int key = trace->ev_qualifier_ids.entries[i];
 
-		if (value.enabled)
+		if (value.enabled) {
 			trace__init_bpf_map_syscall_args(trace, key, &value);
+			trace__init_syscall_bpf_progs(trace, key);
+		}
 
 		err = bpf_map_update_elem(fd, &key, &value, BPF_EXIST);
 		if (err)
@@ -2793,8 +2837,10 @@ static int __trace__init_syscalls_bpf_map(struct trace *trace, bool enabled)
 	int err = 0, key;
 
 	for (key = 0; key < trace->sctbl->syscalls.nr_entries; ++key) {
-		if (enabled)
+		if (enabled) {
 			trace__init_bpf_map_syscall_args(trace, key, &value);
+			trace__init_syscall_bpf_progs(trace, key);
+		}
 
 		err = bpf_map_update_elem(fd, &key, &value, BPF_ANY);
 		if (err)
diff --git a/tools/perf/examples/bpf/augmented_raw_syscalls.c b/tools/perf/examples/bpf/augmented_raw_syscalls.c
index 48a536b1be6d..66b33b299349 100644
--- a/tools/perf/examples/bpf/augmented_raw_syscalls.c
+++ b/tools/perf/examples/bpf/augmented_raw_syscalls.c
@@ -94,6 +94,29 @@ int syscall_unaugmented(struct syscall_enter_args *args)
 	return 1;
 }
 
+/*
+ * This will be tail_called from SEC("raw_syscalls:sys_enter"), so will find in
+ * augmented_filename_map what was read by that raw_syscalls:sys_enter and go
+ * on from there, reading the first syscall arg as a string, i.e. open's
+ * filename.
+ */
+SEC("!syscalls:sys_enter_open")
+int sys_enter_open(struct syscall_enter_args *args)
+{
+	int key = 0;
+	struct augmented_args_filename *augmented_args = bpf_map_lookup_elem(&augmented_filename_map, &key);
+	const void *filename_arg = (const void *)args->args[0];
+	unsigned int len = sizeof(augmented_args->args);
+
+        if (augmented_args == NULL)
+                return 1; /* Failure: don't filter */
+
+	len += augmented_filename__read(&augmented_args->filename, filename_arg, sizeof(augmented_args->filename.value));
+
+	/* If perf_event_output fails, return non-zero so that it gets recorded unaugmented */
+	return perf_event_output(args, &__augmented_syscalls__, BPF_F_CURRENT_CPU, augmented_args, len);
+}
+
 SEC("raw_syscalls:sys_enter")
 int sys_enter(struct syscall_enter_args *args)
 {
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 09/37] perf trace: Put the per-syscall entry/exit prog_array BPF map infrastructure in place
  2019-07-22 17:38 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (7 preceding siblings ...)
  2019-07-22 17:38 ` [PATCH 08/37] perf trace: Allow specifying the bpf prog to augment specific syscalls Arnaldo Carvalho de Melo
@ 2019-07-22 17:38 ` Arnaldo Carvalho de Melo
  2019-07-22 17:38 ` [PATCH 10/37] perf trace: Handle raw_syscalls:sys_enter just like the BPF_OUTPUT augmented event Arnaldo Carvalho de Melo
                   ` (27 subsequent siblings)
  36 siblings, 0 replies; 38+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-07-22 17:38 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
	linux-perf-users, Arnaldo Carvalho de Melo, Adrian Hunter,
	Luis Cláudio Gonçalves

From: Arnaldo Carvalho de Melo <acme@redhat.com>

I.e. look for "syscalls_sys_enter" and "syscalls_sys_exit" BPF maps of
type PROG_ARRAY and populate it with the handlers as specified per
syscall, for now only 'open' is wiring it to something, in time all
syscalls that need to copy arguments entering a syscall or returning
from one will set these to the right handlers, reusing when possible
pre-existing ones.

Next step is to use bpf_tail_call() into that.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-t0p4u43i9vbpzs1xtowna3gb@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-trace.c                    | 76 ++++++++++++++++++-
 .../examples/bpf/augmented_raw_syscalls.c     | 14 ++++
 2 files changed, 86 insertions(+), 4 deletions(-)

diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index 6cc696edf24a..fb8b8e78d7b5 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -1,4 +1,3 @@
-// SPDX-License-Identifier: GPL-2.0-only
 /*
  * builtin-trace.c
  *
@@ -83,6 +82,10 @@ struct trace {
 		int		max;
 		struct syscall  *table;
 		struct bpf_map  *map;
+		struct { // per syscall BPF_MAP_TYPE_PROG_ARRAY
+			struct bpf_map  *sys_enter,
+					*sys_exit;
+		}		prog_array;
 		struct {
 			struct perf_evsel *sys_enter,
 					  *sys_exit,
@@ -1619,6 +1622,22 @@ static int trace__validate_ev_qualifier(struct trace *trace)
 	goto out;
 }
 
+static __maybe_unused bool trace__syscall_enabled(struct trace *trace, int id)
+{
+	bool in_ev_qualifier;
+
+	if (trace->ev_qualifier_ids.nr == 0)
+		return true;
+
+	in_ev_qualifier = bsearch(&id, trace->ev_qualifier_ids.entries,
+				  trace->ev_qualifier_ids.nr, sizeof(int), intcmp) != NULL;
+
+	if (in_ev_qualifier)
+	       return !trace->not_ev_qualifier;
+
+	return trace->not_ev_qualifier;
+}
+
 /*
  * args is to be interpreted as a series of longs but we need to handle
  * 8-byte unaligned accesses. args points to raw_data within the event
@@ -2784,6 +2803,18 @@ static void trace__init_syscall_bpf_progs(struct trace *trace, int id)
 	}
 }
 
+static int trace__bpf_prog_sys_enter_fd(struct trace *trace, int id)
+{
+	struct syscall *sc = trace__syscall_info(trace, NULL, id);
+	return sc ? bpf_program__fd(sc->bpf_prog.sys_enter) : bpf_program__fd(trace->syscalls.unaugmented_prog);
+}
+
+static int trace__bpf_prog_sys_exit_fd(struct trace *trace, int id)
+{
+	struct syscall *sc = trace__syscall_info(trace, NULL, id);
+	return sc ? bpf_program__fd(sc->bpf_prog.sys_exit) : bpf_program__fd(trace->syscalls.unaugmented_prog);
+}
+
 static void trace__init_bpf_map_syscall_args(struct trace *trace, int id, struct bpf_map_syscall_entry *entry)
 {
 	struct syscall *sc = trace__syscall_info(trace, NULL, id);
@@ -2837,10 +2868,8 @@ static int __trace__init_syscalls_bpf_map(struct trace *trace, bool enabled)
 	int err = 0, key;
 
 	for (key = 0; key < trace->sctbl->syscalls.nr_entries; ++key) {
-		if (enabled) {
+		if (enabled)
 			trace__init_bpf_map_syscall_args(trace, key, &value);
-			trace__init_syscall_bpf_progs(trace, key);
-		}
 
 		err = bpf_map_update_elem(fd, &key, &value, BPF_ANY);
 		if (err)
@@ -2859,6 +2888,34 @@ static int trace__init_syscalls_bpf_map(struct trace *trace)
 
 	return __trace__init_syscalls_bpf_map(trace, enabled);
 }
+
+static int trace__init_syscalls_bpf_prog_array_maps(struct trace *trace)
+{
+	int map_enter_fd = bpf_map__fd(trace->syscalls.prog_array.sys_enter),
+	    map_exit_fd  = bpf_map__fd(trace->syscalls.prog_array.sys_exit);
+	int err = 0, key;
+
+	for (key = 0; key < trace->sctbl->syscalls.nr_entries; ++key) {
+		int prog_fd;
+
+		if (!trace__syscall_enabled(trace, key))
+			continue;
+
+		trace__init_syscall_bpf_progs(trace, key);
+
+		// It'll get at least the "!raw_syscalls:unaugmented"
+		prog_fd = trace__bpf_prog_sys_enter_fd(trace, key);
+		err = bpf_map_update_elem(map_enter_fd, &key, &prog_fd, BPF_ANY);
+		if (err)
+			break;
+		prog_fd = trace__bpf_prog_sys_exit_fd(trace, key);
+		err = bpf_map_update_elem(map_exit_fd, &key, &prog_fd, BPF_ANY);
+		if (err)
+			break;
+	}
+
+	return err;
+}
 #else
 static int trace__set_ev_qualifier_bpf_filter(struct trace *trace __maybe_unused)
 {
@@ -2875,6 +2932,11 @@ static struct bpf_program *trace__find_bpf_program_by_title(struct trace *trace
 {
 	return NULL;
 }
+
+static int trace__init_syscalls_bpf_prog_array_maps(struct trace *trace __maybe_unused)
+{
+	return 0;
+}
 #endif // HAVE_LIBBPF_SUPPORT
 
 static int trace__set_ev_qualifier_filter(struct trace *trace)
@@ -3129,6 +3191,10 @@ static int trace__run(struct trace *trace, int argc, const char **argv)
 	if (trace->syscalls.map)
 		trace__init_syscalls_bpf_map(trace);
 
+	if (trace->syscalls.prog_array.sys_enter)
+		trace__init_syscalls_bpf_prog_array_maps(trace);
+
+
 	if (trace->ev_qualifier_ids.nr > 0) {
 		err = trace__set_ev_qualifier_filter(trace);
 		if (err < 0)
@@ -3754,6 +3820,8 @@ static void trace__set_bpf_map_filtered_pids(struct trace *trace)
 static void trace__set_bpf_map_syscalls(struct trace *trace)
 {
 	trace->syscalls.map = trace__find_bpf_map_by_name(trace, "syscalls");
+	trace->syscalls.prog_array.sys_enter = trace__find_bpf_map_by_name(trace, "syscalls_sys_enter");
+	trace->syscalls.prog_array.sys_exit  = trace__find_bpf_map_by_name(trace, "syscalls_sys_exit");
 }
 
 static int trace__config(const char *var, const char *value, void *arg)
diff --git a/tools/perf/examples/bpf/augmented_raw_syscalls.c b/tools/perf/examples/bpf/augmented_raw_syscalls.c
index 66b33b299349..c66474a6ccf4 100644
--- a/tools/perf/examples/bpf/augmented_raw_syscalls.c
+++ b/tools/perf/examples/bpf/augmented_raw_syscalls.c
@@ -33,6 +33,20 @@ struct syscall {
 
 bpf_map(syscalls, ARRAY, int, struct syscall, 512);
 
+/*
+ * What to augment at entry?
+ *
+ * Pointer arg payloads (filenames, etc) passed from userspace to the kernel
+ */
+bpf_map(syscalls_sys_enter, PROG_ARRAY, u32, u32, 512);
+
+/*
+ * What to augment at exit?
+ *
+ * Pointer arg payloads returned from the kernel (struct stat, etc) to userspace.
+ */
+bpf_map(syscalls_sys_exit, PROG_ARRAY, u32, u32, 512);
+
 struct syscall_enter_args {
 	unsigned long long common_tp_fields;
 	long		   syscall_nr;
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 10/37] perf trace: Handle raw_syscalls:sys_enter just like the BPF_OUTPUT augmented event
  2019-07-22 17:38 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (8 preceding siblings ...)
  2019-07-22 17:38 ` [PATCH 09/37] perf trace: Put the per-syscall entry/exit prog_array BPF map infrastructure in place Arnaldo Carvalho de Melo
@ 2019-07-22 17:38 ` Arnaldo Carvalho de Melo
  2019-07-22 17:38 ` [PATCH 11/37] perf augmented_raw_syscalls: Add handler for "openat" Arnaldo Carvalho de Melo
                   ` (26 subsequent siblings)
  36 siblings, 0 replies; 38+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-07-22 17:38 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
	linux-perf-users, Arnaldo Carvalho de Melo, Adrian Hunter,
	Luis Cláudio Gonçalves

From: Arnaldo Carvalho de Melo <acme@redhat.com>

So, we use a PERF_COUNT_SW_BPF_OUTPUT to output the augmented sys_enter
payload, i.e. to output more than just the raw syscall args, and if
something goes wrong when handling an unfiltered syscall, we bail out
and just return 1 in the bpf program associated with
raw_syscalls:sys_enter, meaning, don't filter that tracepoint, in which
case what will appear in the perf ring buffer isn't the BPF_OUTPUT
event, but the original raw_syscalls:sys_enter event with its normal
payload.

Now that we're switching to using a bpf_tail_call +
BPF_MAP_TYPE_PROG_ARRAY we're going to use this in the common case, so a
bug where raw_syscalls:sys_enter wasn't being handled by
trace__sys_enter() surfaced and for  that case, instead of using the
strace-like augmenter (trace__sys_enter()), we continued to use the
normal generic tracepoint handler:

  (gdb) p evsel
  $2 = (struct perf_evsel *) 0xc03e40
  (gdb) p evsel->name
  $3 = 0xbc56c0 "raw_syscalls:sys_enter"
  (gdb) p ((struct perf_evsel *) 0xc03e40)->name
  $4 = 0xbc56c0 "raw_syscalls:sys_enter"
  (gdb) p ((struct perf_evsel *) 0xc03e40)->handler
  $5 = (void *) 0x495eb3 <trace__event_handler>

This resulted in this:

     0.027 raw_syscalls:sys_enter:NR 12 (0, 7fcfcac64c9b, 4d, 7fcfcac64c9b, 7fcfcac6ce00, 19)
     ... [continued]: brk())                = 0x563b88677000

I.e. only the sys_exit tracepoint was being properly handled, but since
the sys_enter went to the generic trace__event_handler() we printed it
using libtraceevent's formatter instead of 'perf trace's strace-like
one.

Fix it by setting trace__sys_enter() as the handler for
raw_syscalls:sys_enter and setup the tp_field tracepoint field
accessors.

Now, to test it we just make raw_syscalls:sys_enter return 1 right after
checking if the pid is filtered, making it not use
bpf_perf_output_event() but rather ask for the tracepoint not to be
filtered and the result is the expected one:

  brk(NULL)                               = 0x556f42d6e000

I.e. raw_syscalls:sys_enter returns 1, gets handled by
trace__sys_enter() and gets it combined with the raw_syscalls:sys_exit
in a strace-like way.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-0mkocgk31nmy0odknegcby4z@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-trace.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index fb8b8e78d7b5..872c9cc982a5 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -4128,7 +4128,22 @@ int cmd_trace(int argc, const char **argv)
 				if (perf_evsel__init_augmented_syscall_tp(augmented, evsel) ||
 				    perf_evsel__init_augmented_syscall_tp_args(augmented))
 					goto out;
+				/*
+				 * Augmented is __augmented_syscalls__ BPF_OUTPUT event
+				 * Above we made sure we can get from the payload the tp fields
+				 * that we get from syscalls:sys_enter tracefs format file.
+				 */
 				augmented->handler = trace__sys_enter;
+				/*
+				 * Now we do the same for the *syscalls:sys_enter event so that
+				 * if we handle it directly, i.e. if the BPF prog returns 0 so
+				 * as not to filter it, then we'll handle it just like we would
+				 * for the BPF_OUTPUT one:
+				 */
+				if (perf_evsel__init_augmented_syscall_tp(evsel, evsel) ||
+				    perf_evsel__init_augmented_syscall_tp_args(evsel))
+					goto out;
+				evsel->handler = trace__sys_enter;
 			}
 
 			if (strstarts(perf_evsel__name(evsel), "syscalls:sys_exit_")) {
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 11/37] perf augmented_raw_syscalls: Add handler for "openat"
  2019-07-22 17:38 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (9 preceding siblings ...)
  2019-07-22 17:38 ` [PATCH 10/37] perf trace: Handle raw_syscalls:sys_enter just like the BPF_OUTPUT augmented event Arnaldo Carvalho de Melo
@ 2019-07-22 17:38 ` Arnaldo Carvalho de Melo
  2019-07-22 17:38 ` [PATCH 12/37] perf augmented_raw_syscalls: Switch to using BPF_MAP_TYPE_PROG_ARRAY Arnaldo Carvalho de Melo
                   ` (25 subsequent siblings)
  36 siblings, 0 replies; 38+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-07-22 17:38 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
	linux-perf-users, Arnaldo Carvalho de Melo, Adrian Hunter,
	Luis Cláudio Gonçalves

From: Arnaldo Carvalho de Melo <acme@redhat.com>

I.e. for a syscall that has its second argument being a string, its
difficult these days to find 'open' being used in the wild :-)

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-yf3kbzirqrukd3fb2sp5qx4p@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-trace.c                      |  1 +
 .../perf/examples/bpf/augmented_raw_syscalls.c  | 17 +++++++++++++++++
 2 files changed, 18 insertions(+)

diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index 872c9cc982a5..a681b8c2ee4e 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -836,6 +836,7 @@ static struct syscall_fmt {
 	  .arg = { [0] = { .scnprintf = SCA_FDAT,	/* dfd */ },
 		   [2] = { .scnprintf = SCA_OPEN_FLAGS, /* flags */ }, }, },
 	{ .name	    = "openat",
+	  .bpf_prog_name = { .sys_enter = "!syscalls:sys_enter_openat", },
 	  .arg = { [0] = { .scnprintf = SCA_FDAT,	/* dfd */ },
 		   [2] = { .scnprintf = SCA_OPEN_FLAGS, /* flags */ }, }, },
 	{ .name	    = "perf_event_open",
diff --git a/tools/perf/examples/bpf/augmented_raw_syscalls.c b/tools/perf/examples/bpf/augmented_raw_syscalls.c
index c66474a6ccf4..661936f90fe0 100644
--- a/tools/perf/examples/bpf/augmented_raw_syscalls.c
+++ b/tools/perf/examples/bpf/augmented_raw_syscalls.c
@@ -131,6 +131,23 @@ int sys_enter_open(struct syscall_enter_args *args)
 	return perf_event_output(args, &__augmented_syscalls__, BPF_F_CURRENT_CPU, augmented_args, len);
 }
 
+SEC("!syscalls:sys_enter_openat")
+int sys_enter_openat(struct syscall_enter_args *args)
+{
+	int key = 0;
+	struct augmented_args_filename *augmented_args = bpf_map_lookup_elem(&augmented_filename_map, &key);
+	const void *filename_arg = (const void *)args->args[1];
+	unsigned int len = sizeof(augmented_args->args);
+
+        if (augmented_args == NULL)
+                return 1; /* Failure: don't filter */
+
+	len += augmented_filename__read(&augmented_args->filename, filename_arg, sizeof(augmented_args->filename.value));
+
+	/* If perf_event_output fails, return non-zero so that it gets recorded unaugmented */
+	return perf_event_output(args, &__augmented_syscalls__, BPF_F_CURRENT_CPU, augmented_args, len);
+}
+
 SEC("raw_syscalls:sys_enter")
 int sys_enter(struct syscall_enter_args *args)
 {
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 12/37] perf augmented_raw_syscalls: Switch to using BPF_MAP_TYPE_PROG_ARRAY
  2019-07-22 17:38 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (10 preceding siblings ...)
  2019-07-22 17:38 ` [PATCH 11/37] perf augmented_raw_syscalls: Add handler for "openat" Arnaldo Carvalho de Melo
@ 2019-07-22 17:38 ` Arnaldo Carvalho de Melo
  2019-07-22 17:38 ` [PATCH 13/37] perf augmented_raw_syscalls: Support copying two string syscall args Arnaldo Carvalho de Melo
                   ` (24 subsequent siblings)
  36 siblings, 0 replies; 38+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-07-22 17:38 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
	linux-perf-users, Arnaldo Carvalho de Melo, Adrian Hunter,
	Luis Cláudio Gonçalves

From: Arnaldo Carvalho de Melo <acme@redhat.com>

Trying to control what arguments to copy, which ones were strings, etc
all from userspace via maps went nowhere, lots of difficulties to get
the verifier satisfied, so use what the fine BPF guys designed for such
a syscall handling mechanism: bpf_tail_call + BPF_MAP_TYPE_PROG_ARRAY.

The series leading to this should have explained it thoroughly, but the
end result, explained via gdb should help understand this:

  Breakpoint 1, syscall_arg__scnprintf_filename (bf=0xc002b1 "", size=2031, arg=0x7fffffff7970) at builtin-trace.c:1268
  1268	{
  (gdb) n
  1269		unsigned long ptr = arg->val;
  (gdb) n
  1271		if (arg->augmented.args)
  (gdb) n
  1272			return syscall_arg__scnprintf_augmented_string(arg, bf, size);
  (gdb) s
  syscall_arg__scnprintf_augmented_string (arg=0x7fffffff7970, bf=0xc002b1 "", size=2031) at builtin-trace.c:1251
  1251	{
  (gdb) n
  1252		struct augmented_arg *augmented_arg = arg->augmented.args;
  (gdb) n
  1253		size_t printed = scnprintf(bf, size, "\"%.*s\"", augmented_arg->size, augmented_arg->value);
  (gdb) n
  1258		int consumed = sizeof(*augmented_arg) + augmented_arg->size;
  (gdb) p bf
  $1 = 0xc002b1 "\"/etc/ld.so.cache\""
  (gdb) bt
  #0  syscall_arg__scnprintf_augmented_string (arg=0x7fffffff7970, bf=0xc002b1 "\"/etc/ld.so.cache\"", size=2031) at builtin-trace.c:1258
  #1  0x0000000000492634 in syscall_arg__scnprintf_filename (bf=0xc002b1 "\"/etc/ld.so.cache\"", size=2031, arg=0x7fffffff7970) at builtin-trace.c:1272
  #2  0x0000000000493cd7 in syscall__scnprintf_val (sc=0xc0de68, bf=0xc002b1 "\"/etc/ld.so.cache\"", size=2031, arg=0x7fffffff7970, val=140737354091036) at builtin-trace.c:1689
  #3  0x000000000049404f in syscall__scnprintf_args (sc=0xc0de68, bf=0xc002a7 "AT_FDCWD, \"/etc/ld.so.cache\"", size=2041, args=0x7ffff6cbf1ec "\234\377\377\377", augmented_args=0x7ffff6cbf21c, augmented_args_size=28, trace=0x7fffffffa170,
      thread=0xbff940) at builtin-trace.c:1756
  #4  0x0000000000494a97 in trace__sys_enter (trace=0x7fffffffa170, evsel=0xbe1900, event=0x7ffff6cbf1a0, sample=0x7fffffff7b00) at builtin-trace.c:1975
  #5  0x0000000000496ff1 in trace__handle_event (trace=0x7fffffffa170, event=0x7ffff6cbf1a0, sample=0x7fffffff7b00) at builtin-trace.c:2685
  #6  0x0000000000497edb in __trace__deliver_event (trace=0x7fffffffa170, event=0x7ffff6cbf1a0) at builtin-trace.c:3029
  #7  0x000000000049801e in trace__deliver_event (trace=0x7fffffffa170, event=0x7ffff6cbf1a0) at builtin-trace.c:3056
  #8  0x00000000004988de in trace__run (trace=0x7fffffffa170, argc=2, argv=0x7fffffffd660) at builtin-trace.c:3258
  #9  0x000000000049c2d3 in cmd_trace (argc=2, argv=0x7fffffffd660) at builtin-trace.c:4220
  #10 0x00000000004dcb6c in run_builtin (p=0xa18e00 <commands+576>, argc=5, argv=0x7fffffffd660) at perf.c:304
  #11 0x00000000004dcdd9 in handle_internal_command (argc=5, argv=0x7fffffffd660) at perf.c:356
  #12 0x00000000004dcf20 in run_argv (argcp=0x7fffffffd4bc, argv=0x7fffffffd4b0) at perf.c:400
  #13 0x00000000004dd28c in main (argc=5, argv=0x7fffffffd660) at perf.c:522
  (gdb)
  (gdb) continue
  Continuing.
  openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3

Now its a matter of automagically assigning the BPF programs copying
syscall arg pointers to functions that are "open"-like (i.e. that need
only the first syscall arg copied as a string), or "openat"-like (2nd
arg, etc).

End result in tool output:

  # perf trace -e open* ls /tmp/notthere
  LLVM: dumping /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o
  openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
  openat(AT_FDCWD, "/lib64/libselinux.so.1", O_RDONLY|O_CLOEXEC) = 3
  openat(AT_FDCWD, "/lib64/libcap.so.2", O_RDONLY|O_CLOEXEC) = 3
  openat(AT_FDCWD, "/lib64/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
  openat(AT_FDCWD, "/lib64/libpcre2-8.so.0", O_RDONLY|O_CLOEXEC) = 3
  openat(AT_FDCWD, "/lib64/libdl.so.2", O_RDONLY|O_CLOEXEC) = 3
  openat(AT_FDCWD, "/lib64/libpthread.so.0", O_RDONLY|O_CLOEXEC) = 3
  openat(AT_FDCWD, "", O_RDONLY|O_CLOEXEC) = 3
  openat(AT_FDCWD, "/usr/share/locale/locale.alias", O_RDONLY|O_CLOEXEC) = 3
  openat(AT_FDCWD, "/usr/share/locale/en_US.UTF-8/LC_MESSAGES/coreutils.mo", O_RDONLY) = ls: cannot access '/tmp/notthere'-1 ENOENT (No such file or directory)
  openat(AT_FDCWD, "/usr/share/locale/en_US.utf8/LC_MESSAGES/coreutils.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
  openat(AT_FDCWD, "/usr/share/locale/en_US/LC_MESSAGES/coreutils.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
  openat(AT_FDCWD, "/usr/share/locale/en.UTF-8/LC_MESSAGES/coreutils.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
  openat(AT_FDCWD, "/usr/share/locale/en.utf8/LC_MESSAGES/coreutils.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
  openat(AT_FDCWD, "/usr/share/locale/en/LC_MESSAGES/coreutils.mo", O_RDONLY: No such file or directory) =
  -1 ENOENT (No such file or directory)
  openat(AT_FDCWD, "/usr/share/locale/en_US.UTF-8/LC_MESSAGES/libc.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
  openat(AT_FDCWD, "/usr/share/locale/en_US.utf8/LC_MESSAGES/libc.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
  openat(AT_FDCWD, "/usr/share/locale/en_US/LC_MESSAGES/libc.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
  openat(AT_FDCWD, "/usr/share/locale/en.UTF-8/LC_MESSAGES/libc.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
  openat(AT_FDCWD, "/usr/share/locale/en.utf8/LC_MESSAGES/libc.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
  openat(AT_FDCWD, "/usr/share/locale/en/LC_MESSAGES/libc.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
  #

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-snc7ry99cl6r0pqaspjim98x@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 .../examples/bpf/augmented_raw_syscalls.c     | 127 +++---------------
 1 file changed, 16 insertions(+), 111 deletions(-)

diff --git a/tools/perf/examples/bpf/augmented_raw_syscalls.c b/tools/perf/examples/bpf/augmented_raw_syscalls.c
index 661936f90fe0..ce308b9a317c 100644
--- a/tools/perf/examples/bpf/augmented_raw_syscalls.c
+++ b/tools/perf/examples/bpf/augmented_raw_syscalls.c
@@ -174,131 +174,36 @@ int sys_enter(struct syscall_enter_args *args)
 
 	probe_read(&augmented_args->args, sizeof(augmented_args->args), args);
 
-	syscall = bpf_map_lookup_elem(&syscalls, &augmented_args->args.syscall_nr);
-	if (syscall == NULL || !syscall->enabled)
-		return 0;
 	/*
-	 * Yonghong and Edward Cree sayz:
-	 *
-	 * https://www.spinics.net/lists/netdev/msg531645.html
-	 *
-	 * >>   R0=inv(id=0) R1=inv2 R6=ctx(id=0,off=0,imm=0) R7=inv64 R10=fp0,call_-1
-	 * >> 10: (bf) r1 = r6
-	 * >> 11: (07) r1 += 16
-	 * >> 12: (05) goto pc+2
-	 * >> 15: (79) r3 = *(u64 *)(r1 +0)
-	 * >> dereference of modified ctx ptr R1 off=16 disallowed
-	 * > Aha, we at least got a different error message this time.
-	 * > And indeed llvm has done that optimisation, rather than the more obvious
-	 * > 11: r3 = *(u64 *)(r1 +16)
-	 * > because it wants to have lots of reads share a single insn.  You may be able
-	 * > to defeat that optimisation by adding compiler barriers, idk.  Maybe someone
-	 * > with llvm knowledge can figure out how to stop it (ideally, llvm would know
-	 * > when it's generating for bpf backend and not do that).  -O0?  ¯\_(ツ)_/¯
-	 *
-	 * The optimization mostly likes below:
-	 *
-	 *	br1:
-	 * 	...
-	 *	r1 += 16
-	 *	goto merge
-	 *	br2:
-	 *	...
-	 *	r1 += 20
-	 *	goto merge
-	 *	merge:
-	 *	*(u64 *)(r1 + 0)
-	 *
-	 * The compiler tries to merge common loads. There is no easy way to
-	 * stop this compiler optimization without turning off a lot of other
-	 * optimizations. The easiest way is to add barriers:
-	 *
-	 * 	 __asm__ __volatile__("": : :"memory")
-	 *
-	 * 	 after the ctx memory access to prevent their down stream merging.
+	 * Jump to syscall specific augmenter, even if the default one,
+	 * "!raw_syscalls:unaugmented" that will just return 1 to return the
+	 * unagmented tracepoint payload.
 	 */
-	/*
-	 * For now copy just the first string arg, we need to improve the protocol
-	 * and have more than one.
-	 *
-	 * Using the unrolled loop is not working, only when we do it manually,
-	 * check this out later...
-
-	u8 arg;
-#pragma clang loop unroll(full)
-	for (arg = 0; arg < 6; ++arg) {
-		if (syscall->string_args_len[arg] != 0) {
-			filename_len = syscall->string_args_len[arg];
-			filename_arg = (const void *)args->args[arg];
-			__asm__ __volatile__("": : :"memory");
-			break;
-		}
-	}
-
-	verifier log:
-
-; if (syscall->string_args_len[arg] != 0) {
-37: (69) r3 = *(u16 *)(r0 +2)
- R0=map_value(id=0,off=0,ks=4,vs=14,imm=0) R1_w=inv0 R2_w=map_value(id=0,off=2,ks=4,vs=14,imm=0) R6=ctx(id=0,off=0,imm=0) R7=map_value(id=0,off=0,ks=4,vs=4168,imm=0) R10=fp0,call_-1 fp-8=mmmmmmmm
-; if (syscall->string_args_len[arg] != 0) {
-38: (55) if r3 != 0x0 goto pc+5
- R0=map_value(id=0,off=0,ks=4,vs=14,imm=0) R1=inv0 R2=map_value(id=0,off=2,ks=4,vs=14,imm=0) R3=inv0 R6=ctx(id=0,off=0,imm=0) R7=map_value(id=0,off=0,ks=4,vs=4168,imm=0) R10=fp0,call_-1 fp-8=mmmmmmmm
-39: (b7) r1 = 1
-; if (syscall->string_args_len[arg] != 0) {
-40: (bf) r2 = r0
-41: (07) r2 += 4
-42: (69) r3 = *(u16 *)(r0 +4)
- R0=map_value(id=0,off=0,ks=4,vs=14,imm=0) R1_w=inv1 R2_w=map_value(id=0,off=4,ks=4,vs=14,imm=0) R3_w=inv0 R6=ctx(id=0,off=0,imm=0) R7=map_value(id=0,off=0,ks=4,vs=4168,imm=0) R10=fp0,call_-1 fp-8=mmmmmmmm
-; if (syscall->string_args_len[arg] != 0) {
-43: (15) if r3 == 0x0 goto pc+32
- R0=map_value(id=0,off=0,ks=4,vs=14,imm=0) R1=inv1 R2=map_value(id=0,off=4,ks=4,vs=14,imm=0) R3=inv(id=0,umax_value=65535,var_off=(0x0; 0xffff)) R6=ctx(id=0,off=0,imm=0) R7=map_value(id=0,off=0,ks=4,vs=4168,imm=0) R10=fp0,call_-1 fp-8=mmmmmmmm
-; filename_arg = (const void *)args->args[arg];
-44: (67) r1 <<= 3
-45: (bf) r3 = r6
-46: (0f) r3 += r1
-47: (b7) r5 = 64
-48: (79) r3 = *(u64 *)(r3 +16)
-dereference of modified ctx ptr R3 off=8 disallowed
-processed 46 insns (limit 1000000) max_states_per_insn 0 total_states 12 peak_states 12 mark_read 7
-	*/
-
-#define __loop_iter(arg) \
-	if (syscall->string_args_len[arg] != 0) { \
-		unsigned int filename_len = syscall->string_args_len[arg]; \
-		const void *filename_arg = (const void *)args->args[arg]; \
-		if (filename_len <= sizeof(augmented_args->filename.value)) \
-			len += augmented_filename__read(&augmented_args->filename, filename_arg, filename_len);
-#define loop_iter_first() __loop_iter(0); }
-#define loop_iter(arg) else __loop_iter(arg); }
-#define loop_iter_last(arg) else __loop_iter(arg); __asm__ __volatile__("": : :"memory"); }
-
-	loop_iter_first()
-	loop_iter(1)
-	loop_iter(2)
-	loop_iter(3)
-	loop_iter(4)
-	loop_iter_last(5)
+	bpf_tail_call(args, &syscalls_sys_enter, augmented_args->args.syscall_nr);
 
-	/* If perf_event_output fails, return non-zero so that it gets recorded unaugmented */
-	return perf_event_output(args, &__augmented_syscalls__, BPF_F_CURRENT_CPU, augmented_args, len);
+	// If not found on the PROG_ARRAY syscalls map, then we're filtering it:
+	return 0;
 }
 
 SEC("raw_syscalls:sys_exit")
 int sys_exit(struct syscall_exit_args *args)
 {
 	struct syscall_exit_args exit_args;
-	struct syscall *syscall;
 
 	if (pid_filter__has(&pids_filtered, getpid()))
 		return 0;
 
 	probe_read(&exit_args, sizeof(exit_args), args);
-
-	syscall = bpf_map_lookup_elem(&syscalls, &exit_args.syscall_nr);
-	if (syscall == NULL || !syscall->enabled)
-		return 0;
-
-	return 1;
+	/*
+	 * Jump to syscall specific return augmenter, even if the default one,
+	 * "!raw_syscalls:unaugmented" that will just return 1 to return the
+	 * unagmented tracepoint payload.
+	 */
+	bpf_tail_call(args, &syscalls_sys_exit, exit_args.syscall_nr);
+	/*
+	 * If not found on the PROG_ARRAY syscalls map, then we're filtering it:
+	 */
+	return 0;
 }
 
 license(GPL);
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 13/37] perf augmented_raw_syscalls: Support copying two string syscall args
  2019-07-22 17:38 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (11 preceding siblings ...)
  2019-07-22 17:38 ` [PATCH 12/37] perf augmented_raw_syscalls: Switch to using BPF_MAP_TYPE_PROG_ARRAY Arnaldo Carvalho de Melo
@ 2019-07-22 17:38 ` Arnaldo Carvalho de Melo
  2019-07-22 17:38 ` [PATCH 14/37] perf trace: Look for default name for entries in the syscalls prog array Arnaldo Carvalho de Melo
                   ` (23 subsequent siblings)
  36 siblings, 0 replies; 38+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-07-22 17:38 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
	linux-perf-users, Arnaldo Carvalho de Melo, Adrian Hunter,
	Luis Cláudio Gonçalves

From: Arnaldo Carvalho de Melo <acme@redhat.com>

Starting with the renameat and renameat2 syscall, that both receive as
second and fourth parameters a pathname:

  # perf trace -e rename* mv one ANOTHER
  LLVM: dumping /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o
  mv: cannot stat 'one': No such file or directory
  renameat2(AT_FDCWD, "one", AT_FDCWD, "ANOTHER", RENAME_NOREPLACE) = -1 ENOENT (No such file or directory)
  #

Since the per CPU scratch buffer map has space for two maximum sized
pathnames, the verifier is satisfied that there will be no overrun.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-x2uboyg5kx2wqeru288209b6@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-trace.c                    |  2 ++
 .../examples/bpf/augmented_raw_syscalls.c     | 20 +++++++++++++++++++
 2 files changed, 22 insertions(+)

diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index a681b8c2ee4e..c64f7c99db15 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -873,9 +873,11 @@ static struct syscall_fmt {
 	{ .name	    = "recvmsg",
 	  .arg = { [2] = { .scnprintf = SCA_MSG_FLAGS, /* flags */ }, }, },
 	{ .name	    = "renameat",
+	  .bpf_prog_name = { .sys_enter = "!syscalls:sys_enter_renameat", },
 	  .arg = { [0] = { .scnprintf = SCA_FDAT, /* olddirfd */ },
 		   [2] = { .scnprintf = SCA_FDAT, /* newdirfd */ }, }, },
 	{ .name	    = "renameat2",
+	  .bpf_prog_name = { .sys_enter = "!syscalls:sys_enter_renameat", },
 	  .arg = { [0] = { .scnprintf = SCA_FDAT, /* olddirfd */ },
 		   [2] = { .scnprintf = SCA_FDAT, /* newdirfd */ },
 		   [4] = { .scnprintf = SCA_RENAMEAT2_FLAGS, /* flags */ }, }, },
diff --git a/tools/perf/examples/bpf/augmented_raw_syscalls.c b/tools/perf/examples/bpf/augmented_raw_syscalls.c
index ce308b9a317c..df52d92e1c69 100644
--- a/tools/perf/examples/bpf/augmented_raw_syscalls.c
+++ b/tools/perf/examples/bpf/augmented_raw_syscalls.c
@@ -70,6 +70,7 @@ pid_filter(pids_filtered);
 struct augmented_args_filename {
        struct syscall_enter_args args;
        struct augmented_filename filename;
+       struct augmented_filename filename2;
 };
 
 bpf_map(augmented_filename_map, PERCPU_ARRAY, int, struct augmented_args_filename, 1);
@@ -148,6 +149,25 @@ int sys_enter_openat(struct syscall_enter_args *args)
 	return perf_event_output(args, &__augmented_syscalls__, BPF_F_CURRENT_CPU, augmented_args, len);
 }
 
+SEC("!syscalls:sys_enter_renameat")
+int sys_enter_renameat(struct syscall_enter_args *args)
+{
+	int key = 0;
+	struct augmented_args_filename *augmented_args = bpf_map_lookup_elem(&augmented_filename_map, &key);
+	const void *oldpath_arg = (const void *)args->args[1],
+		   *newpath_arg = (const void *)args->args[3];
+	unsigned int len = sizeof(augmented_args->args), oldpath_len;
+
+        if (augmented_args == NULL)
+                return 1; /* Failure: don't filter */
+
+	oldpath_len = augmented_filename__read(&augmented_args->filename, oldpath_arg, sizeof(augmented_args->filename.value));
+	len += oldpath_len + augmented_filename__read((void *)(&augmented_args->filename) + oldpath_len, newpath_arg, sizeof(augmented_args->filename.value));
+
+	/* If perf_event_output fails, return non-zero so that it gets recorded unaugmented */
+	return perf_event_output(args, &__augmented_syscalls__, BPF_F_CURRENT_CPU, augmented_args, len);
+}
+
 SEC("raw_syscalls:sys_enter")
 int sys_enter(struct syscall_enter_args *args)
 {
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 14/37] perf trace: Look for default name for entries in the syscalls prog array
  2019-07-22 17:38 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (12 preceding siblings ...)
  2019-07-22 17:38 ` [PATCH 13/37] perf augmented_raw_syscalls: Support copying two string syscall args Arnaldo Carvalho de Melo
@ 2019-07-22 17:38 ` Arnaldo Carvalho de Melo
  2019-07-22 17:38 ` [PATCH 15/37] perf augmented_raw_syscalls: Rename augmented_args_filename to augmented_args_payload Arnaldo Carvalho de Melo
                   ` (22 subsequent siblings)
  36 siblings, 0 replies; 38+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-07-22 17:38 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
	linux-perf-users, Arnaldo Carvalho de Melo, Adrian Hunter,
	Luis Cláudio Gonçalves

From: Arnaldo Carvalho de Melo <acme@redhat.com>

I.e. just look for "!syscalls:sys_enter_" or "exit_" plus the syscall
name, that way we need just to add entries to the
augmented_raw_syscalls.c BPF source to add handlers.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-6xavwddruokp6ohs7tf4qilb@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-trace.c | 30 +++++++++++++++++++-----------
 1 file changed, 19 insertions(+), 11 deletions(-)

diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index c64f7c99db15..5258399a1c94 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -830,13 +830,11 @@ static struct syscall_fmt {
 	{ .name	    = "newfstatat",
 	  .arg = { [0] = { .scnprintf = SCA_FDAT, /* dfd */ }, }, },
 	{ .name	    = "open",
-	  .bpf_prog_name = { .sys_enter = "!syscalls:sys_enter_open", },
 	  .arg = { [1] = { .scnprintf = SCA_OPEN_FLAGS, /* flags */ }, }, },
 	{ .name	    = "open_by_handle_at",
 	  .arg = { [0] = { .scnprintf = SCA_FDAT,	/* dfd */ },
 		   [2] = { .scnprintf = SCA_OPEN_FLAGS, /* flags */ }, }, },
 	{ .name	    = "openat",
-	  .bpf_prog_name = { .sys_enter = "!syscalls:sys_enter_openat", },
 	  .arg = { [0] = { .scnprintf = SCA_FDAT,	/* dfd */ },
 		   [2] = { .scnprintf = SCA_OPEN_FLAGS, /* flags */ }, }, },
 	{ .name	    = "perf_event_open",
@@ -873,7 +871,6 @@ static struct syscall_fmt {
 	{ .name	    = "recvmsg",
 	  .arg = { [2] = { .scnprintf = SCA_MSG_FLAGS, /* flags */ }, }, },
 	{ .name	    = "renameat",
-	  .bpf_prog_name = { .sys_enter = "!syscalls:sys_enter_renameat", },
 	  .arg = { [0] = { .scnprintf = SCA_FDAT, /* olddirfd */ },
 		   [2] = { .scnprintf = SCA_FDAT, /* newdirfd */ }, }, },
 	{ .name	    = "renameat2",
@@ -2778,12 +2775,27 @@ static struct bpf_program *trace__find_syscall_bpf_prog(struct trace *trace, str
 {
 	struct bpf_program *prog;
 
-	if (prog_name == NULL)
+	if (prog_name == NULL) {
+		char default_prog_name[256];
+		scnprintf(default_prog_name, sizeof(default_prog_name), "!syscalls:sys_%s_%s", type, sc->name);
+		prog = trace__find_bpf_program_by_title(trace, default_prog_name);
+		if (prog != NULL)
+			goto out_found;
+		if (sc->fmt && sc->fmt->alias) {
+			scnprintf(default_prog_name, sizeof(default_prog_name), "!syscalls:sys_%s_%s", type, sc->fmt->alias);
+			prog = trace__find_bpf_program_by_title(trace, default_prog_name);
+			if (prog != NULL)
+				goto out_found;
+		}
 		goto out_unaugmented;
+	}
 
 	prog = trace__find_bpf_program_by_title(trace, prog_name);
-	if (prog != NULL)
+
+	if (prog != NULL) {
+out_found:
 		return prog;
+	}
 
 	pr_debug("Couldn't find BPF prog \"%s\" to associate with syscalls:sys_%s_%s, not augmenting it\n",
 		 prog_name, type, sc->name);
@@ -2798,12 +2810,8 @@ static void trace__init_syscall_bpf_progs(struct trace *trace, int id)
 	if (sc == NULL)
 		return;
 
-	if (sc->fmt != NULL) {
-		sc->bpf_prog.sys_enter = trace__find_syscall_bpf_prog(trace, sc, sc->fmt->bpf_prog_name.sys_enter, "enter");
-		sc->bpf_prog.sys_exit  = trace__find_syscall_bpf_prog(trace, sc, sc->fmt->bpf_prog_name.sys_exit,  "exit");
-	} else {
-		sc->bpf_prog.sys_enter = sc->bpf_prog.sys_exit = trace->syscalls.unaugmented_prog;
-	}
+	sc->bpf_prog.sys_enter = trace__find_syscall_bpf_prog(trace, sc, sc->fmt ? sc->fmt->bpf_prog_name.sys_enter : NULL, "enter");
+	sc->bpf_prog.sys_exit  = trace__find_syscall_bpf_prog(trace, sc, sc->fmt ? sc->fmt->bpf_prog_name.sys_exit  : NULL,  "exit");
 }
 
 static int trace__bpf_prog_sys_enter_fd(struct trace *trace, int id)
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 15/37] perf augmented_raw_syscalls: Rename augmented_args_filename to augmented_args_payload
  2019-07-22 17:38 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (13 preceding siblings ...)
  2019-07-22 17:38 ` [PATCH 14/37] perf trace: Look for default name for entries in the syscalls prog array Arnaldo Carvalho de Melo
@ 2019-07-22 17:38 ` Arnaldo Carvalho de Melo
  2019-07-22 17:38 ` [PATCH 16/37] perf script: Fix --max-blocks man page description Arnaldo Carvalho de Melo
                   ` (21 subsequent siblings)
  36 siblings, 0 replies; 38+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-07-22 17:38 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
	linux-perf-users, Arnaldo Carvalho de Melo, Adrian Hunter,
	Luis Cláudio Gonçalves

From: Arnaldo Carvalho de Melo <acme@redhat.com>

It'll get other stuff in there than just filenames, starting with
sockaddr for 'connect' and 'bind'.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-bsexidtsn91ehdpzcd6n5fm9@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 .../examples/bpf/augmented_raw_syscalls.c     | 22 ++++++++++---------
 1 file changed, 12 insertions(+), 10 deletions(-)

diff --git a/tools/perf/examples/bpf/augmented_raw_syscalls.c b/tools/perf/examples/bpf/augmented_raw_syscalls.c
index df52d92e1c69..77bb6a0edce3 100644
--- a/tools/perf/examples/bpf/augmented_raw_syscalls.c
+++ b/tools/perf/examples/bpf/augmented_raw_syscalls.c
@@ -67,13 +67,15 @@ struct augmented_filename {
 
 pid_filter(pids_filtered);
 
-struct augmented_args_filename {
+struct augmented_args_payload {
        struct syscall_enter_args args;
-       struct augmented_filename filename;
-       struct augmented_filename filename2;
+       struct {
+	       struct augmented_filename filename;
+	       struct augmented_filename filename2;
+       };
 };
 
-bpf_map(augmented_filename_map, PERCPU_ARRAY, int, struct augmented_args_filename, 1);
+bpf_map(augmented_args_tmp, PERCPU_ARRAY, int, struct augmented_args_payload, 1);
 
 static inline
 unsigned int augmented_filename__read(struct augmented_filename *augmented_filename,
@@ -111,7 +113,7 @@ int syscall_unaugmented(struct syscall_enter_args *args)
 
 /*
  * This will be tail_called from SEC("raw_syscalls:sys_enter"), so will find in
- * augmented_filename_map what was read by that raw_syscalls:sys_enter and go
+ * augmented_args_tmp what was read by that raw_syscalls:sys_enter and go
  * on from there, reading the first syscall arg as a string, i.e. open's
  * filename.
  */
@@ -119,7 +121,7 @@ SEC("!syscalls:sys_enter_open")
 int sys_enter_open(struct syscall_enter_args *args)
 {
 	int key = 0;
-	struct augmented_args_filename *augmented_args = bpf_map_lookup_elem(&augmented_filename_map, &key);
+	struct augmented_args_payload *augmented_args = bpf_map_lookup_elem(&augmented_args_tmp, &key);
 	const void *filename_arg = (const void *)args->args[0];
 	unsigned int len = sizeof(augmented_args->args);
 
@@ -136,7 +138,7 @@ SEC("!syscalls:sys_enter_openat")
 int sys_enter_openat(struct syscall_enter_args *args)
 {
 	int key = 0;
-	struct augmented_args_filename *augmented_args = bpf_map_lookup_elem(&augmented_filename_map, &key);
+	struct augmented_args_payload *augmented_args = bpf_map_lookup_elem(&augmented_args_tmp, &key);
 	const void *filename_arg = (const void *)args->args[1];
 	unsigned int len = sizeof(augmented_args->args);
 
@@ -153,7 +155,7 @@ SEC("!syscalls:sys_enter_renameat")
 int sys_enter_renameat(struct syscall_enter_args *args)
 {
 	int key = 0;
-	struct augmented_args_filename *augmented_args = bpf_map_lookup_elem(&augmented_filename_map, &key);
+	struct augmented_args_payload *augmented_args = bpf_map_lookup_elem(&augmented_args_tmp, &key);
 	const void *oldpath_arg = (const void *)args->args[1],
 		   *newpath_arg = (const void *)args->args[3];
 	unsigned int len = sizeof(augmented_args->args), oldpath_len;
@@ -171,7 +173,7 @@ int sys_enter_renameat(struct syscall_enter_args *args)
 SEC("raw_syscalls:sys_enter")
 int sys_enter(struct syscall_enter_args *args)
 {
-	struct augmented_args_filename *augmented_args;
+	struct augmented_args_payload *augmented_args;
 	/*
 	 * We start len, the amount of data that will be in the perf ring
 	 * buffer, if this is not filtered out by one of pid_filter__has(),
@@ -185,7 +187,7 @@ int sys_enter(struct syscall_enter_args *args)
 	struct syscall *syscall;
 	int key = 0;
 
-        augmented_args = bpf_map_lookup_elem(&augmented_filename_map, &key);
+        augmented_args = bpf_map_lookup_elem(&augmented_args_tmp, &key);
         if (augmented_args == NULL)
                 return 1;
 
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 16/37] perf script: Fix --max-blocks man page description
  2019-07-22 17:38 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (14 preceding siblings ...)
  2019-07-22 17:38 ` [PATCH 15/37] perf augmented_raw_syscalls: Rename augmented_args_filename to augmented_args_payload Arnaldo Carvalho de Melo
@ 2019-07-22 17:38 ` Arnaldo Carvalho de Melo
  2019-07-22 17:38 ` [PATCH 17/37] perf script: Improve man page description of metrics Arnaldo Carvalho de Melo
                   ` (20 subsequent siblings)
  36 siblings, 0 replies; 38+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-07-22 17:38 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
	linux-perf-users, Andi Kleen, Arnaldo Carvalho de Melo

From: Andi Kleen <ak@linux.intel.com>

The --max-blocks description was using the old name brstackasm.  Use
brstackinsn instead.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20190711181922.18765-1-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/Documentation/perf-script.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/Documentation/perf-script.txt b/tools/perf/Documentation/perf-script.txt
index d4e2e18a5881..042b9e5dcc32 100644
--- a/tools/perf/Documentation/perf-script.txt
+++ b/tools/perf/Documentation/perf-script.txt
@@ -384,7 +384,7 @@ include::itrace.txt[]
 	perf script --time 0%-10%,30%-40%
 
 --max-blocks::
-	Set the maximum number of program blocks to print with brstackasm for
+	Set the maximum number of program blocks to print with brstackinsn for
 	each sample.
 
 --reltime::
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 17/37] perf script: Improve man page description of metrics
  2019-07-22 17:38 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (15 preceding siblings ...)
  2019-07-22 17:38 ` [PATCH 16/37] perf script: Fix --max-blocks man page description Arnaldo Carvalho de Melo
@ 2019-07-22 17:38 ` Arnaldo Carvalho de Melo
  2019-07-22 17:38 ` [PATCH 18/37] perf script: Fix off by one in brstackinsn IPC computation Arnaldo Carvalho de Melo
                   ` (19 subsequent siblings)
  36 siblings, 0 replies; 38+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-07-22 17:38 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
	linux-perf-users, Andi Kleen, Arnaldo Carvalho de Melo

From: Andi Kleen <ak@linux.intel.com>

Clarify that a metric is based on events, not referring to itself. Also
some improvements with the sentences.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20190711181922.18765-3-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/Documentation/perf-script.txt | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/tools/perf/Documentation/perf-script.txt b/tools/perf/Documentation/perf-script.txt
index 042b9e5dcc32..caaab28f8400 100644
--- a/tools/perf/Documentation/perf-script.txt
+++ b/tools/perf/Documentation/perf-script.txt
@@ -228,11 +228,11 @@ OPTIONS
 
 	With the metric option perf script can compute metrics for
 	sampling periods, similar to perf stat. This requires
-	specifying a group with multiple metrics with the :S option
+	specifying a group with multiple events defining metrics with the :S option
 	for perf record. perf will sample on the first event, and
-	compute metrics for all the events in the group. Please note
+	print computed metrics for all the events in the group. Please note
 	that the metric computed is averaged over the whole sampling
-	period, not just for the sample point.
+	period (since the last sample), not just for the sample point.
 
 	For sample events it's possible to display misc field with -F +misc option,
 	following letters are displayed for each bit:
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 18/37] perf script: Fix off by one in brstackinsn IPC computation
  2019-07-22 17:38 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (16 preceding siblings ...)
  2019-07-22 17:38 ` [PATCH 17/37] perf script: Improve man page description of metrics Arnaldo Carvalho de Melo
@ 2019-07-22 17:38 ` Arnaldo Carvalho de Melo
  2019-07-22 17:38 ` [PATCH 19/37] perf tools: Fix proper buffer size for feature processing Arnaldo Carvalho de Melo
                   ` (18 subsequent siblings)
  36 siblings, 0 replies; 38+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-07-22 17:38 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
	linux-perf-users, Andi Kleen, Denis Bakhvalov,
	Arnaldo Carvalho de Melo

From: Andi Kleen <ak@linux.intel.com>

When we hit the end of a program block, need to count the last
instruction too for the IPC computation. This caused large errors for
small blocks.

  % perf script -b ls / > /dev/null

Before:

  % perf script -F +brstackinsn --xed
  ...
        00007f94c9ac70d8                        jz 0x7f94c9ac70e3                       # PRED 3 cycles [36] 4.33 IPC
        00007f94c9ac70e3                        testb  $0x20, 0x31d(%rbx)
        00007f94c9ac70ea                        jnz 0x7f94c9ac70b0
        00007f94c9ac70ec                        testb  $0x8, 0x205ad(%rip)
        00007f94c9ac70f3                        jz 0x7f94c9ac6ff0               # PRED 1 cycles [37] 3.00 IPC

After:

  % perf script -F +brstackinsn --xed
  ...
        00007f94c9ac70d8                        jz 0x7f94c9ac70e3                       # PRED 3 cycles [15] 4.67 IPC
        00007f94c9ac70e3                        testb  $0x20, 0x31d(%rbx)
        00007f94c9ac70ea                        jnz 0x7f94c9ac70b0
        00007f94c9ac70ec                        testb  $0x8, 0x205ad(%rip)
        00007f94c9ac70f3                        jz 0x7f94c9ac6ff0               # PRED 1 cycles [16] 4.00 IPC

Suggested-by: Denis Bakhvalov <denis.bakhvalov@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20190711181922.18765-2-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-script.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 8f24865596af..0140ddb8dd0b 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -1059,7 +1059,7 @@ static int perf_sample__fprintf_brstackinsn(struct perf_sample *sample,
 
 			printed += ip__fprintf_sym(ip, thread, x.cpumode, x.cpu, &lastsym, attr, fp);
 			if (ip == end) {
-				printed += ip__fprintf_jump(ip, &br->entries[i], &x, buffer + off, len - off, insn, fp,
+				printed += ip__fprintf_jump(ip, &br->entries[i], &x, buffer + off, len - off, ++insn, fp,
 							    &total_cycles);
 				if (PRINT_FIELD(SRCCODE))
 					printed += print_srccode(thread, x.cpumode, ip);
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 19/37] perf tools: Fix proper buffer size for feature processing
  2019-07-22 17:38 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (17 preceding siblings ...)
  2019-07-22 17:38 ` [PATCH 18/37] perf script: Fix off by one in brstackinsn IPC computation Arnaldo Carvalho de Melo
@ 2019-07-22 17:38 ` Arnaldo Carvalho de Melo
  2019-07-22 17:38 ` [PATCH 20/37] perf stat: Fix segfault for event group in repeat mode Arnaldo Carvalho de Melo
                   ` (17 subsequent siblings)
  36 siblings, 0 replies; 38+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-07-22 17:38 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
	linux-perf-users, Arnaldo Carvalho de Melo, Alexander Shishkin,
	David Carrillo-Cisneros, Kan Liang, Peter Zijlstra, Song Liu

From: Jiri Olsa <jolsa@kernel.org>

After Song Liu's segfault fix for pipe mode, Arnaldo reported following
error:

  # perf record -o - | perf script
  0x514 [0x1ac]: failed to process type: 80

It's caused by wrong buffer size setup in feature processing, which
makes cpu topology feature fail, because it's using buffer size to
recognize its header version.

Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Carrillo-Cisneros <davidcc@google.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Fixes: e9def1b2e74e ("perf tools: Add feature header record to pipe-mode")
Link: http://lkml.kernel.org/r/20190715140426.32509-1-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/header.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index c24db7f4909c..20111f8da5cb 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -3747,7 +3747,7 @@ int perf_event__process_feature(struct perf_session *session,
 		return 0;
 
 	ff.buf  = (void *)fe->data;
-	ff.size = event->header.size - sizeof(event->header);
+	ff.size = event->header.size - sizeof(*fe);
 	ff.ph = &session->header;
 
 	if (feat_ops[feat].process(&ff, NULL))
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 20/37] perf stat: Fix segfault for event group in repeat mode
  2019-07-22 17:38 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (18 preceding siblings ...)
  2019-07-22 17:38 ` [PATCH 19/37] perf tools: Fix proper buffer size for feature processing Arnaldo Carvalho de Melo
@ 2019-07-22 17:38 ` Arnaldo Carvalho de Melo
  2019-07-22 17:38 ` [PATCH 21/37] perf augmented_raw_syscalls: Augment sockaddr arg in 'connect' Arnaldo Carvalho de Melo
                   ` (16 subsequent siblings)
  36 siblings, 0 replies; 38+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-07-22 17:38 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
	linux-perf-users, Jiri Olsa, Numfor Mbiziwo-Tiapo,
	Alexander Shishkin, Ian Rogers, Mark Drayton, Peter Zijlstra,
	Song Liu, Stephane Eranian, Arnaldo Carvalho de Melo

From: Jiri Olsa <jolsa@redhat.com>

Numfor Mbiziwo-Tiapo reported segfault on stat of event group in repeat
mode:

  # perf stat -e '{cycles,instructions}' -r 10 ls

It's caused by memory corruption due to not cleaned evsel's id array and
index, which needs to be rebuilt in every stat iteration. Currently the
ids index grows, while the array (which is also not freed) has the same
size.

Fixing this by releasing id array and zeroing ids index in
perf_evsel__close function.

We also need to keep the evsel_list alive for stat record (which is
disabled in repeat mode).

Reported-by: Numfor Mbiziwo-Tiapo <nums@google.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Drayton <mbd@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20190715142121.GC6032@krava
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-stat.c | 9 ++++++++-
 tools/perf/util/evsel.c   | 2 ++
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index b55a534b4de0..352cf39d7c2f 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -607,7 +607,13 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
 	 * group leaders.
 	 */
 	read_counters(&(struct timespec) { .tv_nsec = t1-t0 });
-	perf_evlist__close(evsel_list);
+
+	/*
+	 * We need to keep evsel_list alive, because it's processed
+	 * later the evsel_list will be closed after.
+	 */
+	if (!STAT_RECORD)
+		perf_evlist__close(evsel_list);
 
 	return WEXITSTATUS(status);
 }
@@ -1997,6 +2003,7 @@ int cmd_stat(int argc, const char **argv)
 			perf_session__write_header(perf_stat.session, evsel_list, fd, true);
 		}
 
+		perf_evlist__close(evsel_list);
 		perf_session__delete(perf_stat.session);
 	}
 
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index b787ec778049..7d1757a2ec46 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1292,6 +1292,7 @@ static void perf_evsel__free_id(struct perf_evsel *evsel)
 	xyarray__delete(evsel->sample_id);
 	evsel->sample_id = NULL;
 	zfree(&evsel->id);
+	evsel->ids = 0;
 }
 
 static void perf_evsel__free_config_terms(struct perf_evsel *evsel)
@@ -2078,6 +2079,7 @@ void perf_evsel__close(struct perf_evsel *evsel)
 
 	perf_evsel__close_fd(evsel);
 	perf_evsel__free_fd(evsel);
+	perf_evsel__free_id(evsel);
 }
 
 int perf_evsel__open_per_cpu(struct perf_evsel *evsel,
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 21/37] perf augmented_raw_syscalls: Augment sockaddr arg in 'connect'
  2019-07-22 17:38 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (19 preceding siblings ...)
  2019-07-22 17:38 ` [PATCH 20/37] perf stat: Fix segfault for event group in repeat mode Arnaldo Carvalho de Melo
@ 2019-07-22 17:38 ` Arnaldo Carvalho de Melo
  2019-07-22 17:38 ` [PATCH 22/37] perf trace beauty: Make connect's addrlen be printed as an int, not hex Arnaldo Carvalho de Melo
                   ` (15 subsequent siblings)
  36 siblings, 0 replies; 38+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-07-22 17:38 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
	linux-perf-users, Arnaldo Carvalho de Melo, Adrian Hunter,
	Luis Cláudio Gonçalves

From: Arnaldo Carvalho de Melo <acme@redhat.com>

We already had a beautifier for an augmented sockaddr payload, but that
was when we were hooking on each syscalls:sys_enter_foo tracepoints,
since now we're almost doing that by doing a tail call from
raw_syscalls:sys_enter, its almost the same, we can reuse it straight
away.

  # perf trace -e connec* ssh www.bla.com
  connect(3</var/lib/sss/mc/passwd>, { .family: PF_LOCAL, path: /var/run/nscd/socket }, 0x6e) = -1 ENOENT (No such file or directory)
  connect(3</var/lib/sss/mc/passwd>, { .family: PF_LOCAL, path: /var/run/nscd/socket }, 0x6e) = -1 ENOENT (No such file or directory)
  connect(4<socket:[16604782]>, { .family: PF_LOCAL, path: /var/lib/sss/pipes/nss }, 0x6e) = 0
  connect(7, { .family: PF_LOCAL, path: /var/run/nscd/socket }, 0x6e) = -1 ENOENT (No such file or directory)
  connect(7, { .family: PF_LOCAL, path: /var/run/nscd/socket }, 0x6e) = -1 ENOENT (No such file or directory)
  connect(5</etc/hosts>, { .family: PF_LOCAL, path: /var/run/nscd/socket }, 0x6e) = -1 ENOENT (No such file or directory)
  connect(5</etc/hosts>, { .family: PF_LOCAL, path: /var/run/nscd/socket }, 0x6e) = -1 ENOENT (No such file or directory)
  connect(5</etc/hosts>, { .family: PF_INET, port: 53, addr: 192.168.44.1 }, 0x10) = 0
  connect(5</etc/hosts>, { .family: PF_INET, port: 22, addr: 146.112.61.108 }, 0x10) = 0
  connect(5</etc/hosts>, { .family: PF_INET6, port: 22, addr: ::ffff:146.112.61.108 }, 0x1c) = 0
  ^C#

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-5xkrbcpjsgnr3zt1aqdd7nvc@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 .../examples/bpf/augmented_raw_syscalls.c     | 35 ++++++++++++++++---
 1 file changed, 30 insertions(+), 5 deletions(-)

diff --git a/tools/perf/examples/bpf/augmented_raw_syscalls.c b/tools/perf/examples/bpf/augmented_raw_syscalls.c
index 77bb6a0edce3..d7a292d7ee2f 100644
--- a/tools/perf/examples/bpf/augmented_raw_syscalls.c
+++ b/tools/perf/examples/bpf/augmented_raw_syscalls.c
@@ -16,6 +16,7 @@
 
 #include <unistd.h>
 #include <linux/limits.h>
+#include <linux/socket.h>
 #include <pid_filter.h>
 
 /* bpf-output associated map */
@@ -69,10 +70,13 @@ pid_filter(pids_filtered);
 
 struct augmented_args_payload {
        struct syscall_enter_args args;
-       struct {
-	       struct augmented_filename filename;
-	       struct augmented_filename filename2;
-       };
+       union {
+		struct {
+			struct augmented_filename filename,
+						  filename2;
+		};
+		struct sockaddr_storage saddr;
+	};
 };
 
 bpf_map(augmented_args_tmp, PERCPU_ARRAY, int, struct augmented_args_payload, 1);
@@ -112,11 +116,32 @@ int syscall_unaugmented(struct syscall_enter_args *args)
 }
 
 /*
- * This will be tail_called from SEC("raw_syscalls:sys_enter"), so will find in
+ * These will be tail_called from SEC("raw_syscalls:sys_enter"), so will find in
  * augmented_args_tmp what was read by that raw_syscalls:sys_enter and go
  * on from there, reading the first syscall arg as a string, i.e. open's
  * filename.
  */
+SEC("!syscalls:sys_enter_connect")
+int sys_enter_connect(struct syscall_enter_args *args)
+{
+	int key = 0;
+	struct augmented_args_payload *augmented_args = bpf_map_lookup_elem(&augmented_args_tmp, &key);
+	const void *sockaddr_arg = (const void *)args->args[1];
+	unsigned int socklen = args->args[2];
+	unsigned int len = sizeof(augmented_args->args);
+
+        if (augmented_args == NULL)
+                return 1; /* Failure: don't filter */
+
+	if (socklen > sizeof(augmented_args->saddr))
+		socklen = sizeof(augmented_args->saddr);
+
+	probe_read(&augmented_args->saddr, socklen, sockaddr_arg);
+
+	/* If perf_event_output fails, return non-zero so that it gets recorded unaugmented */
+	return perf_event_output(args, &__augmented_syscalls__, BPF_F_CURRENT_CPU, augmented_args, len + socklen);
+}
+
 SEC("!syscalls:sys_enter_open")
 int sys_enter_open(struct syscall_enter_args *args)
 {
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 22/37] perf trace beauty: Make connect's addrlen be printed as an int, not hex
  2019-07-22 17:38 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (20 preceding siblings ...)
  2019-07-22 17:38 ` [PATCH 21/37] perf augmented_raw_syscalls: Augment sockaddr arg in 'connect' Arnaldo Carvalho de Melo
@ 2019-07-22 17:38 ` Arnaldo Carvalho de Melo
  2019-07-22 17:38 ` [PATCH 23/37] perf trace beauty: Disable fd->pathname when close() not enabled Arnaldo Carvalho de Melo
                   ` (14 subsequent siblings)
  36 siblings, 0 replies; 38+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-07-22 17:38 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
	linux-perf-users, Arnaldo Carvalho de Melo, Adrian Hunter,
	Luis Cláudio Gonçalves

From: Arnaldo Carvalho de Melo <acme@redhat.com>

  # perf trace -e connec* ssh www.bla.com
  connect(3</var/lib/sss/mc/passwd>, { .family: PF_LOCAL, path: /var/run/nscd/socket }, 110) = -1 ENOENT (No such file or directory)
  connect(3</var/lib/sss/mc/passwd>, { .family: PF_LOCAL, path: /var/run/nscd/socket }, 110) = -1 ENOENT (No such file or directory)
  connect(4<socket:[16610959]>, { .family: PF_LOCAL, path: /var/lib/sss/pipes/nss }, 110) = 0
  connect(7, { .family: PF_LOCAL, path: /var/run/nscd/socket }, 110) = -1 ENOENT (No such file or directory)
  connect(7, { .family: PF_LOCAL, path: /var/run/nscd/socket }, 110) = -1 ENOENT (No such file or directory)
  connect(5, { .family: PF_LOCAL, path: /var/run/nscd/socket }, 110) = -1 ENOENT (No such file or directory)
  connect(5</usr/lib64/libnss_mdns4_minimal.so.2>, { .family: PF_LOCAL, path: /var/run/nscd/socket }, 110) = -1 ENOENT (No such file or directory)
  connect(5</usr/lib64/libnss_mdns4_minimal.so.2>, { .family: PF_INET, port: 53, addr: 192.168.44.1 }, 16) = 0
  connect(5</usr/lib64/libnss_mdns4_minimal.so.2>, { .family: PF_INET, port: 22, addr: 146.112.61.108 }, 16) = 0
  connect(5</usr/lib64/libnss_mdns4_minimal.so.2>, { .family: PF_INET6, port: 22, addr: ::ffff:146.112.61.108 }, 28) = 0
  ^Cconnect(5</usr/lib64/libnss_mdns4_minimal.so.2>, { .family: PF_INET, port: 22, addr: 146.112.61.108 }, 16) = -1 (unknown) (INTERNAL ERROR: strerror_r(512, [buf], 128)=22)
  #

Argh, the SCA_FD needs to invalidate its cache when close is done...

It works if the 'close' syscall is not filtered out ;-\

  # perf trace -e close,connec* ssh www.bla.com
  close(3)                                = 0
  close(3</usr/lib64/libpcre2-8.so.0.8.0>) = 0
  close(3)                                = 0
  close(3</usr/lib64/libkrb5.so.3.3>)     = 0
  close(3</usr/lib64/libkrb5.so.3.3>)     = 0
  close(3)                                = 0
  close(3</usr/lib64/libk5crypto.so.3.1>) = 0
  close(3</usr/lib64/libk5crypto.so.3.1>) = 0
  close(3</usr/lib64/libcom_err.so.2.1>)  = 0
  close(3</usr/lib64/libcom_err.so.2.1>)  = 0
  close(3)                                = 0
  close(3</usr/lib64/libkrb5support.so.0.1>) = 0
  close(3</usr/lib64/libkrb5support.so.0.1>) = 0
  close(3</usr/lib64/libkeyutils.so.1.8>) = 0
  close(3</usr/lib64/libkeyutils.so.1.8>) = 0
  close(3)                                = 0
  close(3)                                = 0
  close(3)                                = 0
  close(3)                                = 0
  close(4)                                = 0
  close(3)                                = 0
  close(3)                                = 0
  connect(3</etc/nsswitch.conf>, { .family: PF_LOCAL, path: /var/run/nscd/socket }, 110) = -1 ENOENT (No such file or directory)
  close(3</etc/nsswitch.conf>)            = 0
  connect(3</usr/lib64/libnss_sss.so.2>, { .family: PF_LOCAL, path: /var/run/nscd/socket }, 110) = -1 ENOENT (No such file or directory)
  close(3</usr/lib64/libnss_sss.so.2>)    = 0
  close(3</usr/lib64/libnss_sss.so.2>)    = 0
  close(3)                                = 0
  close(3)                                = 0
  connect(4<socket:[16616519]>, { .family: PF_LOCAL, path: /var/lib/sss/pipes/nss }, 110) = 0
  ^C
  #

Will disable this beautifier when 'close' is filtered out...

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-ekuiciyx4znchvy95c8p1yyi@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-trace.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index 5258399a1c94..123d7efc12e8 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -725,7 +725,8 @@ static struct syscall_fmt {
 	{ .name	    = "close",
 	  .arg = { [0] = { .scnprintf = SCA_CLOSE_FD, /* fd */ }, }, },
 	{ .name	    = "connect",
-	  .arg = { [1] = { .scnprintf = SCA_SOCKADDR, /* servaddr */ }, }, },
+	  .arg = { [1] = { .scnprintf = SCA_SOCKADDR, /* servaddr */ },
+		   [2] = { .scnprintf = SCA_INT, /* addrlen */ }, }, },
 	{ .name	    = "epoll_ctl",
 	  .arg = { [1] = STRARRAY(op, epoll_ctl_ops), }, },
 	{ .name	    = "eventfd2",
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 23/37] perf trace beauty: Disable fd->pathname when close() not enabled
  2019-07-22 17:38 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (21 preceding siblings ...)
  2019-07-22 17:38 ` [PATCH 22/37] perf trace beauty: Make connect's addrlen be printed as an int, not hex Arnaldo Carvalho de Melo
@ 2019-07-22 17:38 ` Arnaldo Carvalho de Melo
  2019-07-22 17:38 ` [PATCH 24/37] perf trace beauty: Do not try to use the fd->pathname beautifier for bind/connect fd arg Arnaldo Carvalho de Melo
                   ` (13 subsequent siblings)
  36 siblings, 0 replies; 38+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-07-22 17:38 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
	linux-perf-users, Arnaldo Carvalho de Melo, Adrian Hunter,
	Luis Cláudio Gonçalves

From: Arnaldo Carvalho de Melo <acme@redhat.com>

As we invalidate the fd->pathname table in the SCA_CLOSE_FD beautifier,
if we don't have it we may end up keeping an fd->pathname association
that then gets misprinted.

The previous behaviour continues when the close() syscall is enabled,
which may still be a a problem if we lose records (i.e. we may lose a
'close' record and then get that fd reused by socket()) but then the
tool will notify that records are being lost and the user will be warned
that some of the heuristics will fall apart.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-b7t6h8sq9lebemvfy2zh3qq1@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-trace.c | 19 ++++++++++++++++---
 1 file changed, 16 insertions(+), 3 deletions(-)

diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index 123d7efc12e8..94c33bb573c1 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -127,6 +127,7 @@ struct trace {
 	unsigned int		min_stack;
 	int			raw_augmented_syscalls_args_size;
 	bool			raw_augmented_syscalls;
+	bool			fd_path_disabled;
 	bool			sort_events;
 	bool			not_ev_qualifier;
 	bool			live;
@@ -1178,7 +1179,7 @@ static const char *thread__fd_path(struct thread *thread, int fd,
 {
 	struct thread_trace *ttrace = thread__priv(thread);
 
-	if (ttrace == NULL)
+	if (ttrace == NULL || trace->fd_path_disabled)
 		return NULL;
 
 	if (fd < 0)
@@ -2097,7 +2098,7 @@ static int trace__sys_exit(struct trace *trace, struct perf_evsel *evsel,
 
 	ret = perf_evsel__sc_tp_uint(evsel, ret, sample);
 
-	if (sc->is_open && ret >= 0 && ttrace->filename.pending_open) {
+	if (!trace->fd_path_disabled && sc->is_open && ret >= 0 && ttrace->filename.pending_open) {
 		trace__set_fd_pathname(thread, ret, ttrace->filename.name);
 		ttrace->filename.pending_open = false;
 		++trace->stats.vfs_getname;
@@ -3206,7 +3207,6 @@ static int trace__run(struct trace *trace, int argc, const char **argv)
 	if (trace->syscalls.prog_array.sys_enter)
 		trace__init_syscalls_bpf_prog_array_maps(trace);
 
-
 	if (trace->ev_qualifier_ids.nr > 0) {
 		err = trace__set_ev_qualifier_filter(trace);
 		if (err < 0)
@@ -3218,6 +3218,19 @@ static int trace__run(struct trace *trace, int argc, const char **argv)
 		}
 	}
 
+	/*
+	 * If the "close" syscall is not traced, then we will not have the
+	 * opportunity to, in syscall_arg__scnprintf_close_fd() invalidate the
+	 * fd->pathname table and were ending up showing the last value set by
+	 * syscalls opening a pathname and associating it with a descriptor or
+	 * reading it from /proc/pid/fd/ in cases where that doesn't make
+	 * sense.
+	 *
+	 *  So just disable this beautifier (SCA_FD, SCA_FDAT) when 'close' is
+	 *  not in use.
+	 */
+	trace->fd_path_disabled = !trace__syscall_enabled(trace, syscalltbl__id(trace->sctbl, "close"));
+
 	err = perf_evlist__apply_filters(evlist, &evsel);
 	if (err < 0)
 		goto out_error_apply_filters;
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 24/37] perf trace beauty: Do not try to use the fd->pathname beautifier for bind/connect fd arg
  2019-07-22 17:38 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (22 preceding siblings ...)
  2019-07-22 17:38 ` [PATCH 23/37] perf trace beauty: Disable fd->pathname when close() not enabled Arnaldo Carvalho de Melo
@ 2019-07-22 17:38 ` Arnaldo Carvalho de Melo
  2019-07-22 17:38 ` [PATCH 25/37] perf trace beauty: Beautify 'sendto's sockaddr arg Arnaldo Carvalho de Melo
                   ` (12 subsequent siblings)
  36 siblings, 0 replies; 38+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-07-22 17:38 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
	linux-perf-users, Arnaldo Carvalho de Melo, Adrian Hunter,
	Luis Cláudio Gonçalves

From: Arnaldo Carvalho de Melo <acme@redhat.com>

Doesn't make sense and also we now beautify the sockaddr, which provides
enough info:

  # trace -e close,socket,connec* ssh www.bla.com
  <SNIP>
  close(5)                                = 0
  socket(PF_INET, SOCK_DGRAM|CLOEXEC|NONBLOCK, IPPROTO_IP) = 5
  connect(5, { .family: PF_INET, port: 53, addr: 192.168.44.1 }, 16) = 0
  close(5)                                = 0
  socket(PF_INET, SOCK_STREAM, IPPROTO_TCP) = 5
  ^C#

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-h9drpb7ail808d2mh4n7tla4@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-trace.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index 94c33bb573c1..010aa9e9a561 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -710,7 +710,8 @@ static struct syscall_fmt {
 	  .arg = { [0] = { .scnprintf = SCA_X86_ARCH_PRCTL_CODE, /* code */ },
 		   [1] = { .scnprintf = SCA_PTR, /* arg2 */ }, }, },
 	{ .name	    = "bind",
-	  .arg = { [1] = { .scnprintf = SCA_SOCKADDR, /* umyaddr */ }, }, },
+	  .arg = { [0] = { .scnprintf = SCA_INT, /* fd */ },
+		   [1] = { .scnprintf = SCA_SOCKADDR, /* umyaddr */ }, }, },
 	{ .name	    = "bpf",
 	  .arg = { [0] = STRARRAY(cmd, bpf_cmd), }, },
 	{ .name	    = "brk",	    .hexret = true,
@@ -726,7 +727,8 @@ static struct syscall_fmt {
 	{ .name	    = "close",
 	  .arg = { [0] = { .scnprintf = SCA_CLOSE_FD, /* fd */ }, }, },
 	{ .name	    = "connect",
-	  .arg = { [1] = { .scnprintf = SCA_SOCKADDR, /* servaddr */ },
+	  .arg = { [0] = { .scnprintf = SCA_INT, /* fd */ },
+		   [1] = { .scnprintf = SCA_SOCKADDR, /* servaddr */ },
 		   [2] = { .scnprintf = SCA_INT, /* addrlen */ }, }, },
 	{ .name	    = "epoll_ctl",
 	  .arg = { [1] = STRARRAY(op, epoll_ctl_ops), }, },
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 25/37] perf trace beauty: Beautify 'sendto's sockaddr arg
  2019-07-22 17:38 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (23 preceding siblings ...)
  2019-07-22 17:38 ` [PATCH 24/37] perf trace beauty: Do not try to use the fd->pathname beautifier for bind/connect fd arg Arnaldo Carvalho de Melo
@ 2019-07-22 17:38 ` Arnaldo Carvalho de Melo
  2019-07-22 17:38 ` [PATCH 26/37] perf trace beauty: Beautify bind's " Arnaldo Carvalho de Melo
                   ` (11 subsequent siblings)
  36 siblings, 0 replies; 38+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-07-22 17:38 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
	linux-perf-users, Arnaldo Carvalho de Melo, Adrian Hunter,
	Luis Cláudio Gonçalves

From: Arnaldo Carvalho de Melo <acme@redhat.com>

By just writing the collector in the augmented_raw_syscalls.c BPF
program:

  # perf trace -e sendto
  <SNIP>
  ping/23492 sendto(3, 0x56253bbef700, 64, NONE, { .family: PF_INET, port: 0, addr: 10.10.161.32 }, 0x10) = 64
  ping/23492 sendto(3, 0x56253bbef700, 64, NONE, { .family: PF_INET, port: 0, addr: 10.10.161.32 }, 0x10) = 64
  ping/23492 sendto(3, 0x56253bbef700, 64, NONE, { .family: PF_INET, port: 0, addr: 10.10.161.32 }, 0x10) = 64
  ping/23492 sendto(3, 0x56253bbef700, 64, NONE, { .family: PF_INET, port: 0, addr: 10.10.161.32 }, 0x10) = 64
  Socket Thread/3573 sendto(247, 0x7fb32d49c000, 120, NONE, { .family: PF_UNSPEC }, NULL) = 120
  DNS Res~er #18/11374 sendto(242, 0x7fb342cfe420, 20, NONE, { .family: PF_NETLINK }, 0xc) = 20
  DNS Res~er #18/11374 sendto(242, 0x7fb342cfcca0, 42, MSG_NOSIGNAL, { .family: PF_UNSPEC }, NULL) = 42
  DNS Res~er #18/11374 sendto(242, 0x7fb342cfcccc, 42, MSG_NOSIGNAL, { .family: PF_UNSPEC }, NULL) = 42
  ping/23492 sendto(3, 0x56253bbef700, 64, NONE, { .family: PF_INET, port: 0, addr: 10.10.161.32 }, 0x10) = 64
  Socket Thread/3573 sendto(242, 0x7fb308bb1c08, 296, NONE, { .family: PF_UNSPEC }, NULL) = 296
  ping/23492 sendto(3, 0x56253bbef700, 64, NONE, { .family: PF_INET, port: 0, addr: 10.10.161.32 }, 0x10) = 64
  ping/23492 sendto(3, 0x56253bbef700, 64, NONE, { .family: PF_INET, port: 0, addr: 10.10.161.32 }, 0x10) = 64
  ping/23492 sendto(3, 0x56253bbef700, 64, NONE, { .family: PF_INET, port: 0, addr: 10.10.161.32 }, 0x10) = 64
  ping/23492 sendto(3, 0x56253bbef700, 64, NONE, { .family: PF_INET, port: 0, addr: 10.10.161.32 }, 0x10) = 64
  ^C
  #

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-p0l0rlvq19v5zf8qc2x2itow@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 .../examples/bpf/augmented_raw_syscalls.c     | 21 +++++++++++++++++++
 1 file changed, 21 insertions(+)

diff --git a/tools/perf/examples/bpf/augmented_raw_syscalls.c b/tools/perf/examples/bpf/augmented_raw_syscalls.c
index d7a292d7ee2f..9c2228b01ee6 100644
--- a/tools/perf/examples/bpf/augmented_raw_syscalls.c
+++ b/tools/perf/examples/bpf/augmented_raw_syscalls.c
@@ -142,6 +142,27 @@ int sys_enter_connect(struct syscall_enter_args *args)
 	return perf_event_output(args, &__augmented_syscalls__, BPF_F_CURRENT_CPU, augmented_args, len + socklen);
 }
 
+SEC("!syscalls:sys_enter_sendto")
+int sys_enter_sendto(struct syscall_enter_args *args)
+{
+	int key = 0;
+	struct augmented_args_payload *augmented_args = bpf_map_lookup_elem(&augmented_args_tmp, &key);
+	const void *sockaddr_arg = (const void *)args->args[4];
+	unsigned int socklen = args->args[5];
+	unsigned int len = sizeof(augmented_args->args);
+
+        if (augmented_args == NULL)
+                return 1; /* Failure: don't filter */
+
+	if (socklen > sizeof(augmented_args->saddr))
+		socklen = sizeof(augmented_args->saddr);
+
+	probe_read(&augmented_args->saddr, socklen, sockaddr_arg);
+
+	/* If perf_event_output fails, return non-zero so that it gets recorded unaugmented */
+	return perf_event_output(args, &__augmented_syscalls__, BPF_F_CURRENT_CPU, augmented_args, len + socklen);
+}
+
 SEC("!syscalls:sys_enter_open")
 int sys_enter_open(struct syscall_enter_args *args)
 {
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 26/37] perf trace beauty: Beautify bind's sockaddr arg
  2019-07-22 17:38 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (24 preceding siblings ...)
  2019-07-22 17:38 ` [PATCH 25/37] perf trace beauty: Beautify 'sendto's sockaddr arg Arnaldo Carvalho de Melo
@ 2019-07-22 17:38 ` Arnaldo Carvalho de Melo
  2019-07-22 17:38 ` [PATCH 27/37] perf stat: Always separate stalled cycles per insn Arnaldo Carvalho de Melo
                   ` (10 subsequent siblings)
  36 siblings, 0 replies; 38+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-07-22 17:38 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
	linux-perf-users, Arnaldo Carvalho de Melo, Adrian Hunter,
	Luis Cláudio Gonçalves

From: Arnaldo Carvalho de Melo <acme@redhat.com>

By reusing the "connect" BPF collector.

Testing it system wide and stopping/starting sshd:

  # perf trace -e bind
  LLVM: dumping /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o
  DNS Res~er #18/15132 bind(243, { .family: PF_NETLINK }, 12)  = 0
  DNS Res~er #19/4833 bind(247, { .family: PF_NETLINK }, 12)  = 0
  DNS Res~er #19/4833 bind(238, { .family: PF_NETLINK }, 12)  = 0
  DNS Res~er #18/15132 bind(243, { .family: PF_NETLINK }, 12)  = 0
  DNS Res~er #18/10327 bind(258, { .family: PF_NETLINK }, 12)  = 0
  :6507/6507 bind(24, { .family: PF_NETLINK }, 12)   = 0
  DNS Res~er #19/4833 bind(238, { .family: PF_NETLINK }, 12)  = 0
  DNS Res~er #18/15132 bind(242, { .family: PF_NETLINK }, 12)  = 0
  sshd/6514 bind(3, { .family: PF_NETLINK }, 12)    = 0
  sshd/6514 bind(5, { .family: PF_INET, port: 22, addr: 0.0.0.0 }, 16) = 0
  sshd/6514 bind(7, { .family: PF_INET6, port: 22, addr: :: }, 28) = 0
  DNS Res~er #18/10327 bind(229, { .family: PF_NETLINK }, 12)  = 0
  DNS Res~er #18/15132 bind(231, { .family: PF_NETLINK }, 12)  = 0
  DNS Res~er #19/4833 bind(229, { .family: PF_NETLINK }, 12)  = 0
  ^C#

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-m2hmxqrckxxw2ciki0tu889u@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-trace.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index 010aa9e9a561..d403b09812d1 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -710,8 +710,10 @@ static struct syscall_fmt {
 	  .arg = { [0] = { .scnprintf = SCA_X86_ARCH_PRCTL_CODE, /* code */ },
 		   [1] = { .scnprintf = SCA_PTR, /* arg2 */ }, }, },
 	{ .name	    = "bind",
+	  .bpf_prog_name = { .sys_enter = "!syscalls:sys_enter_connect", },
 	  .arg = { [0] = { .scnprintf = SCA_INT, /* fd */ },
-		   [1] = { .scnprintf = SCA_SOCKADDR, /* umyaddr */ }, }, },
+		   [1] = { .scnprintf = SCA_SOCKADDR, /* umyaddr */ },
+		   [2] = { .scnprintf = SCA_INT, /* addrlen */ }, }, },
 	{ .name	    = "bpf",
 	  .arg = { [0] = STRARRAY(cmd, bpf_cmd), }, },
 	{ .name	    = "brk",	    .hexret = true,
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 27/37] perf stat: Always separate stalled cycles per insn
  2019-07-22 17:38 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (25 preceding siblings ...)
  2019-07-22 17:38 ` [PATCH 26/37] perf trace beauty: Beautify bind's " Arnaldo Carvalho de Melo
@ 2019-07-22 17:38 ` Arnaldo Carvalho de Melo
  2019-07-22 17:38 ` [PATCH 28/37] perf session: Fix loading of compressed data split across adjacent records Arnaldo Carvalho de Melo
                   ` (9 subsequent siblings)
  36 siblings, 0 replies; 38+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-07-22 17:38 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
	linux-perf-users, Cong Wang, Andi Kleen,
	Arnaldo Carvalho de Melo

From: Cong Wang <xiyou.wangcong@gmail.com>

The "stalled cycles per insn" is appended to "instructions" when the CPU
has this hardware counter directly. We should always make it a separate
line, which also aligns to the output when we hit the "if (total &&
avg)" branch.

Before:

  $ sudo perf stat --all-cpus --field-separator , --log-fd 1 -einstructions,cycles -- sleep 1
  4565048704,,instructions,64114578096,100.00,1.34,insn per cycle,,
  3396325133,,cycles,64146628546,100.00,,

After:

  $ sudo ./tools/perf/perf stat --all-cpus --field-separator , --log-fd 1 -einstructions,cycles -- sleep 1
  6721924,,instructions,24026790339,100.00,0.22,insn per cycle
  ,,,,,0.00,stalled cycles per insn
  30939953,,cycles,24025512526,100.00,,

Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/20190517221039.8975-1-xiyou.wangcong@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/stat-shadow.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/stat-shadow.c b/tools/perf/util/stat-shadow.c
index 656065af4971..accb1bf1cfd8 100644
--- a/tools/perf/util/stat-shadow.c
+++ b/tools/perf/util/stat-shadow.c
@@ -819,7 +819,8 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
 					"stalled cycles per insn",
 					ratio);
 		} else if (have_frontend_stalled) {
-			print_metric(config, ctxp, NULL, NULL,
+			out->new_line(config, ctxp);
+			print_metric(config, ctxp, NULL, "%7.2f ",
 				     "stalled cycles per insn", 0);
 		}
 	} else if (perf_evsel__match(evsel, HARDWARE, HW_BRANCH_MISSES)) {
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 28/37] perf session: Fix loading of compressed data split across adjacent records
  2019-07-22 17:38 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (26 preceding siblings ...)
  2019-07-22 17:38 ` [PATCH 27/37] perf stat: Always separate stalled cycles per insn Arnaldo Carvalho de Melo
@ 2019-07-22 17:38 ` Arnaldo Carvalho de Melo
  2019-07-22 17:38 ` [PATCH 29/37] perf trace beauty: Add BPF augmenter for the 'rename' syscall Arnaldo Carvalho de Melo
                   ` (8 subsequent siblings)
  36 siblings, 0 replies; 38+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-07-22 17:38 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
	linux-perf-users, Alexey Budankov, Alexander Shishkin,
	Andi Kleen, Peter Zijlstra, Arnaldo Carvalho de Melo

From: Alexey Budankov <alexey.budankov@linux.intel.com>

Fix decompression failure found during the loading of compressed trace
collected on larger scale systems (>48 cores).

The error happened due to lack of decompression space for a mmaped
buffer data chunk split across adjacent PERF_RECORD_COMPRESSED records.

  $ perf report -i bt.16384.data --stats
  failed to decompress (B): 63869 -> 0 : Destination buffer is too small
  user stack dump failure
  Can't parse sample, err = -14
  0x2637e436 [0x4080]: failed to process type: 9
  Error:
  failed to process sample

  $ perf test 71
  71: Zstd perf.data compression/decompression              : Ok

Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/4d839e1b-9c48-89c4-9702-a12217420611@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/session.c | 22 ++++++++++++++--------
 tools/perf/util/session.h |  1 +
 tools/perf/util/zstd.c    |  4 ++--
 3 files changed, 17 insertions(+), 10 deletions(-)

diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index d0fd6c614e68..37efa1f43d8b 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -36,10 +36,16 @@ static int perf_session__process_compressed_event(struct perf_session *session,
 	void *src;
 	size_t decomp_size, src_size;
 	u64 decomp_last_rem = 0;
-	size_t decomp_len = session->header.env.comp_mmap_len;
+	size_t mmap_len, decomp_len = session->header.env.comp_mmap_len;
 	struct decomp *decomp, *decomp_last = session->decomp_last;
 
-	decomp = mmap(NULL, sizeof(struct decomp) + decomp_len, PROT_READ|PROT_WRITE,
+	if (decomp_last) {
+		decomp_last_rem = decomp_last->size - decomp_last->head;
+		decomp_len += decomp_last_rem;
+	}
+
+	mmap_len = sizeof(struct decomp) + decomp_len;
+	decomp = mmap(NULL, mmap_len, PROT_READ|PROT_WRITE,
 		      MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
 	if (decomp == MAP_FAILED) {
 		pr_err("Couldn't allocate memory for decompression\n");
@@ -47,10 +53,10 @@ static int perf_session__process_compressed_event(struct perf_session *session,
 	}
 
 	decomp->file_pos = file_offset;
+	decomp->mmap_len = mmap_len;
 	decomp->head = 0;
 
-	if (decomp_last) {
-		decomp_last_rem = decomp_last->size - decomp_last->head;
+	if (decomp_last_rem) {
 		memcpy(decomp->data, &(decomp_last->data[decomp_last->head]), decomp_last_rem);
 		decomp->size = decomp_last_rem;
 	}
@@ -61,7 +67,7 @@ static int perf_session__process_compressed_event(struct perf_session *session,
 	decomp_size = zstd_decompress_stream(&(session->zstd_data), src, src_size,
 				&(decomp->data[decomp_last_rem]), decomp_len - decomp_last_rem);
 	if (!decomp_size) {
-		munmap(decomp, sizeof(struct decomp) + decomp_len);
+		munmap(decomp, mmap_len);
 		pr_err("Couldn't decompress data\n");
 		return -1;
 	}
@@ -255,15 +261,15 @@ static void perf_session__delete_threads(struct perf_session *session)
 static void perf_session__release_decomp_events(struct perf_session *session)
 {
 	struct decomp *next, *decomp;
-	size_t decomp_len;
+	size_t mmap_len;
 	next = session->decomp;
-	decomp_len = session->header.env.comp_mmap_len;
 	do {
 		decomp = next;
 		if (decomp == NULL)
 			break;
 		next = decomp->next;
-		munmap(decomp, decomp_len + sizeof(struct decomp));
+		mmap_len = decomp->mmap_len;
+		munmap(decomp, mmap_len);
 	} while (1);
 }
 
diff --git a/tools/perf/util/session.h b/tools/perf/util/session.h
index dd8920b745bc..863dbad87849 100644
--- a/tools/perf/util/session.h
+++ b/tools/perf/util/session.h
@@ -46,6 +46,7 @@ struct perf_session {
 struct decomp {
 	struct decomp *next;
 	u64 file_pos;
+	size_t mmap_len;
 	u64 head;
 	size_t size;
 	char data[];
diff --git a/tools/perf/util/zstd.c b/tools/perf/util/zstd.c
index 23bdb9884576..d2202392ffdb 100644
--- a/tools/perf/util/zstd.c
+++ b/tools/perf/util/zstd.c
@@ -99,8 +99,8 @@ size_t zstd_decompress_stream(struct zstd_data *data, void *src, size_t src_size
 	while (input.pos < input.size) {
 		ret = ZSTD_decompressStream(data->dstream, &output, &input);
 		if (ZSTD_isError(ret)) {
-			pr_err("failed to decompress (B): %ld -> %ld : %s\n",
-			       src_size, output.size, ZSTD_getErrorName(ret));
+			pr_err("failed to decompress (B): %ld -> %ld, dst_size %ld : %s\n",
+			       src_size, output.size, dst_size, ZSTD_getErrorName(ret));
 			break;
 		}
 		output.dst  = dst + output.pos;
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 29/37] perf trace beauty: Add BPF augmenter for the 'rename' syscall
  2019-07-22 17:38 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (27 preceding siblings ...)
  2019-07-22 17:38 ` [PATCH 28/37] perf session: Fix loading of compressed data split across adjacent records Arnaldo Carvalho de Melo
@ 2019-07-22 17:38 ` Arnaldo Carvalho de Melo
  2019-07-22 17:38 ` [PATCH 30/37] perf trace: Forward error codes when trying to read syscall info Arnaldo Carvalho de Melo
                   ` (7 subsequent siblings)
  36 siblings, 0 replies; 38+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-07-22 17:38 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
	linux-perf-users, Arnaldo Carvalho de Melo, Adrian Hunter,
	Luis Cláudio Gonçalves

From: Arnaldo Carvalho de Melo <acme@redhat.com>

I.e. two strings:

  # perf trace -e rename
  systemd/1 rename("/run/systemd/units/.#invocation:dnf-makecache.service970761b7f2840dcc", "/run/systemd/units/invocation:dnf-makecache.service") = 0
  systemd-journa/715 rename("/run/systemd/journal/streams/.#9:17539785BJDblc", "/run/systemd/journal/streams/9:17539785") = 0
  mv/1936 rename("/tmp/build/perf/fd/.array.o.tmp", "/tmp/build/perf/fd/.array.o.cmd") = 0
  sh/1949 rename("/tmp/build/perf/.cpu.o.tmp", "/tmp/build/perf/.cpu.o.cmd") = 0
  mv/1954 rename("/tmp/build/perf/fs/.tracing_path.o.tmp", "/tmp/build/perf/fs/.tracing_path.o.cmd") = 0
  mv/1963 rename("/tmp/build/perf/common-cmds.h+", "/tmp/build/perf/common-cmds.h") = 0
  :1975/1975 rename("/tmp/build/perf/.exec-cmd.o.tmp", "/tmp/build/perf/.exec-cmd.o.cmd") = 0
  mv/1979 rename("/tmp/build/perf/fs/.fs.o.tmp", "/tmp/build/perf/fs/.fs.o.cmd") = 0
  mv/2005 rename("/tmp/build/perf/.debug.o.tmp", "/tmp/build/perf/.debug.o.cmd") = 0
  mv/2012 rename("/tmp/build/perf/.str_error_r.o.tmp", "/tmp/build/perf/.str_error_r.o.cmd") = 0
  mv/2019 rename("/tmp/build/perf/.help.o.tmp", "/tmp/build/perf/.help.o.cmd") = 0
  mv/2031 rename("/tmp/build/perf/.trace-seq.o.tmp", "/tmp/build/perf/.trace-seq.o.cmd") = 0
  make/2038  ... [continued]: rename())             = 0
  :2038/2038 rename("/tmp/build/perf/.event-plugin.o.tmp", "/tmp/build/perf/.event-plugin.o.cmd") ...
  ar/2035 rename("/tmp/build/perf/stzwBX3a", "/tmp/build/perf/libapi.a") = 0
  mv/2051 rename("/tmp/build/perf/.parse-utils.o.tmp", "/tmp/build/perf/.parse-utils.o.cmd") = 0
  mv/2069 rename("/tmp/build/perf/.subcmd-config.o.tmp", "/tmp/build/perf/.subcmd-config.o.cmd") = 0
  make/2080 rename("/tmp/build/perf/.parse-filter.o.tmp", "/tmp/build/perf/.parse-filter.o.cmd") = 0
  mv/2099 rename("/tmp/build/perf/.pager.o.tmp", "/tmp/build/perf/.pager.o.cmd") = 0
  :2124/2124 rename("/tmp/build/perf/.sigchain.o.tmp", "/tmp/build/perf/.sigchain.o.cmd") = 0
  make/2140 rename("/tmp/build/perf/.event-parse.o.tmp", "/tmp/build/perf/.event-parse.o.cmd") = 0
  mv/2164 rename("/tmp/build/perf/.kbuffer-parse.o.tmp", "/tmp/build/perf/.kbuffer-parse.o.cmd") = 0
  sh/2174 rename("/tmp/build/perf/.run-command.o.tmp", "/tmp/build/perf/.run-command.o.cmd") = 0
  mv/2190 rename("/tmp/build/perf/.tep_strerror.o.tmp", "/tmp/build/perf/.tep_strerror.o.cmd") = 0
  :2261/2261 rename("/tmp/build/perf/.event-parse-api.o.tmp", "/tmp/build/perf/.event-parse-api.o.cmd") = 0
  :2480/2480 rename("/tmp/build/perf/stLv3kG2", "/tmp/build/perf/libtraceevent.a") = 0
  ^C#

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-6hh2rl27uri6gsxhmk6q3hx5@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 .../examples/bpf/augmented_raw_syscalls.c     | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/tools/perf/examples/bpf/augmented_raw_syscalls.c b/tools/perf/examples/bpf/augmented_raw_syscalls.c
index 9c2228b01ee6..79787cf4fce9 100644
--- a/tools/perf/examples/bpf/augmented_raw_syscalls.c
+++ b/tools/perf/examples/bpf/augmented_raw_syscalls.c
@@ -197,6 +197,25 @@ int sys_enter_openat(struct syscall_enter_args *args)
 	return perf_event_output(args, &__augmented_syscalls__, BPF_F_CURRENT_CPU, augmented_args, len);
 }
 
+SEC("!syscalls:sys_enter_rename")
+int sys_enter_rename(struct syscall_enter_args *args)
+{
+	int key = 0;
+	struct augmented_args_payload *augmented_args = bpf_map_lookup_elem(&augmented_args_tmp, &key);
+	const void *oldpath_arg = (const void *)args->args[0],
+		   *newpath_arg = (const void *)args->args[1];
+	unsigned int len = sizeof(augmented_args->args), oldpath_len;
+
+        if (augmented_args == NULL)
+                return 1; /* Failure: don't filter */
+
+	oldpath_len = augmented_filename__read(&augmented_args->filename, oldpath_arg, sizeof(augmented_args->filename.value));
+	len += oldpath_len + augmented_filename__read((void *)(&augmented_args->filename) + oldpath_len, newpath_arg, sizeof(augmented_args->filename.value));
+
+	/* If perf_event_output fails, return non-zero so that it gets recorded unaugmented */
+	return perf_event_output(args, &__augmented_syscalls__, BPF_F_CURRENT_CPU, augmented_args, len);
+}
+
 SEC("!syscalls:sys_enter_renameat")
 int sys_enter_renameat(struct syscall_enter_args *args)
 {
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 30/37] perf trace: Forward error codes when trying to read syscall info
  2019-07-22 17:38 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (28 preceding siblings ...)
  2019-07-22 17:38 ` [PATCH 29/37] perf trace beauty: Add BPF augmenter for the 'rename' syscall Arnaldo Carvalho de Melo
@ 2019-07-22 17:38 ` Arnaldo Carvalho de Melo
  2019-07-22 17:38 ` [PATCH 31/37] perf trace: Mark syscall ids that are not allocated to avoid unnecessary error messages Arnaldo Carvalho de Melo
                   ` (6 subsequent siblings)
  36 siblings, 0 replies; 38+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-07-22 17:38 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
	linux-perf-users, Arnaldo Carvalho de Melo, Adrian Hunter,
	Luis Cláudio Gonçalves

From: Arnaldo Carvalho de Melo <acme@redhat.com>

We iterate thru the syscall table produced from the kernel syscall
tables reading info, propagate the error and add to the debug message.

This helps in fixing further bugs, such as failing to read the
"sendfile" syscall info when it really should try the aliasm
"sendfile64".

  Problems reading syscall 40: 2 (No such file or directory)(sendfile) information

  # grep sendfile /tmp/build/perf/arch/x86/include/generated/asm/syscalls_64.c
	[40] = "sendfile",
  #

I.e. in the tracefs format file for the syscall tracepoints we have it
as sendfile64:

  # find /sys -type f -name format | grep sendfile
  /sys/kernel/debug/tracing/events/syscalls/sys_enter_sendfile64/format
  /sys/kernel/debug/tracing/events/syscalls/sys_exit_sendfile64/format
  #

But as "sendfile" in the file used to build the syscall table used in
perf:

  $ grep sendfile arch/x86/entry/syscalls/syscall_64.tbl
  40	common	sendfile		__x64_sys_sendfile64
  $

So we need to add, in followup patches, aliases in 'perf trace' syscall
data structures to cope with thie.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-w3eluap63x9je0bb8o3t79tz@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-trace.c | 15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index d403b09812d1..5dae7b172291 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -1492,13 +1492,13 @@ static int trace__read_syscall_info(struct trace *trace, int id)
 	const char *name = syscalltbl__name(trace->sctbl, id);
 
 	if (name == NULL)
-		return -1;
+		return -EINVAL;
 
 	if (id > trace->syscalls.max) {
 		struct syscall *nsyscalls = realloc(trace->syscalls.table, (id + 1) * sizeof(*sc));
 
 		if (nsyscalls == NULL)
-			return -1;
+			return -ENOMEM;
 
 		if (trace->syscalls.max != -1) {
 			memset(nsyscalls + trace->syscalls.max + 1, 0,
@@ -1525,10 +1525,10 @@ static int trace__read_syscall_info(struct trace *trace, int id)
 	}
 
 	if (syscall__alloc_arg_fmts(sc, IS_ERR(sc->tp_format) ? 6 : sc->tp_format->format.nr_fields))
-		return -1;
+		return -ENOMEM;
 
 	if (IS_ERR(sc->tp_format))
-		return -1;
+		return PTR_ERR(sc->tp_format);
 
 	sc->args = sc->tp_format->format.fields;
 	/*
@@ -1789,6 +1789,7 @@ typedef int (*tracepoint_handler)(struct trace *trace, struct perf_evsel *evsel,
 static struct syscall *trace__syscall_info(struct trace *trace,
 					   struct perf_evsel *evsel, int id)
 {
+	int err = 0;
 
 	if (id < 0) {
 
@@ -1811,9 +1812,10 @@ static struct syscall *trace__syscall_info(struct trace *trace,
 	}
 
 	if ((id > trace->syscalls.max || trace->syscalls.table[id].name == NULL) &&
-	    trace__read_syscall_info(trace, id))
+	    (err = trace__read_syscall_info(trace, id)) != 0)
 		goto out_cant_read;
 
+	err = -EINVAL;
 	if ((id > trace->syscalls.max || trace->syscalls.table[id].name == NULL))
 		goto out_cant_read;
 
@@ -1821,7 +1823,8 @@ static struct syscall *trace__syscall_info(struct trace *trace,
 
 out_cant_read:
 	if (verbose > 0) {
-		fprintf(trace->output, "Problems reading syscall %d", id);
+		char sbuf[STRERR_BUFSIZE];
+		fprintf(trace->output, "Problems reading syscall %d: %d (%s)", id, -err, str_error_r(-err, sbuf, sizeof(sbuf)));
 		if (id <= trace->syscalls.max && trace->syscalls.table[id].name != NULL)
 			fprintf(trace->output, "(%s)", trace->syscalls.table[id].name);
 		fputs(" information\n", trace->output);
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 31/37] perf trace: Mark syscall ids that are not allocated to avoid unnecessary error messages
  2019-07-22 17:38 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (29 preceding siblings ...)
  2019-07-22 17:38 ` [PATCH 30/37] perf trace: Forward error codes when trying to read syscall info Arnaldo Carvalho de Melo
@ 2019-07-22 17:38 ` Arnaldo Carvalho de Melo
  2019-07-22 17:38 ` [PATCH 32/37] perf trace: Preallocate the syscall table Arnaldo Carvalho de Melo
                   ` (5 subsequent siblings)
  36 siblings, 0 replies; 38+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-07-22 17:38 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
	linux-perf-users, Arnaldo Carvalho de Melo, Adrian Hunter,
	Luis Cláudio Gonçalves

From: Arnaldo Carvalho de Melo <acme@redhat.com>

There are holes in syscall tables with IDs not associated with any
syscall, mark those when trying to read information for syscalls, which
could happen when iterating thru all syscalls from 0 to the highest
numbered syscall id.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-cku9mpcrcsqaiq0jepu86r68@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-trace.c | 25 +++++++++++++++++++------
 1 file changed, 19 insertions(+), 6 deletions(-)

diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index 5dae7b172291..765b998755ce 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -976,6 +976,7 @@ static struct syscall_fmt *syscall_fmt__find_by_alias(const char *alias)
  * is_exit: is this "exit" or "exit_group"?
  * is_open: is this "open" or "openat"? To associate the fd returned in sys_exit with the pathname in sys_enter.
  * args_size: sum of the sizes of the syscall arguments, anything after that is augmented stuff: pathname for openat, etc.
+ * nonexistent: Just a hole in the syscall table, syscall id not allocated
  */
 struct syscall {
 	struct tep_event    *tp_format;
@@ -987,6 +988,7 @@ struct syscall {
 	}		    bpf_prog;
 	bool		    is_exit;
 	bool		    is_open;
+	bool		    nonexistent;
 	struct tep_format_field *args;
 	const char	    *name;
 	struct syscall_fmt  *fmt;
@@ -1491,9 +1493,6 @@ static int trace__read_syscall_info(struct trace *trace, int id)
 	struct syscall *sc;
 	const char *name = syscalltbl__name(trace->sctbl, id);
 
-	if (name == NULL)
-		return -EINVAL;
-
 	if (id > trace->syscalls.max) {
 		struct syscall *nsyscalls = realloc(trace->syscalls.table, (id + 1) * sizeof(*sc));
 
@@ -1512,8 +1511,15 @@ static int trace__read_syscall_info(struct trace *trace, int id)
 	}
 
 	sc = trace->syscalls.table + id;
-	sc->name = name;
+	if (sc->nonexistent)
+		return 0;
 
+	if (name == NULL) {
+		sc->nonexistent = true;
+		return 0;
+	}
+
+	sc->name = name;
 	sc->fmt  = syscall_fmt__find(sc->name);
 
 	snprintf(tp_name, sizeof(tp_name), "sys_enter_%s", sc->name);
@@ -1811,14 +1817,21 @@ static struct syscall *trace__syscall_info(struct trace *trace,
 		return NULL;
 	}
 
+	err = -EINVAL;
+
 	if ((id > trace->syscalls.max || trace->syscalls.table[id].name == NULL) &&
 	    (err = trace__read_syscall_info(trace, id)) != 0)
 		goto out_cant_read;
 
-	err = -EINVAL;
-	if ((id > trace->syscalls.max || trace->syscalls.table[id].name == NULL))
+	if (id > trace->syscalls.max)
 		goto out_cant_read;
 
+	if (trace->syscalls.table[id].name == NULL) {
+		if (trace->syscalls.table[id].nonexistent)
+			return NULL;
+		goto out_cant_read;
+	}
+
 	return &trace->syscalls.table[id];
 
 out_cant_read:
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 32/37] perf trace: Preallocate the syscall table
  2019-07-22 17:38 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (30 preceding siblings ...)
  2019-07-22 17:38 ` [PATCH 31/37] perf trace: Mark syscall ids that are not allocated to avoid unnecessary error messages Arnaldo Carvalho de Melo
@ 2019-07-22 17:38 ` Arnaldo Carvalho de Melo
  2019-07-22 17:38 ` [PATCH 33/37] perf trace: Reuse BPF augmenters from syscalls with similar args signature Arnaldo Carvalho de Melo
                   ` (4 subsequent siblings)
  36 siblings, 0 replies; 38+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-07-22 17:38 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
	linux-perf-users, Arnaldo Carvalho de Melo, Adrian Hunter,
	Brendan Gregg, Luis Cláudio Gonçalves

From: Arnaldo Carvalho de Melo <acme@redhat.com>

We'll continue reading its details from tracefs as we need it, but
preallocate the whole thing otherwise we may realloc and end up with
pointers to the previous buffer.

I.e. in an upcoming algorithm we'll look for syscalls that have function
signatures that are similar to a given syscall to see if we can reuse
its BPF augmenter, so we may be at syscall 42, having a 'struct syscall'
pointing to that slot in trace->syscalls.table[] and try to read the
slot for an yet unread syscall, which would realloc that table to read
the info for syscall 43, say, which would trigger a realoc of
trace->syscalls.table[], and then the pointer we had for syscall 42
would be pointing to the previous block of memory. b00m.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-m3cjzzifibs13imafhkk77a0@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-trace.c   | 29 +++++++----------------------
 tools/perf/util/syscalltbl.c |  1 +
 tools/perf/util/syscalltbl.h |  1 +
 3 files changed, 9 insertions(+), 22 deletions(-)

diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index 765b998755ce..d8565c9a18a2 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -79,7 +79,6 @@ struct trace {
 	struct perf_tool	tool;
 	struct syscalltbl	*sctbl;
 	struct {
-		int		max;
 		struct syscall  *table;
 		struct bpf_map  *map;
 		struct { // per syscall BPF_MAP_TYPE_PROG_ARRAY
@@ -1493,21 +1492,10 @@ static int trace__read_syscall_info(struct trace *trace, int id)
 	struct syscall *sc;
 	const char *name = syscalltbl__name(trace->sctbl, id);
 
-	if (id > trace->syscalls.max) {
-		struct syscall *nsyscalls = realloc(trace->syscalls.table, (id + 1) * sizeof(*sc));
-
-		if (nsyscalls == NULL)
+	if (trace->syscalls.table == NULL) {
+		trace->syscalls.table = calloc(trace->sctbl->syscalls.nr_entries, sizeof(*sc));
+		if (trace->syscalls.table == NULL)
 			return -ENOMEM;
-
-		if (trace->syscalls.max != -1) {
-			memset(nsyscalls + trace->syscalls.max + 1, 0,
-			       (id - trace->syscalls.max) * sizeof(*sc));
-		} else {
-			memset(nsyscalls, 0, (id + 1) * sizeof(*sc));
-		}
-
-		trace->syscalls.table = nsyscalls;
-		trace->syscalls.max   = id;
 	}
 
 	sc = trace->syscalls.table + id;
@@ -1819,11 +1807,11 @@ static struct syscall *trace__syscall_info(struct trace *trace,
 
 	err = -EINVAL;
 
-	if ((id > trace->syscalls.max || trace->syscalls.table[id].name == NULL) &&
-	    (err = trace__read_syscall_info(trace, id)) != 0)
+	if (id > trace->sctbl->syscalls.max_id)
 		goto out_cant_read;
 
-	if (id > trace->syscalls.max)
+	if ((trace->syscalls.table == NULL || trace->syscalls.table[id].name == NULL) &&
+	    (err = trace__read_syscall_info(trace, id)) != 0)
 		goto out_cant_read;
 
 	if (trace->syscalls.table[id].name == NULL) {
@@ -1838,7 +1826,7 @@ static struct syscall *trace__syscall_info(struct trace *trace,
 	if (verbose > 0) {
 		char sbuf[STRERR_BUFSIZE];
 		fprintf(trace->output, "Problems reading syscall %d: %d (%s)", id, -err, str_error_r(-err, sbuf, sizeof(sbuf)));
-		if (id <= trace->syscalls.max && trace->syscalls.table[id].name != NULL)
+		if (id <= trace->sctbl->syscalls.max_id && trace->syscalls.table[id].name != NULL)
 			fprintf(trace->output, "(%s)", trace->syscalls.table[id].name);
 		fputs(" information\n", trace->output);
 	}
@@ -3922,9 +3910,6 @@ int cmd_trace(int argc, const char **argv)
 		NULL
 	};
 	struct trace trace = {
-		.syscalls = {
-			. max = -1,
-		},
 		.opts = {
 			.target = {
 				.uid	   = UINT_MAX,
diff --git a/tools/perf/util/syscalltbl.c b/tools/perf/util/syscalltbl.c
index 022a9c670338..820fceeb19a9 100644
--- a/tools/perf/util/syscalltbl.c
+++ b/tools/perf/util/syscalltbl.c
@@ -79,6 +79,7 @@ static int syscalltbl__init_native(struct syscalltbl *tbl)
 
 	qsort(tbl->syscalls.entries, nr_entries, sizeof(struct syscall), syscallcmp);
 	tbl->syscalls.nr_entries = nr_entries;
+	tbl->syscalls.max_id	 = syscalltbl_native_max_id;
 	return 0;
 }
 
diff --git a/tools/perf/util/syscalltbl.h b/tools/perf/util/syscalltbl.h
index c8e7e9ce0f01..9172613028d0 100644
--- a/tools/perf/util/syscalltbl.h
+++ b/tools/perf/util/syscalltbl.h
@@ -6,6 +6,7 @@ struct syscalltbl {
 	union {
 		int audit_machine;
 		struct {
+			int max_id;
 			int nr_entries;
 			void *entries;
 		} syscalls;
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 33/37] perf trace: Reuse BPF augmenters from syscalls with similar args signature
  2019-07-22 17:38 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (31 preceding siblings ...)
  2019-07-22 17:38 ` [PATCH 32/37] perf trace: Preallocate the syscall table Arnaldo Carvalho de Melo
@ 2019-07-22 17:38 ` Arnaldo Carvalho de Melo
  2019-07-22 17:38 ` [PATCH 34/37] perf trace: Add "sendfile64" alias to the "sendfile" syscall Arnaldo Carvalho de Melo
                   ` (3 subsequent siblings)
  36 siblings, 0 replies; 38+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-07-22 17:38 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
	linux-perf-users, Arnaldo Carvalho de Melo, Adrian Hunter,
	Luis Cláudio Gonçalves

From: Arnaldo Carvalho de Melo <acme@redhat.com>

We have an augmenter for the "open" syscall, which has just one pointer,
in the first argument, a "const char *", so any other syscall that has
just one pointer and that is the first can reuse the "open" BPF
augmenter program.

Even more, syscalls that get two pointers with the first being a string
can reuse "open"'s BPF augmenter till we have an augmenter that better
matches that syscall with two pointers.

With this the few augmenters we have, for open (first arg is a string),
openat (2nd arg is a string), renameat (2nd and 4th are strings) can be
reused by a lot of syscalls, ditto for "bind" reusing "connect" because
both have the 2nd argument as a sockaddr and the 3rd as its len.

Lets see how this makes the "bind" syscall reuse the "connect" BPF prog
augmenter found in tools/perf/examples/bpf/augmented_raw_syscalls.c:

  # perf trace -e bind,connect systemctl restart sshd
  connect(3, { .family: PF_LOCAL, path: /run/systemd/private }, 23) = 0
  #

Oh, it just connects to some daemon, so we better do it system wide and then
stop/start sshd:

  # perf trace -e bind,connect
  systemctl/10124 connect(3, { .family: PF_LOCAL, path: /run/systemd/private }, 23) = 0
  sshd/10102 connect(7, { .family: PF_LOCAL, path: /dev/log }, 110) = 0
  systemctl/10126 connect(3, { .family: PF_LOCAL, path: /run/systemd/private }, 23) = 0
  systemd/10128  ... [continued]: connect())            = 0
  (sshd)/10128 connect(3, { .family: PF_LOCAL, path: /run/systemd/journal/stdout }, 30) ...
  sshd/10128 bind(3, { .family: PF_NETLINK }, 12)    = 0
  sshd/10128 connect(4, { .family: PF_LOCAL, path: /var/run/nscd/socket }, 110) = -1 ENOENT (No such file or directory)
  sshd/10128 connect(3, { .family: PF_INET6, port: 22, addr: :: }, 28) = 0
  sshd/10128 connect(3, { .family: PF_UNSPEC }, 16)  = 0
  sshd/10128 connect(3, { .family: PF_INET, port: 22, addr: 0.0.0.0 }, 16) = 0
  sshd/10128 connect(3, { .family: PF_LOCAL, path: /var/run/nscd/socket }, 110) = -1 ENOENT (No such file or directory)
  sshd/10128 connect(3, { .family: PF_LOCAL, path: /var/run/nscd/socket }, 110) = -1 ENOENT (No such file or directory)
  sshd/10128 connect(5, { .family: PF_LOCAL, path: /var/run/nscd/socket }, 110) = -1 ENOENT (No such file or directory)
  sshd/10128 connect(5, { .family: PF_LOCAL, path: /var/run/nscd/socket }, 110) = -1 ENOENT (No such file or directory)
  sshd/10128 bind(4, { .family: PF_INET, port: 22, addr: 0.0.0.0 }, 16) = 0
  sshd/10128 connect(6, { .family: PF_LOCAL, path: /dev/log }, 110) = 0
  sshd/10128 bind(6, { .family: PF_INET6, port: 22, addr: :: }, 28) = 0
  sshd/10128 connect(7, { .family: PF_LOCAL, path: /dev/log }, 110) = 0
  ^C#

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-zfley2ghs4nim1uq4nu6ed3l@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-trace.c | 154 ++++++++++++++++++++++++++++++++++++-
 1 file changed, 152 insertions(+), 2 deletions(-)

diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index d8565c9a18a2..200fbe33d5de 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -709,7 +709,6 @@ static struct syscall_fmt {
 	  .arg = { [0] = { .scnprintf = SCA_X86_ARCH_PRCTL_CODE, /* code */ },
 		   [1] = { .scnprintf = SCA_PTR, /* arg2 */ }, }, },
 	{ .name	    = "bind",
-	  .bpf_prog_name = { .sys_enter = "!syscalls:sys_enter_connect", },
 	  .arg = { [0] = { .scnprintf = SCA_INT, /* fd */ },
 		   [1] = { .scnprintf = SCA_SOCKADDR, /* umyaddr */ },
 		   [2] = { .scnprintf = SCA_INT, /* addrlen */ }, }, },
@@ -879,7 +878,6 @@ static struct syscall_fmt {
 	  .arg = { [0] = { .scnprintf = SCA_FDAT, /* olddirfd */ },
 		   [2] = { .scnprintf = SCA_FDAT, /* newdirfd */ }, }, },
 	{ .name	    = "renameat2",
-	  .bpf_prog_name = { .sys_enter = "!syscalls:sys_enter_renameat", },
 	  .arg = { [0] = { .scnprintf = SCA_FDAT, /* olddirfd */ },
 		   [2] = { .scnprintf = SCA_FDAT, /* newdirfd */ },
 		   [4] = { .scnprintf = SCA_RENAMEAT2_FLAGS, /* flags */ }, }, },
@@ -2910,6 +2908,94 @@ static int trace__init_syscalls_bpf_map(struct trace *trace)
 	return __trace__init_syscalls_bpf_map(trace, enabled);
 }
 
+static struct bpf_program *trace__find_usable_bpf_prog_entry(struct trace *trace, struct syscall *sc)
+{
+	struct tep_format_field *field, *candidate_field;
+	int id;
+
+	/*
+	 * We're only interested in syscalls that have a pointer:
+	 */
+	for (field = sc->args; field; field = field->next) {
+		if (field->flags & TEP_FIELD_IS_POINTER)
+			goto try_to_find_pair;
+	}
+
+	return NULL;
+
+try_to_find_pair:
+	for (id = 0; id < trace->sctbl->syscalls.nr_entries; ++id) {
+		struct syscall *pair = trace__syscall_info(trace, NULL, id);
+		struct bpf_program *pair_prog;
+		bool is_candidate = false;
+
+		if (pair == NULL || pair == sc ||
+		    pair->bpf_prog.sys_enter == trace->syscalls.unaugmented_prog)
+			continue;
+
+		for (field = sc->args, candidate_field = pair->args;
+		     field && candidate_field; field = field->next, candidate_field = candidate_field->next) {
+			bool is_pointer = field->flags & TEP_FIELD_IS_POINTER,
+			     candidate_is_pointer = candidate_field->flags & TEP_FIELD_IS_POINTER;
+
+			if (is_pointer) {
+			       if (!candidate_is_pointer) {
+					// The candidate just doesn't copies our pointer arg, might copy other pointers we want.
+					continue;
+			       }
+			} else {
+				if (candidate_is_pointer) {
+					// The candidate might copy a pointer we don't have, skip it.
+					goto next_candidate;
+				}
+				continue;
+			}
+
+			if (strcmp(field->type, candidate_field->type))
+				goto next_candidate;
+
+			is_candidate = true;
+		}
+
+		if (!is_candidate)
+			goto next_candidate;
+
+		/*
+		 * Check if the tentative pair syscall augmenter has more pointers, if it has,
+		 * then it may be collecting that and we then can't use it, as it would collect
+		 * more than what is common to the two syscalls.
+		 */
+		if (candidate_field) {
+			for (candidate_field = candidate_field->next; candidate_field; candidate_field = candidate_field->next)
+				if (candidate_field->flags & TEP_FIELD_IS_POINTER)
+					goto next_candidate;
+		}
+
+		pair_prog = pair->bpf_prog.sys_enter;
+		/*
+		 * If the pair isn't enabled, then its bpf_prog.sys_enter will not
+		 * have been searched for, so search it here and if it returns the
+		 * unaugmented one, then ignore it, otherwise we'll reuse that BPF
+		 * program for a filtered syscall on a non-filtered one.
+		 *
+		 * For instance, we have "!syscalls:sys_enter_renameat" and that is
+		 * useful for "renameat2".
+		 */
+		if (pair_prog == NULL) {
+			pair_prog = trace__find_syscall_bpf_prog(trace, pair, pair->fmt ? pair->fmt->bpf_prog_name.sys_enter : NULL, "enter");
+			if (pair_prog == trace->syscalls.unaugmented_prog)
+				goto next_candidate;
+		}
+
+		pr_debug("Reusing \"%s\" BPF sys_enter augmenter for \"%s\"\n", pair->name, sc->name);
+		return pair_prog;
+	next_candidate:
+		continue;
+	}
+
+	return NULL;
+}
+
 static int trace__init_syscalls_bpf_prog_array_maps(struct trace *trace)
 {
 	int map_enter_fd = bpf_map__fd(trace->syscalls.prog_array.sys_enter),
@@ -2935,6 +3021,70 @@ static int trace__init_syscalls_bpf_prog_array_maps(struct trace *trace)
 			break;
 	}
 
+	/*
+	 * Now lets do a second pass looking for enabled syscalls without
+	 * an augmenter that have a signature that is a superset of another
+	 * syscall with an augmenter so that we can auto-reuse it.
+	 *
+	 * I.e. if we have an augmenter for the "open" syscall that has
+	 * this signature:
+	 *
+	 *   int open(const char *pathname, int flags, mode_t mode);
+	 *
+	 * I.e. that will collect just the first string argument, then we
+	 * can reuse it for the 'creat' syscall, that has this signature:
+	 *
+	 *   int creat(const char *pathname, mode_t mode);
+	 *
+	 * and for:
+	 *
+	 *   int stat(const char *pathname, struct stat *statbuf);
+	 *   int lstat(const char *pathname, struct stat *statbuf);
+	 *
+	 * Because the 'open' augmenter will collect the first arg as a string,
+	 * and leave alone all the other args, which already helps with
+	 * beautifying 'stat' and 'lstat''s pathname arg.
+	 *
+	 * Then, in time, when 'stat' gets an augmenter that collects both
+	 * first and second arg (this one on the raw_syscalls:sys_exit prog
+	 * array tail call, then that one will be used.
+	 */
+	for (key = 0; key < trace->sctbl->syscalls.nr_entries; ++key) {
+		struct syscall *sc = trace__syscall_info(trace, NULL, key);
+		struct bpf_program *pair_prog;
+		int prog_fd;
+
+		if (sc == NULL || sc->bpf_prog.sys_enter == NULL)
+			continue;
+
+		/*
+		 * For now we're just reusing the sys_enter prog, and if it
+		 * already has an augmenter, we don't need to find one.
+		 */
+		if (sc->bpf_prog.sys_enter != trace->syscalls.unaugmented_prog)
+			continue;
+
+		/*
+		 * Look at all the other syscalls for one that has a signature
+		 * that is close enough that we can share:
+		 */
+		pair_prog = trace__find_usable_bpf_prog_entry(trace, sc);
+		if (pair_prog == NULL)
+			continue;
+
+		sc->bpf_prog.sys_enter = pair_prog;
+
+		/*
+		 * Update the BPF_MAP_TYPE_PROG_SHARED for raw_syscalls:sys_enter
+		 * with the fd for the program we're reusing:
+		 */
+		prog_fd = bpf_program__fd(sc->bpf_prog.sys_enter);
+		err = bpf_map_update_elem(map_enter_fd, &key, &prog_fd, BPF_ANY);
+		if (err)
+			break;
+	}
+
+
 	return err;
 }
 #else
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 34/37] perf trace: Add "sendfile64" alias to the "sendfile" syscall
  2019-07-22 17:38 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (32 preceding siblings ...)
  2019-07-22 17:38 ` [PATCH 33/37] perf trace: Reuse BPF augmenters from syscalls with similar args signature Arnaldo Carvalho de Melo
@ 2019-07-22 17:38 ` Arnaldo Carvalho de Melo
  2019-07-22 17:38 ` [PATCH 35/37] perf probe: Set pev->nargs to zero after freeing pev->args entries Arnaldo Carvalho de Melo
                   ` (2 subsequent siblings)
  36 siblings, 0 replies; 38+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-07-22 17:38 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
	linux-perf-users, Arnaldo Carvalho de Melo, Adrian Hunter,
	Luis Cláudio Gonçalves

From: Arnaldo Carvalho de Melo <acme@redhat.com>

We were looking in tracefs for:

  /sys/kernel/debug/tracing/events/syscalls/sys_enter_sendfile/format when

what is there is just

  /sys/kernel/debug/tracing/events/syscalls/sys_enter_sendfile/format

Its the same id, 40 in x86_64, so just add an alias and let the existing
logic take care of that.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-km2hmg7hru6u4pawi5fi903q@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-trace.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index 200fbe33d5de..ca28c48f812e 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -895,6 +895,7 @@ static struct syscall_fmt {
 	  .arg = { [0] = { .scnprintf = SCA_SECCOMP_OP,	   /* op */ },
 		   [1] = { .scnprintf = SCA_SECCOMP_FLAGS, /* flags */ }, }, },
 	{ .name	    = "select", .timeout = true, },
+	{ .name	    = "sendfile", .alias = "sendfile64", },
 	{ .name	    = "sendmmsg",
 	  .arg = { [3] = { .scnprintf = SCA_MSG_FLAGS, /* flags */ }, }, },
 	{ .name	    = "sendmsg",
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 35/37] perf probe: Set pev->nargs to zero after freeing pev->args entries
  2019-07-22 17:38 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (33 preceding siblings ...)
  2019-07-22 17:38 ` [PATCH 34/37] perf trace: Add "sendfile64" alias to the "sendfile" syscall Arnaldo Carvalho de Melo
@ 2019-07-22 17:38 ` Arnaldo Carvalho de Melo
  2019-07-22 17:38 ` [PATCH 36/37] perf probe: Avoid calling freeing routine multiple times for same pointer Arnaldo Carvalho de Melo
  2019-07-22 17:38 ` [PATCH 37/37] perf build: Do not use -Wshadow on gcc < 4.8 Arnaldo Carvalho de Melo
  36 siblings, 0 replies; 38+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-07-22 17:38 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
	linux-perf-users, Arnaldo Carvalho de Melo, Adrian Hunter,
	Masami Hiramatsu

From: Arnaldo Carvalho de Melo <acme@redhat.com>

So that, when perf_add_probe_events() fails, like in:

  # perf probe icmp_rcv:64 "type=icmph->type"
  Failed to find 'icmph' in this function.
    Error: Failed to add events.
  Segmentation fault (core dumped)
  #

We don't segfault.

clear_perf_probe_event() was zeroing the whole pev, and since the switch
to zfree() for the members in the pev, that memset() was removed, which
left nargs with its original value, in the above case 1.

With the memset the same pev could be passed to clear_perf_probe_event()
multiple times, since all it would have would be zeroes, and free()
accepts zero, the loop would not happen and we would just memset it
again to zeroes.

Without it we got that segfault, so zero nargs to keep it like it was,
next cset will avoid calling clear_perf_probe_event() for the same pevs
in case of failure.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Fixes: d8f9da240495 ("perf tools: Use zfree() where applicable")
Link: https://lkml.kernel.org/n/tip-802f2jypnwqsvyavvivs8464@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/probe-event.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index 0c3b55d0617d..4acd3457d39d 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -2219,6 +2219,7 @@ void clear_perf_probe_event(struct perf_probe_event *pev)
 			field = next;
 		}
 	}
+	pev->nargs = 0;
 	zfree(&pev->args);
 }
 
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 36/37] perf probe: Avoid calling freeing routine multiple times for same pointer
  2019-07-22 17:38 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (34 preceding siblings ...)
  2019-07-22 17:38 ` [PATCH 35/37] perf probe: Set pev->nargs to zero after freeing pev->args entries Arnaldo Carvalho de Melo
@ 2019-07-22 17:38 ` Arnaldo Carvalho de Melo
  2019-07-22 17:38 ` [PATCH 37/37] perf build: Do not use -Wshadow on gcc < 4.8 Arnaldo Carvalho de Melo
  36 siblings, 0 replies; 38+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-07-22 17:38 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
	linux-perf-users, Arnaldo Carvalho de Melo, Adrian Hunter,
	Masami Hiramatsu

From: Arnaldo Carvalho de Melo <acme@redhat.com>

When perf_add_probe_events() we call cleanup_perf_probe_events() for the
pev pointer it receives, then, as part of handling this failure the main
'perf probe' goes on and calls cleanup_params() and that will again call
cleanup_perf_probe_events()for the same pointer, so just set nevents to
zero when handling the failure of perf_add_probe_events() to avoid the
double free.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-x8qgma4g813z96dvtw9w219q@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-probe.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/tools/perf/builtin-probe.c b/tools/perf/builtin-probe.c
index 6418782951a4..3d0ffd41fb55 100644
--- a/tools/perf/builtin-probe.c
+++ b/tools/perf/builtin-probe.c
@@ -698,6 +698,16 @@ __cmd_probe(int argc, const char **argv)
 
 		ret = perf_add_probe_events(params.events, params.nevents);
 		if (ret < 0) {
+
+			/*
+			 * When perf_add_probe_events() fails it calls
+			 * cleanup_perf_probe_events(pevs, npevs), i.e.
+			 * cleanup_perf_probe_events(params.events, params.nevents), which
+			 * will call clear_perf_probe_event(), so set nevents to zero
+			 * to avoid cleanup_params() to call clear_perf_probe_event() again
+			 * on the same pevs.
+			 */
+			params.nevents = 0;
 			pr_err_with_code("  Error: Failed to add events.", ret);
 			return ret;
 		}
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 37/37] perf build: Do not use -Wshadow on gcc < 4.8
  2019-07-22 17:38 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
                   ` (35 preceding siblings ...)
  2019-07-22 17:38 ` [PATCH 36/37] perf probe: Avoid calling freeing routine multiple times for same pointer Arnaldo Carvalho de Melo
@ 2019-07-22 17:38 ` Arnaldo Carvalho de Melo
  36 siblings, 0 replies; 38+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-07-22 17:38 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
	linux-perf-users, Arnaldo Carvalho de Melo, Andrii Nakryiko,
	Adrian Hunter

From: Arnaldo Carvalho de Melo <acme@redhat.com>

As it is too strict, see https://lkml.org/lkml/2006/11/28/253 and
https://gcc.gnu.org/gcc-4.8/changes.html, that takes into account
Linus's comments (search for Wshadow) for the reasoning about -Wshadow
not being interesting before gcc 4.8.

Acked-by: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/r/20190719183417.GQ3624@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/scripts/Makefile.include | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/tools/scripts/Makefile.include b/tools/scripts/Makefile.include
index 495066bafbe3..ded7a950dc40 100644
--- a/tools/scripts/Makefile.include
+++ b/tools/scripts/Makefile.include
@@ -32,7 +32,6 @@ EXTRA_WARNINGS += -Wno-system-headers
 EXTRA_WARNINGS += -Wold-style-definition
 EXTRA_WARNINGS += -Wpacked
 EXTRA_WARNINGS += -Wredundant-decls
-EXTRA_WARNINGS += -Wshadow
 EXTRA_WARNINGS += -Wstrict-prototypes
 EXTRA_WARNINGS += -Wswitch-default
 EXTRA_WARNINGS += -Wswitch-enum
@@ -69,8 +68,16 @@ endif
 # will do for now and keep the above -Wstrict-aliasing=3 in place
 # in newer systems.
 # Needed for the __raw_cmpxchg in tools/arch/x86/include/asm/cmpxchg.h
+#
+# See https://lkml.org/lkml/2006/11/28/253 and https://gcc.gnu.org/gcc-4.8/changes.html,
+# that takes into account Linus's comments (search for Wshadow) for the reasoning about
+# -Wshadow not being interesting before gcc 4.8.
+
 ifneq ($(filter 3.%,$(MAKE_VERSION)),)  # make-3
 EXTRA_WARNINGS += -fno-strict-aliasing
+EXTRA_WARNINGS += -Wno-shadow
+else
+EXTRA_WARNINGS += -Wshadow
 endif
 
 ifneq ($(findstring $(MAKEFLAGS), w),w)
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2019-07-22 17:43 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-22 17:38 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
2019-07-22 17:38 ` [PATCH 01/37] perf include bpf: Add bpf_tail_call() prototype Arnaldo Carvalho de Melo
2019-07-22 17:38 ` [PATCH 02/37] perf bpf: Do not attach a BPF prog to a tracepoint if its name starts with ! Arnaldo Carvalho de Melo
2019-07-22 17:38 ` [PATCH 03/37] perf evsel: Store backpointer to attached bpf_object Arnaldo Carvalho de Melo
2019-07-22 17:38 ` [PATCH 04/37] perf trace: Add pointer to BPF object containing __augmented_syscalls__ Arnaldo Carvalho de Melo
2019-07-22 17:38 ` [PATCH 05/37] perf trace: Look up maps just on the __augmented_syscalls__ BPF object Arnaldo Carvalho de Melo
2019-07-22 17:38 ` [PATCH 06/37] perf trace: Order -e syscalls table Arnaldo Carvalho de Melo
2019-07-22 17:38 ` [PATCH 07/37] perf trace: Add BPF handler for unaugmented syscalls Arnaldo Carvalho de Melo
2019-07-22 17:38 ` [PATCH 08/37] perf trace: Allow specifying the bpf prog to augment specific syscalls Arnaldo Carvalho de Melo
2019-07-22 17:38 ` [PATCH 09/37] perf trace: Put the per-syscall entry/exit prog_array BPF map infrastructure in place Arnaldo Carvalho de Melo
2019-07-22 17:38 ` [PATCH 10/37] perf trace: Handle raw_syscalls:sys_enter just like the BPF_OUTPUT augmented event Arnaldo Carvalho de Melo
2019-07-22 17:38 ` [PATCH 11/37] perf augmented_raw_syscalls: Add handler for "openat" Arnaldo Carvalho de Melo
2019-07-22 17:38 ` [PATCH 12/37] perf augmented_raw_syscalls: Switch to using BPF_MAP_TYPE_PROG_ARRAY Arnaldo Carvalho de Melo
2019-07-22 17:38 ` [PATCH 13/37] perf augmented_raw_syscalls: Support copying two string syscall args Arnaldo Carvalho de Melo
2019-07-22 17:38 ` [PATCH 14/37] perf trace: Look for default name for entries in the syscalls prog array Arnaldo Carvalho de Melo
2019-07-22 17:38 ` [PATCH 15/37] perf augmented_raw_syscalls: Rename augmented_args_filename to augmented_args_payload Arnaldo Carvalho de Melo
2019-07-22 17:38 ` [PATCH 16/37] perf script: Fix --max-blocks man page description Arnaldo Carvalho de Melo
2019-07-22 17:38 ` [PATCH 17/37] perf script: Improve man page description of metrics Arnaldo Carvalho de Melo
2019-07-22 17:38 ` [PATCH 18/37] perf script: Fix off by one in brstackinsn IPC computation Arnaldo Carvalho de Melo
2019-07-22 17:38 ` [PATCH 19/37] perf tools: Fix proper buffer size for feature processing Arnaldo Carvalho de Melo
2019-07-22 17:38 ` [PATCH 20/37] perf stat: Fix segfault for event group in repeat mode Arnaldo Carvalho de Melo
2019-07-22 17:38 ` [PATCH 21/37] perf augmented_raw_syscalls: Augment sockaddr arg in 'connect' Arnaldo Carvalho de Melo
2019-07-22 17:38 ` [PATCH 22/37] perf trace beauty: Make connect's addrlen be printed as an int, not hex Arnaldo Carvalho de Melo
2019-07-22 17:38 ` [PATCH 23/37] perf trace beauty: Disable fd->pathname when close() not enabled Arnaldo Carvalho de Melo
2019-07-22 17:38 ` [PATCH 24/37] perf trace beauty: Do not try to use the fd->pathname beautifier for bind/connect fd arg Arnaldo Carvalho de Melo
2019-07-22 17:38 ` [PATCH 25/37] perf trace beauty: Beautify 'sendto's sockaddr arg Arnaldo Carvalho de Melo
2019-07-22 17:38 ` [PATCH 26/37] perf trace beauty: Beautify bind's " Arnaldo Carvalho de Melo
2019-07-22 17:38 ` [PATCH 27/37] perf stat: Always separate stalled cycles per insn Arnaldo Carvalho de Melo
2019-07-22 17:38 ` [PATCH 28/37] perf session: Fix loading of compressed data split across adjacent records Arnaldo Carvalho de Melo
2019-07-22 17:38 ` [PATCH 29/37] perf trace beauty: Add BPF augmenter for the 'rename' syscall Arnaldo Carvalho de Melo
2019-07-22 17:38 ` [PATCH 30/37] perf trace: Forward error codes when trying to read syscall info Arnaldo Carvalho de Melo
2019-07-22 17:38 ` [PATCH 31/37] perf trace: Mark syscall ids that are not allocated to avoid unnecessary error messages Arnaldo Carvalho de Melo
2019-07-22 17:38 ` [PATCH 32/37] perf trace: Preallocate the syscall table Arnaldo Carvalho de Melo
2019-07-22 17:38 ` [PATCH 33/37] perf trace: Reuse BPF augmenters from syscalls with similar args signature Arnaldo Carvalho de Melo
2019-07-22 17:38 ` [PATCH 34/37] perf trace: Add "sendfile64" alias to the "sendfile" syscall Arnaldo Carvalho de Melo
2019-07-22 17:38 ` [PATCH 35/37] perf probe: Set pev->nargs to zero after freeing pev->args entries Arnaldo Carvalho de Melo
2019-07-22 17:38 ` [PATCH 36/37] perf probe: Avoid calling freeing routine multiple times for same pointer Arnaldo Carvalho de Melo
2019-07-22 17:38 ` [PATCH 37/37] perf build: Do not use -Wshadow on gcc < 4.8 Arnaldo Carvalho de Melo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).