linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/8] perf tools: Assorted fixes
@ 2019-03-05 15:25 Jiri Olsa
  2019-03-05 15:25 ` [PATCH 1/8] perf c2c: Fix c2c report for empty numa node Jiri Olsa
                   ` (7 more replies)
  0 siblings, 8 replies; 30+ messages in thread
From: Jiri Olsa @ 2019-03-05 15:25 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: lkml, Ingo Molnar, Namhyung Kim, Alexander Shishkin,
	Peter Zijlstra, Jonas Rabenstein, Nageswara R Sastry,
	Ravi Bangoria, Andi Kleen

hi,
sending assorted general fixes that queued
up in my other branches.

It contains all my previous posted single patches,
(except for the dir data patchset).

Also available in here:
  https://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git
  perf/fixes

thanks,
jirka


---
Jiri Olsa (8):
      perf c2c: Fix c2c report for empty numa node
      perf tools: Add error path into hist_entry__init
      perf hist: Fix memory leak of srcline
      perf tools: Read and store caps/max_precise in perf_pmu
      perf tools: Get precise_ip from the pmu config
      perf tools: Probe for precise_ip with simple attr
      perf tools: Fix double free in perf_data__close
      perf tools: Force perf_data__open|close zero data->file.path

 tools/perf/builtin-c2c.c  |  8 ++++++--
 tools/perf/util/data.c    |  4 ++--
 tools/perf/util/evlist.c  | 25 ++++++++++++++++++++-----
 tools/perf/util/evsel.c   | 36 +++++++++++++++++++++++++++---------
 tools/perf/util/hist.c    | 51 ++++++++++++++++++++++++++++++---------------------
 tools/perf/util/pmu.c     | 14 ++++++++++++++
 tools/perf/util/pmu.h     |  1 +
 tools/perf/util/session.c |  4 +---
 tools/perf/util/setup.py  |  2 +-
 9 files changed, 102 insertions(+), 43 deletions(-)

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 1/8] perf c2c: Fix c2c report for empty numa node
  2019-03-05 15:25 [PATCH 0/8] perf tools: Assorted fixes Jiri Olsa
@ 2019-03-05 15:25 ` Jiri Olsa
  2019-03-09 20:05   ` [tip:perf/urgent] " tip-bot for Jiri Olsa
  2019-03-05 15:25 ` [PATCH 2/8] perf tools: Add error path into hist_entry__init Jiri Olsa
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 30+ messages in thread
From: Jiri Olsa @ 2019-03-05 15:25 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: lkml, Ingo Molnar, Namhyung Kim, Alexander Shishkin,
	Peter Zijlstra, Jonas Rabenstein, Nageswara R Sastry,
	Ravi Bangoria, Andi Kleen

From: Jiri Olsa <jolsa@redhat.com>

Ravi Bangoria reported that we fail with empty
numa node with following message:

  $ lscpu
  NUMA node0 CPU(s):
  NUMA node1 CPU(s):   0-4

  $ sudo ./perf c2c report
  node/cpu topology bugFailed setup nodes

Fixing this by detecting empty node and keeping
its cpu set empty.

Reported-by: Nageswara R Sastry <nasastry@in.ibm.com>
Tested-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Link: http://lkml.kernel.org/n/tip-dyq5jo6pn1j3yqavb5ukjwwu@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/perf/builtin-c2c.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index 4272763a5e96..9e6cc868bdb4 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -2056,6 +2056,12 @@ static int setup_nodes(struct perf_session *session)
 		if (!set)
 			return -ENOMEM;
 
+		nodes[node] = set;
+
+		/* empty node, skip */
+		if (cpu_map__empty(map))
+			continue;
+
 		for (cpu = 0; cpu < map->nr; cpu++) {
 			set_bit(map->map[cpu], set);
 
@@ -2064,8 +2070,6 @@ static int setup_nodes(struct perf_session *session)
 
 			cpu2node[map->map[cpu]] = node;
 		}
-
-		nodes[node] = set;
 	}
 
 	setup_nodes_header();
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 2/8] perf tools: Add error path into hist_entry__init
  2019-03-05 15:25 [PATCH 0/8] perf tools: Assorted fixes Jiri Olsa
  2019-03-05 15:25 ` [PATCH 1/8] perf c2c: Fix c2c report for empty numa node Jiri Olsa
@ 2019-03-05 15:25 ` Jiri Olsa
  2019-03-09 20:06   ` [tip:perf/urgent] perf hist: " tip-bot for Jiri Olsa
  2019-03-05 15:25 ` [PATCH 3/8] perf hist: Fix memory leak of srcline Jiri Olsa
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 30+ messages in thread
From: Jiri Olsa @ 2019-03-05 15:25 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: lkml, Ingo Molnar, Namhyung Kim, Alexander Shishkin,
	Peter Zijlstra, Jonas Rabenstein, Nageswara R Sastry,
	Ravi Bangoria, Andi Kleen

Adding error path into hist_entry__init to unify error
handling, so every new member does not need to free
everything else.

Link: http://lkml.kernel.org/n/tip-v0ssws4hsr0tozb7lm0k5f70@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/perf/util/hist.c | 39 +++++++++++++++++++--------------------
 1 file changed, 19 insertions(+), 20 deletions(-)

diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 669f961316f0..74e307d17c49 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -396,11 +396,8 @@ static int hist_entry__init(struct hist_entry *he,
 		 * adding new entries.  So we need to save a copy.
 		 */
 		he->branch_info = malloc(sizeof(*he->branch_info));
-		if (he->branch_info == NULL) {
-			map__zput(he->ms.map);
-			free(he->stat_acc);
-			return -ENOMEM;
-		}
+		if (he->branch_info == NULL)
+			goto err;
 
 		memcpy(he->branch_info, template->branch_info,
 		       sizeof(*he->branch_info));
@@ -419,21 +416,8 @@ static int hist_entry__init(struct hist_entry *he,
 
 	if (he->raw_data) {
 		he->raw_data = memdup(he->raw_data, he->raw_size);
-
-		if (he->raw_data == NULL) {
-			map__put(he->ms.map);
-			if (he->branch_info) {
-				map__put(he->branch_info->from.map);
-				map__put(he->branch_info->to.map);
-				free(he->branch_info);
-			}
-			if (he->mem_info) {
-				map__put(he->mem_info->iaddr.map);
-				map__put(he->mem_info->daddr.map);
-			}
-			free(he->stat_acc);
-			return -ENOMEM;
-		}
+		if (he->raw_data == NULL)
+			goto err_infos;
 	}
 	INIT_LIST_HEAD(&he->pairs.node);
 	thread__get(he->thread);
@@ -444,6 +428,21 @@ static int hist_entry__init(struct hist_entry *he,
 		he->leaf = true;
 
 	return 0;
+
+err_infos:
+	if (he->branch_info) {
+		map__put(he->branch_info->from.map);
+		map__put(he->branch_info->to.map);
+		free(he->branch_info);
+	}
+	if (he->mem_info) {
+		map__put(he->mem_info->iaddr.map);
+		map__put(he->mem_info->daddr.map);
+	}
+err:
+	map__zput(he->ms.map);
+	free(he->stat_acc);
+	return -ENOMEM;
 }
 
 static void *hist_entry__zalloc(size_t size)
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 3/8] perf hist: Fix memory leak of srcline
  2019-03-05 15:25 [PATCH 0/8] perf tools: Assorted fixes Jiri Olsa
  2019-03-05 15:25 ` [PATCH 1/8] perf c2c: Fix c2c report for empty numa node Jiri Olsa
  2019-03-05 15:25 ` [PATCH 2/8] perf tools: Add error path into hist_entry__init Jiri Olsa
@ 2019-03-05 15:25 ` Jiri Olsa
  2019-03-09 20:07   ` [tip:perf/urgent] " tip-bot for Jiri Olsa
  2019-03-05 15:25 ` [PATCH 4/8] perf tools: Read and store caps/max_precise in perf_pmu Jiri Olsa
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 30+ messages in thread
From: Jiri Olsa @ 2019-03-05 15:25 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: lkml, Ingo Molnar, Namhyung Kim, Alexander Shishkin,
	Peter Zijlstra, Jonas Rabenstein, Nageswara R Sastry,
	Ravi Bangoria, Andi Kleen

From: Jiri Olsa <jolsa@redhat.com>

We can't allocate he->srcline unconditionaly, only when
new hist_entry is created. Moving he->srcline allocation
into hist_entry__init function.

Original-patch-by: Jonas Rabenstein <jonas.rabenstein@studium.uni-erlangen.de>
Suggested-by: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/n/tip-2ijaiiwlzmk4cae3zs97pq0x@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/perf/util/hist.c | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 74e307d17c49..f9eb95bf3938 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -419,6 +419,13 @@ static int hist_entry__init(struct hist_entry *he,
 		if (he->raw_data == NULL)
 			goto err_infos;
 	}
+
+	if (he->srcline) {
+		he->srcline = strdup(he->srcline);
+		if (he->srcline == NULL)
+			goto err_rawdata;
+	}
+
 	INIT_LIST_HEAD(&he->pairs.node);
 	thread__get(he->thread);
 	he->hroot_in  = RB_ROOT_CACHED;
@@ -429,6 +436,9 @@ static int hist_entry__init(struct hist_entry *he,
 
 	return 0;
 
+err_rawdata:
+	free(he->raw_data);
+
 err_infos:
 	if (he->branch_info) {
 		map__put(he->branch_info->from.map);
@@ -605,7 +615,7 @@ __hists__add_entry(struct hists *hists,
 			.map	= al->map,
 			.sym	= al->sym,
 		},
-		.srcline = al->srcline ? strdup(al->srcline) : NULL,
+		.srcline = (char *) al->srcline,
 		.socket	 = al->socket,
 		.cpu	 = al->cpu,
 		.cpumode = al->cpumode,
@@ -962,7 +972,7 @@ iter_add_next_cumulative_entry(struct hist_entry_iter *iter,
 			.map = al->map,
 			.sym = al->sym,
 		},
-		.srcline = al->srcline ? strdup(al->srcline) : NULL,
+		.srcline = (char *) al->srcline,
 		.parent = iter->parent,
 		.raw_data = sample->raw_data,
 		.raw_size = sample->raw_size,
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 4/8] perf tools: Read and store caps/max_precise in perf_pmu
  2019-03-05 15:25 [PATCH 0/8] perf tools: Assorted fixes Jiri Olsa
                   ` (2 preceding siblings ...)
  2019-03-05 15:25 ` [PATCH 3/8] perf hist: Fix memory leak of srcline Jiri Olsa
@ 2019-03-05 15:25 ` Jiri Olsa
  2019-03-09 20:07   ` [tip:perf/urgent] " tip-bot for Jiri Olsa
  2019-03-05 15:25 ` [PATCH 5/8] perf tools: Get precise_ip from the pmu config Jiri Olsa
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 30+ messages in thread
From: Jiri Olsa @ 2019-03-05 15:25 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: lkml, Ingo Molnar, Namhyung Kim, Alexander Shishkin,
	Peter Zijlstra, Jonas Rabenstein, Nageswara R Sastry,
	Ravi Bangoria, Andi Kleen

Read the caps/max_precise value and store it in struct
perf_pmu to be used when setting the maximum precise_ip
field in following patch.

Link: http://lkml.kernel.org/n/tip-y47h37fzs5t8hy9t0j5g9pyj@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/perf/util/pmu.c | 14 ++++++++++++++
 tools/perf/util/pmu.h |  1 +
 2 files changed, 15 insertions(+)

diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
index 51d437f55d18..6199a3174ab9 100644
--- a/tools/perf/util/pmu.c
+++ b/tools/perf/util/pmu.c
@@ -752,6 +752,19 @@ perf_pmu__get_default_config(struct perf_pmu *pmu __maybe_unused)
 	return NULL;
 }
 
+static int pmu_max_precise(const char *name)
+{
+	char path[PATH_MAX];
+	int max_precise = -1;
+
+	scnprintf(path, PATH_MAX,
+		 "bus/event_source/devices/%s/caps/max_precise",
+		 name);
+
+	sysfs__read_int(path, &max_precise);
+	return max_precise;
+}
+
 static struct perf_pmu *pmu_lookup(const char *name)
 {
 	struct perf_pmu *pmu;
@@ -784,6 +797,7 @@ static struct perf_pmu *pmu_lookup(const char *name)
 	pmu->name = strdup(name);
 	pmu->type = type;
 	pmu->is_uncore = pmu_is_uncore(name);
+	pmu->max_precise = pmu_max_precise(name);
 	pmu_add_cpu_aliases(&aliases, pmu);
 
 	INIT_LIST_HEAD(&pmu->format);
diff --git a/tools/perf/util/pmu.h b/tools/perf/util/pmu.h
index 47253c3daf55..bd9ec2704a57 100644
--- a/tools/perf/util/pmu.h
+++ b/tools/perf/util/pmu.h
@@ -26,6 +26,7 @@ struct perf_pmu {
 	__u32 type;
 	bool selectable;
 	bool is_uncore;
+	int max_precise;
 	struct perf_event_attr *default_config;
 	struct cpu_map *cpus;
 	struct list_head format;  /* HEAD struct perf_pmu_format -> list */
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 5/8] perf tools: Get precise_ip from the pmu config
  2019-03-05 15:25 [PATCH 0/8] perf tools: Assorted fixes Jiri Olsa
                   ` (3 preceding siblings ...)
  2019-03-05 15:25 ` [PATCH 4/8] perf tools: Read and store caps/max_precise in perf_pmu Jiri Olsa
@ 2019-03-05 15:25 ` Jiri Olsa
  2019-03-05 16:13   ` Andi Kleen
  2019-03-05 15:25 ` [PATCH 6/8] perf tools: Probe for precise_ip with simple attr Jiri Olsa
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 30+ messages in thread
From: Jiri Olsa @ 2019-03-05 15:25 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: lkml, Ingo Molnar, Namhyung Kim, Alexander Shishkin,
	Peter Zijlstra, Jonas Rabenstein, Nageswara R Sastry,
	Ravi Bangoria, Andi Kleen

Getting precise_ip field from the perf_pmu::max_precise
config read from sysfs. If it's not available falling
back to current detection function.

This fixes perf c2c usage of 'P' modifier on memory events,
where the standard detection fails, because of way too many
new perf_event_attr features added recently.

Use PYTHON_MOD to distinguish python perf module build
to exclude pmu.o object call being compiled in. There's
the flex/bison (and other) object dependencies, that
we'd need to teach setup.py about.

Link: http://lkml.kernel.org/n/tip-zyegnaq8xlkhh4sgfl4tnvui@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/perf/util/evsel.c  | 28 +++++++++++++++++++++++++++-
 tools/perf/util/setup.py |  2 +-
 2 files changed, 28 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index dfe2958e6287..eec542bab815 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -870,6 +870,32 @@ static bool is_dummy_event(struct perf_evsel *evsel)
 	       (evsel->attr.config == PERF_COUNT_SW_DUMMY);
 }
 
+#ifndef PYTHON_MOD
+static int pmu_get_precise(const char *pmu_name)
+{
+	struct perf_pmu *pmu = pmu_name ? perf_pmu__find(pmu_name) : NULL;
+
+	/* If unsupported pmu->max_precise is -1. */
+	return pmu ? pmu->max_precise : -1;
+}
+#else
+static int pmu_get_precise(const char *pmu_name __maybe_unused)
+{
+	return 0;
+}
+#endif
+
+static void perf_evsel__set_max_precise_ip(struct perf_evsel *evsel)
+{
+	int max_precise = pmu_get_precise(evsel->pmu_name);
+	struct perf_event_attr *attr = &evsel->attr;
+
+	if (max_precise >= 0)
+		attr->precise_ip = max_precise;
+	else
+		perf_event_attr__set_max_precise_ip(attr);
+}
+
 /*
  * The enable_on_exec/disabled value strategy:
  *
@@ -1091,7 +1117,7 @@ void perf_evsel__config(struct perf_evsel *evsel, struct record_opts *opts,
 	}
 
 	if (evsel->precise_max)
-		perf_event_attr__set_max_precise_ip(attr);
+		perf_evsel__set_max_precise_ip(evsel);
 
 	if (opts->all_user) {
 		attr->exclude_kernel = 1;
diff --git a/tools/perf/util/setup.py b/tools/perf/util/setup.py
index 5b5a167b43ce..9eee3467ac5d 100644
--- a/tools/perf/util/setup.py
+++ b/tools/perf/util/setup.py
@@ -37,7 +37,7 @@ class install_lib(_install_lib):
 
 cflags = getenv('CFLAGS', '').split()
 # switch off several checks (need to be at the end of cflags list)
-cflags += ['-fno-strict-aliasing', '-Wno-write-strings', '-Wno-unused-parameter', '-Wno-redundant-decls' ]
+cflags += ['-fno-strict-aliasing', '-Wno-write-strings', '-Wno-unused-parameter', '-Wno-redundant-decls', '-DPYTHON_MOD' ]
 if cc != "clang":
     cflags += ['-Wno-cast-function-type' ]
 
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 6/8] perf tools: Probe for precise_ip with simple attr
  2019-03-05 15:25 [PATCH 0/8] perf tools: Assorted fixes Jiri Olsa
                   ` (4 preceding siblings ...)
  2019-03-05 15:25 ` [PATCH 5/8] perf tools: Get precise_ip from the pmu config Jiri Olsa
@ 2019-03-05 15:25 ` Jiri Olsa
  2019-03-09 20:08   ` [tip:perf/urgent] perf evsel: " tip-bot for Jiri Olsa
  2019-03-05 15:25 ` [PATCH 7/8] perf tools: Fix double free in perf_data__close Jiri Olsa
  2019-03-05 15:25 ` [PATCH 8/8] perf tools: Force perf_data__open|close zero data->file.path Jiri Olsa
  7 siblings, 1 reply; 30+ messages in thread
From: Jiri Olsa @ 2019-03-05 15:25 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: lkml, Ingo Molnar, Namhyung Kim, Alexander Shishkin,
	Peter Zijlstra, Jonas Rabenstein, Nageswara R Sastry,
	Ravi Bangoria, Andi Kleen

Currently we probe for precise_ip with user specified
perf_event_attr, which might fail because of unsupported
kernel features, which would get disabled during the
open time anyway.

Switching the probe to take place on simple hw cycles,
so the following record sets proper precise_ip:

  # perf record -e cycles:P ls
  # perf evlist -v
  cycles:P: size: 112, ... precise_ip: 3, ...

Link: http://lkml.kernel.org/n/tip-rwncfxifbhnmd89yx0va5zg0@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/perf/util/evlist.c | 25 ++++++++++++++++++++-----
 tools/perf/util/evsel.c  |  8 --------
 2 files changed, 20 insertions(+), 13 deletions(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 08cedb643ea6..ed20f4379956 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -230,18 +230,33 @@ void perf_evlist__set_leader(struct perf_evlist *evlist)
 	}
 }
 
-void perf_event_attr__set_max_precise_ip(struct perf_event_attr *attr)
+void perf_event_attr__set_max_precise_ip(struct perf_event_attr *pattr)
 {
-	attr->precise_ip = 3;
+	struct perf_event_attr attr = {
+		.type		= PERF_TYPE_HARDWARE,
+		.config		= PERF_COUNT_HW_CPU_CYCLES,
+		.exclude_kernel	= 1,
+		.precise_ip	= 3,
+	};
 
-	while (attr->precise_ip != 0) {
-		int fd = sys_perf_event_open(attr, 0, -1, -1, 0);
+	event_attr_init(&attr);
+
+	/*
+	 * Unnamed union member, not supported as struct member named
+	 * initializer in older compilers such as gcc 4.4.7
+	 */
+	attr.sample_period = 1;
+
+	while (attr.precise_ip != 0) {
+		int fd = sys_perf_event_open(&attr, 0, -1, -1, 0);
 		if (fd != -1) {
 			close(fd);
 			break;
 		}
-		--attr->precise_ip;
+		--attr.precise_ip;
 	}
+
+	pattr->precise_ip = attr.precise_ip;
 }
 
 int __perf_evlist__add_default(struct perf_evlist *evlist, bool precise)
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index eec542bab815..2ef229e24b6f 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -294,20 +294,12 @@ struct perf_evsel *perf_evsel__new_cycles(bool precise)
 
 	if (!precise)
 		goto new_event;
-	/*
-	 * Unnamed union member, not supported as struct member named
-	 * initializer in older compilers such as gcc 4.4.7
-	 *
-	 * Just for probing the precise_ip:
-	 */
-	attr.sample_period = 1;
 
 	perf_event_attr__set_max_precise_ip(&attr);
 	/*
 	 * Now let the usual logic to set up the perf_event_attr defaults
 	 * to kick in when we return and before perf_evsel__open() is called.
 	 */
-	attr.sample_period = 0;
 new_event:
 	evsel = perf_evsel__new(&attr);
 	if (evsel == NULL)
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 7/8] perf tools: Fix double free in perf_data__close
  2019-03-05 15:25 [PATCH 0/8] perf tools: Assorted fixes Jiri Olsa
                   ` (5 preceding siblings ...)
  2019-03-05 15:25 ` [PATCH 6/8] perf tools: Probe for precise_ip with simple attr Jiri Olsa
@ 2019-03-05 15:25 ` Jiri Olsa
  2019-03-09 20:09   ` [tip:perf/urgent] perf session: " tip-bot for Jiri Olsa
  2019-03-05 15:25 ` [PATCH 8/8] perf tools: Force perf_data__open|close zero data->file.path Jiri Olsa
  7 siblings, 1 reply; 30+ messages in thread
From: Jiri Olsa @ 2019-03-05 15:25 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: lkml, Ingo Molnar, Namhyung Kim, Alexander Shishkin,
	Peter Zijlstra, Jonas Rabenstein, Nageswara R Sastry,
	Ravi Bangoria, Andi Kleen

We can't call perf_data__close and subsequently
perf_session__delete, because it will call
perf_data__close again and cause double free
for data->file.path.

  $ perf report -i .
  incompatible file format (rerun with -v to learn more)
  free(): double free detected in tcache 2
  Aborted (core dumped)

In fact we don't need to call perf_data__close
at all, because at the time the got out_close
is reached, session->data is already initialized,
so the perf_data__close call will be triggered
from perf_session__delete.

Fixes: 2d4f27999b88 ("perf data: Add global path holder")
Link: http://lkml.kernel.org/n/tip-2g1zaoxmgmaxbuz8yvopq32v@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/perf/util/session.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index c764bbc91009..db643f3c2b95 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -140,7 +140,7 @@ struct perf_session *perf_session__new(struct perf_data *data,
 
 		if (perf_data__is_read(data)) {
 			if (perf_session__open(session) < 0)
-				goto out_close;
+				goto out_delete;
 
 			/*
 			 * set session attributes that are present in perf.data
@@ -181,8 +181,6 @@ struct perf_session *perf_session__new(struct perf_data *data,
 
 	return session;
 
- out_close:
-	perf_data__close(data);
  out_delete:
 	perf_session__delete(session);
  out:
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 8/8] perf tools: Force perf_data__open|close zero data->file.path
  2019-03-05 15:25 [PATCH 0/8] perf tools: Assorted fixes Jiri Olsa
                   ` (6 preceding siblings ...)
  2019-03-05 15:25 ` [PATCH 7/8] perf tools: Fix double free in perf_data__close Jiri Olsa
@ 2019-03-05 15:25 ` Jiri Olsa
  2019-03-09 20:09   ` [tip:perf/urgent] perf data: " tip-bot for Jiri Olsa
  7 siblings, 1 reply; 30+ messages in thread
From: Jiri Olsa @ 2019-03-05 15:25 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: lkml, Ingo Molnar, Namhyung Kim, Alexander Shishkin,
	Peter Zijlstra, Jonas Rabenstein, Nageswara R Sastry,
	Ravi Bangoria, Andi Kleen

Making sure the data->file.path is zeroed on perf_data__open
error path and in perf_data__close, so we don't double free
it in case someone call it twice.

Link: http://lkml.kernel.org/n/tip-yt0yz1y9h5vdppsj8nkiuh07@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/perf/util/data.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/data.c b/tools/perf/util/data.c
index 7bd5ddeb7a41..e098e189f93e 100644
--- a/tools/perf/util/data.c
+++ b/tools/perf/util/data.c
@@ -237,7 +237,7 @@ static int open_file(struct perf_data *data)
 	     open_file_read(data) : open_file_write(data);
 
 	if (fd < 0) {
-		free(data->file.path);
+		zfree(&data->file.path);
 		return -1;
 	}
 
@@ -270,7 +270,7 @@ int perf_data__open(struct perf_data *data)
 
 void perf_data__close(struct perf_data *data)
 {
-	free(data->file.path);
+	zfree(&data->file.path);
 	close(data->file.fd);
 }
 
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH 5/8] perf tools: Get precise_ip from the pmu config
  2019-03-05 15:25 ` [PATCH 5/8] perf tools: Get precise_ip from the pmu config Jiri Olsa
@ 2019-03-05 16:13   ` Andi Kleen
  2019-03-05 16:28     ` Jiri Olsa
  0 siblings, 1 reply; 30+ messages in thread
From: Andi Kleen @ 2019-03-05 16:13 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Arnaldo Carvalho de Melo, lkml, Ingo Molnar, Namhyung Kim,
	Alexander Shishkin, Peter Zijlstra, Jonas Rabenstein,
	Nageswara R Sastry, Ravi Bangoria

On Tue, Mar 05, 2019 at 04:25:33PM +0100, Jiri Olsa wrote:
> Getting precise_ip field from the perf_pmu::max_precise
> config read from sysfs. If it's not available falling
> back to current detection function.

max_precise depends on the event.  This won't work for all
events. For example only instructions and cycles support
ppp

The previous method handled it by event.

-Andi

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 5/8] perf tools: Get precise_ip from the pmu config
  2019-03-05 16:13   ` Andi Kleen
@ 2019-03-05 16:28     ` Jiri Olsa
  2019-03-05 16:40       ` Andi Kleen
  0 siblings, 1 reply; 30+ messages in thread
From: Jiri Olsa @ 2019-03-05 16:28 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Jiri Olsa, Arnaldo Carvalho de Melo, lkml, Ingo Molnar,
	Namhyung Kim, Alexander Shishkin, Peter Zijlstra,
	Jonas Rabenstein, Nageswara R Sastry, Ravi Bangoria

On Tue, Mar 05, 2019 at 08:13:19AM -0800, Andi Kleen wrote:
> On Tue, Mar 05, 2019 at 04:25:33PM +0100, Jiri Olsa wrote:
> > Getting precise_ip field from the perf_pmu::max_precise
> > config read from sysfs. If it's not available falling
> > back to current detection function.
> 
> max_precise depends on the event.  This won't work for all
> events. For example only instructions and cycles support
> ppp

I'm getting  precise_ip=3 on mem-* events as well, that's why I
was fixing this.. now it's not working for any event

> 
> The previous method handled it by event.

how about we use empty template with just type/config
of the original event, like in the change below,
that way we eliminate unsupported features failing
the probing.. maybe we could use oldest attr version

also if precise_ip is event based, we shouldn't use the
caps/max_precise then

jirka


---
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index ed20f4379956..cee2f83feb89 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -233,8 +233,8 @@ void perf_evlist__set_leader(struct perf_evlist *evlist)
 void perf_event_attr__set_max_precise_ip(struct perf_event_attr *pattr)
 {
 	struct perf_event_attr attr = {
-		.type		= PERF_TYPE_HARDWARE,
-		.config		= PERF_COUNT_HW_CPU_CYCLES,
+		.type		= pattr->type,
+		.config		= pattr->config,
 		.exclude_kernel	= 1,
 		.precise_ip	= 3,
 	};

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH 5/8] perf tools: Get precise_ip from the pmu config
  2019-03-05 16:28     ` Jiri Olsa
@ 2019-03-05 16:40       ` Andi Kleen
  2019-03-07 15:35         ` [PATCHv2 " Jiri Olsa
  0 siblings, 1 reply; 30+ messages in thread
From: Andi Kleen @ 2019-03-05 16:40 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Jiri Olsa, Arnaldo Carvalho de Melo, lkml, Ingo Molnar,
	Namhyung Kim, Alexander Shishkin, Peter Zijlstra,
	Jonas Rabenstein, Nageswara R Sastry, Ravi Bangoria

On Tue, Mar 05, 2019 at 05:28:54PM +0100, Jiri Olsa wrote:
> On Tue, Mar 05, 2019 at 08:13:19AM -0800, Andi Kleen wrote:
> > On Tue, Mar 05, 2019 at 04:25:33PM +0100, Jiri Olsa wrote:
> > > Getting precise_ip field from the perf_pmu::max_precise
> > > config read from sysfs. If it's not available falling
> > > back to current detection function.
> > 
> > max_precise depends on the event.  This won't work for all
> > events. For example only instructions and cycles support
> > ppp
> 
> I'm getting  precise_ip=3 on mem-* events as well, that's why I
> was fixing this.. now it's not working for any event

I don't think it means anything for mem-*

There's some support for it on Goldmont plus for other events,
but it doesn't support mem-*. On big core it's only
for instructions and cycles, all implemented with the same
event. All other PEBS events only have two levels
switching between the two IPs.

-Andi

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCHv2 5/8] perf tools: Get precise_ip from the pmu config
  2019-03-05 16:40       ` Andi Kleen
@ 2019-03-07 15:35         ` Jiri Olsa
  2019-03-07 16:51           ` Andi Kleen
  0 siblings, 1 reply; 30+ messages in thread
From: Jiri Olsa @ 2019-03-07 15:35 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Jiri Olsa, Arnaldo Carvalho de Melo, lkml, Ingo Molnar,
	Namhyung Kim, Alexander Shishkin, Peter Zijlstra,
	Jonas Rabenstein, Nageswara R Sastry, Ravi Bangoria

On Tue, Mar 05, 2019 at 08:40:17AM -0800, Andi Kleen wrote:
> On Tue, Mar 05, 2019 at 05:28:54PM +0100, Jiri Olsa wrote:
> > On Tue, Mar 05, 2019 at 08:13:19AM -0800, Andi Kleen wrote:
> > > On Tue, Mar 05, 2019 at 04:25:33PM +0100, Jiri Olsa wrote:
> > > > Getting precise_ip field from the perf_pmu::max_precise
> > > > config read from sysfs. If it's not available falling
> > > > back to current detection function.
> > > 
> > > max_precise depends on the event.  This won't work for all
> > > events. For example only instructions and cycles support
> > > ppp
> > 
> > I'm getting  precise_ip=3 on mem-* events as well, that's why I
> > was fixing this.. now it's not working for any event
> 
> I don't think it means anything for mem-*
> 
> There's some support for it on Goldmont plus for other events,
> but it doesn't support mem-*. On big core it's only
> for instructions and cycles, all implemented with the same
> event. All other PEBS events only have two levels
> switching between the two IPs.

ok, so how about this, it's the change I posted merged with the patch

jirka


---
Currently we probe for precise_ip with user specified
perf_event_attr, which might fail because of unsupported
kernel features, which would get disabled during the
open time anyway.

Switching the probe to take place on simple event,
only configured with users type/config, so the
following record sets proper precise_ip:

  # perf record -e cycles:P ls
  # perf evlist -v
  cycles:P: size: 112, ... precise_ip: 3, ...

Link: http://lkml.kernel.org/n/tip-rwncfxifbhnmd89yx0va5zg0@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/perf/util/evlist.c | 25 ++++++++++++++++++++-----
 tools/perf/util/evsel.c  |  8 --------
 2 files changed, 20 insertions(+), 13 deletions(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 08cedb643ea6..cee2f83feb89 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -230,18 +230,33 @@ void perf_evlist__set_leader(struct perf_evlist *evlist)
 	}
 }
 
-void perf_event_attr__set_max_precise_ip(struct perf_event_attr *attr)
+void perf_event_attr__set_max_precise_ip(struct perf_event_attr *pattr)
 {
-	attr->precise_ip = 3;
+	struct perf_event_attr attr = {
+		.type		= pattr->type,
+		.config		= pattr->config,
+		.exclude_kernel	= 1,
+		.precise_ip	= 3,
+	};
 
-	while (attr->precise_ip != 0) {
-		int fd = sys_perf_event_open(attr, 0, -1, -1, 0);
+	event_attr_init(&attr);
+
+	/*
+	 * Unnamed union member, not supported as struct member named
+	 * initializer in older compilers such as gcc 4.4.7
+	 */
+	attr.sample_period = 1;
+
+	while (attr.precise_ip != 0) {
+		int fd = sys_perf_event_open(&attr, 0, -1, -1, 0);
 		if (fd != -1) {
 			close(fd);
 			break;
 		}
-		--attr->precise_ip;
+		--attr.precise_ip;
 	}
+
+	pattr->precise_ip = attr.precise_ip;
 }
 
 int __perf_evlist__add_default(struct perf_evlist *evlist, bool precise)
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index eec542bab815..2ef229e24b6f 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -294,20 +294,12 @@ struct perf_evsel *perf_evsel__new_cycles(bool precise)
 
 	if (!precise)
 		goto new_event;
-	/*
-	 * Unnamed union member, not supported as struct member named
-	 * initializer in older compilers such as gcc 4.4.7
-	 *
-	 * Just for probing the precise_ip:
-	 */
-	attr.sample_period = 1;
 
 	perf_event_attr__set_max_precise_ip(&attr);
 	/*
 	 * Now let the usual logic to set up the perf_event_attr defaults
 	 * to kick in when we return and before perf_evsel__open() is called.
 	 */
-	attr.sample_period = 0;
 new_event:
 	evsel = perf_evsel__new(&attr);
 	if (evsel == NULL)
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCHv2 5/8] perf tools: Get precise_ip from the pmu config
  2019-03-07 15:35         ` [PATCHv2 " Jiri Olsa
@ 2019-03-07 16:51           ` Andi Kleen
  2019-03-07 22:32             ` Jiri Olsa
  2019-03-14 14:01             ` Jiri Olsa
  0 siblings, 2 replies; 30+ messages in thread
From: Andi Kleen @ 2019-03-07 16:51 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Jiri Olsa, Arnaldo Carvalho de Melo, lkml, Ingo Molnar,
	Namhyung Kim, Alexander Shishkin, Peter Zijlstra,
	Jonas Rabenstein, Nageswara R Sastry, Ravi Bangoria

On Thu, Mar 07, 2019 at 04:35:00PM +0100, Jiri Olsa wrote:
> On Tue, Mar 05, 2019 at 08:40:17AM -0800, Andi Kleen wrote:
> > On Tue, Mar 05, 2019 at 05:28:54PM +0100, Jiri Olsa wrote:
> > > On Tue, Mar 05, 2019 at 08:13:19AM -0800, Andi Kleen wrote:
> > > > On Tue, Mar 05, 2019 at 04:25:33PM +0100, Jiri Olsa wrote:
> > > > > Getting precise_ip field from the perf_pmu::max_precise
> > > > > config read from sysfs. If it's not available falling
> > > > > back to current detection function.
> > > > 
> > > > max_precise depends on the event.  This won't work for all
> > > > events. For example only instructions and cycles support
> > > > ppp
> > > 
> > > I'm getting  precise_ip=3 on mem-* events as well, that's why I
> > > was fixing this.. now it's not working for any event
> > 
> > I don't think it means anything for mem-*
> > 
> > There's some support for it on Goldmont plus for other events,
> > but it doesn't support mem-*. On big core it's only
> > for instructions and cycles, all implemented with the same
> > event. All other PEBS events only have two levels
> > switching between the two IPs.
> 
> ok, so how about this, it's the change I posted merged with the patch

Still seems like a hack. Even though I don't know of a case that
would break now. But it would be better to have the precise probing
in the open retry loop, instead of trying to reinvent it here.

-Andi

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCHv2 5/8] perf tools: Get precise_ip from the pmu config
  2019-03-07 16:51           ` Andi Kleen
@ 2019-03-07 22:32             ` Jiri Olsa
  2019-03-14 14:01             ` Jiri Olsa
  1 sibling, 0 replies; 30+ messages in thread
From: Jiri Olsa @ 2019-03-07 22:32 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Jiri Olsa, Arnaldo Carvalho de Melo, lkml, Ingo Molnar,
	Namhyung Kim, Alexander Shishkin, Peter Zijlstra,
	Jonas Rabenstein, Nageswara R Sastry, Ravi Bangoria

On Thu, Mar 07, 2019 at 08:51:23AM -0800, Andi Kleen wrote:
> On Thu, Mar 07, 2019 at 04:35:00PM +0100, Jiri Olsa wrote:
> > On Tue, Mar 05, 2019 at 08:40:17AM -0800, Andi Kleen wrote:
> > > On Tue, Mar 05, 2019 at 05:28:54PM +0100, Jiri Olsa wrote:
> > > > On Tue, Mar 05, 2019 at 08:13:19AM -0800, Andi Kleen wrote:
> > > > > On Tue, Mar 05, 2019 at 04:25:33PM +0100, Jiri Olsa wrote:
> > > > > > Getting precise_ip field from the perf_pmu::max_precise
> > > > > > config read from sysfs. If it's not available falling
> > > > > > back to current detection function.
> > > > > 
> > > > > max_precise depends on the event.  This won't work for all
> > > > > events. For example only instructions and cycles support
> > > > > ppp
> > > > 
> > > > I'm getting  precise_ip=3 on mem-* events as well, that's why I
> > > > was fixing this.. now it's not working for any event
> > > 
> > > I don't think it means anything for mem-*
> > > 
> > > There's some support for it on Goldmont plus for other events,
> > > but it doesn't support mem-*. On big core it's only
> > > for instructions and cycles, all implemented with the same
> > > event. All other PEBS events only have two levels
> > > switching between the two IPs.
> > 
> > ok, so how about this, it's the change I posted merged with the patch
> 
> Still seems like a hack. Even though I don't know of a case that
> would break now. But it would be better to have the precise probing
> in the open retry loop, instead of trying to reinvent it here.

right, I guess we could just set precise_ip hard to 3 for :P
and add fallback part for the precise_ip in perf_evsel__open

jirka

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [tip:perf/urgent] perf c2c: Fix c2c report for empty numa node
  2019-03-05 15:25 ` [PATCH 1/8] perf c2c: Fix c2c report for empty numa node Jiri Olsa
@ 2019-03-09 20:05   ` tip-bot for Jiri Olsa
  0 siblings, 0 replies; 30+ messages in thread
From: tip-bot for Jiri Olsa @ 2019-03-09 20:05 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: jolsa, acme, alexander.shishkin, mingo, jonas.rabenstein,
	linux-kernel, tglx, hpa, nasastry, namhyung, jolsa, ak, peterz,
	ravi.bangoria

Commit-ID:  e34c940245437f36d2c492edd1f8237eff391064
Gitweb:     https://git.kernel.org/tip/e34c940245437f36d2c492edd1f8237eff391064
Author:     Jiri Olsa <jolsa@redhat.com>
AuthorDate: Tue, 5 Mar 2019 16:25:29 +0100
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Wed, 6 Mar 2019 18:15:24 -0300

perf c2c: Fix c2c report for empty numa node

Ravi Bangoria reported that we fail with an empty NUMA node with the
following message:

  $ lscpu
  NUMA node0 CPU(s):
  NUMA node1 CPU(s):   0-4

  $ sudo ./perf c2c report
  node/cpu topology bugFailed setup nodes

Fix this by detecting the empty node and keeping its CPU set empty.

Reported-by: Nageswara R Sastry <nasastry@in.ibm.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jonas Rabenstein <jonas.rabenstein@studium.uni-erlangen.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190305152536.21035-2-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-c2c.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index 4272763a5e96..9e6cc868bdb4 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -2056,6 +2056,12 @@ static int setup_nodes(struct perf_session *session)
 		if (!set)
 			return -ENOMEM;
 
+		nodes[node] = set;
+
+		/* empty node, skip */
+		if (cpu_map__empty(map))
+			continue;
+
 		for (cpu = 0; cpu < map->nr; cpu++) {
 			set_bit(map->map[cpu], set);
 
@@ -2064,8 +2070,6 @@ static int setup_nodes(struct perf_session *session)
 
 			cpu2node[map->map[cpu]] = node;
 		}
-
-		nodes[node] = set;
 	}
 
 	setup_nodes_header();

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [tip:perf/urgent] perf hist: Add error path into hist_entry__init
  2019-03-05 15:25 ` [PATCH 2/8] perf tools: Add error path into hist_entry__init Jiri Olsa
@ 2019-03-09 20:06   ` tip-bot for Jiri Olsa
  0 siblings, 0 replies; 30+ messages in thread
From: tip-bot for Jiri Olsa @ 2019-03-09 20:06 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: namhyung, ravi.bangoria, peterz, hpa, linux-kernel,
	alexander.shishkin, jonas.rabenstein, ak, tglx, mingo, acme,
	nasastry, jolsa

Commit-ID:  c57589106fd6d996dbf3757708baa4a3fb91850f
Gitweb:     https://git.kernel.org/tip/c57589106fd6d996dbf3757708baa4a3fb91850f
Author:     Jiri Olsa <jolsa@kernel.org>
AuthorDate: Tue, 5 Mar 2019 16:25:30 +0100
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Wed, 6 Mar 2019 18:16:30 -0300

perf hist: Add error path into hist_entry__init

Adding error path into hist_entry__init to unify error handling, so
every new member does not need to free everything else.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jonas Rabenstein <jonas.rabenstein@studium.uni-erlangen.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Cc: nageswara r sastry <nasastry@in.ibm.com>
Link: http://lkml.kernel.org/r/20190305152536.21035-3-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/hist.c | 39 +++++++++++++++++++--------------------
 1 file changed, 19 insertions(+), 20 deletions(-)

diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 669f961316f0..74e307d17c49 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -396,11 +396,8 @@ static int hist_entry__init(struct hist_entry *he,
 		 * adding new entries.  So we need to save a copy.
 		 */
 		he->branch_info = malloc(sizeof(*he->branch_info));
-		if (he->branch_info == NULL) {
-			map__zput(he->ms.map);
-			free(he->stat_acc);
-			return -ENOMEM;
-		}
+		if (he->branch_info == NULL)
+			goto err;
 
 		memcpy(he->branch_info, template->branch_info,
 		       sizeof(*he->branch_info));
@@ -419,21 +416,8 @@ static int hist_entry__init(struct hist_entry *he,
 
 	if (he->raw_data) {
 		he->raw_data = memdup(he->raw_data, he->raw_size);
-
-		if (he->raw_data == NULL) {
-			map__put(he->ms.map);
-			if (he->branch_info) {
-				map__put(he->branch_info->from.map);
-				map__put(he->branch_info->to.map);
-				free(he->branch_info);
-			}
-			if (he->mem_info) {
-				map__put(he->mem_info->iaddr.map);
-				map__put(he->mem_info->daddr.map);
-			}
-			free(he->stat_acc);
-			return -ENOMEM;
-		}
+		if (he->raw_data == NULL)
+			goto err_infos;
 	}
 	INIT_LIST_HEAD(&he->pairs.node);
 	thread__get(he->thread);
@@ -444,6 +428,21 @@ static int hist_entry__init(struct hist_entry *he,
 		he->leaf = true;
 
 	return 0;
+
+err_infos:
+	if (he->branch_info) {
+		map__put(he->branch_info->from.map);
+		map__put(he->branch_info->to.map);
+		free(he->branch_info);
+	}
+	if (he->mem_info) {
+		map__put(he->mem_info->iaddr.map);
+		map__put(he->mem_info->daddr.map);
+	}
+err:
+	map__zput(he->ms.map);
+	free(he->stat_acc);
+	return -ENOMEM;
 }
 
 static void *hist_entry__zalloc(size_t size)

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [tip:perf/urgent] perf hist: Fix memory leak of srcline
  2019-03-05 15:25 ` [PATCH 3/8] perf hist: Fix memory leak of srcline Jiri Olsa
@ 2019-03-09 20:07   ` tip-bot for Jiri Olsa
  0 siblings, 0 replies; 30+ messages in thread
From: tip-bot for Jiri Olsa @ 2019-03-09 20:07 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: jolsa, ak, hpa, jolsa, ravi.bangoria, jonas.rabenstein,
	linux-kernel, namhyung, peterz, acme, nasastry, tglx, mingo,
	alexander.shishkin

Commit-ID:  2634958586368dcbf09c0d2a17dee02d1fc53e0d
Gitweb:     https://git.kernel.org/tip/2634958586368dcbf09c0d2a17dee02d1fc53e0d
Author:     Jiri Olsa <jolsa@redhat.com>
AuthorDate: Tue, 5 Mar 2019 16:25:31 +0100
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Wed, 6 Mar 2019 18:16:57 -0300

perf hist: Fix memory leak of srcline

We can't allocate he->srcline unconditionaly, only when new hist_entry
is created. Moving he->srcline allocation into hist_entry__init
function.

Original-patch-by: Jonas Rabenstein <jonas.rabenstein@studium.uni-erlangen.de>
Suggested-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Nageswara R Sastry <nasastry@in.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Link: http://lkml.kernel.org/r/20190305152536.21035-4-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/hist.c | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 74e307d17c49..f9eb95bf3938 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -419,6 +419,13 @@ static int hist_entry__init(struct hist_entry *he,
 		if (he->raw_data == NULL)
 			goto err_infos;
 	}
+
+	if (he->srcline) {
+		he->srcline = strdup(he->srcline);
+		if (he->srcline == NULL)
+			goto err_rawdata;
+	}
+
 	INIT_LIST_HEAD(&he->pairs.node);
 	thread__get(he->thread);
 	he->hroot_in  = RB_ROOT_CACHED;
@@ -429,6 +436,9 @@ static int hist_entry__init(struct hist_entry *he,
 
 	return 0;
 
+err_rawdata:
+	free(he->raw_data);
+
 err_infos:
 	if (he->branch_info) {
 		map__put(he->branch_info->from.map);
@@ -605,7 +615,7 @@ __hists__add_entry(struct hists *hists,
 			.map	= al->map,
 			.sym	= al->sym,
 		},
-		.srcline = al->srcline ? strdup(al->srcline) : NULL,
+		.srcline = (char *) al->srcline,
 		.socket	 = al->socket,
 		.cpu	 = al->cpu,
 		.cpumode = al->cpumode,
@@ -962,7 +972,7 @@ iter_add_next_cumulative_entry(struct hist_entry_iter *iter,
 			.map = al->map,
 			.sym = al->sym,
 		},
-		.srcline = al->srcline ? strdup(al->srcline) : NULL,
+		.srcline = (char *) al->srcline,
 		.parent = iter->parent,
 		.raw_data = sample->raw_data,
 		.raw_size = sample->raw_size,

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [tip:perf/urgent] perf tools: Read and store caps/max_precise in perf_pmu
  2019-03-05 15:25 ` [PATCH 4/8] perf tools: Read and store caps/max_precise in perf_pmu Jiri Olsa
@ 2019-03-09 20:07   ` tip-bot for Jiri Olsa
  0 siblings, 0 replies; 30+ messages in thread
From: tip-bot for Jiri Olsa @ 2019-03-09 20:07 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: peterz, jonas.rabenstein, mingo, ravi.bangoria, linux-kernel,
	namhyung, jolsa, ak, hpa, tglx, nasastry, acme,
	alexander.shishkin

Commit-ID:  90a86bde97ba050cb3c9ccb215252ee2d2d705fa
Gitweb:     https://git.kernel.org/tip/90a86bde97ba050cb3c9ccb215252ee2d2d705fa
Author:     Jiri Olsa <jolsa@kernel.org>
AuthorDate: Tue, 5 Mar 2019 16:25:32 +0100
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Wed, 6 Mar 2019 18:18:17 -0300

perf tools: Read and store caps/max_precise in perf_pmu

Read the caps/max_precise value and store it in struct perf_pmu to be
used when setting the maximum precise_ip field in following patch.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jonas Rabenstein <jonas.rabenstein@studium.uni-erlangen.de>
Cc: Nageswara R Sastry <nasastry@in.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Link: http://lkml.kernel.org/r/20190305152536.21035-5-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/pmu.c | 14 ++++++++++++++
 tools/perf/util/pmu.h |  1 +
 2 files changed, 15 insertions(+)

diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
index 51d437f55d18..6199a3174ab9 100644
--- a/tools/perf/util/pmu.c
+++ b/tools/perf/util/pmu.c
@@ -752,6 +752,19 @@ perf_pmu__get_default_config(struct perf_pmu *pmu __maybe_unused)
 	return NULL;
 }
 
+static int pmu_max_precise(const char *name)
+{
+	char path[PATH_MAX];
+	int max_precise = -1;
+
+	scnprintf(path, PATH_MAX,
+		 "bus/event_source/devices/%s/caps/max_precise",
+		 name);
+
+	sysfs__read_int(path, &max_precise);
+	return max_precise;
+}
+
 static struct perf_pmu *pmu_lookup(const char *name)
 {
 	struct perf_pmu *pmu;
@@ -784,6 +797,7 @@ static struct perf_pmu *pmu_lookup(const char *name)
 	pmu->name = strdup(name);
 	pmu->type = type;
 	pmu->is_uncore = pmu_is_uncore(name);
+	pmu->max_precise = pmu_max_precise(name);
 	pmu_add_cpu_aliases(&aliases, pmu);
 
 	INIT_LIST_HEAD(&pmu->format);
diff --git a/tools/perf/util/pmu.h b/tools/perf/util/pmu.h
index 47253c3daf55..bd9ec2704a57 100644
--- a/tools/perf/util/pmu.h
+++ b/tools/perf/util/pmu.h
@@ -26,6 +26,7 @@ struct perf_pmu {
 	__u32 type;
 	bool selectable;
 	bool is_uncore;
+	int max_precise;
 	struct perf_event_attr *default_config;
 	struct cpu_map *cpus;
 	struct list_head format;  /* HEAD struct perf_pmu_format -> list */

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [tip:perf/urgent] perf evsel: Probe for precise_ip with simple attr
  2019-03-05 15:25 ` [PATCH 6/8] perf tools: Probe for precise_ip with simple attr Jiri Olsa
@ 2019-03-09 20:08   ` tip-bot for Jiri Olsa
  0 siblings, 0 replies; 30+ messages in thread
From: tip-bot for Jiri Olsa @ 2019-03-09 20:08 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: acme, alexander.shishkin, ravi.bangoria, namhyung, peterz, tglx,
	linux-kernel, jonas.rabenstein, ak, nasastry, hpa, jolsa, mingo

Commit-ID:  5b61adb16599be04346e7e943c1b5113b57485ad
Gitweb:     https://git.kernel.org/tip/5b61adb16599be04346e7e943c1b5113b57485ad
Author:     Jiri Olsa <jolsa@kernel.org>
AuthorDate: Tue, 5 Mar 2019 16:25:34 +0100
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Wed, 6 Mar 2019 18:19:45 -0300

perf evsel: Probe for precise_ip with simple attr

Currently we probe for precise_ip with user specified perf_event_attr,
which might fail because of unsupported kernel features, which would get
disabled during the open time anyway.

Switching the probe to take place on simple hw cycles, so the following
record sets proper precise_ip:

  # perf record -e cycles:P ls
  # perf evlist -v
  cycles:P: size: 112, ... precise_ip: 3, ...

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jonas Rabenstein <jonas.rabenstein@studium.uni-erlangen.de>
Cc: Nageswara R Sastry <nasastry@in.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Link: http://lkml.kernel.org/r/20190305152536.21035-7-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/evlist.c | 25 ++++++++++++++++++++-----
 tools/perf/util/evsel.c  |  8 --------
 2 files changed, 20 insertions(+), 13 deletions(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 08cedb643ea6..ed20f4379956 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -230,18 +230,33 @@ void perf_evlist__set_leader(struct perf_evlist *evlist)
 	}
 }
 
-void perf_event_attr__set_max_precise_ip(struct perf_event_attr *attr)
+void perf_event_attr__set_max_precise_ip(struct perf_event_attr *pattr)
 {
-	attr->precise_ip = 3;
+	struct perf_event_attr attr = {
+		.type		= PERF_TYPE_HARDWARE,
+		.config		= PERF_COUNT_HW_CPU_CYCLES,
+		.exclude_kernel	= 1,
+		.precise_ip	= 3,
+	};
 
-	while (attr->precise_ip != 0) {
-		int fd = sys_perf_event_open(attr, 0, -1, -1, 0);
+	event_attr_init(&attr);
+
+	/*
+	 * Unnamed union member, not supported as struct member named
+	 * initializer in older compilers such as gcc 4.4.7
+	 */
+	attr.sample_period = 1;
+
+	while (attr.precise_ip != 0) {
+		int fd = sys_perf_event_open(&attr, 0, -1, -1, 0);
 		if (fd != -1) {
 			close(fd);
 			break;
 		}
-		--attr->precise_ip;
+		--attr.precise_ip;
 	}
+
+	pattr->precise_ip = attr.precise_ip;
 }
 
 int __perf_evlist__add_default(struct perf_evlist *evlist, bool precise)
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index dfe2958e6287..3bbf73e979c0 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -294,20 +294,12 @@ struct perf_evsel *perf_evsel__new_cycles(bool precise)
 
 	if (!precise)
 		goto new_event;
-	/*
-	 * Unnamed union member, not supported as struct member named
-	 * initializer in older compilers such as gcc 4.4.7
-	 *
-	 * Just for probing the precise_ip:
-	 */
-	attr.sample_period = 1;
 
 	perf_event_attr__set_max_precise_ip(&attr);
 	/*
 	 * Now let the usual logic to set up the perf_event_attr defaults
 	 * to kick in when we return and before perf_evsel__open() is called.
 	 */
-	attr.sample_period = 0;
 new_event:
 	evsel = perf_evsel__new(&attr);
 	if (evsel == NULL)

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [tip:perf/urgent] perf session: Fix double free in perf_data__close
  2019-03-05 15:25 ` [PATCH 7/8] perf tools: Fix double free in perf_data__close Jiri Olsa
@ 2019-03-09 20:09   ` tip-bot for Jiri Olsa
  0 siblings, 0 replies; 30+ messages in thread
From: tip-bot for Jiri Olsa @ 2019-03-09 20:09 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: nasastry, acme, mingo, linux-kernel, ravi.bangoria, namhyung,
	hpa, ak, jonas.rabenstein, jolsa, alexander.shishkin, peterz,
	tglx

Commit-ID:  befa09b61f8bf1d7c34b8e6405f08d804640573c
Gitweb:     https://git.kernel.org/tip/befa09b61f8bf1d7c34b8e6405f08d804640573c
Author:     Jiri Olsa <jolsa@kernel.org>
AuthorDate: Tue, 5 Mar 2019 16:25:35 +0100
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Wed, 6 Mar 2019 18:20:33 -0300

perf session: Fix double free in perf_data__close

We can't call perf_data__close and subsequently perf_session__delete,
because it will call perf_data__close again and cause double free for
data->file.path.

  $ perf report -i .
  incompatible file format (rerun with -v to learn more)
  free(): double free detected in tcache 2
  Aborted (core dumped)

In fact we don't need to call perf_data__close at all, because at the
time the got out_close is reached, session->data is already initialized,
so the perf_data__close call will be triggered from
perf_session__delete.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jonas Rabenstein <jonas.rabenstein@studium.uni-erlangen.de>
Cc: Nageswara R Sastry <nasastry@in.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Fixes: 2d4f27999b88 ("perf data: Add global path holder")
Link: http://lkml.kernel.org/r/20190305152536.21035-8-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/session.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index c764bbc91009..db643f3c2b95 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -140,7 +140,7 @@ struct perf_session *perf_session__new(struct perf_data *data,
 
 		if (perf_data__is_read(data)) {
 			if (perf_session__open(session) < 0)
-				goto out_close;
+				goto out_delete;
 
 			/*
 			 * set session attributes that are present in perf.data
@@ -181,8 +181,6 @@ struct perf_session *perf_session__new(struct perf_data *data,
 
 	return session;
 
- out_close:
-	perf_data__close(data);
  out_delete:
 	perf_session__delete(session);
  out:

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [tip:perf/urgent] perf data: Force perf_data__open|close zero data->file.path
  2019-03-05 15:25 ` [PATCH 8/8] perf tools: Force perf_data__open|close zero data->file.path Jiri Olsa
@ 2019-03-09 20:09   ` tip-bot for Jiri Olsa
  0 siblings, 0 replies; 30+ messages in thread
From: tip-bot for Jiri Olsa @ 2019-03-09 20:09 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: ravi.bangoria, acme, tglx, linux-kernel, namhyung,
	alexander.shishkin, jolsa, jonas.rabenstein, mingo, peterz, ak,
	hpa, nasastry

Commit-ID:  b8f7d86b5849ea7bb84bddc0345a3799049764d4
Gitweb:     https://git.kernel.org/tip/b8f7d86b5849ea7bb84bddc0345a3799049764d4
Author:     Jiri Olsa <jolsa@kernel.org>
AuthorDate: Tue, 5 Mar 2019 16:25:36 +0100
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Wed, 6 Mar 2019 18:21:00 -0300

perf data: Force perf_data__open|close zero data->file.path

Making sure the data->file.path is zeroed on perf_data__open error path
and in perf_data__close, so we don't double free it in case someone call
it twice.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jonas Rabenstein <jonas.rabenstein@studium.uni-erlangen.de>
Cc: Nageswara R Sastry <nasastry@in.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Link: http://lkml.kernel.org/r/20190305152536.21035-9-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/data.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/data.c b/tools/perf/util/data.c
index 7bd5ddeb7a41..e098e189f93e 100644
--- a/tools/perf/util/data.c
+++ b/tools/perf/util/data.c
@@ -237,7 +237,7 @@ static int open_file(struct perf_data *data)
 	     open_file_read(data) : open_file_write(data);
 
 	if (fd < 0) {
-		free(data->file.path);
+		zfree(&data->file.path);
 		return -1;
 	}
 
@@ -270,7 +270,7 @@ int perf_data__open(struct perf_data *data)
 
 void perf_data__close(struct perf_data *data)
 {
-	free(data->file.path);
+	zfree(&data->file.path);
 	close(data->file.fd);
 }
 

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCHv2 5/8] perf tools: Get precise_ip from the pmu config
  2019-03-07 16:51           ` Andi Kleen
  2019-03-07 22:32             ` Jiri Olsa
@ 2019-03-14 14:01             ` Jiri Olsa
  2019-03-14 15:49               ` Andi Kleen
  1 sibling, 1 reply; 30+ messages in thread
From: Jiri Olsa @ 2019-03-14 14:01 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Jiri Olsa, Arnaldo Carvalho de Melo, lkml, Ingo Molnar,
	Namhyung Kim, Alexander Shishkin, Peter Zijlstra,
	Jonas Rabenstein, Nageswara R Sastry, Ravi Bangoria

On Thu, Mar 07, 2019 at 08:51:23AM -0800, Andi Kleen wrote:
> On Thu, Mar 07, 2019 at 04:35:00PM +0100, Jiri Olsa wrote:
> > On Tue, Mar 05, 2019 at 08:40:17AM -0800, Andi Kleen wrote:
> > > On Tue, Mar 05, 2019 at 05:28:54PM +0100, Jiri Olsa wrote:
> > > > On Tue, Mar 05, 2019 at 08:13:19AM -0800, Andi Kleen wrote:
> > > > > On Tue, Mar 05, 2019 at 04:25:33PM +0100, Jiri Olsa wrote:
> > > > > > Getting precise_ip field from the perf_pmu::max_precise
> > > > > > config read from sysfs. If it's not available falling
> > > > > > back to current detection function.
> > > > > 
> > > > > max_precise depends on the event.  This won't work for all
> > > > > events. For example only instructions and cycles support
> > > > > ppp
> > > > 
> > > > I'm getting  precise_ip=3 on mem-* events as well, that's why I
> > > > was fixing this.. now it's not working for any event
> > > 
> > > I don't think it means anything for mem-*
> > > 
> > > There's some support for it on Goldmont plus for other events,
> > > but it doesn't support mem-*. On big core it's only
> > > for instructions and cycles, all implemented with the same
> > > event. All other PEBS events only have two levels
> > > switching between the two IPs.
> > 
> > ok, so how about this, it's the change I posted merged with the patch
> 
> Still seems like a hack. Even though I don't know of a case that
> would break now. But it would be better to have the precise probing
> in the open retry loop, instead of trying to reinvent it here.

how about something like below?

jirka


---
 tools/perf/util/evlist.c | 29 ----------------
 tools/perf/util/evlist.h |  2 --
 tools/perf/util/evsel.c  | 73 +++++++++++++++++++++++++++++++++-------
 3 files changed, 60 insertions(+), 44 deletions(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index ed20f4379956..3a243c249c29 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -230,35 +230,6 @@ void perf_evlist__set_leader(struct perf_evlist *evlist)
 	}
 }
 
-void perf_event_attr__set_max_precise_ip(struct perf_event_attr *pattr)
-{
-	struct perf_event_attr attr = {
-		.type		= PERF_TYPE_HARDWARE,
-		.config		= PERF_COUNT_HW_CPU_CYCLES,
-		.exclude_kernel	= 1,
-		.precise_ip	= 3,
-	};
-
-	event_attr_init(&attr);
-
-	/*
-	 * Unnamed union member, not supported as struct member named
-	 * initializer in older compilers such as gcc 4.4.7
-	 */
-	attr.sample_period = 1;
-
-	while (attr.precise_ip != 0) {
-		int fd = sys_perf_event_open(&attr, 0, -1, -1, 0);
-		if (fd != -1) {
-			close(fd);
-			break;
-		}
-		--attr.precise_ip;
-	}
-
-	pattr->precise_ip = attr.precise_ip;
-}
-
 int __perf_evlist__add_default(struct perf_evlist *evlist, bool precise)
 {
 	struct perf_evsel *evsel = perf_evsel__new_cycles(precise);
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 744906dd4887..cb15e3366eba 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -303,8 +303,6 @@ void perf_evlist__to_front(struct perf_evlist *evlist,
 void perf_evlist__set_tracking_event(struct perf_evlist *evlist,
 				     struct perf_evsel *tracking_evsel);
 
-void perf_event_attr__set_max_precise_ip(struct perf_event_attr *attr);
-
 struct perf_evsel *
 perf_evlist__find_evsel_by_str(struct perf_evlist *evlist, const char *str);
 
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 3bbf73e979c0..7cd70f99201a 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -295,7 +295,6 @@ struct perf_evsel *perf_evsel__new_cycles(bool precise)
 	if (!precise)
 		goto new_event;
 
-	perf_event_attr__set_max_precise_ip(&attr);
 	/*
 	 * Now let the usual logic to set up the perf_event_attr defaults
 	 * to kick in when we return and before perf_evsel__open() is called.
@@ -305,6 +304,8 @@ struct perf_evsel *perf_evsel__new_cycles(bool precise)
 	if (evsel == NULL)
 		goto out;
 
+	evsel->precise_max = true;
+
 	/* use asprintf() because free(evsel) assumes name is allocated */
 	if (asprintf(&evsel->name, "cycles%s%s%.*s",
 		     (attr.precise_ip || attr.exclude_kernel) ? ":" : "",
@@ -1083,7 +1084,7 @@ void perf_evsel__config(struct perf_evsel *evsel, struct record_opts *opts,
 	}
 
 	if (evsel->precise_max)
-		perf_event_attr__set_max_precise_ip(attr);
+		attr->precise_ip = 3;
 
 	if (opts->all_user) {
 		attr->exclude_kernel = 1;
@@ -1749,6 +1750,60 @@ static bool ignore_missing_thread(struct perf_evsel *evsel,
 	return true;
 }
 
+static void display_attr(struct perf_event_attr *attr)
+{
+	if (verbose >= 2) {
+		fprintf(stderr, "%.60s\n", graph_dotted_line);
+		fprintf(stderr, "perf_event_attr:\n");
+		perf_event_attr__fprintf(stderr, attr, __open_attr__fprintf, NULL);
+		fprintf(stderr, "%.60s\n", graph_dotted_line);
+	}
+}
+
+static int perf_event_open(struct perf_evsel *evsel,
+			   pid_t pid, int cpu, int group_fd,
+			   unsigned long flags)
+{
+	int precise_ip = evsel->attr.precise_ip;
+	int fd;
+
+	while (1) {
+		pr_debug2("sys_perf_event_open: pid %d  cpu %d  group_fd %d  flags %#lx",
+			  pid, cpu, group_fd, flags);
+
+		fd = sys_perf_event_open(&evsel->attr, pid, cpu, group_fd, flags);
+		if (fd >= 0)
+			break;
+
+		/*
+		 * Do quick precise_ip fallback if requested:
+		 *  - there is some precise_ip
+		 *  - maximum precise is requested
+		 *  - we failed with ENOTSUP error, which is
+		 *    associated with wrong precise_ip config
+		 */
+		if (!precise_ip || !evsel->precise_max || (errno != ENOTSUP))
+			break;
+
+		/*
+		 * We tried all the precise_ip values, and it's
+		 * still failing, so leave it to standard fallabck.
+		 */
+		if (!evsel->attr.precise_ip) {
+			evsel->attr.precise_ip = precise_ip;
+			break;
+		}
+
+		pr_debug2("\nsys_perf_event_open failed, error %d\n", -ENOTSUP);
+
+		evsel->attr.precise_ip--;
+		pr_debug2("decreasing precise_ip by one (%d)\n", evsel->attr.precise_ip);
+		display_attr(&evsel->attr);
+	}
+
+	return fd;
+}
+
 int perf_evsel__open(struct perf_evsel *evsel, struct cpu_map *cpus,
 		     struct thread_map *threads)
 {
@@ -1824,12 +1879,7 @@ int perf_evsel__open(struct perf_evsel *evsel, struct cpu_map *cpus,
 	if (perf_missing_features.sample_id_all)
 		evsel->attr.sample_id_all = 0;
 
-	if (verbose >= 2) {
-		fprintf(stderr, "%.60s\n", graph_dotted_line);
-		fprintf(stderr, "perf_event_attr:\n");
-		perf_event_attr__fprintf(stderr, &evsel->attr, __open_attr__fprintf, NULL);
-		fprintf(stderr, "%.60s\n", graph_dotted_line);
-	}
+	display_attr(&evsel->attr);
 
 	for (cpu = 0; cpu < cpus->nr; cpu++) {
 
@@ -1841,13 +1891,10 @@ int perf_evsel__open(struct perf_evsel *evsel, struct cpu_map *cpus,
 
 			group_fd = get_group_fd(evsel, cpu, thread);
 retry_open:
-			pr_debug2("sys_perf_event_open: pid %d  cpu %d  group_fd %d  flags %#lx",
-				  pid, cpus->map[cpu], group_fd, flags);
-
 			test_attr__ready();
 
-			fd = sys_perf_event_open(&evsel->attr, pid, cpus->map[cpu],
-						 group_fd, flags);
+			fd = perf_event_open(evsel, pid, cpus->map[cpu],
+					     group_fd, flags);
 
 			FD(evsel, cpu, thread) = fd;
 
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCHv2 5/8] perf tools: Get precise_ip from the pmu config
  2019-03-14 14:01             ` Jiri Olsa
@ 2019-03-14 15:49               ` Andi Kleen
  2019-03-15 12:15                 ` [PATCH] perf tools: Move precise_ip detection into perf_evsel__open Jiri Olsa
  0 siblings, 1 reply; 30+ messages in thread
From: Andi Kleen @ 2019-03-14 15:49 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Jiri Olsa, Arnaldo Carvalho de Melo, lkml, Ingo Molnar,
	Namhyung Kim, Alexander Shishkin, Peter Zijlstra,
	Jonas Rabenstein, Nageswara R Sastry, Ravi Bangoria

> > Still seems like a hack. Even though I don't know of a case that
> > would break now. But it would be better to have the precise probing
> > in the open retry loop, instead of trying to reinvent it here.
> 
> how about something like below?

Yes looks reasonable.

-Andi

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH] perf tools: Move precise_ip detection into perf_evsel__open
  2019-03-14 15:49               ` Andi Kleen
@ 2019-03-15 12:15                 ` Jiri Olsa
  2019-03-15 14:05                   ` Andi Kleen
  2019-03-15 14:35                   ` Arnaldo Carvalho de Melo
  0 siblings, 2 replies; 30+ messages in thread
From: Jiri Olsa @ 2019-03-15 12:15 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Jiri Olsa, Arnaldo Carvalho de Melo, lkml, Ingo Molnar,
	Namhyung Kim, Alexander Shishkin, Peter Zijlstra,
	Jonas Rabenstein, Nageswara R Sastry, Ravi Bangoria

On Thu, Mar 14, 2019 at 08:49:11AM -0700, Andi Kleen wrote:
> > > Still seems like a hack. Even though I don't know of a case that
> > > would break now. But it would be better to have the precise probing
> > > in the open retry loop, instead of trying to reinvent it here.
> > 
> > how about something like below?
> 
> Yes looks reasonable.

full patch attached,

jirka


---
After discussion with Andi, moving the precise_ip detection
for maximum precise config (via :P modifier or for default
cycles event) into perf_evsel__open code.

The current detection code in perf_event_attr__set_max_precise_ip
function is tricky, because precise_ip config is specific for
given event and it currently checks only hw cycles.

We now check for valid precise_ip value right after failing
sys_perf_event_open for specific event, before any of the
perf_event_attr fallback code gets executed. This way we get
the proper config in perf_event_attr together with allowed
precise_ip settings.

We can see that code activity with -vv, like:

  $ perf record -vv ls
  ...
  ------------------------------------------------------------
  perf_event_attr:
    size                             112
    { sample_period, sample_freq }   4000
    ...
    precise_ip                       3
    sample_id_all                    1
    exclude_guest                    1
    mmap2                            1
    comm_exec                        1
    ksymbol                          1
  ------------------------------------------------------------
  sys_perf_event_open: pid 9926  cpu 0  group_fd -1  flags 0x8
  sys_perf_event_open failed, error -95
  decreasing precise_ip by one (2)
  ------------------------------------------------------------
  perf_event_attr:
    size                             112
    { sample_period, sample_freq }   4000
    ...
    precise_ip                       2
    sample_id_all                    1
    exclude_guest                    1
    mmap2                            1
    comm_exec                        1
    ksymbol                          1
  ------------------------------------------------------------
  sys_perf_event_open: pid 9926  cpu 0  group_fd -1  flags 0x8 = 4
  ...

Suggested-by: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/n/tip-dkvxxbeg7lu74155d4jhlmc9@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/perf/util/evlist.c | 29 ----------------
 tools/perf/util/evlist.h |  2 --
 tools/perf/util/evsel.c  | 72 ++++++++++++++++++++++++++++++++--------
 3 files changed, 59 insertions(+), 44 deletions(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index ed20f4379956..3a243c249c29 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -230,35 +230,6 @@ void perf_evlist__set_leader(struct perf_evlist *evlist)
 	}
 }
 
-void perf_event_attr__set_max_precise_ip(struct perf_event_attr *pattr)
-{
-	struct perf_event_attr attr = {
-		.type		= PERF_TYPE_HARDWARE,
-		.config		= PERF_COUNT_HW_CPU_CYCLES,
-		.exclude_kernel	= 1,
-		.precise_ip	= 3,
-	};
-
-	event_attr_init(&attr);
-
-	/*
-	 * Unnamed union member, not supported as struct member named
-	 * initializer in older compilers such as gcc 4.4.7
-	 */
-	attr.sample_period = 1;
-
-	while (attr.precise_ip != 0) {
-		int fd = sys_perf_event_open(&attr, 0, -1, -1, 0);
-		if (fd != -1) {
-			close(fd);
-			break;
-		}
-		--attr.precise_ip;
-	}
-
-	pattr->precise_ip = attr.precise_ip;
-}
-
 int __perf_evlist__add_default(struct perf_evlist *evlist, bool precise)
 {
 	struct perf_evsel *evsel = perf_evsel__new_cycles(precise);
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 744906dd4887..cb15e3366eba 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -303,8 +303,6 @@ void perf_evlist__to_front(struct perf_evlist *evlist,
 void perf_evlist__set_tracking_event(struct perf_evlist *evlist,
 				     struct perf_evsel *tracking_evsel);
 
-void perf_event_attr__set_max_precise_ip(struct perf_event_attr *attr);
-
 struct perf_evsel *
 perf_evlist__find_evsel_by_str(struct perf_evlist *evlist, const char *str);
 
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 3bbf73e979c0..f606e8a81f0e 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -295,7 +295,6 @@ struct perf_evsel *perf_evsel__new_cycles(bool precise)
 	if (!precise)
 		goto new_event;
 
-	perf_event_attr__set_max_precise_ip(&attr);
 	/*
 	 * Now let the usual logic to set up the perf_event_attr defaults
 	 * to kick in when we return and before perf_evsel__open() is called.
@@ -305,6 +304,8 @@ struct perf_evsel *perf_evsel__new_cycles(bool precise)
 	if (evsel == NULL)
 		goto out;
 
+	evsel->precise_max = true;
+
 	/* use asprintf() because free(evsel) assumes name is allocated */
 	if (asprintf(&evsel->name, "cycles%s%s%.*s",
 		     (attr.precise_ip || attr.exclude_kernel) ? ":" : "",
@@ -1083,7 +1084,7 @@ void perf_evsel__config(struct perf_evsel *evsel, struct record_opts *opts,
 	}
 
 	if (evsel->precise_max)
-		perf_event_attr__set_max_precise_ip(attr);
+		attr->precise_ip = 3;
 
 	if (opts->all_user) {
 		attr->exclude_kernel = 1;
@@ -1749,6 +1750,59 @@ static bool ignore_missing_thread(struct perf_evsel *evsel,
 	return true;
 }
 
+static void display_attr(struct perf_event_attr *attr)
+{
+	if (verbose >= 2) {
+		fprintf(stderr, "%.60s\n", graph_dotted_line);
+		fprintf(stderr, "perf_event_attr:\n");
+		perf_event_attr__fprintf(stderr, attr, __open_attr__fprintf, NULL);
+		fprintf(stderr, "%.60s\n", graph_dotted_line);
+	}
+}
+
+static int perf_event_open(struct perf_evsel *evsel,
+			   pid_t pid, int cpu, int group_fd,
+			   unsigned long flags)
+{
+	int precise_ip = evsel->attr.precise_ip;
+	int fd;
+
+	while (1) {
+		pr_debug2("sys_perf_event_open: pid %d  cpu %d  group_fd %d  flags %#lx",
+			  pid, cpu, group_fd, flags);
+
+		fd = sys_perf_event_open(&evsel->attr, pid, cpu, group_fd, flags);
+		if (fd >= 0)
+			break;
+
+		/*
+		 * Do quick precise_ip fallback if:
+		 *  - there is precise_ip set in perf_event_attr
+		 *  - maximum precise is requested
+		 *  - sys_perf_event_open failed with ENOTSUP error,
+		 *    which is associated with wrong precise_ip
+		 */
+		if (!precise_ip || !evsel->precise_max || (errno != ENOTSUP))
+			break;
+
+		/*
+		 * We tried all the precise_ip values, and it's
+		 * still failing, so leave it to standard fallback.
+		 */
+		if (!evsel->attr.precise_ip) {
+			evsel->attr.precise_ip = precise_ip;
+			break;
+		}
+
+		pr_debug2("\nsys_perf_event_open failed, error %d\n", -ENOTSUP);
+		evsel->attr.precise_ip--;
+		pr_debug2("decreasing precise_ip by one (%d)\n", evsel->attr.precise_ip);
+		display_attr(&evsel->attr);
+	}
+
+	return fd;
+}
+
 int perf_evsel__open(struct perf_evsel *evsel, struct cpu_map *cpus,
 		     struct thread_map *threads)
 {
@@ -1824,12 +1878,7 @@ int perf_evsel__open(struct perf_evsel *evsel, struct cpu_map *cpus,
 	if (perf_missing_features.sample_id_all)
 		evsel->attr.sample_id_all = 0;
 
-	if (verbose >= 2) {
-		fprintf(stderr, "%.60s\n", graph_dotted_line);
-		fprintf(stderr, "perf_event_attr:\n");
-		perf_event_attr__fprintf(stderr, &evsel->attr, __open_attr__fprintf, NULL);
-		fprintf(stderr, "%.60s\n", graph_dotted_line);
-	}
+	display_attr(&evsel->attr);
 
 	for (cpu = 0; cpu < cpus->nr; cpu++) {
 
@@ -1841,13 +1890,10 @@ int perf_evsel__open(struct perf_evsel *evsel, struct cpu_map *cpus,
 
 			group_fd = get_group_fd(evsel, cpu, thread);
 retry_open:
-			pr_debug2("sys_perf_event_open: pid %d  cpu %d  group_fd %d  flags %#lx",
-				  pid, cpus->map[cpu], group_fd, flags);
-
 			test_attr__ready();
 
-			fd = sys_perf_event_open(&evsel->attr, pid, cpus->map[cpu],
-						 group_fd, flags);
+			fd = perf_event_open(evsel, pid, cpus->map[cpu],
+					     group_fd, flags);
 
 			FD(evsel, cpu, thread) = fd;
 
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH] perf tools: Move precise_ip detection into perf_evsel__open
  2019-03-15 12:15                 ` [PATCH] perf tools: Move precise_ip detection into perf_evsel__open Jiri Olsa
@ 2019-03-15 14:05                   ` Andi Kleen
  2019-03-15 14:35                   ` Arnaldo Carvalho de Melo
  1 sibling, 0 replies; 30+ messages in thread
From: Andi Kleen @ 2019-03-15 14:05 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Jiri Olsa, Arnaldo Carvalho de Melo, lkml, Ingo Molnar,
	Namhyung Kim, Alexander Shishkin, Peter Zijlstra,
	Jonas Rabenstein, Nageswara R Sastry, Ravi Bangoria

On Fri, Mar 15, 2019 at 01:15:46PM +0100, Jiri Olsa wrote:
> On Thu, Mar 14, 2019 at 08:49:11AM -0700, Andi Kleen wrote:
> > > > Still seems like a hack. Even though I don't know of a case that
> > > > would break now. But it would be better to have the precise probing
> > > > in the open retry loop, instead of trying to reinvent it here.
> > > 
> > > how about something like below?
> > 
> > Yes looks reasonable.
> 
> full patch attached,

Acked-by: Andi Kleen <ak@linux.intel.com>

-Andi

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] perf tools: Move precise_ip detection into perf_evsel__open
  2019-03-15 12:15                 ` [PATCH] perf tools: Move precise_ip detection into perf_evsel__open Jiri Olsa
  2019-03-15 14:05                   ` Andi Kleen
@ 2019-03-15 14:35                   ` Arnaldo Carvalho de Melo
  2019-03-15 14:52                     ` Jiri Olsa
  1 sibling, 1 reply; 30+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-03-15 14:35 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Andi Kleen, Jiri Olsa, lkml, Ingo Molnar, Namhyung Kim,
	Alexander Shishkin, Peter Zijlstra, Jonas Rabenstein,
	Nageswara R Sastry, Ravi Bangoria

Em Fri, Mar 15, 2019 at 01:15:46PM +0100, Jiri Olsa escreveu:
> On Thu, Mar 14, 2019 at 08:49:11AM -0700, Andi Kleen wrote:
> > > > Still seems like a hack. Even though I don't know of a case that
> > > > would break now. But it would be better to have the precise probing
> > > > in the open retry loop, instead of trying to reinvent it here.
> > > 
> > > how about something like below?
> > 
> > Yes looks reasonable.
> 
> full patch attached,
> 
> jirka
> 
> 
> ---
> After discussion with Andi, moving the precise_ip detection
> for maximum precise config (via :P modifier or for default
> cycles event) into perf_evsel__open code.
> 
> The current detection code in perf_event_attr__set_max_precise_ip
> function is tricky, because precise_ip config is specific for
> given event and it currently checks only hw cycles.
> 
> We now check for valid precise_ip value right after failing
> sys_perf_event_open for specific event, before any of the
> perf_event_attr fallback code gets executed. This way we get
> the proper config in perf_event_attr together with allowed
> precise_ip settings.
> 
> We can see that code activity with -vv, like:
> 
>   $ perf record -vv ls
>   ...
>   ------------------------------------------------------------
>   perf_event_attr:
>     size                             112
>     { sample_period, sample_freq }   4000
>     ...
>     precise_ip                       3
>     sample_id_all                    1
>     exclude_guest                    1
>     mmap2                            1
>     comm_exec                        1
>     ksymbol                          1
>   ------------------------------------------------------------
>   sys_perf_event_open: pid 9926  cpu 0  group_fd -1  flags 0x8
>   sys_perf_event_open failed, error -95
>   decreasing precise_ip by one (2)
>   ------------------------------------------------------------
>   perf_event_attr:
>     size                             112
>     { sample_period, sample_freq }   4000
>     ...
>     precise_ip                       2
>     sample_id_all                    1
>     exclude_guest                    1
>     mmap2                            1
>     comm_exec                        1
>     ksymbol                          1
>   ------------------------------------------------------------
>   sys_perf_event_open: pid 9926  cpu 0  group_fd -1  flags 0x8 = 4
>   ...
> 
> Suggested-by: Andi Kleen <ak@linux.intel.com>
> Link: http://lkml.kernel.org/n/tip-dkvxxbeg7lu74155d4jhlmc9@git.kernel.org
> Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> ---
>  tools/perf/util/evlist.c | 29 ----------------
>  tools/perf/util/evlist.h |  2 --
>  tools/perf/util/evsel.c  | 72 ++++++++++++++++++++++++++++++++--------
>  3 files changed, 59 insertions(+), 44 deletions(-)
> 
> diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
> index ed20f4379956..3a243c249c29 100644
> --- a/tools/perf/util/evlist.c
> +++ b/tools/perf/util/evlist.c
> @@ -230,35 +230,6 @@ void perf_evlist__set_leader(struct perf_evlist *evlist)
>  	}
>  }
>  
> -void perf_event_attr__set_max_precise_ip(struct perf_event_attr *pattr)
> -{
> -	struct perf_event_attr attr = {
> -		.type		= PERF_TYPE_HARDWARE,
> -		.config		= PERF_COUNT_HW_CPU_CYCLES,
> -		.exclude_kernel	= 1,
> -		.precise_ip	= 3,
> -	};
> -
> -	event_attr_init(&attr);
> -
> -	/*
> -	 * Unnamed union member, not supported as struct member named
> -	 * initializer in older compilers such as gcc 4.4.7
> -	 */
> -	attr.sample_period = 1;
> -
> -	while (attr.precise_ip != 0) {
> -		int fd = sys_perf_event_open(&attr, 0, -1, -1, 0);
> -		if (fd != -1) {
> -			close(fd);
> -			break;
> -		}
> -		--attr.precise_ip;
> -	}
> -
> -	pattr->precise_ip = attr.precise_ip;
> -}
> -
>  int __perf_evlist__add_default(struct perf_evlist *evlist, bool precise)
>  {
>  	struct perf_evsel *evsel = perf_evsel__new_cycles(precise);
> diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
> index 744906dd4887..cb15e3366eba 100644
> --- a/tools/perf/util/evlist.h
> +++ b/tools/perf/util/evlist.h
> @@ -303,8 +303,6 @@ void perf_evlist__to_front(struct perf_evlist *evlist,
>  void perf_evlist__set_tracking_event(struct perf_evlist *evlist,
>  				     struct perf_evsel *tracking_evsel);
>  
> -void perf_event_attr__set_max_precise_ip(struct perf_event_attr *attr);
> -
>  struct perf_evsel *
>  perf_evlist__find_evsel_by_str(struct perf_evlist *evlist, const char *str);
>  
> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> index 3bbf73e979c0..f606e8a81f0e 100644
> --- a/tools/perf/util/evsel.c
> +++ b/tools/perf/util/evsel.c
> @@ -295,7 +295,6 @@ struct perf_evsel *perf_evsel__new_cycles(bool precise)
>  	if (!precise)
>  		goto new_event;
>  
> -	perf_event_attr__set_max_precise_ip(&attr);
>  	/*
>  	 * Now let the usual logic to set up the perf_event_attr defaults
>  	 * to kick in when we return and before perf_evsel__open() is called.
> @@ -305,6 +304,8 @@ struct perf_evsel *perf_evsel__new_cycles(bool precise)
>  	if (evsel == NULL)
>  		goto out;
>  
> +	evsel->precise_max = true;
> +
>  	/* use asprintf() because free(evsel) assumes name is allocated */
>  	if (asprintf(&evsel->name, "cycles%s%s%.*s",
>  		     (attr.precise_ip || attr.exclude_kernel) ? ":" : "",
> @@ -1083,7 +1084,7 @@ void perf_evsel__config(struct perf_evsel *evsel, struct record_opts *opts,
>  	}
>  
>  	if (evsel->precise_max)
> -		perf_event_attr__set_max_precise_ip(attr);
> +		attr->precise_ip = 3;
>  
>  	if (opts->all_user) {
>  		attr->exclude_kernel = 1;
> @@ -1749,6 +1750,59 @@ static bool ignore_missing_thread(struct perf_evsel *evsel,
>  	return true;
>  }
>  
> +static void display_attr(struct perf_event_attr *attr)
> +{
> +	if (verbose >= 2) {
> +		fprintf(stderr, "%.60s\n", graph_dotted_line);
> +		fprintf(stderr, "perf_event_attr:\n");
> +		perf_event_attr__fprintf(stderr, attr, __open_attr__fprintf, NULL);
> +		fprintf(stderr, "%.60s\n", graph_dotted_line);
> +	}
> +}
> +
> +static int perf_event_open(struct perf_evsel *evsel,
> +			   pid_t pid, int cpu, int group_fd,
> +			   unsigned long flags)


The patch is ok, but I think the naming of this function is too generic,
so I'm renaming it to:

static int perf_evsel__open_adjust_precise_ip(struct perf_evsel *evsel,
					      pid_t pid, int cpu, int group_fd,
					      unsigned long flags)

Ok?

The perf_evsel__open() code is already complex with that fallback
mechanism, this is just one more way of fallbacking when asking the
kernel for something that may fail.

In fact what happens if the precise_ip that is being asked _is_
supported but sys_perf_event_open() fails because some other
perf_event_attr attribute that is set is not supported? 

I see, it gets it back restored to what the user asked so that the
standard fallback is tried, ok, I'll apply with just the rename for this
function,

Thanks,

- Arnaldo

> +{
> +	int precise_ip = evsel->attr.precise_ip;
> +	int fd;
> +
> +	while (1) {
> +		pr_debug2("sys_perf_event_open: pid %d  cpu %d  group_fd %d  flags %#lx",
> +			  pid, cpu, group_fd, flags);
> +
> +		fd = sys_perf_event_open(&evsel->attr, pid, cpu, group_fd, flags);
> +		if (fd >= 0)
> +			break;
> +
> +		/*
> +		 * Do quick precise_ip fallback if:
> +		 *  - there is precise_ip set in perf_event_attr
> +		 *  - maximum precise is requested
> +		 *  - sys_perf_event_open failed with ENOTSUP error,
> +		 *    which is associated with wrong precise_ip
> +		 */
> +		if (!precise_ip || !evsel->precise_max || (errno != ENOTSUP))
> +			break;
> +
> +		/*
> +		 * We tried all the precise_ip values, and it's
> +		 * still failing, so leave it to standard fallback.
> +		 */
> +		if (!evsel->attr.precise_ip) {
> +			evsel->attr.precise_ip = precise_ip;
> +			break;
> +		}
> +
> +		pr_debug2("\nsys_perf_event_open failed, error %d\n", -ENOTSUP);
> +		evsel->attr.precise_ip--;
> +		pr_debug2("decreasing precise_ip by one (%d)\n", evsel->attr.precise_ip);
> +		display_attr(&evsel->attr);
> +	}
> +
> +	return fd;
> +}
> +
>  int perf_evsel__open(struct perf_evsel *evsel, struct cpu_map *cpus,
>  		     struct thread_map *threads)
>  {
> @@ -1824,12 +1878,7 @@ int perf_evsel__open(struct perf_evsel *evsel, struct cpu_map *cpus,
>  	if (perf_missing_features.sample_id_all)
>  		evsel->attr.sample_id_all = 0;
>  
> -	if (verbose >= 2) {
> -		fprintf(stderr, "%.60s\n", graph_dotted_line);
> -		fprintf(stderr, "perf_event_attr:\n");
> -		perf_event_attr__fprintf(stderr, &evsel->attr, __open_attr__fprintf, NULL);
> -		fprintf(stderr, "%.60s\n", graph_dotted_line);
> -	}
> +	display_attr(&evsel->attr);
>  
>  	for (cpu = 0; cpu < cpus->nr; cpu++) {
>  
> @@ -1841,13 +1890,10 @@ int perf_evsel__open(struct perf_evsel *evsel, struct cpu_map *cpus,
>  
>  			group_fd = get_group_fd(evsel, cpu, thread);
>  retry_open:
> -			pr_debug2("sys_perf_event_open: pid %d  cpu %d  group_fd %d  flags %#lx",
> -				  pid, cpus->map[cpu], group_fd, flags);
> -
>  			test_attr__ready();
>  
> -			fd = sys_perf_event_open(&evsel->attr, pid, cpus->map[cpu],
> -						 group_fd, flags);
> +			fd = perf_event_open(evsel, pid, cpus->map[cpu],
> +					     group_fd, flags);
>  
>  			FD(evsel, cpu, thread) = fd;
>  
> -- 
> 2.17.2

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] perf tools: Move precise_ip detection into perf_evsel__open
  2019-03-15 14:35                   ` Arnaldo Carvalho de Melo
@ 2019-03-15 14:52                     ` Jiri Olsa
  2019-03-23 15:04                       ` Jiri Olsa
  0 siblings, 1 reply; 30+ messages in thread
From: Jiri Olsa @ 2019-03-15 14:52 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Andi Kleen, Jiri Olsa, lkml, Ingo Molnar, Namhyung Kim,
	Alexander Shishkin, Peter Zijlstra, Jonas Rabenstein,
	Nageswara R Sastry, Ravi Bangoria

On Fri, Mar 15, 2019 at 11:35:04AM -0300, Arnaldo Carvalho de Melo wrote:

SNIP

> > +static void display_attr(struct perf_event_attr *attr)
> > +{
> > +	if (verbose >= 2) {
> > +		fprintf(stderr, "%.60s\n", graph_dotted_line);
> > +		fprintf(stderr, "perf_event_attr:\n");
> > +		perf_event_attr__fprintf(stderr, attr, __open_attr__fprintf, NULL);
> > +		fprintf(stderr, "%.60s\n", graph_dotted_line);
> > +	}
> > +}
> > +
> > +static int perf_event_open(struct perf_evsel *evsel,
> > +			   pid_t pid, int cpu, int group_fd,
> > +			   unsigned long flags)
> 
> 
> The patch is ok, but I think the naming of this function is too generic,
> so I'm renaming it to:
> 
> static int perf_evsel__open_adjust_precise_ip(struct perf_evsel *evsel,
> 					      pid_t pid, int cpu, int group_fd,
> 					      unsigned long flags)
> 
> Ok?

ok

> 
> The perf_evsel__open() code is already complex with that fallback
> mechanism, this is just one more way of fallbacking when asking the
> kernel for something that may fail.
> 
> In fact what happens if the precise_ip that is being asked _is_
> supported but sys_perf_event_open() fails because some other
> perf_event_attr attribute that is set is not supported? 

it's outside the scope of standard feature fallback code,
so we will try it for any possible fallback variant, so:

we will try all possible precise_ip (3,2,1,0) and they will
all fail because of the unsupported attribute - so we will
restore the precise_ip back and continue in standard fallback
code that will eventualy switch that attribute off

> 
> I see, it gets it back restored to what the user asked so that the
> standard fallback is tried, ok, I'll apply with just the rename for this
> function,

thanks,
jirka

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] perf tools: Move precise_ip detection into perf_evsel__open
  2019-03-15 14:52                     ` Jiri Olsa
@ 2019-03-23 15:04                       ` Jiri Olsa
  2019-03-25 14:53                         ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 30+ messages in thread
From: Jiri Olsa @ 2019-03-23 15:04 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Andi Kleen, Jiri Olsa, lkml, Ingo Molnar, Namhyung Kim,
	Alexander Shishkin, Peter Zijlstra, Jonas Rabenstein,
	Nageswara R Sastry, Ravi Bangoria

On Fri, Mar 15, 2019 at 03:52:25PM +0100, Jiri Olsa wrote:
> On Fri, Mar 15, 2019 at 11:35:04AM -0300, Arnaldo Carvalho de Melo wrote:
> 
> SNIP
> 
> > > +static void display_attr(struct perf_event_attr *attr)
> > > +{
> > > +	if (verbose >= 2) {
> > > +		fprintf(stderr, "%.60s\n", graph_dotted_line);
> > > +		fprintf(stderr, "perf_event_attr:\n");
> > > +		perf_event_attr__fprintf(stderr, attr, __open_attr__fprintf, NULL);
> > > +		fprintf(stderr, "%.60s\n", graph_dotted_line);
> > > +	}
> > > +}
> > > +
> > > +static int perf_event_open(struct perf_evsel *evsel,
> > > +			   pid_t pid, int cpu, int group_fd,
> > > +			   unsigned long flags)
> > 
> > 
> > The patch is ok, but I think the naming of this function is too generic,
> > so I'm renaming it to:
> > 
> > static int perf_evsel__open_adjust_precise_ip(struct perf_evsel *evsel,
> > 					      pid_t pid, int cpu, int group_fd,
> > 					      unsigned long flags)
> > 
> > Ok?
> 
> ok
> 
> > 
> > The perf_evsel__open() code is already complex with that fallback
> > mechanism, this is just one more way of fallbacking when asking the
> > kernel for something that may fail.
> > 
> > In fact what happens if the precise_ip that is being asked _is_
> > supported but sys_perf_event_open() fails because some other
> > perf_event_attr attribute that is set is not supported? 
> 
> it's outside the scope of standard feature fallback code,
> so we will try it for any possible fallback variant, so:
> 
> we will try all possible precise_ip (3,2,1,0) and they will
> all fail because of the unsupported attribute - so we will
> restore the precise_ip back and continue in standard fallback
> code that will eventualy switch that attribute off
> 
> > 
> > I see, it gets it back restored to what the user asked so that the
> > standard fallback is tried, ok, I'll apply with just the rename for this
> > function,
> 
> thanks,
> jirka

ping, there's rebased version in my perf/fixes branch

jirka

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] perf tools: Move precise_ip detection into perf_evsel__open
  2019-03-23 15:04                       ` Jiri Olsa
@ 2019-03-25 14:53                         ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 30+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-03-25 14:53 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Andi Kleen, Jiri Olsa, lkml, Ingo Molnar, Namhyung Kim,
	Alexander Shishkin, Peter Zijlstra, Jonas Rabenstein,
	Nageswara R Sastry, Ravi Bangoria

Em Sat, Mar 23, 2019 at 04:04:22PM +0100, Jiri Olsa escreveu:
> On Fri, Mar 15, 2019 at 03:52:25PM +0100, Jiri Olsa wrote:
> > On Fri, Mar 15, 2019 at 11:35:04AM -0300, Arnaldo Carvalho de Melo wrote:
> > > The perf_evsel__open() code is already complex with that fallback
> > > mechanism, this is just one more way of fallbacking when asking the
> > > kernel for something that may fail.

> > > In fact what happens if the precise_ip that is being asked _is_
> > > supported but sys_perf_event_open() fails because some other
> > > perf_event_attr attribute that is set is not supported? 

> > it's outside the scope of standard feature fallback code,
> > so we will try it for any possible fallback variant, so:

> > we will try all possible precise_ip (3,2,1,0) and they will
> > all fail because of the unsupported attribute - so we will
> > restore the precise_ip back and continue in standard fallback
> > code that will eventualy switch that attribute off
 
> > > I see, it gets it back restored to what the user asked so that the
> > > standard fallback is tried, ok, I'll apply with just the rename for this
> > > function,
 
> ping, there's rebased version in my perf/fixes branch

Thanks, applied.

- Arnaldo

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2019-03-25 14:53 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-05 15:25 [PATCH 0/8] perf tools: Assorted fixes Jiri Olsa
2019-03-05 15:25 ` [PATCH 1/8] perf c2c: Fix c2c report for empty numa node Jiri Olsa
2019-03-09 20:05   ` [tip:perf/urgent] " tip-bot for Jiri Olsa
2019-03-05 15:25 ` [PATCH 2/8] perf tools: Add error path into hist_entry__init Jiri Olsa
2019-03-09 20:06   ` [tip:perf/urgent] perf hist: " tip-bot for Jiri Olsa
2019-03-05 15:25 ` [PATCH 3/8] perf hist: Fix memory leak of srcline Jiri Olsa
2019-03-09 20:07   ` [tip:perf/urgent] " tip-bot for Jiri Olsa
2019-03-05 15:25 ` [PATCH 4/8] perf tools: Read and store caps/max_precise in perf_pmu Jiri Olsa
2019-03-09 20:07   ` [tip:perf/urgent] " tip-bot for Jiri Olsa
2019-03-05 15:25 ` [PATCH 5/8] perf tools: Get precise_ip from the pmu config Jiri Olsa
2019-03-05 16:13   ` Andi Kleen
2019-03-05 16:28     ` Jiri Olsa
2019-03-05 16:40       ` Andi Kleen
2019-03-07 15:35         ` [PATCHv2 " Jiri Olsa
2019-03-07 16:51           ` Andi Kleen
2019-03-07 22:32             ` Jiri Olsa
2019-03-14 14:01             ` Jiri Olsa
2019-03-14 15:49               ` Andi Kleen
2019-03-15 12:15                 ` [PATCH] perf tools: Move precise_ip detection into perf_evsel__open Jiri Olsa
2019-03-15 14:05                   ` Andi Kleen
2019-03-15 14:35                   ` Arnaldo Carvalho de Melo
2019-03-15 14:52                     ` Jiri Olsa
2019-03-23 15:04                       ` Jiri Olsa
2019-03-25 14:53                         ` Arnaldo Carvalho de Melo
2019-03-05 15:25 ` [PATCH 6/8] perf tools: Probe for precise_ip with simple attr Jiri Olsa
2019-03-09 20:08   ` [tip:perf/urgent] perf evsel: " tip-bot for Jiri Olsa
2019-03-05 15:25 ` [PATCH 7/8] perf tools: Fix double free in perf_data__close Jiri Olsa
2019-03-09 20:09   ` [tip:perf/urgent] perf session: " tip-bot for Jiri Olsa
2019-03-05 15:25 ` [PATCH 8/8] perf tools: Force perf_data__open|close zero data->file.path Jiri Olsa
2019-03-09 20:09   ` [tip:perf/urgent] perf data: " tip-bot for Jiri Olsa

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).